Audience and Market Research Synthesis

Intermediate

After this you can turn scattered signals — reviews, sales-call language, competitor moves, market chatter — into a market read you have actually verified, then mine that read for campaign angles, treating every AI-surfaced insight as a hypothesis to check rather than a finding to ship.

Understand

Audience research is the highest-return, lowest-risk thing you can point AI at, and it is the thing operators reach for least, because it produces nothing you can publish. There is no asset at the end, just a sharper read. That is exactly why it pays: the read is what makes everything downstream non-generic. A model handed customer language, real objections, and the specific pain your buyers keep naming does not regress to the bland center the way an empty prompt does. The research phase is where you acquire the proprietary context, and the generation phase inherits it. Skip the research and you are asking the model to invent your market from its averaged prior, which is the definition of slop produced faster.

The trap sits inside that same convenience. Ask a model to "summarize what customers think about onboarding" and it will hand you a clean, confident paragraph. Some of it is genuinely synthesized from the reviews you pasted. Some of it is the model filling gaps with the statistical average of everything ever written about onboarding — plausible, well-shaped, and not actually about your customers. The two are indistinguishable on the page. Both read as a finding. The whole discipline of this module is refusing to let them stay indistinguishable.

Synthesis or averaged priorthe same confident paragraph can come from your pasted signals or from the model's generic training average, and on the page the two are indistinguishable until you trace each claim back to a source.

There is a measured reason the averaged version is so seductive. A 2023 HBS/BCG study of 758 consultants found that giving them AI raised average output quality by roughly 40% while narrowing the spread of ideas across the group — the work got better and more samey at the same time. That is the failure in miniature. The model lifts you to a competent middle and quietly removes the edges, and the edges are where a real market insight lives. A "trend" the model surfaces with no specific source behind it is usually that middle wearing the costume of a discovery: the consensus take, the thing every competitor's model also produces, dressed up as something you found. Ship it into a campaign and you have spent budget broadcasting the average of your category back at it.

So synthesis is not the act of asking the model to summarize. It is the act of forcing provenance. Every line in the read gets tagged: is this grounded in a signal I actually supplied, or is it the model's inference? A grounded claim cites the review or the call it came from. An inferred claim is a hypothesis — possibly true, not yet load-bearing. You can still act on hypotheses, but you act on them as bets to test, not facts to build on. The verified read is the subset that survived that sort.

The hypothesis gatethe verification step that sits between an AI-surfaced insight and any campaign decision — grounded claims pass into the read, ungrounded ones get tagged as hypotheses and either tested or held.

Only once you have that verified read does angle generation earn its keep. Here the move inverts: you want quantity, not a single tidy answer. Ask for one campaign angle and the model gives you the obvious one — the angle your three nearest competitors are also running, because it is the center of the same distribution. Ask for eight or ten angles spanning different pains and audiences and you force it out toward the edges, where the non-obvious ones live. Then you select. Generation is cheap and convergent; selection is where your judgment does the work, keeping the angle that ties to a verified customer pain and discarding the ones that sound clever but trace back to nothing.

Where it breaks

This whole approach assumes your signals are real signals. Feed the model fifteen five-star reviews you cherry-picked and the verified read faithfully synthesizes a market that does not exist — every claim is grounded, every claim traces to a quote, and the read is still wrong because the input was skewed. Verification confirms that a claim came from your sources; it cannot confirm your sources represent the market. Thin or biased input produces a confidently-sourced fiction, which is worse than an obvious guess because it passes the provenance check.

The verification step also has a real cost, and it is not always worth paying. For a quick gut-check before a low-stakes social post, tagging every line as grounded-or-inferred is overhead you do not need. The discipline earns out when a read is about to drive real spend, a positioning decision, or a launch — when shipping the averaged prior would be expensive. Match the rigor to the stakes rather than running the full sort on every casual question.

And the failure mode this module exists to prevent does not announce itself. There is no flag on the model's output marking which sentences are synthesized and which are invented. A fabricated "customers increasingly want X" looks exactly like a real one — same confidence, same clean phrasing. You catch it only by going to trace it and finding no source underneath, which is why provenance has to be a step you run, not a vibe you trust. The model will never volunteer that it made something up.

Do it now

Run synthesis and ideation as two gated passes, never one. Paste your raw signals — reviews, call snippets, forum threads, support tickets, quoted verbatim, not summarized — and run this:

Paste this

You are synthesizing market research from the signals I paste below. Two passes.

PASS 1 — SYNTHESIS. Read every signal. Produce a market read as a list of claims.
Tag each claim:
  [GROUNDED: "<exact quote or paraphrase from a signal>"] — traces to something I gave you.
  [INFERRED] — your reasoning, not directly in any signal.
For GROUNDED claims about customer language, preserve the customer's actual words —
do not tidy, theme, or compress away the specific phrasing. The weird, visceral wording
is the asset; a clean summary destroys it.
If a claim has no signal behind it, you MUST tag it [INFERRED]. Do not present
inference as observation. If you are unsure, tag [INFERRED].

PASS 2 — ANGLES. From the GROUNDED claims only, generate 8 campaign angles spanning
different pains and audience segments. For each: the angle, the grounded claim it rests on,
and the exact customer phrase it can use. Then rank them and tell me which ONE you'd keep
and why — favor the angle tied to a strong grounded pain over the clever one tied to inference.

SIGNALS:
<paste raw reviews / call notes / forum quotes here>

Splitting synthesis from ideation is what makes the gate hold. Pass 1 makes the model show its sources before it gets to be persuasive; the grounded/inferred tag turns "what customers think" from a paragraph you trust into a list you can audit. Pass 2 draws angles only from the grounded subset, so an inferred guess can never quietly become the basis of a campaign. The verbatim instruction fights the model's strongest reflex in research — compression — because the phrase a customer actually used ("it just silently stops syncing and I look stupid in front of my client") converts where your tidy paraphrase ("reliability concerns") does not. What this produces is a verified read and a set of grounded angle candidates, not a finished campaign; selecting one and executing it at volume is the next module's job, not this one's.

One habit makes this stick: before you act on any line in the read, find the quote. If you cannot, it is a hypothesis, and you either test it against a source the model has not seen or you hold it. Treat the absence of a quote as information, not an inconvenience.

Worked example

Illustrative

Illustrative. A constructed synthesis to show the discipline, not a real engagement.

A B2B scheduling tool wants angles for a retention campaign. The signals pasted in: a dozen G2 reviews, four sales-call transcripts, a churned-customer survey. The naive move is "what do customers want?" and a confident paragraph back. Run through Pass 1 instead and the read comes out tagged:

[GROUNDED: "I spent 20 minutes rebuilding my week after it dropped my recurring blocks"] Recurring-event reliability is a named, repeated pain — three reviews, one call.

[GROUNDED: "switched from Calendly because the team view was useless"] Team-visibility is the stated reason several buyers chose this tool over the incumbent.

[INFERRED] Customers value AI-powered scheduling suggestions. (No signal mentions this — it is the model's prior about what scheduling tools should emphasize in 2026.)

That third line is the trap, caught. Nobody in the signals asked for AI suggestions; the model surfaced it because the average article about scheduling software now leads with AI. Shipped unchecked, it becomes a campaign about a feature nobody named, broadcasting the category average. Tagged as inferred, it drops out of the angle pass.

Two passes, gatedthe synthesis-then-ideation flow for this example — inferred claims are filtered out before angles are generated, so the selected angle rests only on a grounded, quoted pain.

Pass 2, drawing only from the grounded claims, returns angles spanning reliability, the switching-from-Calendly story, and the team-view differentiator. Two candidates survive selection. One leans on the team-visibility switching reason — strong, but several competitors run the same comparison. The other rests on the recurring-event pain in the customer's own words: "It remembers your recurring blocks, so a dropped sync never costs you a Monday morning." That one wins selection because it traces to the most-repeated grounded pain and uses language no competitor is using, since no competitor mined these specific reviews. The angle is non-generic for a concrete reason — it inherited a verified, proprietary pain that the research pass surfaced and the verification pass confirmed.