back to articles
Product
May 12, 202612 min read

Building a persona discovery system as queryable infrastructure

Most product personas end up as documents nobody references after the workshop ends. Here is how I rebuilt mine as queryable infrastructure, plus the discourse audit methodology that hardens them.


Over the last month, I've been working on what I'm calling the meta layer around a side project I've been building. The hypothesis which brought me there is that building production-grade products with AI requires a solid infrastructure, and that infrastructure isn't being articulated enough for most AI products to actually succeed.

In this article, I will discuss one piece of my meta layer which allowed me to cut 40% of features from my roadmap. The personas are built with AI and then confirmed through a thorough audit against real sources, and stored as queryable artefacts which I can run my product or features against at any time.

When my system was ready and I ran an overall analysis of my project against my personas, it identified hundreds of friction points based on their preferences and expectations.

Tools used

AI tooling
  • Claude Code
  • Playwright MCP
  • Claude Skills
Storage
  • Markdown files
  • Two-file persona pattern
  • Grounding knowledge bank
Sources mined
  • Reddit threads
  • App Store reviews
  • Competitor docs

The problem with how most teams use personas

Anyone who has worked in Marketing or Product for long enough has most likely seen this situation plenty. A deck is created, looking really nice using headshot photos, quotes, list of requirements and emotions that seem to achieve all the objectives of the persona creation exercise. Then, the teams start building around it. At first, everybody keeps the persona top of mind in their outputs, then as time goes on, the attention given to the personas gets weaker and finally, the deck sits in the cloud without being of much use.

Workshop
Deck created

Week 1
Everyone references it

Month 1
Attention fades

Month 6
Forgotten

The lifecycle of a typical persona deck.

This was happening in my own project. I had created personas at the beginning and they felt useful to guide my initial user flow. But then over time, I started to rely more on my personal thoughts and stopped using them. They were artifacts of a workshop, not tools I was still using.

This phenomenon is worse with AI. Building new features has become quite fast, what used to take months of work can now be done in a matter of hours or days. But this also means that it's easier than ever to fall into the "build and see" mindset. Without relying on a set of rules to guide our actions, the direction can easily become capability-driven instead of being outcome-driven.

The velocity that makes AI products fast to build also makes them fast to lose focus.

The "personas don't drive decisions" problem isn't new, people in the field have been writing about it for years, from Cooper's original persona work through more recent jobs-to-be-done patterns. What's new here is the angle: AI tooling makes the rigorous version cheap enough to actually run.

Persona work is the constraint that keeps the direction honest.

Building personas as infrastructure

Most persona work usually produces a single document per persona. That document then contains everything: demographics, motivations, goals, pain points. Then they get stacked together in a deck or a Confluence page. But the issue with this approach is that we lose the source of the persona, the work which was done to confirm the entity itself. Also, this format is especially problematic when AI is part of your workflow. A human PM can guess the context behind a persona claim. However, an AI agent needs that context spelled out explicitly. Without the evidence trail attached, the AI can't tell which claims came from research and which were workshop guesses.

To build personas as infrastructure, the first thing I made was a knowledge bank for the AI agents.

Out of the box, AI agents produce inconsistent persona documents: different sections, different depth, different assumptions about what counts as a finding. So I wrote a grounding document that lives in the personas folder and acts as the reference standard. It defines what a profile should contain, what the discourse audit should produce, what the evidence labels mean, and what counts as a load-bearing finding versus filler.

AI agents pull this document into context when creating a new persona or running an audit. The output starts from a shared standard instead of from whatever the model defaults to.

Once the AI agents had a shared standard, the next problem was storage. I created a persona folder which holds two files for each of my personas plus the research. My structure actually looks like this:

Persona folder structure
markdowns/
└── design/
  └── personas/
      ├── discourse/
      ├── profile/
      └── research/
text

The profile/ folder contains the living working artifacts. These are the files that new features and the app get stress-tested against. When developing a new functionality, these are the files that get loaded into context. They aren't too big, under 150 lines usually.

The discourse/ folder contains the time-stamped evidence. All the claims that are in the profile files trace back to a confirmation in the discourse files with an author and a link. This file is pretty static, but can be updated once in a while to review if the information around a persona has shifted.

This split is exactly what enables me to make the persona queryable. A single file persona document would either burn through way too many tokens or not have the information needed. With this setup, I can call the persona profile easily and do a deeper dive with the discourse if needed.

From there, it becomes pretty straightforward to build a simple skill to test a feature against the personas while keeping the context focused. The skills can then either be integrated into a full workflow, where they are called automatically as a new feature is built, or called when needed through a command.

What are Claude Code skills?

Skills are reusable capabilities you can package for Claude Code to invoke on demand. Each skill is a small group of instructions and tools that the agent can call when it matches a task. In the persona system, a "test feature against personas" skill loads the relevant profile files into context and runs the stress-test prompt, all from a single command or trigger.

The persona stops being a document that simply exists and becomes an actionable gate instead. Want to know how a persona reacts to a paywall design? Run it through the persona set. You can also do this for your entire user flow if you want! The two-file pattern is what makes that possible.

Running the audits

With the file structure in place, the work is split in two phases per persona.

The first phase was to build a standard persona profile with information such as the demographics, the motivation profile, the goals, pain points. I was able to build this with the AI model by using the research document on best practices to guide the creation of my profiles. The output is a light profile document for each persona that is easy to read, but is mostly a hypothesis at this stage. It's really a starting point more than anything.

The second phase is where I confirmed or corrected the profiles generated during phase 1. It goes through a full audit, which is the main work of this process. The audit goes through 5 steps:

  1. Author first pass. Before any searching, the AI asks me a few questions about the persona. For example, do I know somebody who would fall in this persona? What do I think their pain points are? This serves as a baseline to test against later, so that I don't rely solely on my intuition.
  2. Source identification. Which subreddits, App Store pages, competitor reviews, and so on to mine.
  3. Primary source mining. Here, I had to use Playwright MCP through Claude Code to be able to scan content. For each citation, the agent extracted the author, the score, the quote and the link to the source.
  4. Synthesis into clusters. The evidence gets grouped together to build patterns. Each cluster carries evidence labels: V for verified primary quotes with author and link, D for directional voices (register-confirming but low engagement), F for author priors captured before searching, X for claims that recur across multiple persona audits and get weighted heavier as a result.
  5. Verdict. Does the persona hold up against the evidence? What changed?

1. Author
first pass

2. Source
identification

3. Primary-source
mining

4. Synthesis
into clusters

5. Verdict

The five-step discourse audit pipeline.

What is Playwright MCP?

Playwright MCP is a Model Context Protocol server that gives an AI agent controlled access to a browser. Through it, Claude Code can navigate web pages, extract structured data (Reddit threads, App Store reviews, competitor docs), take screenshots, and run tests. It's what makes the discourse audit possible at scale without manual copy-pasting.

The audits go in the discourse/ folder and look like this:

Discourse audit shape
---
name: persona-discourse-audit
description: Phase 2 discourse audit for the [persona] persona...
status: complete
---

# [Persona] Discourse Audit, Phase 2

## Sources & limits
- (a) Verified primary: verbatim quotes from threads fetched via Playwright MCP
- (b) Directional: register-confirming voices with low engagement
- (c) Author-side priors: author testimony

## Cluster conclusions

### Cluster A: [Pattern name]
- **V:** [Username] (Competition App Store): "verbatim quote"
- **V:** [Username] (Reddit): "verbatim quote"
- **F:** Author prior: "verbatim quote"

**Reading:** [the synthesis claim, evidence-anchored]
markdown

The audit work itself is done by AI. Claude Code runs the discourse mining, synthesizes the cluster conclusions, drafts the right-sizing verdicts. My role is mostly one of oversight and validation at the end: does this conclusion actually hold against the source quotes?

This is the part that surprised me most. Five evidence-driven audits done in a few hours, work that would have taken a research team weeks of interview cycles per persona. The methodology is replicable by anyone with the tooling. It just requires proper discipline about what evidence you're willing to accept and what you're not. Now, of course it might not be as great as a proper full-blown persona analysis done by a full team of researchers, but this is still a real unlock in a high-velocity world.

What the audits surfaced

Once the system was running, the audits started producing findings faster than I could implement fixes for them.

They came in five categories:

  • Friction points. Specific UX failures users described with verbatim source quotes.
  • Feature validation signals. Evidence that real users want a feature, or evidence they don't.
  • Content gaps. Specific places where the product wasn't speaking to its personas correctly.
  • Product and marketing decisions. What to ship, what to cut, what to reposition.
  • Deprecation calls. Personas that didn't drive decisions and shouldn't exist.

The hundreds of friction points mentioned earlier came from the first category. One concrete example to give an idea of the type of friction point it found: walking the personas through the onboarding flow surfaced an ambiguity I had built in without noticing.

When the user reached an optional step to declare information (think "tell us about any constraints we should know about"), the system couldn't distinguish between a positive declaration ("I have none") and a passive skip ("I didn't bother to answer"). The system flagged that these should produce different downstream behavior. Conflating the two (treating "actively said no" as identical to "didn't engage with the question") would have produced wrong behavior in every AI feature that consumed that field.

That's one finding. There are hundreds more in the same shape.

The decisions the audits drove were bigger and rarer, for example:

  • A feature I'd been positioning as a separate marketing target became an in-product feature surface. The audience for that capability folded back into a baseline persona with a constraint dimension added. Engineering scope dropped roughly 40% as a result, measured by the feature surfaces I removed from the roadmap.
  • AI should not be the headline. Five audits, five re-confirmations that capability-first marketing has been classified as "AI SLOP" in the relevant communities. I removed every "AI-powered" framing from marketing, onboarding, and so on. AI ships as silent capability.
  • Three personas got cut entirely. They were based on intuition without solid evidence. The corresponding feature surfaces dropped off the roadmap with them.

The other thing the audits surfaced was a marketing-strategy observation I hadn't expected.

By the third persona audit, the personas were sorting themselves not so much by role but by acquisition path. Some of them had distinct enough audiences to warrant their own marketing strategies. Others didn't pitch as separate campaigns, but more as being subsets of the other campaigns. They were full personas, every bit as real, but their acquisition path overlapped with one of the primary personas.

Closing thoughts

A few things stand out coming out of this initiative.

The persona file most teams use is broken in a way that's hard to see until you've built the alternative. Static documents can't be queried, they can only be read. The two-file pattern, the knowledge bank, and the discourse audit turn the persona from a document into a system.

The AI tooling angle matters more than I expected. Five evidence-driven audits in a few hours is the kind of speed of execution I expect to see more of as agents get better at structured research. The methodology is replicable by anyone with the tooling and the structural discipline to run it.

The product hasn't launched yet, so every finding is up for revalidation once real users hit it. There's more I'm building in the meta layer, such as agent orchestration, hook contracts, and skill composition. I'll write about those separately as they take shape, you can follow along on the articles page or read more about the project on the about page.

Static persona decks belong to a model of Product work that AI is making obsolete. The teams that turn personas into infrastructure will move faster than the ones still updating slides.

More articles

All articles

About the author

Maxim St-Hilaire is a Staff Product Manager at Udemy leading MarTech product strategy. Programmer-turned-PM. Bilingual, based in Toronto.

Read more about Maxim