Menu

Posts tagged “product discovery”

Output isn’t design

The hard part of design is rarely generating the form. It is understanding the problem well enough to know what and how something should exist at all. There is use and place for these tools, but tools are not the design process.

Christopher Alexander came closer than anyone to naming this clearly. In Notes on the Synthesis of Form, he describes design as the search for a good fit between a form and its context. Context, in his sense, is not a background condition. It is the full set of forces that make a problem what it is: human needs, technical constraints, conflicting requirements, habits, edge cases, and relationships that are easy to miss until you spend time with them. Bad design appears where those forces remain unresolved. Good design appears where those misfits have been worked through carefully.

— Karri Saarinen, Output isn’t design

AI Prototyping Is Changing How We Build Products at Uber

There is no doubt that this post was at least 80% written by AI but I’m not even super mad about it because that is just the way of the world now, and the summary it generated from how Uber works is actually legit interesting:

A prototype without a PRD can drift away from the problem the team intends to solve. A PRD without a prototype can remain abstract, leaving room for inconsistent interpretations. […] If going from idea to prototype is now fast and cheap, the PRD can no longer be the primary place where ideas are defined. Its value increasingly lies in capturing intent, tradeoffs, success metrics, and decisions.

The PRD as an artifact is in the spotlight right now in a way that I think is really healthy. Should it remain but change its JTBD? Should it be an eval instead? Who knows. Let’s figure it out together…

From Assistant to Collaborator: How My AI Second Brain Grew Up

Over the past few months I’ve been writing about how I use AI for product work. The first post covered the philosophy: context files, opinionated prompts, and how to compose the right inputs for each task. The second added slash commands and daily summaries. The third was a hands-on setup guide. And the fourth introduced project brains for keeping complex initiatives organized.

This post covers a different kind of change. The earlier additions were incremental: more commands, better context, smoother workflows. What changed recently feels more like a threshold. The system went from a tool I invoke for specific tasks to something closer to a collaborator I dispatch to do real work. Three capabilities drove that shift: multi-agent orchestration, cross-session memory, and the encoding of domain expertise into the system itself.

Multi-Agent Workflows

The clearest example is customer escalation investigations. As a PM for data products, I regularly investigate customer-reported issues: logging gaps, data discrepancies, behavior that doesn’t match expectations. These investigations require pulling information from multiple sources and cross-referencing it all into an analysis that engineering can act on.

I built a slash command that handles this as a multi-phase workflow. When I run it with a ticket ID, here’s what happens:

  1. The system reads the customer ticket, extracts the core problem, identifies which product area is involved, and classifies the issue type.
  2. Three specialist agents launch simultaneously, each focused on a different data source. One searches the codebase for the relevant logic and recent changes. Another searches for related tickets and prior incidents across projects. A third checks documentation and internal wiki pages for relevant operational context.
  3. A fourth agent receives the combined findings and produces database queries that can confirm or refute the working hypothesis.
  4. The system combines everything into a structured analysis: issue classification, root cause anchored in code where possible, customer impact, and recommended next steps.
  5. A blind validator independently re-fetches every source cited in the draft to verify the claims hold up. Then an adversarial challenger looks for alternative explanations and tests whether the classification is correct.

The output is a document I can review with an engineering colleague or paste into a chat thread. It includes a confidence assessment and a data collection status table showing what was checked and what was unavailable, along with how the analysis compensated for gaps.

The command file that orchestrates all of this isn’t prompting in the traditional sense. It defines which agents to dispatch, what information each one needs, when to wait for results before proceeding, and how to handle failures gracefully. Writing this felt more like designing a workflow than writing a prompt.

I’ve applied the same pattern to other tasks. A “fix feasibility” command evaluates whether a ticket describes a code change simple enough for a PM to implement with AI coding assistance, and produces an implementation brief if the answer is yes. The specific use cases differ, but the architecture is the same: break the problem into specialist tasks that run in parallel, then synthesize and validate the results.

Cross-Session Memory

AI conversations are stateless by default. Every new session starts from zero, which means re-explaining context that should already be established. Over a few weeks of working on the same projects, this friction adds up.

I addressed this with a four-layer memory system:

  • The first layer is stable facts: a compact file that captures the current state of all active work, including project status, recent decisions, and environment constraints. This is the primary orientation file. When I start a session, the AI reads it and immediately knows what’s in flight.
  • The second is a session log: a reverse-chronological list of handoff notes. Each entry records what happened in a session and what threads remain open. The last three entries give enough context to pick up where I left off.
  • Third, a corrections file. This holds behavioral fixes for things the AI consistently gets wrong. It’s a staging area that should shrink over time as fixes get promoted elsewhere.
  • And finally, a decisions log: a cross-cutting record of decisions that don’t belong to a specific project. Each entry captures context and rationale so I don’t relitigate settled questions.

Two commands manage this. /session-start loads all four files and presents a brief summary of current state and recent sessions. /session-end reviews the conversation, writes a handoff note, and then checks whether any learnings should be promoted to infrastructure.

“Promote to infrastructure” means taking something learned during a session and baking it into the files the agent actually reads. A correction about how to handle a specific edge case in escalation investigations might start in the corrections file, then get promoted into the escalation command or a domain skill once it’s validated. The corrections file shrinks over time as that knowledge moves into the right places.

This creates a loop where the system improves its own instructions. I approve every change, so it’s not self-modifying in a creepy way. But in practice each work session can make the next one slightly better, and the compound effect over weeks is noticeable.

Domain Expertise

The earlier posts described skills like pm-thinking, which applies product methodology (problem-first thinking, measurable outcomes) to any PM-related conversation. That’s useful, but generic. It works the same way regardless of what product you’re building.

The bigger shift was building skills that encode institutional knowledge about specific products. I now have skills for each major product area my team owns: log delivery, analytics, audit logs, alerting, and data pipelines. Each skill contains the product’s architecture and common failure modes, along with which code repositories to search and which database tables hold relevant data.

This is what makes the multi-agent workflows useful. When the code investigator agent examines an escalation about missing logs, the domain skill tells it which service handles job state and which repository contains the delivery pipeline. It also flags recent architectural changes that might be relevant. Without that context, the agent produces plausible-sounding analysis that misses the specific details engineering needs.

Now every investigation that uses a skill validates or extends the knowledge it contains, and /session-end catches insights that should be added back.

How The Work Changes

The biggest change is in my own role. It’s gone from “write the right prompt” to “design the right process.” The escalation command is a workflow with phases, dependencies, and validation steps, and thinking about it that way beats trying to pack everything into a single conversation. A few other things I’ve noticed:

  • Validation has to be built in. The blind validator exists because agents make mistakes. They cite files that don’t exist, mischaracterize what code does, or draw conclusions the evidence doesn’t support. Catching those issues before they reach anyone else is the whole point.
  • Cross-session memory requires discipline. The system only works if I run /session-end after substantive sessions and keep stable facts current. When I skip it, the next session starts cold and I lose the compounding benefit. Automation helps, but the commitment to maintain the memory is mine.
  • And domain skills need regular maintenance. Products change. Code gets refactored, pipelines get rearchitected. Skills that aren’t periodically updated drift from reality. I haven’t solved this well yet. It’s still a manual process of noticing when a skill’s knowledge is stale and updating it.

The system still makes mistakes. Multi-agent workflows are more thorough than single-prompt conversations, but they’re not infallible. The confidence assessment in the escalation output exists because sometimes the answer is “medium confidence, we couldn’t confirm this from the available data.” That honesty about limitations is more useful than false certainty.

Where This Is Going

I’m sure the specific commands and skills will look different in six months as I learn what works and what doesn’t. But the underlying pattern feels durable: compose specialist agents with deep domain context, validate their output, and feed learnings back into the system.

I’ve published updated files to the Product AI Public repo, including the session memory commands and a generalized version of the multi-agent escalation workflow. If you’re building something similar, those might be useful starting points.

None of these pieces does much on its own. It’s the way they feed each other that turned a pile of separate prompts into something I lean on every day.

Project Brains: Organizing Complex Initiatives for AI-Assisted Work

I’ve written before about how I use AI for product work and how that workflow evolved with slash commands and skills. This post focuses on how to maintain context for complex, long-running projects.

The Problem: Context Fragmentation

When I’m working on a major initiative, relevant information ends up scattered everywhere: PRDs in one tool, tickets in another, meeting notes in a third, plus emails and chat threads. Every time I return to a project after a few days, I spend time reconstructing where things stand.

AI assistants can make this worse because each conversation starts fresh. I can reference files, but the model doesn’t know which files matter for this project, what decisions we’ve already made, or what questions remain open. I end up re-explaining context that should be obvious.

Project brains solve this by creating a dedicated folder for each major initiative with a standard structure that both humans and AI can navigate.

What a Project Brain Looks Like

The structure looks like this:

projects/[project-name]/
├── CONTEXT.md        # The hub: status, stakeholders, decisions, open questions
├── artifacts/        # PRDs, specs, designs, one-pagers
├── decisions/        # Decision logs with rationale and alternatives
├── research/         # Customer feedback, data analysis, technical investigation
└── meetings/         # Meeting notes related to this project

The CONTEXT.md file is a living document that answers the questions I’d need to answer every time I pick up a project:

  • What’s the current status?
  • Who are the stakeholders and what do they care about?
  • What decisions have we made and why?
  • What questions are still open?
  • Where are the relevant artifacts?

When I start a conversation about a project, I point the AI to the project folder. It reads CONTEXT.md first, then can drill into specific artifacts as needed. The model immediately knows the project state without me explaining it.

A Real Example

Say I’m working on adding observability to an internal platform—something that needs coordination across multiple teams over several months. The CONTEXT.md includes:

  • Quick reference table: Status, PM, engineering lead, target dates, links to the PRD and relevant tickets. Everything I’d need to orient myself.
  • Problem statement: A clear articulation of the user pain. In this case: “Platform incidents go undetected until users report them, and debugging takes hours due to lack of visibility.”
  • Success metrics with baselines and targets: Things like uptime targets, reduction in mean time to resolution, and alert accuracy. These anchor every conversation about scope.
  • Key decisions made: A table showing what was decided, when, why, and what alternatives we considered. When someone asks “why aren’t we including component X in v1?”, the answer is already documented.
  • Open questions: A checklist of unresolved issues. This prevents the AI from assuming things are settled when they’re not.
  • Links: Direct paths to the PRD, spec, analysis docs, and related pages.

The decisions/ folder contains detailed decision logs for significant choices. The research/ folder holds whatever analysis informed the project direction. The meetings/ folder captures sync notes that would otherwise disappear into Gemini notes in a Google Drive… somewhere.

When to Create a Project Brain

Not every task needs this treatment. I create a project brain when:

  • The work spans multiple weeks or months. Short-term tasks don’t need the overhead.
  • Multiple stakeholders are involved. If I need to coordinate with other teams, having a single source of context helps.
  • Decisions require documented rationale. If someone might ask “why did you do it this way?” later, a decision log is worth the investment.
  • The project crosses team boundaries. Cross-functional initiatives benefit from dedicated context that doesn’t live in any one team’s space.

For simpler work, I use a flatter folder structure with documents organized by type. Project brains are for the complex initiatives where losing the thread between sessions costs me real time.

How AI Uses Project Brains

This earns its keep when I’m working with AI on project-specific tasks. A few examples:

  • Preparing for a meeting: “Read the CONTEXT.md in the [project] folder. I have a spec review meeting tomorrow. What are the open questions I should raise?”
  • Drafting an update: “Based on the project context, draft a status update for leadership. Focus on progress since the start of the month and remaining blockers.”
  • Decision analysis: “We need to decide whether to include [component] in scope. Read the research folder and the current CONTEXT.md. What would you recommend and why?”

By the time I’m working in it, the AI already knows the project’s history and the people involved, so its recommendations fit this specific situation instead of falling back on generic best practices.

Maintaining the Project Brain

The value depends on keeping CONTEXT.md current. I’ve found a few practices help:

  • Update after significant events. When a decision is made, a meeting happens, or the status changes, update the file immediately. “I’ll do it later” means it won’t happen. LLMs are great at making these updates, so you can simply say “update relevant files based on the session we just concluded.”
  • Move open questions to resolved. When a question gets answered, don’t delete it. Mark it resolved and note the answer. This preserves the reasoning trail.
  • Link, don’t duplicate. CONTEXT.md should point to artifacts, not contain them. Keep PRDs in the artifacts folder. Keep meeting notes in the meetings folder. The context file is a hub, not a repository.

Scaffolding New Projects

I have a slash command that scaffolds new project brains:

/new-project platform-observability

This creates the folder structure, generates a CONTEXT.md from a template, and fills out a rough draft based on whatever context I provide. Removing the friction of setup means I’m more likely to actually use the system. You can view the command here.

The template includes the standard sections (Quick Reference, Problem Statement, Success Metrics, etc.) with placeholder text. I fill in what I know and mark other sections as TBD. Even an incomplete project brain is more useful than scattered notes.

What Surprised Me

A well-organized project brain with sparse content beats a folder full of undifferentiated documents every time, because the AI (and future me) can work with structure far more easily than with a pile of files. The decision logs have paid off the most: when someone asks why we didn’t do something, I point to the log instead of reconstructing my reasoning from memory. And while I built this for the AI, I reference these files constantly myself. Staying on top of the context keeps me oriented too, not just the assistant. The structure stays flexible too. Some projects grow extra subfolders like research/customer-interviews/, others need fewer.

This approach requires discipline to maintain, and the upfront setup takes time. But for complex initiatives where context fragmentation is a real problem, project brains have been worth the investment. The AI becomes a more useful collaborator when it has access to the same context I do.

I’m still iterating on the structure. I suspect the template will look different six months from now as I learn what sections actually get used and which ones I skip every time. I’m not trying to get the folder structure perfect. I just want to stop losing context between conversations, so each time I come back to a project I can build on what I already know.

How I Use AI for Product Work

Update! I wrote a follow-up post here: How My AI Product “Second Brain” Evolved.

I’ve been refining my approach to using LLMs for product work, and I figured it’s time to write up how I actually use them day-to-day.

The most valuable thing an AI assistant can do for product work is push back on weak reasoning, spot gaps you missed, and force you to articulate why your idea is good. A peer reviewer who shares your product philosophy, basically. With the right prompts, AI assistants are also good at producing background and framing documents: explainers that synthesize complex topics and summaries of technical concepts.

Here’s how I’ve set it up, what makes it work, and how you might build something similar.

The Philosophy: A Sparring Partner

I believe LLMs are most useful when you give them two things: context and constraints.

Context tells the model who you are, what you’re working on, and what “good” looks like in your world. Constraints keep it from drifting into generic advice or invented frameworks.

Every prompt I use is designed to provide both. They’re opinionated on purpose. I’d rather have an assistant that pushes back on bad ideas than one that says “Great idea!” to everything.

I don’t want AI writing presentations for me. I want a thinking partner that:

  • Challenges weak problem statements before I waste time on solutions
  • Spots missing success criteria I forgot to define
  • Asks “why?” when my reasoning gets hand-wavy
  • Points out when I’m jumping to solutions before understanding the problem
  • Helps me create background docs and explainers that set context for others

The Building Blocks

Here’s the folder structure I maintain, with a series of Markdown files organized by purpose:

llm-prompts/
├── prompts/           # System prompts for different use cases
│   ├── pm/            # Product management prompts
│   └── technical/     # Technical/engineering prompts
├── context/           # Personal context files (who I am, how I work)
├── reference/         # Syntax guides and reference docs
└── work/              # Saved feedback and refined docs

What makes this work is how you combine them.

Layer 1: System Prompts

These are the instructions that tell the AI how to behave for a specific task. I have different prompts for different jobs:

  • General PM sparring: A prompt that knows my product philosophy and pushes back on weak reasoning. I use this for thinking through tradeoffs, preparing for meetings, and sanity-checking my approach.
  • Document review: Prompts specifically designed to critique PRDs, OKRs, strategy docs, and other artifacts. These encode what “good” looks like and call out common anti-patterns.
  • Idea stress-testing: A prompt that I stole from my friend Stephen, which simulates a debate between an optimist and a skeptic to pressure-test new ideas before I get too attached to them.
  • Technical understanding: Prompts that help me understand systems, architectural decisions, and technical concepts well enough to lead effectively (I’m not an engineer, but I need to hold my own in architecture reviews).

Each one carries a specific point of view about what good work looks like.

Layer 2: Personal Context

I maintain files that describe:

  • Who I am: My role, my experience, my communication style
  • How I work: My product philosophy, my management approach, my values
  • What I’m working on: Current projects, team context, company priorities

When I start a conversation, I can pull in the relevant context files alongside my prompt. The model then has the background it needs to give me advice that fits my situation, instead of generic best practices from a blog post.

Layer 3: Reference Materials

Sometimes you need the model to follow specific formats or conventions. I keep reference files for things like wiki markup syntax, documentation templates, or internal style guides. These keep the output usable without a round of reformatting.

How I Actually Use This

I use Windsurf as my daily driver, and it has a feature that makes this whole system work: the @ mention. In the chat panel, I can reference any file by typing @ followed by the path. Windsurf then includes that file’s contents as context for the conversation.

This means I can compose my “assistant” on the fly by combining:

  1. A system prompt for the task at hand
  2. Relevant personal context files
  3. The document or code I’m working on

Example: Document Review

When I need feedback on a PRD before sharing it with stakeholders, I’ll start a conversation and reference my PRD review prompt plus my product philosophy context. Then I paste in the PRD and ask for critique.

The model comes back with feedback measured against my own standards. It’ll call out if my problem statement is vague, if my success metrics aren’t measurable, or if I’m jumping to solutions before properly framing the problem—the kind of pushback I’d want from a peer reviewer.

Example: Brainstorming Partner

For early-stage thinking, I use a more conversational prompt that knows how I like to explore ideas. I’ll describe what I’m thinking about and ask it to poke holes, suggest angles I haven’t considered, or help me articulate why something feels off.

This helps most before big meetings. I can rehearse my reasoning and get challenged on the weak spots before I’m in front of stakeholders.

Example: Technical Understanding

I’m not an engineer, but I work with technical teams. When I need to understand how a system works well enough to ask good questions or spot when something doesn’t add up, I use prompts designed for technical explanation.

The key is that these prompts know to explain things without condescension but also without assuming I know the jargon. They cite specific files and line numbers when relevant, and they explain the “why” behind design decisions.

Connecting to Real Data

MCP (Model Context Protocol) servers connect the AI to external data sources: internal wikis, documentation sites, code repositories, APIs. So it can answer from real information instead of guessing from training data.

In my prompts, I tell the model which MCP servers are available and when to use them. For example, my technical prompts instruct the model to:

  • Search official documentation first to ground answers in verified information
  • Check internal wikis for known issues, edge cases, and workarounds
  • Look at code repositories when documentation is incomplete
  • Always cite sources with links so I can verify

Now the AI answers like someone with access to your company’s actual knowledge base, citing real docs and real code instead of best-practice boilerplate.

Keeping a Record

Save the conversation output somewhere useful.

I have a work/ folder organized by topic where I save feedback and refined thinking. When the model gives me good critique on a PRD, I’ll ask it to write a summary of the key issues to a Markdown file I can reference later. This keeps the insights from getting lost in chat history.

What I’ve Learned

  1. Context files are worth the investment. I have files that describe who I am, how I work, and what I value. Updating these takes time, but it pays off in every conversation.
  2. Pushback is the point. These prompts are designed to challenge bad thinking. If the model is pushing back on your approach, take it seriously: it might be right.
  3. Iterate on the prompts. I update these regularly based on what works and what doesn’t. If a prompt isn’t helping, change it.
  4. Less context is often more. Including too much context can dilute the signal. Start with the minimum you need, add more if the model seems confused.

The thinking is still mine

None of this does the thinking for me. It encodes my preferences and philosophies into something an LLM can use as a baseline for pushing back. I still write my own PRDs, OKRs, and strategy docs, because those represent my actual thinking. But I let AI help me create background documents, explainers, and context-setting materials. And I have a sparring partner that catches the gaps I miss and asks the uncomfortable questions before stakeholders do.

If you build something similar, I’d love to hear how it goes.

AI's "Just Ship it." problem

Here’s Leah Tharin with a good reminder of what it means to ship, and how AI can (and cannot) help. In short, building is only one part of creating valuable products. Shipping involves:

  • Ideation: There’s an idea
  • Development: You build the idea
  • Validation: You validate whether what you think the idea does is actually happening

Yes, vibe coding tools like Lovable et al. help you to ship things faster, but only as long as these ideas struggle with the “development” part and don’t need Ideation and Validation.

Source: AI’s “Just Ship it.” problem

Talking to customers

Oh my, Justin (from my favorite newsletter platform Buttondown) nails it here:

Customers make for good historians but poor futurists, and certainly they won’t do the hardest and most important job of identifying your leverage points for you.

That was your shot. Here’s your chaser:

None of this is to say you shouldn’t talk to customers: you should! But it should be neither the first nor the last step in your process: if someone needs to talk with people to figure out what to build next, it means they have insufficient vision; if someone needs to talk with people to figure out if something is truly ready for GA it means your org has insufficient conviction and process.

Draw it until it works

Here’s a quick thought about ramping up on something new as a product manager.

If I don’t understand how something works in an organization, I do two things. I ask questions, and I draw boxes and arrows based on the answers. People sometimes make fun of me for this, but hear me when I say that nothing gets people aligned like a systems diagram they can disagree with.

B2C, B2B, Platform, Internal… the industry/product type doesn’t matter. Draw the flow of information through your product, get people to disagree, adjust until they agree. That’s the moment when you become a PM that can actually be helpful to the team and the business. You cannot improve the system until you understand it.

The Law of Propinquity And The Work From Home Dilemma

Here’s a solid, research-based take on remote work by Paul Taylor. He discusses what type of work is usually more successful when done in person vs. remotely:

If you are doing solid repeatable work, or work that requires intense solo concentration, you can work from pretty much anywhere. If you are in a discovery phase of work, and trying to fuse ideas together from multiple viewpoints remote working might be a hindrance.

It reminds me of a saying I heard somewhere: “In-person is where decisions happen, remote is where work happens.” I also really like this (new to me) concept:

The law of propinquity states that the greater physical (or psychological) proximity between people, the greater the chance that they will form friendships or romantic relationships. Other things being equal, the more we see people and interact with them, the more probable we are to like them.

You are probably not one feature away from success

I like this perspective from Ed Sim on recognizing that you can’t always build yourself into product-market fit…

There is no easy answer for a lack of customer traction, but my one suggestion before you commit to the idea that you are one feature away from success, is to go back to the basics and first ask if this is the right user or customer. If you believe you have that nailed, try multiple messages and keep learning from every interaction. You may have the right product today but for the wrong user. Or you simply may just have a cool technology in search of a problem to solve in which case you should start completely over.