Posts tagged “ai”

7 January 2026

Learning in the Age of AI

Scott H. Young has a thoughtful piece on what’s still worth learning in a world with AI. He cuts through both the panic and the hype to look at what the data actually shows. The biggest finding is probably not all that surprising: early-career workers in AI-exposed fields are getting hit hardest.

Another report from the Stanford Digital Economy Lab notes that early-career workers in AI-exposed fields (such as programming) have seen a relative decline in employment, even as employment among workers aged 30 years and older increased. This matches my intuition that AI coding agents can do a lot of junior developer tasks pretty well, but struggle to match the experience needed to tackle more serious work.

Young’s advice is to cultivate generalist skills. Not the content-free “critical thinking” kind, but genuinely transferable knowledge:

In an environment of change, it’s better to be the hardy dandelion rather than the hothouse orchid. Similarly, I expect with AI-induced change, people who have maintained diverse interests and skills will be best positioned to take advantage of the change, whereas extreme specialists will face a greater risk of extinction.

2 January 2026

Don't Outsource Your Love of Music to AI

I’m late to this one, but I like Liz Pelly’s take on Spotify Wrapped. It’s not just about music—it’s about what happens when we let corporations automate our memories:

Spotify Wrapped now feels like just another example of something personal and precious that is being automated away from us; another example of a supposedly unbearable task of thinking and writing being “offloaded” in order to make life more frictionless.

The post is essentially about friction—and why we need it. She argues that working through the process of remembering what mattered to us and thinking critically about our year is what keeps us sharp and curious. When we just accept what a streaming service tells us about our taste, we’re not just outsourcing a task. We’re losing our own sense of what connected with us and why.

It encourages music fans to believe that the records they streamed the most must be the ones they liked the most, which is surely not always the case.

Her suggestion is straightforward: write your own list. It doesn’t have to be polished—a notes app screenshot, a handwritten list, whatever. Just something that came from you, not from an algorithm optimizing for engagement metrics.

30 December 2025

What's Actually Working with AI

Natalia Quintero wrote about what she’s learned from talking to more than 100 companies about AI implementation. This part about the problem with early adopters and isolated workflows stood out:

AI doesn’t spread like other software. Think about Asana. If one person decides to organize their team’s tasks there, everyone benefits automatically because the work is more organized, and someone on the team has taken responsibility for that organization. You don’t need to learn the tool to get value from your colleague using it. AI doesn’t work that way. If you develop workflows around how you work, that value doesn’t automatically translate to the rest of the company. Your prompts, your GPTs, your automations—they’re built around your context, your processes, and your way of thinking. They don’t transfer.

That’s the adoption problem in a nutshell. A power user’s AI setup is like their personal note-taking system—valuable for them but not portable. It explains why enterprise rollouts don’t work the way everyone expects.

The recruiting firm example is good: they trained 10 champions who built tools their peers wanted to use. One person automated scheduling coordination (saving 2–10 hours per task), and suddenly 30 others got curious. Peer-to-peer beats top-down mandates.

(If you’re curious about my setup, I wrote about it here)

30 December 2025

Humans make mistakes, and so does AI. It's fine.

Will Larson has a good post about implementing “Agent Skills” in their internal agent framework at Imprint. The whole piece is worth reading, but I wanted to highlight this observation:

Humans make mistakes all the time. For example, I’ve seen many dozens of JIRA tickets from humans that don’t explain the actual problem they are having. People are used to that, and when a human makes a mistake, they blame the human. However, when agents make a mistake, a surprising percentage of people view it as a fundamental limitation of agents as a category, rather than thinking that, “Oh, I should go update that prompt.”

There’s a double standard at play here that I’ve noticed too. When a colleague writes a confusing document, we ask them to clarify. When an agent produces something off, we tend to smirk and declare the technology fundamentally broken.

The fix is often as simple as updating a prompt—the same way you’d coach a team member to write better tickets. Skills, in Larson’s implementation, are essentially reusable prompt snippets that encode learned behaviors across workflows. It’s the kind of organizational knowledge we build up with people over time, just made explicit.

30 December 2025

How My AI Product "Second Brain" Evolved

A couple of weeks ago I wrote about how I use AI for product work: the basic setup of context files, prompts, and the @ mention system in Windsurf. Since then I’ve moved tools, added slash commands, and automated a few things, so here’s the update.

The philosophy has shifted. I still don’t use AI to do my core thinking. I write my own PRDs and strategy docs. But I’ve come to rely on it more as a helpful assistant for the work around the work: reviewing documents before I share them, researching technical questions, summarizing my week, preparing for meetings. These days it feels more like having a colleague on call.

From Windsurf to Claude Code / OpenCode

The biggest shift was moving from Windsurf to Claude Code, Anthropic’s terminal-based AI assistant. Claude Code runs in your terminal and has direct access to your filesystem, which changes how you can structure these workflows.

The key feature that made this worthwhile is slash commands. Instead of manually @-mentioning prompt files, I can type /ask-se and Claude Code automatically loads the right context, reads the relevant files, and knows how to respond. Small thing, but removing that friction is the difference between using these tools and just meaning to.

I also started using OpenCode, an open-source alternative that works similarly. Both tools read from the same instruction files, so I maintain one set of prompts that work in either environment.

Slash Commands

The prompts I described in the original post are now wrapped in slash commands. The files live in .claude/commands/ and look like this:

.claude/commands/
├── prd.md          # Review a PRD
├── okr.md          # Review OKRs
├── debate.md       # Stress-test a product idea
├── ask-se.md       # Research technical questions
├── briefing.md     # Calendar briefing
├── today.md        # End-of-day summary
├── weekly.md       # Weekly summary
└── ...

Each command file contains instructions for the AI, plus a $ARGUMENTS placeholder for any input I provide. When I run /debate should we build a developer portal?, Claude Code reads the debate prompt, substitutes my question for the placeholder, and runs through its process.

The commands I use most often:

/prd. Reviews a PRD I’ve written and pushes back on vague problem statements, missing success metrics, or unclear scope. I run this before sharing drafts with stakeholders.
/debate. Simulates a debate between an optimist and a skeptic about a product idea. This is probably the command I reach for most when I’m still forming an opinion about something.
/ask-se. Helps me answer specific technical questions about our products. For example, today I ran /ask-se On Gateway HTTP Logpush jobs, what would trigger "unknown" as the action? The command uses MCP (Model Context Protocol) servers to search our public documentation and internal wiki, then synthesizes an answer I can actually use. It’s how I get product answers without interrupting engineers or digging through docs myself.

I don’t have to remember file paths or compose context any more. I type the command and start the conversation.

Skills: Persistent Methodology

Commands are things I invoke explicitly. Skills are methodology files that get applied automatically when they’re relevant.

I have three skills set up:

pm-thinking. Applies my product philosophy to any PM-related work. When I’m reviewing a PRD, it automatically checks for problem-first thinking, measurable outcomes, and clear non-goals.
cloudflare-context. Loads knowledge about Cloudflare’s products and triggers proactive use of internal data sources (more on this below).
data-team-context. Specific context about my team’s priorities, current initiatives, and constraints.

The skill files live in .claude/skills/ and contain triggers (when to apply) and behaviors (what to do). For example, the pm-thinking skill flags anti-patterns like “vague success metrics” or “jumping to solutions without understanding the problem.”

This means even when I’m not explicitly running a PM-focused command, the AI still knows to apply my methodology.

Daily and Weekly Summaries

One of the more useful additions is automated work journaling. At the end of the day, I run /today. The command:

Finds files I modified that day using filesystem timestamps
Reads the key files to understand what I actually worked on
Asks if there’s anything else to add
Generates a summary focused on outcomes, not tasks
Saves it to a structured folder: work/cloudflare/weeknotes/2025/01/week-01/2025-01-06.md

The output follows a consistent format:

# Tuesday, December 16, 2025

## Summary
Focused on Q1 planning and customer research. Clarified success
metrics for data quality initiative and documented common Logpush questions.

## What I Worked On
- **Q1 Planning:** Reviewed OKR drafts, identified gaps in success metrics
- **Customer Research:** Documented Logpush egress IP questions for support docs

Then on Friday (or Monday morning), I run /weekly. It reads all the daily notes for the week and synthesizes them into a summary I use to prep for 1:1s and status updates.

This works because I can never remember on Friday what I did on Tuesday. The daily notes capture the work while it’s fresh, and the weekly summary rolls it up into something useful.

Calendar Briefings

The most custom piece of this system is the calendar briefing. I got this idea from an interview with Webflow’s CPO on Claire Vo’s podcast, about building an AI chief of staff. Following her example I wrote a Python script that:

Connects to Google Calendar via OAuth
Fetches events for today, tomorrow, or the week
Generates a briefing that flags meetings that could be async, suggests prep work, and warns if the day is overloaded

I run /briefing tomorrow before wrapping up for the day. It gives me a head start on thinking about what’s coming and whether I need to prepare anything.

The briefing uses context from my about-me.md file, so it knows things like which recurring meetings are more important than others, and what kind of prep I typically need for different meeting types.

Syncing Between Tools

One wrinkle: Claude Code and OpenCode have slightly different formats for commands. Claude Code uses plain markdown; OpenCode expects frontmatter with a description field.

So I built a sync command. Running /sync-to-opencode compares the two folders and copies any changed commands, adding the frontmatter OpenCode needs. It means I can maintain one set of prompts (in .claude/commands/) and have them work in both tools.

Most people don’t need this. I switch between tools often enough that manual syncing got annoying.

What I’ve Learned Since the First Post

Automation compounds. The daily notes feed the weekly summary, the skills carry across every conversation, and once the slash commands cut enough friction I actually started using the system instead of just thinking about using it.
Custom workflows are basically one file now. With slash commands and skills, I can create tools that work exactly how I want to work. The /ask-se command knows my products, our customers, and the kinds of questions I typically need answered. The daily summary knows my file structure and my preferred format. I get to encode my workflow into a tool instead of adapting to whatever the tool prefers.
Scripts fill the gaps. The calendar briefing couldn’t be a pure prompt because it needed to fetch data from an external API. Having a scripts/ folder for Python code that extends the system has been useful for anything that requires real I/O.
Keep work output local. The prompts and context files live in a private git repo so I can sync them between machines and version them over time. But the actual work output, like weeknotes, briefings, and anything I’m working on, stays on my local machine, not in the repo. The AI helps me do the work, but the work itself doesn’t need to live in the cloud.

If you’re building something similar, I’d recommend starting with the basics from the first post: context files, opinionated prompts, and a tool that lets you compose them easily. The slash commands and automation came later, once I understood what workflows I actually used repeatedly.

The specific tooling matters less than the approach. Figure out where AI actually helps (reviewing your work, researching questions, reducing busywork) and build the smallest system that makes those things easy to do.

22 December 2025

Introducing TL;DL: AI-Powered Podcast Summaries

Do you ever listen to a podcast episode and wish you could have a summary you could reference later? Not the whole transcript or someone else’s review, just a concise breakdown of the key points in a format you can scan quickly when you need to remember what was covered. Well, that’s why I spent the weekend building TL;DL (Too Long; Didn’t Listen). It generates AI summaries from podcast episodes.

Beyond that, there were a few other podcast use cases I kept running into:

Catching up on episodes I missed. Sometimes a podcast gets 10 episodes ahead while life happens. A summary helps me decide which ones are worth going back to.
Getting a feel for a new podcast. Before committing to a full episode, I want to know if a new show covers topics in a way that works for how I think and learn.
Quick reference after listening. When I want to apply something from an episode—like a framework or technique—I don’t want to re-listen to an hour of audio to find the relevant 5 minutes.

So I built something for myself, and now I’m making it available to others.

How It Works

Registered users can submit an Apple Podcasts episode URL, choose a summary template, and the system does the rest. It transcribes the audio (using OpenAI Whisper), generates a summary (using GPT–5.2), and caches everything for a year.

The TL;DL submission form with three template options

The three templates are designed for different types of content:

Key Takeaways & Practical Steps — This is the default, and it’s what I use most. The summary includes an overview, key insights, actionable steps, and notable quotes. Best for professional development and craft podcasts where you want to walk away with something to implement.
Narrative Summary — For story-driven content and interviews. Instead of bullet points, this generates flowing prose that captures the arc of the conversation, including key moments and themes.
ELI5 (Explain Like I’m 5) — For technical or complex topics. It breaks down dense material using everyday analogies and simple language.

The ELI5 Template Passed the Real Test

My wife is a therapist. She listens to highly technical psychology podcasts about things like Transference-Focused Psychotherapy and pathological narcissism. When I ran a recent episode she listed to through TL;DL, she was genuinely impressed by the “Key Takeaways” summary. It captured the clinical nuances accurately.

I, on the other hand, didn’t understand a word of it.

So I generated another summary using the “ELI5” template, and suddenly I could follow along. Concepts like devaluation got explained as “when a patient puts down the therapist, the therapy, or anything connected to it.” The technical frameworks became accessible. Here’s the episode page if you want to toggle between the two summaries yourself.

A Note about Podcast Creators and Attribution

Attribution matters to me. Every episode page prominently displays the podcast name, creator names, and both a “Listen on Apple Podcasts” link and a “Website” link to the official podcast website. My hope is that TL;DL helps expand a podcast’s audience by making the content more accessible. Summaries should bring people to a podcast, not replace the experience of listening—most podcasts have transcripts available already after all.

That said, if creators would rather not have their podcast processed, they can opt out and I’ll add their show to the blocklist.

The Technical Bits

For those interested in the stack: TL;DL runs entirely on Cloudflare’s edge platform. Cloudflare Workers handles the serverless compute, Workers KV stores the cached transcripts and summaries, and Cloudflare Queues manages the background job processing.

One interesting technical challenge was job status consistency. When you submit an episode, you want to see the status update in real-time as it progresses from “queued” to “transcribing” to “summarizing” to “completed.” Workers KV is eventually consistent, which meant status updates could lag by up to a minute. Users would refresh and still see “queued” even after the job was done.

I solved this with Durable Objects, Cloudflare’s strongly consistent coordination layer. The job status gets written to both the Durable Object (for immediate reads) and KV (for persistence and fallback). The UI now updates instantly.

Audio file handling for long episodes was another challenge. OpenAI Whisper has a 25MB file size limit. For podcasts that exceed this, I implemented MP3 frame-aware chunking—splitting the audio at frame boundaries so transcription can be stitched back together cleanly. The overlap handling ensures no words get lost between chunks.

What’s Next

Beyond solving my own problem, this was one of those projects where the building itself was the reward. The technical challenges were interesting, the product felt useful from day one, and I got to learn Durable Objects properly.

Submitting new episodes is currently invite-only while I iron out the rough edges. If you’re interested in access, reach out. For now, you can browse existing summaries to see how it works.

18 December 2025

Building a music discovery app (and what I learned about Product)

I miss liner notes. In the age of infinite streaming and algorithmic playlists I find myself longing for the days when you’d flip open a CD case and actually read about the music you were listening to. Who produced this? What’s the story behind the album? Why does this track feel different from everything else they’ve made?

Spotify and Apple Music are great at giving you more music. They’re less good at helping you understand why you might love something, or what to explore next. So I built my own solution—and then rebuilt it twice.

The problem I was trying to solve

My relationship with Last.fm goes back to 2007. In case you’re not familiar, Last.fm is a service that “scrobbles” (tracks) everything you listen to, building a comprehensive history of your musical life. It’s become a wonderful archive of my taste evolution over nearly two decades.

Last.fm is great at telling you what you listened to. It’s less useful for helping you understand why you might love something, or what else you should explore. Spotify and Apple Music’s algorithmic playlists are fine, but they often feel like they’re optimizing for engagement rather than genuine discovery.

I wanted a tool that would:

Show me context about the artists and albums in my listening history
Help me discover music through similarity and connection, not just popularity metrics
Give me that “liner notes” depth I was craving
Work with my existing Last.fm data (18 years of listening history is a lot to throw away)

So I started building, first by copy-pasting from GPT–4 (the olden days!), and most recently with Antigravity + Claude Opus 4.5 (we’ve come a long way since 2023). Here’s where it all stands today…

Listen To More: three iterations and counting

Listen To More is the core project—a music discovery platform that combines real-time listening data with AI-powered insights.

The first version was simple: a personal dashboard that pulled my Last.fm data and displayed it nicely. Functional, but limited. The second version added some AI summaries using OpenAI’s API. Better, but still rough around the edges.

The current version—iteration three—is a complete rebuild focused on speed and multi-user support. What started as “a thing I made for myself” is now something anyone can use. Sign in with your Last.fm account, and you get:

Rich album and artist pages with AI-generated summaries, complete with source citations (so you know the AI isn’t just making things up)
Your personal stats showing recent listening activity, top artists and albums over different time periods.
Weekly insights powered by AI that analyze your 7-day listening patterns and suggest albums you might love
Cross-platform streaming links for every album—Spotify, Apple Music, and more
A Discord bot so you can share music discoveries with friends

The tech stack is Hono on Cloudflare Workers, with D1 (SQLite) for the database and KV for caching. The whole thing is server-side rendered with vanilla JavaScript for progressive enhancement. Pages load in about 300ms, then AI summaries stream in asynchronously.

I chose this stack partly because I work at Cloudflare and wanted to understand our developer platform better. More on that later.

Extending the ecosystem with MCP servers

MCP stands for Model Context Protocol. In plain terms, it’s a standard that lets AI assistants (like Claude) connect to external data sources and tools. Think of it as giving an AI the ability to actually use personalized data rather than just answer questions based on pre-training.

I built two MCP servers to extend my music discovery ecosystem:

Last.fm MCP Server

Available at lastfm-mcp.com, this server lets AI assistants access your Last.fm listening data. Once connected, you can have conversations like:

“When did I start listening to Led Zeppelin?”
“What was I obsessed with in summer 2023?”
“Show me how my music taste has evolved over the years”

The AI can pull your actual scrobble data, analyze trends, and give you personalized insights. It supports temporal queries (looking at specific time periods), similar artists discovery, and comprehensive listening statistics.

Discogs MCP Server

This one connects to Discogs—the massive music database and marketplace that’s especially popular with vinyl collectors. If you have a Discogs collection, the MCP server lets AI assistants:

Search your collection with intelligent mood mapping (“find something mellow for a Sunday evening”)
Get context-aware recommendations based on what you own
Provide collection analytics and insights

Both servers run on Cloudflare Workers and use OAuth for secure authentication. They’re open source if you want to poke around or deploy your own.

What I learned

I’m a Product Manager, not an engineer. But I’ve found that having more technical depth broadens the scope of things I am able to contextualize—and makes me more confident in my interactions with engineers. Here’s what building these projects reinforced for me:

Side projects are a low-stakes learning environment. When you’re building for yourself, there’s no pressure to ship by a deadline or meet someone else’s requirements. You can experiment, break things, and iterate freely. I tried approaches that would have been too risky to propose in a work context—some of them broke the site spectacularly, others worked beautifully.
There’s no substitute for using your own product. I use these tools every day. That constant exposure surfaces issues and opportunities that you’d never catch in a quarterly review or user interview. The feature prioritization becomes obvious when you’re feeling your own friction.
Building with your company’s tools is invaluable. I now have deep, practical knowledge of Cloudflare Workers, D1, KV, and the rest of our developer platform. When I’m talking to customers or evaluating feature requests, I’m drawing on real experience, not just documentation. I can empathize with the developer experience because I’ve lived it.
The fun matters. I keep coming back to these projects because I genuinely enjoy working on them. The satisfaction of solving a problem you personally care about is different from the satisfaction of shipping something at work. Both are valuable, but the former is what sustains a side project through the inevitable rough patches.

What’s next

I have a list of features I’d love to add—better recommendations, more sophisticated listening pattern analysis, maybe even integration with other music services. But I’m also learning to pace myself. These projects aren’t going anywhere, and part of the joy is the slow, steady improvement over time.

If you’re curious, you can check them out here:

listentomore.com — The main music discovery platform
lastfm-mcp.com — Connect AI assistants to your Last.fm data
Discogs MCP Server — Connect AI assistants to your Discogs collection

And if you’re a PM thinking about starting a technical side project: do it. Pick something you personally care about, use tools you want to learn, and give yourself permission to build slowly. The lessons transfer in ways you won’t expect.

16 December 2025

Building MCP servers in the real world

This has been my experience with MCP servers as well. As useful as I think my Last.fm MCP server is, I can’t see it every having more than a dozen users. But internal company servers are massively useful:

MCP is being used especially heavily by internal data and platform teams to give internal users access to systems. These are systems that these users perhaps already had access to, but it was either too complex or too broad, or needed a lot of documentation or special skills to use.

Wiki search is so much better now that I can use our internal MCP server for it via Windsurf.

Source: Building MCP servers in the real world →

16 December 2025

Measuring AI's Impact on Shipping Speed and Code Quality

Will Larson has a good post about how they’re adopting AI at his company. The process is interesting, but this is the part that jumped out at me:

My biggest fear for AI adoption is that they can focus on creating the impression of adopting AI, rather than focusing on creating additional productivity. Optics are a core part of any work, but almost all interesting work occurs where optics and reality intersect.

It’s really hard to figure out if AI tools are (1) helping teams ship faster (2) without sacrificing quality.

We’re working on figuring out this problem right now at Cloudflare. Our proposed approach sidesteps the problem of per-commit AI attribution (did Copilot write this line? did Claude?) by correlating team-level AI tool usage with team-level health metrics over time. If a team’s AI adoption increases by 30% and their change failure rate stays stable, that’s a useful signal. If AI usage spikes and incidents start trending up, that’s worth investigating.

The key insight is that you don’t need perfect attribution to get directionally useful data. Correlation isn’t causation, and teams adopting AI tools may already be more experimental or higher-performing. But at least you’re measuring something real instead of the something like “# of lines written by AI”, which leads straight to the Goodhart’s Law problem where metrics become targets.

14 December 2025

How I Use AI for Product Work

Update! I wrote a follow-up post here: How My AI Product “Second Brain” Evolved.

I’ve been refining my approach to using LLMs for product work, and I figured it’s time to write up how I actually use them day-to-day.

The most valuable thing an AI assistant can do for product work is push back on weak reasoning, spot gaps you missed, and force you to articulate why your idea is good. A peer reviewer who shares your product philosophy, basically. With the right prompts, AI assistants are also good at producing background and framing documents: explainers that synthesize complex topics and summaries of technical concepts.

Here’s how I’ve set it up, what makes it work, and how you might build something similar.

The Philosophy: A Sparring Partner

I believe LLMs are most useful when you give them two things: context and constraints.

Context tells the model who you are, what you’re working on, and what “good” looks like in your world. Constraints keep it from drifting into generic advice or invented frameworks.

Every prompt I use is designed to provide both. They’re opinionated on purpose. I’d rather have an assistant that pushes back on bad ideas than one that says “Great idea!” to everything.

I don’t want AI writing presentations for me. I want a thinking partner that:

Challenges weak problem statements before I waste time on solutions
Spots missing success criteria I forgot to define
Asks “why?” when my reasoning gets hand-wavy
Points out when I’m jumping to solutions before understanding the problem
Helps me create background docs and explainers that set context for others

The Building Blocks

Here’s the folder structure I maintain, with a series of Markdown files organized by purpose:

llm-prompts/
├── prompts/           # System prompts for different use cases
│   ├── pm/            # Product management prompts
│   └── technical/     # Technical/engineering prompts
├── context/           # Personal context files (who I am, how I work)
├── reference/         # Syntax guides and reference docs
└── work/              # Saved feedback and refined docs

What makes this work is how you combine them.

Layer 1: System Prompts

These are the instructions that tell the AI how to behave for a specific task. I have different prompts for different jobs:

General PM sparring: A prompt that knows my product philosophy and pushes back on weak reasoning. I use this for thinking through tradeoffs, preparing for meetings, and sanity-checking my approach.
Document review: Prompts specifically designed to critique PRDs, OKRs, strategy docs, and other artifacts. These encode what “good” looks like and call out common anti-patterns.
Idea stress-testing: A prompt that I stole from my friend Stephen, which simulates a debate between an optimist and a skeptic to pressure-test new ideas before I get too attached to them.
Technical understanding: Prompts that help me understand systems, architectural decisions, and technical concepts well enough to lead effectively (I’m not an engineer, but I need to hold my own in architecture reviews).

Each one carries a specific point of view about what good work looks like.

Layer 2: Personal Context

I maintain files that describe:

Who I am: My role, my experience, my communication style
How I work: My product philosophy, my management approach, my values
What I’m working on: Current projects, team context, company priorities

When I start a conversation, I can pull in the relevant context files alongside my prompt. The model then has the background it needs to give me advice that fits my situation, instead of generic best practices from a blog post.

Layer 3: Reference Materials

Sometimes you need the model to follow specific formats or conventions. I keep reference files for things like wiki markup syntax, documentation templates, or internal style guides. These keep the output usable without a round of reformatting.

How I Actually Use This

I use Windsurf as my daily driver, and it has a feature that makes this whole system work: the @ mention. In the chat panel, I can reference any file by typing @ followed by the path. Windsurf then includes that file’s contents as context for the conversation.

This means I can compose my “assistant” on the fly by combining:

A system prompt for the task at hand
Relevant personal context files
The document or code I’m working on

Example: Document Review

When I need feedback on a PRD before sharing it with stakeholders, I’ll start a conversation and reference my PRD review prompt plus my product philosophy context. Then I paste in the PRD and ask for critique.

The model comes back with feedback measured against my own standards. It’ll call out if my problem statement is vague, if my success metrics aren’t measurable, or if I’m jumping to solutions before properly framing the problem—the kind of pushback I’d want from a peer reviewer.

Example: Brainstorming Partner

For early-stage thinking, I use a more conversational prompt that knows how I like to explore ideas. I’ll describe what I’m thinking about and ask it to poke holes, suggest angles I haven’t considered, or help me articulate why something feels off.

This helps most before big meetings. I can rehearse my reasoning and get challenged on the weak spots before I’m in front of stakeholders.

Example: Technical Understanding

I’m not an engineer, but I work with technical teams. When I need to understand how a system works well enough to ask good questions or spot when something doesn’t add up, I use prompts designed for technical explanation.

The key is that these prompts know to explain things without condescension but also without assuming I know the jargon. They cite specific files and line numbers when relevant, and they explain the “why” behind design decisions.

Connecting to Real Data

MCP (Model Context Protocol) servers connect the AI to external data sources: internal wikis, documentation sites, code repositories, APIs. So it can answer from real information instead of guessing from training data.

In my prompts, I tell the model which MCP servers are available and when to use them. For example, my technical prompts instruct the model to:

Search official documentation first to ground answers in verified information
Check internal wikis for known issues, edge cases, and workarounds
Look at code repositories when documentation is incomplete
Always cite sources with links so I can verify

Now the AI answers like someone with access to your company’s actual knowledge base, citing real docs and real code instead of best-practice boilerplate.

Keeping a Record

Save the conversation output somewhere useful.

I have a work/ folder organized by topic where I save feedback and refined thinking. When the model gives me good critique on a PRD, I’ll ask it to write a summary of the key issues to a Markdown file I can reference later. This keeps the insights from getting lost in chat history.

What I’ve Learned

Context files are worth the investment. I have files that describe who I am, how I work, and what I value. Updating these takes time, but it pays off in every conversation.
Pushback is the point. These prompts are designed to challenge bad thinking. If the model is pushing back on your approach, take it seriously: it might be right.
Iterate on the prompts. I update these regularly based on what works and what doesn’t. If a prompt isn’t helping, change it.
Less context is often more. Including too much context can dilute the signal. Start with the minimum you need, add more if the model seems confused.

The thinking is still mine

None of this does the thinking for me. It encodes my preferences and philosophies into something an LLM can use as a baseline for pushing back. I still write my own PRDs, OKRs, and strategy docs, because those represent my actual thinking. But I let AI help me create background documents, explainers, and context-setting materials. And I have a sparring partner that catches the gaps I miss and asks the uncomfortable questions before stakeholders do.

If you build something similar, I’d love to hear how it goes.

« Previous
1
…
3
4
5
6
7
…
15
Next »