Posts tagged “productivity”

23 March 2026

Agentic manual testing

Simon Willison has a practical guide on manual testing with coding agents. Two tips I’ve already started using:

It’s still quick for an agent to write out a demo file and then compile and run it. I sometimes encourage it to use /tmp purely to avoid those files being accidentally committed to the repository later on.

And:

If an agent finds something that doesn’t work through their manual testing, I like to tell them to fix it with red/green TDD. This ensures the new case ends up covered by the permanent automated tests.

14 March 2026

From Assistant to Collaborator: How My AI Second Brain Grew Up

Over the past few months I’ve been writing about how I use AI for product work. The first post covered the philosophy: context files, opinionated prompts, and how to compose the right inputs for each task. The second added slash commands and daily summaries. The third was a hands-on setup guide. And the fourth introduced project brains for keeping complex initiatives organized.

This post covers a different kind of change. The earlier additions were incremental: more commands, better context, smoother workflows. What changed recently feels more like a threshold. The system went from a tool I invoke for specific tasks to something closer to a collaborator I dispatch to do real work. Three capabilities drove that shift: multi-agent orchestration, cross-session memory, and the encoding of domain expertise into the system itself.

Multi-Agent Workflows

The clearest example is customer escalation investigations. As a PM for data products, I regularly investigate customer-reported issues: logging gaps, data discrepancies, behavior that doesn’t match expectations. These investigations require pulling information from multiple sources and cross-referencing it all into an analysis that engineering can act on.

I built a slash command that handles this as a multi-phase workflow. When I run it with a ticket ID, here’s what happens:

The system reads the customer ticket, extracts the core problem, identifies which product area is involved, and classifies the issue type.
Three specialist agents launch simultaneously, each focused on a different data source. One searches the codebase for the relevant logic and recent changes. Another searches for related tickets and prior incidents across projects. A third checks documentation and internal wiki pages for relevant operational context.
A fourth agent receives the combined findings and produces database queries that can confirm or refute the working hypothesis.
The system combines everything into a structured analysis: issue classification, root cause anchored in code where possible, customer impact, and recommended next steps.
A blind validator independently re-fetches every source cited in the draft to verify the claims hold up. Then an adversarial challenger looks for alternative explanations and tests whether the classification is correct.

The output is a document I can review with an engineering colleague or paste into a chat thread. It includes a confidence assessment and a data collection status table showing what was checked and what was unavailable, along with how the analysis compensated for gaps.

The command file that orchestrates all of this isn’t prompting in the traditional sense. It defines which agents to dispatch, what information each one needs, when to wait for results before proceeding, and how to handle failures gracefully. Writing this felt more like designing a workflow than writing a prompt.

I’ve applied the same pattern to other tasks. A “fix feasibility” command evaluates whether a ticket describes a code change simple enough for a PM to implement with AI coding assistance, and produces an implementation brief if the answer is yes. The specific use cases differ, but the architecture is the same: break the problem into specialist tasks that run in parallel, then synthesize and validate the results.

Cross-Session Memory

AI conversations are stateless by default. Every new session starts from zero, which means re-explaining context that should already be established. Over a few weeks of working on the same projects, this friction adds up.

I addressed this with a four-layer memory system:

The first layer is stable facts: a compact file that captures the current state of all active work, including project status, recent decisions, and environment constraints. This is the primary orientation file. When I start a session, the AI reads it and immediately knows what’s in flight.
The second is a session log: a reverse-chronological list of handoff notes. Each entry records what happened in a session and what threads remain open. The last three entries give enough context to pick up where I left off.
Third, a corrections file. This holds behavioral fixes for things the AI consistently gets wrong. It’s a staging area that should shrink over time as fixes get promoted elsewhere.
And finally, a decisions log: a cross-cutting record of decisions that don’t belong to a specific project. Each entry captures context and rationale so I don’t relitigate settled questions.

Two commands manage this. /session-start loads all four files and presents a brief summary of current state and recent sessions. /session-end reviews the conversation, writes a handoff note, and then checks whether any learnings should be promoted to infrastructure.

“Promote to infrastructure” means taking something learned during a session and baking it into the files the agent actually reads. A correction about how to handle a specific edge case in escalation investigations might start in the corrections file, then get promoted into the escalation command or a domain skill once it’s validated. The corrections file shrinks over time as that knowledge moves into the right places.

This creates a loop where the system improves its own instructions. I approve every change, so it’s not self-modifying in a creepy way. But in practice each work session can make the next one slightly better, and the compound effect over weeks is noticeable.

Domain Expertise

The earlier posts described skills like pm-thinking, which applies product methodology (problem-first thinking, measurable outcomes) to any PM-related conversation. That’s useful, but generic. It works the same way regardless of what product you’re building.

The bigger shift was building skills that encode institutional knowledge about specific products. I now have skills for each major product area my team owns: log delivery, analytics, audit logs, alerting, and data pipelines. Each skill contains the product’s architecture and common failure modes, along with which code repositories to search and which database tables hold relevant data.

This is what makes the multi-agent workflows useful. When the code investigator agent examines an escalation about missing logs, the domain skill tells it which service handles job state and which repository contains the delivery pipeline. It also flags recent architectural changes that might be relevant. Without that context, the agent produces plausible-sounding analysis that misses the specific details engineering needs.

Now every investigation that uses a skill validates or extends the knowledge it contains, and /session-end catches insights that should be added back.

How The Work Changes

The biggest change is in my own role. It’s gone from “write the right prompt” to “design the right process.” The escalation command is a workflow with phases, dependencies, and validation steps, and thinking about it that way beats trying to pack everything into a single conversation. A few other things I’ve noticed:

Validation has to be built in. The blind validator exists because agents make mistakes. They cite files that don’t exist, mischaracterize what code does, or draw conclusions the evidence doesn’t support. Catching those issues before they reach anyone else is the whole point.
Cross-session memory requires discipline. The system only works if I run /session-end after substantive sessions and keep stable facts current. When I skip it, the next session starts cold and I lose the compounding benefit. Automation helps, but the commitment to maintain the memory is mine.
And domain skills need regular maintenance. Products change. Code gets refactored, pipelines get rearchitected. Skills that aren’t periodically updated drift from reality. I haven’t solved this well yet. It’s still a manual process of noticing when a skill’s knowledge is stale and updating it.

The system still makes mistakes. Multi-agent workflows are more thorough than single-prompt conversations, but they’re not infallible. The confidence assessment in the escalation output exists because sometimes the answer is “medium confidence, we couldn’t confirm this from the available data.” That honesty about limitations is more useful than false certainty.

Where This Is Going

I’m sure the specific commands and skills will look different in six months as I learn what works and what doesn’t. But the underlying pattern feels durable: compose specialist agents with deep domain context, validate their output, and feed learnings back into the system.

I’ve published updated files to the Product AI Public repo, including the session memory commands and a generalized version of the multi-agent escalation workflow. If you’re building something similar, those might be useful starting points.

None of these pieces does much on its own. It’s the way they feed each other that turned a pile of separate prompts into something I lean on every day.

14 March 2026

When Using AI Leads to “Brain Fry"

I am definitely feeling the “brain fry” right now:

We found that the phenomenon described in these posts—cognitive exhaustion from intensive oversight of AI agents—is both real and significant. We call it “AI brain fry,” which we define as mental fatigue from excessive use or oversight of AI tools beyond one’s cognitive capacity. Participants described a “buzzing” feeling or a mental fog with difficulty focusing, slower decision-making, and headaches.

The research is fascinating and worth reading, with super interesting findings like this:

As employees go from using one AI tool to two simultaneously, they experience a significant increase in productivity. As they incorporate a third tool, productivity again increases, but at a lower rate. After three tools, though, productivity scores dipped. Multitasking is notoriously unproductive, and yet we fall for its allure time and again.

Earlier this week I had this thought: “Oh no, I think I’ve blown out my context window. I wish I could add some more tokens to my brain. Until then I might just have to respond to new requests with 401 Unauthorized.”

And that’s when I realized I probably need to go touch grass or something.

1 March 2026

Toolshed, blueprints, and why good agents need good DevEx

Alistair Gray published part two of Stripe’s “Minions” series, going deeper on how they built their internal coding agents. It’s a great read throughout, but three ideas really stood out to me.

First, blueprints. These are workflows that mix deterministic steps with agentic ones:

Blueprints are workflows defined in code that direct a minion run. Blueprints combine the determinism of workflows with agents’ flexibility in dealing with the unknown: a given node can run either deterministic code or an agent loop focused on a task. In essence, a blueprint is like a collection of agent skills interwoven with deterministic code so that particular subtasks can be handled most appropriately.

If you know a step should always happen the same way, don’t let an LLM decide how to do it. Let the agent handle the ambiguous parts, and hardcode the rest (this can also dramatically reduce token cost).

Second, their centralized MCP server:

We built a centralized internal MCP server called Toolshed, which makes it easy for Stripe engineers to author new tools and make them automatically discoverable to our agentic systems. All our agentic systems are able to use Toolshed as a shared capability layer; adding a tool to Toolshed immediately grants capabilities to our whole fleet of hundreds of different agents.

A shared tool layer that all agents can use… 500 tools, one server, hundreds of agents. Very cool idea.

And third, what they call “shifting feedback left”:

We have pre-push hooks to fix the most common lint issues. A background daemon precomputes lint rule heuristics that apply to a change and caches the results of running those lints, so developers can usually get lint fixes in well under a second on a push.

If you can catch a problem before it hits CI, do it there. A sub-second lint fix on push is better than a 10-minute CI failure, whether you’re a person or an LLM burning tokens.

So much of Stripe’s agent success is built on top of investments they made for human developer productivity. Good dev environments, fast feedback loops, shared tooling. The agents benefit from all of it, and developers remain in control.

23 February 2026

Project Brains: Organizing Complex Initiatives for AI-Assisted Work

I’ve written before about how I use AI for product work and how that workflow evolved with slash commands and skills. This post focuses on how to maintain context for complex, long-running projects.

The Problem: Context Fragmentation

When I’m working on a major initiative, relevant information ends up scattered everywhere: PRDs in one tool, tickets in another, meeting notes in a third, plus emails and chat threads. Every time I return to a project after a few days, I spend time reconstructing where things stand.

AI assistants can make this worse because each conversation starts fresh. I can reference files, but the model doesn’t know which files matter for this project, what decisions we’ve already made, or what questions remain open. I end up re-explaining context that should be obvious.

Project brains solve this by creating a dedicated folder for each major initiative with a standard structure that both humans and AI can navigate.

What a Project Brain Looks Like

The structure looks like this:

projects/[project-name]/
├── CONTEXT.md        # The hub: status, stakeholders, decisions, open questions
├── artifacts/        # PRDs, specs, designs, one-pagers
├── decisions/        # Decision logs with rationale and alternatives
├── research/         # Customer feedback, data analysis, technical investigation
└── meetings/         # Meeting notes related to this project

The CONTEXT.md file is a living document that answers the questions I’d need to answer every time I pick up a project:

What’s the current status?
Who are the stakeholders and what do they care about?
What decisions have we made and why?
What questions are still open?
Where are the relevant artifacts?

When I start a conversation about a project, I point the AI to the project folder. It reads CONTEXT.md first, then can drill into specific artifacts as needed. The model immediately knows the project state without me explaining it.

A Real Example

Say I’m working on adding observability to an internal platform—something that needs coordination across multiple teams over several months. The CONTEXT.md includes:

Quick reference table: Status, PM, engineering lead, target dates, links to the PRD and relevant tickets. Everything I’d need to orient myself.
Problem statement: A clear articulation of the user pain. In this case: “Platform incidents go undetected until users report them, and debugging takes hours due to lack of visibility.”
Success metrics with baselines and targets: Things like uptime targets, reduction in mean time to resolution, and alert accuracy. These anchor every conversation about scope.
Key decisions made: A table showing what was decided, when, why, and what alternatives we considered. When someone asks “why aren’t we including component X in v1?”, the answer is already documented.
Open questions: A checklist of unresolved issues. This prevents the AI from assuming things are settled when they’re not.
Links: Direct paths to the PRD, spec, analysis docs, and related pages.

The decisions/ folder contains detailed decision logs for significant choices. The research/ folder holds whatever analysis informed the project direction. The meetings/ folder captures sync notes that would otherwise disappear into Gemini notes in a Google Drive… somewhere.

When to Create a Project Brain

Not every task needs this treatment. I create a project brain when:

The work spans multiple weeks or months. Short-term tasks don’t need the overhead.
Multiple stakeholders are involved. If I need to coordinate with other teams, having a single source of context helps.
Decisions require documented rationale. If someone might ask “why did you do it this way?” later, a decision log is worth the investment.
The project crosses team boundaries. Cross-functional initiatives benefit from dedicated context that doesn’t live in any one team’s space.

For simpler work, I use a flatter folder structure with documents organized by type. Project brains are for the complex initiatives where losing the thread between sessions costs me real time.

How AI Uses Project Brains

This earns its keep when I’m working with AI on project-specific tasks. A few examples:

Preparing for a meeting: “Read the CONTEXT.md in the [project] folder. I have a spec review meeting tomorrow. What are the open questions I should raise?”
Drafting an update: “Based on the project context, draft a status update for leadership. Focus on progress since the start of the month and remaining blockers.”
Decision analysis: “We need to decide whether to include [component] in scope. Read the research folder and the current CONTEXT.md. What would you recommend and why?”

By the time I’m working in it, the AI already knows the project’s history and the people involved, so its recommendations fit this specific situation instead of falling back on generic best practices.

Maintaining the Project Brain

The value depends on keeping CONTEXT.md current. I’ve found a few practices help:

Update after significant events. When a decision is made, a meeting happens, or the status changes, update the file immediately. “I’ll do it later” means it won’t happen. LLMs are great at making these updates, so you can simply say “update relevant files based on the session we just concluded.”
Move open questions to resolved. When a question gets answered, don’t delete it. Mark it resolved and note the answer. This preserves the reasoning trail.
Link, don’t duplicate. CONTEXT.md should point to artifacts, not contain them. Keep PRDs in the artifacts folder. Keep meeting notes in the meetings folder. The context file is a hub, not a repository.

Scaffolding New Projects

I have a slash command that scaffolds new project brains:

/new-project platform-observability

This creates the folder structure, generates a CONTEXT.md from a template, and fills out a rough draft based on whatever context I provide. Removing the friction of setup means I’m more likely to actually use the system. You can view the command here.

The template includes the standard sections (Quick Reference, Problem Statement, Success Metrics, etc.) with placeholder text. I fill in what I know and mark other sections as TBD. Even an incomplete project brain is more useful than scattered notes.

What Surprised Me

A well-organized project brain with sparse content beats a folder full of undifferentiated documents every time, because the AI (and future me) can work with structure far more easily than with a pile of files. The decision logs have paid off the most: when someone asks why we didn’t do something, I point to the log instead of reconstructing my reasoning from memory. And while I built this for the AI, I reference these files constantly myself. Staying on top of the context keeps me oriented too, not just the assistant. The structure stays flexible too. Some projects grow extra subfolders like research/customer-interviews/, others need fewer.

This approach requires discipline to maintain, and the upfront setup takes time. But for complex initiatives where context fragmentation is a real problem, project brains have been worth the investment. The AI becomes a more useful collaborator when it has access to the same context I do.

I’m still iterating on the structure. I suspect the template will look different six months from now as I learn what sections actually get used and which ones I skip every time. I’m not trying to get the folder structure perfect. I just want to stop losing context between conversations, so each time I come back to a project I can build on what I already know.

21 February 2026

The AI baseline has moved

Geoffrey Huntley wrote about what happens when people finally “get” AI:

If you’re having trouble sleeping because of all the things that you want to create, congratulations. You’ve made it through to the other side of the chasm, and you are developing skills that employers in 2026 are expecting as a bare minimum.

The only question that remains is whether you are going to be a consumer of these tools or someone who understands them deeply and automates your job function? Trust me, you want to be in the latter camp because consumption is now the baseline for employment.

Knowing how to use these tools is no longer a differentiator. The gap is between people who consume AI outputs and people who understand the systems well enough to build on top of them.

For product managers, this means that prompting ChatGPT for a first draft doesn’t count as an AI skill anymore. The question is whether you can wire together agents, automate your own workflows, and spot opportunities others miss because they’re still thinking in manual processes.

2 February 2026

The Jevons Paradox and the Future of Knowledge Work

I keep thinking about this essay by Mike Fisher about what happens when automation makes work easier. His central argument challenges the assumption that’s baked into most AI-and-jobs discourse:

In every domain where automation becomes powerful, the pattern remains consistent. Human expertise becomes more valuable because the total volume of meaningful work increases. Early fears of automation nearly always assume a fixed amount of work being redistributed. But work is not fixed. Work expands when constraints are removed.

He anchors this on the Jevons Paradox—the 19th century observation that improved steam engine efficiency led to more coal consumption, not less. And then he traces the pattern through radiology, where the number of US radiologists grew from 30,723 in 2014 to 36,024 in 2023, despite Hinton’s 2016 prediction that deep learning would make them obsolete within five years.

He concludes:

AI will reshape the profession, but only in the sense that cars reshaped transportation or spreadsheets reshaped finance. Not by eliminating the field, but by expanding its scope. Not by reducing labor, but by elevating it. Not by shrinking opportunity, but by multiplying it. The world does not need fewer people who understand systems. It needs far more of them.

I find this framing useful because it shifts the question from “will AI take my job?” to “how will the work change as the volume increases?” That’s a much more interesting thing to figure out (which is also why I have been so focused on expanding my Product Second Brain).

8 January 2026

How to Set Up OpenCode as Your Product Second Brain

This is the hands-on companion to How I Use AI for Product Work and How My AI Product Second Brain Evolved. Those posts explain the philosophy; this one gets you to a working OpenCode setup in 30 minutes.

By the end, you’ll have a folder structure, three working slash commands, and a context file that makes the AI useful.

Prerequisites

You’ll need OpenCode installed and configured. Follow the installation guide to get started. If you can run opencode in your terminal and get a response, you’re ready.

Step 1: Create the Folder Structure

Create a new directory for your product second brain. I recommend keeping it in a git repo so you can version your prompts over time.

mkdir -p product-ai/{context,prompts/pm,.opencode/command}
cd product-ai
git init

Your structure should look like this:

product-ai/
├── .opencode/
│   └── command/       # Slash commands live here
├── context/           # Personal context files
├── prompts/
│   └── pm/            # PM-specific prompts
├── AGENTS.md          # Instructions for OpenCode
└── opencode.jsonc     # OpenCode config

Step 2: Create Your Context File

The context file tells the AI who you are and how you work. Create context/about-me.md:

# About Me

## Role
[Your title] at [Company], working on [your area].

## What I Care About
- [Your product philosophy, e.g., "Start with the problem, not the solution"]
- [Your working style, e.g., "Bias toward shipping and learning"]
- [Your communication preferences, e.g., "Direct feedback, no hedging"]

## Current Focus
- [Project or initiative 1]
- [Project or initiative 2]

Keep this updated as your focus changes. The more specific you are, the more useful the AI becomes.

Step 3: Create AGENTS.md

This file tells OpenCode how to behave in your repo. Create AGENTS.md in the root:

# Product AI Second Brain

Read this file before responding.

## Who I Am
Read `context/about-me.md` for personal context.

## Slash Commands
Run these by typing the command in OpenCode:

| Command | Purpose |
|---------|---------|
| `/prd` | Review a PRD |
| `/debate` | Stress-test a product idea |
| `/okr` | Review OKRs |

Step 4: Add Your First Commands

Slash commands are markdown files in .opencode/command/. Each command file is a thin wrapper that points to a full prompt file in prompts/pm/. This separation keeps commands simple while allowing prompts to be detailed and shareable.

Command 1: /debate (Stress-test an Idea)

Create .opencode/command/debate.md:

---
description: Stress-test a product idea with pro vs skeptic debate
---

# Product Debate

Read these files before proceeding:
- `prompts/pm/debate-product-idea.md` - **REQUIRED: Full debate framework**
- `context/about-me.md` - Your product beliefs and context

## Your Task

$ARGUMENTS

## Instructions

Then create the prompt file prompts/pm/debate-product-idea.md with your full debate methodology. The basic structure: define two personas (a Visionary who argues for the idea and a Skeptic who pokes holes), have them debate, then synthesize the strongest arguments from both sides.

Command 2: /okr (Review OKRs)

Create .opencode/command/okr.md:

---
description: Review OKRs for clarity and outcome-orientation
---

# OKR Review

Read these files before proceeding:
- `prompts/pm/review-okrs.md` - **REQUIRED: Full OKR review framework**
- `context/about-me.md` - Your product beliefs and context

## Your Task

$ARGUMENTS

## Instructions

Then create prompts/pm/review-okrs.md with your OKR criteria. Mine checks for outcome-orientation (are these outputs or outcomes?), measurability, and whether the key results actually ladder up to the objective.

Command 3: /prd (Review a PRD)

Create .opencode/command/prd.md:

---
description: Review a PRD for completeness and clarity
---

# PRD Review

Read these files before proceeding:
- `prompts/pm/review-prd.md` - **REQUIRED: Full PRD review framework**
- `context/about-me.md` - Your product beliefs and context

## Your Task

$ARGUMENTS

## Instructions

Then create prompts/pm/review-prd.md with your PRD review criteria.

Step 5: Try It Out

Navigate to your product-ai directory and run OpenCode:

cd product-ai
opencode

Test each command:

/debate Should we build a self-serve dashboard for customers to debug their own issues?

/okr [paste your OKRs here]

What’s Next

Once you’re comfortable with this setup, consider adding:

More commands: /retro for retrospectives, /feedback for drafting colleague feedback
Skills: Methodology files that OpenCode loads automatically based on context (see the skills documentation)
Daily summaries: A /today command that summarizes what you worked on
Project-specific context: Folders for major initiatives with their own context files

Start small and add complexity as you identify repeated workflows. If you find yourself doing the same thing more than twice, it’s probably worth automating.

If you build something useful on top of this, I’d love to hear about it!

7 January 2026

Learning in the Age of AI

Scott H. Young has a thoughtful piece on what’s still worth learning in a world with AI. He cuts through both the panic and the hype to look at what the data actually shows. The biggest finding is probably not all that surprising: early-career workers in AI-exposed fields are getting hit hardest.

Another report from the Stanford Digital Economy Lab notes that early-career workers in AI-exposed fields (such as programming) have seen a relative decline in employment, even as employment among workers aged 30 years and older increased. This matches my intuition that AI coding agents can do a lot of junior developer tasks pretty well, but struggle to match the experience needed to tackle more serious work.

Young’s advice is to cultivate generalist skills. Not the content-free “critical thinking” kind, but genuinely transferable knowledge:

In an environment of change, it’s better to be the hardy dandelion rather than the hothouse orchid. Similarly, I expect with AI-induced change, people who have maintained diverse interests and skills will be best positioned to take advantage of the change, whereas extreme specialists will face a greater risk of extinction.

30 December 2025

How My AI Product "Second Brain" Evolved

A couple of weeks ago I wrote about how I use AI for product work: the basic setup of context files, prompts, and the @ mention system in Windsurf. Since then I’ve moved tools, added slash commands, and automated a few things, so here’s the update.

The philosophy has shifted. I still don’t use AI to do my core thinking. I write my own PRDs and strategy docs. But I’ve come to rely on it more as a helpful assistant for the work around the work: reviewing documents before I share them, researching technical questions, summarizing my week, preparing for meetings. These days it feels more like having a colleague on call.

From Windsurf to Claude Code / OpenCode

The biggest shift was moving from Windsurf to Claude Code, Anthropic’s terminal-based AI assistant. Claude Code runs in your terminal and has direct access to your filesystem, which changes how you can structure these workflows.

The key feature that made this worthwhile is slash commands. Instead of manually @-mentioning prompt files, I can type /ask-se and Claude Code automatically loads the right context, reads the relevant files, and knows how to respond. Small thing, but removing that friction is the difference between using these tools and just meaning to.

I also started using OpenCode, an open-source alternative that works similarly. Both tools read from the same instruction files, so I maintain one set of prompts that work in either environment.

Slash Commands

The prompts I described in the original post are now wrapped in slash commands. The files live in .claude/commands/ and look like this:

.claude/commands/
├── prd.md          # Review a PRD
├── okr.md          # Review OKRs
├── debate.md       # Stress-test a product idea
├── ask-se.md       # Research technical questions
├── briefing.md     # Calendar briefing
├── today.md        # End-of-day summary
├── weekly.md       # Weekly summary
└── ...

Each command file contains instructions for the AI, plus a $ARGUMENTS placeholder for any input I provide. When I run /debate should we build a developer portal?, Claude Code reads the debate prompt, substitutes my question for the placeholder, and runs through its process.

The commands I use most often:

/prd. Reviews a PRD I’ve written and pushes back on vague problem statements, missing success metrics, or unclear scope. I run this before sharing drafts with stakeholders.
/debate. Simulates a debate between an optimist and a skeptic about a product idea. This is probably the command I reach for most when I’m still forming an opinion about something.
/ask-se. Helps me answer specific technical questions about our products. For example, today I ran /ask-se On Gateway HTTP Logpush jobs, what would trigger "unknown" as the action? The command uses MCP (Model Context Protocol) servers to search our public documentation and internal wiki, then synthesizes an answer I can actually use. It’s how I get product answers without interrupting engineers or digging through docs myself.

I don’t have to remember file paths or compose context any more. I type the command and start the conversation.

Skills: Persistent Methodology

Commands are things I invoke explicitly. Skills are methodology files that get applied automatically when they’re relevant.

I have three skills set up:

pm-thinking. Applies my product philosophy to any PM-related work. When I’m reviewing a PRD, it automatically checks for problem-first thinking, measurable outcomes, and clear non-goals.
cloudflare-context. Loads knowledge about Cloudflare’s products and triggers proactive use of internal data sources (more on this below).
data-team-context. Specific context about my team’s priorities, current initiatives, and constraints.

The skill files live in .claude/skills/ and contain triggers (when to apply) and behaviors (what to do). For example, the pm-thinking skill flags anti-patterns like “vague success metrics” or “jumping to solutions without understanding the problem.”

This means even when I’m not explicitly running a PM-focused command, the AI still knows to apply my methodology.

Daily and Weekly Summaries

One of the more useful additions is automated work journaling. At the end of the day, I run /today. The command:

Finds files I modified that day using filesystem timestamps
Reads the key files to understand what I actually worked on
Asks if there’s anything else to add
Generates a summary focused on outcomes, not tasks
Saves it to a structured folder: work/cloudflare/weeknotes/2025/01/week-01/2025-01-06.md

The output follows a consistent format:

# Tuesday, December 16, 2025

## Summary
Focused on Q1 planning and customer research. Clarified success
metrics for data quality initiative and documented common Logpush questions.

## What I Worked On
- **Q1 Planning:** Reviewed OKR drafts, identified gaps in success metrics
- **Customer Research:** Documented Logpush egress IP questions for support docs

Then on Friday (or Monday morning), I run /weekly. It reads all the daily notes for the week and synthesizes them into a summary I use to prep for 1:1s and status updates.

This works because I can never remember on Friday what I did on Tuesday. The daily notes capture the work while it’s fresh, and the weekly summary rolls it up into something useful.

Calendar Briefings

The most custom piece of this system is the calendar briefing. I got this idea from an interview with Webflow’s CPO on Claire Vo’s podcast, about building an AI chief of staff. Following her example I wrote a Python script that:

Connects to Google Calendar via OAuth
Fetches events for today, tomorrow, or the week
Generates a briefing that flags meetings that could be async, suggests prep work, and warns if the day is overloaded

I run /briefing tomorrow before wrapping up for the day. It gives me a head start on thinking about what’s coming and whether I need to prepare anything.

The briefing uses context from my about-me.md file, so it knows things like which recurring meetings are more important than others, and what kind of prep I typically need for different meeting types.

Syncing Between Tools

One wrinkle: Claude Code and OpenCode have slightly different formats for commands. Claude Code uses plain markdown; OpenCode expects frontmatter with a description field.

So I built a sync command. Running /sync-to-opencode compares the two folders and copies any changed commands, adding the frontmatter OpenCode needs. It means I can maintain one set of prompts (in .claude/commands/) and have them work in both tools.

Most people don’t need this. I switch between tools often enough that manual syncing got annoying.

What I’ve Learned Since the First Post

Automation compounds. The daily notes feed the weekly summary, the skills carry across every conversation, and once the slash commands cut enough friction I actually started using the system instead of just thinking about using it.
Custom workflows are basically one file now. With slash commands and skills, I can create tools that work exactly how I want to work. The /ask-se command knows my products, our customers, and the kinds of questions I typically need answered. The daily summary knows my file structure and my preferred format. I get to encode my workflow into a tool instead of adapting to whatever the tool prefers.
Scripts fill the gaps. The calendar briefing couldn’t be a pure prompt because it needed to fetch data from an external API. Having a scripts/ folder for Python code that extends the system has been useful for anything that requires real I/O.
Keep work output local. The prompts and context files live in a private git repo so I can sync them between machines and version them over time. But the actual work output, like weeknotes, briefings, and anything I’m working on, stays on my local machine, not in the repo. The AI helps me do the work, but the work itself doesn’t need to live in the cloud.

If you’re building something similar, I’d recommend starting with the basics from the first post: context files, opinionated prompts, and a tool that lets you compose them easily. The slash commands and automation came later, once I understood what workflows I actually used repeatedly.

The specific tooling matters less than the approach. Figure out where AI actually helps (reviewing your work, researching questions, reducing busywork) and build the smallest system that makes those things easy to do.