I have decided that in this new AI era I will be practicing FDD. Fear-Driven Development. Every time I send a pull request, which happens a lot now, I’m terrified of an engineer sending it back to me and asking me to please stay in my lane and stop sending them slop. So I plan, write specs and implementation plans, test thoroughly, and I don’t trust the agent’s inevitable confidence.
I’ll come back to that, but let me first frame what this post is about. The loudest take on PM work right now is that AI is collapsing the role — that we’re one product cycle away from redundancy, or being reduced to prompt jockeys. That hasn’t been my experience at all. The job got more hands-on, harder (brain fry is real), but also a lot more fun. What follows is what actually shifted for me over the last 5 months at Cloudflare, what didn’t, and a couple of things I got wrong.
What changed
We all ship now. The biggest shift in my day-to-day is that my team and I write code. Not as a vanity exercise or to replace engineers — there is no universe in which I’m touching our data pipeline code. But for prototypes, internal tools, small features, and live bugs or UX improvements that are safe for us to do, we just go ahead and make a PR instead of adding things to the backlog. This quote from one of our EMs sums it up well:
I like these dashboard revamps. Far better if PMs can express their visions for the product directly to Opencode, avoids a lot of back and forth.
This is where FDD comes in. The terror of shipping slop is the thing that keeps me responsible. Telling Claude “hey, build me X” is a fast road to code that doesn’t work the way it should, or worse, works but introduces ten regressions along the way. So the pattern I run now is: brainstorm first, usually with a skill that forces me to articulate what I’m actually trying to do; write a spec; turn the spec into an implementation plan; and only then start generating code. Counterintuitively, the planning is what makes the whole thing faster (and better!). Skip the brainstorm and you’ll spend ten rounds of PR review untangling code the model wrote confidently and incorrectly. Plan properly and the build itself is usually the fast part.
Context is the product, and evals are the new PRD. The second change is arguably even more consequential: I spend real time maintaining a context layer. A CLAUDE.md, a library of skills, stakeholder memory, agent routing, a second brain that feeds all of it. None of this was part of my job a year ago, and treating AI as a chat window means missing most of what it can do. My own PM rubrics live inside this layer too. My /okr and /prd commands load the problem-first frameworks and antipattern checklists I’d apply myself, so the first pass on any draft or review is already done before I open the doc.
The same shift is showing up in specs and PRDs. When a document has two audiences (the team building the product and the agent helping them) the writing changes. Ambiguity gets expensive. Rhetorical flourishes don’t survive the first load into context. A good spec is one that loads cleanly as context and runs as a plan — that’s a different job than writing a memo for stakeholders. Ornella at Braintrust has a good post on how evals are the new PRDs:
An eval is a structured, repeatable test that answers one question. Does my AI system do the right thing? Think of it as a unit test for AI behavior.
Her argument is that in AI-native products the eval suite is what actually defines the product. A PRD says what you want; an eval tells you whether you got it. For PMs, the artifact that matters is the one you can run.
We take more load off engineering. CUSTESCs (our customer escalations) used to take hours and hours of PM and engineering capacity. Someone would dig through Jira, read code across a dozen repos, chase down the relevant ClickHouse tables, check the docs against what the customer expected, and go back and forth for days before anyone had a useful working hypothesis. Our team now has a /custesc command that does most of that in parallel. It pulls the ticket, runs three agents at once across code, Jira, and the wiki, generates and runs ClickHouse queries to check the leading hypothesis, and passes the draft through a blind validator and challenger before it lands as a classified analysis. Ticket ID to root cause in about 20 minutes, most of the time.
This moves where the investigation work sits. A CUSTESC used to be a tax on engineering. Now I can run the full investigation myself and come to engineering with a classified issue and a working hypothesis, instead of a vague “can someone take a look at this?” One enormous side benefit: I’ve learned more about our products in the past couple of months than I have in the 1.5 years before that.
What didn’t change
Figuring out what to bet on. What to say yes/no to is still the hardest part of the job. AI can lay out the tradeoffs; it can’t tell you which user/business opportunities to prioritize this quarter. The roadmap is still a set of bets you own. (What “roadmap” even means in this new world deserves a post on its own, but suffice to say we have fully adopted Now/Next/Later)
Trust with engineers. There is no AI shortcut for being useful to your team over time. Showing up and owning the ambiguous stuff is still the job. The calls no one wants to make are still yours. If anything, PMs being able to prototype makes the line between helping and stepping on toes harder to hold. It’s a human line, and it gets redrawn every week.
Owning outcomes when things go wrong. AI doesn’t absorb accountability. It won’t take the hit in a postmortem, and it won’t rebuild trust with a customer after an incident — you will. The hardest moments in the job haven’t changed.
What I got wrong
I underestimated the context layer. For the first year I treated AI as a chat tool: ask good questions, get good answers. I thought prompt engineering was the skill. The thing I missed is that skills, memory, agent routing, and the specs you load in are the product. Prompts sit downstream.
I thought adoption would be gradual. I assumed PMs would pick this up on a normal curve: some fast, some slow, most in the middle. What I’m seeing industry-wide is a widening gap between PMs who are willing to change how they work and PMs who aren’t. You can see the gap in how fast they investigate a problem, how concretely they argue about a design, and how useful they are to other teams.
Getting started is easy and the early wins are obvious. The hard part is being open to changing your job. I was talking to my wife the other day about what I’m doing, and she asked the obvious question: “Why are you automating your job away?” My answer: the people who automate their own jobs away are the ones who become more valuable, because the craft is now in orchestration — setting up the layers so the AI does the right thing.
Where this leaves us
I don’t know what this job looks like in another year. The pattern of the last few months has been that the ground shifts faster than the opinions written about it, and most of the stable-sounding takes age badly within a quarter. What I’m trying to do is pay attention to what’s getting easier, what’s getting harder, and what’s making me uncomfortable. And reminding myself that “uncomfortable” is usually where the real learning happens.