Menu

Building a music discovery app (and what I learned about Product)

I miss liner notes. In the age of infinite streaming and algorithmic playlists I find myself longing for the days when you’d flip open a CD case and actually read about the music you were listening to. Who produced this? What’s the story behind the album? Why does this track feel different from everything else they’ve made?

Spotify and Apple Music are great at giving you more music. They’re less good at helping you understand why you might love something, or what to explore next. So I built my own solution—and then rebuilt it twice.

The problem I was trying to solve

My relationship with Last.fm goes back to 2007. In case you’re not familiar, Last.fm is a service that “scrobbles” (tracks) everything you listen to, building a comprehensive history of your musical life. It’s become a wonderful archive of my taste evolution over nearly two decades.

Last.fm is great at telling you what you listened to. It’s less useful for helping you understand why you might love something, or what else you should explore. Spotify and Apple Music’s algorithmic playlists are fine, but they often feel like they’re optimizing for engagement rather than genuine discovery.

I wanted a tool that would:

  • Show me context about the artists and albums in my listening history
  • Help me discover music through similarity and connection, not just popularity metrics
  • Give me that “liner notes” depth I was craving
  • Work with my existing Last.fm data (18 years of listening history is a lot to throw away)

So I started building, first by copy-pasting from GPT–4 (the olden days!), and most recently with Antigravity + Claude Opus 4.5 (we’ve come a long way since 2023). Here’s where it all stands today…

Listen To More: three iterations and counting

Listen To More is the core project—a music discovery platform that combines real-time listening data with AI-powered insights.

The first version was simple: a personal dashboard that pulled my Last.fm data and displayed it nicely. Functional, but limited. The second version added some AI summaries using OpenAI’s API. Better, but still rough around the edges.

The current version—iteration three—is a complete rebuild focused on speed and multi-user support. What started as “a thing I made for myself” is now something anyone can use. Sign in with your Last.fm account, and you get:

  • Rich album and artist pages with AI-generated summaries, complete with source citations (so you know the AI isn’t just making things up)
  • Your personal stats showing recent listening activity, top artists and albums over different time periods.
  • Weekly insights powered by AI that analyze your 7-day listening patterns and suggest albums you might love
  • Cross-platform streaming links for every album—Spotify, Apple Music, and more
  • A Discord bot so you can share music discoveries with friends

The tech stack is Hono on Cloudflare Workers, with D1 (SQLite) for the database and KV for caching. The whole thing is server-side rendered with vanilla JavaScript for progressive enhancement. Pages load in about 300ms, then AI summaries stream in asynchronously.

I chose this stack partly because I work at Cloudflare and wanted to understand our developer platform better. More on that later.

Extending the ecosystem with MCP servers

MCP stands for Model Context Protocol. In plain terms, it’s a standard that lets AI assistants (like Claude) connect to external data sources and tools. Think of it as giving an AI the ability to actually use personalized data rather than just answer questions based on pre-training.

I built two MCP servers to extend my music discovery ecosystem:

Last.fm MCP Server

Available at lastfm-mcp.com, this server lets AI assistants access your Last.fm listening data. Once connected, you can have conversations like:

  • “When did I start listening to Led Zeppelin?”
  • “What was I obsessed with in summer 2023?”
  • “Show me how my music taste has evolved over the years”

The AI can pull your actual scrobble data, analyze trends, and give you personalized insights. It supports temporal queries (looking at specific time periods), similar artists discovery, and comprehensive listening statistics.

Discogs MCP Server

This one connects to Discogs—the massive music database and marketplace that’s especially popular with vinyl collectors. If you have a Discogs collection, the MCP server lets AI assistants:

  • Search your collection with intelligent mood mapping (“find something mellow for a Sunday evening”)
  • Get context-aware recommendations based on what you own
  • Provide collection analytics and insights

Both servers run on Cloudflare Workers and use OAuth for secure authentication. They’re open source if you want to poke around or deploy your own.

What I learned

I’m a Product Manager, not an engineer. But I’ve found that having more technical depth broadens the scope of things I am able to contextualize—and makes me more confident in my interactions with engineers. Here’s what building these projects reinforced for me:

  • Side projects are a low-stakes learning environment. When you’re building for yourself, there’s no pressure to ship by a deadline or meet someone else’s requirements. You can experiment, break things, and iterate freely. I tried approaches that would have been too risky to propose in a work context—some of them broke the site spectacularly, others worked beautifully.
  • There’s no substitute for using your own product. I use these tools every day. That constant exposure surfaces issues and opportunities that you’d never catch in a quarterly review or user interview. The feature prioritization becomes obvious when you’re feeling your own friction.
  • Building with your company’s tools is invaluable. I now have deep, practical knowledge of Cloudflare Workers, D1, KV, and the rest of our developer platform. When I’m talking to customers or evaluating feature requests, I’m drawing on real experience, not just documentation. I can empathize with the developer experience because I’ve lived it.
  • The fun matters. I keep coming back to these projects because I genuinely enjoy working on them. The satisfaction of solving a problem you personally care about is different from the satisfaction of shipping something at work. Both are valuable, but the former is what sustains a side project through the inevitable rough patches.

What’s next

I have a list of features I’d love to add—better recommendations, more sophisticated listening pattern analysis, maybe even integration with other music services. But I’m also learning to pace myself. These projects aren’t going anywhere, and part of the joy is the slow, steady improvement over time.

If you’re curious, you can check them out here:

And if you’re a PM thinking about starting a technical side project: do it. Pick something you personally care about, use tools you want to learn, and give yourself permission to build slowly. The lessons transfer in ways you won’t expect.

Where Do the Children Play?

Eli Stark-Elster has a piece that reframes the “kids and screens” debate in a way I haven’t seen before. The usual narrative blames addictive tech design, but he offers an alternative:

Why do our children spend more time in Fortnite than forests? Usually, we blame the change on tech companies. They make their platforms as addicting as possible, and the youth simply can’t resist — once a toddler locks eyes with an iPad, game over.

I want to suggest an alternative: digital space is the only place left where children can grow up without us.

The argument is that kids have always needed spaces away from adult supervision. We’ve just paved over the forests and creeks where they used to find it.

What makes this more than speculation is the research he cites: 72% of 8 to 12-year-olds say they’d rather spend time together in person, without screens. 61% wish they had more time to play with friends without adults around. The kids don’t actually want to be on screens all day. They’re looking for something we’ve taken away.

It seems like what they want is to wander together in a forest. But they can’t. So they boot up Fortnite or TikTok instead.

I’m still sitting with this one. It doesn’t let tech companies off the hook, but it does suggest that “just take away the iPad” isn’t addressing the real problem.

Building MCP servers in the real world

This has been my experience with MCP servers as well. As useful as I think my Last.fm MCP server is, I can’t see it every having more than a dozen users. But internal company servers are massively useful:

MCP is being used especially heavily by internal data and platform teams to give internal users access to systems. These are systems that these users perhaps already had access to, but it was either too complex or too broad, or needed a lot of documentation or special skills to use.

Wiki search is so much better now that I can use our internal MCP server for it via Windsurf.

Source: Building MCP servers in the real world

Measuring AI's Impact on Shipping Speed and Code Quality

Will Larson has a good post about how they’re adopting AI at his company. The process is interesting, but this is the part that jumped out at me:

My biggest fear for AI adoption is that they can focus on creating the impression of adopting AI, rather than focusing on creating additional productivity. Optics are a core part of any work, but almost all interesting work occurs where optics and reality intersect.

It’s really hard to figure out if AI tools are (1) helping teams ship faster (2) without sacrificing quality.

We’re working on figuring out this problem right now at Cloudflare. Our proposed approach sidesteps the problem of per-commit AI attribution (did Copilot write this line? did Claude?) by correlating team-level AI tool usage with team-level health metrics over time. If a team’s AI adoption increases by 30% and their change failure rate stays stable, that’s a useful signal. If AI usage spikes and incidents start trending up, that’s worth investigating.

The key insight is that you don’t need perfect attribution to get directionally useful data. Correlation isn’t causation, and teams adopting AI tools may already be more experimental or higher-performing. But at least you’re measuring something real instead of the something like “# of lines written by AI”, which leads straight to the Goodhart’s Law problem where metrics become targets.

How I Use AI for Product Work

Update! I wrote a follow-up post here: How My AI Product “Second Brain” Evolved.

I’ve been refining my approach to using LLMs for product work, and I figured it’s time to write up how I actually use them day-to-day.

The most valuable thing an AI assistant can do for product work is push back on weak reasoning, spot gaps you missed, and force you to articulate why your idea is good. A peer reviewer who shares your product philosophy, basically. With the right prompts, AI assistants are also good at producing background and framing documents: explainers that synthesize complex topics, summaries of technical concepts, and so on.

Here’s how I’ve set it up, what makes it work, and how you might build something similar.

The Philosophy: Sparring Partner, Not Ghostwriter

I believe LLMs are most useful when you give them two things: context and constraints.

  • Context tells the model who you are, what you’re working on, and what “good” looks like in your world.
  • Constraints keep the model from going off the rails with generic advice or hallucinated frameworks.

Every prompt I use is designed to provide both. They’re opinionated on purpose. I’d rather have an assistant that pushes back on bad ideas than one that says “Great idea!” to everything.

I don’t want AI writing presentations for me. I want a thinking partner that:

  • Challenges weak problem statements before I waste time on solutions
  • Spots missing success criteria I forgot to define
  • Asks “why?” when my reasoning gets hand-wavy
  • Points out when I’m jumping to solutions before understanding the problem
  • Helps me create background docs and explainers that set context for others

The Building Blocks

Here’s the folder structure I maintain, with a series of Markdown files organized by purpose:

llm-prompts/
├── prompts/           # System prompts for different use cases
│   ├── pm/            # Product management prompts
│   └── technical/     # Technical/engineering prompts
├── context/           # Personal context files (who I am, how I work)
├── reference/         # Syntax guides and reference docs
└── work/              # Saved feedback and refined docs

No single prompt does the work. What matters is how you combine them.

Layer 1: System Prompts

These are the instructions that tell the AI how to behave for a specific task. I have different prompts for different jobs:

  • General PM sparring: A prompt that knows my product philosophy and pushes back on weak reasoning. I use this for thinking through tradeoffs, preparing for meetings, and sanity-checking my approach.
  • Document review: Prompts specifically designed to critique PRDs, OKRs, strategy docs, and other artifacts. These encode what “good” looks like and call out common anti-patterns.
  • Idea stress-testing: A prompt that I stole from my friend Stephen, which simulates a debate between an optimist and a skeptic to pressure-test new ideas before I get too attached to them.
  • Technical understanding: Prompts that help me understand systems, architectural decisions, and technical concepts well enough to lead effectively (I’m not an engineer, but I need to hold my own in architecture reviews).

Each prompt is opinionated. Each one encodes a specific philosophy about what good work looks like, rather than defaulting to generic “be helpful” guidance.

Layer 2: Personal Context

I maintain files that describe:

  • Who I am: My role, my experience, my communication style
  • How I work: My product philosophy, my management approach, my values
  • What I’m working on: Current projects, team context, company priorities

When I start a conversation, I can pull in the relevant context files alongside my prompt. The model then has the background it needs to give me advice that fits my situation, instead of generic best practices from a blog post.

Layer 3: Reference Materials

Sometimes you need the model to follow specific formats or conventions. I keep reference files for things like wiki markup syntax, documentation templates, or internal style guides. These keep the output usable without a round of reformatting.

How I Actually Use This

I use Windsurf as my daily driver, and it has a feature that makes this whole system work: the @ mention. In the chat panel, I can reference any file by typing @ followed by the path. Windsurf then includes that file’s contents as context for the conversation.

This means I can compose my “assistant” on the fly by combining:

  1. A system prompt for the task at hand
  2. Relevant personal context files
  3. The document or code I’m working on

Example: Document Review

When I need feedback on a PRD before sharing it with stakeholders, I’ll start a conversation and reference my PRD review prompt plus my product philosophy context. Then I paste in the PRD and ask for critique.

The model comes back with feedback grounded in my own standards, not generic advice. It’ll call out if my problem statement is vague, if my success metrics aren’t measurable, or if I’m jumping to solutions before properly framing the problem. The kind of pushback I’d want from a peer reviewer.

Example: Brainstorming Partner

For early-stage thinking, I use a more conversational prompt that knows how I like to explore ideas. I’ll describe what I’m thinking about and ask it to poke holes, suggest angles I haven’t considered, or help me articulate why something feels off.

This is particularly useful before big meetings. I can rehearse my reasoning and get challenged on the weak spots before I’m in front of stakeholders.

Example: Technical Understanding

I’m not an engineer, but I work with technical teams. When I need to understand how a system works well enough to ask good questions or spot when something doesn’t add up, I use prompts designed for technical explanation.

The key is that these prompts know to explain things without condescension but also without assuming I know the jargon. They cite specific files and line numbers when relevant, and they explain the “why” behind design decisions.

Connecting to Real Data

MCP (Model Context Protocol) servers connect the AI to external data sources: internal wikis, documentation sites, code repositories, APIs. That lets it ground responses in actual information rather than just its training data.

In my prompts, I tell the model which MCP servers are available and when to use them. For example, my technical prompts instruct the model to:

  • Search official documentation first to ground answers in verified information
  • Check internal wikis for known issues, edge cases, and workarounds
  • Look at code repositories when documentation is incomplete
  • Always cite sources with links so I can verify

This turns the AI from a general-purpose assistant into something closer to an expert with access to your company’s actual knowledge base. Instead of generic advice, I get responses that reference real docs and real code.

Keeping a Record

Save the conversation output somewhere useful.

I have a work/ folder organized by topic where I save feedback and refined thinking. When the model gives me good critique on a PRD, I’ll ask it to write a summary of the key issues to a Markdown file I can reference later. This keeps the insights from getting lost in chat history.

What I’ve Learned

A few things I’ve learned along the way:

  1. Context files are worth the investment. I have files that describe who I am, how I work, and what I value. Updating these takes time, but it pays off in every conversation.
  2. Pushback is the point. These prompts are designed to challenge bad thinking. If the model is pushing back on your approach, take it seriously: it might be right.
  3. Iterate on the prompts. I update these regularly based on what works and what doesn’t. If a prompt isn’t helping, change it.
  4. Less context is often more. Including too much context can dilute the signal. Start with the minimum you need, add more if the model seems confused.

Wrapping up

This setup doesn’t make the thinking go away. It just encodes my preferences and philosophies into something an LLM can use as a baseline for pushing back. I still write my own PRDs, OKRs, and strategy docs, because those represent my actual thinking. But I let AI help me create background documents, explainers, and context-setting materials. And I have a sparring partner that catches the gaps I miss, challenges the assumptions I glossed over, and asks the uncomfortable questions before stakeholders do.

If you build something similar, I’d love to hear how it goes.

"Disagree and Let’s See"

I like this alternative to the “Disagree and Commit” saying:

“Disagree and let’s see” allows you to stay aligned with the team without forcing you to pretend you had conviction you didn’t have. It lets you walk into a room with your team and be honest:

“Here’s the path that was chosen. It wasn’t my first pick, but here’s the experiment we’re running, and here’s what we’re trying to learn.”

That’s a much more authentic stance for most leaders than repeating something with a tight smile and hoping no one notices your doubt.

Source: “Disagree and Let’s See”

New side project: Discord Stock & Crypto Bot

Not sure how many people would be interested in this, but it was fun to make so I thought I’d share. This is a Discord bot that provides real-time stock and cryptocurrency information, 30-day price trends, and AI-powered news summaries through slash commands. When you add the bot to Discord you can use the /stock and /crypto commands to get information like this:

Want to add it to your Discord server? Head over here!

Horrible edge cases to consider when dealing with music

Metadata is the hardest problem in software, and these examples prove my point. Don’t @ me!

My favourite: a band named brouillard, with a single member called brouillard, whose every single album is named brouillard, and of course, so is every single track.

Source: Horrible edge cases to consider when dealing with music

Brief thoughts on the recent Cloudflare outage

Lorin Hochstein is a big name in the LFI (Learning From Incidents) space. He often writes about post-incident reviews, and he has a very interesting write-up of the Cloudflare outage on November 18, 2025 blog post. I especially loved this part:

Companies generally err on the side of saying less rather than more. After all, if you provide more detail, you open yourself up to criticism that the failure was due to poor engineering. The fewer details you provide, the fewer things people can call you out on. It’s not hard to find people online criticizing Cloudflare online using the details they provided as the basis for their criticism.

I think it would advance our industry if people held the opposite view: the more details that are provided an incident writeup, the higher esteem we should hold that organization. I respect Cloudflare is an engineering organization a lot more precisely because they are willing to provide these sorts of details. I don’t want to hear what Cloudflare should have done from people who weren’t there, I want to hear us hold other companies up to Cloudflare’s standard for describing the details of a failure mode and the inherently confusing nature of incident response.

Source: Brief thoughts on the recent Cloudflare outage

The price of admission

Some tough love here about what it means to have “executive presence”.

When someone tells you that you need more business sense, or that you’re not ready for more scope, or that you need to level up, this is typically what they’re trying to communicate. That you’re more concerned with how work happens than with what work should happen in the first place.

Source: The price of admission