Menu

Release: tldl v2.2.0 — RSS-first monitoring and audio-URL dedup

Project
tldl
Summary
Your favorite podcasts, summarized.
Link
tldl-pod.com

Two meaningful changes since v2.1.0. tldl now detects new podcast episodes directly from RSS feeds with conditional GETs instead of relying on Podcast Index re-crawls, so episodes typically land in the queue within minutes of publication. A second fix catches episodes that get retitled or have their GUIDs regenerated after publication — a surprisingly common pattern in the wild.

What’s new

  • RSS-first monitoring. The monitor now fetches RSS feeds directly with If-Modified-Since / If-None-Match headers, queues full episode metadata without a Podcast Index round-trip, and falls back to PI only on RSS errors. Detection lag drops from “hours” (PI re-crawl cadence) to “minutes” for feeds that update frequently.
  • POST /admin/rebuild-index. Backfill endpoint now populates audioUrl on every existing index entry so the new dedup check works retroactively.

Fixes

  • Silent duplicate episodes. Episodes that publishers edited after publishing — new title, regenerated GUID, or both — used to slip past dedup and get transcribed twice. A new audio-URL dedup signal (origin + pathname, query-stripped, lowercased) catches them. Confirmed against a real-world retitle where Lenny’s Podcast re-published an episode with a different title + GUID, and 100 historical near-duplicates silently deduped on the first force-check after deploy.

Under the hood

  • Queue messages carry full episode metadata when the source is RSS, so the consumer branches on rssSourced and skips Podcast Index + iTunes enrichment entirely on that path.
  • New audioUrl field on EpisodeIndexEntry.
  • Monitor cron cadence tuned to every 2 hours — RSS conditional GETs keep the feed-scan cost low, and most monitored feeds don’t publish often enough to justify the previous 30-minute cadence.