Content System — Built in Public
Every post runs through an AI pipeline I designed and assembled myself. Gemini writes and formats. My cloned voice narrates. I edit without burning tokens. A full post with images costs a few cents to produce.
Every time I asked Claude to fix one sentence in a post, it had to re-read the entire thing first. A period changed to a dash. One sentence rewritten as two. Small edits. Big token bills. Because the model doesn't know what changed unless it reads everything again from the top.
I was also spending real money generating images per post, refining drafts back and forth, and watching my Anthropic credits drain on work that didn't need to be that expensive. The posts weren't the problem. The process was.
So I rebuilt the process.
Posts don't start from scratch on the day I write them. When I have a thought, a detail, or a direction, I add it to a draft in Convex. The draft sits there and grows until I'm ready. No pressure to finish something I started. No lost notes.
When a draft is ready, I tell Claude, and Claude passes it through the Gemini engine. Gemini writes the full post in my voice, formatted the way my blog actually works: one sentence per line, callouts, pull quotes, section headers, the works. It also generates all the images for the post in the same pass. This runs on my Gemini plan, not Anthropic. The cost difference is significant.
After Gemini produces the draft, I can open it in my own browser and edit directly on the page. Bold something. Fix a word. Add a link. Delete a sentence. Save. No AI involved, no tokens spent.
Once the text is final, the post gets narrated using an ElevenLabs clone built from hours of my own recordings. A preflight check runs first to catch anything that might sound wrong: odd numbers, abbreviations, words the model mispronounces. The audio is generated, level-matched, and attached. A read-along bar highlights each sentence as it plays.
Posts live in Convex. Clicking Publish triggers a static rebuild so the page is served fast. The whole thing is version-controlled and auto-deployed through Vercel on every push.
The Gemini draft engine runs on my Google plan. Image generation runs on the same plan. For a full post with formatted copy and four images, the cost is a few cents. Running that same process through Anthropic was burning through my five-hour weekly token allowance fast.
The reason back-and-forth editing with Claude gets expensive is that the model has to read the entire post to understand context before it can change one word. A 2,000-word post read and rewritten fifteen times to get the phrasing right adds up fast. The in-page editor exists so that kind of iteration happens at zero cost.
Claude still handles the work that requires judgment: voice review, structural decisions, narration preflight. But formatting, images, and minor text fixes don't need that. Routing them to Gemini keeps the Claude budget where it belongs.
Every generated image in the blog features the same recurring character: a woman in her 40s, light curly hair, warm expression, recognizable from post to post. Gemini generates all blog images. The character stays consistent because every prompt is anchored to the same detailed description before anything else is specified.
I built a character brand kit: a precise description of this woman that goes into every image prompt before anything else is specified. Short enough to fit anywhere. Specific enough that the character is recognizably the same person across every post, every scene, every lighting setup.
The reason this mattered from the start: an early post about my mom had a different woman in every image, a different daughter in every image. I took a course on AI image and video creation to learn how character consistency actually works, then taught those mechanics to Claude so Claude can now brief Gemini the same way every time.
The same character anchored across different posts and scenes.
Posts originally launched with a generic AI voice. Most have since been re-narrated using a professional voice clone built from hours of my own recordings. When you press Play, you're hearing a voice clone of me. The transition is still in progress, so a few older posts are still on the original AI voice.
A self-improving audio rules engine handles the things voice clones get wrong: how to pronounce words like "resume," when to slow down on short lines, how to keep contractions from getting swallowed. Every fix that gets discovered gets baked into the rules file permanently so it never needs to be caught again. The narration gets better with every post.
A read-along bar highlights the current sentence as the audio plays so you can follow along.
All posts, drafts, and metadata live in Convex. The HTTP API serves posts to the static site without a traditional CMS. Drafts grow in Convex until they're ready to publish.
Bold, italic, link, delete, replace: all from the browser with no AI involved. Minor changes that used to cost tokens now cost nothing and take ten seconds.
A professional voice clone built from hours of recordings. Audio generation, level-matching, and the read-along timeline all run from the same pipeline.
Every push to the main branch triggers an automatic deploy. Posts are served as static HTML rebuilt from Convex, so page load is fast and there's nothing to manage.
Before any narration is generated, a preflight script scans the post for pronunciation traps: number badges, abbreviations, homographs, lines that are too short. Issues are flagged before the first generation.
Every published post automatically mirrors as a source in a custom NotebookLM app I set up with my own API key, so Claude can speak to it directly. The blog becomes its own searchable knowledge base over time.
The posts document what I figured out while building. The blog itself is the demonstration. Every system described in these posts is the same one that published them.
Read the blog →