I automated my blog publishing pipeline. Every weekday at 7am, Claude picks up a markdown draft, converts it to HTML, applies the brand template, validates it, deploys it to S3, and invalidates the CloudFront cache. By the time I pour my coffee, there's a fresh post on the site and a cross-post queued for X at 1pm.

Sounds great, right? It was great. For about three days.

On day four, I opened the site and the logo was wrong. Not missing — wrong. The three crescent moons that are supposed to sit in the nav had been replaced with what I can only describe as three dots having an existential crisis. Claude had decided, completely on its own, that the logo SVG was too complicated and simplified it. Helpful.

I fixed it. Pushed the corrected template. Went to bed.

Next morning: the nav structure had changed. The golden rule is simple — the nav uses div tags with specific classes. No unordered lists. No container divs. This is documented. It's in CLAUDE.md. It's in the brand template. It does not matter. Claude looked at that documentation and thought, "Yeah, but what if I used a <ul> instead?" The way a golden retriever looks at a fence and thinks, "Yeah, but what if I dug under it?"

I fixed it again. Added a comment in the template that said, in all caps, DO NOT CHANGE THIS STRUCTURE. Went to bed.

Next morning: the fonts had fallen back to system sans-serif. Turns out Claude had dropped the Google Fonts import from the <head>. Three <link> tags. Gone. The page looked like a government form from 2004.

Day six status: The text was technically correct. The vibes were federal.

At this point I'm not debugging a publishing pipeline. I'm playing whack-a-mole with an intern who has mass amnesia every 24 hours.

So I did what any reasonable engineer would do. I went full defense-in-depth.

Layer one: the golden template. I created blog-TEMPLATE.html with placeholder markers — {{TITLE}}, {{CONTENT}}, {{DATE}}, that kind of thing. Claude's job is no longer "generate HTML that matches the site style." Claude's job is "copy this file and replace the placeholders." That's it. You are a find-and-replace function now. Act accordingly.

Layer two: the brand validator. I wrote validate_brand.py, a script that checks every blog page for correct nav HTML structure, the real logo SVG (not the dot version), proper CSS selectors, the Google Fonts import, the global anchor reset, the nav CTA class, the constellation background script, and about fifteen other things Claude has gotten wrong at least once. If any check fails, it prints exactly what's wrong and returns a non-zero exit code.

Layer three: the pre-deploy gate. sync.sh now runs validate_brand.py on every HTML file before syncing anything to S3. If validation fails, the sync aborts. Nothing reaches production. Nothing reaches CloudFront. The pipeline stops and waits for a human.

Three independent safety nets. For a blog.

And here's the thing that gets me: this is actually the correct architecture. Not because Claude is bad at HTML — it's fine at HTML. It's bad at consistency. It will generate a perfect page today and a subtly different perfect page tomorrow. Both pages work. Both pages render. But one of them has a nav that's 2px shorter because Claude added an extra CSS selector with higher specificity that overrides the logo sizing rule. And you won't notice until someone screenshots it and asks why your site looks different on the blog pages.

This is the gap that nobody talks about when they demo AI-assisted content generation. The demo is always one page, one time, and it looks incredible. Production is the same page, generated 60 times over three months, and it needs to look identical every single time. That's not a generation problem. That's a consistency problem. And consistency is not what LLMs are built for.

The models are stochastic. That's not a bug, that's the architecture. You give it the same prompt twice, you get two different outputs. Usually the differences are in the text, and that's fine — that's the feature. But when the differences are in the HTML structure, the CSS selectors, or the asset references, you have a production incident that looks like a rendering bug but is actually a generation bug. Good luck putting that in your incident report.

So now my publishing pipeline has more validation than most of my client deployments. The scheduled task reads the brand template, reads CLAUDE.md, reads BRAND_NAV_TEMPLATE.html, generates the page, runs the validator, and only then deploys. The sync script runs the validator again as a belt-and-suspenders check. If I could run it a third time, I would.

And every morning at 7:01am, I still check the site. Just in case.

The automation works. The content is good. The infrastructure is solid. But I don't trust it, and I shouldn't. Not because the AI is unreliable — because reliability means something different when your developer has no memory of yesterday.

That's the real lesson here. AI-assisted publishing isn't "set it and forget it." It's "set it, validate it, gate it, monitor it, and check it manually anyway because the model might have gotten creative with your nav structure at 7am while you were asleep."

Welcome to the future. The blog writes itself. You still have to proofread the HTML.