Opus 4.7 Dropped on a Wednesday and GPT-5.5 Dropped the Following Thursday

Opus 4.7 dropped on a Wednesday. I had a Sonnet-tuned prompt in production that immediately needed re-evaluation. GPT-5.5 dropped the next Thursday. So did the prompt I’d just rewritten.

Welcome to the AI leapfrog game. The unit of measure is no longer months. It’s not even weeks. It’s days.

Look, I’ve lived through churn before. Containers replaced VMs. Kubernetes ate Docker Swarm. Lambda was going to replace EC2 until cold starts ate everyone’s lunch and we all migrated back partway. There was a year, around 2018, where every JavaScript framework on Hacker News was either being born or being declared dead. I was there. I remember.

But that kind of churn had a pattern: APIs are forever, implementations rotate. The EC2 API I wrote against in 2012 still works in 2026. I could pull a Terraform file out of a backup from a decade ago and run terraform apply and it would mostly do the right thing. The names of the resources, the shape of the requests, the response formats — those are stable. What rotates underneath is the implementation. The pattern: write to the API, let the platform handle the upgrades.

AI does not work like that.

The “API” for AI, the part that makes your code do useful things, isn’t function signatures. It’s prompts. And the prompts are tuned to a specific model’s specific quirks. The way one model over-explains. The way another handles edge cases. The way a third one needs you to specify “respond in JSON only” three times before it stops trying to be helpful. You write the prompt around those personalities. You add few-shot examples that work because of how this model interprets them. You build a workflow that respects the failure modes of this model.

Then a new model drops. The API signature is identical — send messages, get text back. The function call shape hasn’t moved. Your code compiles. Your tests run. But the behavior is different. The over-explaining you trained around is gone, so now your output is too short. The edge case handling shifted, so the failure mode you wrote a guardrail for isn’t the failure mode anymore. The new failure mode is something you haven’t written a guardrail for yet, because you didn’t know it existed until now.

The new contract: AI model upgrades preserve the API but break the contract.

This is not a normal software upgrade. Normal software upgrades preserve the contract. AI model upgrades preserve the API but break the contract.

I sell AI automation to small businesses. The pitch is “this saves you eight hours a week, costs you $X a month, and works.” What I am not saying out loud, what is becoming load-bearing for the practice and is becoming hard to ignore, is the second half of that sentence: “…as long as none of these three vendors ships a new model in the next sixty days. If they do, I will spend a few hours retesting your stuff. I might rewrite prompts I shipped six weeks ago. I might find that a workflow I built around one model’s strengths is now built around something the new model is bad at. The cost of this is currently absorbed by me, because I haven’t figured out how to price it yet, and because if I billed for it the bill would be larger than the value I delivered, and I would lose the client.”

That last part is the part nobody wants to print on a slide. But it’s true.

The honest engineering answer is: build for the API surface, not the model. Keep prompts version-controlled like code. Maintain a regression suite of inputs and expected outputs that tells you within five minutes whether the new model breaks anything important. Treat model upgrades the way you treat OS patches — not “oh great, free improvement” but “okay, what regressed and how loudly.” If you don’t have a regression suite, you don’t actually know what your AI does. You have a vibes-based deployment.

The honest market answer is: you are in a market where everything you build has a half-life of roughly eight weeks before the platform underneath you mutates and you have to do work to keep your output stable. This is true whether you are a solo consultant or a Fortune 500 platform team. The Fortune 500 has more headcount to absorb the work. That is the only difference. Plan for it. Price for it. Or stop selling the thing.

I’ve been doing infrastructure for fifteen years and the only constant I’ve seen is that the layer underneath you is always changing. Cloud changed it once. SaaS changed it again. AI changes it on a faster clock than any of them. The clock will keep getting faster. The next thing under us, in five years, will probably move faster still.

The half-life problem: The only stable thing is that nothing’s stable. Build for that, or stop building.

That’s the whole craft now: you don’t build a thing. You build the capacity to keep rebuilding the thing. Whatever you ship today is a draft of what you’ll ship next month, and the month after that. The artifacts that survive are the ones with regression suites, observable behavior, and a human in the loop who knows what good looks like. Everything else gets quietly broken by a Wednesday model drop and discovered by a customer on a Friday.

Opus 4.7 will be old by July. GPT-5.5 will be old by August. Whatever I’m using to write this post will be old by the time you read it. My drafts are dated the day I wrote them, because by the publish date even the dates are wrong.

Anyway. The new model just dropped. I should probably go re-test my chatbot.