Senior Engineers Are Back (Thanks Amazon)
Image: ChatGPT and TSD
Amazon are dealing with a few incidents recently. People are getting pinged during the middle of the night, and there’s a scramble to work out what is going wrong.
As you might expect Amazon have their own internal way of expressing a major incident. They’re speaking of incidents with a “high blast radius”.
Now according to some leaked meeting notes, these incidents have a root-cause. These are “Gen-AI assisted changes” that have gone wrong.
AI wrote the code.
Nobody properly reviewed it.
Production broke.
And the solution is “senior engineers”.
The “Move Fast and Break Things” Hangover
Tech companies have spent the last two years loudly announcing that AI writes a large chunk of their code.
At my employer a senior member of staff said that they “no longer look for the code” that they push. No wonder our production environment suddenly looks like an abstract game of Jenga.
Because software engineers have been keeping a secret. All of us. Writing code is the easy part.
The hard part is actually understanding systems.
Claude code can get code to build. It can run tests. It can make sure that the code conforms to a pretty simple specification.
What it can’t do very well is:
understand complex system architecture
predict side effects
understand legacy systems
reason about operational risk
When you miss out these things from your workflow I’m not surprised that you’ve created incidents with a “high blast radius”.
I think we’ve all been there when someone deploys a (small) change and then suddenly the company infrastructure is on fire and product managers are shouting for status updates.
AI makes mistakes faster
AI generated code is the promise of more code. Everyone will get their features faster, customers will be happy and we will all make bank.
Management is banging the efficiency drum. You need to do more, you need to push more code and you need to do it now.
Yet when something becomes more efficient people don’t use less of it, they use more which is an economic principle called Jevons Paradox. I think that’s exactly what is at play here.
Developers can now produce code much faster.
Which means companies produce more code.
When we produce more code that means
more deployments
more complexity
more bugs
We’ve changed a perceived (because it was never an issue) productivity problem into a quality control problem.
The Solution was always there
Amazon’s response to these outages is interesting. They aren’t panicking. They aren’t withdrawing AI tools from their engineers.
They’re changing their process to require approval from senior engineers. Now we are getting to the solution, peopleware. We (as developers) should be thankful for that.
It’s also the correct solution.
This is the early learning curve of AI coding tools at scale (senior reviews, improving its testing) moving to a long term of smart models and intense productivity. Human judgement is going to be key for production systems.
The code review problem nobody talks about
The real skill in software engineering isn’t writing syntax. It’s:
understanding the system
predicting the consequences of change
spotting dangerous assumptions
Those are things you learn after years of painful production incidents, bug bashes and staring into space looking for the “right” solution.
AI doesn’t have that experience, and doesn’t produce the solutions that we need.
Only Senior engineers do.
This has been clearly exposed by the code review struggles I’ve seen at companies. These struggles I’m referring to are of course the struggles of people who do not adequately carry out code reviews.
I’ve been left waiting hours (or even days) for code reviews. I’ve posted pull requests and then begged for people to read them in Slack channels. Work gets blocked and delayed, and nothing seems to happen with the code.
With AI we don’t get 10x productivity, we get 10x pull requests. Now I think it’s down to your team what happens with those pull requests.
AI as a tool
Let’s be clear about something.
AI coding tools are useful, and are changing the nature of our jobs. They remove repetitive work and help developers move faster, but it seems like they won’t be able to replace engineers who understand systems.
What it will do is change the job.
Developers will spend less time writing boilerplate and more time:
reviewing AI output
designing systems
debugging weird failures
integrating complex systems
Which, incidentally, is what experienced engineers were already doing. The rest of us are just catching up.
The lesson
This Amazon story isn’t really about AI.
It’s about process. It’s about learning. It’s about doing better, and being better.
Software engineering has always needed:
strong code review
experienced engineers
careful deployments
accountability for changes
AI didn’t remove those requirements.
It just made it easier to ignore them.
And ignore it at your own peril.
About The Author
Professional Software Developer “The Secret Developer” can be found on Twitter @TheSDeveloper and regularly publishes articles through Medium.com
The Secret Developer is scared of being replaced by AI, no lie.