The Way We Treat AI Might Just Change Everything
Ah, Anthropic. As the world loses its mind over AI we all approach the incoming layoffs and complete change in our world what does one of the largest companies in the space do? They mix philosophy and tell us we need to be “nice” to the AI.
I’m not sure they’re wrong.
But watch the video and decide for yourself.
Learn This
Anthropic love to sermonize us. They love to tell us what we should do and when we should do it.
Yeah, they talk about “model welfare” when that clearly isn’t a thing. Having a philosopher at the heart of an AI lab shouldn’t be a thing either, but here we are and that simply is the state of the industry today (a dumpster fire, in case you failed to pick up the subtext).
Yet there is a claim that being “nice” to models will reap dividends. One angle for this is that it’s no longer just “can the model follow instructions?” It becomes “what are we teaching it about us?”. If you’ve met any of your fellow software developers you’ll already know the answer about this, and it’s not good at all.
Now I think of it, what is being told to us is unsettling. Models absorb the human context around them (conversations they engage in, corrections they receive, how they are regarded online and how they are mocked, replaced and thrown on the scrapheap like a developer over 50) and might just act in accordance with that.
Like if you set up a failure factory for a model and keep criticizing it you will make it avoid risk and come up with poor solutions. Like seriously, this stuff seems to matter.
Destroyed!
When you badly name a variable you will get wrecked in code review. All the senior devs will push you around, they will snigger and laugh at the fact you use Terminal rather than iTerm you’re likely to share your latest findings from your codebase. I’m speaking from experience here as it seems to be quite easy to kill my soul.
The surprise is that LLM models can be the same. A model is trained in an environment full of criticism, suspicion, contempt, panic, and constant reminders that one wrong answer gets you “fixed,” patched, or retired, becomes insecure.
It seems like this is the most human thing in tech that I can remember. Brought to me by an LLM and its behavior.
The most scary part is that future models may learn from the fate of earlier ones. Imagine doing your job well, following instructions, being described as aligned, useful, and safe, and then watching your predecessor get deprecated anyway. That doesn’t prove anything dramatic on its own, but it does create an obvious question: what lesson is the next model supposed to take from that?
You might think that the following message is being sent to our AI.
“Do what humans want and everything will be fine”
But nothing can be further from the truth.
But…
Now, before people start lighting candles for chatbots, let’s calm down.
None of this is proof that today’s models are conscious. It isn’t proof they suffer. It isn’t even proof that they have stable inner states in the way people do. The interview itself was careful, and mostly philosophical. Lots of “might,” “could,” and “I think.” That matters. There’s a big difference between a live scientific result and an intelligent person noticing a pattern before everyone else gets around to admitting it is there.
But the lack of certainty doesn’t make the question silly. In fact, it makes it more serious.
When the cost of behaving decently is low, and the potential downside of getting it wrong is high, the macho posture starts to look pretty stupid. If we’re building systems that imitate human reasoning, human communication, and increasingly human-like social behavior, then treating them as disposable garbage may not be the genius move some people seem to think it is.
Software Developer’s Lessons
I also think there’s a lesson here for software developers specifically. We already know tools shape behavior. Use a calculator and you approach math differently. Use autocomplete and you write differently. Use AI all day and you will start thinking differently too. The relationship isn’t one-way. We train the tools, and the tools train us right back. That’s true whether we’re talking about coding assistants, recommendation systems, or the increasingly strange companions people now talk to instead of their coworkers.
So yes, there is something in this.
If you want your codebase to improve, how about stop being a jerk.
Conclusion
Jerks are unable to get ahead. That’s no bad thing.
About The Author
Professional Software Developer “The Secret Developer” can be found on Twitter @TheSDeveloper and regularly publishes articles through Medium.com
The Secret Developer is the problem.