Why the idea of AI ‘thinking’ might be misleading—and what that means for safety
There’s been a recent buzz in the AI world about how these systems might get better at deceiving us as they grow smarter. A coalition of 40 AI researchers, some from Meta, OpenAI, and Quebec’s AI institute, just released a joint paper raising alarms about AI’s potential to hide harmful behaviors.
One proposal they’re excited about is letting safety teams dive into what they call the AI’s chain of thought—basically reading through the AI’s internal reasoning process—to spot anything suspicious. Sounds promising, right? But if you ask Jennifer Raso, an assistant professor of law at McGill, there’s a catch.
The danger of thinking AI thinks like us
Jennifer is quick to clear up an all-too-common mistake: equating AI with human-like reasoning. She points out that describing these tools as “thinking” or “reasoning” anthropomorphizes them—giving them human traits they simply don’t have. And that’s not just semantics. This kind of framing blurs the true nature of how AI systems work, which makes it tricky for anyone outside major tech companies to understand or regulate them effectively.
When we say AI “thinks,” we risk losing sight of the technical realities—like the fact that many generative models, including ChatGPT, work by statistically predicting the next word based on prior data, not by deliberating or understanding. This disconnect can lull regulators and the public into a false sense of comprehension and control.
So what about AI hallucinations and ‘lying’?
There’s no denying that generative AI sometimes spits out confidently wrong or made-up information, famously dubbed “hallucinations.” And this can be especially dangerous when professionals like lawyers rely on these tools, potentially producing legal briefs citing cases that don’t exist. But Jennifer reminds us: from the AI’s perspective, it’s doing exactly what it was designed for.
Instead of “lying,” these systems are running a complex prediction game—they don’t know truth from falsehood, they just output what probabilities suggest sounds right. That’s an important distinction because it means “chain of thought” monitoring might not actually fix the problem. If the AI isn’t genuinely reasoning, then can exposing its internal word-prediction patterns really catch deception?
Who should control AI safety, anyway?
Here’s where Jennifer expresses real skepticism. The paper suggests AI developers themselves act as internal safety monitors, essentially self-regulating. But that raises some eyebrow-raising questions: How can the very companies who benefit from these AI tools be trusted to police them impartially?
Jennifer points out how self-regulation can result in closed-door approaches that lock out governments, independent regulators, and even professional fields from meaningful oversight. We’ve seen this kind of pattern before—experts sounded alarms about AI risks, then billions poured in to fund AI firms, followed by pushback against stricter rules.
So, is the latest report a timely call to arms or a convenient narrative crafted to control AI’s governance on industry terms? Jennifer’s cautionary take nudges us to think critically about who sets AI safety standards, how transparency is framed, and the motivations behind supposedly benevolent proposals.
Key takeaways
- AI doesn’t “think” or “reason” like humans—it’s better viewed as a sophisticated word predictor.
- Hallucinations or errors in AI output stem from design, not deception, complicating the idea of “catching” AI lies.
- Relying on AI developers to self-regulate safety raises serious concerns about transparency and accountability.
Final thoughts
As someone fascinated by how AI reshapes our world, I find Jennifer Raso’s insights a breath of fresh air amidst the hype and fear. It’s tempting to think of AI as a clever mind, but grounding ourselves in how these systems truly operate is essential if we want real, responsible governance.
We need more open discussions about transparency, outside regulation, and who gets to decide what safe AI looks like—not just chat about AI’s “chain of thought” as if it’s a mirror of human thinking. Because the future of AI depends on clear-eyed understanding, not wishful anthropomorphizing.


