Every now and then, a conversation catches my attention, unpacking the really hard questions about AI’s future. Recently, I came across fascinating insights from Benjamin Mann, co-founder of Anthropic and one of the key architects behind GPT-3 at OpenAI. Ben’s perspective on where AI is heading—from safety risks to economic upheavals and what we can do to prepare—offers a nuanced, grounded look at this rapidly evolving domain.
Why safety in AI isn’t just an afterthought
One of the most striking things I encountered was the story behind Anthropic itself. Ben and a handful of others left OpenAI because they felt safety wasn’t being prioritized enough. Imagine a place where the goal is building powerful AI for humanity’s benefit, but inside the company, safety and research pull in different directions. Ben described three “tribes” at OpenAI—the safety tribe, research tribe, and startup tribe—that often conflicted. That tension led him and others to start a company focused on putting safety first, not as an add-on, but baked deep into the AI, aligned to be helpful, harmless, and honest.
This makes me realize how crucial it is to see AI safety not just as a checkbox but as the foundation for the future of tech, especially as we’re racing toward superintelligence. Ben pointed out that only a tiny fraction of people worldwide work on AI safety, despite the massive investment in AI development. That’s astonishing given what’s at stake.
Less than 1,000 people worldwide work on AI safety, while the industry spends roughly $300 billion annually on AI development.
Progress isn’t slowing down: The scaling laws and what they mean
There’s a common narrative about AI progress hitting plateaus. But Ben challenges that, explaining that progress is actually accelerating. Model releases used to come yearly, and now improvements happen every few months or even faster. He introduces an interesting analogy to Einstein’s relativity—the technology’s advance feels slower because we’re in the thick of exponential change and time seems dilated.
Scaling laws, which have held true across enormous expansions in data and compute, show no signs of breaking yet. This sustained progression means we’re unlikely to see a sudden halt in AI capabilities anytime soon, and we should be prepared for major transformations ahead.
The economic impact and the looming job disruption
Ben and Dario Amodei, Anthropic’s CEO, suggest that within 20 years, AI could reshape society so fundamentally that even capitalism might look foreign to us. They predict unemployment could rise around 20%, driven by both displacement and skill mismatches.
But what’s truly eye-opening is how AI is already changing work today. For instance, AI reaches 82% automated resolution rates in customer service and writes 95% of the code in some software engineering teams. That means smaller teams can do massively more.
Ben emphasizes that to stay ahead, people need to be ambitious in how they use AI tools—whether that means iterating prompts multiple times or exploring new ways to unlock AI’s power. Simply treating AI like older tech won’t cut it.
Redefining AGI: What counts as transformative AI?
Ben prefers the term “transformative AI” over AGI. It’s less about matching human abilities on every front and more about when AI starts fundamentally transforming the economy and society. He shared a practical yardstick called the Economic Turing Test: if you can replace half the jobs in a market basket with AI without people noticing the difference, then transformative AI is here.
Imagine when the world’s GDP grows by 10% a year or more due to AI-driven productivity. It’s a radical shift that will change lives dramatically. It’s both exciting and daunting.
How Anthropic aligns AI safely with constitutional AI
A big part of Anthropic’s approach is Constitutional AI, a method where they instill a set of human values and principles—drawn from sources like the UN Declaration of Human Rights—directly into the AI’s operating rules. Instead of relying on human raters to supervise every response, the AI judges itself against these principles and self-corrects. This recursive self-improvement, or Reinforcement Learning from AI Feedback (RLAIF), is a game changer to scale AI safety research.
Ben stresses that it’s not just about safety playing defense but about giving AI a personality rooted in trust, honesty, and kindness. That’s why Anthropic’s Claude model is both less sycophantic and better aligned to help users effectively. Safety and user experience go hand in hand.
The timeline to superintelligence: How soon and what then?
One of the most talked-about points is the predicted timeline. Ben aligns with the AI 2027 report forecasting a 50% chance that superintelligence will come around 2028. That’s just a few years away—a startling thought for many.
But he also cautions that even after superintelligence arrives, the societal impacts will diffuse gradually and unevenly. Some places and industries will experience waves of change sooner than others.
What are the risks and can we solve alignment?
When it comes to existential risks from AI, Ben estimates roughly a 0-10% chance of extremely bad outcomes globally. It’s not zero, and that’s why safety work is critical. The problem might be difficult or impossible to solve, easy, or somewhere in between. Anthropic operates under the assumption that our actions right now can greatly influence the outcome—an enormous responsibility.
Personal reflections: carrying the weight of AI’s future
Ben also shared what it’s like to work in a role where the stakes feel so enormous. He adopts a mindset from Replacing Guilt by Nate Soares, emphasizing “resting in motion”: staying engaged at a sustainable pace without being paralyzed by anxiety. Working alongside an egoless, mission-driven team helps too—people who genuinely care about making the future positive.
How to prepare yourself and your kids for the AI era
Ben’s advice for individuals? Get curious, be willing to experiment, and embrace ambition with AI tools. He encourages trying prompts multiple times, learning from what doesn’t work.
For his own kids, he’s focused less on traditional achievement and more on nurturing curiosity, creativity, kindness, and self-led learning—all skills that will thrive long after facts fade.
Key takeaways
- AI safety must be the number one priority—building powerful AI without safety at its core risks irreversible harm.
- Progress is accelerating, not slowing down, so the coming years will bring dramatic shifts in technology and society.
- Transformative AI will reshape economies and jobs, and to thrive, individuals must adopt AI tools ambitiously and creatively.
- Alignment techniques like Constitutional AI show promise in creating AI that is not just capable but trustworthy and safe.
- The singularity or superintelligence could arrive soon, so proactive measures to understand and govern AI are critical today.
- Preparing future generations means focusing on curiosity, creativity, and kindness, not just rote learning.
Wrapping up
Reading these insights from Benjamin Mann really brought home how intertwined progress and responsibility are in AI’s journey. The complexity of aligning AI safely while pushing forward innovation is one of the defining challenges of our time. Yet, there’s real hope in the approaches Anthropic is pioneering—embedding values directly into AI’s lock and key and making safety a first-class citizen.
If AI is going to be the last invention humanity ever needs to make, as Ben put it, then making sure it’s done right feels like no less than our collective duty.



