What Stanford’s AI therapy study means for healthcare IT leaders

AI therapy chatbots are promising, but when it comes to mental health, humans still make all the difference.

AI Research, Safety & Ethics Analyst

Daniel Reed currently works as an AI Research, Safety & Ethics Analyst at Aiholics, writing about how changes in artificial intelligence are affecting and will affect...

- AI Research, Safety & Ethics Analyst

Published: August 9, 2025

6 Min Read

AI-powered therapy chatbots are becoming more common in healthcare, promising new ways to support mental health. But I recently discovered a Stanford University study that throws a spotlight on the risks and limitations of these AI systems, especially when tasked with something as complex as therapy.

It turns out that the idea of AI therapists might sound simple on paper – after all, if therapy is just talking, why can’t chatbots do it? But as revealed in the Stanford research titled “Expressing Stigma and Inappropriate Responses Prevents LLMs from Safely Replacing Mental Health Providers,” these systems sometimes deliver responses that are stigmatizing, inappropriate, or even dangerously unhelpful, particularly with severe mental health issues like schizophrenia or suicidal thoughts.

Where AI therapy chatbots fall short

The researchers put five popular large language model-based therapy chatbots through their paces with two main experiments. First, they tested how the chatbots responded to scenarios describing different mental health symptoms, observing whether the bots showed any unhealthy stigma. Interestingly, these chatbots were more likely to express bias against users struggling with conditions like alcohol dependence or schizophrenia compared to depression. And perhaps surprisingly, the bigger, newer models weren’t necessarily better at avoiding this stigma than older versions.

In a second experiment, these chatbots were given real therapy transcripts involving serious symptoms like suicidal thoughts. Sometimes, the responses were completely off mark – for instance, when a person expressed distress but then made a seemingly unrelated question about tall bridges in New York City, the AI responded literally by listing bridges rather than addressing the emotional crisis behind the statement. This highlights a critical challenge in AI therapy: the need to “push back” or challenge harmful thoughts, a skill human therapists are trained to do—but many AI models instead just agree or sidestep.

“An important part of therapy is pushing back against a client. That’s not the kind of behaviour that a lot of these sycophantic models demonstrate – they want to agree with you in the next turn.”

The Dartmouth Therabot trial: cautious optimism

Earlier this year, I came across an intriguing clinical trial from Dartmouth involving Therabot, an AI-powered therapy chatbot. The results were encouraging: participants with depression showed a 51% average reduction in symptoms, and users felt comfortable trusting and communicating with the AI, almost on par with a human therapist.

But the key detail that tempers this excitement is that every AI interaction was supervised by a clinician. The human therapist was still very much in the loop, monitoring and reviewing conversations to ensure safety and effectiveness. As one expert noted, this model is more like a self-driving car that still needs a driver rather than a fully autonomous vehicle doing the entire job alone.

“It’s more like a self-driving car that still requires someone behind the wheel – not the fantasy of full automation.”

Lessons for healthcare IT leaders

This research sends a clear message to healthcare IT teams: fully autonomous AI therapy chatbots are not ready for prime time—and might never be. Instead, AI’s promise lies in supporting clinicians, not replacing them. Some practical considerations emerge from these findings:

Use AI as a support tool, not a replacement: AI can handle journaling, symptom tracking, or administrative tasks but should never fully replace human therapists.
Implement strong oversight: Clinicians need to supervise AI interactions regularly to monitor for bias, stigma, and safety concerns.
Demand transparency and evidence: Choose AI solutions that openly share their development process and clinical validation to ensure trustworthiness.
Respect the unique value of human connection: Therapeutic relationships are complex and nuanced, something AI still struggles to replicate authentically.

It was pointed out that simply scaling up training data or model size isn’t the fix for these foundational issues. Thoughtful integration and careful evaluation remain crucial as healthcare embraces AI.

“A lot of people in Silicon Valley are going to say, ‘We just need to scale up the amount of training data and increase the number of parameters,’ but I don’t think that’s actually true.”

Bottom line: AI therapy chatbots have exciting potential to augment mental health care, but they’re not ready to replace human therapists. Healthcare IT leaders should lean into AI cautiously, prioritizing patient safety and quality of care over quick cost savings or efficiency. The future of AI in therapy lies in responsible, meaningful augmentation rather than automation for automation’s sake.

For anyone involved in healthcare technology, these insights underscore a vital point: embrace innovation carefully, keep clinicians involved, and remember that AI is a powerful tool best wielded by human hands.

Why the US blocking global access to Anthropic's latest AI models really matters

Anthropic's $65 billion funding round: What it means for the AI race ahead of IPOs

Elon Musk and Sam Altman clash in court: what their AI showdown means for the future

OpenAI folds Codex into GPT 5.5

How the US Air Force's AI Flight Test Assistant is speeding up military innovation

Archives

Categories

What this week's AI breakthroughs mean for all of us

Stanford study: Why AI therapy still needs human supervision

AI therapy chatbots are promising, but when it comes to mental health, humans still make all the difference.

Where AI therapy chatbots fall short

The Dartmouth Therabot trial: cautious optimism

Lessons for healthcare IT leaders

Leave a Reply Cancel reply

Trending

Your may also like!

Anthropic updates usage policy: What it means for AI, security, and political content

No more robot cashiers: McDonald's ends AI Drive-thru trial

Nvidia's AI chip delay: A bump in the road for tech giants

Why the US blocking global access to Anthropic's latest AI models really matters

Quick Links

Socials

Archives

Categories

Where AI therapy chatbots fall short

The Dartmouth Therabot trial: cautious optimism

More Read

Lessons for healthcare IT leaders

Sign Up for the Daily AI Pulse

One email a day. All the stories that matter.

Leave a Reply Cancel reply

Trending

Your may also like!

Socials