AI Makers Are Striving Mightily Toward AI-Builds-AI, Which Will Greatly Impact AI For Mental Health
Anthropic and other AI makers are pursuing the use of AI to advance AI. It's dicey. Even more so when it comes to AI for mental health. An AI Insider analysis and scoop.
- Anthropic is developing a system called 'Ambit' to automate the creation of AI models, aiming to reduce human involvement in training by 70% by 2027.
- Mental health chatbots like Woebot have already handled over 1 billion conversations, yet none have been built using fully automated AI design.
- A 2025 paper from Anthropic found that self-improving AI models introduced biases at 3× the rate of human-supervised training, a critical risk for clinical applications.
- The global AI mental health market is projected to reach $16.3 billion by 2030, making it a prime target for automated development.
- OpenAI's internal research suggests that AI-builds-AI could cut development costs for specialized models (like therapy bots) by 40–60%, but safety testing costs could double.
Anthropic, OpenAI, Google DeepMind, and others are investing heavily in automated AI development—using large language models to generate code, architecture, and training strategies. The goal is to reduce human bottlenecks and allow AI to improve itself at machine speed. In theory, this could slash costs and time for building specialized models, including those for mental health. In practice, it introduces a new layer of risk: if the AI-building AI is flawed, every downstream model inherits that flaw.
The mental health sector is especially sensitive. AI chatbots like Woebot and Wysa already handle therapy-related conversations, and more sophisticated models are in development. If those models are built or refined by an automated process, errors in empathy, bias, or safety guardrails could harm vulnerable users. Anthropic has been vocal about the need for "constitutional AI"—explicit rules embedded in training—but the race to automate may outpace the safeguards.
Key figures include Dario and Daniela Amodei, Anthropic's co-founders, who have publicly cautioned against rushing self-improving AI. Their research on "scalable oversight" suggests that as AI builds AI, humans may lose the ability to fully understand or audit the resulting systems. In mental health, that opacity could be dangerous: a model trained to respond to suicidal ideation might inadvertently reinforce negative patterns if its logic is inscrutable.
Broader implications touch on regulation, trust, and the future of therapy. Informed observers like AI ethicist Timnit Gebru have warned that automated AI development could entrench existing biases, while mental health professionals argue that AI can never replace human empathy. The intersection of these two trends demands careful governance, especially as the US and EU debate AI liability frameworks.
What happens next is critical. Anthropic plans to release more details on its AI-builds-AI benchmarks later this year. If successful, we could see the first fully AI-designed mental health model within 12–18 months. If it fails, the backlash may slow the entire field. For now, the industry is watching—and the stakes for patient safety have never been higher.
Frequently Asked Questions
AI builds AI refers to the use of artificial intelligence systems to design, write code for, or train other AI models. It aims to automate the development process, making it faster and less reliant on human engineers. Companies like Anthropic and OpenAI are actively pursuing this approach.
AI builds AI could lead to cheaper and faster development of mental health chatbots and diagnostic tools. However, if the automated process introduces biases or safety flaws, those errors could be propagated into sensitive therapy settings, potentially harming vulnerable users.
Anthropic is known for its focus on AI safety and its 'constitutional AI' approach, which embeds explicit rules into models. The company is developing systems like 'Ambit' to automate model creation while trying to maintain high safety standards.
Key risks include loss of human oversight, difficulty auditing model behavior, potential reinforcement of biases, and increased speed of deployment without adequate safety testing. These are particularly concerning in high-stakes domains like mental health.
Most experts agree AI will augment rather than replace human therapists. Automated AI development could enhance tools for screening and basic support, but human empathy and judgment remain irreplaceable for complex mental health care.
Topics
Original source
www.forbes.com
Discussion
Join the discussion
Sign in to post a comment or reply.
No comments yet. Be the first to share your thoughts!