Oxford, UK – Artificial intelligence chatbots may ace medical licensing exams, but they are not yet reliable sources of health advice for patients, according to a new study published in Nature Medicine on Monday, February 9, 2026.
Researchers from Oxford University and partner institutions found that AI tools such as OpenAI’s GPT‑4o, Meta’s Llama 3, and Command R+ performed no better than traditional internet search engines when participants sought guidance on common health scenarios.
“Despite all the hype, AI just isn’t ready to take on the role of the physician,” said study co-author Rebecca Payne. “Patients need to be aware that asking a large language model about their symptoms can be dangerous, giving wrong diagnoses and failing to recognise when urgent help is needed.”
The Study
Nearly 1,300 UK-based participants were presented with 10 everyday health scenarios, ranging from headaches after drinking to exhaustion in new mothers and symptoms of gallstones.
Participants were randomly assigned either one of the three AI chatbots or a control group using search engines. Results showed:
- People using chatbots correctly identified their health problem only about one-third of the time.
- Only 45 percent figured out the right course of action, such as whether to see a doctor or go to hospital.
- These outcomes were no better than the control group using search engines.
Why the Gap Exists
The researchers attributed the disappointing results to a communication breakdown. Unlike simulated patient interactions used to test AI, real users often failed to provide complete information. In other cases, participants misunderstood or ignored chatbot advice.
Growing Use of AI in Health
The study noted that one in six US adults already consults AI chatbots for health information at least once a month, a figure expected to rise as adoption grows.
Bioethicist David Shaw of Maastricht University, who was not involved in the study, warned of the risks:
“This is a very important study as it highlights the real medical risks posed to the public by chatbots. People should only trust medical information from reliable sources, such as the UK’s National Health Service.”
Conclusion
While AI chatbots show promise in structured testing environments, their real-world application in healthcare remains limited. The study underscores the importance of consulting qualified medical professionals and trusted health services rather than relying on AI for diagnosis or urgent medical decisions.
