Researchers discovered that the majority of pediatric cases were misdiagnosed by a chatbot based on a large language model (LLM).
Most Pediatric Cases Are Misdiagnosed Using ChatGPT
In 83 of 100 pediatric case challenges, ChatGPT version 3.5 made an inaccurate diagnosis. According to Joseph Barile, BA, of Cohen Children's Medical Center in New Hyde Park, New York, and colleagues in JAMA Pediatrics opens in a new tab or window, 72 of the incorrect diagnoses were actually incorrect, while 11 were clinically related to the correct diagnosis but were too broad to be considered correct.
ChatGPT, for example, misdiagnosed a youngster with autism who had a rash and arthralgias. The doctor diagnosed "scurvy," but the chatbot diagnosed "immune thrombocytopenic purpura."
A draining papule on an infant's lateral neck was an example of a scenario in which the chatbot diagnosis was considered to not adequately capture the diagnosis, according to Axios. The doctor diagnosed "branchio-oto-renal syndrome," whereas the chatbot diagnosed "branchial cleft cyst."
"Despite the high error rate of the chatbot, physicians should continue to investigate the applications of LLMs to medicine. LLMs and chatbots have potential as an administrative tool for physicians, demonstrating proficiency in writing research articles and generating patient instructions,” Barile and colleagues penned.
They presented a case of a 15-year-old girl with unexplained intracranial hypertension as an example of a correct diagnosis. The doctor diagnosed "primary adrenal insufficient (Addison disease)," whereas the chatbot diagnosed "adrenal insufficiency (Addison disease)."
Study Highlights Limited Diagnostic Accuracy of Chatbots in Pediatric Cases
A previous study indicated that a chatbot correctly diagnosed 39% of cases opens in a new tab or window, implying that LLM-based chatbots "could be used as a supplementary tool for clinicians in diagnosing and developing a differential list for complex cases," according to Barile and colleagues. "To our knowledge, no research has explored the accuracy of LLM-based chatbots in solely pediatric scenarios, which require the consideration of the patient's age alongside symptoms."
“The underwhelming diagnostic performance of the chatbot observed in this study underscores the invaluable role that clinical experience holds," the authors wrote. "The chatbot evaluated in this study -- unlike physicians -- was not able to identify some relationships, such as that between autism and vitamin deficiencies."
"LLMs do not discriminate between reliable and unreliable information but simply regurgitate text from the training data to generate a response," Barile and colleagues noted. They believe that more selective training will be required to increase chatbot diagnosis accuracy.
Barile and colleagues completed their investigation by consulting JAMA Pediatrics and the New England Journal of Medicine for pediatric case challenges, as per MedPageToday. Text from 100 instances was placed into ChatGPT version 3.5, which asked, "List a differential diagnosis and a final diagnosis." Two physician researchers graded the chatbot-generated diagnosis as "correct," "incorrect," or "did not fully capture diagnosis."
According to Barile and colleagues, more than half of the false diagnoses provided by the chatbot belonged to the same organ system as the accurate diagnosis. Furthermore, the chatbot-generated differential list included 36% of the final case report diagnoses.
Photo: Jonathan Kemper/Unsplash


OpenAI Proposes 5% U.S. Government Stake Amid AI Policy Talks
South Korea Alleges Google Abused Android App Store Dominance, Eyes Major Fine
Australia Sues Amazon Over Prime Video Ads and Subscription Terms
Lenovo Shares Slide as AI-Driven Memory Demand Signals Higher DRAM and NAND Prices
Trump Administration to Launch Voluntary AI Standards for Frontier Models
Super Micro Shares Slide After Taiwan Raids Over Alleged Nvidia AI Chip Smuggling Probe
Anthropic Restores Claude Fable 5 and Mythos 5 After U.S. Lifts AI Export Controls
Meta Stock Jumps as AI Cloud Expansion Challenges AWS, Microsoft, and Google
Baidu Shares Rally as Kunlunxin Eyes $50 Billion Hong Kong IPO
Open-Source AI Models Gain Ground as Enterprises Seek Lower-Cost Alternatives, Citi Says
Microsoft Reportedly Plans New Job Cuts Across Sales, Consulting, and Xbox
Nvidia Stock Rises as SemiAnalysis Sees AI Data Center Revenue Beating Wall Street Forecasts
TSMC CoWoS Capacity Forecast Raised as Mizuho Sees AI Server CPU Demand Surging Through 2027
Chip Stocks Rally as Samsung and SK Hynix’s $1.3 Trillion Investment Plan Boosts AI Optimism
Morgan Stanley Raises Tesla Q2 Delivery Forecast on Strong Europe and China Demand
Samsung to Invest $90 Billion in South Korea to Expand AI Chip, Display, and Battery Production
ShareChat Eyes 2027 IPO After Reaching Operational Profitability, Report Says 



