New trials compare Claude, ChatGPT, and DeepSeek on mammography reports

January 28, 2026

While AI is excelling at administrative tasks it still struggles to beat the human eye in complex cancer diagnosis. A comparative study pitted three major language models—ChatGPT-4o, Claude 3 Opus, and DeepSeek-R1—against human radiologists in analyzing mammography reports for breast cancer risks (BI-RADS 4). The results were clear: human radiologists significantly outperformed all three AI models. While the AI tools demonstrated high sensitivity meaning they were good at flagging potential issues they suffered from low specificity leading to a high volume of false alarms. The study concludes that for now these tools are best used as "safety net" assistants to ensure nothing is missed rather than as independent diagnostic agents.

Read the original article at: https://medinform.jmir.org/2025/1/e80182

Search This Blog

Digital Health

New trials compare Claude, ChatGPT, and DeepSeek on mammography reports

Comments

Post a Comment

Popular posts from this blog

Cultural barriers and privacy fears are stalling digital adoption

Digital Health Insights: December 4th – 10th, 2025

Supercomputers reveal a new Parkinson's culprit: malfunctioning PT5B neurons that trigger the chaotic brain waves behind tremors