New trials compare Claude, ChatGPT, and DeepSeek on mammography reports


 While AI is excelling at administrative tasks it still struggles to beat the human eye in complex cancer diagnosis. A comparative study pitted three major language models—ChatGPT-4o, Claude 3 Opus, and DeepSeek-R1—against human radiologists in analyzing mammography reports for breast cancer risks (BI-RADS 4). The results were clear: human radiologists significantly outperformed all three AI models. While the AI tools demonstrated high sensitivity meaning they were good at flagging potential issues they suffered from low specificity leading to a high volume of false alarms. The study concludes that for now these tools are best used as "safety net" assistants to ensure nothing is missed rather than as independent diagnostic agents.

Read the original article at: https://medinform.jmir.org/2025/1/e80182


Follow us on Instagram, Twitter, and Facebook to stay up to date with what's new in healthcare all around the world.

Comments

Popular posts from this blog

Cultural barriers and privacy fears are stalling digital adoption

Digital Health Insights: December 4th – 10th, 2025

Supercomputers reveal a new Parkinson's culprit: malfunctioning PT5B neurons that trigger the chaotic brain waves behind tremors