A study published in VIEW analyzed the performance of large multimodal models (LMMs) in interpreting CT scans for lung cancer diagnostics. It evaluated models like GPT-4V, LLaVA-1.5, and BiomedCLIP on a dataset of 100 chest CT images, assessing tasks such as binary classification and NCCN risk categorization. While GPT-4V achieved the highest agreement with expert consensus (75 percent), all models struggled with risk assessment accuracy. The research indicates that current LMMs lack the diagnostic reliability of experts, emphasizing the need for better training approaches and collaboration with radiology professionals.
Which Generative AI Model is the Best Diagnostician?
Comparative evaluation highlights current limitations and potential of LMMs in chest CT diagnostics
08/11/2025
News
1 min read