0
Study evaluates LMMs for lung cancer CT scan diagnostics.
Models tested include GPT-4V, LLaVA-1.5, and BiomedCLIP.
GPT-4V had the highest agreement of 75 percent.
Accuracy for NCCN risk assessment was under 50 percent.
LLaVA-1.5 produced the most consistent descriptive results.
Models exhibited hallucinations and alignment issues.
Future improvements needed through collaboration with radiology professionals.
