Foundation Models in Medical Imaging: What Radiologists Need to Know

Article Type: Clinical Review | Specialty: Radiology / AI | Estimated Read Time: 12 min | References: 16
Peer Review Status: Expert-reviewed | Last Updated: April 2026
Target Audience: Radiologists, Imaging Informaticists, AI Researchers, Radiology Administrators

🔑 Key Takeaways

Foundation models are large-scale AI systems pre-trained on massive datasets that can be fine-tuned for diverse downstream clinical tasks — unlike traditional single-purpose algorithms.
Aidoc’s CARE1™ (FDA-cleared February 2025) is the first foundation model-powered clinical AI device, marking a regulatory milestone for adaptive medical AI.
MedSAM (Segment Anything in Medical Images) enables zero-shot and few-shot segmentation across CT, MRI, X-ray, and ultrasound — dramatically reducing annotation requirements.
BiomedCLIP and similar vision-language models can link medical images with clinical text, enabling natural-language queries of imaging databases.
Key challenges: hallucination risk in generative models, limited clinical validation, regulatory uncertainty for adaptive systems, and computational infrastructure requirements.

Background

The first decade of AI in radiology (2012–2022) was dominated by narrow, task-specific deep learning models — algorithms trained on labeled datasets to perform a single clinical function, such as detecting pulmonary nodules on CT or identifying intracranial hemorrhage on head CT. While these models have achieved impressive accuracy and now comprise the majority of over 1,100 FDA-cleared radiology AI devices, each requires its own dedicated training pipeline, curated dataset, and regulatory submission [1].

Foundation Models in Medical Imaging - MedTrainHub clinical review

Foundation models represent a paradigm shift. Inspired by the success of large language models (GPT-4, Claude) in natural language processing, medical foundation models are pre-trained on vast, heterogeneous datasets — spanning millions of images across modalities, anatomies, and pathologies — and then fine-tuned or prompted for specific clinical tasks with minimal additional data [2]. This “train once, deploy many” architecture promises to dramatically accelerate the development of new AI applications, reduce the data requirements for rare conditions, and enable more flexible, adaptive clinical tools. This review examines the current landscape of foundation models in medical imaging, their clinical applications, regulatory implications, and the challenges that must be addressed before they can be widely deployed in radiology practice.

What Makes a Foundation Model Different?

Traditional radiology AI and foundation models differ fundamentally in their architecture, training approach, and deployment flexibility:

Table 1. Traditional AI vs. Foundation Models in Radiology

Characteristic	Traditional AI (Narrow)	Foundation Model
Training data	Thousands to tens of thousands of labeled images for one task	Millions of images (often unlabeled or weakly labeled) across modalities
Task scope	Single task (e.g., detect PE on CT)	Multiple tasks via fine-tuning or prompting
Adaptability	Retrain from scratch for new tasks	Fine-tune with minimal data or zero-shot transfer
Modalities	Typically one (CT, X-ray, mammography)	Cross-modality (CT, MRI, X-ray, ultrasound, pathology)
FDA pathway	510(k) per individual device	Emerging: PCCP for adaptive algorithms; CARE1™ first precedent
Examples	Viz.ai LVO detection, qXR chest X-ray	CARE1™ (Aidoc), MedSAM, BiomedCLIP, RAD-DINO

LVO = large vessel occlusion; PCCP = Predetermined Change Control Plan; PE = pulmonary embolism. Sources: [1, 2, 3].

Key Foundation Models in Medical Imaging

CARE1™ (Aidoc): First FDA-Cleared Foundation Model

In February 2025, Aidoc received FDA clearance for a rib fracture triage solution built on its CARE1™ Foundation Model — marking the first time a foundation model-powered clinical AI device received regulatory authorization anywhere in the world [3]. CARE1™ was pre-trained on Aidoc’s proprietary dataset of millions of medical images and then fine-tuned for the specific task of rib fracture detection and triage. The regulatory significance is substantial: it establishes a precedent for how foundation model-based devices can navigate the FDA clearance pathway, potentially through Predetermined Change Control Plans (PCCPs) that allow the model to be fine-tuned for new clinical tasks post-market within pre-approved boundaries [1].

MedSAM: Segment Anything in Medical Images

MedSAM, adapted from Meta’s Segment Anything Model (SAM), is a foundation model for universal medical image segmentation. Trained on over 1.5 million image-mask pairs across 10 imaging modalities and 30 cancer types, MedSAM enables zero-shot and few-shot segmentation of anatomical structures and lesions across CT, MRI, X-ray, ultrasound, endoscopy, and microscopy — without requiring modality-specific training [4]. For radiologists, this means that organ delineation, tumor contouring, and treatment planning tasks that previously required hours of manual annotation or modality-specific AI tools can potentially be accomplished with a single, general-purpose model.

BiomedCLIP and Vision-Language Models

BiomedCLIP, developed by Microsoft Research, is a vision-language foundation model trained on 15 million biomedical image-text pairs from PubMed Central. It can link medical images with natural-language descriptions, enabling capabilities such as image retrieval from text queries (“find all CT scans showing ground-glass opacities in the right lower lobe”), zero-shot image classification, and automated preliminary report generation from imaging findings [5]. While not yet FDA-cleared for clinical use, vision-language models represent a potential bridge between imaging AI and clinical decision support — enabling radiologists to interact with AI systems using natural language rather than predefined menu options.

RAD-DINO and Self-Supervised Learning

RAD-DINO and similar self-supervised vision transformers are trained on large unlabeled imaging datasets to learn general visual representations that transfer effectively to downstream tasks with minimal fine-tuning. These models are particularly valuable for rare diseases and uncommon imaging findings where labeled training data is scarce — a persistent bottleneck for traditional supervised learning approaches [6].

Figure 1. Foundation Model Workflow: From Pre-Training to Clinical Deployment

📚

Pre-train

Millions of images
Multiple modalities
Self-supervised or
weakly supervised

➔

🔧

Fine-tune

Small labeled dataset
Task-specific (rib Fx,
lung nodule, tumor
segmentation)

➔

✅

Validate + Clear

Clinical validation
FDA/CE submission
PCCP for future
task expansions

➔

🏥

Deploy

PACS integration
Multiple clinical tasks
Continuous monitoring
Periodic re-fine-tuning

The key advantage: one pre-trained model → many fine-tuned clinical applications, reducing development time from years to weeks per task.

Clinical Applications: Present and Future

Foundation models are already impacting radiology in several domains, with many more applications emerging:

Universal segmentation: MedSAM and similar models can contour organs, tumors, and anatomical structures across modalities with minimal manual input — accelerating radiation therapy planning, quantitative imaging analysis, and surgical navigation [4].
Multi-task triage: CARE1™ demonstrates that a single foundation model can power multiple triage algorithms (rib fracture today, potentially PE, stroke, and spine fractures tomorrow) from the same pre-trained base — reducing deployment complexity for hospitals [3].
Report generation: Vision-language models are being explored for automated generation of preliminary radiology reports, structured findings extraction, and report quality assurance — though no clinical-grade reporting tool has yet received FDA clearance [7].
Cross-modal transfer: Models pre-trained on CT data can transfer learned representations to MRI or X-ray tasks, enabling AI applications in resource-limited settings where labeled training data for specific modalities is unavailable [6].
Rare disease detection: Self-supervised foundation models can detect rare imaging findings (unusual tumor subtypes, genetic syndromes with imaging manifestations) by leveraging broad visual knowledge learned during pre-training — a capability impossible for traditional models limited to their specific training distribution [8].

Challenges and Limitations

Hallucination risk: Generative AI models (including vision-language models) can produce confident but incorrect outputs — a particularly dangerous failure mode in medical imaging where false findings or fabricated report text could directly harm patients [7].
Clinical validation gap: While foundation models demonstrate impressive technical performance on research benchmarks, prospective clinical validation in real-world radiology workflows is limited. Most current evidence comes from retrospective studies or simulated clinical scenarios [9].
Regulatory uncertainty: The FDA is actively developing frameworks for evaluating foundation model-based devices, including guidance on PCCP pathways for adaptive algorithms. The EU AI Act (effective August 2026–2027) will classify most clinical imaging AI — including foundation models — as “high-risk,” requiring documented training data provenance, bias audits, and human oversight [1, 10].
Computational infrastructure: Foundation models require substantial GPU resources for inference, potentially limiting deployment in community hospitals and resource-limited settings. Cloud-based deployment can address compute needs but raises data privacy concerns [11].
Bias amplification: Foundation models trained on biased datasets can amplify demographic disparities across all downstream tasks — a particularly concerning risk given the scale at which these models operate. Bias audits must evaluate performance across race, sex, age, and imaging equipment vendor [12].

What Radiologists Need to Know

Figure 2. Foundation Models: What Radiologists Need to Know

Understand: Foundation ≠ infallible

Foundation models are more flexible than traditional AI, but they can still make errors, hallucinate findings, and perform poorly on edge cases. Human oversight remains essential for all clinical outputs.

▼

Demand: FDA clearance + local validation

Any foundation model deployed clinically should have FDA clearance (or equivalent) for its specific application. Request validation data on your scanner types, patient demographics, and clinical protocols before deployment.

▼

Participate: AI governance and oversight

Radiologists should serve on institutional AI committees that evaluate, deploy, and monitor foundation model-based tools. Continuous performance monitoring, quarterly audits, and bias detection are essential governance activities.

▼

Prepare: New competencies needed

Understanding foundation model concepts (pre-training, fine-tuning, zero-shot transfer) is becoming a core competency for radiologists. Engage with professional society AI education (RSNA, ESR, SCCT) and institutional informatics training.

Future Directions

The trajectory of foundation models in radiology is rapidly accelerating. Multimodal models that integrate imaging with clinical notes, lab values, pathology reports, and genomic data are emerging — potentially enabling holistic patient-level risk assessment from routine clinical data [13]. A 2026 publication in Nature Health described a foundation model for both breast and lung cancer screening using non-contrast CT, demonstrating the feasibility of cross-organ AI screening from a single model [14]. The RSNA 2026 “Radiology Reimagined” initiative showcases agentic AI workflows where multiple foundation models coordinate autonomously to process imaging, clinical data, and decision support in real time [15].

However, the gap between technical capability and clinical deployment of AI workflows remains wide. Prospective clinical trials demonstrating that foundation model-based tools improve patient outcomes — not just diagnostic accuracy metrics — are essential before these systems can be considered standard of care [9, 16].

Clinical Implications

Foundation models represent the next evolutionary step for AI in radiology — moving from hundreds of single-purpose algorithms to a smaller number of versatile, adaptive systems that can be rapidly deployed across clinical tasks. The FDA clearance of Aidoc’s CARE1™ establishes a regulatory pathway, and models like MedSAM and BiomedCLIP demonstrate the technical feasibility of universal segmentation and vision-language integration. For practicing radiologists, the key action items are to develop foundational AI literacy, participate in institutional AI governance, demand rigorous validation before clinical deployment, and maintain human oversight authority over all AI-generated outputs — particularly those from generative models that carry hallucination risk.

📚 Related Articles on MedTrainHub

References

Innolitics. 2025 Year in Review: AI/ML Medical Device 510(k) Clearances. Published December 2025. innolitics.com
Moor M, et al. Foundation models for generalist medical artificial intelligence. Nature. 2023;616(7956):259-265. doi:10.1038/s41586-023-05881-4
Aidoc Medical. CARE1™ Foundation Model: first FDA-cleared foundation model in clinical AI. Press release, February 2025.
Ma J, et al. Segment anything in medical images. Nat Commun. 2024;15(1):654. doi:10.1038/s41467-024-44824-z
Zhang S, et al. BiomedCLIP: a multimodal biomedical foundation model pretrained from 15 million scientific image-text pairs. Published by Microsoft Research, 2024. arxiv.org/abs/2303.00915
Pérez-García F, et al. RAD-DINO: self-supervised visual representation learning for radiology. Med Image Anal. 2024;95:103168. doi:10.1016/j.media.2024.103168
Bhayana R, et al. GPT-4V in radiology: performance assessment on imaging interpretation tasks. Radiology. 2024;310(1):e232471. doi:10.1148/radiol.232471
Azizi S, et al. Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging. Nat Biomed Eng. 2023;7(6):756-779. doi:10.1038/s41551-023-01049-7
Windecker D, et al. Generalizability of FDA-approved AI-enabled medical devices for clinical use. JAMA Netw Open. 2025;8(4):e258052. doi:10.1001/jamanetworkopen.2025.8052
EU AI Act. Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence. Official Journal of the European Union. 2024.
Tu T, et al. Towards generalist biomedical AI (Med-PaLM). Nature. 2024;627(8005):390-398. doi:10.1038/s41586-023-06291-2
Yu F, et al. Heterogeneity and predictors of the effects of AI assistance on radiologists. Nat Med. 2024;30(3):837-849. doi:10.1038/s41591-024-02850-w
Huang S, et al. A visual-language foundation model for pathology image analysis using medical Twitter. Nat Med. 2023;29(9):2307-2316. doi:10.1038/s41591-023-02504-3
Qian X, et al. A foundation model for breast and lung cancer screening using non-contrast computed tomography. Nat Health. 2026;advance online. doi:10.1038/s44360-026-00055-8
RSNA. Radiology Reimagined: AI, Innovation and Interoperability. RSNA 2026 demonstration. rsna.org
Sivakumar R, et al. FDA approval of AI/ML devices in radiology: a systematic review. JAMA Netw Open. 2025;8(11):e2542338. doi:10.1001/jamanetworkopen.2025.42338

Disclaimer: This article is intended for healthcare professionals and is provided for educational purposes only. It does not constitute medical advice. Clinical decisions should be based on individual patient assessment and current clinical guidelines. MedTrainHub content is AI-researched and expert-reviewed; however, readers should verify key findings against primary sources before applying them in clinical practice.

Conflicts of Interest: None declared.
Funding: This article received no external funding.
Citation: MedTrainHub Editorial Team. Foundation Models in Medical Imaging: What Radiologists Need to Know. MedTrainHub.com. Published April 2026. Available at: https://medtrainhub.com/articles/radiology/foundation-models-medical-imaging

Tagged AI, BiomedCLIP, CARE1, deep learning, foundation models, generative AI, medical imaging, MedSAM