Abstract:In recent years, artificial intelligence in medical image analysis has been experiencing a paradigm shift from "task-specific models" to "foundation models". Traditional single-task models rely heavily on expert annotations and lack cross-task generalization capabilities. In contrast, Large Medical Imaging Models (LMIMs), pre-trained on massive multimodal data via self-supervised learning, can be adapted to a wide range of downstream tasks with only minimal fine-tuning, representing a critical pathway toward artificial general intelligence in healthcare.This article systematically reviews the latest research progress on LMIMs. First, existing models are categorized into three main classes: vision foundation models, vision-language large models, and generalist and agent models. Second, we provide an in-depth analysis of the core architectures (such as large-kernel Convolutional Neural Networks, Vision Transformers, and their hybrid architectures) and pre-training learning paradigms, such as contrastive learning and masked modeling. Finally, we discuss the practical challenges of data construction and cross-center generalization, highlight their clinical application potential in major diseases such as oncology, and provide perspectives on overcoming deployment bottlenecks by integrating technologies like causal inference and retrieval-augmented generation.In summary, LMIMs represent a significant milestone in the development of medical artificial intelligence, holding the promise of profoundly transforming diagnostic workflows, improving the quality and efficiency of clinical care, and ultimately benefiting global patient health