image: DeepSeek-R1 demonstrates strong performance across multiple educational and reasoning benchmarks. It achieves 79.8% Pass@1 on AIME 2024, slightly surpassing OpenAI-o1-1217. On MATH-500, it reaches 97.3%, matching OpenAI-o1-1217 and outperforming other models. Compared to DeepSeek-V3, DeepSeek-R1 shows significant improvements, scoring 90.8% on MMLU, 84.0% on MMLU-Pro, and 71.5% on GPQA Diamond. Although it performs slightly below OpenAI-o1-1217 on some benchmarks, it outperforms other closed-source models, highlighting its strength in educational tasks. The data for the comparison in the figure are derived from Figure 1 in the DeepSeek paper [4], where the benchmark performance of DeepSeek-R1 is presented.
Credit: The corresponding author Hao Chen
A joint research team from The Hong Kong University of Science and Technology and The Hong Kong University of Science and Technology (Guangzhou) has published a perspective article in MedComm – Future Medicine, titled “Large Language Models for Transforming Healthcare: A Perspective on DeepSeek-R1”. The article comprehensively evaluates DeepSeek-R1, a Chinese-developed open-source large language model (LLM), and its potential to transform the healthcare landscape.
Since its release in January 2025 by DeepSeek, DeepSeek-R1 has attracted wide attention for its powerful reasoning abilities, cost efficiency, and transparency in the medical field. Unlike closed-source reasoning models such as ChatGPT-o1, DeepSeek-R1’s open-access approach offers healthcare institutions the flexibility to deploy AI systems while protecting data privacy. For example, Nanfang Hospital of Southern Medical University and primary care clinics in Inner Mongolia have already initiated local applications of DeepSeek-R1 to improve healthcare delivery.
The study highlights how DeepSeek-R1 enhances clinical workflows. It supports diagnostic reasoning, treatment planning, and risk assessment by providing clinicians with transparent reasoning chains and structured decision-making paths. Real-world applications at The University of Hong Kong-Shenzhen Hospital have demonstrated DeepSeek-R1’s role in assisting with medical record analysis and treatment recommendations.
In addition to clinical support, DeepSeek-R1 shows promise in patient engagement and medical education. The model has been used by Shenzhen University-affiliated South China Hospital to generate personalized treatment guidance, improving patient adherence. It has also been applied by Qilu Hospital of Shandong University to create large-scale training materials and interactive educational cases for medical students.
Despite these advances, the article acknowledges key challenges that remain for DeepSeek-R1’s clinical integration. These include the model’s current limitation to text-only data, risks of hallucinated outputs, and the need to balance AI-driven safety recommendations with patient autonomy. The authors call for further research into multimodal capabilities and enhanced retrieval-augmented generation methods to address these issues.
The paper concludes that while DeepSeek-R1 has not yet reached its full potential, it marks a significant step toward reliable and equitable AI-driven healthcare solutions. The authors emphasize that continued efforts in technical refinement and ethical governance will be critical for the safe and effective integration of large language models into healthcare systems globally.
See the article:
Large Language Models for Transforming Healthcare: A Perspective on DeepSeek-R1
https://onlinelibrary.wiley.com/doi/10.1002/mef2.70021
Journal
MedComm – Future Medicine
Article Title
Large Language Models for Transforming Healthcare: A Perspective on DeepSeek-R1
Article Publication Date
12-May-2025