News Release

International study reveals sex and age biases in AI models for skin disease diagnosis

Peer-Reviewed Publication

Health Data Science

Uncovering Demographic Bias in AI Models Evaluating Skin Conditions from Clinical Images

image: 

Scientists from ShanghaiTech University compared the performance of large language models (LLMs), like ChatGPT-4 and LLaVA, in diagnosing skin diseases among male and female patients across different age groups. The findings point to potential biases across age and sex groups, that must be addressed before clinical deployment.

view more 

Credit: Zhiyu Wan, Health Information Safety and Intelligence Research Lab, ShanghaiTech University (generated with the help of ChatGPT-4o)

An international research team led by Assistant Professor Zhiyu Wan from ShanghaiTech University has recently published groundbreaking findings in the journal Health Data Science, highlighting biases in multimodal large language models (LLMs) such as ChatGPT-4 and LLaVA in diagnosing skin diseases from medical images. The study systematically evaluated these AI models across different sex and age groups.

Utilizing approximately 10,000 dermatoscopic images, the study focused on three common skin diseases: melanoma, melanocytic nevi, and benign keratosis-like lesions. Results revealed that while ChatGPT-4 and LLaVA outperformed most traditional deep learning models overall, ChatGPT-4 showed greater fairness across demographic groups, whereas LLaVA exhibited significant sex-related biases.

Dr. Wan emphasized, “While large language models like ChatGPT-4 and LLaVA demonstrate clear potential in dermatology, we must address the observed biases, particularly across sex and age groups, to ensure these technologies are safe and effective for all patients.”

The team plans further research incorporating additional demographic variables like skin tone to comprehensively evaluate the fairness and reliability of AI models in clinical scenarios. This research provides critical guidance for developing more equitable and trustworthy medical AI systems.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.