This week, PLOS Medicine launches our Special Issue on Machine Learning in Health and Biomedicine, Guest Edited by Atul Butte, the Priscilla Chan and Mark Zuckerberg Distinguished Professor and Director of the Institute for Computational Health Sciences at UCSF, Suchi Saria, John C. Malone Assistant Professor in the Department of Computer Science, Statistics, and Health Policy at Johns Hopkins University, and Research Director of the Malone Center for Engineering in Healthcare, and Aziz Sheikh, Professor of Primary Care Research & Development and Director, Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh.
In the Special Issue's first featured study, Eric Karl Oermann of the Department of Neurological Surgery, Icahn School of Medicine, New York, and his colleagues, trained a convolutional neural network (CNN) to detect signs of pneumonia in chest X-rays from 3 large U.S. hospital systems, individually and in combination, and attempted to validate the trained models' performance internally (with held-out test data) and externally (using test data from a different hospital system).
Overall, the researchers found that the CNN models showed a significant reduction in accuracy when tested on data from outside the training set. For example, the CNN trained on data from Mount Sinai Hospital (MSH; 42,396 radiographs from 12,904 patients) had an area under the receiver operating characteristic curve (AUC) of 0.802 (95% confidence interval 0.793-0.812) in held-out internal validation; an AUC of 0.717 (95% CI 0.687-0.746) in data from the National Institutes of Health Clinical Center (NIH; 112,120 radiographs from 30,805 patients), and an AUC of 0.756 (95% CI 0.674-0.838) in data from Indiana University Network for Patient Care (IU; 3,807 radiographs from 3,683 patients).
Due to limitations of the datasets, and because the features and interrelationships by which CNNs predict outcomes are not easily reduced to simpler, familiar terms, the authors cannot fully assess what factors other than disease prevalence might have led to reduced performance in external validation; hospital system and department characteristics may have contributed to confounding. Nonetheless, the study provides evidence that estimates of real-world CNN performance based on held-out internal test data can be overly optimistic.
In a second featured study from the Special Issue, Yizhi Liu of Sun Yat-sen University, Guangzhou, and colleagues report that a trained Random Forest model can estimate a future diagnosis of high myopia among Chinese school-aged children as early as 8 years in advance.
School-aged myopia is the most prevalent eye disease in China. Myopia can progress to high myopia, which carries increased risk of ocular complications including retinal detachment, glaucoma, and pathological myopia. Preventive approaches for myopia progression exist, but have known side effects. To aid in the identification of children at elevated risk of future high myopia, Liu and colleagues leveraged machine learning and data from ophthalmic service providers in China to create a prediction model.
For their study, the researchers extracted clinical refraction data from electronic medical records (EMR) for 129,242 individuals aged 6 to 20 years who visited 8 ophthalmic centers between January 1, 2005 and December 30, 2015. They used age, spherical equivalent (SE), and past annual progression rate (APR) to train a Random Forest algorithm to predict SE and onset of high myopia (SE ? ?6.0 diopters) up to 10 years in the future. Model training used a single center (517,949 records from 92,484 individuals), and the remaining 7 centers were used for external validation (169,114 records from 36,758 individuals). The researchers obtained additional validation using data from 2 Chinese population-based cohorts (17,113 records from 3,215 participants).
Liu, and colleagues' trained algorithm predicted the onset of high myopia over 10 years with clinically acceptable performance in external EMR validation (AUC ranging from 0.874 to 0.976 for 3 years, 0.847 to 0.921 for 5 years, and 0.802 to 0.886 for 8 years). The algorithm also predicted development of high myopia by 18 years of age, as a surrogate of high myopia in adulthood, with clinically acceptable performance in external EMR validation over 3 years (AUC ranged from 0.940 to 0.985), 5 years (AUC ranged from 0.856 to 0.901), and 8 years (AUC ranged from 0.801 to 0.837).
As expected, the performance of Liu and colleagues' prediction model for future high myopia is reduced when the targeted prediction time increases; additionally, practitioners may disagree on what constitutes clinically acceptable performance and accuracy. However, in this study, a clinically interpretable prediction model achieved AUC in the 0.80-0.90 range for future predictions up to 8 years in several external validation analyses. The authors state, "[t]his work proposes a novel direction for the use of medical big data mining to transform clinical practice and guide health policy-making and precise individualized interventions."
Research Article - Oermann et al
The Department of Radiology at the Icahn School of Medicine at Mount Sinai (http://icahn.
I have read the journal's policy and the authors of this manuscript have the following competing interests: MAB and ML are currently employees at Verily Life Sciences, which played no role in the research and has no commercial interest in it. EKO and ABC receive funding from Intel for unrelated work.
Zech JR, Badgeley MA, Liu M, Costa AB, Titano JJ, Oermann EK (2018) Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study. PLoS Med 15(11): e1002683. https:/
Image Credit: StockSnap, Pixabay
Department of Medicine, California Pacific Medical Center, San Francisco, California, United States of America
Verily Life Sciences, South San Francisco, California, United States of America Department of Neurological Surgery, Icahn School of Medicine, New York, New York, United States of America
Department of Radiology, Icahn School of Medicine, New York, New York, United States of America
In your coverage please use this URL to provide access to the freely available paper:
Research Article - Liu et al
This study was funded by the National Key R&D Program of China (2018YFC0116500), the National Natural Science Foundation of China (91546101, 81822010), the Guangdong Science and Technology Innovation Leading Talents (2017TX04R031), and Youth Pearl River Scholar in Guangdong (2016). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
The authors have no conflicts of interest to declare.
Lin H, Long E, Ding X, Diao H, Chen Z, Liu R, et al. (2018) Prediction of myopia development among Chinese school-aged children using refraction data from electronic medical records: A retrospective, multicentre machine learning study. PLoS Med 15(11): e1002674. https:/
State Key Laboratory of Ophthalmology, Clinical Research Center for Ocular Disease, Zhongshan Ophthalmic Centre, Sun Yat-sen University, Guangzhou, China
School of Public Health, Sun Yat-sen University, Guangzhou, China
School of Mathematics, Sun Yat-sen University, Guangzhou, China
Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
UCL Institute of Ophthalmology, University College London and Moorfields Eye Hospital, London, United Kingdom
First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
Laboratory of Immunology, National Eye Institute, National Institutes of Health, Bethesda, Maryland, United States of America
ARC Centre of Excellence in Vision Science, Research School of Biology, College of Medicine, Biology and Environment, Australian National University, Canberra, Australian Capital Territory, Australia
Centre for Eye Research Australia, University of Melbourne, Royal Victorian Eye and Ear Hospital, East Melbourne, Victoria, Australia
In your coverage please use this URL to provide access to the freely available paper: http://journals.