News Release

Forecasting diseases using Wikipedia

Peer-Reviewed Publication


Analyzing page views of Wikipedia articles could make it possible to monitor and forecast diseases around the globe, according to research publishing this week in PLOS Computational Biology.

Dr Sara Del Valle and her team from Los Alamos National Laboratory successfully monitored influenza outbreaks in the United States, Poland, Japan and Thailand, dengue fever in Brazil and Thailand, and tuberculosis in China and Thailand.

The team was also able to forecast all but one of these outbreaks (tuberculosis in China) at least 28 days in advance. The results suggest that people start searching for disease-related information on Wikipedia before they seek medical attention.

The paper shows the potential to transfer models across different regions; that is, one can "train" a computer model using public health data in one location and implement the model in another region. For example, researchers could create models using data from Japan to track and forecast disease in Thailand. This is particularly important for countries that do not offer reliable disease data.

Sara Del Valle says: "A global disease-forecasting system will change the way we respond to epidemics. In the same way we check the weather each morning, individuals and public health officials can monitor disease incidence and plan for the future based on today's forecast. The goal of this research is to build an operational disease monitoring and forecasting system with open data and open source code. This paper shows we can achieve that goal."


All works published in PLOS Computational Biology are Open Access, which means that all content is immediately and freely available. Use this URL in your coverage to provide readers access to the paper upon publication:


Dr. Sara Del Valle
Address: Los Alamos National Laboratory,
Systems Engineering & Integration,
P.O. Box 1663, MS M997
Phone: 505-665-7285

Citation: Generous N, Fairchild G, Deshpande A, Del Valle SY, Priedhorsky R (2014) Global Disease Monitoring and Forecasting with Wikipedia. PLoS Comput Biol 10(11): e1003892. doi:10.1371/journal.pcbi.1003892

Funding: This work is supported in part by NIH/NIGMS/MIDAS under grant U01-GM097658-01 and the Defense Threat Reduction Agency (DTRA), Joint Science and Technology Office for Chemical and Biological Defense under project numbers CB3656 and CB10007. Data collected using QUAC; this functionality was supported by the U.S. Department of Energy through the LANL LDRD Program. Computation used HPC resources provided by the LANL Institutional Computing Program. LANL is operated by Los Alamos National Security, LLC for the Department of Energy under contract DE-AC52-06NA25396. Approved for public release: LA-UR,14-22535. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing Interests: The authors have declared that no competing interests exist.

About PLOS Computational Biology

PLOS Computational Biology features works of exceptional significance that further our understanding of living systems at all scales through the application of computational methods. All works published in PLOS Computational Biology are Open Access. All content is immediately available and subject only to the condition that the original authorship and source are properly attributed. Copyright is retained. For more information follow @PLOSCompBiol on Twitter or contact

About PLOS

PLOS is a nonprofit publisher and advocacy organization founded to accelerate progress in science and medicine by leading a transformation in research communication. For more information, visit

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.