News Release 

AI and big data predict which research will influence future medical treatments



IMAGE: This image depicts the co-citation network of seminal fundamental publications that led to the clinical development of cancer immunotherapy treatments. Large dots (center) represent the most influential clinical trials that... view more 

Credit: Ian Hutchins and George Santangelo

An artificial intelligence/machine learning model to predict which scientific advances are likely to eventually translate to the clinic has been developed by Ian Hutchins and colleagues in the Office of Portfolio Analysis (OPA), a team led by George Santangelo at the National Institutes of Health (NIH). This work, described in a Meta-Research article published October 10 in the open-access journal PLOS Biology, aims to decrease the sometimes decades-long interval between scientific discovery and clinical application; the method determines the likelihood that a research article will be cited by a future clinical trial or guideline, an early indicator of translational progress.

Hutchins and colleagues have quantified these predictions, which are highly accurate with as little as two years of post-publication data, as a novel metric called "Approximate Potential to Translate" (APT). APT values can be used by researchers and decision-makers to focus attention on areas of science that have strong signatures of translational potential. Although numbers alone should never be a substitute for evaluation by human experts, the APT metric has the potential to accelerate biomedical progress as one component of data-driven decision-making.

The model that computes APT values makes predictions based upon the content of research articles and the articles that cite them. A long-standing barrier to research and development of metrics like APT is that such citation data has remained hidden behind proprietary, restrictive, and often costly licensing agreements. To disrupt this impediment to the scientific community, to increase transparency, and to facilitate reproducibility, OPA has aggregated citation data from publicly available resources to create an open citation collection (NIH-OCC), the details of which appear in a Community Page article in the same issue of PLOS Biology. The NIH-OCC comprises over 420 million citation links at present and will be updated monthly as citations continue to accumulate. For publications since 2010, the NIH-OCC is already more comprehensive than leading proprietary sources of citation data.

Citation data from the NIH-OCC are used to calculate both APT values and Relative Citation Ratios (RCRs). The latter, a measure of scientific influence at the article level, normalized for the field of study and time since publication, was developed previously by Santangelo's team at NIH, and has already been widely adopted in both the scientific and evaluator communities. Upon publication of these two articles, APT values and the NIH-OCC will be freely and publicly available as new components of the iCite webtool that will continue as the primary source of RCR data ( The OPA team encourages the use of iCite to improve research assessment and decision-making that can contribute to optimizing the scientific enterprise.


Peer-reviewed; Software / Modelling; N/A

In your coverage please use this URL to provide access to the freely available article in PLOS Biology:

Meta-Research Article:

Community Page:


Meta-Research Article: Hutchins BI, Davis MT, Meseroll RA, Santangelo GM (2019) Predicting translational progress in biomedical research. PLoS Biol 17(10): e3000416.

Community Page: Hutchins BI, Baker KL, Davis MT, Diwersy MA, Haque E, Harriman RM, et al. (2019) The NIH Open Citation Collection: A public access, broad coverage resource. PLoS Biol 17(10): e3000385.


Meta-Research Article: The authors are employees or contractors for the US Federal Government but received no specific funding for this work.

Community Page: The authors are employees of or contractors for the US Federal Government, but the author(s) received no specific funding for this work.

Competing Interests:

Meta-Research Article: Since the authors work in the Division of Program Coordination, Planning, and Strategic Initiatives at the National Institutes of Health, our work could have policy implications for how research portfolios are evaluated.

Community Page: The authors have declared that no competing interests exist.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.