News Release

Largest dataset of thousands of proteins marks landmark step for research into human health

First-of-its-kind dataset – based on samples from over 54,000 UK Biobank volunteer participants – lays the foundation for future discovery of new drug targets

Peer-Reviewed Publication

UK Biobank

Today, [Wednesday 4 October] the scientific journal Nature1 published the results of the world’s largest and most comprehensive study on the effects of common genetic variation on proteins circulating in the blood and how these associations can contribute to disease. This unprecedented population-scale investigation of proteins, powered by turning biological samples into data from UK Biobank, will help scientists better understand how and why diseases develop, which could help drive the development of new diagnostics and treatments for a wide range of health conditions.

To develop this unique and unparalleled dataset, researchers measured the abundance of nearly 3,000 circulating proteins, many of which were previously difficult to capture, from over 54,000 participants in the UK Biobank – which has been collecting data and tracking the health of 500,000 volunteer participants enrolled between 2006 and 2010. The study identified over 14,000 associations between common genetic variants and proteins circulating in the blood, over 80% of which were previously unknown. Scientists worldwide will be able to access the proteomic data in the coming weeks via UK Biobank2.

This landmark study was commissioned, funded, and carried out by the Pharma Proteomics Project, a collaboration between 13 leading biopharmaceutical companies3. The team carried out analyses on the data, demonstrating the vast potential for future research using the study. These include:

  • Genome-wide association studies to build an open access library of all the common gene variants that influence protein levels in blood. This can be used to study complex biological processes, such as the immune system, find proteins that are key players in causing disease, identify new drug targets and potentially shorten development time for earlier-stage drug candidates and increase success rates for clinical trials.
  • Profiling of blood protein levels across the top 20 most common health conditions in UK Biobank4. This revealed that, for example, inflammatory proteins, long thought to contribute towards mental health conditions, are significantly higher in patients with depression.
  • Training machine learning models to determine how successfully blood proteins can predict demographic factors. This analysis found that blood proteins can predict age, sex and body mass index (BMI) with very high accuracy. In the future, this technology could be used to compare chronological age with biological age and determine how this is related to risk of future diseases. 

Professor Naomi Allen, Chief Scientist of UK Biobank, said:

“This momentous study offers whole new avenues of research to the biomedical community, and is a leading example of how cross-sector collaboration can bring about results that are so much greater than the sum of their parts. All of these data will soon be available to bona fide researchers across the globe, alongside the existing genomic, lifestyle and health data that UK Biobank holds for its 500,000 volunteers. I am excited for researchers to use these data to identify patterns that could transform our understanding of how diseases develop, and to identify potential new treatment pathways.”

Dr Chris Whelan, Director, Neuroscience, Data Science & Digital Health, Janssen Research & Development, LLC, a Johnson & Johnson Company, who leads the Pharma Proteomics Project, said:

“To date, the scientific community has invested substantially in genomics for the advancement of precision medicine. However, to identify the right drug for the right patient at the right time, we must move beyond genomics alone. This dataset will help paint a much more nuanced and detailed picture of how the human genome and proteins circulating in the blood influence human health and disease – enabling biomedical researchers to identify new biological associations, find new drug targets and build blood-based diagnostics.”

Other future innovative work expected to result from this study includes using proteins circulating in the blood to predict whether someone will develop a disease several years before the condition occurs, classifying diseases into distinct biological subtypes, and using proteins in the blood to predict drug efficacy and safety prior to clinical trials.




For more information and requests for interview please contact:

Naomi Clarke, Head of Press, UK Biobank 07903 158 979



Notes to editors

About UK Biobank  

UK Biobank is a large-scale biomedical database and research resource containing genetic, lifestyle and health information from half a million UK participants. UK Biobank’s database, which includes blood samples, heart and brain scans and genetic data of the 500,000 volunteer participants, is globally accessible to approved researchers who are undertaking health-related research that is in the public interest. UK Biobank recruited 500,000 people aged between 40-69 years in 2006-2010 from across the UK. With their consent, they provided detailed information about their lifestyle, physical measures and had blood, urine and saliva samples collected and stored for future analysis. UK Biobank’s research resource is a major contributor in the advancement of modern medicine and treatment, enabling better understanding of the prevention, diagnosis, and treatment of a wide range of serious and life-threatening illnesses – including cancer, heart diseases and stroke. Over 30,000 researchers from more than 90 countries are registered to use UK Biobank and more than 9,000 peer-reviewed papers have been published as a result. UK Biobank is supported by Wellcome and the Medical Research Council, as well as the British Heart Foundation, Cancer Research UK and NIHR. The organisation has over 220 dedicated members of staff, based in multiple locations across the UK.


Find out more here:



  1. Plasma proteomic associations with genetics and health in the UK Biobank, Sun & Whelan et al, Nature, October 2023.
  2. Data will be made available to approved researchers through UK Biobank. Researchers can register to apply from around the world. For more information visit:
  3. Biopharmaceutical companies in the Pharma Proteomics Project: Alnylam, Amgen, AstraZeneca, Biogen, Bristol Myers Squibb, Calico, Genentech, a member of the Roche Group, GSK, The Janssen Pharmaceutical Companies of Johnson & Johnson, Novo Nordisk, Pfizer, Regeneron and Takeda.
  4. The 20 most prevalent health conditions in UK Biobank:

1. Disorders of lipoprotein metabolism and other lipidaemias

2. Depression

3. Essential (primary) hypertension

4. Chronic ischemic heart disease

5. Acute upper respiratory infections

6. Unspecified acute lower respiratory infection

7. Vasomotor and allergic rhinitis

8. Asthma

9. Gastro-oesophageal reflux disease

10. Gastritis and duodenitis

11. Diaphragmatic hernia

12. Diverticular disease of intestine

13. Other diseases of anus and rectum

14. Other dermatitis

15. Other disorders of skin and subcutaneous tissue

16. Other arthrosis

17. Other joint disorders

18. Dorsalgia

19. Other soft tissue disorders

20. Other disorders of urinary system



Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.