image: This image shows the Virtual Cell Challege logo. The competition is hosted by Arc Institute and sponsored by Nvidia, 10x Genomics, and Ultima Genomics.
Credit: Arc Institute
In a commentary published today (June 26, 2025) in the journal Cell, Arc researchers introduce the independent nonprofit’s first “Virtual Cell Challenge,” a public competition with a grand prize worth $100,000 for the machine learning model that best predicts how cells will respond to genetic perturbations. Arc is introducing the competition to catalyze progress at the interface of artificial intelligence and biology, particularly by accelerating the creation of high quality datasets and sparking a conversation about rigorous standards for assessing how well AI models simulate cellular behavior.
For the inaugural challenge, Arc has generated a new single-cell transcriptomics dataset of 300,000 H1 human embryonic stem cells (H1 hESCs) with 300 genetic perturbations, which will be deployed throughout the competition in segments for fine-tuning, validation, and testing. Competitors are invited to train models on gene expression data for over half a billion cells included in the Arc Virtual Cell Atlas, as well as other public datasets. The challenge will specifically evaluate how well models can predict changes in gene activity when individual genes are silenced. Competitors will make predictions of these effects during the middle phase of the competition, with their interim performance shared on a live leaderboard, before the final assessment leading to a public announcement of the winners.
“A team’s success will depend on their model’s ability to generalize to a new cell context. This is a difficult task, so we have structured this first competition as a few-shot learning challenge by releasing a training subset of H1 hESCs,” said Dave Burke (X: @davey_burke), Arc’s Chief Technology Officer. “The ability for models to generalize to new cell contexts is ultimately key to unlocking virtual cells for drug discovery and we hope this challenge will ultimately help accelerate progress towards that goal."
In developing the competition’s evaluation framework, Arc also aims to provide consistent benchmarks for virtual cell model performance for the field. While rapid advances in single-cell technologies and machine learning have created new opportunities to model cellular behavior, researchers have struggled to compare different approaches due to inconsistent evaluation methods and varying dataset quality.
"Virtual cells that can capture dynamic cellular responses represent the future of biological research, but we need rigorous ways to test and compare the performance of these models," said Arc Executive Director, Co-Founder, and Core Investigator Silvana Konermann (X: @SKonermann). "This competition will encourage scientists to build the most promising AI models while allowing us to stress-test our evaluation framework with the research community so we can establish benchmarks that the entire field can build upon."
Registration for the Challenge is open as of June 26 at virtualcellchallenge.org, with teams receiving training data at sign-up. Final rankings will be determined solely by model performance on the final test set, which will be released in late October one week prior to the final submission deadline. Winners will be announced in December.
Individual contributors as well as teams from academic institutions, biotechnology companies, and independent research organizations are eligible to participate. Competitors with experience in computational modeling or single-cell biology are particularly encouraged to enter. The three teams with the top three models will receive prizes valued at $100,000, $50,000, and $25,000, combining cash awards and NVIDIA DGX Cloud credits. The Virtual Cell Challenge is generously sponsored by NVIDIA, 10x Genomics, and Ultima Genomics.
“The Virtual Cell Challenge will help unite virtual cell developers to build new models for discovery in the life sciences. We are supporting this competition to empower the research community in building powerful foundational models that can predict how genetic perturbations impact cell behavior,” said Anthony Costa, Director of Digital Biology, NVIDIA.
"This effort is intended to drive community engagement, and to accelerate progress by providing high-quality benchmark datasets, a public leaderboard, and a mechanism for reproducible and transparent comparison," said Arc Core Investigator Hani Goodarzi (X: @genophoria). “Breakthrough ideas come from anywhere–when researchers push each other to build better models, the entire field advances.”
Arc plans to continue driving progress in the field by repeating the “Virtual Cell Challenge” each year with new single-cell transcriptomics datasets comprising different cell types, and with stiffer challenges requiring entrants’ models to predict the effects of more complicated biological changes.
"CASP competitions transformed protein structure prediction over 25 years, ultimately enabling breakthroughs like AlphaFold,” said Arc Co-Founder and Core Investigator Patrick Hsu (X: @pdhsu). “We believe Arc can use the same approach to accelerate progress toward comprehensive virtual cells that could fundamentally change how we study biology and identify targets to better treat complex diseases."
###
- Learn more about or register for the Virtual Cell Challenge: virtualcellchallenge.org
- Read the commentary in Cell: https://www.cell.com/cell/fulltext/S0092-8674(25)00675-0
- Learn more about Arc’s virtual cell model State: https://arcinstitute.org/tools/state
- Learn more about the Arc Virtual Cell Atlas: https://arcinstitute.org/tools/virtualcellatlas
Arc Institute (X: @arcinstitute) is an independent nonprofit research organization headquartered in Palo Alto, California. Arc operates at the interface of biology and machine learning, providing scientists with no-strings-attached, multi-year funding and complete freedom to pursue curiosity-driven research agendas, while fostering deep interdisciplinary collaboration through specialized Technology Centers. Arc's mission is to accelerate scientific progress, understand the root causes of complex diseases, and narrow the gap between discoveries and impact on patients. The Institute, founded in 2021, operates in close partnership with Stanford University, the University of California, Berkeley, and the University of California, San Francisco.
Method of Research
Commentary/editorial
Subject of Research
Not applicable