image: Image
Credit: Yiming WANG , Zijia NI , Yinhua HUANG
Chickens are one of the most important livestock globally, serving not only as a primary source of high-quality protein for humans but also carrying unique genetic characteristics across different breeds—for example, Leghorn chickens are renowned for their high egg production, Tibetan chickens can adapt to hypoxic plateau environments, and Silkie chickens attract attention due to their special melanin deposition. Previous studies have largely relied on single “linear reference genomes”, which struggle to fully capture the genetic differences among breeds. When using next-generation sequencing (NGS) to detect “structural variants (SVs)” longer than 50 base pairs, underdetection often occurs due to insufficient representativeness of the reference genome. While third-generation sequencing can more accurately identify SVs, its high cost limits applications in large-scale population studies. How to efficiently decipher genetic variations in chickens while controlling costs has become a key challenge in livestock and poultry breeding.
A study by Professor Yinhua Huang’s team at China Agricultural University has for the first time constructed a “graph-based pan-genome” for chickens, providing a new tool for efficient mining of genetic variations in chickens. The relevant study has been published in Frontiers of Agricultural Science and Engineering (DOI: 10.15302/J-FASE-2024591).
The “graph-based pan-genome” is a non-linear genetic reference system. Built upon the linear genome GRCg6a of the red jungle fowl, it integrates high-quality genomic data from 12 samples across 2 commercial breeds and 9 local breeds. Genetic variations among breeds are stored as “nodes” and “edges”, forming a non-linear structure with multiple genetic pathways. This design overcomes the “one-size-fits-all” limitation of traditional linear genomes, more comprehensively reflecting the genetic diversity of chicken populations.
Results show that compared with traditional linear genomes, the graph-based pan-genome achieves higher alignment efficiency for NGS data—even with low-depth (7–15×) NGS data, the median alignment rate exceeds 98.8%. It performs particularly superior in structural variant detection: when analyzing NGS data from breeds such as Leghorns and Rhode Island Reds, the number of SVs detected (e.g., 9944 SVs in Leghorns) far exceeds results from the linear genome tool Lumpy (e.g., only 3246 SVs in Leghorns), significantly enhancing the comprehensiveness of variant discovery.
Using this graph-based pan-genome, researchers further identified key variants associated with important traits. For example, in Leghorns, 666 breed-specific high-frequency SVs were found, with some located in regions related to follicle development (e.g., the MKI67 gene) and circadian rhythm regulation (e.g., the CLOCK gene), potentially linked to their high egg production. In Tibetan chickens adapted to plateau environments, combined with transcriptome data, certain SVs were identified in the promoter regions of mitochondrial protein synthesis genes (e.g., MRPS24), which may aid hypoxic adaptation by influencing gene expression. These findings provide critical clues for deciphering the genetic mechanisms underlying traits such as egg production and environmental adaptability in chickens.
This study not only offers a more efficient tool for genetic research in chickens but also provides methodological references for constructing pan-genomes in other agricultural animals. In future research, scientists will further validate the functions of these variants and optimize graph-based pan-genome construction techniques to promote their practical applications in livestock and poultry breeding.
Journal
Frontiers of Agricultural Science and Engineering
Method of Research
Experimental study
Subject of Research
Not applicable
Article Title
Graph-based pan-genome analysis reveals diversity of structural variations in native and commercial chicken
Article Publication Date
6-May-2025