News Release

New AI framework Hi4GS revolutionizes wheat yield prediction, boosting accuracy by over 82%

Peer-Reviewed Publication

KeAi Communications Co., Ltd.

In The Crop Journal, researchers introduced Hi4GS, an AI-driven framework improving wheat yield prediction accuracy by 82%. Hi4GS streamlines SNP selection, employs intelligent optimization, and uncovers key genetic markers, enabling transparent genomic

image: 

In The Crop Journal, researchers introduced Hi4GS, an AI-driven framework improving wheat yield prediction accuracy by 82%. Hi4GS streamlines SNP selection, employs intelligent optimization, and uncovers key genetic markers, enabling transparent genomic insights and advancing cost-effective breeding for global food security.

view more 

Credit: Fa Cui

As the global population grows, increasing wheat (Triticum aestivum L.) yields is critical for food security. While genomics selection (GS) has become a core technology in modern breeding by predicting breeding values using genome-wide markers, it faces a notable hurdle: the "small n large p" problem. With hundreds of thousands of genetic markers (SNPs) but relatively few breeding samples, models often suffer from overfitting and high computational costs.

In a study published in The Crop Journal, a research team led by Professor Fa Cui from Ludong University, along with colleagues from several agricultural institutions, has unveiled Hi4GS (Hybrid Feature Selection for Genomic Selection). This novel, interpretable framework streamlines high-dimensional genotypic data, improving prediction accuracy and identifying the potential biological "drivers" behind wheat yield.

"Our goal was to move beyond the 'black box' nature of traditional genomic models," says Shanghui Zhang, the study's first author. "By filtering through the noise of tens of thousands of SNPs, Hi4GS allows us to achieve much higher predictive precision with a fraction of the data, while simultaneously uncovering the actual genes that influence yield."

The Hi4GS framework operates through a multi-stage strategy that moves from broad screening and intelligent optimization to deep biological interpretation. Initially, the system tackles the vast amount of genetic data by creating an "Elite SNP Candidate Pool." It does this through a dual-track approach: integrating multiple importance-ranking algorithms (like Ridge regression and GWAS) with a novel weighting scheme, while also using quantity-determining algorithms to capture all potentially valuable genetic information without bias.

Following this broad screening, Hi4GS employs a Prior-guided Grey Wolf Optimizer (P-GWO) for fine-tuned selection. Unlike standard algorithms that search randomly, this intelligent optimizer focuses its search within the pre-screened 'Elite Pool'.

"This acts like a navigation map," explains Professor Fa Cui, the corresponding author. "By using prior knowledge to guide the starting point, we find the optimal SNP combination faster and more accurately."

For the first time in this context, the team applied SHAP (SHapley Additive exPlanations) values, a technique from game theory. This allows them to quantify whether a specific SNP has a positive or negative impact on yield and to understand how different markers interact, effectively opening up the model's "black box."

Further testing across 11 yield traits in four wheat datasets, the researchers found that GS models using SNPs selected by Hi4GS improved average predictive accuracy by more than 82% compared to using the entire SNP set.

"Furthermore these findings are not just statistical coincidences," says Zhang. "The SNPs identified by Hi4GS fell into gene regions at a rate of 9.17%—significantly higher than the genomic background of 5.61%. This confirms that the framework successfully isolates functional biological information."

The implications for "Breeding 4.0" are significant. By narrowing down thousands of markers to a few dozen high-impact candidate genes, Hi4GS provides a blueprint for developing low-cost, high-efficiency breeding chips.

"This framework is not limited to wheat," notes Cui. "It can be applied to other major crops and animals, providing a powerful tool for genomic-assisted breeding and helping to accelerate the development of high-yielding varieties worldwide."

The team has released the Hi4GS R package as open-source software on GitHub, making this advanced tool available to the global agricultural research community.

###

Contact the author:

Fa Cui

Email address: 3314@ldu.edu.cn

The publisher KeAi was established by Elsevier and China Science Publishing & Media Ltd to unfold quality research globally. In 2013, our focus shifted to open access publishing. We now proudly publish more than 200 world-class, open access, English language journals, spanning all scientific disciplines. Many of these are titles we publish in partnership with prestigious societies and academic institutions, such as the National Natural Science Foundation of China (NSFC).


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.