Expanding the reach of leaf spectroscopy: Toward universal models for plant trait prediction
Nanjing Agricultural University The Academy of Science
Analyzing 1,967 samples from 349 tree species across eight forest sites in China, the study reveals that differences in evolutionary history and environmental conditions hinder model transferability. By incorporating training data from diverse sites, the researchers achieved accuracy comparable to site-specific models, showing the feasibility of developing generalizable PLSR models.
Plant leaf traits—spanning morphology, physiology, and biochemistry—are critical to understanding how plants balance resource acquisition and ecosystem function. Traditionally, trait measurements depend on destructive sampling and time-consuming laboratory work, restricting the scale of ecological studies. Spectroscopy has emerged as a non-destructive alternative, using reflectance spectra to infer traits such as water content, dry mass, nitrogen, and pigments. Two broad approaches exist: physically based models and empirically based models. Among empirical methods, PLSR is especially popular, linking measured traits to spectral features with high accuracy. Yet a key challenge persists: models built in one environment often underperform elsewhere, raising questions about their broader applicability. Based on these challenges, developing generalizable PLSR models requires urgent study.
A study (DOI: 10.1016/j.plaphe.2025.100054) published in Plant Phenomics on 15 May 2025 by Yanjun Su’s team, Chinese Academy of Sciences, provides both a roadmap for building robust tools in plant functional ecology and a call for collaborative databases that can power universal models.
This study evaluated how well partial least squares regression (PLSR) models predict plant leaf traits from reflectance spectra and whether such models can be used beyond their original sites. Researchers first assembled a dataset coupling spectra with six key traits—leaf mass per area (LMA), leaf water content (LWC), equivalent water thickness (EWT), and mass-based carbon (C), nitrogen (N), and phosphorus (P)—from tree species spanning diverse Chinese forest climates. They trained site-specific PLSR models and then transferred them to other sites to test robustness, while probing potential drivers of transfer performance, including differences in trait distributions, species’ evolutionary histories (phylogenetic distance), and climate (e.g., mean annual temperature, MAT), and finally assessed whether expanding the training set across sites could yield generalizable models. Results showed low overall transferability: applying a model outside its training site significantly reduced accuracy (P < 0.05) across all traits. Nutrient traits were most affected, with normalized RMSE increases averaging 13.61% (C), 7.92% (N), and 12.12% (P), whereas structural and water-related traits were more stable, with smaller nRMSE increases of 5.59% (LWC), 4.74% (LMA), and 5.73% (EWT). Sites with richer tree diversity (e.g., XSBN, GTS, SNJ) showed smaller accuracy drops than less diverse sites (QY, DLS). Transfer performance worsened with greater dissimilarity between sites; phylogenetic distance and MAT emerged as dominant determinants, while trait-distribution differences were less influential. Crucially, diversifying training data improved predictions: as the number of training sites increased, nRMSE declined for 40 of 42 site–trait combinations, and cross-site models trained on all sites achieved robust performance (R² ≈ 0.53–0.83; nRMSE ≈ 7.43%–13.99%), with LMA and EWT among the best predicted (R² up to 0.83 and 0.82). Practically, the findings advise evaluating evolutionary and environmental similarity before model transfer and emphasize that broad, multi-site training is key to building generalizable PLSR models for reliable leaf-trait prediction.
The study underscores the power of spectroscopy and PLSR for rapid, non-destructive plant trait estimation, particularly for traits like nitrogen content and leaf mass per area that are central to plant physiology and breeding. Generalizable models could accelerate ecological monitoring, reduce reliance on labor-intensive sampling, and enable high-throughput phenotyping for crop improvement.
###
References
DOI
Original URL
https://doi.org/10.1016/j.plaphe.2025.100054
Funding information
This study was supported by the National Natural Science Foundation of China grants 32422059 and 32271640, the National Key R&D Program of China grants 2022YFF0803100 and 2022YFD2200101, and the Innovation and Technology Fund (funding support to State Key Laboratories in Hong Kong of Agrobiotechnology).
About Plant Phenomics
Plant Phenomics is dedicated to publishing novel research that will advance all aspects of plant phenotyping from the cell to the plant population levels using innovative combinations of sensor systems and data analytics. Plant Phenomics aims also to connect phenomics to other science domains, such as genomics, genetics, physiology, molecular biology, bioinformatics, statistics, mathematics, and computer sciences. Plant Phenomics should thus contribute to advance plant sciences and agriculture/forestry/horticulture by addressing key scientific challenges in the area of plant phenomics.
Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.