An open-source large language model for Chinese education research
Peer-Reviewed Publication
Updates every hour. Last Updated: 17-Dec-2025 06:12 ET (17-Dec-2025 11:12 GMT/UTC)
Open-source large language models (LLMs) research has made significant progress, but most studies predominantly focus on general-purpose English data, which poses challenges for LLM research in Chinese education. To address this, this research first reviewed and synthesized the core technologies of representative open-source LLMs, and designed an advanced 1.5B-parameter LLM tailored for the Chinese education field. Chinese education large language model (CELLM) is trained from scratch, involving two stages, namely, pre-training and instruction fine-tuning. In the pre-training phase, an open-source dataset is utilized for the Chinese education domain. During the instruction fine-tuning stage, the Chinese instruction dataset is developed and open-sourced, comprising over 258,000 data entries. Finally, the results and analysis of CELLM across multiple evaluation datasets are presented, which provides a reference baseline performance for future research. All of the models, data, and codes are open-source to foster community research on LLMs in the Chinese education domain.
Calcium (Ca) plays a fundamental role in carbonate weathering and the global carbon cycle. Carbonate weathering contributes approximately 80% of the dissolved Ca flux delivered from global rivers to the oceans. Therefore, it is crucial to elucidate the geochemical behavior of Ca isotopes during carbonate weathering. The research group led by Prof. Han Guilin at the China University of Geosciences (Beijing) integrated multi-isotope datasets (including Li-K-Fe-Zn) to investigate the geochemical behavior of Ca isotopes in river water and suspended sediments of the Lancang River during carbonate weathering. This work provides new insights into the global C-Ca cycle. The related findings have now been published in Science China Earth Sciences in 2025.
A new study uncovers a crucial molecular pathway that enables endometrial cells to survive oxidative stress, fueling endometriosis progression. Researchers found the CHK1/SGK1 axis plays a pivotal role in promoting cell survival and aging resistance, presenting promising therapeutic targets. Antioxidants and CHK1/SGK1 specific inhibitors demonstrated potential in reducing lesions, offering hope for improved treatments on endometriosis.
Mitochondria integrally influence plant growth, fertility and adaptation. Notably, multiple chromosomal configurations are present in Saccharum complex mitogenomes. There are substantial genomic reorganization and gene transfer events throughout evolution.
The multidrug-resistant pathogen Acinetobacter baumannii (A. baumannii) is a global health concern. Its surface capsular polysaccharides and lipopolysaccharides, which are structurally diverse and often contain rare, non-classical sugars, are major virulence factors. These glycans represent promising targets for novel therapeutics. Notably, glycoconjugate vaccines based on these structures elicit protective antibodies and confer effective immunity in animal models, highlighting their potential for combating infections.
GPI anchoring is indispensable for cell-wall integrity and full virulence of the maize pathogen Cochliobolus heterostrophus. Deletion of ChGPI7 or ChFEM1 crippled appressorium formation, exposes chitin, and triggers host immune detection. A total of 124 potential GPI-anchored proteins were predicted, indicating that this pathway may serve as a potential antifungal target.
Functionalized single-walled carbon nanotubes (SWCNTs) effectively delivered the amh/amhy plasmid into all-female mandarin fish via immersion. At 40 mg/L, plasmid DNA and transcripts were detected in gonads within 7-14 days. By 60-120 days, some fish developed masculinized gonads with downregulated foxl2/cyp19a1a and upregulated amh/dmrt1. The study demonstrates SWCNTs as a viable gene delivery tool in fish and confirms the crucial role of amh/amhy in sex determination via the amhrII/smads pathway activating dmrt1.
Researchers at the University of Windsor developed SH17, a large open-source dataset with 8,099 images and 75,994 labeled instances to improve detection of personal protective equipment (PPE) in manufacturing. Using advanced AI models like YOLOv9, the study achieved over 70% accuracy, offering industries a scalable tool to enhance worker safety and compliance.