image: The input elemental Wikipedia embeddings are mapped to node representations by MLP layers. By embeddings post-processing (e.g. by Transformer) and advanced message passing, different GNN models are designed to generated the final embeddings. A recommendation system is thus built to score the appearance of a link (binary system) or a triangle (ternary system) in the material networks. (PD: inner product; HDM: Hadamard product)
Credit: Yuan-Chao Hu from Songshan Lake Material Laboratory.
A research team from the Songshan Lake Materials Laboratory has developed an AI-guided "Recommendation System" to discover new metallic glasses (MG). By combining element embeddings learned from Wikipedia by a language model with graph neural networks analyzing hidden material relationships. This approach addresses longstanding challenges related to the vast chemical space and limited experimental datasets, opening new horizons for materials design and accelerating the development of next-generation MGs.
Metallic glasses are a class of amorphous alloys valued for their unique mechanical, chemical, and physical properties, widely used in the industry from aerospace to biomedicine. However, discovering new MGs remains a formidable challenge. Traditional trial-and-error methods require long time and resources to explore the vast compositional landscape, as the formation of MGs critically depends on complex multi-element interactions that are difficult to predict. Since the 1960s, researchers have explored only several thousand compositions from the vast elemental combination space avilable. While machine learning has brought new opportunities, its application faces fundamental challenges: scarce experimental data and limitations in how materials are represented computationally. Traditional methods encode materials using predefined physical properties, which may miss non-traditional glass-forming mechanisms and limit model generalizability. In this context, the recent study by Ouyang et al. introduces a novel framework that leverages the rich, unstructured knowledge stored in Wikipedia.
First, the authors generated element embeddings using AI models trained to process Wikipedia entries on chemical elements, enabling the extraction of rich, context-dependent information without human bias. Second, they constructed material networks in which elements serve as nodes and known metallic-glass compositions define the connections. On these networks, they applied graph neural networks (GNNs) that integrate Wikipedia-derived element features with the underlying network topology to predict new glass-forming systems.
The core innovation of this work lies in transforming unstructured textual knowledge into structured, learnable representations that inform the search for amorphous alloys. The team processed Wikipedia articles from multiple languages to generate robust, semantic element embeddings, which encode chemical behaviours and relationships without relying solely on experimental data.
Using these embeddings as input, they trained versatile graph neural network architectures, including graph convolutional networks (GCN), neural graph collaborative filtering (NGCF), and Transformer-based GNNs (TransGNN), to model the complex interactions among elements in metallic glasses. These models serve as recommendation systems, suggesting element pairs for binary MGs and complexes for ternary systems, thus guiding experimental efforts more intelligently.
Results demonstrated that the models could reliably identify promising compositions with high glass-forming ability, some of which had not been previously explored. The TransGNN architecture, enhanced with attention mechanisms, stood out as the most accurate, effectively capturing the long-range and multi-element relationships essential for MG formation. This approach confirmed that high-quality predictive performance could be achieved even when relying on knowledge from diverse natural language sources.
This paradigm shift from experimental trial-and-error to knowledge-driven computational prediction. Accelerating the discovery process, reduces costs, and broadens the scope of feasible alloys.
The Future: This work showcases the potential of integrating natural language processing, graph learning, and materials science to revolutionize materials discovery. Future efforts will focus on expanding the knowledge base with further multilingual data, refining models to incorporate thermodynamic and kinetic information, and experimentally validating top predictions. Additionally, the framework can be extended beyond metallic glasses to other complex materials such as high-entropy alloys, composites, and functional ceramics.
By bridging unstructured textual data with sophisticated graph models, this approach paves the way for a new paradigm: knowledge-powered, data-efficient materials design that accelerates innovation while conserving resources. As the methodology matures, it promises to significantly shorten the developmental cycle of advanced materials, fostering rapid progress toward sustainable, high-performance technologies.
The research has been recently published in the online edition of AI for Science.
Reference: Kaichen Ouyang, Shiyun Zhang, Song-Ling Liu, Jiachuan Tian, Yuanhao Li, Hua Tong, Hai-Yang Bai, Wei-Hua Wang and Yuan-Chao Hu.Graph learning metallic glass discovery from Wikipedia[J]. AI Sci. DOI: 10.1088/3050-287X/ae1b20
Article Title
Graph learning metallic glass discovery from Wikipedia
Article Publication Date
14-Nov-2025