Materials Genome Initiative (MGI) and National Materials Genome Project have been launched by American and Chinese government in the past decade. One of the major goals of these missions is to facilitate the identification of materials data to speed material discovery and development. Current methods are promising candidates to identify structures effectively, but have limited ability to deal with all structures accurately and automatically in the big materials database, because different material resources and various measurement error lead to variation of bond length and bond angle.
Feng Pan and his colleagues, from Peking Univerisy Shenzhen Graduate School, propose a new paradigm based on graph theory (GT scheme) to improve the efficiency and accuracy of material identification, which focuses on processing the "topological relationship" rather than the value of bond length and bond angle among different structures.
In GT scheme, the researchers first simplify crystal structures into a graph, which only consists of vertices and edges, in which atoms are simplified as vertices and adjacent atoms with the actual chemical bonds are "connected" with edges. If the topological connections in the simplified graphs between two structures are the isomorphic, the GT scheme will consider them as one structure. By using this method, automatic deduplication for big materials database is achieved for the first time, which identifies 626,772 unique structures from 865,458 original structures.
Moreover, the GT scheme has been modified to solve some advanced problems such as identifying highly distorted structures, distinguishing structures with strong similarity and classifying complex crystal structures in materials big data. Compared with the traditional structure chemistry methods, the GT scheme can address these iusses much more easily, which enhances the efficiency and reliability of material identification.
By using this artificial intelligent technique, the researchers are trying to achieve high-throughput calculation, preparation and detection for the materials database. The GT scheme subverts the traditional material research methods and accelerates the development in material research field.
###This work "Identify crystal structures by a new paradigm based on graph theory for building materials big data" has been published in SCIENCE CHINA Chemistry, and the paper is available online at: https:/
The authors thank Dr. Lin-Wang Wang from Lawrence Berkeley National Laboratory and Dr. Wenfei Fan from the University of Edinburgh for their helpful discussions. This work was supported by the National Key R&D Program of China (2016YFB0700600), the National Natural Science Foundation of China (21603007, 51672012), Soft Science Research Project of Guangdong Province (2017B030301013), and New Energy Materials Genome Preparation & Test Key-Laboratory Project of Shenzhen (ZDSYS201707281026184).
See the article: Mouyi Weng, Zhi Wang, Guoyu Qian, Yaokun Ye, Zhefeng Chen, Xin Chen, Shisheng Zheng, Feng Pan. Identify crystal structures by a new paradigm based on graph theory for building materials big data. Sci. China Chem., 2019, doi: 10.1007/s11426-019-9502-5