News Release

XGCN: A library for large-scale graph neural network recommendations

Peer-Reviewed Publication

Higher Education Press

XGCN’s overall framework and usage example

image: 

XGCN’s overall framework and usage example

view more 

Credit: Xiran SONG, Hong HUANG, Jianxun LIAN, Hai JIN

Graph Neural Networks (GNNs) have gained widespread adoption in recommendation systems. When it comes to processing large graphs, GNNs may encounter the scalability issue stemming from their multi-layer message-passing operations. Consequently, scaling GNNs has emerged as a crucial research area in recent years, with numerous scaling strategies being proposed. To promote the study of GNN recommendations, a number of open-source recommendation libraries have incorporated GNNs as a key model category. However, when dealing with large graphs, most of the existing libraries still have two limitations. The first one is inadequate consideration of scaling strategies. Most libraries only implement specific GNN algorithms without taking scaling strategies into account. The second limitation is the lack of large graph processing optimizations. Many official model implementations overlook certain coding details, leading to scaling issues when processing large graphs. To solve the problems, a research team led by Hong Huang、Hai Jin published their new research on 14 Mar 2024 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.

The team proposed a Python-based library named XGCN, aiming at helping users quickly build and run large-scale GNNs in a single-machine environment. The library supports various scaling strategies, offers optimized implementations for large graphs, and has an easy-to-use interface for running and development. Specifically, XGCN includes 16 embedding models in total, covering a broad spectrum of types: from shallow models to common GNNs with three kinds of mainstream scaling strategies: layer-sampling-based, decoupling-based, and clustering-based methods. It incorporates optimizations in models’ implementation, such as a set of Numba-accelerated operation functions, making them more suitable for large graphs. Detailed documents for usage guidance and model running scripts are also provided.

Experiments are performed to evaluate several different implementations of a classic graph convolution operation by using datasets of different scales, ranging from 0.15 million nodes to 3 million nodes. It is observed that existing implementations tend to experience out-of-memory issues, while XGCN consistently showcases superior time and memory efficiency.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.