image: Performance of the compared methods on the selection of important feature groups with different sizes. The higher the curve, the better the performance.
Credit: Fan XU, Zhi-Jian ZHOU, Jie NI, Wei GAO
Past years have witnessed impressive successes for tree models, while an important problem is to understand their predictions, especially for some critical applications. Previous interpretation methods for tree models focus on measuring the importance of individual features while ignoring plentiful correlations and structures among multiple features.
To solve the problems, a research team led by Wei GAO published their new research on 15 May 2025 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
The team proposed an interpretation method based on the importance of feature groups for tree models, and this is effective to exploit the inherent structures and complex correlations among multiple features.
They first introduced the BGShapvalue to measure the importance of feature groups, along with some desirable properties. They then presented a polynomial algorithm BGShapTree to handle the sum of exponential terms in BGshapvalue for tree models. The basic idea is to decompose the BGShapvalue into leaves’ weights and exploit the relationships between features and leaves. Based on this approach, the team presented a greedy algorithm to search for salient feature groups with large BGShapvalues. Extensive experiments on 20 benchmark datasets have validated the effectiveness of the proposed approach.
For the future direction, one is to extend the proposed approach to more complicated tree models such as XGBoost and deep forests. Another is to exploit more efficient approaches to search for salient feature groups.
Journal
Frontiers of Computer Science
Method of Research
Experimental study
Subject of Research
Not applicable
Article Title
Interpretation with baseline shapley value for feature groups on tree models
Article Publication Date
15-May-2025