The workflow of the study (IMAGE)
Caption
n this study, 773 untreated breast cancer patients from all over China were collected and followed up for at least 5 years. We obtained clinical data from 773 cases, RNA sequencing data from 752 cases, and proteomic data from 271 cases.
We used 5 different data combinations to develop and train the model, and then compared the performance of the different feature combination models using 5 different algorithms. By optimizing the number of features, select the most important subset from a large number of features. Finally, an optimal model containing 13 features was determined.
In order to make the model transparent and trustworthy, we used advanced interpretability techniques including SHAP and KAN. In addition, we will encapsulate the explanatory model into a user-friendly network tool for clinical doctors to use, enabling real-time prediction and result visualization. Finally, to further confirm the biological basis and clinical relevance of the model, we conducted immunohistochemical verification.
Credit
Zhixuan Wu, Shengnan Yao, Lingli Jin, Xue Wu, Rongrong Zhang, Ouchen Wang, Erjie Xia
Usage Restrictions
None
License
Original content