Colorectal cancer (CRC) is the third most common cancer among women and men in the USA, and recent studies have shown an increasing incidence in less developed regions, including Sub-Saharan Africa (SSA). A team of researchers from South African universities in collaboration with the University of South Carolina-Upstate, have developed method which relies on a hybrid signature (based on patterns in DNA mutation and RNA expression) and assessed its predictive properties for the mutation status and survival of CRC patients.
The research team, led by Mohanad Mohammed at the University of KwaZulu-Natal, South Africa investigated publicly-available microarray and RNASeq data from 54 matched formalin-fixed paraffin embedded (FFPE) samples from the Affymetrix GeneChip and RNASeq platforms. The samples were used to obtain information about differentially expressed genes between mutant and wild-type samples. The researchers then applied bioinformatics techniques which include the use of support-vector machines, artificial neural networks, random forests, k-nearest neighbor, naïve Bayes, negative binomial linear discriminant analysis, and the Poisson linear discriminant analysis algorithms for classification. The Cox proportional hazards model was used for survival analysis.
When compared to the gene list from each of the individual platforms, the researchers noted that the hybrid gene list had the highest accuracy, sensitivity, specificity, and AUC for mutation status, across all the classifiers and is prognostic for survival in patients with CRC. Negative binomial linear discriminant analysis method was the best performer on the RNASeq data while the SVM method was the most suitable classifier for CRC across the two data types. The researchers concluded that nine genes were found to be predictive of survival.
"This signature could be useful in clinical practice, especially for colorectal cancer diagnosis and therapy.", notes Mohammed. Future studies should determine the effectiveness of integration in cancer survival analysis and the application on unbalanced data, where the classes are of different sizes, as well as on data with multiple classes. The research has been published in the journal, Current Bioinformatics.
Authors: Mohanad Mohammed1, Henry Mwambi1, and Bernard Omolo2, 3
1School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, South Africa 2Division of Mathematics & Computer Science, University of South Carolina-Upstate, USA 3School of Public Health, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa.