News Release

Predicting cancer patient survival with gene expression data

Peer-Reviewed Publication

PLOS



Full size image available here

Cancer specialists often talk about cancer as an umbrella term for over 200 different diseases, each having unique characteristics. But even these categories are too broad, as the same type of cancer can take very different paths in different people. Researchers have traditionally diagnosed and treated cancer based on microscopic analysis of cell size and shape, a method that's especially difficult for very closely related cancers, such as non-Hodgkin's lymphoma, which has 20 subtypes. As scientists learn more about the molecular alterations in cancer, they're beginning to establish cancer subtypes based on the underlying molecular footprint of a tumor. Now Eric Bair and Robert Tibshirani describe a procedure that combines both gene expression data and the patients' clinical history to identify biologically significant cancer subtypes and show that this method is a powerful predictor of patient survival.

Their approach uses clinical data to identify a list of genes that correspond to a particular clinical factor--such as survival time, tumor stage, or metastasis--in tandem with statistical analysis to look for additional patterns in the data to identify clinically relevant subsets of genes. In many retrospective studies, patient survival time is known, even though tumor subtypes are not; Bair and Tibshirani used that survival data to guide their analysis of the microarray data. They calculated the correlation of each gene in the microarray data with patient survival to generate a list of "significant" genes and then used these genes to identify tumor subtypes. Creating a list of candidate genes based on clinical data, the authors explain, reduces the chances of including genes unrelated to survival, increasing the probability of identifying gene clusters with clinical and thus predictive significance. Such "indicator gene lists" could identify subgroups of patients with similar gene expression profiles. The lists of subgroups, based on gene expression profiles and clinical outcomes of previous patients, could be used to assign future patients to the appropriate subgroup.

By providing a method to cull the thousands of genes generated by a microarray to those most likely to have clinical relevance, Bair and Tibshirani have created a powerful tool to identify new cancer subtypes, predict expected patient survival, and, in some cases, help suggest the most appropriate course of treatment.

###

citation: Bair E, Tibshirani R (2004) Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol: e108 DOI: 10.1371/journal.pbio.0020108

link: http://www.plosbiology.org/plosonline/?request=get-document&doi=10.1371/journal.pbio.0020108

PLEASE MENTION PLoS BIOLOGY (www.plosbiology.org) AS THE SOURCE FOR THESE ARTICLES. THANK YOU.

All works published in PLoS Biology are open access. Everything is immediately available without cost to anyone, anywhere--to read, download, redistribute, include in databases, and otherwise use--subject only to the condition that the original authorship is properly attributed. Copyright is retained by the authors. The Public Library of Science uses the Creative Commons Attribution License.

CONTACT:

Eric Bair
Stanford University
Palo Alto, CA 94304-2427
United States of America
650-575-2281
ebair@stanford.edu


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.