Subtyping glioblastoma by combining miRNA and mRNA expression data using compressed sensing-based approach
1 Department of Biomedical Engineering, Tulane University, New Orleans, LA, USA
2 Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, LA, USA
3 Center for Systems Biomedicine, Shanghai University for Science and Technology, Shanghai, China
EURASIP Journal on Bioinformatics and Systems Biology 2013, 2013:2 doi:10.1186/1687-4153-2013-2Published: 14 January 2013
In the clinical practice, many diseases such as glioblastoma, leukemia, diabetes, and prostates have multiple subtypes. Classifying subtypes accurately using genomic data will provide individualized treatments to target-specific disease subtypes. However, it is often difficult to obtain satisfactory classification accuracy using only one type of data, because the subtypes of a disease can exhibit similar patterns in one data type. Fortunately, multiple types of genomic data are often available due to the rapid development of genomic techniques. This raises the question on whether the classification performance can significantly be improved by combining multiple types of genomic data. In this article, we classified four subtypes of glioblastoma multiforme (GBM) with multiple types of genome-wide data (e.g., mRNA and miRNA expression) from The Cancer Genome Atlas (TCGA) project. We proposed a multi-class compressed sensing-based detector (MCSD) for this study. The MCSD was trained with data from TCGA and then applied to subtype GBM patients using an independent testing data. We performed the classification on the same patient subjects with three data types, i.e., miRNA expression data, mRNA (or gene expression) data, and their combinations. The classification accuracy is 69.1% with the miRNA expression data, 52.7% with mRNA expression data, and 90.9% with the combination of both mRNA and miRNA expression data. In addition, some biomarkers identified by the integrated approaches have been confirmed with results from the published literatures. These results indicate that the combined analysis can significantly improve the accuracy of classifying GBM subtypes and identify potential biomarkers for disease diagnosis.