Classification of genomic and proteomic data using support vector machines

Research output: Chapter in Book/Report/Conference proceedingBook chapter


Supervised learning methods are used when one wants to construct a classifier. To use such a method, one has to know the correct classification of at least some samples, which are used to train the classifier. Once a classifier has been trained it can be used to predict the class of unknown samples. Supervised learning methods have been used numerous times in genomic applications and we will only provide some examples here. Different subtypes of cancers such as leukemia (Golub et al., 1999) and small round blue cell tumors (Khan et al., 2001) have been predicted based on their gene expression profiles obtained with microarrays. Microarray data has also been used in the construction of classifiers for the prediction of outcome of patients, such as whether a breast tumor is likely to give rise to a distant metastasis (van’t Veer et al., 2002) or whether a medulloblastoma patient is likely to have a favorable clinical outcome (Pomeroy et al., 2002). Proteomic patterns in serum have been used to identify ovarian cancer (Petricoin et al., 2002a) and prostate cancer (Adam et al., 2002); (Petricoin et al., 2002b).


Research areas and keywords

Subject classification (UKÄ) – MANDATORY

  • Cancer and Oncology
Original languageEnglish
Title of host publicationFundamentals of Data Mining in Genomics and Proteomics
EditorsD. P. Berrar, W. Dubitzky, M Granzow
ISBN (Print)978-0-387-47508-0
Publication statusPublished - 2007
Publication categoryResearch