Using microarray measurements techniques, it is possible to measure the activity of genes simultaneously across the whole genome. Since genes influence each others activity levels through complex regulatory networks, such gene expression measurements are state samples of a dynamical system. Gene expression data has proven useful for diagnosis and definition of disease subgroups, for inference of the functional role of a given gene or for the deciphering of complex disease mechanisms. However, the extraction of meaning from data sets of such size and complexity needs to be aided by computational methods. Dimensionality reduction methods represent high-dimensional data as point configurations in lower-dimensional space in a way that optimally preserves geometrical or statistical properties. Nonlinear dimensionality reduction takes into account that data may be sampled from a general Riemannian manifold and attempts to uncover its intrinsic geometry.
This thesis deals with the application of spectral methods of nonlinear dimensionality reduction to gene expression data. It is demonstrated that nonlinear dimensionality reduction often yields more biologically relevant lower-dimensional representations compared with linear methods. A method for robust estimation of geodesic distances is further proposed.
- Mathematics (Faculty of Engineering)
- Fontes, Magnus, Supervisor
- Fioretos, Thoas, Supervisor
- Broberg, Per, Supervisor
- Pawitan, Yudi, Supervisor, External person
|Publication status||Published - 2006|
- Gene expression
- Manifold learning
- Nonlinear dimensionality reduction