Abstract
In human cells there are usually two copies of each chromosome, but in cancer cells abnormalities could exist. The differences consist of segments of chromosomes with an altered number of copies. There can be deletions as well as amplifications and the lengths of the segments can also vary. Localising the deviant regions is of great importance for increasing the knowledge of the disease. In this thesis the copy numbers are modelled using Hidden Markov Models (HMMs). A hidden Markov process can be described as a Markov process observed in noise; thus it consists of two different processes such that one is an unobservable Markov process, while the other is the observed process.
In paper A we present a method suitable for aCGH data from tiling BAC arrays, i.e. the probes are rather long and could overlap. In addition they are of unequal lengths and unevenly spread over the genome, which makes it suitable to apply a continuousindex process. We assume the Markov model to have a discrete state space and the parameters are estimated with an MCEM algorithm. The model in paper B is a modification of the model in paper A, such that the Markov process takes values in a continuous state space. This makes the method more realistic since it can handle larger differences in the data, including systematic errors. In addition we assume some of the transition rates to be common to get a parsimonious model. We take a Bayesian approach and use reversible jump MCMC to simulate the Markov process.
In paper C we present a model designed for SNP data which consists of allelic intensities for the two alleles at each SNP. We assume a discrete number of states, but keep the parsimonious approach from paper B such that some of the transition rates are common. The SNPs are point measurements but unevenly spread over the genome which motivates a continuousindex process. Further on in paper D we present an MCMC sampler, which is suitable for hidden Markov models, when taking a Bayesian approach. We alternate between updating the parameters and the trajectory, and for the latter update we present a sequential Monte Carlo method based on forward filteringbackward simulation. The method is applied on oligonucleotide copy number data with the same model as in paper B.
In paper A we present a method suitable for aCGH data from tiling BAC arrays, i.e. the probes are rather long and could overlap. In addition they are of unequal lengths and unevenly spread over the genome, which makes it suitable to apply a continuousindex process. We assume the Markov model to have a discrete state space and the parameters are estimated with an MCEM algorithm. The model in paper B is a modification of the model in paper A, such that the Markov process takes values in a continuous state space. This makes the method more realistic since it can handle larger differences in the data, including systematic errors. In addition we assume some of the transition rates to be common to get a parsimonious model. We take a Bayesian approach and use reversible jump MCMC to simulate the Markov process.
In paper C we present a model designed for SNP data which consists of allelic intensities for the two alleles at each SNP. We assume a discrete number of states, but keep the parsimonious approach from paper B such that some of the transition rates are common. The SNPs are point measurements but unevenly spread over the genome which motivates a continuousindex process. Further on in paper D we present an MCMC sampler, which is suitable for hidden Markov models, when taking a Bayesian approach. We alternate between updating the parameters and the trajectory, and for the latter update we present a sequential Monte Carlo method based on forward filteringbackward simulation. The method is applied on oligonucleotide copy number data with the same model as in paper B.
Original language  English 

Qualification  Doctor 
Awarding Institution 

Supervisors/Advisors 

Award date  2010 Oct 22 
Publisher  
Publication status  Published  2010 
Bibliographical note
Defence detailsDate: 20101022
Time: 10:15
Place: MH:C
External reviewer(s)
Name: Hössjer, Ola
Title: Professor
Affiliation: Department of Mathematics, Stockholm University

Subject classification (UKÄ)
 Probability Theory and Statistics
Free keywords
 Hidden Markov models
 DNA copy number
 allelic copy number
 Markov chain Monte Carlo