Title: Abstract
1 .
Applications of Homozygosity Haplotype in the
Study of Human Genetic Diseases with High Density
SNP Genotype Haiyan Jiang1,Mark Samuels1,2,Duane
Guernsey1,Andrew Orr1,3 Departments of
1Pathology, and 3Ophthalmology and Visual
Sciences, Dalhousie University, Halifax, NS
Canada 2Department of Medicine, University of
Montreal, Montreal, QC Canada
Abstract In a large family with a specific
disease, patients usually share the
identity-by-descent (IBD) haplotype linked to the
disease susceptibility genes. Although many
haplotype analysis methods have been developed to
detect the shared interval, it is currently still
very difficult to reconstruct the haplotype on a
genome-wide basis. A non-parametric method
Homozygosity Haplotype (HH) was proposed recently
for the genome-wide search of the shared
autosomal segment with high density SNP genotype.
Rather than phasing the haplotype, HH utilizes a
form of haplotype described by the homozygous
SNPs only, which allows HH to perform genome-wide
search with high efficiency. The applicability
and the effectiveness of HH in identifying the
candidate region of causative gene were studied
with the Illumina 550k genotype data of the
affected members from a large family with
Schnyder crystalline corneal dystrophy (SCCD, MIM
121800), a rare autosomal dominant disease. HH
successfully detected the 1Mb shared segment
with a minimum set of three samples. We proposed
that HH can be applied to screen the known
causative genes or loci by searching for the
shared homozygosity haplotype for patients who
have inherited a susceptibility gene from a
common ancestor. A new strategy for the
genome-wide screening of the known causative
genes or loci with high density SNP genotype data
was developed, which has the potential to be used
as an efficient alternative approach other than
sequencing or microsatellite-based fine mapping
for the research of genetic diseases and the
clinical diagnosis.
3. Application to the screening of known
causative genes Assuming that patients who have
inherited the disease susceptibility gene from a
common ancestor also share haplotype in the
genomic interval, HH approach can be applied to
screen the known causative genes or loci by
searching for the shared homozygosity haplotype
around the gene. If patients do not share
significant RCHH around the known gene, then the
gene can be excluded. Impact of genotyping
errors It is difficult to determine genotyping
errors when only a few affected individuals in a
family are available to be genotyped, an approach
was developed to calculate the error possibility.
First, replace the mismatched compSNPs with
concordant SNPs to create consistent homozygosity
haplotype. Run Monte Carlo (MC) to simulate
genotyping errors with the selected error model
and error ratio on the modified genotypes.
Analyze the distribution of the number of
mismatched compSNPs created by simulated
genotyping errors using Poisson distribution.
Calculate the possibility of getting N mismatched
compSNPs introduced by genotyping error.
2. Use HH to identify the candidate loci for
Schnyder crystalline corneal dystrophy
Background
Lincon and Lander error model
Results
1. Homozygosity Haplotype Method
The whole-genome screening approach was validated
using a family with Myoclonus dystonia (MIM
159900). The known causative genes are SGCE,
DRD2, and DYT1. A published causative mutation
c.304CgtT (R102X) in the SGCE gene has been
detected in the affected family members by
sequencing. HH was tested whether the proposed
screening approach can exclude non-causative
genes correctly. Four patients from the family
were genotyped with Illumina HumanHap550
beadchips. HH was run to identify RCHHs shared by
the four patients with a cutoff 3.0 cM.
Genome-wide mapping of RCHHs shared by four
patients from a Canadian family with Myoclonus
dystonia
Results
I. An RCHH at chr110,679,786-11,639,887 was
identified by HH method with the genotype data of
10 patients.
II. Minimal subset required to identify the
interval
Results of genotyping error simulation
Sample selection select distantly related
individuals because they share less RCAs
Ratio of RCA to the total genetic length shared
by two descendants from a common ancestor. In
which, m, n are the number of generations removed
from a common ancestor of two subjects
- The two gene DRD2 and DYT1 can be excluded
because no RCHH was detected around them. The
results of genotyping error simulations with P0
suggest the genotype data are reliable. - The largest RCHH at chr7 93,168,493-130,965,632
with size of 37 Mb includes gene SGCE
(chr794,052,472-94,123,457). - The study of Myoclonus dystonia demonstrated that
the proposed screening approach excluded all
non-causative genes successfully. Besides, it
identified the potential linkage of SGCE in the
meanwhile.
Miyazawa H, et. al. Homozygosity haplotype allows
a genomewide search for the autosomal segments
shared among patients. Am J Hum Genet. 2007 Jun
80(6)1090-102.
- Features of HH method
- Non-parametric
- High efficiency
- Complexity O(n2), n number of subjects
- For Marfan syndrome, Affymetrix 500k SNP
genotype, 9 subjects, the computational time is 6
s on laptop. - Both dominant and recessive disease loci can be
detected - HH analysis may provide an advantage when 6mn
50 (m, n are the number of generations removed
from a common ancestor of two subjects) where the
haplotype analysis or the linkage analysis are
difficult to perform. - HH is well-suited to the local population in
Atlantic region with mnlt20.
Conclusions Our study of HH approach with
Illumina 550k SNP genotype data from a series of
monogentic disease projects demonstrates that HH
method is very efficient and effective in
identifying disease linked regions. Based on the
idea of homozygosity haplotype, we developed a
new approach for the genome-wide screening of the
known causative genes or loci using high density
SNP genotype data. The successful application to
a family with known causative mutation supports
that the method has the potential to be used as
an efficient alternative approach other than
sequencing or Microsatellite-based fine mapping
for the research and clinical diagnosis of
genetic diseases.
July 2008, ISMB 2008