Human Genome Structure and Organization - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Human Genome Structure and Organization

Description:

Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G. Genetic Variation Phenotype Expression of the genotype (modified by the environment). – PowerPoint PPT presentation

Number of Views:203
Avg rating:3.0/5.0
Slides: 54
Provided by: Bert85
Category:

less

Transcript and Presenter's Notes

Title: Human Genome Structure and Organization


1
Human Genome Structure and Organization
  • Bert Gold, Ph.D., F.A.C.M.G.

2
Genetic Variation
  • Phenotype
  • Expression of the genotype (modified by the
    environment). The structural or functional nature
    of an individual. Includes
  • appearance, physical features, organ structure
  • biochemical, physiologic nature
  • Genotype
  • Genetic status, the alleles an individual carries.

3
Learning Objectives
  • Recap and Update Public and Private Human Genome
    Project Status
  • Provide Reminders of Necessary Background for
    Genetic Disease Association and Linkage Studies

4
Definitions
  • Penetrance - The probability that an individual
    who is at-risk for the disorder (ie- carries
    the gene) develops (expresses) the condition.
    May be age dependent.
  • Expression - The characteristics of a trait or
    disease that are outwardly expressed.
    Eg-myotonic dystrophy myotonia, cataracts,
    narcolepsy, frontal balding, infertility.
  • Ascertainment The method used in gathering
    genetic data. Study conclusions differ depending
    on how affected individuals entered the study.
  • Phenocopy Individuals whose phenotype, under
    the influence of non-genetic agents, has become
    like the one normally caused by a specific
    genotype in the absence of non-genetic agents.
  • Pleiotropy - The quality of an allele to produce
    more than one effect ie- to manifest its
    expression in the structure and/or function of
    more than one organ system or tissue
  • Recurrence Risk Likelihood that a relative of a
    proband for a rare disease will have the same
    disease.

5
Penetrance and Expressivity
  • Penetrance Proportion that expresses a trait
  • Complete P1.0 or 100
  • Incomplete (reduced) Plt1.0 or lt 100
  • Expressivity Severity of the phenotype
  • Expressivity may vary
  • Between families (interfamilial) or
  • Within families (intrafamilial)
  • TRY NOT TO CONFUSE VARIABLE EXPRESSIVITY WITH
    INCOMPLETE PENETRANCE

6
Chromosomes, Genes and Proteins
  • Genes are on Chromosomes
  • Genes may encode proteins or RNA

7
Non-coding RNA genes
  • tRNAs (497 were counted, 821 when count genes and
    pseudogenes)
  • tRNAs found are consistent with Wobble
  • Codon bias only roughly correlated with tRNA
    distribution
  • rRNAs
  • small nucleolar RNAs (snoRNAs)
  • snRNAs (spliceosome constituents)
  • 7SL RNA
  • telomerase RNA
  • Xist transcript
  • Vault RNA

8
tRNAs
9
Some chromosomes are richer in genes than others
Number of Nucleotides in Exons
10
HOXA, HOXB, HOXC and HOXD are in regions with a
particularly low density of repeats This is
believed to result from the presence of
Cis-acting elements in this vicinity.
11
Proteins demonstrate patterns and similarity of
function
12
Functionally and Structurally similar proteins
are organized into families
e.g.- E.C., SWISS-PROT, TrEMBL,
13
In silico approaches to characterize genes
include
  • PFAM, searchable via HMMER
  • Other in silico collections include
  • PRINTS
  • PROSITE
  • SMART
  • BLOCKS
  • Creation of an Integrated Protein Index (IPI)

14
How many genes are there?
  • Estimates from the Public Program
  • RefSeq
  • Exons
  • Introns
  • Average Sizes
  • Coding Sequences (CDS)
  • Alternative splice products (about 3)
  • Creation of an Integrated Gene Index (IGI)
  • Genscan to Ensembl to Pfam via GeneWise (31,778)
  • Could be as low as 24,500 using overprediction
    corrections.

15
Estimates from Celera25,086 in Assembly 3
  • 25,086 in Assembly 3

16
Pre-existing estimates
  • W. Gilberts back of the envelope calculation
  • Reassociation Kinetics
  • Estimates from Double Twist using Promoter
    Inspector plus
  • Unpublished estimates from Human Genome Sciences

17
Size of Genes
  • Largest Dystrophin 2.7 Mb
  • Titin
  • 80,780 bp coding
  • 178 exons
  • largest single exon 17,106

18
GENE HOMOLOGS, ORTHOLOGS, PARALOGS
  • Vaculolar sorting machinery in yeast
  • ABC gene superfamily
  • Ig gene superfamily
  • FGF superfamily
  • Intermediate filament superfamily
  • PROTEIN FAMILY EXPANSION APPEARS TO BE A PRIMARY
    EVOUTIONARY MECHANISM

19
The proteome
  • Functional categories
  • PRINTS
  • Prosite
  • Pfam
  • Interpro (http//www.ebi.ac.uk/interpro/)

20
GENE ONTOLOGY
  • Standard Vocabulary
  • Hierarchy of terms (Directed ACYCLIC Graph)
  • Ashburner Nature Genetics 2525-29 (2000)
  • Bushy model

21
Horizontal Transfer controversy
  • One of the major conclusions of the Public Genome
    effort, published in Feb. 15, 2001 Nature was
  • Hundreds of human genes appear likely to have
    resulted from horizontal transfer from bacteria
    at some point in the vertebrate lineage. Dozens
    of genes appear to have been derived from
    transposable elements
  • This has now been widely disputed and is believed
    to result from
  • Microbial contaminants in the sequence.
  • Bacterial gene integration into pre-vertebrates
  • And
  • The more probable explanation for the existence
    of genes shared by humans and prokaryotes, but
    missing in nonvertebrates, is a combination of
    evolutionary rate
  • variation, the small sample of nonvertebrate
    genomes, and gene loss in the nonvertebrate
    lineages.
  • -Salzberg et. al., Science

22
Splice Pattern, 98 GT-AG
23
Chromatin Structure
  • Euchromatin
  • Heterochromatin
  • Nucleosomes

24
Chromosome Facts
  • Chromosomes replicate during S phase
  • Chromosomes recombine during Pachytene
  • Recombination is an obligate activity
  • Sex chromosomes recombine with each other

25
Cytogenetics is done by Karyotyping
  • Chromosomes are chemically frozen in metaphase
  • Must be carried out on dividing cells
  • Microfilament inhibitors
  • Microtubule inhibitors
  • Membrane lysis
  • Pronase, trypsin digest
  • Giemsa stain
  • G-bands correspond to regions of relatively low
    GC content
  • http//genome.ucsc.edu/goldenPath/mapPlots/
  • http//genome.ucsc.edu/goldenPath/hgTracks.html

26
Cell Division Meiosis
  • Segregation
  • Defined Alleles are paired gametes receive one
    of each.
  • Exceptions trisomy and uniparental disomy
  • Independent Assortment
  • Gene Pairs segregate independently
  • Exception linkage

27
Meiosis Creates Gametes
  • And provides a basis for genetic recombination!

28
Genetic Recombination
  • Crossing Over
  • Resolution
  • Recombinant Chromosomes
  • OBLIGATE ACTIVITY
  • FEMALE RECOMB. RATES HIGHER THAN MALE
  • INCREASED RATES AT TELOMERES
  • PARADOX SHORT ARMS SHOW MORE THAN LONG ARMS
  • 1cM is 1 Mb on long arms, but short arms are 2 cM
    per Mb and the Yp-Xp pseudoautosomal region is 20
    cM per Mb.

29
INCREASED RATES AT TELOMERES
30
PARADOX SHORT ARMS SHOW MORE THAN LONG ARMS
31
Genes
  • Units of heredity
  • Encode proteins (and some RNAs)
  • Human genetics is the study of gene variation in
    humans
  • Gene as a term is used ambiguously to refer
    both to the locus and the allele ie- There is
    only one locus but two alleles in a given
    individual.
  • Sequencing in both genome projects took place
    upon multiple alleles this has led to some
    assembly confusions.
  • Ultimately want a haploid genome map.

32
The Human Genome Project
  • International public effort commencing in 1990 to
    sequence the entire human genome by 2005.
  • STS approach chosen in 1991
  • Private effort launched in 1996 by Celera using
    Shotgun cloning

33
BAC clones, sequenced into BAC end reads, and
assembled into contigs
34
Markerless contigs in the Celera assembly are
called Scaffolds
35
Markers are BAC ends in the shotgun
36
Mate pair reads provided the core of Celera
sequence
37
Draft human genome sequences complete by February
2001.
  • Published simultaneously in Feb. 2001
  • Public Sequence in NATURE (409 745-964)
  • Celera Sequence in SCIENCE (291 1145-1434)

38
Greater than 50 of sequence is repetitive
39
45 of the human genome is derived from
transposable elements
  • Long Interspersed Elements LINEs (21 of genome)
  • LINE1 Some Still Active, Autonomous, consist of
    two ORFs (one is a pol).
  • LINE2
  • LINE3
  • Short Interspersed Elements SINEs (13 of
    genome)
  • ALU Some still active, use L1 enzymes to
    replicate
  • MIR
  • Ther2/MIR3
  • LTR Retroposons
  • Consist of gag and pol
  • Protease, rt, RNAseH, integrase all encoded
  • Reverse transcription occurs cytoplasmically,
    using a tRNA to prime replication
  • DNA Transposons

40
98.5 of sequence is non-coding.
  • Approximately 1/3 of the human genome is
    transcribed (public guess).

41
Allelism
  • Alternate forms of a gene
  • e.g.- Sickle Cell, CFTR
  • Recessive disease
  • e.g. Achondroplasia, Tuberous Sclerosis
  • Dominant Disease

42
Heterozygote or Homozygote
  • 1,2 or 1,1
  • homogeneity of alleles at a locus

43
Genetic Markers
  • RFLPs
  • VNTRs (STRs)
  • Microsatellites
  • STSs
  • SNPs
  • Tools used to find disease genes
  • Flags with locations throughout the genome

44
Polymorphism Information Content versus
Heterozygosity (PIC vs. het)
  • Determining heterozygosity from SNP rare allele
    frequency
  • Information Content in SNPs versus STRs

45
Typology of SNPs
  • Type I- Coding, non-synonymous, non-conservative
  • Type II- Coding, non-synonymous, conservative
  • Type III- Coding, synonymous
  • Type IV- Non-coding, 5-UTR
  • Type V- Non-coding, 3UTR
  • Type VI- Other non-coding
  • Type I and Type II SNPs have lower heterozygosity
    than other SNPs, presumably as a result of
    selective pressure.
  • About 25 of type I and type II SNPs have minor
    allele frequencies gt 15
  • About 60 have minor allele frequencies lt 5

46
Mutation
  • Occurs more often during male meiosis
  • Occurs more often in long genes
  • More easily detected in Dominant Diseases
  • Achondroplasia
  • Duchenne Muscular Dystrophy
  • May often involve CpG mutating to TpG

47
Autosomal Recessive Inheritance
  • Two copies of a gene required to be affected
  • Carriers have one copy of the mutation and are
    unaffected
  • 25 of offspring of two carriers will be affected
  • Males and females affected in equal number
  • Eg. Sickle Cell, beta-thal., CF

48
X Linked Recessive (Sex Linked)
  • Females rarely affected
  • No male to male transmission
  • Affected males transmit gene to all daughters
  • Eg- Duchenne Muscular Dystrophy, Hemophilia A

49
Autosomal Dominant Inheritance
  • Each child at 50 risk
  • Does not skip generations
  • Often, lethal in double dose
  • Large genetic load

50
X-linked Dominant Pedigree
  • Example is Hypophosphatemic, Vitamin D Resistant
    Rickets
  • Distinguished from Autosomal Dominant by
  • No male-to-male transmission
  • All daughters of affected fathers are affected

51
IMPORTANT NOTE
  • Dominant and Recessive refer to the phenotypic
    expression of alleles, NOT to intrinsic
    characteristics of gene loci.

52
Inheritance Pattern Complexities
  • Pseudodominant Transmission of a Recessive
  • Pseudorecessive Transmission of a Dominant
  • Misassigned paternity, causal heterogeneity,
    incomplete penetrance, germline mosaicisim
  • Mosaicism
  • Mitochondrial Inheritance
  • Penetrance and Expressivity
  • Semi-dominant, gender- influenced, age-related,
    transmission-related, imprinting
  • Uniparental Disomy (UPD)
  • Environmental effects, phenocopies

53
Preview of linkage analysis
  • Characterizing Human Genetics
  • Long generation time
  • Inability to control matings
  • Inability to control study population
  • Inability to control exposures to environmental
    conditions
  • It is possible to define phenotypes well!
  • Can study genetic structures through family
    history
  • Link phenotypes and genetic structures through
    statistical methods
Write a Comment
User Comments (0)
About PowerShow.com