Title: General Introduction to the Genome
1General Introduction to the Genome
2An Outlines
- Molecular Biology Major Events
- DNA, RNA
- Protein Synthesis(Transcription Translation)
- Genome Anatomy
- Bioinformatics
- Genomics Signal Processing
-
2
3- Molecular Biology Major Events
- DNA, RNA
- Protein Synthesis(Transcription Translation)
- Genome Anatomy
- Bioinformatics
- Genomics Signal Processing
-
3
4Molecular Biology Major Events
1869
Johann Friedrich
DNA Discovery
5Molecular Biology Major Events
Edward Tatum
6The Central Dogma
Target
Book
Book shelves
Nucleus
7 What is Life made of?
715
8Eukaryotes vs Prokaryotes
DNA
DNA
8
9Prokaryotes Eukaryotes
Single cell Single or multi cell
No nucleus Nucleus
No organelles Organelles
One piece of circular DNA Chromosomes
No mRNA post transcriptional modification Exons/Introns splicing
915
10The Cell Chemical Composition
- 70 Water
- 7 Small molecules
- Salts
- Amino acids? (Protein)
- Nucleotides? (DNA, RAN)
- 23 macromolecules
- Proteins
- Polysaccharides
- Lipids
10
11The Cell The 3 Critical Molecules
Form enzymes Form bodys components
Hold Genetic information
Transfer Information Synthesize Protein
12- Molecular Biology Major Events
- DNA, RNA
- Protein Synthesis(Transcription Translation)
- Genome Anatomy
- Bioinformatics
- Genomics Signal Processing
-
12
13DNA the Nucleotide
13
14DNA Nitrogenous base
Purines
Pyrimidines
14
15DNA Polymerization reaction
5 P
3OH
5
3
16DNA hydrogn bounds
No of base pairs Genome Size HG 3200 Mbp (Mb)
17Sugar- Phosphate Back bone
DNA Watson - Crick Model 1951
18DNA Watson - Crick Model
Sugar- Phosphate Back bone
No of base pairs Genome Size HG 3200 Mbp (Mb)
19RNA versus DNA
G, A ,C,T
G, A ,C,U
19
20Protein structure
- 1902 - Emil Hermann Fischer wins Nobel prize
showed amino acids are linked and form proteins
20
21Amino acid Basic unit of protein
Different side chains, R, determine the
properties of 20 amino acids.
Amino group
Carboxylic acid group
An amino acid
21
2222
23Protein structure
- Primary structure
- Secondary structure
- Super-secondary structure
- Tertiary structure
- Quaternary structure
24Protein Structure Predication Problem
Protein sequence
Protein 3D structure
Protein Function
25The Central DogmaGenes is proteins blueprint,
Genome
DNA
Gene
Protein
26- Molecular Biology Major Events
- DNA, RNA
- Protein Synthesis(Transcription Translation)
- Genome Anatomy
- Bioinformatics
- Genomics Signal Processing
-
26
27Protein Synthesis DNA, RNA, and the Flow of
Information
Replication
Translation
Transcription
27
28Protein Synthesis Gene Expression
28
29mRNA
Gene 3
1
Gene 2
2
3
Gene 1
Splicing
30Alternative Splicing
Pre-mRNA
mRNA
Gene 3
1
Gene 2
3
2
Gene 1
31m-RNA Editing
Pre-mRNA
mRNA
Gene 3
1
Gene 2
2
3
Gene 1
3232
33Translation
Pre-mRNA
Start Codon
mRNA
Gene 3
AUGAUAAC UA G
Gene 2
CV
Gene 1
Stop Codon
34Protein Synthesis The Genetic Code
Start
Stop
34
35Gene Regulation
1
Gene 1
2
3
R Gene 1
Regulatory protein
36Gene Regulation
We have a little knowledge about regulatory
mechanisms
Regulatory protein Gene 1
Gene 2
Gene 1
Regulatory protein Gene 2
37What a big Genome Size?
- The 12 font size enables approximately 60
nucleotides of DNA sequence to be written in a
line 10 cm in length. - Genome size total number of nucleotide base
pairs. - typically in millions of base pairs, or megabases
abbreviated Mb or Mbp)
37
38- Molecular Biology Major Events
- DNA, RNA
- Protein Synthesis(Transcription Translation)
- Genome Anatomy
- Bioinformatics
- Genomics Signal Processing
-
38
39the human genome sequence would stretch for 5000
km, the distance from Montreal to London, Los
Angeles to Panama, Tokyo to Calcutta, Cape Town
to Addis Ababa, or Auckland to Perth
The sequence would fill about 3000 books the size
of book 600 pages size.
39
40 Genome size of organism are different
40
41Genome size is not good indicator for genes number
41
42- Space is saved in the genomes of less complex
organisms because the genes are more closely
packed together.
42
43C-value paradox
- Correlation between the complexity of an
organism and the size of its genome was looked on
as a bit of a puzzle. -
43
44Genome Anatomy
45Human Genome Anatomy
Human genome? Nuclear genome
? Mitochondrial genome
45
46 Human Mitochondrial Genome Anatomy
- it is much smaller than the nuclear genome(17
kB), and it contains just 37 genes. - 13 code proteins and 24 specify non-coding RNA.
- do not contain intron.
- is typical of the mitochondrial genomes of other
animals
46
4747
48Nuclear Human Genome Anatomy
62
48
49Nuclear Human Genome Anatomy Protein Coding Genes
50Nuclear Human Genome Anatomy Protein Coding Genes
five exons, separated by four introns.
average exons nine exons per gene
50
51Two gene segments (V28 and V29-1)
51
52Nuclear Human Genome Anatomy pseudogene
Non functional genes
52
53 Nuclear Human Genome Anatomy genome-wide repeat
54 Nuclear Human Genome Anatomy genome-wide
repeat
- Tandemly repeated DNA
- Minisatellite DNA
- Microsatellite DNA
- Interspersed genome-wide repeats
- SINE
- LINES
- LTR
- DNA transposons
54
55Nuclear Human Genome Anatomy genome-wide repeat
Minisatellite DNA
- we are familiar with because of its association
with structural features of chromosomes. - Telomeric DNA, which in humans comprises hundreds
of copies of the motif 5'-TTAGGG-3'.
..
TTAGGG
TTAGGG
TTAGGG
..
AATCCC
AATCCC
AATCCC
55
56The content of the human nuclear genome
genome-wide repeat Microsatellite DNA
- microsatellites with a CA repeat, such as
- make up 0.25 of the genome, 8 Mb in all.
- Single base-pair repeats such as
- make up another 0.15.
56
57Nuclear Human Genome Anatomy genome-wide repeat
Interspersed repeat
57
58Gene Classification Gene function
- This system has the advantage that the fairly
broad functional categories used in can be
further subdivided to produce a hierarchy of
increasingly specific functional descriptions for
smaller and smaller sets of genes. - The weakness
- functions have not yet been
assigned to many eukaryotic genes.
58
59Gene Classification Gene function
- The gene catalog couldnt tell us why we are
human? - it may still not be possible simply from genome
comparisons with the chimpanzee genome to
determine what makes us human
59
60Gene Classification Gene function
- The major categories of protein coding genes
represent the most studied areas of cell biology,
which means that many of the relevant genes can
be recognized because their protein products are
known. - Genes whose products have not yet been identified
are more likely to be involved in the less well
studied areas of cellular activity.
60
61Gene classification Protein Domain
- A more powerful method is to base the
classification not on the functions of genes but
on the structures of the proteins that they
specify. - A protein molecule is constructed from a series
of domains, each of which has a particular
biochemical function.
61
62Gene classification Protein Domain
62
63- Molecular Biology Major Events
- DNA, RNA
- Protein Synthesis(Transcription Translation)
- Genome Anatomy
- Bioinformatics
- Genomics Signal Processing
-
63
64What is Bioinformatics?
- Integration of computational and biological
methods - to convert biological information into general
theories.
aatgcatgcggctatgctaatgcatgcggctatgctaagctgggatccg
atgacaatgcatgcggctatgctaatgcatgcggctatgcaagctgggat
ccgatgactatgctaagctgggatccgatgacaatgcatgcggctatgct
aatgaatggtcttgggatttaccttggaatgctaagctgggatccgatga
caatgcatgcggctatgctaatgaatggtcttgggatttaccttggaata
tgctaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcg
gctatgctaatgcatgcggctatgcaagctgggatccgatgactatgcta
agctgcggctatgctaatgcatgcggctatgctaagctgggatccgatga
caatgcatgcggctatgctaatgcatgcggctatgcaagctgggatcctg
cggctatgctaatgaatggtcttgggatttaccttggaatgctaagctgg
gatccgatgacaatgcatgcggctatgctaatgaatggtcttgggattta
ccttggaatatgctaatgcatgcggctatgctaagctgggaatgcatgcg
gctatgctaagctgggatccgatgacaatgcatgcggctatgctaatgca
tgcggctatgcaagctgggatccgatgactatgctaagctgcggctatgc
taatgcatgcggctatgctaagctcatgcggctatgctaagctgggaatg
catgcggctatgctaagctgggatccgatgacaatgcatgcggctatgct
aatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcgg
ctatgctaatgcatgcggctatgctaagctcggctatgctaatgaatggt
cttgggatttaccttggaatgctaagctgggatccgatgacaatgcatgc
ggctatgctaatgaatggtcttgggatttaccttggaatatgctaatgca
tgcggctatgctaagctgggaatgcatgcggctatgctaagctgggatcc
gatgacaatgcatgcggctatgctaatgcatgcggctatgcaagctggga
tccgatgactatgctaagctgcggctatgctaatgcatgcggctatgcta
agctcatgcgg
64
65 Data structures Software engineering (C,
C,PERL)
Cell structure Genome, genes DNA, RNA
Biology
Computer Science
Bioinformatics
Chemistry
Statistics
Markof Model Neural Network
Protein structure Molecular bounds
65
66Bioinformatics Subareas
- The subareas within bioinformatics include
Genomics and Proteomics.
Genome comparison evolutionary tree
Microarray Analysis Gene predication Gene
classification Gene regulation
Protein 3D predication Protein protein
interaction Protein alignment
66
67- Molecular Biology Major Events
- DNA, RNA
- Protein Synthesis(Transcription Translation)
- Genome Anatomy
- Bioinformatics
- Genomics Signal Processing
-
67
68What is GSP?
- Using Theory and Methods of Signal Processing
aatgcatgcggctatgctaatgcatgcggctatgctaagctgggatccg
atgacaatgcatgcggctatgctaatgcatgcggctatgcaagctgggat
ccgatgactatgctaagctgggatccgatgacaatgcatgcggctatgct
aatgaatggtcttgggatttaccttggaatgctaagctgggatccgatga
caatgcatgcggctatgctaatgaatggtcttgggatttaccttggaata
tgctaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcg
gctatgctaatgcatgcggctatgcaagctgggatccgatgactatgcta
agctgcggctatgctaatgcatgcggctatgctaagctgggatccgatga
caatgcatgcggctatgctaatgcatgcggctatgcaagctgggatcctg
cggctatgctaatgaatggtcttgggatttaccttggaatgctaagctgg
gatccgatgacaatgcatgcggctatgctaatgaatggtcttgggattta
ccttggaatatgctaatgcatgcggctatgctaagctgggaatgcatgcg
gctatgctaagctgggatccgatgacaatgcatgcggctatgctaatgca
tgcggctatgcaagctgggatccgatgactatgctaagctgcggctatgc
taatgcatgcggctatgctaagctcatgcggctatgctaagctgg
- To gain global understanding of Genome.
69GSP Labs
- The Genomic Signal
- Processing Laboratory at Texas AM
University. - The Computational Biology Division of the
Translational Genomics - Research Institute in Phoenix, Arizona.
To model Genomic Regulatory Mechanisms for the
purposes of diagnosis and therapy.
Edward R. Dougherty
70GSP Labs
- Columbia's Genomic Information Systems Laboratory
- at Columbia University
Dimitris Anastassiou
71GSP Labs
- DSP Group, Department of Electrical Engineering,
California Institute of Technology
P. P. Vaidyanathan
72Mapping Character String to Numerical Sequences
AAAATTTTCCCGGGTAGCTTTCCCGGGT
0001110101010101111111111000
73Research Area of GSP
- Gene Predication
- Genes Predication
- Hidden Markov Models (HMM)
- Fourier Transform
- Wavelet Transform
- Resonant Recognition Model (RRM)
- To identify the common hot spots of many protein
molecules using Fourier transform methods. -
74References
- http//biology.ucok.edu/bidlack/biology/notes.htm
- http//www.ncbi.nlm.nih.gov/books/bv.fcgi?ridgeno
mes - http//www.estrellamountain.edu/faculty/farabee/bi
obk/biobooktoc.html - http//www.werathah.com/
- http//lectures.molgen.mpg.de/online_lectures.html
74
75References
- http//www.biology.lsu.edu/webfac/jmoroney/BIOL309
0/
75
76THANKYOU FOR YOUR ATTENATION