Title: Lecture 1A
1Lecture 1A
- Introduction to BioInformatics
2Outline
- Define Bioinformatics
- What questions can be answered?
- History
- Human genome project
- Different application of BioInformatics
- Omics Revolution
3What is BioInformatics?
- Application of tools of computation and analysis
to the capture and interpretation of biological
data - Interdisciplinary field
- Essential for the management of data in modern
biology and medicine
4BioInformatics vs Computational Biology
- Used Interchangeably
- Really different terms
- Distinction made by National Institutes of Health
- Bioinformatics
- Refers to the creation and advancement of
algorithms - algorithm is a sequence of finite instructions,
often used for calculation and data processing - Computational and statistical techniques to solve
problems arising from management and analysis of
biological data - Computational Biology
- Refers to hypothesis-driven investigation of a
specific problem using computers, using
experimental or simulated data to advance
scientific knowledge
5Disciplines that have contributed
Physics
Biology
Medicine
Bioinformatics
Mathematics
Computer Science
Chemistry/Biochemistry
Statistics
6Why BioInformatics?
- Solve biological problems usually on the
molecular level - Problems too great fro human discernment
7What areas of biology use bioinformatics?
- Molecular Biology
- Genetic material
- Information in nucleic acids analyzed using
bioinformatics - Evolutionary Biology
- Developmental history of species
- Common ancestor phylogeny and how evolution
works - Structural Biology
- Physical forms of molecules
- Position of atoms within molecules
- Genetics
- Model molecular changes (over time)
- Mutations- gene changes and protein products
8Tools of Bioinformatics
- Computer software programs
- Internet
- Sequence analysis of DNA and protein using
various programs and databases available - Evolving discipline
- Bioinformaticians
- Pharmaceutical companies
- Biomedical laboratories
9BioInformatics
- Used to answer fundamental questions in the life
sciences - What are the evolutionary origins of this
protein? - What gene does this DNA sequence code for?
- What does this gene do?
- How does this enzyme/ribozyme work and what does
it look like? - When is this gene expressed?
- What genes are expressed before the onset of
cancer? - What drugs can be used to treat this disease?
- What mutations are responsible for this genetic
disorder?
10Applications
- No Single comprehensive database exists for
accessing all the information needed to manage
data - Construction of phylogeny trees
- Predict gene location and products
- ORF finder
- Blast or FastA
- Predict protein structure
- Protein Explorer
- Literature Searches
- PubMed and Galileo
- Sequencing of genomes
- Sequence alignments
- Microarray analysis
11Human Significance
- Locate mutations responsible for genetic
diseases. - Aids in the treatment and diagnosis of those
diseases - Pharmacogenomics
- Designer drugs and therapies
- Biotechnology
- Discover and exploit new enzymes
- Environmental clean-up
- Antibiotics and other chemotherapeutic agents
- Useful products
12Major Events in Molecular Biology History
- 1869 DNA discovered
- Johann Friedrich Mieschers nuclein
- 1941 Central Dogma revealed
- Beadle and Tatum
- 1950 Complementary Bases discovered
- Edwin Chargaff
- 1953 DNA is a double helix
- Watson, Crick and Franklin
- 1956Role of ribosomes
- George Emil Palade
13Major Events in the History of Molecular Biology
- 1950s
- The first protein sequenced
- Frederick Sanger
- Edman degradation
- Simplified Sangers method
- 1960s
- Ion exchange columns, chromatography and
electrophoresis - Sped up the process
- Pehr Edman
- Sequenatorautomated sequencing
- 1975 (Sanger)
- Dideoxy termination sequencing for DNA
14Other Important Dates
- http//www.geocities.com/bioinformaticsweb/his.htm
l - 1955
- 1973
- 1977
- 1980
- 1985
- 1988
- 1990
- 1995 -2000
- 2001 - 2003
15History of Bioinformatics
- Margaret Dayhoff (Bioinformatics Founder)
- Established the Atlas of Protein Sequence and
Structure - Annual Publication to catalogue all know amino
acid sequences - Protein Information Resource (PIR) database in
1983 - Algorithms to study protein sequences
- Tools to design and utilize sequence databases
16Dayhoffs Contributions
Dayhoff wrote FORTRAN programs to solve a
puzzle sequence assembly from weeks to minutes!
AVTALWGKVNVDEVG
VHLTPEEKS
AVTALWGKVNV
LVVYPWTQRF
GEALGRLLVVYP
PEEKSAVTA
KVNVDEVGGEALGR
These represent short segments of amino acid
sequences that make up hemoglobin.
17Dayhoffs Contributions
Dayhoff wrote FORTRAN programs to solve a
puzzle sequence assembly from weeks to minutes!
AVTALWGKVNVDEVG
VHLTPEEKS
LVVYPWTQRF
PEEKSAVTA
AVTALWGKVNV
GEALGRLLVVYP
KVNVDEVGGEALGR
18Dayhoffs Contributions
Dayhoff wrote FORTRAN programs to solve a
puzzle sequence assembly from weeks to minutes!
VHLTPEEKS
PEEKSAVTA
AVTALWGKVNV
AVTALWGKVNVDEVG
KVNVDEVGGEALGR
GEALGRLLVVYP
LVVYPWTQRF
19Dayhoffs Contributions
Dayhoff wrote FORTRAN programs to solve a
puzzle sequence assembly from weeks to minutes!
VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRF
- Her programs were later added to automated
sequencers. - She also established the Atlas of Protein
Sequence (65) and Structure (a book), which later
became the PIR (80) - The PIR allowed sequence comparison, which lead
to molecular evolutionary biology (molecular
phylogeny)
20Human Genome Project
- Began 1990
- Estimated cost 13 billion
- Landmark achievement for bioinformatics
- Random sequencing
21Human Genome Project
- http//www.ornl.gov/sci/techresources/Human_Genome
/project/journals/journals.shtml - Click on Feb. 16th
- Human genome
- Read abstract
22Functional Genomics
- Since the sequence of human genome
- Emphasis is changing from genes themselves to
gene products - Functional genomics
- Assigns functional relevance to genomic
information - Study of gene, their resulting proteins,and roles
played by the proteins - Proteomics analysis of the proteins expressed
by the cell
23Comparative Genomics
- Comparison of the sequencing of genomes from a
number of model organisms - Study gene structure and function
- Human
- Plants- Arabidopsis thaliana
- Yeast- Saccharomyces cerevisia
- Fruit fly- Drosophila melanogaster
- Nematode worm- Caenorhabditis elegans
- Mouse- Mus musculus
24Clinical Applications
- New drugs
- Targeted drugs
- Gene therapy for single genes
- Diagnosis and treatment plans
- Information on genetic disorders
- Potential adverse reactions
- Pharmacogenomics
25Human Genome Project and Omics Revolution
- Genomics
- DNA Sequence
- Homology locations of genes and functional
sites phylogeny mapping - Infer function
- Transcriptomics
- mRNA sequence and structure
- Determine expression mechanisms via identifying
alternative splicing regions - Proteomics
- Amino acid Sequence and protein structure
- Predict structure
- Solve structure
- Infer function from structure
26Omics
- Metabolomics
- Studying proteins and enzymatic pathways involved
in cell metabolism - Glycomics
- Studying the carbohydrates of a cell
- Interactomics
- Studying the complex interactions of protein
networks in a cell - Nutrigenomics
- Interations between diet and genes