Title: Introduction to Bioinformatics
1Introduction to Bioinformatics
2What is Bioinformatics
- Easy Answer
- Using computers to solve molecular biology
problems Intersection of molecular biology and
computer science - Hard Answer
- Computational techniques (e.g. algorithms,
artificial intelligence, databases) for
management and analysis of biological data and
knowledge
3Bioinformatics
- Bioinformatics Biology Information
- Biology is becoming an information science
- Computation methods are necessary to analyze the
massive amount of information that coming out of
the genome projects
4Bioinformatics is Another Revolution in Biology
5Three concepts, which remain central to
Bioinformatics
A complex, dynamic, three-dimensional molecule
a simple string of characters
6Three concepts, which remain central to
Bioinformatics
- The concept of similarity
- Evolution has operated on every sequence
- In biomolecular sequences (DNA, RNA or amino acid
sequences). High sequence similarity usually
implies significant functional or structural
similarity. - The opposite is not true
- Algorithms for comparing sequences and finding
similar regions are at the heart of bioinformatics
7Three concepts, which remain central to
Bioinformatics
- Bioinformatics is not a theoretical science it
is driven by the data, which in turn is driven by
the needs of biology. - Sequences
- Microarray technologies
8GenBank Growth
9Moores Law
10What do you need to know?
- It all depends on your background
- Are you a ?
- Biologist with some computer knowledge, or
- Computer scientist with some biology background
- Few do both well
11Background
- Biology for Computer Scientists
- Computer Science for Biologists
12Biological Information Flow
Genome
Introns/Exons
Gene Sequence
Protein Sequence
Bioinformatics attempts to model this pathway
Protein Structure
Protein Functions
Cellular Pathways
13Living Things
- Entropy (the tendency to disorder) always
increase - Living organisms have low entropy compared with
things like soil - They are relatively orderly
- The most critical task is to maintain the
distinction between inside and outside
14Living Things
- In order to maintain low entropy, living
organisms must expend energy to keep things
orderly. - They figured out how to do this 4 billion years
ago - The functions of life, therefore, are meant to
facilitate the acquisition and orderly
expenditure of energy
15Living Things
- The compartments with low entropy are separated
from the world. - Cells are the smallest unit of such compartments.
- Bacteria are single-cell organisms
- Humans are multi-cell organisms
16(No Transcript)
17The living things have the following tasks
- Gather energy from environment
- Use energy to maintain inside/outside distinction
- Use extra energy to reproduce
- Develop strategies for being successful and
efficient at the above tasks - Develop ways to move around
- Develop signal transduction capabilities (e.g.
vision) - Develop methods for efficient energy capture
(e.g. digestion) - Develop ways to reproduce effectively
18How to accomplish?
- Living compartments on earth have developed three
basic technologies - Ability to separate inside from outside (lipids)
- Ability to build three-dimensional molecules that
assist in the critical functions of life
(Protein, RNA) - Ability to compress the information about how
(and when) to build these molecules in linear
code (DNA)
19Bioinformatics Schematic of a Cell
20Lipids
- Made of hydrophilic (water loving) molecular
fragment connected to hydrophobic fragments - Spontaneously form sheets (lipid membranes) in
which all the hydrophilic ends align on the
outside, and hydrophobic ends align on the inside - Creates a very stable separation, not easy to
pass through except for water and a few other
small atoms/molecules
21What is Nucleotide?
- Pentose, base, phosphate group
22Pentose RNA and DNA
23Base
- Adenine (A), Cytosine (C), Guanine (G), Thymine
(T), - Uracil (U).
24Nucleic Acid Chain
- Condensation reaction
- Orientation
- From 5 to 3
- In DNA or RNA, a nucleic acid chain is called
Strand - DNA double-stranded
- RNA a single strand
- The number of bases
- Base pair (bp) in DNA
25DNA Structure
26DNA Structure
27DNA Structure
28RNA Structure and Function
- The major role of RNA is to participate in
protein synthesis - Messenger RNA (mRNA)
- Transfer RNA (tRNA)
- Ribosomal RNA (rRNA)
29mRNA
30The Genetic Code
31What is gene?
- A gene includes the entire nucleic acid sequence
necessary for the expression of its product. - Such sequence may be divided into
- Regulatory region
- Transcriptional region exons and introns
- Exons encode a peptide or functional RNA
- Introns will be removed after transcription
32Gene
33Genome
- The total genetic information of an organism.
- For most organisms, it is the complete DNA
sequence - For RNA viruses, the genome is the complete RNA
sequence
34Genes and Control
- Human genome has 3,000,000,000 bps divided into
23 liner segments (chromosome) - A gene has an average 1340 DNA bps, thus
specifying a protein of about ? (how many) amino
acids - Humans have about 35,000 genes 40,000,000 DNA
bps 3 of total DNA in genome - Human have another 2,960,000,000 bps for control
information. (e.g. when, where, how long, etc)
35Gene Expression
- An organism may contain many types of cells, each
with distinct shape and function - However, they all have the same genome
- The genes in a genome do not have any effect on
cellular functions until they are expressed - Different types of cells express different sets
of genes, thereby exhibiting various shapes and
functions
36Gene Expression
- The production of a protein or a functional RNA
from its gene - Several steps are required
- Transcription
- RNA processing
- Nuclear transport
- Protein synthesis
37Gene Expression
38Central Dogma
DNA
RNA
Protein
Next Protein Structure and Function
39An Amino Acid
- An amino acid is defined as the molecule
containing an amino group (NH2), a carboxyl group
(COOH) and an R group. - R-CH(NH2)-COOH
- The R group differs among various amino acids.
- In a protein, the R group is also call a
sidechain.
40An Amino Acid
41The Twenty Amino Acids of Proteins
42The Twenty Amino Acids of Proteins
43Protein
- Peptide ? a chain of amino acids linked together
by peptide bonds. - Polypeptides ? long peptides
- Oligopeptides ? short peptides (lt 10 amino acids)
- Protein are made up of one or more polypeptides
with more than 50 amino acids
44Protein Structure
- Primary Structure
- Refers to its amino acid sequence
45Secondary structure
- Regular, repeated patterns of folding of the
protein backbone. - Two most common folding patterns
- Alpha helix
- Beta sheet
46Tertiary Structure
- The overall folding of the entire polypeptide
chain into a specific 3D shape
47Quaternary Structure
- Many proteins are formed more than one
polypeptide chain - Describe the way in which the different subunits
are packed together to form the overall structure
of the protein - Hemoglobin molecule
48Quaternary Structure
49Evolution
- Mutation ? rare events, sometimes single base
changes, sometimes larger events - Recombination ? how your genome was constructed
as a mixture of your two parents - Through Natural Selection
- Homology (similarity) different species are
assumed to have common ancestors - The genetic variation between different people is
(surprisingly ..)
50References
- http//www.biology.arizona.edu/biochemistry/proble
m_sets/large_molecules/ - http//helix-web.stanford.edu/bmi214/index2004.htm
l - http//www.web-books.com/MoBio/
- http//www.cs.sunysb.edu/skiena/549/