Title: Introduction to Structural Bioinformatics
1Introduction to Structural Bioinformatics
- Bruce Byrne, PhD
- Fundamentals of Bioinformatics
- Spring, 2007
2What we will be doing
- The Power Point Presentation
- Whats important about structural bioinformatics?
- There are themes in common to the rest of
bioinformatics. - How do we get structural information for
proteins? - Laboratory techniques
- Data search and retrieval
- What do the data look like?
- How can we visualize the data?
3Structural Bioinformatics
- Why might it be useful to understand both the
sequence and structure of a protein? - Better understanding of functional interactions
- Protein/ligand binding
- Catalysis
- Insight into aberrant biological phenomena
(disease) - Cancer, diabetes, etc
- Develop drugs
- Control disease by interfering with key pathways
at molecular level - Influence disease course with metabolic
inhibitors/activators - Manipulate gene and protein
- Gene therapy
- Biopharmaceuticals
4Pattern Recognition is Fundamental to
Bioinformatics
- Texts and Sequences
- Text exact matches to words or strings
- Sequence similarity
- Expression Data DNA Microarray
- Experimental data represented by color and
intensity. How do patterns compare from on
condition to another? - More, later, in Gene Expression unit.
- Molecular Structures
- How can a characteristic set of amino acids,
arranged in 3-D, account for ligand or receptor
binding?
Text
Sequence
Expression
5What Happens with Advanced Structural
Bioinformatics?
- Now
- Manipulating sequence or altering structure
- Modeling a similar sequences with unknown
structure to a known structure - Docking a macromolecule and ligand
- Future
- Predicting structure based solely on sequence
- ab initio methods
6Ab initio methods
- Build a peptide structure with just sequence
information. - Can have more than one low energy model.
- Iterative process and convergence is a must
- Feasible for short peptide sequences (best model
was 112aa by Robetta server (David Baker, Nature,
2007) - Useful to build loop regions, missing protein
segments in low resolution structures. - Complimentary to NMR and X-ray crystallography.
7Basics Data for Modeling Structure
- Solution NMR Nuclear Magnetic Resonance
- Does not require crystallization
- Works on molecules in aqueous solution
- Lower resolution
- X-Ray Crystallography
- 80 of PDB entries
- High resolution
- Large quantities of crystallized protein
- Validity for some biochemical microenvironments?
- Highly hydrated crystals
- Good empirical agreement between solution NMR and
crystallography
8Imaging - General
Image in and out of focus
- Ability to create an image related to wavelength
of energy source - Light
- X-rays
- Neutron beams
- Electrons
- Lenses focus images using a variety of energy
sources - Glass light
- Magnets electrons
- Diffraction creates patterns
- Regular, interpretable patterns resulting from
the interference of waves
Waves on the surface of water diffracted moving
through a small hole
Given knowledge of the pattern on the left
and the properties of the hole, the waves on the
right can be modeled
Images Wikipedia
9X-Ray Crystallography
- Not just an imaging technique
- Data gathering
- Substantial interpretation
- How does it work?
- Wavelength (Ångström range, 10-8 cm ) will cause
scatter by electron cloud of similar sized atom - Generally yields a unique model
- Cannot resolve the positions of hydrogen atoms
unless by modeling or resolution beyond about 1.2
Å - Terminal side-chain atoms uncertain for Asp, Gln
and Thr requires inferred identity
10Solution NMR
- Analyzes proteins in solution
- Especially useful for smaller proteins, lt 30 kD
- Very important because
- some proteins resist crystallization
- Yields the positions of some hydrogen atoms
- Solution NMR often yields multiple models, in
comparison with crystallography - Especially useful in the analysis of large
complexes
11The Crystal
- Crystal must be
- Single
- A few tenths of a mm in each direction
- Protein crystals are
- Fragile and sensitive
- Bound by weak hydrogen bonds, salt bridges and
hydrophobic interactions - Contain 50 solvent in channels between stacked
molecules - Jelly-like nature permits soaking crystals in
metal solutions or enzyme inhibitors - Expensive
12Obtaining a Crystal
- High concentration, purified protein (2-50 mg/ml
) - Add agents to reduce solubility, without
precipitation - Evaporate agent from reservoir into hanging drop
with protein - Experiment trial and error
13The Experimental Set-up
- Rotating anode X-ray generator
- Monochromator or focusing mirrors yield single
wavelength - Crystal can be repositioned using goinometer
- Photo-plate or electronic recording of diffracted
pattern
14Interpretation Theory Bragg's Law
- n? 2d sinT (1)
- Derived by Sir W.H. Bragg and his son Sir W.L.
Bragg in 1913 - Explains why crystals reflect X-ray beams at
certain angles of incidence (theta, q). - Direct evidence for the periodic atomic structure
of crystals postulated for several centuries. - The Braggs were awarded the Nobel Prize in
physics in 1915.
15Interpretation Practice
- Obtain diffraction pattern
- Position and intensity apparent in image
- Phases of the waves which formed each spot must
also be determined - irradiate two or more derivatives of the same
crystal which differ only in the presence of
heavy metal ions - Use multiple wavelengths
- Position, density and phase constitute a
structure factor. - PDB structure factor data files permit creation
of a complete electron density map
16Structural Bioinformatics Repositories
- PDB Protein Database
- Curated collection
- Prime source for data
- NCBI
- Derived database, a subset of PDB
- One useful structural viewer
- Integrated with other NCBI databases
17Finding The PDB File
- PDB http//www.rcsb.org/pdb/home/home.do
- Search for a PDB ID 1zaa
18Looking at the PDB Data File
- Note Display File options
- Select PDB File
- Note file structure and similarities to GenBank
- HEADER
- TITLE
- COMPND
- SOURCE
- KEYWDS
- EXPDTA etc
- REMARK
- DBREF
- SEQRES
- etc
- Far down in the file, you will note the position
of each heavy atom of each residue of the protein
using XYZ coordinates - How do these data compare to what you have
learned about X-ray crystallography?
19Finding Data
- What are Zinc Fingers?
- Navigating structures at NCBI
20Zinc Fingers (1)
- Well-understood structure with important
biological function - Independently folded domain of many proteins
- Requires 1 or more Zinc ions
- A series of Zinc Fingers recognizes specific DNA
sequences - Matches regulatory proteins like transcription
factors
21Zinc Fingers (2)
- Very common DNA-binding motif
- Characterized by two anti-parallel beta strands
followed by an alpha helix - Stabilized by Zn ion interacting with conserved
histidine (H) and cysteine (C) residues.
22Search Using Alternative Strategies
- Use Entrez
- Search term zinc finger
- Cick structure
- 600 hits
- TaxBrowser
- Select Mus musculus
- Click structures
- Search for zinc finger
- Search for zinc finger
23Examine the Tabs
- Tabs sort your result depending on the
- Search strategy
- Characteristics of the database
- Structure searches
- NMR
- X-ray
24MMDB Summary Page (1)
- Note layout and features
- Reference often but not always a publication
- Search resulted from match of key words, but not
a well controlled vocabulary - MMDB and PDB index numbers
25MMDB Summary Page (2)
- Chains (Proteins and Nucleotides, in this
example) - 3d Domains
- Domain Families
26ExerciseCn3D The NCBI Viewer
- Having reviewed how structural data are obtained
in the laboratory and catalogued at NCBI this
Units Exercise will give you expertise in using
the NCBI structural viewer, Cn3D. - Download the assignement from WebCT
27Next Week Beyond
- On-line
- Find structures at NCBI using Entrez tools
- Use Cn3D Viewer to visualize structures
- Use similarity searching tools to find similar
structures - Review remaining schedule of classes.