Function preserves sequences - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Function preserves sequences

Description:

Second line has the start of the sequence (50 or 60 characters per line) ... There is also a Japanese database provider, DDBJ. Spring 2002 ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 15
Provided by: christo57
Category:

less

Transcript and Presenter's Notes

Title: Function preserves sequences


1
Function preserves sequences
Christophe Roos - MediCel ltd christophe.roos_at_medi
cel.fi
Mutations change sequences
Molecular evolution
Part 3 sequence databases comparisons
Similarity is the result of conservation or
converging evolution it has its reason of being
2
The public biological databases
  • EMBL or GenBank or DDBJ for DNA
  • emblnew for daily updates, merges the main DB
    4x/year
  • SwissProt or PIR for proteins
  • Trembl, tremblnew, remtrembl
  • PDB for structures
  • In flat file format, yet quite informative and
    convertible
  • Fasta format is a universal sequence format
    first line starts with gt followed by free text.
    Second line has the start of the sequence (50 or
    60 characters per line). Use the first line for
    the name or the Accession Number (AC)

3
Database homes
  • The European database home is in Hinxton,
    Cambridge, UK European Bioinformatics Institute
    - EBI
  • http//www.ebi.ac.uk
  • Access through the Sequence Retrieval System, SRS
  • The American database home is in Washington DC
    National Center for Biotechnology Information
    NCBI
  • http//www.ncbi.nlm.nih.gov
  • Access through Entrez
  • Both centers exchange their data on a daily
    basis, however there are differences in
    annotations, consistency, speed and quality.
  • There is also a Japanese database provider, DDBJ.

4
A look at one entry from EMBL
part 1/3
5
A look at one entry from EMBL
part 2/3
6
A look at one entry from EMBL
The feature table of the entry contains several
linked items, such as exon-assembly (mRNA) and
coding sequence (CDS). There are also
cross-references to other databases
part 3/3
7
A look at one entry from SwissProt
The eyeless gene a master regulatory gene in eye
formation
8
The effect of the eyeless gene
  • The eyeless gene is a master regulatory gene in
    eye formation
  • When it is absent, no eyes are formed
  • When it is present where it should not, it
    induces eye formation

Normal
Overexpressed in antennae and wings
Absent
9
A look at one entry from SwissProt
Part 2 the annotations about the function and
location
10
A look at one entry from SwissProt
Part 3 The feature table and the amino acid
sequence
11
A look at one entry from SwissProt
The eyeless gene is also called PAX6 and can be
found in several species birds, mammals,
reptiles, fish, invertebrates
12
Sequence comparison
- Why?
  • Function by analogy If sequences are conserved
    their function is probably also conserved.
  • Functional domains If some parts of the
    sequences are more conserved than other parts,
    there must be an underlying biological reason for
    it.
  • Establishing relationship/differences in
    function By quantification of sequence
    relationships it is possible to estimate function
    of novel genes
  • Establishing relationship between species

13
Sequence comparison how?
  • Compare two sequences of similar length
  • Compare two sequences of very different length
  • Compare several sequences
  • Allow gaps or not?
  • Scoring yes-no or good-intermediate-bad
  • The best or all above a threshold?

14
Sequence comparison metrics
gap
match
GA-CGGATTAG GATCGGAATAG
  • The scoring matrix
  • The score for a match
  • The penality for a mismatch
  • The penality for the insertion of a gap
    (gap-open)
  • The penality for elongating a gap (gap-length)
  • Local or global similarities ?

mismatch
Write a Comment
User Comments (0)
About PowerShow.com