The Cybercell Database CCDB - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

The Cybercell Database CCDB

Description:

Data Extractor. Simple to use JavaScript-based interface ... Useful because it allows for misspellings, which the data extractor does not. The Whole Thing ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 28
Provided by: Shan1153
Category:

less

Transcript and Presenter's Notes

Title: The Cybercell Database CCDB


1
The Cybercell Database (CCDB)
Presenter Shan Sundararaj
Provisional PhD. Candidate
Department of Pharmacy
University of Alberta
Supervisor Dr. D. Wishart
Date December 12, 2002.
2
Outline
  • What is the Cybercell project?
  • What is the CCDB?
  • How was it made how does it work?
  • What can it do?
  • Conclusion Future plans

3
Project Cybercell
  • Objective
  • Create a virtual biological cell
  • Whos involved
  • Researchers across Canada internationally
  • Both universities and private companies
  • Dozens of structural biologists, computer
    scientists, bioinformaticists, etc.

4
Project Cybercell
  • Objective Create a virtual cell
  • simulate all elements and processes of a
    biological cell on a computer
  • use Escherichia coli as a model
  • use it to predict cellular phenomena
  • e.g. response to drugs or environment
  • Test simulations with the best quantitative data
    possible

5
Cybercell example
A model of a cell with a metabolite diffusing
into it through transport proteins and being
converted to a different molecule by an enzyme
http//129.128.166.250/research/simulation_gallery
/4d/cybercell4.mov
This simulation requires very specific
information about enzyme availability and rates
to be useful
6
Cybercell Database (CCDB)
7
Outline
  • What is the Cybercell project?
  • What is the CCDB?
  • How was it made how does it work?
  • What can it do?
  • Conclusion Future plans

8
CCDB
  • the CCDB is an evolving repository of
    quantitative data on proteins and genes compiled
    from a variety of data sources
  • Includes browsable, searchable and sortable lists
    and links of
  • protein names
  • sequences
  • functions
  • structures
  • physicochemical
  • constants
  • cofactors
  • copy numbers
  • products
  • reactants
  • binding partners
  • More

9
CCDB Statistics
10
The Colicard
  • Fundamental unit of the database (e.g. the
    record)
  • One exists for all 4374 proteins identified in E.
    coli K12

11
Interesting Features
  • Can view 3-D structures using WebMol
  • Homology models created for proteins with
    appropriate template
  • Contains lists of protein interacting partners
    and protein complexes
  • Useful when looking for homologous interacting
    proteins
  • Enzyme information from BRENDA

12
Related Databases
  • CCRNA
  • Equivalent database for tRNA and rRNA molecules
  • CC3D
  • Database that focuses on structural information
  • CCMD
  • Database of data on small metabolic compounds

13
Outline
  • What is the Cybercell project?
  • What is the CCDB?
  • How was it made how does it work?
  • What can it do?
  • Conclusion Future plans

14
Data gathering
  • There is a VAST amount of varied information
    available about proteins, nucleic acids and other
    molecules!!
  • All stored nicely in the CCDB now, but where does
    it originate?

15
Information Sources
Manual Literature Searches
Other databases
16
Database structure
  • CCDB is a flat-file database
  • Individual text file for each Colicard
  • Create an index file of all the pertinent
    information and use Perl cgi scripts to make a
    searchable, sortable web interface
  • Contains many links to references and other
    databases for more detailed information
  • Use other tools to perform search and data
    extraction functions for extra functionality

17
Data Gathering
  • Would take years to do all manually!
  • Use manual forms to fill in some information
  • Make use of web robot to query databases and
    build as much of Colicards as possible

18
Robots Can Be Annoying
  • Many sites do not like robots (e.g. EcoCyc)
  • Some sites disallow robots altogether (e.g.
    BRENDA)
  • Have to be polite, look at robots.txt file, dont
    query too fast or too often


19
Update robots (cont.)
  • Weekly check to web-based databases to look for
    updates
  • If update available, ALL updated files are
    downloaded
  • Information is parsed and compared to last
    available local copy (e.g. line-by-line
    comparison of Swiss-Prot files)

20
Outline
  • What is the Cybercell project?
  • What is the CCDB?
  • How was it made how does it work?
  • What can it do?
  • Conclusion Future plans

21
Search/Sort functions
  • Main browsing page allows you to sort by any of
    11 characteristics, for example
  • Protein Name
  • Swiss Prot ID
  • of amino acids
  • Function
  • Gene Name
  • Good for broad overview, but what if you want to
    look at a specific protein?

http//redpoll.pharmacy.ualberta.ca/CCDB/cgi-bin/E
CARD_BROWS_NEWn.cgi?hits20browsn8pag1acco11

22
Data Extractor
  • Simple to use JavaScript-based interface
  • Can choose proteins based on ANY type of
    information in database
  • Uses pre-parsed pairwise data sets and cgi
    scripts to extract and display data

http//redpoll.pharmacy.ualberta.ca/CCDB/CCDB_Ext.
html
23
WebGlimpse Search
  • Even simpler, though less robust search method
  • Uses program called WebGlimpse that indexes all
    the Colicard files and returns those that match
    the search criteria somewhere in the file
  • Useful because it allows for misspellings, which
    the data extractor does not

24
The Whole Thing
  • The entire CCDB database in text files is
    available for download at http//redpoll.pharmacy
    .ualberta.ca/bahram/cgi-bin/Pdownload.cgi
  • This is for people who would like to have a local
    copy of all the information

25
Outline
  • What is the Cybercell project?
  • What is the CCDB?
  • How was it made how does it work?
  • What can it do?
  • Conclusion Future plans

26
Conclusion
  • The Cybercell project aims to simulate a
    biological cell on a computer in the next 5-10
    years
  • The CCDB will act as a repository of all known
    and discovered information regarding E. coli
    molecules during the project
  • CCDB is updated automatically as well as by
    manual forms to remain up-to-date
  • CCDB can be queried in several ways (browsing,
    data extractor, search)

27
Thanks
  • Dr. Wishart
  • Bahram Habibi-Nazhad
  • An Chi Guo
  • Haiyan Zhang
  • Melania
  • Everyone in lab
Write a Comment
User Comments (0)
About PowerShow.com