Title: MBC Bioinformatics Preecha Leangaramgul Medical
1MBC
Bioinformatics
Preecha Leangaramgul Medical Biotechnology Center
(MBC), Department of Medical Sciences, Ministry
of Public Health
Bioinformatics Training
2MBC
Introduction to Bioinformatics
- Overview
- System in MBC GCG Wisconsin Package
- Databases supported in WP
- SeqLab and SeqWeb
Bioinformatics Training
3MBC
What is Bioinformatics?
- Application of computer technology to
management of biological information - Computer are used gather, store, analyze and
integrate biological and - genetic information such as nucleotide and
amino acid sequences - Information can be applied to gene-based drug
discovery and development - Biology, Computer Science, Mathematics and
Information Technology merge to - form a single discipline
Bioinformatics Training
4MBC
- The development and implementation of tools
that enable efficient access to, and use and
management of, various types of information -
The development of new algorithms (mathematical
formulas) and statistics with which to
assess relationships among members of large data
sets, such as methods to locate a gene
within a sequence, predict protein structure
and/or function, and cluster protein
sequences into families of related sequences
Bioinformatics Training
5MBC
Bioinformatics System in MBC
GCG Wisconsin Package (Accelrys) 1.
Softwares - Unix command line - SeqLab
X-Window user interface - SeqWeb Web-based
graphical user interface Internet Explorer
2. Databases - Nucleic Acid - Protein
Bioinformatics Training
6MBC
GCG Wisconsin Package Softwares
- A complete suit of over 130 programs designed
for all aspects of biological - sequence analysis
- Modular each program performs a specific
function to achieve different - goals but can be combined in different ways to
address various analytical - problems
- Flexible can add other Accelrys bioinformatics
products, in-house to - extend its functionality
- Customizable each program has a unique set of
options to change its - behavior and output
Bioinformatics Training
7System in MBC
MBC
Bioinformatics Training
8System Setup Server SUN Fire280R
Server 2 GB RAM, 80 GB HD OS Version Solaris
9 Host name MBC01 Storage SUN Storage
A1000 and SUN Storage 3320 /data with 266 GB on
A1000 /data2 with 739 GB on 3320 /data3 with
739 GB on 3320 /data4 with 135 GB on A1000
MBC
Bioinformatics Training
9MBC
Total HD 1879 GB Used 492
GB Available 1387 GB Capacity 74
server
storage
server
Bioinformatics Training
10MBC
Databases supported in the Wisconsin Package
- GCG formatted sequence databases
- 2 nucleic acid and 5 protein databases updated
regularly - Usage of these databases greatly enhances the
functionalities of the programs, - increases the search speed
Bioinformatics Training
11MBC
Databases Nucleic Acid
- GCG combined GenBank and EMBL to form
GenEMBLPlus containing - all of GenBank and an abridged EMBL for data
that are not present - in GenBank
- Databases are released bi-monthly following
GenBanks release schedule - GCG formatted databases follow the GenBank
organization
Bioinformatics Training
12MBC
Nucleotide Database Divisions
- EMBLPlus (abridged) emp
- GenBankPlus gbp
- GenEMBLPlus gep
- Expressed Sequence Tags (EST)
- - est gb_est em_est
- Genome Survey Sequences (GSS)
- - gss gb_gss em_gss
- High-Throughput cDNA Sequences (HTC)
- - htc gb_htc
Bioinformatics Training
13MBC
Databases Protein
The Wisconsin Package contains the following
public protein database - PIR Protein
Information Resource from the National Biomedical
Research Foundation (NBRF) -
SWISS-PROT Swiss Protein Database by Dr. Amos
Bairoch at the Swiss Institute of
Bioinformatics (SIB) - SP-TrEMBL
Translated EMBL database in SWISS-PROT format.
A joint effort by Dr. Bairoch and the
European Bioinformatics Institute
(EBI) - GenPept Translated GenBank
Database, no annotation - NRL_3D a
structural database produced by PIR from
sequence information extracted from 3-D
structures in PDB
Bioinformatics Training
14MBC
Databases Protein
- PIR pir
- - PIR1 pir1
- - PIR2 pir2
- - PIR3 pir3
- - PIR4 pir4
- SWISS-PROT sw
- SP-TrEMBL sptr
- SWISS-PROTPlus Combination of SWISS-PROT and
SP-TrEMBL - - swp
- GenPept gp
- NRL_3D nrl_3d
Bioinformatics Training
15MBC
GCG Wisconsin Package with SeqLab Interface
- Installed and run under UNIX
- - More disk space for database and
results - - Run on CPUs for computer intensive
analysis - - Multi-user access and easy maintenance
- Native interface for the Wisconsin Package (WP)
UNIX command line - Graphical user interface (GUI) for WP on UNIX
an X-Window interface
Bioinformatics Training
16MBC
Unix Command line
Bioinformatics Training
17MBC
Available Programs in SeqLab
- Comparison
- compare two or more sequences
- Database Searching
- use the sequences to search peptide and nucleic
acid databases - Evolution
- investigate the evolutionary relationships
within a group of sequences - Mapping
- display restriction maps or peptide cleavage
maps of the sequence - Gene Finding and Pattern Recognition
- recognize coding regions, terminators, repeats,
and consensus patterns
Bioinformatics Training
18MBC
- Primer Selection
- select oligonucleotide primers for a template
DNA sequence - Protein Analysis
- perform protein analysis tasks such as find
structural and sequence motifs, - find transmembrane regions, find
helix-turn-helix motifs - Nucleic Acid Secondary Structure
- predict secondary structure in nucleic acid
sequences - Translation
- translate nucleic acids into peptide sequences
and peptides back into - nucleic acids
Bioinformatics Training
19MBC
Available Programs in SeqLab
Bioinformatics Training
20MBC
Bioinformatics Training
21MBC
GCG Wisconsin Package with SeqWeb Interface
- A Web-based graphical user interface to access
the WP in intranet - Provides parameter control and extensive on-line
help about the algorithms, - parameters and results
- Produces results with links to both internal and
external data - Offers multi-program analyses as one-step
operations - Allows users to save results as text, graphics
or HTML documents
Bioinformatics Training
22MBC
Available Programs in SeqWeb
- Access the most frequently used programs in the
Wisconsin Package - - Database searching and retrieval
- - Sequence comparison
- - Protein analysis
- - Mapping
- - Evolutionary analysis
- - Pattern recognition and motif searching
- - Nucleic acid secondary structure
- - Primer prediction
Bioinformatics Training
23MBC
Bioinformatics Training
24MBC
Bioinformatics Training
25MBC
Thank you
Bioinformatics Training