To understand the structures of existing biological databases, - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

To understand the structures of existing biological databases,

Description:

To understand the structures of existing biological databases, ... Orthology describes genes in different species that derive from a common ancestor. ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 17
Provided by: Eliz173
Category:

less

Transcript and Presenter's Notes

Title: To understand the structures of existing biological databases,


1
(No Transcript)
2
Purpose
  • To understand the definition of
  • homology, and its importance in
  • supporting the evolution theory
  • To understand the structures of existing
    biological databases,
  • and the ability to process their data
  • To utilize sequence alignment tools such as
    BLAST,
  • understand its underlying scoring mechanisms,
    and its
  • statistical significance
  • To construct a easily manageable database system
  • with user-friendly interface

3
Step 1 What is homology?
  • Read related information regarding the origin
    and meaning of homology
  • Recommend reading Homology in Biology,
    Jonathan Wells
  • Recommend reading Icons of Evolution? Alan
    D. Gishlick
  • Before actually staring to work, you should be
    able to
  • answer the following questions
  • Whats the difference between similarity and
    homology?
  • What were biologists views on the meaning of
    homology?
  • And what is your point of view? (If you have
    any)

4
Definition of Homology
Homologous sequences. Orthologs and Paralogs are
two types of homologous sequences. Orthology
describes genes in different species that derive
from a common ancestor. Orthologous genes may or
may not have the same function. Paralogy
describes homologous genes within a single
species that diverged by gene duplication.
(Definition from NCBI Education)
5
Step 2 BLAST
  • Know how to use BLAST, understand the BLAST
    algorithm
  • Use the web-based BLAST at NCBI,
  • play with it
  • We will use blastp in the project
  • Download BLAST software from the NCBI
  • FTP site, install it

6
Step 3 Review Biological Databases
  • You have to integrate the following database
  • HomoloGene (from NCBI)
  • Homophila
  • Improved homology prediction (? construct a
    database like euGenes by hand)
  • The following databases are optional for
    inclusion
  • FlyBase
  • MGD
  • RGD
  • SGD
  • DHMHD (The Dysmorphic Human-Mouse Homology
    Database )
  • Cancer Immunity

7
To complete your project
  • You must give us
  • A search function so that we can search
    homologies by criteria such as organisms, gene
    name, gene symbol, homology group, etc.
  • When we give a query, you must display the
    corresponding homologies and its source (ex this
    homology is from HomoloGene, this is from
    RGD.etc)
  • A summary table (like in euGenes) describing the
    number of predicted homologies between different
    organisms

8
Bioinformatics Computational Molecular
BiologyFinal Project
  • Microarray Analysis
  • Presenter Elizabeth Tseng

9
Purpose
  • To understand what is microarray and its use
  • Obtain an overview of the various analysis tools
    used in microarrays
  • Familiarize with the usage of
  • microarray tools
  • Construct an user-friendly platform
  • for microarray analysis

10
Step 1 What is microarray?
  • You already know from the previous courses that
    microarray is an array of DNA or protein samples
    that can be hybridized with probes to study
    patterns of gene expression.
  • Types of microarray include cDNA (spotted) and
    Affymetrix
  • Definition of gene expression used to describe
    the transcription of the information contained
    within the DNA into mRNA molecules that are then
    translated into the protein that perform most of
    the critical functions of cells.
  • How it works by using an array containing many
    DNA samples, scientists can determine in a
    single experiment the expression levels of
    numerous genes within a cell by measuring the
    amount of mRNA bound to each site on the array.
    With the aid of computer, the amount of mRNA
    bound is measured, generating a profile of gene
    expression in the cell.
  • Ratio of expression (ex tumor/normal, or
    red/green) indicates whether or not that spot is
    more high expressed

11
Step 2 Why do need microarray?
  • Microarray can contain a VERY LARGE number of
    genes
  • Microarray is small sized
  • Microarray may be used to assay gene expression
    within a single sample or to compare gene
    expression in two different cell types or tissue
    samples

GREEN represents Control DNA, RED represents
Sample DNA YELLOW represents a combination of
Control and Sample DNA BLACK represents areas
where neither the Control nor Sample DNA
hybridized to the target DNA.
(Step 1 2 from NCBI Education)
12
Step 3 Microarray Analysis
  • (pre-Analysis) Standardization
  • Remove background noise
  • Normalize intensity
  • Example SNOMAD
  • (pre-Analysis) Picking out the significant genes
    (ex SAM)
  • Comparison of gene expression
  • Clustering
  • Hierarchical clustering
  • K-means
  • SOM (Self-Organizing Maps)

13
Actual image
Pseudogram
Source MeV
14
(No Transcript)
15
Step 4 Review microarray tools
  • SNOMAD - Standardization and NOrmalization of
    MicroArray Data http//pevsnerlab.kennedyk
    rieger.org/snomadinput.html
  • SAM - Significance Analysis of Microarrays
    http//www-stat.stanford.edu/tibs/SAM/index.html
  • PAM - Prediction Analysis for Microarrays
    http//www-stat.stanford.edu/tibs/PAM/
  • Eisens Cluster TreeView http//rana.lbl.gov/E
    isenSoftware.htm
  • EMBL-EBI-Expression Profiler
    http//www.ebi.ac.uk/microarray-srv/EP/cgi-bin/ep_
    ui.pl
  • NCI BRB ArrayTools http//linus.nci.nih.gov/BRB-
    ArrayTools.html
  • TIGR-TM4 http//www.tigr.org/software/tm4/
  • Whitehead Institute - GeneCluster 2
    http//www-genome.wi.mit.edu/cancer/software/genec
    luster2/gc2.html
  • dChip http//www.dchip.org/
  • Genesis http//genome.tugraz.at/
  • BioConductor http//www.bioconductor.org

16
To complete the project
  • You must construct a website that
  • For each input microarray file, perform
    statistical analysis (ex t-test, pearson corr.)
    to determine the significance of each expression
    profile
  • For a given Affymetrix CEL file, draw a
    probe-intensity image.
  • Compare expression levels between two probe sets.
  • Use PCA (Principle Component Analysis) to
  • perform clustering
  • Hierarchical clustering

probe-intensity sample image
profile comparison sample
Write a Comment
User Comments (0)
About PowerShow.com