My Masters Work - PowerPoint PPT Presentation

1 / 73
About This Presentation
Title:

My Masters Work

Description:

... of active oxygen species is a type of stress called photo oxidative stress. ... database built from Nottingham Arabidopsis Stock Center (NASC) AffyWatch Service. ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 74
Provided by: Office2004652
Category:
Tags: masters | work

less

Transcript and Presenter's Notes

Title: My Masters Work


1
My Masters Work
  • Richa Tiwari

2
Outline of the talk
  • Analysis of Phylogeny Tree Evaluation Approaches
    (Project done in CS641).
  • Proteomics and 2-D Gel Electrophoresis (Study
    done for CS)
  • Coexpression analysis of dimerization between
    bZIP proteins in groups C, S1 and S2 in
    Arabidopsis Thaliana, under the conditions of
    differential light and CO2 levels (Project done
    for BST676).

3
Analysis of Phylogeny Tree Evaluation Approaches
4
Phylogenetic Analysis
  • Alignment of the sequences
  • Determining the presence of relationship between
    sequences
  • Decision of most appropriate tree building
    algorithm
  • Scrutinize the tree to determine level of
    confidence

5
  • Algorithmic Method
  • Defines an algorithm that leads to the
    determination of a tree.
  • Criteria Based Method
  • Defines a criterion for comparing different
    phylogenies and therefore phylogenies can be
    ranked, and comparison possible.

6
(No Transcript)
7
Maximum Parsimony Method
  • Most parsimonious tree will explain the observed
    character distribution with a tree that have the
    minimum tree length.
  • Tree selection criterion - Minimum tree length
  • (Fewest character state transformation)

8
Maximum Likelihood (ML)
  • ML evaluates the probability that the chosen
    evolutionary model will have generated the
    observed sequences.
  • Evolutionary Model Accounts for the changes in
    sequences.
  • Phylogenies are then inferred by finding those
    trees that yield the highest likelihood.

9
Distance Based Method
  • Distance-based methods attempts to find the
    distance that is the total changes between the
    two taxons from the point where they last shared
    an ancestor.
  • It is a cluster based method.

10
Software used.
  • PHYLIP
  • To compare the three phylogeny methods.
    Programs used from the package are
  • Maximum Parsimony DNAPARS
  • Maximum Likelihood DNAML
  • Distance-based DNADIST and Neighbor
  • Tree constructed using DRAWGRAM
  • Consensus tree constructed using CONSENSUS

11
Using Sample data
Maximum parsimony
Maximum likelihood
Distance Based

DNAPARS DNAML Neighbor
12
Consensus tree for given example
------Human
--1.0- ------Orang
------ ------Rhesus
--1.0-
------Gorilla --------------------Chimp
------Human
--1.0- --1.0-
------Chimp ------
-------------Orang
--------------------Rhesus
---------------------------Gorilla
Parsimony Method
Maximum Likelihood
-------------Orang
--1.0-
------Chimp ------ --1.0-
------Human
--------------------Rhesus
---------------------------Gorilla
Distance Based/Neighbor joining
13
Observation
  • Reliability of branch length estimates
  • NJ and MLgt MP
  • Computational speed (ngt500)
  • NJ/DNADIST 0.005 seconds
  • DNAPARS 0.5 seconds
  • DNAML 230.0 seconds

14
Conclusion
  • Our experiments and the results obtained indicate
    that the Distance Based method is better than the
    other two methods in terms of Fastness,
    Simplicity and good performance for high number
    of taxa.
  • Also we can say that if you have a fast computer
    and large dataset Maximum likelihood method is
    better than Maximum parsimony.

15
Proteomics and 2-D gel Electrophoresis
16
Introduction
  • The entire set of proteins expressed by the
    genome in a cell, organ or organism is referred
    to as the proteome.
  • Proteomics Methods that discover and quantify
    proteins and their biochemical changes.

17
Application of Proteomics
  • Protein Mining
  • Network Mapping
  • Mapping Protein Modifications

18
Proteomics Analysis
Reference www.mbi.osu.edu/sciprograms/prfmaterial
s/vandre.ppt
19
2-D Gel Electrophoresis
  • The horizontal position tells us about the charge
    of a protein, whereas the intensity of the gel
    spot tells us about the amount of that protein in
    the system.
  • Steps-
  • 1. Prepare protein sample in solution
  • 2. Separate proteins (in each dimension)
  • I. Based on pH
  • Using isoelectric focusing (IEF)
  • Using immobilized pH gradient (IPG) strips
  • II. Based on molecular weight (size)
  • Using gel electrophoresis
  • 3. Stain proteins to enable visualization.

20
Introduction to the project
  • This project focuses on 2D gel electrophoretic
    separation of proteins.
  • We analyzed few random spots from the 2D gels of
    rat mammary tissue.
  • Statistical methods to find the variance in pI of
    the same protein in different gels.
  • Analyzed the reasons for these differences.
  • Inferred the relationship between the
    experimental values and the predicted values.

21
Images of the gels used in the project.
22
One of the gels with Protein Spots
23
  • The Gels we used were from an already done
    experiment. 28 Random protein spots were selected
    based on the their intensity from each of the
    three gels.
  • Mass Spectrometry
  • Differentially expressed proteins identified by
    image analysis were excised from 2D gels and
    trypsin digested. The resulting peptide fragments
    were analyzed on a MALDI mass spectrometer (MS).
    The MALDI spectra displays a peptide
    fingerprint of the protein using corresponding
    peptide masses.

24
MALDI TOF MS
25
  • Proteins were identified by entering the masses
    (ions from MALDI spectrum) of the peptides into
    a peptide mapping database. Some examples of such
    protein search engine are-
  • Mascot - very popular and also used in this
    project
  • Sequest
  • Aldente
  • ProteinLynx
  • Phenyx

26
Image of a search data base
27
Results
  • We tabulated the result obtained from the
    database internet search and the one we obtained
    from the experiment.
  • We observed that the pI values as well as the
    molecular weight were not same in all gels for
    same protein.
  • The pI values of the three gels were quite
    similar but they were different from the
    predicted pI values.

28
  • In a 2D gel the position of protein spot can
    change due to various reasons and because of
    which the molecular weight and pI values may also
    differ.

29
Graphical representation of pI values of three
gels
30
Graph showing the variance among the predicted pI
and observed pI
31
Observations
  • We saw that the difference between the pI values
    of the three gels that is the experimental values
    are not very different from each other.
  • So we can interpret that the difference due to
    non biological reason is very less in the
    experiment.
  • There were few protein spots for which internet
    search revealed the same result as same protein
    name. But our experiment gave different results
    which can be because of different group (like
    phosphate or sulphate) getting attached to it.
    There can be other reasons for it too.

32
  • Average deviations between the three observed
    proteins and the predicted pI values were
    calculated as
  • (pI (gel 12_5)- pred. pI) (pI (gel 12_5)-
    pred. pI) (pI (gel 12_5)- pred. pI) / 3
  • This gave the results shown in the next slide.
    We obtained positive as well as negative values
    for the deviations.

33
Average deviations between the three gels and the
predicted pI
34
  • We can interpret that the proteins were modified
    more by negatively charged group such that there
    pI values decreased.
  • The addition of one phosphate groups to serine,
    threonine, and tyrosine residues typically
    decreases their isoelectric points by 0.1 pH
    unit.

35
Regression results
  • A statistical analysis test was performed to
    determine which of the three gels were closest to
    the predicted pI values. That is in which of the
    three gels had the proteins being least modified.
  • The test was Clibration test. We prepared a
    regression model for each gel. The inverse
    regression equation used was
  • Predicted pI Observed pI from Gel Intercept
    slope

36
Predicted pI values from the Calibration test and
internet database
37
  • The result we obtained showed us that all the
    three gels predicted almost same pI values and
    they were quite away from the original predicted
    pI values.
  • All these similarities between the three gels
    show us that the difference between the pI values
    of proteins between the predicted and the
    experimented values is not very much because of
    non biological factors, but because of chemical
    modifications in the proteins.

38
Coexpression analysis of dimerization between
bZIP proteins in groups C, S1 and S2 in
Arabidopsis Thaliana, under the conditions of
differential light and CO2 levels.
39
IntroductionTranscription factor
  • Transcription factor are proteins involved in the
    regulation of gene expression, that bind to
    promoter region upstream of genes.
  • They are composed of two essential functional
    regions
  • DNA binding domain It binds to DNA.
  • Activator Domain It interacts with other
    regulatory proteins there by affecting the
    efficiency of DNA binding.

40
bZIP proteins
  • bZIP proteins are a class of transcription factor
    which has leucine zipper motif consisting of a
    periodic repetition of a leucine residue at every
    seventh position forming an alpha-helical
    confirmation.
  • The segment that comprises the basic region and
    the periodic array of leucine residues is
    referred to as basic-region leucine zipper or
    bZIP motif.

41
(No Transcript)
42
Some facts
  • There are 792 bZIP proteins recorded in
    nonredundant database.
  • The no of bZIP proteins in the cell of selected
    organisms are as follows
  • yeast 16
  • fruitfly 110
  • plant (Arabidopsis thaliana) 75
  • Human - 114

43
Arabidopsis
  • The Arabidopsis genome sequence contains 75
    distinct members of the bZIP family, of which 50
    of them are not well studied.
  • Using common domains the bZIP family can be
    subdivided into 10 groups Groups A - S.

44
(No Transcript)
45
(No Transcript)
46
C S protein interaction
  • Elhert et al measured interactions between C and
    S proteins.
  • C and S1 heterodimerized
  • Two S2 proteins dimerized.

47
Effect of Light CO2 on C S proteins
  • Carbohydrate signaling
  • Increase of carbohydrate partitioning in
    elevated CO2, and a decrease in low light.
  • Seed development
  • Photosensory system detects the quality,
    quantity, direction and duration of light.
    Controls developmental pattern.
  • Stress
  • Light dependent generation of active oxygen
    species is a type of stress called photo
    oxidative stress.

48
Experiment Selection Criteria
  • a) Chose C and S bZIP proteins
  • Coexpression Engine http//www.ssg.uab.edu/coexpr
    ession
  • b) Selected tissue and array type
  • c) Chose specific experiment

49
a) Chose C and S bZIP proteins
50
b) Selected tissue and array type
51
c) Chose specific experimentNASC Experiments
52
Justification
  • Biologically feasible comparisons due to similar
  • Tissue types
  • Experiment conditions
  • Statistical
  • Measurement protocol

53
The tool used
  • Co-expression Analysis Tool, version 2.0
    developed at the Section on Statistical Genetics,
    UAB http//obiwan.ssg.uab.edu8080/coexpression/se
    rvlets/CoexpReleasesResponseManager
  • mainly built to analyze the co-expression in
    Arabidopsis plant.
  • NASC Experiments to study affymetrix gene chip
    profiling of light and CO2 effect in leaf
    development in Arabidopsis used.

54
  • Uses the database built from Nottingham
    Arabidopsis Stock Center (NASC) AffyWatch
    Service.
  • Version 2 used in this project contains total of
    566 microarray chips out of which 486 ATH1 micro
    array chips were used.

55
NASC Experiments used
  • 4 experiments conducted to examine the effect of
    developing leaf insertions under varying
    conditions of light and CO2.
  • The sampling was done at time interval of 0th,
    2nd, 4th, 12th, 24th, 48th and 96th hour using a
    batch of 24 plants.
  • Four replicates were produced for each of the
    seven time points per experiment.

56
Working of the tool
  • Linear regression analysis is done on the probe
    sets.
  • Result of regression gives three important
    values- slope parameter (indicating the direction
    of co-expression), p-value (stating the
    confidence in the correlation) and R squared
    values (strength of correlation).

57
Procedure
  • 4 genes of C group, 5 genes of S1 group and 3
    genes of S2 group were studied in the project.
  • We submit the AGI IDs, the tissue type (here
    leaf) and the experiment number (in our case 156,
    157 158 and 159) in the tool.
  • Our genes of interest are regressed on all the
    22,810 ATH1 probe sets and a p-value, R squared
    value and slope parameter is obtained.

58
  • Those genes were subsequently sorted according to
    the R squared value and p-value and ranked such
    that
  • Higher the R squared value, higher is the
    rank.
  • An arbitrary cut-off 15 of the top ranked genes
    were identified as highly co-expressed.

59
Hypothesis
  • Genes coding for dimerizing proteins should be
    coexpressed at the same time.
  • If genes in group C and S1 lead to
    heterodimerization then they should be
    coexpressed at the same time.

60
Table 2 Mapping information between AtbZIP AGI
ATH Probeset AtbZIP Group Ids
61
Table 3 Regression estimates between Group C
AtbZIIP63 (245925_at) and Probes in Group S1, C
and S2.
62
Table 4 Regression estimates between Group C
AtbZIIP25 (251848_at) and Probes in Group S1, C
and S2.
63
Regression estimates between Group C AtbZIIP9
(246962_s_at) and Probes in Group S1, C and S2.
64
(No Transcript)
65
(No Transcript)
66
(No Transcript)
67
(No Transcript)
68
(No Transcript)
69
(No Transcript)
70
(No Transcript)
71
Results
  • bZIP1(Group S1) coexpresses well with bZIP63
    (S1) under conditions of Ambient Co2 and low
    light but the same coexpression interaction is
    weak under conditions of Elevated Co2 and Ambient
    Light.
  • Also, very minimal interaction was found between
    genes of Group C (bZIP25, bZIP10, bZIP9, and
    bZIP63) and bZIP9 (Group C

72
Conclusion
  • This bZIP study was a good litmus test for the
    SSG Coexpression Tool.
  • Results presented in this study provide evidence
    that a good if not significant number of AtbZIP
    proteins interacting as heterodimers are
    co-regulating under varying conditions of stress.
  • This study shows evidence that coexpression
    patterns in genes can be studied by pooling
    publicly available microarray data and that the
    use of simple linear regression procedure is
    feasible.

73
Discussion
  • Varying trends in the coexpression proposes some
    theories
  • Different genes are expressed in diff tissues. Is
    study on leaf good enough to support our
    hypothesis?
  • Time-course data is valuable and should be
    accounted for in the analysis. However, this kind
    of analysis requires more observation recorded at
    different timepoints.
  • Linear regression is good but will a robust
    time-series based approach be appropriate in our
    study?
Write a Comment
User Comments (0)
About PowerShow.com