A Computational Method to Increase Gene Expression - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

A Computational Method to Increase Gene Expression

Description:

Create a 'second' sequence create the sequence which is maximally random given ... Experimental program by Greg Spies with Julie McElrath, Steve Self, and Larry Corey ... – PowerPoint PPT presentation

Number of Views:109
Avg rating:3.0/5.0
Slides: 18
Provided by: PHS88
Category:

less

Transcript and Presenter's Notes

Title: A Computational Method to Increase Gene Expression


1
A Computational Method to Increase Gene Expression
  • Harlan Robins
  • Computational Biology
  • FHCRC

2
HIV genes are poorly expressed
Cullen BR, Trends Biochem Sci, 2003 Aug28(8).
3
Problem HIV vaccine is likely less effective
with poor protein expression
  • Immune response correlates positively with
    antigen levels
  • For DNA vaccines, the antigen is expressed
    protein

4
Solution increase protein expression
  • Traditional method is codon optimization
  • The idea is to recode an ORF with codons
    corresponding to the highest abundance tRNAs
  • This method is designed to optimize translation
  • Codon usage has some success in HIV
  • Gag, Pol, Env expression can increase 500Xs
  • For HIV a new method is needed
  • The poor expression of HIV genes is caused by
    nuclear isolation, not translation
  • Gain from codon usage is indirect effect

5
What else is playing a role besides codon usage?
  • Short motifs are good starting point
  • Many RNA binding proteins
  • Suggested more proteins bind RNA than DNA
  • Statistically meaningful for real genome lengths

6
How do we Find motifs in coding regions?
  • Two Sequences
  • Compare the two sequence search for motifs that
    are over- or under-represented in one sequence
    compared to the other
  • Single Sequence with constraints
  • Create a second sequence create the sequence
    which is maximally random given the constraints
    this is the maximal entropy sequence
  • Apply same procedure as for two sequences

7
Form a Distribution (from a sequence)
  • We need to form a probability distribution from a
    sequence.
  • For length n, we have i1,,4n words. P(si)
    the fraction of length n words with sequence si.

Example if our sequence is 40 bases
AGACTAATTGCGTAGCATAATCATGCATGTCGATGCGATT
P(GCAT) 2/(40-3) .054
8
Back to finding Motifs
  • First goal start with
  • a probability distribution (that we get from a
    sequence)
  • a set of constraints on that sequence (i.e.
    sequence must code for a particular protein).
  • Produce the most random distribution possible
    given the constraints.
  • This will be our first approximation for a
    background genome.

9
Coding Sequence motifs
  • Constraints
  • Preserve amino acid order
  • Preserve codon usage in each gene
  • Get Maximum Entropy Distribution from sequence by
    randomly permuting the codons for each amino acid
    within each gene.
  • For example, imagine a short gene with the
    following AA and coding sequences

M L1 L2 H1 L3 H2
L4 H3 ST ATG CTA CTG CAT TTA CAT
CTG CTT TAG
Randomly permute L1,L2,L3,L4 and H1,H2,H3,
extracting the fraction of each word of length
n. The MED is the average of many runs.
10
Now that we have the two distributions we need a
strategy to find motifs
  • We want motifs that are over- or
    under-represented in the real distribution as
    compared to the MED.
  • Choose the word (motif) that contributes the most
    number of bits to the entropy difference between
    the two distributions.
  • reminder, for this talk we are only
    considering non-degenerate motifs
  • For each motif si with P(si), split the
    distribution into 2 parts P(si) and 1-P(si).
  • Find the word si which has maximum Relative
    Entropy.

11
A problem arises How do we find the next motif?
Example CTAG is strongly selected against (as is
true in many bacteria because it is a restriction
site).
12
Solution (simplified version) rescale MED to
remove contribution of word si from DKL
For all i imax
This rescaling is proven to monotonically
decrease DKL.
13
List of motifs for E. Coli
  • or means over- or under-represented
  • Many motifs are known restriction sites

14
Gag expression in transiently transfected 293
cells
  • The table presents the results from 4 independent
    transfection experiments
  • Two DNA dose (0.5 and 1 ?g) were used for
    transfection of 293 cells
  • Culture media were collected at 48 and 72 hours
    post transfection
  • Gag expression was quantified by P24 measurement
  • Results are expressed as the mean value (ng/ml,
    ?SD) of triplicates

15
Immunogenicity of Gag DNA vaccines in mouse
(b-cell or antibody)
Anti-p24 Ab response measured by ELISA
Robins Gag
16
Immunogenicity of Gag DNA vaccines in mouse
(t-cell or cellular)
P24-specific CD4 and CD8 cellular immune
responses measured by IFN-? ELISpot
Robins Gag
17
Experimental program by Greg Spies with Julie
McElrath, Steve Self, and Larry Corey
  • Optimize VRC version of Gag incorporating full
    set of motifs
  • Create Adenovirus vector with optimized Gag and
    GFP tag
  • Test vaccine potency
  • In vitro
  • Mouse model
  • Move to primates
Write a Comment
User Comments (0)
About PowerShow.com