Welcome to Introduction to Bioinformatics Monday, 11 October - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Welcome to Introduction to Bioinformatics Monday, 11 October

Description:

Welcome to. Introduction to Bioinformatics. Monday, 11 October. Characteristics of PSSMs ... If hetQ isn't the golden link, then what is? Gene preceded by NtcA ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 27
Provided by: bioinforma6
Learn more at: https://bulletin.vcu.edu
Category:

less

Transcript and Presenter's Notes

Title: Welcome to Introduction to Bioinformatics Monday, 11 October


1
Welcome toIntroduction to BioinformaticsMonday,
11 October
  • Characteristics of PSSMs
  • How to make a PSSM
  • Uncertainty and information
  • How to score a sequence

Problem sets (Blast, Modeling)
2
Scenario 1 Prediction of regulatory site
3
N2 fixation in cyanobacteria
N2
CO2
O2
Matveyev and Elhai (unpublished)
4
Differentiation in cyanobacteriaWhat does NtcA
bind to?
Herrero et al (2001) J Bacteriol 183411-425
5
Differentiation in cyanobacteria
Sequence upstream from hetQ
ttctatgagaatataaaattttccttaagtttct aaaaccgaccattct
gatgaataagtccggtttt tgctttttcgctttatttatctatatttcc
aagt ggggtgacaactatcttgccaatattgtcgttat gaaaaaatct
GTAacatgagaTACacaatagcatttatatttgcttTAgtaTctctctct
tgggtggg
(20-24)TAnnnT
GTA(8)TAC NtcA binding site
Promoter
6
Differentiation in cyanobacteriaIntegration of
signals through HetR
HetQ
-N
NtcA
???
Genes needed for differentiation
Position in cell cycle
HetR
Level of PatS
Level of HetN
Master regulator
Stockholm
7
Scenario 1 The aftermath
  • Did you go for it?

YES
  • Did it bind NtcA?

YES
  • Did killing the site prevent heterocysts?

NO
Stockholm
8
Scenario 1 The aftermath
  • Did you go for it?

YES
  • Did it bind NtcA?

YES
  • Did killing the site prevent heterocysts?

NO
  • Fame and fortune?

NO
  • Reasonable paper?

YES
9
Scenario 1 The aftermath
If hetQ isnt the golden link, then what is?
-N
NtcA
???
Genes needed for differentiation
HetR
  • Gene preceded by NtcA-binding site
  • Blocking NtcA-binding affects gene expression
  • Gene product required for hetR expression

10
Scenario 1 The aftermath
If hetQ isnt the golden link, then what is?
-N
NtcA
???
Genes needed for differentiation
HetR
  • Gene preceded by NtcA-binding site

How to find?
  • Search for GTA(N8)TAC(N20-24)TAT?

11
Position-specific scoring matrices A better way
12
Position-specific scoring matrices A better way
13
Position-specific scoring matrices A better way
Anabaena genome
14
Position-specific scoring matrices A better way
15
Position-specific scoring matrices A better way
16
Position-specific scoring matrices A better way
17
Position-specific scoring matrices A better way
Score .60 .20 1.0
18
Position-specific scoring matricesIntroduction
of pseudocounts
A?
qG,6 5 real counts pG ? pseudocounts
19
Position-specific scoring matricesIntroduction
of pseudocounts
Score(position,nucleotide) (q p) / (N B)
p pseudocounts B (overall frequency of
nucleotide) A 0.32T 0.32C 0.18G
0.18
B Total number of pseudocounts Square
root (N) ? or 0.1 ?
20
Position-specific scoring matricesIntroduction
of pseudocounts
21
Position-specific scoring matricesNormalization
How to account for similarity due to similar base
composition?
Compare ScorePSSM / Scorebackground
frequency 0.79 / 0.32 2.2
22
Position-specific scoring matricesLog odds form
Log odds -log(score)
Score score score log log
log
23
Position-specific scoring matricesExpand
training set through orthologs
Table 3 Training set including sequences from
two Nostocsa 71-devB CATTACTCCTTCAATCCCTCGCCCCTCAT
TTGTACAGTCTGTTACCTTTACCTGAAACAGATGAATGTAGAATTTA Np
-devB CCTTGACATTCATTCCCCCATCTCCCCATCTGTAGGCTCTGTTA
CGTTTTCGCGTCACAGATAAATGTAGAATTCA 71-glnA
AGGTTAATATTACCTGTAATCCAGACGTTCTGTAACAAAGACTACAAAAC
TGTCTAATGTTTAGAATCTACGATAT Np-glnA
AGGTTAATATAACCTGATAATCCAGATATCTGTAACATAAGCTACAAAAT
CCGCTAATGTCTACTATTTAAGATAT 71-hetC
GTTATTGTTAGGTTGCTATCGGAAAAAATCTGTAACATGAGATACACAAT
AGCATTTATATTTGCTTTAGTATCTC 71-nirA
TATTAAACTTACGCATTAATACGAGAATTTTGTAGCTACTTATACTATTT
TACCTGAGATCCCGACATAACCTTAG Np-nirA
CATCCATTTTCAGCAATTTTACTAAAAAATCGTAACAATTTATACGATTT
TAACAGAAATCTCGTCTTAAGTTATG 71-ntcB
ATTAATGAAATTTGTGTTAATTGCCAAAGCTGTAACAAAATCTACCAAAT
TGGGGAGCAAAATCAGCTAACTTAAT Np-ntcB
TTATACAAATGTAAATCACAGGAAAATTACTGTAACTAACTATACTAAAT
TGCGGAGAATAAACCGTTAACTTAGT 71-urt
ATTAATTTTTATTTAAAGGAATTAGAATTTAGTATCAAAAATAACAATTC
AATGGTTAAATATCAAACTAATATCA Np-urt
TTATTCTTCTGTAACAAAAATCAGGCGTTTGGTATCCAAGATAACTTTTT
ACTAGTAAACTATCGCACTATCATCA
24
Position-specific scoring matricesDecrease
complexity through info analysis
Uncertainty (Hc) - Sum pic log2(pic)
25
Position-specific scoring matricesDecrease
complexity through info analysis
Uncertainty (Hc) - Sum pic log2(pic)
H1 -4/11 log2(4/11) 3/11 log2(3/11)
1/11 log2(1/11) 3/11 log2(3/11)
1.87
H31 -1/11 log2(1/11) 1/11 log2(1/11)
1/11 log2(1/11) 8/11 log2(8/11)
1.28
Information content Sum (Hmax Hc) (summed
over all columns)
26
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com