Title: Review of important points from the NCBI lectures.
 1- Review of important points from the NCBI 
 lectures.
- Example slides 
- Review the two types of microarray platforms. 
- Spotted arrays 
- Affymetrix 
- Specific examples that use microarray technology. 
- Gene expression - role of a transcription factor 
2Web Access
Text
Entrez
Sequence
BLAST
Structure
VAST 
 3Translated BLAST
ucleotide
rotein
Particularly useful for nucleotide sequences 
without protein annotations, such as ESTs or 
genomic DNA
Query
Database
Program
P
N
blastx
N
P
tblastn
N
N
tblastx 
 4Position Specific Score Matrix (PSSM)
 A R N D C Q E G H I L K M 
F P S T W Y V 206 D 0 -2 0 2 -4 2 4 
-4 -3 -5 -4 0 -2 -6 1 0 -1 -6 -4 -1 207 G 
-2 -1 0 -2 -4 -3 -3 6 -4 -5 -5 0 -2 -3 -2 -2 
-1 0 -6 -5 208 V -1 1 -3 -3 -5 -1 -2 6 -1 
-4 -5 1 -5 -6 -4 0 -2 -6 -4 -2 209 I -3 3 
-3 -4 -6 0 -1 -4 -1 2 -4 6 -2 -5 -5 -3 0 -1 
-4 0 210 S -2 -5 0 8 -5 -3 -2 -1 -4 -7 -6 
-4 -6 -7 -5 1 -3 -7 -5 -6 211 S 4 -4 -4 -4 
-4 -1 -4 -2 -3 -3 -5 -4 -4 -5 -1 4 3 -6 -5 -3 
212 C -4 -7 -6 -7 12 -7 -7 -5 -6 -5 -5 -7 -5 0 
-7 -4 -4 -5 0 -4 213 N -2 0 2 -1 -6 7 0 
-2 0 -6 -4 2 0 -2 -5 -1 -3 -3 -4 -3 214 G 
-2 -3 -3 -4 -4 -4 -5 7 -4 -7 -7 -5 -4 -4 -6 -3 
-5 -6 -6 -6 215 D -5 -5 -2 9 -7 -4 -1 -5 -5 
-7 -7 -4 -7 -7 -5 -4 -4 -8 -7 -7 216 S -2 -4 
-2 -4 -4 -3 -3 -3 -4 -6 -6 -3 -5 -6 -4 7 -2 -6 
-5 -5 217 G -3 -6 -4 -5 -6 -5 -6 8 -6 -8 -7 
-5 -6 -7 -6 -4 -5 -6 -7 -7 218 G -3 -6 -4 -5 
-6 -5 -6 8 -6 -7 -7 -5 -6 -7 -6 -2 -4 -6 -7 -7 
219 P -2 -6 -6 -5 -6 -5 -5 -6 -6 -6 -7 -4 -6 -7 
 9 -4 -4 -7 -7 -6 220 L -4 -6 -7 -7 -5 -5 -6 
-7 0 -1 6 -6 1 0 -6 -6 -5 -5 -4 0 221 N 
-1 -6 0 -6 -4 -4 -6 -6 -1 3 0 -5 4 -3 -6 -2 
-1 -6 -1 6 222 C 0 -4 -5 -5 10 -2 -5 -5 1 
-1 -1 -5 0 -1 -4 -1 0 -5 0 0 223 Q 0 1 
4 2 -5 2 0 0 0 -4 -2 1 0 0 0 -1 -1 -3 -3 
-4 224 A -1 -1 1 3 -4 -1 1 4 -3 -4 -3 -1 
-2 -2 -3 0 -2 -2 -2 -3 
Serine is scored differently in these two 
positions
Active site nucleophile 
 5PSI-BLAST
Create your own PSSM Confirming relationships of 
purine nucleotide metabolism proteins
 BLOSUM62 
 PSSM 
query
Alignment
Alignment 
 6Affymetrix vs. glass slide based arrays
- Affymetrix 
- Short oligonucleotides 
- Many oligos per gene 
- Single sample hybridized to chip
- Glass slide 
- Long oligonucleotides or PCR products 
- A single oligo or PCR product per gene 
- Two samples hybridized to chip
7Bacterial DNA microarrays
- Small genome size 
- Fully sequenced genomes, well annotated 
- Ease of producing biological replicates 
- Genetics 
8Applications of DNA microarrays
- Monitor gene expression 
- Study regulatory networks 
- Drug discovery - mechanism of action 
- Diagnostics - tumor diagnosis 
- etc. 
- Genomic DNA hybridizations 
- Explore microbial diversity 
- Whole genome comparisons 
- Diagnostics - tumor diagnosis 
- ?
9Characterization of the stationary phase sigma 
factor regulon (sH) in Bacillus subtilis
- Robert A. Britton and Alan D. Grossman - 
 Massachusetts Institute of Technology.
- Patrick Eichenberger, Eduardo Gonzalez-Pastor, 
 and Richard Losick - Harvard University.
10What is a sigma factor?
- Directs RNA polymerase to promoter sequences 
- Bacteria use many sigma factors to turn on 
 regulatory networks at different times.
- Sporulation 
- Stress responses 
- Virulence
Wosten, 1998 
 11Alternative sigma factors in B. subtilis 
sporulation
Kroos and Yu, 2000 
 12The stationary phase sigma factor sH
- ? most active at the transition from exponential 
 growth to stationary phase
- ? mutants are blocked at stage 0 of sporulation 
- known targets involved in 
- phosphorelay (kinA, spo0F) 
- sporulation (sigF, spoVG) 
- cell division (ftsAZ) 
- cell wall (dacC) 
- general metabolism (citG) 
- phosphatase inhibitors (phr peptides)
13Experimental approach
- Compare expression profiles of wt and ?sigH 
 mutant at times when sigH is active.
- Artificially induce the expression of sigH during 
 exponential growth.
- When Sigma-H is normally not active. 
- Might miss genes that depend additional factors 
 other than Sigma-H.
- Identify potential promoters using computer 
 searches.
14?sigH
wild-type 
 15wild type (Cy5) vs. sigH mutant (Cy3)
Hour -1
Hour 0
Hour 1 
 16(No Transcript) 
 17Identifying differentially expressed genes
- Many different methods 
- Arbritrary assignment of fold change is not a 
 valid approach
- Statistical representation of the data 
- Iterative outlier analysis 
- SAM (significance analysis of microarrays) 
18Data from a microarray are expressed as ratios
- Cy3/Cy5 or Cy5/Cy3 
- Measuring differences in two samples, not 
 absolute expression levels
- Ratios are often log2 transformed before analysis 
19Genes whose transcription is influenced by sH
- 433 genes were altered when comparing wt vs. 
 ?sigH.
- 160 genes were altered when sigH overexpressed. 
- Which genes are directly regulated by Sigma-H? 
20Identifying sigH promoters 
- Two bioinformatics approaches 
- Hidden Markov Model database (P. Fawcett) 
- HMMER 2.2 (hmm.wustl.edu) 
- Pattern searches (SubtiList) 
- Identify 100s of potential promoters 
21Correlate potential sigH promoters with genes 
identified with microarray data.
- Genes positively regulated by Sigma-H in a 
 microarray experiment that have a putative
 promoter within 500bp of the gene.
22Directly controlled sigH genes
- 26 new sigH promoters controlling 54 genes 
- Genes involved in key processes associated with 
 the transition to stationary phase
- generation of new food sources (ie. proteases) 
- transport of nutrients 
- cell wall metabolism 
- cyctochrome biogenesis 
- Correctly identified nearly all known sigH 
 promoters
- Complete sigH regulon 
- 49 promoters controlling 87 genes. 
23- Identification of DNA regions bound by proteins.
Iyer et al. 2001 Nature, 409533-538  
 24Pathogen 1
Pathogen 2