Title: Ran Libeskind-Hadas, Department of Computer Science
1Bioinformatics Education at Harvey Mudd College
Ran Libeskind-Hadas, Department of Computer
Science Thanks to Eliot Bush (Biology) and Zach
Dodds (Computer Science)
2Our name is Mudd
- Undergraduate only 700 students
- Sciences, mathematics, and engineering
3Our name is Mudd
- Undergraduate only 700 students
- Sciences, mathematics, and engineering
4Our name is Mudd
- Undergraduate only 700 students
- Sciences, mathematics, and engineering
5The HMC Curriculum
Includes one semester of CS and one of Biology
Electives
Core
Humanities
Major
6Experiments in the Core
Semester 1
Semester 2
Introduction to Biology
200 students per year
Introduction to CS
The regular path
Integrated Introduction to CS and Biology
20 students in 2009-2010
An integrated full year course
Introduction to Biology
or a second Biology course
Computation and Biology
Introduction to Biology
A one semester integrated course
Introduction to Biology
40 students in 2010-2011
Satisfies CS core requirement but not the Biology
requirement
7Computation and Biology Core Course
- Objectives
-
- Cover the content of the regular CS intro
course - Demonstrate the relationship between computing
and biology - Use computation to teach biology fundamentals and
use biology to motivate computing fundamentals - Provide students with computational tools to
perform their own dry lab experiments -
8Computation and Biology Core Course
- Objectives
-
- Cover the content of the regular CS intro
course - Demonstrate the relationship between computing
and biology - Use computation to teach biology fundamentals and
use biology to motivate computing fundamentals - Provide students with computational tools to
perform their own dry lab experiments -
9Computation and Biology Core Course
- Objectives
-
- Cover the content of the regular CS intro
course - Demonstrate the relationship between computing
and biology - Use computation to teach biology fundamentals and
use biology to motivate computing fundamentals - Provide students with computational tools to
perform their own dry lab experiments -
10Computation and Biology Core Course
- Objectives
-
- Cover the content of the regular CS intro
course - Demonstrate the relationship between computing
and biology - Use computation to teach biology fundamentals and
use biology to motivate computing fundamentals - Provide students with computational tools to
perform their own dry lab experiments -
11Course Structure
Assignment
Biologist
Lab!
Tuesday
C.S.ist
Friday
Weekend
CSist
Thursday
12Biology
CS
Subset of student HW
Introduction to Python Data, functions, and
basic constructs
Gene finding, gene expression, lactase expression
DNA, RNA, central dogma, genes Case study of
lactose intolerance
wks 1-3
Mitochondrial Eve, diploid populations with
selection, molecular evolution simulations
Designing a larger program, randomness, simulation
Population genetics, molecular evolution
wks 4-5
Implement alignment and extend to deal with
substitutions
Sequence alignment
Recursion
Wks 6-7
Recursion on trees and phylogenetic tree
algorithms
Implementing a phylogenetic tree algorithm and
making inferences from the results
Phylogenetics
Wks 8-9
13Biology
CS
Subset of student HW
RNA folding algorithm, efficiency, and memoization
Implement RNA folding and visualize results
Folding RNA to Proteins
wks 10-11
Systems biology and modeling Chemotaxis
Wks 11-12
Chemotaxis simulations and evaluation of models
Computation and modeling
Wks 13-14
Capstone Projects
Topics
Limitations of computation
14Using computation to teach biology fundamentals
- Population genetic model
- Explore effects of drift and selection,
- Hardy-Weinberg equilibrium
15Using biology to motivate computation RNA
Folding
Recursion and memoization
16Above and Beyond
17Above and Beyond
18Final project example What makes cholera
pathogenic?
- Pathogenic vs. non-pathogenic strains
19Final project example What makes cholera
pathogenic?
- Compare all genes in one strain with all in other
to find orthologs (use fast global alignment)
20Final project example What makes cholera
pathogenic?
Programmatically Blast unique proteins to see
what they are
- Read about these unique genes and explain what
they do
21Microarray data
Courtesy of Prof. Russell Schwartz
- Some genes encode for transcription factors that
promote or inhibit the expression of other genes - Purple is highly expressed, green is not
expressed
genes
conditions
22Intuition Behind Network Inference
Courtesy of Prof. Russell Schwartz
gene 1
0
1
0
1
1
gene 2
0
1
0
1
1
gene 3
1
0
1
0
0
gene 4
0
1
1
0
1
conditions
1
1
-
1
-
2
3
3
2
-
-
2
3
-
1
1
-
-
4
2
3
2
3
-
correlated expression implies common regulation
that intuition still leaves a lot of ambiguity
23Assuming a Binary Input Matrix
Courtesy of Prof. Russell Schwartz
- We will assume that genes only have two possible
states 0 (off) or 1 (on) - We will also assume that we want to find
directionality but not strength of regulatory
interactions - We will exclude the possibility of regulatory
cycles
conditions
gene 1
1
0
1
0
1
1
1
0
1
gene 2
0
0
1
1
1
1
0
gene 3
0
0
1
0
0
0
0
1
gene 4
0
0
0
0
0
1
0
1
1
1
4
4
OK
NOT OK
3
2
3
2
24The Project
- Take binary microarray data as input
- Find the acyclic regulatory network with the
highest likelihood - Display the network somehow
25Student Response
Likert scale (1 low, 7 high) survey
This course stimulated my interest in the
subject matter
College mean 5.53/7.0 (std. dev
0.80) Computation and Biology 6.51/7.0
I learned a great deal in this course
College mean 5.76/7.0 (std. dev
0.72) Computation and Biology 6.49/7.0
Time spent outside of class (per week)
College mean 4.98 hours (std. dev
2.42) Computation and Biology 6.28 hours
26What did students choose to do the following term?
Students have one elective in the spring
term Took introductory biology 0/40 Took
an elective other than CS or biology
0/40 Took an upper division biology course
18/40 Took the second CS course 22/40
Outperformed their peers
27- Students learned the foundational content of
- Intro CS and Intro Biology
-
- Students programs provide rich dry lab
experiments - and simulations that reinforce understanding of
biology - Students develop general problem-solving and
- programming skills (e.g. DP) and have
confidence to - solve new problems on their own
28- Students learned the foundational content of
- Intro CS and Intro Biology
-
- Students programs provide rich dry lab
experiments - and simulations that reinforce understanding of
biology - Students develop general problem-solving and
- programming skills (e.g. DP) and have
confidence to - solve new problems on their own
29- Students learned the foundational content of
- Intro CS and Intro Biology
-
- Students programs provide rich dry lab
experiments - and simulations that reinforce understanding of
biology - Students develop general problem-solving and
- programming skills (e.g. DP) and have
confidence to - solve new problems on their own
30Next steps
- Increasing student demand for more courses and
even a major in computational biology - Mathematical Biology Major redesigned in Spring
2011 to Mathematical and Computational Biology
(MCB) major - Good news 9 MCB majors in sophomore year
- (6 Biology majors and 2 Biochemistry majors)
- Bad news Few faculty in a position to
contribute
31Beyond the core (intro CS, intro Biology, 3
semesters math, 2 chemistry, 1 physics, )
- Introductory Sequence
-
- Discrete Math
- Biology laboratory
- Introduction to Mathematical and Computational
Biology - Biology Foundations
- Three of Comparative physiology, ecology and
environmental biology, evolutionary biology,
molecular biology - One biology seminar
- One biology laboratory
- Mathematical and Computation Courses
- Intermediate Mathematical Biology
- Computational Biology
- One upper-division math course
- One upper-division CS course
- Three more math and CS courses
32Future Plans
- Refine and improve introductory course
- Write a book for the introductory course
- Collaborate with sister institutions to expand
computational biology curriculum - New faculty
- New courses
33Questions, Comments, Heckles