Title: Dr. Rosemary Renaut, renaut@asu.edu Director, Computational Biosciences www.asu.edu/compbiosci
1- Dr. Rosemary Renaut, renaut_at_asu.edu Director,
Computational Biosciences www.asu.edu/compbiosci
2- OUTLINE
- THE CBS PROGRAM AT ASU OVERVIEW
- CBS CURRICULUM
- REQUIREMENTS
- SOME HISTORY
- FUTURE
- PROJECTS WHAT DO THEY INVOLVE
- OUR CASE STUDIES COURSE(S)
- INTRODUCING BASIC MATHEMATICS TO LS/CSE STUDENTS
3- A Professional Science Masters Program
- Mathematics and Statistics
- The School of Life Sciences
- Computer Science Engineering
- W. P Carey Schoool Of Business
4(No Transcript)
5- Scientific Computing for Biosciences (4)
- Case Studies/ Projects in Biosciences(4)
- Structural and Molecular Biology(4)
- Statistics and Experimental Design(6)
- Business Practice and Ethics(6)
- Internship and Applied Project(6)
6- Genomics/Proteomics
- Data Mining Data Bases,
- Medical Imaging
- Molecular/Functional Genomics
- Microarray Analysis
- Individualized
7- Calculus and Differential Equations
- Basic Statistics (junior)
- Discrete Algorithms and Data Structures
- Programming skills(C/Java)
- Cell biology, genetics(junior level)
- Organic and Bio Chemistry (junior)
- Motivation, creativity, determination!
8- Interdisciplinary Training/Team Work
- Internship/Applied Project Report
- Business, Management and Ethics
- (Health Services Administration MBA)
- Small Groups/Close Faculty Involvement
- Computer Laboratory
- Extensive Project work/Consulting
9DATA
- Year 4 total 70 students, currently 27
- Graduates 32 (11 left without graduating)
- Internships NIH, ASU, Tgen, AZ Game and
Fish, - US Water conservation lab, AZ biodesign
- Jobs Tgen, ASU, Codon Solutions,
Medical record keeping, Matlab, St Judes
Memphis, Walt
Disney! Cisco - PhD programs (10) Biology, Computer Science,
Biochemistry, Mathematics
10- Undergraduate NIH MARC
- Quantitative Skills (sophomore)spring 05
- Modeling Comp Bio (Junior) spring 05
- Doctoral Program Computational Biosciences
- Molecular Cellular Biology / Mathematics
11- Database Construction/Mining of Pathology
Specimens (Tgen) - Gegenbauer high resolution reconstruction for
MRI, ASU - TLS-SVM for Feature Extraction of Microarray
Data, ASU - Automated video analysis for cell behavior. Tgen
- EST DB for Marine Dinoflagellate Crypthecodinium
cohnii, ASU - Data mining for microsatellites in ESTS from
arabidopsis thaliana and brassica species (US
Water Conservation Laboratory) - The Genome Assembler- Tgen
- A user interface to support navigation for
scientific discovery ASU - Cell Migration Software Tool Tgen
12- EVALUATION OF BIOINFORMATICS RESOURCES
(Tgen/NIH/ASU) - Pattern recognition Automated Cytoskeleton
Reconstruction (ASU) - Develop workable database on crop Lesquerella
using Integrated Crop Information Systems (ICIS)
(US Water Conservation Laboratory) - Investigation of Xylella fastidiosa Within an
Almond Tree Population A Model System for Golden
Death ( ASU Mathecology, AZ) - Search for Epigenetic Properties of DNA and RNA
involved in X Chromosome Inactivation , (Codon
Solutions LLC)
13- WHAT DID WE NEED FOR THESE PROJECTS
Image Analysis Data Mining Fourier
Analysis Modeling Differential Equations Sequence
Comparisons Mathematics for Genetic Analysis
Statistics Data base development for
BIOLOGICAL APPLICATIONS Geographic Information
Systems PERL/BIOPERL/MATLAB/MYSQL
14- Bioinformatics Managing Scientific Data tackles
this challenge head-on by discussing the current
approaches and variety of systems available to
help bioinformaticians with this increasingly
complex issue. The heart of the book lies in the
collaboration efforts of eight distinct
bioinformatics teams that describe their own
unique approaches to data integration and
interoperability. Each system receives its own
chapter where the lead contributors provide
precious insight into the specific problems being
addressed by the system, why the particular
architecture was chosen, and details on the
system's strengths and weaknesses. In closing,
the editors provide important criteria for
evaluating these systems that bioinformatics
professionals will find valuable.
15- Computational Modeling Skills Motivated by Case
Studies - Phylogenetics and Tree Building (for the data
make the tree)
16(No Transcript)
17(No Transcript)
18- Computational Modeling Skills Motivated by Case
Studies - Phylogenetics and Tree Building (for the data
make the tree)
19An ultrametric tree what are the distances
ei? Solve the linear programming problem min
L(e) min ? ei, where this is the total length
of the tree. Moreover each length is positive,
and the total lengths are preserved eg e1e2, and
e4e8e1e6e7
LP problem with constraints max cTx with Axb x
0 Students identify x, c, b, A? Use matlab
linprog
20BUT THERE AREMANY DIFFERENT TREE SHAPESAND
WHICH IS CORRECT? WE NEED EXHAUSTIVE
SEARCH GENETIC ALGORITHMS?
21- Introduction to
- data fitting,
- optimization, genetic algorithms, exhaustive
search - matlab routines,
- Realistic solutions (positive branch lengths)
- Start on some multivariable calculus to derive
normal equations
22(No Transcript)
23New technology allows hundreds of pathology
specimens from human diseases to be sampled as
.6mm punches of tissues that are arrayed into new
TMA paraffin blocks these blocks are then
sectioned with microtomes to produce hundreds of
slides containing hundreds of human tissue
specimens (tissue microarrays, TMAs). Databases
to support analysis of these high throughput TMAs
will include information on diagnosis, treatment,
disease response, and multiple images from
follow-on studies linked to the coordinates of
each of the hundreds of punches on the TMA. Data
mining from the results of TMA experiments will
allow text mining and image feature extraction.
In this project, we present the requirements,
design, and a prototype of a web based TMA
database application.
24(No Transcript)
25(No Transcript)
26Clustering has recently been demonstrated to be
an important preprocessing step prior to
parametric estimation from dynamic PET images.
Clustering, as a form of segmentation, is useful
in improving the accuracy of voxel level
quantification in PET images. Classical
clustering algorithms such as hierarchical
clustering and K-means clustering can be applied
to dynamic PET data using an appropriate
weighting technique. New variants of hierarchical
clustering with different preprocessing criteria
were developed by Dr. Guo recently. Our research
focus is to validate these different algorithms
with respect to their efficiency and accuracy.
Different inter and intra cluster measures and
statistical tests are considered to assess the
quality of the different cluster results.
27(No Transcript)
28(No Transcript)
29(No Transcript)
30(No Transcript)
31Otolith Aging and AnalysisWilliam T.
StewartAdvisors
Dr. Rosemary Renaut Dr. Paul MarshArizona State
University
Scott Byan Kirk Young Marianne MedingArizona
Game and Fish Department
- Otoliths, also known as earstones are paired
calcified structures used for balance and hearing
in teleost fish. An otolith is acellular and
metabolically inert providing biologists with a
record of exposure to both the temperature and
composition of the ambient water. Otoliths
provide an abundance of information ranging from
temperature history, detection of anadromy,
determination of migration pathways, stock
identification, use as a natural tag, and most
importantly age validation. Growth rings
(annuli) on the otolith record the age and growth
of a fish from birth to death. With the use of
Matlab the goal of this project is to design a
program that uses digital otolith images to
semi-automate the aging process. There are three
main components to this task.
32Sequencing a Microbial GenomeMaulik Shah
Advisors Dr. Jeffrey Touchman Dr.
Rosemary Renaut Dr. Phillip
Stafford
- Although many genomes are available for download
today, the underlying technologies should not be
taken for granted. By using shotgun sequencing
techniques and a gauntlet of informatics, we are
able to produce high-quality DNA sequence. We
will first look at some of the robotics and
chemistries of preparing DNA as samples for the
sequencing instruments. Then we will look at the
series of applications used in taking raw data
signals, converting them to sequence and then
finally assembling the data into a single genome.
Highlighted will be some of the techniques used
to speed the informatics processes as well as
some of the challenges that informatics faces in
processing data and assembling the genome.
33Supertree Analysis of the Plant Family
Fabaceae Tiffany J. Morris Advisor Martin F.
Wojciechowski School of Life Sciences, Arizona
State University
The Tree-of-Life is a national and
international project to collect information
about the origin, evolution, and diversity of
organisms, with the goal of producing a tree of
all life on Earth (Pennisi, 2003). The obstacles
to achieving this goal are many. From questions
related to the kinds and number of data to be
used, to building that phylogeny, to the
methodological and computational resources
required to analyze the massive amounts of data
expected to be necessary to bring this to
fruition. The goal of this project is to obtain a
Supertree for the plant family Fabaceae utilizing
phylogenetic trees found in previously published
studies.
34PROTEIN INTERACTION MAPPING USE OF OSPREY TO MAP
SURVIVAL OF MOTOR NEURON PROTEIN INTERACTIONS by
Margaret BarnhartAdvisor Dr. Ron Nieman
- Spinal Muscular Atrophy is one of the leading
genetic causes of death in infants. In humans,
the disease state is characterized by homozygous
deletion of the telomeric copy of the survival of
motor neuron gene (SMN1). The centromeric copy,
SMN2, rescues lethality by producing a small
amount of full-length SMN protein as its minor
product. The SMN gene was first characterized in
1995, and research efforts to describe the
molecular mechanisms of SMN protein in the cell
have since revealed a highly complex set of
functions and interactions for SMN. The large
amount of protein-protein interaction data
collected for SMN exceeds the limitations imposed
by current methods of interaction
visualization. Osprey allows a network
representation of protein-protein interactions
and has been used to describe the recorded sets
of interactions of SMN. This method of
interaction visualization allows relationships to
be drawn between the functions of SMN and
analogous proteins, clustering of interactions
based on level of interaction or function, and
ultimately, the derivation of clues to the
critical function of SMN.
35- Internship is at least 400 hours , possible full
time summer - Student must write a project report with required
format - Student presents report to committee in oral exam
- International students can work off campus using
EIP program - Encourage students to seek projects outside AZ
36- More Information please contact renaut_at_asu.edu
- More information on projects www.asu.edu/compbio
sci - Do you have projects?
- Internships?
- Jobs?