Title: Computational Biology Applications
1Computational Biology Applications
- Bartosz Nowierski
- Poznan Univeristy of Technology
2Laboratory of Bioinformatics (1)
- Institute of Bioorganical Chemistry (Polish
Academy of Sciences) - Founded 1.11.1999
- Members
- Prof. Dr. Habil. Jacek Blazewicz - director
- Ph.D. Piotr Formanowicz
- Ph.D. Marta Kasprzak
- M.Sc. Marcin Jaroszewski
- M.Sc. Piotr Lukasiak
- M.Sc. Piotr Wierzejewski
3Laboratory of Bioinformatics (2)
- Basic research area
- algorithms for sequencing by hybridisation
- analysis of DNA graphs
- analysis of NMR spectra for RNA chains
- restriction map constraction
- constracting phylogenetic trees
- constructing bio-server for selected problems of
computional biology - prediction of protein secondary structures
- DNA sequence assembly
- Basic research area
- algorithms for sequencing by hybridisation
- analysis of DNA graphs
- analysis of NMR spectra for RNA chains
- restriction map constraction
- constracting phylogenetic trees
- constructing bio-server for selected problems of
computional biology - prediction of protein secondary structures
- DNA sequence assembly
4Laboratory of Bioinformatics (3)
- International cooperation
- Universite LeHavre, France
- Max Planck Institutes, Germany
- TmBioscience, Canada
- Rutgers University, USA
- TU Clausthal, Germany
- RIKEN, Japan
5Applications
- DNA sequence assembly
- Prediction of protein secondary structures
- Constracting phylogenetic trees
- Motivation
- popular subject
- great demend
- faster gt distributed
6DNA Sequence AssemblyProblem specification
- Alphabet A, C, G, T
- Problem
7DNA Sequence AssemblyOverlap graph
- input sequence ? vertex
- overlap of sequences ? arc
- shift
- weight
- ACTGCCTA
- CTAGGATC
- TCAAGA
8DNA Sequence AssemblyRedundant arcs
ATGACTACT GACTACTGA ACTGAATCA
24 6
9DNA Sequence AssemblyHamiltionan path with max.
weight
- NP-hard problem gt heuristic
- Selection of first element
- unatractive successor of any vertex
- Selection of next elements
- atractive succesor, but not attractive to others
10DNA Sequence AssemblyParallelization
- Overlaps distrbute set of sequences
- Arc reduction distrib. set of vertices
- First vertex distribute set of vertices
- Next vertices ?
11Protein Secondary StructuresProblem specification
- aminoacid ? secondary structure
- A,C,D,E,F,G,H,I,K,.... ? H,E,X
12Protein Secondary StructuresRule usage
- Logical Analysis of Data approach
VASYDYLVIGGGSGG
VASYDYLVIGGGSGG
13Protein Secondary StructuresRule generation
- Good rule properties
- rule says e.g. H ? it must be right
- rule says e.g. not H ? it should be right
- The best rules
- if 1 variable is out, its not good anymore
14Protein Secondary StructuresRule generation -
algorithm
- Algorithm
- generate all reasonable 0-1 arrays
- Clasifier generation
- set of rules (logical OR)
- mathematical formula
- Parallelization
- division of array space
15Phylogenetic TreesProblem specification
16Phylogenetic TreesAlgorithm
- Parallelization
- distribution of subtrees
- exchange of information about solutions
17Usage of GridLab (1)
- Resource management (GRMS)
- assignment of resources
- application structure (master-slave)
- checkpointing migration
- dynamic assignment of new resources
- frameworks
- distributed shared memory (?)
18Usage of GridLab (2)
- Monitoring
- dynamic link/processor states (tuning)
- estimation of end time (GRMS, end user)
- Others
- visualisation
- security
- mobile users
19Contact
- Director
- Prof. Dr. Habil. Jacek Blazewicz
- Jacek.Blazewicz_at_cs.put.poznan.pl
- DNA Sequence Assembly
- B.Sc. Bartosz Nowierski
- Bartosz.Nowierski_at_cs.put.poznan.pl
- Protein Secondary Structures
- M.Sc. Piotr Lukasiak
- Piotr.Lukasiak_at_cs.put.poznan.pl
- Phylogenetic Trees
- Ph.D. Piotr Formanowicz
- Piotr.Formanowicz_at_cs.put.poznan.pl