Title: The FDK Unit Scientific Advisory Board Meeting
1The FDK UnitScientific Advisory Board Meeting
Monday March 22 2004 Department of Computer
Science University of Helsinki
2Who are we?
- Host institutions
- University of Helsinki, Department of Computer
Science - Helsinki University of Technology, Laboratory of
Computer and Information Science - about 60 people altogether
- professors Esko Ukkonen (director),
Heikki Mannila, (Tapio Elomaa -gt Tampere
Univ of Technology), Helena Ahonen-Myka,
Jaakko Hollmen, Hannu Toivonen
3 - Postdocs, senior researchers Greger Linden,
Juha Muilu, (Kimmo Fredriksson -gt
Joensuu University), Juha Kärkkäinen,
Kjell Lemström, Juho Rousu,
Marko Salmenkivi, Saara Hyvönen,
Päivi Onkamo, Floris Geerts,
Bart Goethals, Aristides Gionis,
Veli Mäkinen Shunsuke Inenaga,
Stefan Burkhardt, Mikko Koivisto,
4What do we do?
- The FDK unit develops methods for forming useful
knowledge from large masses of data. The unit
operates in multidisciplinary fashion,
integrating in its research groups excellence in
computational methods, statistical techniques,
and application sciences. - data gt computational methods gt knowledge
5Core competence
- Combinatorial Pattern Matching searching and
matching of strings and more complicated
(discrete) patterns, derive their combinatorial
properties, and exploit these to achieve superior
performance for the corresponding computational
problems - Data Mining finding interesting and useful
patterns from masses of data - gt Combinatorial algorithms probabilistic models
6The 6 step FDK model
1. Find an interesting computational problem on
an application area in collaboration with
application experts 2. Find a conceptual basis
and formalize the problem 3. Develop an
algorithm 4. Analyse the algorithm 5. Implement
the algorithm as a prototype software 6.
Experiment with the method and tune it using real
data
7 Themes
- Theme I Data mining and machine learning
- Theme II Computational methods in medical
genetics and systems biology (bioinformatics) - Theme III Combinatorial pattern matching and
information retrieval - Theme IV Computational structural biology
8Groups
- Group Mannila data mining, biology, ...
- Group Elomaa algorithmic learning theory
- Group Toivonen data mining, biology,...
- Group Ahonen-Myka IR, linguistics
- Group Hollmen data analysis
- Groups Ukkonen/Lemström/Rousu string
algorithmics, music retrieval, computational
biology
9Activities
- research work in groups
- teaching
- study seminars, internal FDK seminars
- ECML02 / PKDD02 conference in Aug 2002
- Masters program in bioinformatics (Sami Kaski
just started at new professorship of data
analysis and bioinformatics at CS Dept) - PhD education (grad schools etc)
- visitors coming and going
- joint research consortia Biomedicum, Viikki
Biocenter, VTT, industry, EU
10Teaching 2002 (sample)
- Advanced algorithms in data mining (Mannila)
- Special topics in bioinformatics (Hollmen,
Wikman) - Methods of genetic linkage (Koivisto)
- Knowledge extraction (Lindén)
- Analysis of genome structure (Mannila)
- Spatial data processing (Geerts)
- Tutoring of research students (Ollikainen)
- Computational biology (Ukkonen, Salmenkivi,
Rousu) - FDK seminar (Ukkonen, Vasko)
- Postdocs and PhD students serve as teaching
assistants -
11Teaching 2003 (sample)
- Game theoretic concepts in computer science
(Geerts) - Seminar on data mining of spatial data (Leino)
- Tutoring Scientific writing (Mäkinen, Kivioja)
- Genetics for computer scientists (Onkamo)
- Special course on data mining (Goethals)
- Algorithmic methods of data mining (Mannila)
- Computational genomics (Hollmen, Kaski)
- Image processing (Lemström)
- String matching algorithms (Mäkinen)
- Analysis of binary data (seminar) (Hollmen,
Mannila) - Seminar on modeling and mining the web (Hollmen,
Patrikainen) - Seminar on algorithmic problems in data mining
(Gionis) - Seminar on data mining of biomolacular data
(Toivonen) - FDK-seminaari (Ukkonen, Vasko, Mäkinen)
12PhD Education
- Graduate schools ComBi, HeCSE, KIT, COMMIT
- student recruiting local departments (connection
to teaching necessary!), international contacts
(BREW 2004 etc), Internet - so far relatively good students
- heavy supervision load postdocs will help
- international collaboration of grad schools in
bioinformatics Bioinformatics Research and
Education Workshops (Bielefeld, Bergen, Berlin,
EBI, Helsinki)
13Research collaboration within the university
- Rolf Nevanlinna institute
- Institute of Biotechnology
- Finnish Genome Center
- Coordination of bioinformatics curriculum and
teaching
14Research collaboration within the university
(cont)
- Bioinformatics Liisa Holm
- Biometry Elja Arjas
- Gene expression, gene regulation M. Makarov, P.
Auvinen, M. Frilander - Medical genetics L. Palotie, J. Kere, S.
Knuutila, I. Järvelä - Melanoma and other cancer research E. Hölttä, T.
Mäkelä - Finnish names R.-L. Pitkänen
- Paleontology M. Fortelius, J. Jernvall
- Environmental research A. Korhola
- Structural biology D. Bamford
- Systems biology L. Aaltonen, J. Taipale
- Human language technology K. Koskenniemi, L.
Carlson, K. Jokinen (Kouvola)
15Research collaboration in Finland
- Nokia, Lingsoft, Fujitsu-Invia
- Institute of Public Health
- Research Institute for the Languages of Finland
- Institute of Forest Research
- Institute of Occuptional Health
- VTT Biotechnology, VTT Processes
- National teaching network of language technology
- Biocenter Turku
16International collaboration
- Joint publications with researchers from the
following institutions - University of California at Irvine,
Albert-Ludwigs Universität Freiburg, INSA Lyon,
Aalborg University, Imperial College, City
University London, European Bioinformatics
Institute (UK), Univ Haifa, Bar-Ilan University,
University of Chile, Rensselaer Polytechnic
Institute, Limburg University, Stanford
University, MIT, Microsoft Research, New Jersey
Institute of Technology, Oxford University, The
University of Wales (Aberystwyth), TU Munich,
University of California at Riverside, Tufts
University.
17Personnel 2002-2003
- A. FINNISH Male/02 Fem/02 Total/02 Male/03
Fem/03 Total/03 - Professors 5 1 6 4 1 5
- Other senior researchers 1 - 1 2 - 2
- Postdocs 6 1 7 6 2 8
- Grad school students (ME) 14 5 19 14 6 20
- Other postgrad students 1 3 4 2 2 4
- Other academic 10 - 10 10 1 11
- Auxiliary personnel - -
- B. FOREIGN
- Professors - -
- Other senior researchers - -
- Postdocs - 4 - 4
- Grad school students (ME) 1 - 1 1 - 1
- Other postgrad students 2 1 3 2 1 3
- Other academic 3 - 3 1 - 1
- Auxiliary personnel - -
- TOTAL 43 11 54 46 13 59
18Funding in 2002 - 03 (KEuros)
- Domestic 2002 2003
- Own basic funding (home inst) 332 375
- Academy of Finland 60 256
- National Technology Agency 110 61
- Other domestic
- - Academy of Finland 537 355
- - Univ Helsinki 75 80
- - Univ Helsinki (project) 35 -
- - Min of Education (grad schools) 140 288
- - Helsinki Univ Technology (home inst) 90 70
- - enterprises 230 28
- - VTT 10
- B. Foreign
- - Max-Planck-Institut 40 40
- - EU
- TOTAL 1616 1554
Some growth expected in 2004
19Publications and other outcomes 2002-03
2002 2003 Int journal articles 18 23 -
accepted 9 Int refereed proceedings
articles 31 47 Domestic journal
articles - 1 Domestic proceedings
articles 5 1 Int monographs/proceedings 3 3 Ot
her scientific publications 4 14 Patents 1
- Computer programs (and algorithms) 1 8 Lectures
and visiting lectures n/a 52 Radio TV
programs, popular articles 3 1 Degrees -
Masters theses 13 5 - PhD Theses 2 2
20PhD Degrees
- V Ollikainen (2002) Simulation techniques for
disease gene location in isolated populations - J Vilo (2002) Pattern discovery from
biosequences - V Mäkinen (2003) Parametrized approximate string
matching and local-similarity-based point-pattern
matching - M Koivisto (2004) Sum-product algorithms for the
analysis of genetic risks - under review Sevon, M K Vasko, P Kääriäinen
- approaching review T Kivioja, H Tamm, J
Ravantti, J Makkonen - by the end 2004 7-10 PhD degrees completed
21New projects/funding
ACADEMY OF FINLAND - Systems biology
program 1) J Hollmen S Kaski Gene
expression 2) H Mannila J Kere L
Peltonen.. Genome structure 3) J Rousu E
Ukkonen VTT Metabolic modeling - Proactive
computing program 4) program coordination
H Mannila, G. Linden 5) H Toivonen Data
mining for context awareness TEKES 6) H
Ahonen-Myka Mobile and multilingual maintenance
man 7) E. Ukkonen VTT Software for metabolic
flux anal. (Neobio) 8) H. Toivonen Gene mapping
(ALTTI)
22New projects/funding (cont.)
EUROPEAN UNION 9) BIOSAPIENS (E Ukkonen)
NoE on genome annotation, EBI / M
Thornton 10) Pascal (E Ukkonen/H Tirri) NoE
on machine learning, Royal Holloway Univ London
/ J Shawe-Taylor 11) April II (H Mannila)
STREP on combining logical and probabilistic
framework for biological data Univ Freiburg
/ L de Raedt 12) MobiLife (H Mannila) IP on
user centered mobile applications
23FDK Future Goals (2002/2004)
- about the right size ( )
- more postdocs, more visitors ( )
- EU projects, TEKES, Systems biology
programme/Academy ( ) - Focus combinatorial algorithms and probabilistic
models - - basic algorithms for combinatorial pattern
matching - - basic algorithms for data mining
- - genome structure
- - regulatory networks in comp biology
- - human language applications
-