Title: AI and Bioinformatics
1AI and Bioinformatics
- From Database Mining to the Robot Scientist
2History of Bioinformatics
- Definition of Bioinformatics is debated
- In 1973, Herbert Boyer and Stanely Cohen invented
DNA cloning. - By 1977, a method for sequencing DNA was
discovered - In 1981 The Smith-Waterman algorithm for sequence
alignment is published
3History of Bioinformatics
- By 1981, 579 human genes had been mapped
- In 1985 the FASTP algorithm is published.
- In 1988, the Human Genome organization (HUGO) was
founded.
4History of Bioinformatics
- Bioinformatics was fuelled by the need to create
huge databases. - AI and heuristic methods can provide key
solutions for the new challenges posed by the
progressive transformation of biology into a
data-massive science. - Data Mining
- 1990, the BLAST program is implemented.
- BLAST Basic Local Alignment Search Tool.
- A program for searching biosequence databases
5History of Bioinformatics
- Scientists use Computer scripting languages such
as Perl and Python - By 1991, a total of 1879 human genes had been
mapped. - In 1996, Genethon published the final version of
the Human Genetic Map. This concluded the end of
the first phase of the Human Genome Project.
6History of Bioinformatics
Year Subject Name MBP (Millions of base pairs)
1995 Haemophilus Influenza 1.8
1996 Bakers Yeast 12.1
1997 E.Coli 4.7
2000 Pseudomonas aeruginosa A. Thaliana D. Melonagaster 6.3 100 180
2001 Human Genome 3,000
2002 House Mouse 2,500
7Bioinformatics Today
- There are several important problems where AI
approaches are particularly promising - Prediction of Protein Structure
- Semiautomatic drug design
- Knowledge acquisition from genetic data
8Functional Genomics and the Robot Scientist
- Robot scientist developed by University of Wales
researchers - Designed for the study of functional genomics
- Tested on yeast metabolic pathways
- Utilizes logical and associationist knowledge
representation schemes
Ross D. King, et al., Nature, January 2004
9The Robot Scientist
Source BBC News
10Yeast Metabolic Pathways
11Hypothesis Generation and Experimentation Loop
Ross D. King, et al., Nature, January 2004
12Integration of Artificial Intelligence
- Utilizes a Prolog database to store background
biological information - Prolog can inspect biological information, infer
knowledge, and make predictions - Optimal hypothesis is determined using machine
learning, which looks at probabilities and
associated cost
13Experimental Results
- Performance similar to humans
- Performance significantly better than naïve or
random selection of experiments
For 70 classification accuracy A hundredth the
cost of random A third the cost of naive
Ross D. King, et al., Nature, January 2004
14Major Challenges and Research Issues
- Requires individuals with knowledge of both
disciplines - Requires collaboration of individuals from
diverse disciplines
15Major Challenges and Research Issues
- Data generation in biology/bioinformatics is
outpacing methods of data analysis - Data interpretation and generation of hypotheses
requires intelligence - AI offers established methods for knowledge
representation and intelligent data
interpretation - Predict utilization of AI in bioinformatics to
increase
16References and Additional Resources
- Ross D. King, Kenneth E. Whelan, Ffion M. Jones,
Philip G. K. Reiser, Christopher H. Bryant,
Stephen H. Muggleton, Douglas B. Kell Stephen
G. Oliver. Functional Genomic Hypothesis
Generation and Experimentation by a Robot
Scientist. Nature 427 (15), 2004. - A Short History of Bioinformatics -
http//www.netsci.org/Science/Bioinform/feature06.
html - History of Bioinformatics - http//www.geocities.c
om/bioinformaticsweb/his.html - National Center for Biotechnology Information -
http//www.ncbi.nih.gov - Pubmed - http//www.pubmed.gov