Title: Biological Text Mining
1ECML PKDD 2008 European Conference on Machine
Learning and Principles and Practice of
Knowledge Discovery in Databases Workshop on
High-Level Information Extraction 15-19
September, Antwerp, Belgium
Whats an Event ?How Ontologies and
Linguistic Semantics Shape Upcoming Challenges
for Machine Learning Models of Event Extraction
Udo Hahn
Jena University Language and Information
Engineering (JULIE) Lab
www.julielab.de
2Relations, Events, Semantic Roles,..
SemEval
TDT
ACE
Narrative Report John Doe is a naturalized
United States citizen from Iran. His immediate
family (father, mother and two sisters) still
live in Iran. He has traveled to Gaza 6x w/in
the last year.
3Relational Knowledge Types
- Representation of Static Knowledge (invariable
states, are there any ??? ) - Conceptual Is-a, Instance-of
- Conceptual Part-of
- Attributive (properties, binary relations)
- located-in, citizen-of, has-socialsecurityno, ...
- Representation of Dynamic Knowledge (state
changes, dependencies among states) - processes, actions, events
- sales, rating changes, transports, launches,
mergers acquisitions, kidnappings,
4Hand Washing as an Event
5Hand Washing Event (1/2)
- http//www.mayoclinic.com/health/hand-washing/HQ00
407 - Hand washing A simple way to prevent infection
- Hand washing is a simple habit that can help keep
you healthy. Learn about the benefits of good
hand hygiene, as well as when to wash your hands
and how to clean them properly. - Hand washing is a simple habit one that
requires minimal training and no special
equipment. Yet it's one of the best ways to avoid
getting sick. This simple habit requires only
soap and warm water or an alcohol-based hand
sanitizer a cleanser that doesn't require
water. Do you know the benefits of good hand
hygiene and when and how to wash your hands
properly?
6Hand Washing Event (2/2)
- Proper hand-washing techniques
- Good hand-washing techniques include washing your
hands with soap and water or using an
alcohol-based hand sanitizer. Antimicrobial wipes
or towelettes are just as effective as soap and
water in cleaning your hands but aren't as good
as alcohol-based sanitizers. - Antibacterial soaps have become increasingly
popular in recent years. However, these soaps are
no more effective at killing germs than are
regular soap and water. Using these soaps may
lead to the development of bacteria that are
resistant to the products' antimicrobial agents
making it even harder to kill these germs in the
future. In general, regular soap is fine. The
combination of scrubbing your hands with soap
antibacterial or not and rinsing them with
water loosens and removes bacteria from your
hands. - Proper hand washing with soap and waterFollow
these instructions for washing with soap and
water - Wet your hands with warm, running water and apply
liquid or clean bar soap. Lather well. - Rub your hands vigorously together for at least
15 seconds. - Scrub all surfaces, including the backs of your
hands, wrists, between your fingers and under
your fingernails. - Rinse well.
- Dry your hands with a clean or disposable towel.
- Use a towel to turn off the faucet.
- Proper use of an alcohol-based hand
sanitizerAlcohol-based hand sanitizers which
don't require water are an excellent
alternative to hand washing, particularly when
soap and water aren't available. They're actually
more effective than soap and water in killing
bacteria and viruses that cause disease.
Commercially prepared hand sanitizers contain
ingredients that help prevent skin dryness. Using
these products can result in less skin dryness
and irritation than hand washing. - Not all hand sanitizers are created equal,
though. Some "waterless" hand sanitizers don't
contain alcohol. Use only the alcohol-based
products. - To use an alcohol-based hand sanitizer
- Apply about 1/2 tsp of the product to the palm of
your hand. - Rub your hands together, covering all surfaces of
your hands, until they're dry. - If your hands are visibly dirty, however, wash
with soap and water rather than a sanitizer.
7ECML PKDD 2008 European Conference on Machine
Learning and Principles and Practice of
Knowledge Discovery in Databases Workshop on
High-Level Information Extraction 15-19
September, Antwerp, Belgium
Whats Hand Washing?How Mayo Clinic
Guidelines Shape Upcoming Challenges for Machine
Learning Models of Event Extraction
Udo Hahn
Jena University Language and Information
Engineering (JULIE) Lab
www.julielab.de
8Approaches to Event Modeling
- Events as entities
- Events as flat relations
- Features for flat events
- Decomposition into subevents
- Interleaving of subevents
- Hard-wired ordinal connectivity
- Triggered connectivity (e.g., via integrity
constraints, inference rules, etc.) - Scripting for connectivity
9Events as Entitiesunary relations ...
HandWashing
- Issues
- just naming of a relation (variant of ER)
- no interrelations between any arguments
10Flat EventsYet another breed of n-ary relations
...
HandWashing( Agent, Patient,
Instruments, InState, OutState, ...)
- Issues
- Are there core arguments (complement/adjunct)?
- How many arguments (its endless!)?
- Type checking/compatibility
11A remark on type constraints ...
12Features for Flat Events
- HandWashing( Agent, Patient,
- Instruments, InState, OutState,
...) - Telicity /- (is there a point of completion
?) - Aspectuality n (I swim vs. I am swimming)
- Tense n (location on time axis
- past, now, future, )
- Issues
- Classifies verbs, ..., use for inferences?
- Classifies knowledge states (ongoing, result)
the splicing, I am splicing, I spliced,
13Hand Washing as a Complex Activity
14Decomposition into Subevents
HandWashing( ...)
Wet-w-Water( ... )
RubHands( ...)
Dry-w-Towel( ...)
ApplySoap( ... )
Rinse( ...)
- Issues
- Abstraction between event cover and subevents
- Subevent granularity
- Subevent reusability
- Completeness required ?
- Or are there mandatory vs. optional subevents ?
- Or are probabilities associated with subevents ?
- Relevant vs. irrelevant intermediate steps
- the latter are often skipped in event descriptions
15Interleaving of SubeventsHard-wired Ordinal
Connectivity
HandWashing( ...)
1. Wet-w-Water( ... )
3. RubHands( ...)
5. Dry-w-Towel( ...)
2. ApplySoap( ... )
4. Rinse( ...)
- Issues
- Orderings everywhere?
- Strict vs. partial
- Linear vs. parallel
- Many orderings? Defaults vs. exceptions
16Interleaving of SubeventsTriggered Connectivity
(ICs, Rules)
HandWashing( ...)
C
Wet-w-Water( ... )
RubHands( ...)
Dry-w-Towel( ...)
A
B
ApplySoap( ... )
Rinse( ...)
A hands fully soaped B no more soap left C
hands clean dry
- Issues
- Formal reasoning required
- IC checker
- Inference engine
- How many relevant ICs/rules are there?
17Interleaving of SubeventsScripting for
(Inter)Connectivity
- Issues
- Massive knowledge acquisition bottleneck
- Representation format
- Doable at all ?
- Hand Washing
- ...
- Drying your hands
- Pre hands are wet no soap left
- Act fetch towel
- If towel not available call towel maintenance
unit - Towel alternatives other paper or textile
tissues such as handkerchiefs, toilet paper, ... - Post hands are dry clean
- NonDefaultPre hands are wet alcohol-based hand
sanitizer was used - NonDefaultAct wait until alcohol has evaporated
- Post hand are dry clean
- If not clean wash hands with soap
18Lexical Encoding of Events
- Verbs (mind your DG parser!)
- He washed his hands
- Im washing my hands
- Perfective, progressive, aspectuality ...
- but cf. also stative verbs such as know, like,
- Nominalizations
- My hand washing was a nightmare
- Washing machines help you save time
- Adjectives, adverbs
- Hand-washed shirts are cleaner than those which
are machine-washed
19Textual Encoding of Events
- Event cohesion
- He washed his hands. They were covered with mud
and thus needed extensive brushing. - Event coherence
- He washed his hands. The soap smelt like peaches.
- He had washed his hands. Still the oil remained
on his skin.
20Whats so Special about Events ?
- Moving from single, mostly binary relations to
sets of interrelated n-ary relations (n usually
gt2) - Types of interrelations
- Precedence/succession
- (symbolic) temporal relations
- Temporal Interval Calculus 13 atomic rels
- vs. time point (numerical clock) calculi
- logical entailment c causality
- Event granularity
- Default events (lots of) exceptions
21Formal Description of Knowledge Types
- Description of Static Knowledge
- Logic (FOL, in particular)
- Description of Dynamic Knowledge (state changes)
- Differential equations
- Qualitative physics, biology, economics,...
- Petri Nets (and other graph/network reps)
- Planning languages (STRIPS, PDDL, ASBRU, )
- Logics considered harmful
- dynamic PL, TLs (point/interval), MLs
22(Some of) The Challenges of Event Representation
- Associating ontologies w.\ textual realizations
- Linguistic categories (on-going process, result
of a process, etc.) matter g features - Scalability from simple to advanced
representations (granularity sliding) - Different breed of inference rules
- real world modeling
- Frame axioms (tracking changes of the world)
- Given all that representational richness How
tractable are event calculi? - Stay on the poor side of life ? How adequate
will your results be ?
23Machine Learning Challenges ofEvent Extraction
- Formal basis of event description
- Symbolic, discrete KRs
- Learning the building blocks of complex events
- Sets of n-ary relations
- Learning connectivity criteria for these
(sub)relations - Precedence/succession
- Temporal orderings
- ICs, Inference rules a Causality
- Numerical, continuous KRs
- Quantitative data a induction of differential
equations - Methodological Frameworks
- Learning (timed, probabilistic) FSAs, Bayesian Ns
- Event (process) mining Petri Nets (IPM_at_ECML08)
- ILP
- Temporal logic-based learning (see also TimeML)
24Clinical Events Guidelines
(for diabetis type 2)
National Institute for Health and Clinical
Excellence http//www.nice.org.uk/nicemedia/pdf/CG
66T2DQRG.pdf
25Clinical EventsFormalization of Guidelines
Y. Shavar, S. Miksch, P. Johnson AI in Medicine,
1997
26Biological Events Gene Regulation
The process that modulates the frequency, rate
or extent of gene expression, where gene
expression is the process in which the coding
sequence of a gene is converted into a gene
product(s) (protein, RNA). (Gene
Ontology)?
Figure Positive and Negative Regulation of Gene
Transcription (Expression). http//employees.csbsj
u.edu/hjakubowski/classes/ch331/bind/olbindtransci
ption.html
27ECML PKDD 2008 European Conference on Machine
Learning and Principles and Practice of
Knowledge Discovery in Databases Workshop on
High-Level Information Extraction 15-19
September, Antwerp, Belgium
Whats an Event ?How Ontologies and
Linguistic Semantics Shape Upcoming Challenges
for Machine Learning Models of Event Extraction
Udo Hahn
Jena University Language and Information
Engineering (JULIE) Lab
www.julielab.de