QuASI: Question Answering using Statistics, Semantics, and Inference - PowerPoint PPT Presentation

1 / 70
About This Presentation
Title:

QuASI: Question Answering using Statistics, Semantics, and Inference

Description:

Capture word-specific trends by lexicalizing symbols. ... New release of Chinese Treebank provides more data (~300,000 words) ... – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0
Slides: 71
Provided by: hea41
Category:

less

Transcript and Presenter's Notes

Title: QuASI: Question Answering using Statistics, Semantics, and Inference


1
QuASI Question Answering using Statistics,
Semantics, and Inference
  • Marti Hearst, Jerry Feldman, Chris Manning,
    Srini Narayanan
  • Univ. of California-Berkeley / ICSI / Stanford
    University

2
Outline
  • Project Overview
  • Three topics
  • Assigning semantic relations via lexical
    hierarchies
  • From sentences to meanings via syntax
  • From text analysis to inference using conceptual
    schemas

3
Main Goals
  • Support Question-Answering and NLP in general
    by
  • Deepening our understanding of concepts that
    underlie all languages
  • Creating empirical approaches to identifying
    semantic relations from free text
  • Developing probabilistic inferencing algorithms

4
Two Main Thrusts
  • Text-based
  • Use empirical corpus-based techniques to extract
    simple semantic relations
  • Combine these relations to perform simple
    inferences
  • statistical semantic grammar
  • Concept-based
  • Determine language-universal conceptual
    principles
  • Determine how inferences are made among these

5
Relation Recognition (UCB)
  • Abbreviation Definition Recognition
  • TREC Genomics Track
  • Semantic Relation Identification

6
Abbreviation Dectection (UCB)
  • Abbreviation Definition Recognition
  • Developed and evaluated new algorithm
  • Better results than existing approaches
  • Simpler and faster as well
  • Semantic Relation Identification
  • Developed syntactic chunker
  • Analyzed sample relations
  • Began development of a new computational model
  • Incorporates syntax and semantic labels
  • Test example identify treatment for disease

7
Abbreviation Examples
  • Heat-shock protein 40 (Hsp40) enables Hsp70 to
    play critical roles in a number of cellular
    processes, such as protein folding, assembly,
    degradation and translocation in vivo.
  • Glutathione S-transferase pull-down experiments
    showed the direct interaction of in vitro
    translated p110, p64, and p58 of the essential
    CBF3 kinetochore protein complex with Cbf1p, a
    basic region helix-loop-helix zipper protein
    (bHLHzip) that specifically binds to the CDEI
    region on the centromere DNA.
  • Hpa2 is a member of the Gcn5-related
    N-acetyltransferase (GNAT) superfamily, a family
    of enzymes with diverse substrates including
    histones, other proteins,arylalkylamines and
    aminoglycosides.

8
The Algorithm
  • Much simpler than other approaches.
  • Extracts abbreviation-definition candidates
    adjacent to parentheses.
  • Finds correct definitions by matching characters
    in the abbreviation to characters in the
    definition, starting from the right.
  • The first character in the abbreviation must
    match a character at the beginning of a word in
    the definition.
  • To increase precision a few simple heuristics are
    applied to eliminate incorrect pairs.
  • Example Heat shock transcription factor (HSF).
  • The algorithm finds the correct definition, but
    not the correct alignment Heat shock
    transcription factor

9
Results
  • On the gold-standard the algorithm achieved 83
    recall at 96 precision.
  • On a larger test collection the results were 82
    recall at 95 precision.
  • These results show that a very simple algorithm
    produces results that are comparable to these of
    the exiting more complex algorithms.

Counting partial matches, and abbreviations
missing from the gold-standard our algorithm
achieved 83 recall at 99 precision.
10
TREC Task 1 Overview
  • Search 525,938 MedLine records
  • Titles, abstracts, MeSH category terms, citation
    information
  • Topics
  • Taken from the GeneRIF portion of the LocusLink
    database
  • We are supplied with a gene names
  • Definition of a GeneRIF
  • For gene X, find all MEDLINE references that
    focus on the basic biology of the gene or its
    protein products from the designated organism. 
    Basic biology includes isolation, structure,
    genetics and function of genes/proteins in normal
    and disease states.

11
TREC Task 1 Sample Query
  • 3 2120 Homo sapiens OFFICIAL_GENE_NAME ets
    variant gene 6 (TEL ncogene)
  • 3 2120 Homo sapiens OFFICIAL_SYMBOL ETV6
  • 3 2120 Homo sapiens ALIAS_SYMBOL TEL
  • 3 2120 Homo sapiens PREFERRED_PRODUCT ets variant
    gene 6
  • 3 2120 Homo sapiens PRODUCT ets variant gene 6
  • 3 2120 Homo sapiens ALIAS_PROT TEL1 oncogene
  • The first column is the official topic number
    (1-50).
  • The second column contains the LocusLink ID for
    the gene.
  • The third column contains the name of organism.
  • The fourth column contains the gene name type.
  • The fifth column contains the gene name.

12
TREC Task 1 Approach
  • Two main components
  • Retrieve relevant docs
  • May miss many because of variation in how gene
    names are expressed
  • Rank order them

13
TREC Task 1 Approach
  • Retrieval
  • Normalization of query terms
  • Special characters are replaced with spaces in
    both queries and documents.
  • Term expansion
  • A set of pattern based rules is applied to the
    original list of query terms, to expand the
    original set, and increase recall.
  • Some rules with lower confidence get a lower
    weight in the ranking step.
  • Stop word removal
  • Organism identification
  • Gene names are often shared across different
    organisms
  • Developed a method to automatically determine
    which MeSH terms correspond to LocusLink Organism
    terms
  • Retrieved Medline docs indicated by LocusLink
    links corresponding to a given organism
  • Organism terms were the most frequent MeSH
    categories among the selected docs
  • Used these terms to identify the organism term in
    Medline
  • An example of playing two databases off each
    other.
  • Mesh concepts
  • When an exact match is found between one of the
    query terms and a MeSH term assigned to a
    document, the document is retrieved.

14
TREC Task 1 Approach
  • Relevance ranking
  • IBMs DB2 Net Search Extender was used as the
    text search engine.
  • Scoring
  • Each query is a union of 5 different sub-queries
    -
  • titles,
  • abstracts,
  • titles using low confidence expansion rules,
  • abstracts using low confidence expansion rules,
    and
  • MeSH concepts.
  • Each sub-query returns a set of documents with a
    relevance score from the text search engine (or a
    fixed value for MeSH matches)
  • The aggregated score is the weighted SUM of the
    individual scores with optional weights applied
    to each sub-query score.
  • SUM performs better than MAX, since it gives
    higher confidence to documents found in multiple
    sub-queries.
  • Scores are normalized to be in the (0,1) range,
    by dividing the score by the highest aggregated
    score achieved for the query.

15
TREC Task 1 Approach
  • GeneRIF classification
  • A Naïve Bayes model is used to assign to each
    document the probability it is a GeneRIF.
  • MeSH terms are used as features.
  • Combination of text retrieval score and GeneRIF
    classification score.
  • We tried both an additive and a multiplicative
    approach. Both behave similarly with a slightly
    better performance achieved with the additive one.

16
TREC Task 1 Results
  • Performance is measured using the standard
    trec_eval program.
  • On training data
  • Best published result 0.4125
  • With GeneRIF classifier 0.5101
  • Without GeneRIF classifier 0.5028
  • On testing data (turned in 8/4/03)
  • With GeneRIF classifier 0.3933
  • Without GeneRIF classifier 0.3768

17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
The Stanford Lexicalized ParserAn open source
Java parser
  • Dan Klein, Roger Levy, and Chris Manning Computer
    Science and Linguistics
  • Stanford University
  • http//nlp.stanford.edu/

28
Probabilistic parsing
  • Standard Solutions Collins 96, 99 Charniak 97,
    00
  • Capture word-specific trends by lexicalizing
    symbols.
  • Capture environment-specific trends by marking
    ancestors.
  • Benefits
  • Model context-freedom matches data
    context-freedom better.
  • Maximum posterior parses are correct more often.
  • Costs
  • State space becomes huge.
  • Joint estimates become extremely sparse.
  • Exact inference becomes infeasible.
  • Parsers become difficult to engineer.

NP becomes NPrates
NPrates becomes NPVPSrates
We want to address these issues.
29
Factoring Syntax and Semantics
Lexicalized Tree T (C,D) P(T) P(C)P(D)
Syntax C P(C) is a standard PCFG, captures
structural patterns
Semantics D P(D) is a dependency grammar,
captures word-word patterns
30
Efficient exact inference The Factored A
Estimate
?D
aT?C?D??T
?C
  • A parsing will be efficient if we can find a
    tight upper bound on
  • Finding the score of the best coherent pair (C,D)
    is as hard as parsing, but P(C) and P(D) alone
    are very simple, and so we can quickly find
  • These maximizations, considered jointly,
    effectively range over all pairs (C,D) instead of
    only coherent ones, so we know that ?T(E) ? ?C(E)
    ?D(E). We can therefore use a(E) ?T(E) ?D(E)
    as a good admissible estimate.

31
Results Accuracy
Details Syntactic Basic is the unsmoothed
parent-annotated treebank covering grammar best
includes other annotation. Semantic Basic is a
word-word model smoothed by tags best includes a
simple distance and valence model. Results on
Penn Treebank WSJ Section 23. Labeled bracketing
is average sentence F1. Gold dependencies induced
heuristically from gold parsed trees. Klein and
Manning, IJCAI 2003
Labeled Bracketing Accuracy (F1)
Dependency Accuracy
32
Results Efficiency
  • The factored A estimate reduces work by a factor
    of between 100 and 10,000 compared to exhaustive
    parsing.

Search work!
  • Details
  • Parser uses the Eisner Satta 99 O(n4) schema
    (though the exponential observed growth suggests
    that so little work is being done that the
    dominant effect is the small-constant exponential
    function of the A gap, not the large-constant
    polynomial function of the sentence length).
  • The total time is dominated by the plain-PCFG
    parse phase, which can be reduced.

33
Recent Focus Accurate unlexicalized parsing
  • Most of the emphasis in the last decade has been
    on exploiting lexical dependencies
  • We show that accurate structural (syntactic)
    modeling has been highly underexploited
  • Strategy deterministically refine the category
    set of a treebank so it better reflects important
    linguistic distinctions (and hence better models
    probabilistic dependencies)
  • Our best unlexicalized parsers outperform early
    lexicalized parsers Klein and Manning, ACL 2003
    cf. Magerman 1995 84.7, Collins 1996 86.0

34
Recent Focus Accurate unlexicalized parsing
  • E.g. representing subordinating complementizers
    in category set fixes PP parse on the left

35
Recent Focus Accurate unlexicalized parsing
Note development set performance final test
set 40 words F1 86.32
  • Illustrates the strength of the Factored Parser
    architecture we can quickly and easily improve
    one component
  • Unlexicalized grammar is more domain-independent

36
Unlexicalized Sec. 23 Results
  • Beats first generation lexicalized parsers.
  • Much of the power of lexicalization from
    closed-class monolexicalization.

37
Multilingual Parsing ChineseSyntactic sources
of ambiguity
  • English PP attachment (well-understood)
    coordination scoping (less well-understood)
  • Chinese modifier attachment less of a problem,
    as verbal modifiers direct objects arent
    adjacent, and NP modifiers are overtly marked.

38
Chinese Performance
  • Close to state-of-the-art for Chinese parsing
  • Considerable difference in precision/recall split
    from other work suggests complementary
    strengths
  • Levy and Manning, ACL 2003

39
Recent Chinese results learning curve
  • New release of Chinese Treebank provides more
    data (300,000 words)

40
Multilingual Parsing German
  • Linguistic characteristics, relative to English
  • Ample derivational and inflectional morphology
  • Freer word order
  • Verb position differs in matrix/embedded clauses
  • Target corpus Negra
  • 400,000 words newswire text
  • Flatter phrase structure annotations (few PPs!)
  • Explicitly marked phrasal discontinuities

41
Current results (preliminary)
  • Area needing investigation Word dependency model
    currently gives relatively little improvement.
  • Consistent with Dubey and Kellers findings that
    basic head-complement lexical dependencies harm
    performance for Negra German

42
Upcoming
  • Incorporation of morphological information into
    parsing model
  • Recently released TIGER corpus (similar to Negra,
    800,000 words)
  • Additional languages (Czech, Arabic)
  • Reconstruction of dislocated argument positions
    (common in German, Czech, many other languages)

43
Semantic Role Identification Problem Statement
  • Given a sentence and a word of interest (the
    predicator) in that sentence
  • Find
  • The constituents related to that word and the
    nature of those relationships
  • The overarching relationship (the frame) for the
    word and its roles
  • Example Tim drove his car to the store.
  • TimDriver his carVehicle to the
    storeGoal
  • Relationship Transportation

44
Annotated Examples
  • Judge We praised Evaluee the syrup tart
    extravagantly.
  • Her verse circulated to Manner warm Judge
    critical praise.
  • Agent His brothers avenged Injured_party him.
  • Selector The president appoints Leader a Prime
    Minister Conditions each year.
  • She bought Count three Unit kilos Stuff of
    apples.
  • Beh It was Degree really mean Evaluee of
    me.

45
Benefits of Solving the Problem
  • Identify that two syntactically different phrases
    play the same role
  • The board changed their ruling yesterday.
  • The ruling changed because of protests.
  • NLP Question answering, WSD, translation,
    summarization, speech recognition
  • Computational Biology Operon Prediction
  • Security
  • Intrusion Detection
  • Credit Card Fraud

46
A Generative Model
47
Results Framenet I
48
Confusion Table, Roles Contributing Most to Error
Rows, correct Columns guesses
49
Results Framenet II
Test Set Accuracy
Comparable numbers for Framenet I
50
Concept-based Analysis
Uniform formalism for encoding conceptual
relations and grammatical constructions Initial
version of construction parser Coordinated
Relational Probabilistic Models for inference
51
Inference and Conceptual Schemas Background
  • Hypothesis
  • Linguistic input is converted into a mental
    simulation based on bodily-grounded structures.
  • Components
  • Semantic schemas
  • image schemas and executing schemas are
    abstractions over neurally grounded perceptual
    and motor representations
  • Linguistic units
  • lexical and phrasal construction representations
    invoke schemas, in part through metaphor
  • Inference links these structures and provides
    parameters for a simulation engine

52
Conceptual Schemas
  • We have developed an formalism for encoding
    conceptual schemas.
  • Structured feature structure representation
    (ECG).
  • Uniform representation for conceptual relations
    and for grammatical constructions.
  • Supports structured probabilistic inference.
  • Initial DAMLOIL implementation.
  • Produced by a construction parser.

53
Construction Parser
  • The parser maps from language input to a deep
    semantic specification
  • The semantic specification is network of linked
    conceptual ECG schemas
  • Language and domain independent
  • Supports structured probabilistic inference.
  • First system running since November 2002
  • Uses novel parsing techniques combining chunking,
    unification, and semantic fit

54
State of Resource Development
  • MetaNet
  • Pilot System implemented
  • SQL-based backend (Michael Meisel, CS Undergrad).
  • Data-Entry GUI.
  • Database is being populated with Image Schemas
    (Ellen Dodge, Ling Grad)
  • FrameNet
  • DAMLOIL version of FrameNet-1
  • Combining FrameNet and WordNet for Semantic
    Extraction (Behrang Mohit, SIMS and ICSI,
    recently UTD)
  • Good use of FrameNet for QA (UTD, Stanford, CU)
  • Linking to external ontologies
  • ECG OpenCyc Link (Preslov Nakov, Marco Barreno)

55
Dynamic Probabilistic Inference for event
structure
Srini Narayanan Jerry Feldman ICSI and UC
Berkeley
56
Scenario Question (CNS data)
  • How has Al-Qaida conducted its efforts to acquire
    WMD capability and what are the results of this
    endeavor?
  • Even with perfect parsing, to answer this
    question, we have to go beyond words in the input
    in at least the following ways
  • Multiple sources (reports, evidence, news)
  • Fusing information from unreliable sources
    (P(Information true source))
  • Non-monotonicity. Previous assertions or
    predictions may have to be retracted in the light
    of new evidence.
  • Modeling complex events
  • Evolving events with complex dynamics including
    sequence, concurrency, coordination,
    interruptions and resources.

57
Reasoning about Events for QA
  • Reasoning about dynamics
  • Complex event structure
  • Multiple stages, interruptions, resources
  • Evolving events
  • Conditional events, presuppositions.
  • Nested temporal and aspectual references
  • Past, future event references
  • Metaphoric references
  • Use of motion domain to describe complex events.
  • Reasoning with Uncertainty
  • Combining Evidence from Multiple, unreliable
    sources
  • Non-monotonic inference
  • Retracting previous assertions
  • Conditioning on partial evidence

58
Cognitive Semantics
  • Much of language and thought is directly embodied
    and relies on recurrent patterns of familiar
    experience
  • Image Schemas
  • Containment, Force Dynamics, Spatial Relations
  • Motor Schemas
  • Homeostastis, Source Path Goal, Monitoring,
    Aspect
  • Social Cognition
  • Authority, Care-giving, play
  • Abstract Language and Thought derive a
    significant amount of their meaning from mappings
    to embodied schemas
  • Event Structure Metaphor, Projection invariants
    and Cogs (Aspect, topological relations), Frames,
    Mental Spaces.

59
Previous work
  • Models of event structure that are able to deal
    with the temporal and aspectual structure of
    events
  • Based on an active semantics of events and a
    factorized graphical model of complex states.
  • Models event stages, embedding, multi-level
    perspectives and coordination.
  • Event model based on a Stochastic Petri Net
    representation with extensions allowing
    hierarchical decomposition.
  • State is represented as a Temporal Bayes Net
    (T(D)BN).

60
(No Transcript)
61
(No Transcript)
62
(No Transcript)
63
Factorized Inference
64
Quantifying the model
65
Pilot System Results
  • Captures fine grained distinctions needed for
    interpretation
  • Frame-based Inferences (COLING02)
  • Aspectual Inferences (Cogsci98, IJCAI 99,
    COLING02)
  • Metaphoric Inferences (AAAI 99)
  • Sufficient Inductive bias for verb learning
    (Bailey97, CogSci99), construction learning
    (Chang02, to Appear)
  • Model for DAML-S (WWW02, Computer Networks 03)

66
Extensions to Pilot System
  • Scalable Data Resources
  • Language Resources/Ontology
  • Lexicon (Open Source, WordNet, FrameNet)
  • Conceptual Relations
  • Schemas, Maps, Frames, Mental Space
  • General Principle Use Semantic Web resources
  • (DAML, DAML-S, OpenCYC, IEEE SUMO)
  • Language Analyzer
  • Construction Parser (ICSI/EML)
  • Statistical techniques (UCB/Stanford, CU,UTD)
  • Scalable Domain Representation
  • Coordinated Probabilistic Relational Models

67
Problems with DBN
  • Scaling up to relational structures
  • Supports linear (sequence) but not branching
    (concurrency, coordination) dynamics

68
Structured Probabilistic Inference
69
Probabilistic inference for QA
  • Filtering
  • P(X_t o_1t,X_1t)
  • Update the state based on the observation
    sequence and state set
  • MAP Estimation
  • Argmaxh1hnP(X_t o_1t, X_1t)
  • Return the best assignment of values to the
    hypothesis variables given the observation and
    states
  • Smoothing
  • P(X_t-k o_1t, X_1t)
  • modify assumptions about previous states, given
    observation sequence and state set
  • Projection/Prediction/Reachability
  • P(X_tk o_1..t, X_1..t)
  • Predict future states based on observation
    sequence and state set

70
The CPRM algorithm
  • Combines insights from
  • the SVE algorithm for PRMs (Pfeffer 2000)
  • the frontier algorithms for temporal models
    (Murphy 2002) and
  • Inference algorithms for complex, coordinated
    events (Narayanan 1999)
  • Expressive Probabilistic Modeling paradigm with
    relations and branching dynamics.
  • Offers principled methods to bound inferential
    complexity.

71
Summary
  • QA with complex scenarios (such as the CNS
    scenario/data) needs complex inference that deals
    with
  • Relational Structure
  • Uncertain source and domain knowledge
  • Complex dynamics and evolving events
  • We have developed a representation and inference
    algorithm that is capable of tractable inference
    for a variety of domains.
  • We are collaborating with UTD (Sanda Harabagiu)
    to apply these techniques to QA systems.

72
Putting it all Together
  • We explored two related levels of semantics
  • Universal conceptual schemas
  • Extracting semantic relations from text
  • In Phase I they remained separate
  • However, we came up with CPRMs as a common
    representational format
  • In Phase II we propose to combine them in an
    semantically based integrated QA system.
Write a Comment
User Comments (0)
About PowerShow.com