Title: Sep 2003, Chicago,1
1Eliciting Formal Models From Informal Requirements
- Some issues and an approach.
Nikhil Dinesh, David E. Arney, Aravind K.
Joshi, Owen Rambow, Martha Palmer and Insup
Lee University of
Pennsylvania Chicago,
September 24 2003
2Outline
- Introduction
- Example
- Specification Language for Requirements
- The approach
- Future work
3Issues
- What is the right specification language for
requirements ? - Natural Language
- Formal Specification Language
- Formal Specification Language with an NL-looking
restricted language interface - Evaluation
- What metrics can be applied in the evaluation of
such a system
4Overall Approach
NL requirements
NL-based Finite State Machine
Corrections to requirements in NL
Requirements Engineer
Policy Specifier
Extended Finite State Machine
Verification Validation
Errors
5Approach to elicit an EFSM
- Work is preliminary, no implementation
- Outline the stages in eliciting an EFSM
- Desired output representation at each stage
- How to compute from the NL requirements in terms
of - What is available
- What is needed
- What is achievable with existing tools and what
research is needed for the long range
An Extended Finite State Machine is an FSM with
variables. (as in the HASTEN project)
6Outline
- Introduction
- Example
- Specification Language for Requirements
- The approach
- Future work
7Example from FDA CFR 610.40
(a) Except as specified in paragraphs (c) and (d)
of this section, you must test all samples for
evidence of infection due to the following
communicable disease agents (i)
Human immunodeficiency virus (ii)
Hepatitis-B virus
A policy document from Food and Drug
Administration, Code of Federal Regulations.
There are several volumes each of which is
updated once each calendar year and issued on a
quarterly basis.
8Example from FDA CFR 610.40
(a) Except as specified in paragraphs (c) and (d)
of this section, you must test all samples for
evidence of infection due to the following
communicable disease agents (i)
Human immunodeficiency virus (ii)
Hepatitis-B virus (b) To test for evidence of
infection due to communicable disease agents in
paragraph (a), you must use a screening test
approved by the FDA.
9Example from FDA CFR 610.40
(a) Except as specified in paragraphs (c) and (d)
of this section, you must test all samples for
evidence of infection due to the following
communicable disease agents (i)
Human immunodeficiency virus (ii)
Hepatitis-B virus (b) To test for evidence of
infection due to communicable disease agents in
paragraph (a), you must use a screening test
approved by the FDA. You must perform one or more
such test, as necessary, to reduce adequately and
appropriately the risk of transmission of
communicable disease.
10Outline
- Goals
- Example
- Specification Language for requirements
- The approach
- Future work
11Specification Language for Requirements
- Natural Language (NL)
- Specification is accessible to people
- Properties should correspond to the requirements
- Hard to compute properties
- Formal Specification Language (FSL)
- Allows specification to be easily verified
- Not easily accessible to people such as policy
users - Application should motivate the choice
- What is the right choice for the policy domain ?
12Policy
- Large number of policy documents in NL
- Hand conversion to FSL is expensive and error
prone. - Human readability
- Policy is interpreted by humans.
- Is this just an interface problem ?
13Interfaces to Specification Languages
- Suppose policy requirements were in a formal
specification language - Interface provides NL-looking, restricted
language - How easily can it be read?
- We examine one such interface
14Specifying a Property
- PROPEL An approach supporting property
elucidation Smith et al, 2002. - For our example,
- Core phrase Exactly one occurrence of arrival
of a sample eventually leads to one or more
occurrences of test for diseases. - Repetition Phrase The above behavior is
repeatable - Scope Phrase This property must hold before the
first occurrence of P (where P is a state
corresponding to Exceptions in paragraphs (c)
and (d))
15How do properties relate to the NL document ?
- Core phrase Exactly one occurrence of arrival
of a sample eventually leads to one or more
occurrences of test for diseases - Repetition Phrase The above behavior is
repeatable - Scope Phrase This property must hold before the
first occurrence of P
Test for disease
Sentence 1
Exceptions in (c) and (d)
16How do properties relate to the NL document ?
- Core phrase Exactly one occurrence of arrival
of a sample eventually leads to one or more
occurrences of test for diseases - Repetition Phrase The above behavior is
repeatable - Scope Phrase This property must hold before the
first occurrence of P
One or more tests
Sentence 3
17How do properties relate to the NL document ?
- Core phrase Exactly one occurrence of arrival
of a sample eventually leads to one or more
occurrences of test for diseases - Repetition Phrase The above behavior is
repeatable - Scope Phrase This property must hold before the
first occurrence of P
Arrival of a sample
?
Repeatability
18How do properties relate to the NL document ?
- To enforce use of a screening test
- test for disease -gt screening test for disease
- test for disease from Sentence 1
- screening from Sentence 2
- Correspondence becomes harder
- Interface starts to look like an FSL
- Not easily accessible to policy users
19Outline
- Goals
- Example
- Specification Language for requirements
- The approach
- Future work
20NL Document
EFSM-like representation from NL document
NLFSM
EFSM
21NLFSM
Agree to perform test
Agree to use screening test
Do the test
Have you tested ?
Have you reduced risk?
22NLFSM to EFSM
Clause a
V(a)
true
V(a) true
Clause b
23EFSM
24NL Document
Clause Connective Dependency Structure
NLFSM
Temporal Ordering Of Clauses
EFSM
25Clauses and Connectives
(a) Except as specified in paragraphs (c) and (d)
of this section, you must test all samples for
evidence of infection due to the following
communicable disease agents (i)
Human immunodeficiency Connective virus
(ii) Hepatitis-B virus
as specified in paragraphs (c) and (d) of this
section
Except
Clauses
Explicit Connective
you must test all samples for evidence of
infection due to the following communicable
disease agents (i) Human immunodeficiency (ii)
Hepatitis-B virus
26Clauses and Connectives
(b) To test for evidence of communicable disease
agents in paragraph (a), you must use a screening
test approved by the FDA.
To test for evidence of communicable disease
agents in (a)
In order to
Clauses
Implicit Connective
You must use a screening test approved by the
FDA
27Predicates and Arguments
John ate the apple
Predicate ate(Agent,Object)
Dependency Structure
ate
(Object) The apple
(Agent) John
28Arguments of Connectives
29Arguments of Connectives
- In order to
- Action
- Purpose
30Arguments of Connectives
- In order to
- Action
- Purpose
31Arguments across sentences
- If X is true do action A. Otherwise do action B.
Otherwise
(Condition) X is true
(Action) Do action B
32Computing the Dependency Structure
- Discourse Connectives vs Verbs
- arity, Verbs (1 3), Connectives (2)
- Locating the arguments, easier for verbs
- Roles of the arguments, verb-specific (Propbank),
connective-specific (our approach)
33Computing the Dependency Structure
- Chunking
- Except as specified in paragraphs (c) and (d)
of this section you must test all samples for
evidence of infection due to the communicable
disease agents (1) Human immunodeficiency virus
(2) Hepatitis B virus - Simple for the cases where there is only one
connective in the sentence.
34Computing the Dependency Structure
- You should test donation for evidence of all
disease, except if it is a dedicated donation,
you need not test for diseases (a)(5) and (a)(6).
35Computing the Dependency Structure
- Three Generative, Lexicalized Models for
Statistical Parsing, Michael Collins, Proceedings
of the 35th Annual Meeting of the ACL, Madrid
(1997) - Statistical Parsing with an automatically-extracte
d tree adjoining grammar, David Chiang,
Proceedings of the ACL, Hong Kong (2000) - Discriminative Reranking for Natural Language
Parsing, Michael Collins, ICML (2000)
36Computing the Dependency Structure
- Resources available
- Parsers compute rich dependency structure at the
sentence-level - Accuracy of approximately 90 (Collins 2000)
- Efforts underway for parsers at the
discourse-level (DLTAG, Discourse Treebank) - Resources required
- Some annotation of requirements documents
- Labeling of connective-specific roles
37NL Document
Clause Connective Dependency Structure
NLFSM
Temporal Ordering Of Clauses
EFSM
38Temporal Ordering of Clauses
- Related to programs
- Cannot impose an order from an operational
perspective (there may be cycles) - But can an order be imposed from a syntactic
perspective the way we write programs ?
39Temporal Ordering of Clauses
while(i lt 9)
ilt9
while(i lt 9) i i 1 i i 1
False
True
True
True
False
ii1
ii-1
i i 1
i i - 1
Operational FSM
Temporal Tree
40Computing the Temporal Ordering(Between Clauses
in a Sentence)
Exception before the Default
41Computing the Temporal Ordering(Between Clauses
in a Sentence)
Action before the check for Purpose
42Computing the Temporal Ordering(Between Clauses
in a Sentence)
Action before the check for Purpose
43Computing the Temporal Ordering(Between Clauses
in Different Sentences)
Agreement to use a screening test precedes
agreement to test
Noun-Verb link through the light verb (use)
Check that test has taken place should follow the
test
Finite Verb Nonfinite Verb link
44Computing the Temporal Ordering(Between Clauses
in Different Sentences)
45Computing the Temporal Ordering(Between Clauses
in Different Sentences)
tests and screening tests are related. But
the order is not quite clear from these in
isolation. However these clauses are both scoped
by purpose clauses
reduce follows test Desirable to keep
scopes nested.
46(No Transcript)
47Computing the Temporal Ordering
- Resources available
- Work on noun coreference (Morton)
- New Machine Learning algorithms (Maximum Entropy
Models, Conditional Random Fields etc) - Research needed
- Granularity of annotation
- Incorporating world knowledge
48NL Document
Clause Connective Dependency Structure
NLFSM
Temporal Ordering Of Clauses
EFSM
49From Temporal Trees to NLFSM
- Find the phrases/connectives indicative of
iterative constructs - while, for (connectives)
- one or more (phrases)
- Identify the scope and add the back-edges in the
temporal tree. - Scope given by purpose clauses or relations
between verbs
50(No Transcript)
51Outline
- Goals
- Example
- Specification Language for requirements
- The approach
- Future work
52Future Work NLP
- Extending and adapting ongoing NLP work
- Discourse Structure
- Discourse Connectives
- Temporal relations between clauses
- Adapt existing tools
- Parsers
- Chunkers
- Shallow semantic parsers
53Future Work Verification
- Completeness
- NL documents are usually underspecified
- Harmless vs. Harmful underspecification
- Harmless What happens if an FDA-approved
screening test cannot be used ? - Harmful What happens if you cannot do the test
? (because in repeated testing one might run out
of a sample) - Relations between connectives
- if without a corresponding else or
otherwise - Some domain specific way ?
54Future Work Verification
- Consistency
- Relations between variables preventing certain
transitions from being taken - A donation shipped prior to testing cannot have
been tested - Requires lexical knowledge and/or world knowledge
- A shipped donation is not longer in the
possession of the establishment - Safety
- Check if the system behaves as intended
- A sample of blood should not be shipped before
testing unless in the case of dire emergencies - Associating this statement with ones in the
document
55Future Work Evaluation
- Evaluation
- Corpus-based evaluation of NLP techniques
- Comparison with models from other systems
- Experts evaluation of the models generated
56NP
D
N
the
end