Part 3 Real World Applications: SumTimeMousam - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Part 3 Real World Applications: SumTimeMousam

Description:

NLG system that automates the task of writing weather forecasts ... Maxim of Manner: Be perspicuous. More specifically: Avoid obscurity of expression. ... – PowerPoint PPT presentation

Number of Views:117
Avg rating:3.0/5.0
Slides: 38
Provided by: Somay
Category:

less

Transcript and Presenter's Notes

Title: Part 3 Real World Applications: SumTimeMousam


1
Part 3Real World Applications SumTime-Mousam
2
In this lecture you learn
  • SumTime-Mousam
  • Knowledge acquisition
  • Design
  • Document planning
  • Microplanning
  • realization
  • Evaluation
  • Post-edit
  • End-user

3
Introduction
  • So far we studied
  • Data analysis techniques
  • Time series data
  • Spatial data
  • Visualization techniques
  • NLG techniques
  • Now we will study
  • SumTime-Mousam
  • a weather forecast text generation system
  • HCE 3.0
  • a visual knowledge discovery tool

4
SumTime-Mousam
  • NLG system that automates the task of writing
    weather forecasts
  • Developed in our department
  • InputNumerical Weather Prediction (NWP) data
  • Data samples for a few dozens of parameters every
    hour/3 hour from two NWP models
  • Output marine forecasts - forecasts for offshore
    oilrig applications
  • Has been used by our industrial collaborator
    since June 2002.
  • Forecasts for 150 locations per day

5
Example
6
Example
7
Knowledge Acquisition (KA)
  • KA Tasks
  • Think aloud sessions
  • Direct Acquisition of knowledge
  • Onsite Observations
  • Corpus analysis
  • Collaborative prototype development

8
Corpus Description
  • SumTime-Meteo - parallel Text-Data Corpus
  • Size - 1045 parallel Text-Data units
  • Unit
  • NWP Model Data
  • Human Written Forecast Text
  • Similar in concept to statistical MT (Machine
    Translation)
  • Naturally Occurring
  • written for oilrig staff in the North Sea
  • Distribution of the Corpus
  • Available in the public domain

9
Parallel Text - Data
WSW 10-15 increasing 17-22 by early morning,
then gradually easing 9-14 by midnight.
10
Corpus Analyses
  • Meanings of Time phrases
  • Meanings of time phrases in terms of numerical
    data
  • required for lexical choice in summarization
  • No standard time phrase mappings exist
  • Numerical time values not mentioned in forecasts

11
Alignment
  • Step 1
  • Parsing the forecast texts
  • parser tuned for forecast text syntax
  • break the text into phrases
  • extract information such as wind speed and wind
    direction
  • parser carried forward values for the missing
    fields (shown later in the example)

12
Example
SSW 12-16 BACKING ESE 16-20 IN THE MORNING,
BACKING NE EARLY AFTERNOON THEN NNW 24-28 LATE
EVENING
13
Alignment (2)
  • Step 2
  • Associate each phrase with an entry in the input
    data set
  • 43 of the phrases matched with a single entry
    (without ambiguity)
  • heuristics used for improving the accuracy of
    alignment to 70
  • Further improvements in alignment under
    investigation

14
Example (2)
Example Phrase VEERING SW 10-14 BY EVENING
Input Data 1800 SW
By evening ---------gt 1800 hours
Example Phrase BACKING ESE 16-20 IN THE MORNING
Input Data 0600 ESE 18 0900 ESE 16
In the morning -------------gt 0600 hours
15
Results
16
Limitations of Corpus Analysis
  • Quality of knowledge acquired
  • good in some cases
  • poor in many cases
  • required clarifications from experts
  • Useful when used along with other KA techniques

17
KA Methodology
Directly Ask Experts for Knowledge
Initial Prototype
Structured KA with Experts
Corpus Analysis
Initial Version of Full System
Expert Revision
Final System
18
SumTime-MousamArchitecture
Control Data
  • Document planning
  • content selection and organisation
  • Microplanning
  • selecting words and phrases
  • ellipsis
  • Realisation
  • output text using the words and phrases by
    applying grammar rules
  • Control Data
  • derived from end user profile

19
Content Selection
  • What data items are worth picking up for the
    summary?
  • Reasoning from first principles - no detailed
    user model
  • Reusing data analysis techniques used by KDD
    community
  • Attractive
  • but not developed for communication
  • Adapting data analysis techniques to suit needs
    of communication using the Gricean Maxims

20
Data Analysis
  • Experts View
  • Step Method
  • Report changes above thresholds (Significant
    changes)
  • Corpus View
  • Segmentation Method
  • Report changes in Slopes/ report trends

21
Example
  • MAGNUS / THISTLE / NW HUTTON, EAST OF SHETLAND
  • day hour wind dir wind speed (Knots)
  • 20-1-01 6 S 4
  • 20-1-01 9 S 6
  • 20-1-01 12 S 7
  • 20-1-01 15 S 10
  • 20-1-01 18 S 12
  • 20-1-01 21 S 16
  • 21-1-01 0 S 18
  • FORECAST FOR 06-24 GMT, 20- Jan 2001
  • S 02-06 INCREASING 16-20 BY EVENING

22
Experts View-Step Model
S 3-8 INCREASING 8-13 BY AFTERNOON AND 13-18 BY
EVENING.
23
Corpus View-Segmentation Model
S 3-8 INCREASING 15-20 BY MIDNIGHT.
24
Gricean Maxims (Grice 1975)
  • Maxim of Quality Try to make your contribution
    one that is true. More specifically
  • Do not say what you believe to be false.
  • Do not say that for which you lack adequate
    evidence.
  • Maxim of Quantity
  • Make your contribution as informative as is
    required (for the current purposes of the
    exchange).
  • Do not make your contribution more informative
    than is required.
  • Maxim of Relevance Be relevant.
  • Maxim of Manner Be perspicuous. More
    specifically
  • Avoid obscurity of expression. -Avoid
    ambiguity.
  • Be brief. -Be orderly.

25
Application of Gricean Maxims - Example
  • Maxim of Quality
  • Try to report true values from the input data
  • Use linear interpolation instead of linear
    segmentation
  • Uncertainty in the input data needs to be
    communicated to the user

26
Sample Data
27
Linear Regression Vs Linear Interpolation
28
Linear Regression Vs Linear Interpolation (2)
  • Linear Regression
  • S 03-07 INCREASING 16-20 BY MIDNIGHT
  • Linear Interpolation
  • S 06-10 INCREASING 18-22 BY MIDNIGHT
  • Human Written Forecast
  • S 06-10 INCREASING 18-22 BY MIDNIGHT
  • Although visually linear regression looks better
    forecasters do not use it.
  • Uncertainty
  • Speed values are mentioned as ranges e.g. 06-07
    18-22

29
Intrinsic Evaluation of content determination
  • Metrics
  • Short - Size (Accessibility)
  • Accurate - Error (Informativeness)
  • Size Computation
  • measured at the conceptual level
  • number of wind states
  • Error Computation
  • Vertical distance from the line of approximation
  • combined error in wind speed and wind direction
  • normalized

30
Results of Evaluation
  • Segmentation produces shorter summaries without
    losing accuracy
  • Details
  • 16.5 of cases segmentation is better than step
    in both size and error
  • 0.56 of cases the step method is better than
    segmentation in both size and error
  • 2.5 of cases segmentation is better then step
    error wise but worse size wise
  • 32 of cases segmentation is better then step
    size wise but worse error wise
  • 31 of cases segmentation is better than step
    error wise but equal size wise

31
Micro-planning Realization
  • Based on Parallel corpus analysis (described
    earlier) and
  • Expert KA/Revision
  • Details in Papers at
  • www.csd.abdn.ac.uk/research/sumtime/papers.html

32
SumTime-Mousam at Weathernews (UK) Ltd.
33
Post-edit Evaluation
  • Total number of forecasts analysed 2728
  • 2728 texts divided into 73041 phrases
  • 7608 (10) phrases could not be aligned
  • Alignment failures imply that forecasters are not
    happy with our content determination
  • Which is dependent on a process called
    segmentation
  • Forecasters seem to perform more sophisticated
    reasoning than simple segmentation

34
Analysis results (1)
  • Out of the successfully aligned phrases
  • 43914 phrases matched perfectly
  • 21519 phrases are mismatches
  • Detailed analysis of the mismatches

35
Analysis Results (2)
36
End-user Evaluation
  • 73 End-users (oil company staff supporting
    offshore oilrigs) participated in this evaluation
  • used forecasts produced by the following three
    methods
  • human written weather forecasts
  • SumTime-Mousam generated weather forecasts
  • SumTime-Mousam expressing Human select content
  • Each participant completed a questionnaire that
    has two parts
  • Part 1
  • forecast produced by one of the above three
    methods (anonymous)
  • Participant is required to answer comprehension
    questions based on the forecast
  • Part 2
  • showed any two forecasts from the above three
    methods (anonymous)
  • Participant specified his/her preference for one
    of the two forecasts
  • The main result
  • end-users consider the SumTime-Mousam generated
    output linguistically better than human written
    forecasts
  • Content of SumTime-Mousam is not as good as human
    selected content

37
Conclusion
  • SumTime-Mousam is the result of knowledge
    obtained from
  • several knowledge acquisition studies
  • Expert based
  • Corpus based
  • Several evaluation studies
  • Intrinsic evaluation
  • Post-edit evaluation
  • End-user evaluation
  • The development of SumTime-Mousam went through
    many cycles
  • Building novel technology requires iterative
    approach with multiple KA and evaluation studies
Write a Comment
User Comments (0)
About PowerShow.com