Title: Computational Models of Text Quality
1Computational Models of Text Quality
- Ani Nenkova
- University of Pennsylvania
- ESSLLI 2010, Copenhagen
2The ultimate text quality application
- Imagine your favorite text editor
- With spell-checker and grammar checker
- But also functions that tell you
- Word W is repeated too many times
- Fill the gap is a cliché
- You might consider using this more figurative
expression - This sentence is unclear and hard to read
- What is the connection between these two
sentences? - ..
3Currently
- It is our friends who give such feedback
- Often conflicting
- We might agree that a text is good, but find it
hard to explain exactly why - Computational linguistics should have some
answers - Though far from offering a complete solution yet
4In this course
- We will overview research dealing with various
aspects of text quality - A unified approach does not yet exist, but many
proposals - have been tested on corpus data
- integrated in applications
5Current applications education
- Grading student writing
- Is this a good essay?
- One of the graders of SAT and GRE essays is in
fact a machine! 1 - http//www.ets.org/research/capabilities/automated
_scoring - Providing appropriate reading material
- Is this text good for a particular user?
- Appropriate grade level
- Appropriate language competency in L2 2,3
- http//reap.cs.cmu.edu/
6Current applications information retrieval
- Particularly user generated content
- Questions and answers on the web
- Blogs and comments
- Searching over such content poses new problems
4 - What is a good question/answer/comment?
- http//answers.yahoo.com/
- Relevant for general IR as well
- Of the many relevant document some, are better
written
7Current applications NLP
- Models of text quality
- lead to improved systems 5
- offer possibilities for automatic evaluation 6
- Automatic summarization
- Select important content and organize it in as
well-written text - Language generation
- Select, organize and present content on document,
paragraph, sentence and phrase level - Machine translation
8Text quality factors
- Interesting
- Style (clichés, figurative language)
- Vocabulary use
- Grammatical and fluent sentences
- Coherent and easy to understand
- In most types of writing, well-written means
clear and easy to understand. Not necessarily so
in literary works. - Problems with clarity of instructions motivated a
fair amount of early work.
9Early work keep in mind these predate modern
computers!
- Common words are easier to understand
- stentorian vs. loud
- myocardial infarction vs. heart attack
- Common words are short
- Standard readability metrics
- percentage of words not among the N most frequent
- average numbers of syllables per word
- Syntactically simple sentences are easier to
understand - average number of words per sentence
- Flesch-Kincaid, Automated Readability Index,
Gunning-Fog, SMOG, Coleman-Liau
10Modern equivalents
- Language models
- Word probabilities from a large collection
- http//www.speech.cs.cmu.edu/SLM_info.html
- Features derived from syntactic parse 2,7,8,9
- Parse tree height
- Number of subordinating conjunctions
- Number of passive voice constructions
- Number of noun and verb phrases
11Language models
- Unigram and bigram language models
- Really, just huge tables
- Smoothing necessary to account for unseen words
12Features from language models
- Assessing the readability of text t consisting of
m words, for intended audience class c - Number of out of vocabulary words in the text
with respect to the language model for c - Text likelihood and perplexity
13Application to grade level predictionCollins-Thom
pson and Callan, NAACL 2004 10
14Application to grade level predictionCollins-Thom
pson and Callan, NAACL 2004 10
15Results on predicting grade levelSchwarm and
Ostendorf, ACL 2005 11
- Flesch-Kincaid Grade Level index
- number of syllables per word
- sentence length
- Lexile
- word frequency
- sentence length
- SVM features
- language models and syntax
16Models of text coherence
- Global coherence
- Overall document organization
- Local coherence
- Adjacent sentences
17Text structure can be learnt in an unsupervised
manner
Location, time
- Human-written examples from a domain
damage
magnitude
relief efforts
18 Content model Barzilay Lee04 12
- Hidden Markov Model (HMM)-based
- States - clusters of related sentences topics
- Transition prob. - sentence precedence in corpus
- Emission prob. - bigram language model
Generating sentence in current topic
Earthquake reports
Transition from previous topic
location, magnitude
relief efforts
casualties
19Generating Wikipedia articlesSauper and
Barzilay, 2009 12
- Articles on diseases and American film actors
- Create templates of subtopics
- Focus only on subtopic level structure
- Use paragraphs from documents on the web
20Template creation
- Cluster similar headings
- signs and symptoms, symptoms, early symptoms
- Choose k clusters
- average number of subtopics in that domain
- Find majority ordering for the clusters
Biography Early life Career Personal life Death
Diseases Symptoms Causes Diagnosis Treatment
21Extraction of excerpts and ranking
- Candidates for a subtopic
- Paragraphs from top 10 pages of search results
- Measure relevance of candidates for that subtopic
- Features unigrams, bigrams, number of
sentences
22Need to control redundancy across subtopics
- Integer Linear Program
- Variables
- One per excerpt (value 1-chosen or 0)
- Objective
- Minimize sum of the ranks of the excerpts chosen
1 2 3 4 5
causes
symptoms
diagnosis
treatment
- Constraints
- Cosine similarity between any selected pair lt
0.5 - One excerpt per subtopic
23Linguistic models of coherenceHalliday and
Hasan, 1976 13
- Coherent text is characterized by the presence of
various types of cohesive links that facilitate
text comprehension - Reference and lexical reiteration
- Pronouns, definite descriptions, semantically
related words - Discourse relations (conjunction)
- I closed the window because it started raining.
- Substitution (one) or ellipses (do)
24Referential coherence
- Centering theory
- tracking focus of attention across adjacent
sentences 14, 15, 16, 17 - Syntactic form of references
- Particularly first and subsequent mention 18,
19, pronominalization - Lexical chains
- Identifying and tracking topics within a text
20, 21, 22, 23
25Discourse relations
- Explicit vs. implicit
- I stayed home because I had a headache
- Signaled by a discourse connective
- Inferred without the presence of a connective
- I took my umbrella. Because The forecast was
for rain in the afternoon.
26Lexical chains
- Often discussed as cohesion indicator,
implemented systems, but not used in text quality
tasks - Find all words that refer to the same topic
- Find the correct sense of the words
- LexChainer Tool http//www1.cs.columbia.edu/nlp/t
ools.cgi 23 - Applications summarization, IR, spell checking,
hypertext construction - John bought a Jaguar. He loves the car.
- LC jaguar, car, engine, it
27Centering theory ingredients(Grosz et al, 1995)
- Deals with local coherence
- What happens to the flow from sentence to
sentence - Does not deal with global structuring of the text
(paragraphs/segments) - Defines coherence as an estimate of the
processing load required to understand the text
28Processing load
- Upon hearing a sentence a person
- Cognitive effort to interpret the expressions in
the utterance - Integrates the meaning of the utterance with that
of the previous sentence - Creates some expectations on what might come next
29Example
- John met his friend Mary today.
- He was surprised to see her.
- He thought she is still in Italy.
- Form of referring expressions
- Anaphora needs to be resolved
- Create a discourse entity at first mention with
full noun phrase - Creating expectations
30Creating and meeting expectations
- (1) a. John went to his favorite music store to
buy a piano. - b. He had frequented the store for many
years. - c. He was excited that he could finally buy
a piano. - d. He arrived just as the store was closing
for the day. - (2) a. John went to his favorite music store to
buy a piano. - b. It was a store John had frequented for
many years. - c. He was excited that he could finally buy
a piano. - d. It was closing just as John arrived.
31Interpreting pronouns
- Terry really goofs sometimes.
- Yesterday was a beautiful day and he was excited
about trying out his new sailboat. - He wanted Tony to join him on a sailing
expedition. - He called him at 6am.
- He was sick and furious at being woken up so
early.
32Basic centering definitions
- Centers of an utterance
- Set of entities serving to link that utterance to
the other utterances in the discourse segment
that contains it - Not words or phrases themselves
- Semantic interpretations of noun phraes
33Types of centers
- Forward looking centers
- An ordered set of entities
- What could we expect to hear about next
- Ordered by salience as determined by grammatical
function - Subject gt Indirect object gt Object gt Others
- John gave the textbook to Mary.
- Cf John, Mary, textbook
- Preferred center Cp
- The highest ranked forward looking center
- High expectation that the next utterance in the
segment will be about Cp
34Backward looking center
- Single backward looking center, Cb (U)
- For each utterance other than the segment-initial
one - The backward looking center of utterance Un1
connects with one of the forward looking centers
of Un - Cb (U1) is the most highly ranked element from
Cf (Un) that is also realized in U1
35Centering transitions ordering
Cb(Un1)Cb(Un) OR Cb(Un)? Cb(Un1) ! Cb(Un)
Cb(Un1) Cp(Un1) continue smooth-shift
Cb(Un1) ! Cp(Un1) retain rough-shift
36Centering constraints
- There is precisely one backward-looking center
Cb(Un) - Cb(Un1) is the highest-ranked element of Cf(Un)
that is realized in Un1
37Centering rules
- If some element of Cf(Un) is realized as a
pronoun in Un1 then so is Cb(Un1) - Transitions not equal
- continue gt retain gt smooth-shift gt rough-shift
38Centering analysis
- Terry really goofs sometimes.
- CfTerry, Cb?, undef
- Yesterday was a beautiful day and he was excited
about trying out his new sailboat. - CfTerry,sailboat, CbTerry, continue
- He wanted Tony to join him in a sailing
expedition. - CfTerry, Tony, expedition, CbTerry, continue
- He called him at 6am.
- CfTerry,Tony, CbTerry, continue
39- He called him at 6am.
- CfTerry,Tony, CbTerry, continue
- Tony was sick and furious at being woken up so
early. - CfTony, CbTony, smooth shift
- He told Terry to get lost and hung up.
- CfTony,Terry, CbTony, continue
- Of course, Terry hadnt intended to upset Tony.
- CfTerry,Tony, Cb Tony, retain
40Rough shifts in evaluation of writing skills
(Miltsakaki and Kukich, 2002)
- Automatic grading of essays by E-rater
- Syntactic variety
- Represented by features that quantify the
occurrence of clause types - Clear transitions
- Cue phrases in certain syntactic constructions
- Existence of main and supporting points
- Appropriateness of the vocabulary content of the
essay - What about local coherence?
-
41Essay score model
- Human score available
- E-rater prediction available
- Percentage of rough-shifts in each essay
analysis done manually - Negative correlation between the human score and
the percentage of rough-shifts
42- Linear multi-factor regression
- Approximate the human score as a linear function
of the e-rater prediction and the percentage of
rough-shifts - Adding rough shifts significantly improves the
model of the score - 0.5 improvement on 16 scale
- How easy/difficult would it be to fully automate
the rough-shift variable?
43Variants of centering and application to
information ordering
- Karamanis et al, 09 is the most comprehensive
overview of variants of centering theory and an
evaluation of centering in a specific task
related to text quality
44Information ordering task
- Given a set of sentences/clauses, what is the
best presentation? - Take a newspaper article and jumble the
sentences---the result will be much more
difficult to read than the original - Negative examples constructed by randomly
permuting the original - Criteria for deciding which of two orderings is
better - Centering would definitely be applicable
45Centering variations
- Continuity (NOCBlack of continuity)
- Cf(Un) and Cf(Un1) share at least one element
- Coherence
- Cb(Un) Cb(Un1)
- Salience
- Cb(U) Cp(U)
- Cheapness (fulfilled expectations)
- Cb (Un1) Cp(Un)
46Metrics of coherence
- M.NOCB (no continuity)
- M.CHEAP (expectations not met)
- M.KP sum of the violations of continuity,
cheapness, coherence and salience - M. BFP seeks to maximize transitions according to
Rule 2
47Experimental methodology
- Gold-standard ordering
- The original order of the text (object
description, news article) - Assume that other orderings are inferior
- Classification error rate
- Percentage orderings that score better than the
gold-standard 0.5percentage of the orderings
that score the same
48Results
- NOCB gives best results
- Significantly better than the other metrics
- Consistent results for three different corpora
- Museum artifact descriptions (2)
- News
- Airplane accidents
- M.BFP is the second best metric
49(No Transcript)
50Entity grid(Barzilay and Lapata, 2005, 2008)
- Inspired by centering
- Tracks entities across adjacent sentences, as
well as their syntactic positions - Much easier to compute from raw text
- Brown Coherence Toolkit
- http//www.cs.brown.edu/melsner/manual.html
51Entity grid applications
- Several applications , with very good results
- Information ordering
- Comparing the coherence of pairs of summaries
- Distinguishing readability levels
- Child vs. adult
- Improves over PetersenOstendorf
52Entity grid example
- 1 The Justice DepartmentS is conducting an
anti-trust trialO against Microsoft Corp.X
with evidenceX that the companyS is
increasingly attempting to crush competitorsO. - 2 MicrosoftO is accused of trying to forcefully
buy into marketsX where its own productsS are
not competitive enough to unseat established
brandsO. - 3 The caseS revolves around evidenceO of
MicrosoftS aggressively pressuring NetscapeO
into merging browser softwareO. - 4 MicrosoftS claims its tacticsS are
commonplace and good economically. - 5 The governmentS may file a civil suitO
ruling that conspiracyS to curb competitionO
through collusionX is a violation of the
Sherman ActO. - 6 MicrosoftS continues to show increased
earningsO despite the trialX.
53Entity grid representation
5416 entity grid features
- The probability of each type of transition in the
text - Four syntactic distinctions
- S, O, X, _
55Type of reference and info ordering(Elsner and
Charniak, 2008)
- Entity grid features not concerned with how an
entity is mentioned - Discourse old vs. discourse new
- Kent Wells, a BP senior vice president said on
Saturday during a technical briefing that the
current cap, which has a looser fit and has been
diverting about 15,000 barrels of oil a day to a
drillship, will be replaced with a new one in 4
to 7 days. - The new cap will take 4 to 7 days to be
installed, and in case the new cap is not
effective, Mr. Wells said engineers were prepared
to replace it with an improved version of the
current cap.
56- The probability of a given sequence of discourse
new and old realizations gives a further
indication about ordering - Similarly, pronouns should have reasonable
antecedents - Adding both models to the entity grid improves
performance on the information ordering task
57Sentence Ordering
- n sentences
- Output from a generation or summarization system
- Find most coherent ordering
- n! permutations
- With local coherence metrics
- Adjacent sentence flow
- Finding best ordering is NP complete
- Reduction from Traveling Salesman Problem
58Word co-occurrence model(Lapata, ACL 2003
Soricut and Marcu, 2005) 23,24
- Idea from statistical machine translation
- Alignment models
John went to a restaurant. He ordered fish. The
waiter was very attentive.
John est allé à un restaurant.Il ordonna de
poisson.Le garçon était très attentif.
John went to a restaurant. He ordered fish. The
waiter was very attentive.
He ordered fish.The waiter was very
attentive.John gave him a huge tip.
P(ordered restaurant)
P(fish poisson)
We ate at a restaurant yesterday.
P(waiter ordered)
We also ordered some take away.
P(tip waiter)
59Discourse (coherence) relations
- Only recently empirically results have shown that
discourse relations are predictive of text
quality (Pitler and Nenkova, 2008)
60PDTB discourse relations annotations
- Largest corpus of annotated discourse relations
- http//www.seas.upenn.edu/pdtb/
- Four broad classes of relations
- Contingency
- Comparison
- Temporal
- Expansion
- Explicit and implicit
61Implicit and explicit relations
- (E1) He is very tired because he played tennis
all morning. - (E2) He is not very strong but he can run
amazingly fast. - (E3) We had some tea in the afternoon and later
went to a restaurant for a big dinner - (I1) I took my umbrella this morning. because
The forecast was for rain. - (I2) She is never late for meetings. but He
always arrives 10 minutes late. - (I3) She woke up early. afterwards She had
breakfast and went for a walk in the park.
62What is the relative importance of factors in
determining text quality?
- Competent readers (native English speaker)
- graduate students at Penn
- Wall Street Journal texts
- 30 texts ranked on scale 1 to 5
- How well-written is this article?
- How well does the text fit together?
- How easy was it to understand?
- How interesting is the article?
63- Several judgments for each text
- Final quality score was the average
- Scores range from 1.5 to 4.33
- Mean 3.2
64- Which of the many indicators will work best?
- Usually research study focus on only one or two
- How do indicators combine?
- Metrics
- Correlation coefficient
- Accuracy of pair-wise ranking prediction
65- Correlation coefficients between assessor ratings
and different features
66Baseline measures
- Average Characters/Word
- r -.0859 (p .6519)
- Average Words/Sentence
- r .1637 (p .3874)
- Max Words/Sentence
- r .0866 (p .6489)
- Article length
- r -.3713 (p .0434)
67Vocabulary factors
- Language model probability of the article
- M estimated from PTB (WSJ)
- M estimated from general news (NEWS)
68Correlations with well-written assessment
- Log likelihood, WSJ
- r .3723 (p .0428)
- Log likelihood, NEWS
- r .4497 (p .0127)
- Log likelihood with length, WSJ
- r .3732 (p .0422)
- Log likelihood with length, NEWS
- r .6359, p .0002
69Syntactic features
- Average parse tree height
- r -.0634 (p .7439)
- Avr. number of noun phrases per sentence
- r .2189 (p .2539)
- Average SBARs
- r .3405 (p .0707)
- Avr. number of verb phrases per sentence
- r .4213 (p .0228)
70Elements of lexical cohesion
- Avr. cosine similarity between adjacent sents
- r -.1012 (p .5947)
- Avr. word overlap between adjacent sentences
- r -.0531, p .7806
- Avr. NounPronoun Overlap
- r .0905, p .6345
- Avr. Pronouns/Sent
- r .2381, p .2051
- Avr Definite Articles
- r .2309, p .2196
71Correlation with well-written score
- Prob. of S-S transition
- r -.1287 (p .5059)
- Prob. of S-O transition
- r -.0427 (p .8261)
- Prob. of S-X transition
- r -.1450 (p .4529)
- Prob. of S-N transition
- r .3116 (p .0999)
- Prob. of O-S transition
- r .1131 (p .5591)
- Prob. of O-O transition
- r .0825 (p .6706)
- Prob. of O-X transition
- r .0744 (p .7014)
- Prob. of O-N transition
- r .2590 (p .1749)
72- Prob. of X-S transition
- r .1732 (p .3688)
- Prob. of X-O transition
- r .0098 (p .9598)
- Prob. of X-X transition
- r -.0655 (p .7357)
- Prob. of X-N transition
- r .1319 (p .4953)
- Prob. of N-S transition
- r .1898 (p .3242)
- Prob. of N-O transition
- r .2577 (p .1772)
- Prob. of N-X transition
- r .1854 (p .3355)
- Prob. of N-N transition
- r -.2349 (p .2200)
73Well-writteness and discourse
- Log likelihood of discourse rels
- r .4835 (p .0068)
- of discourse relations
- r -.2729 (p .1445)
- Log likelihood of rels with of rels
- r .5409 (p .0020)
- of relations with of words
- r .3819 (p .0373)
- Explicit relations only
- r .1528 (p .4203)
- Implicit relations only
- r .2403 (p .2009)
74Summary significant factors
- Log likelihood of discourse relations
- r .4835
- Log likelihood , NEWS
- r .4497
- Average verb phrases per sentence
- r .4213
- Log likelihood, WSJ
- r .3723
- Number of words
- r -.3713
75Text quality prediction as ranking
- Every pair of texts with ratings differing by 0.5
- Features are the difference of feature values for
each text - Task predict which of the two articles has
higher text quality score
76Prediction accuracy (10-fold cross validation)
- None (Majority Class) 50.21
- number of words 65.84
- ALL 88.88
- Grid only 79.42
- log l discourse rels 77.77
- Avg VPs sen 69.54
- log l NEWS 66.25
77Findings
- Complex interplay between features
- Entity grid features not significantly correlated
with well-written score but very useful for the
ranking task - Discourse information is very helpful
- But here we used gold-standard annotations
- Developing automatic classifier underway
78Implicit and explicit discourse relations
Class Explicit Implicit
Comparison 69 31
Contingency 47 53
Temporal 80 20
Expansion 42 58
79Sense classification based on connectives only
- Four-way classification
- Explicit relations only
- 93 accuracy
- All relations (implicitexplicit)
- 75 accuracy
- Implicit relations are the real challenge
80Explicit discourse relations, tasksPitler and
Nenkova, 2009 25
- Discourse vs. non-discourse use
- I will be happier once the semester is over.
- I have been to Ohio once.
- Relation sense
- Contingency, comparison, temporal, expansion
- I havent been to Paris since I went there on a
school trip in 1998. Temporal - I havent been to Antarctica since it is very far
away. Contingency
81Penn Discourse Treebank
- Largest available annotated corpus of discourse
relations - Penn Treebank WSJ articles
- 18,459 explicit discourse relations
- 100 connectives
- although vs. or
- 91 discourse 3 discourse
82Discourse Usage Experiments
- Positive examples discourse connectives
- Negative examples same strings in PTDB,
unannotated - 10-fold cross validation
- Maximum Entropy classifier
83Discourse Usage Results
84Discourse Usage Results
85Sense Disambiguation Comparison, Contingency,
Expansion, or Temporal?
Features Accuracy
Connective 93.67
Connective Syntax 94.15
Interannotator Agreement 94
86Tool
- Automatic annotation of discourse use and sense
of discourse connectives - Discourse Connectives Tagger
- http//www.cis.upenn.edu/epitler/discourse.html
-
87What about implicit relations?
- Is there hope to have a usable tool soon?
- Early studies on unannotated data gave reason for
optimism - But when recently tested on the PDTB, their
performance is poor - Accuracy of contingency, comparison and temporal
is below 50 -
88Classify implicits and explicits together
- Not easy to infer from combined results how early
systems performed on implicits - As we saw, one can get reasonable overall
performance by doing nothing for explicts - Same sentence 26
- Graphbank corpus doesnt distinguish implicit
and explicit 27
89Classify on large unannotated corpus
- Create artificial implicits by deleting
connective 28, 29, 30 - I am in Europe, but I live in the United States.
- First proposed by Marcu and Echihabi, 2002
- Very good initial results
- Accuracy of distinguishing between two rels, gt75
- But these were on balanced classes
- Not the case in real text
- Not tested on real implicits (but see 30,29)
90Experiments with PDTB
- Pitler et al, ACL 2009 31
- Wide variety of features to capture semantic
opposition and parallelism - Lin et al, EMNLP 2009 32
- (Lexicalized) syntactic features
- Results improve over baselines, better
understanding of features, but the classifiers
are not suitable for application in real tasks
91Word pairs as features
- Most basic feature for implicits
- I_there, I_is, , tired_time, tired_difference
I
am
a
Iittle
tired
there
is
a
13
hour
time
difference
Marcu and Echihabi , 2002
92Intuition with large amounts of data, will find
semantically-related pairs
- The recent explosion of country funds mirrors the
closed-end fund mania of the 1920s, Mr. Foot
says, when narrowly focused funds grew wildly
popular. - They fell into oblivion after the 1929 crash.
93Meta error analysis of prior work
- Using just content words reduces performance (but
has steeper learning curve) - Marcu and Echihabi, 2002
- Nouns and adjectives dont help at all
- Lapata and Lascarides, 2004 33
- Filtering out stopwords lowers results
- Blair-Goldensohn et al., 2007
94Word pairs experimentsPitler et al 2009
- Synthetic implicits Cause/Contrast/None
- Explicit instances from Gigaword with connective
deleted - Because ? Cause, But ? Contrast
- At least 3 sentences apart ? None
- Blair-Goldensohn et al., 2007
- Random selection
- 5,000 Cause
- 5,000 Other
- Computed information gain of word pairs
95Function words have highest information gain
ButDidnt we remove the connective?
96but signals Not-Comparison in synthetic data
- The government says it has reached most isolated
townships by now, but because roads are blocked,
getting anything but basic food supplies to
people remains difficult. - but because ? Comparison
- but because ? Contingency
97Results Word pairs
- Pairs of words from the two text spans
- What doesnt work
- Training on synthetic implicits
- What really works
- Use synthetic implicits for feature selection
- Train on PDTB
98Best Results f-scores
Comparison 21.96 (17.13) Contingency 47.13 (31.10)
Expansion 76.41 (63.84) Temporal 16.76 (16.21)
Comparison/Contingency baseline synthetic
implicits word pairs Expansion/Temporal baseline
real implicits word pairs
99Further experiments using context
- Results from classifying each relation
independently - Naïve Bayes, MaxEnt, AdaBoost
- Since context features were helpful, tried CRF
- 6-way classification, word pairs as features
- Naïve Bayes accuracy 43.27
- CRF accuracy 44.58
100Do we need more coherence factors?Louis and
Nenkova, 2010 34
- If we had perfect co-reference and discourse
relation information, would we be able to explain
local discourse coherence - Our recent corpus study indicates the answer is
NO - 30 of adjacent sentences in the same paragraph
in PDTB - Neither share an entity nor have an implicit
comparison contingency or temporal relation - Lexical chains?
101References
- 1 Burstein, J. Chodorow, M. (in press).
Progress and new directions in technology for
automated essay evaluation. In R. Kaplan (Ed.),
The Oxford handbook of applied linguistics (2nd
Ed.). New York Oxford University Press. - 2 Heilman, M., Collins-Thompson, K., Callan,
J., and Eskenazi, M. (2007). Combining Lexical
and Grammatical Features to Improve Readability
Measures for First and Second Language Texts.
Proceedings of the Human Language Technology
Conference. Rochester, NY. - 3 S. Petersen and M. Ostendorf, A machine
learning approach to reading level assessment,
Computer, Speech and Language, vol. 23, no. 1,
pp. 89-106, 2009 - 4 Finding High Quality Content in Social Media,
Eugene Agichtein, Carlos Castillo, Debora Donato,
Aristides Gionis, Gilad Mishne, ACM Web Search
and Data Mining Conference (WSDM), 2008 - 5 Regina Barzilay and Lillian Lee, Catching the
Drift Probabilistic Content Models, with
Applications to Generation and Summarization,
HLT-NAACL 2004 Proceedings of the Main
Conference, pp113120, 2004
102References
- 6 Emily Pitler, Annie Louis and Ani Nenkova,
Automatic Evaluation of Linguistic Quality in
Multi-Document Summarization, Proceedings of ACL
2010 - 7 Schwarm, S. E. and Ostendorf, M. 2005.
Reading level assessment using support vector
machines and statistical language models. In
Proceedings of ACL 2005. - 8 Jieun Chae, Ani Nenkova Predicting the
Fluency of Text with Shallow Structural Features
Case Studies of Machine Translation and
Human-Written Text. In proceedings of EACL 2009
139-147 - 9 Charniak, E. and Johnson, M. 2005.
Coarse-to-fine n-best parsing and MaxEnt
discriminative reranking. In Proceedings of ACL
2005. - 10 K. Collins-Thompson and J. Callan. (2004). A
language modeling approach to predicting reading
difficulty. Proceedings of HLT/NAACL 2004. - 11 Sarah E. Schwarm and Mari Ostendorf. Reading
Level Assessment Using Support Vector Machines
and Statistical Language Models. In Proceedings
of ACL, 2005. -
103References
- 12 Automatically generating Wikipedia articles
A structure-aware approach, C. Sauper and R.
Barzilay, ACL-IJCNLP 2009 - 13 Halliday, M. A. K., and Ruqaiya Hasan.
1976.Cohesion in English. London Longman - 14 B. Grosz, A. Joshi, and S. Weinstein. 1995.
Centering a framework for modelling the local
coherence of dis- course. Computational
Linguistics, 21(2)203226 - 15 E. Miltsakaki and K. Kukich. 2000. The role
of centering theorys rough-shift in the teaching
and evaluation of writing skills. In Proceedings
of ACL00, pages 408 415. - 16 Karamanis, N., Mellish, C., Poesio, M., and
Oberlander, J. 2009. Evaluating centering for
information ordering using corpora. Comput.
Linguist. 35, 1 (Mar. 2009), 29-46. - 17 Regina Barzilay, Mirella Lapata, "Modeling
Local Coherence An Entity-based Approach,
Computational Linguistics, 2008. - 18 Ani Nenkova, Kathleen McKeown References to
Named Entities a Corpus Study. HLT-NAACL 2003
104References
- 19 Micha Elsner, Eugene Charniak
Coreference-inspired Coherence Modeling. ACL
(Short Papers) 2008 41-44 - 20 Morris, J. and Hirst, G. 1991. Lexical
cohesion computed by thesaural relations as an
indicator of the structure of text. Comput.
Linguist. 17, 1 (Mar. 1991), 21-48. - 21 Regina Barzilay and Michael Elhadad, "Text
summarizations with lexical chains, In Inderjeet
Mani and Mark Maybury, editors, Advances in
Automatic Text Summarization. MIT Press, 1999. - 22 Silber, H. G. and McCoy, K. F. 2002.
Efficiently computed lexical chains as an
intermediate representation for automatic text
summarization. Comput. Linguist. 28, 4 (Dec.
2002), 487-496. - 23 Mirella Lapata, Probabilistic Text
Structuring Experiments with Sentence Ordering,
Proceedings of ACL 2003. - 24 Discourse generation using utility-trained
coherence models, R. Soricut D. Marcu,
COLING-ACL 2006
105References
- 25 Emily Pitler and Ani Nenkova. Using Syntax
to Disambiguate Explicit Discourse Connectives in
Text. Proceedings of ACL, short paper, 2009 - 26 Radu Soricut and Daniel Marcu. 2003.
Sentence Level Discourse Parsing using Syntactic
and Lexical Information. Proceedings of the Human
Language Technology and North American
Association for Computational Linguistics
Conference (HLT/NAACL-2003) - 27 Ben Wellner, James Pustejovsky, Catherine
Havasi, Roser Sauri and Anna Rumshisky.
Classification of Discourse Coherence Relations
An Exploratory Study using Multiple Knowledge
Sources. In Proceedings of the 7th SIGDIAL
Workshop on Discourse and Dialogue - 28 Daniel Marcu and Abdessamad Echihabi (2002).
An Unsupervised Approach to Recognizing Discourse
Relations. Proceedings of the 40th Annual Meeting
of the Association for Computational Linguistics
(ACL-2002) - 29 Sasha Blair-Goldensohn, Kathleen McKeown,
Owen Rambow Building and Refining
Rhetorical-Semantic Relation Models. HLT-NAACL
2007 428-435
106References
- 30 Sporleder, C. and Lascarides, A. 2008. Using
automatically labelled examples to classify
rhetorical relations An assessment. Nat. Lang.
Eng. 14, 3 (Jul. 2008), 369-416. - 31 Emily Pitler, Annie Louis, and Ani Nenkova.
Automatic Sense Prediction for Implicit Discourse
Relations in Text. Proceedings of ACL, 2009. - 32 Ziheng Lin, Min-Yen Kan and Hwee Tou Ng
(2009). Recognizing Implicit Discourse Relations
in the Penn Discourse Treebank. In Proceedings of
EMNLP - 33 Lapata, Mirella and Alex Lascarides. 2004.
Inferring Sentence-internal Temporal Relations.
In Proceedings of the North American Chapter of
the Assocation of Computational Linguistics,
153-160. - 34 Annie Louis and Ani Nenkova, Creating Local
Coherence An Empirical Assessment, ?Proceedings
of NAACL-HLT 2010