Title: Lexical Pragmatics: Some Principles and Formalisms
1Lexical Pragmatics Some Principles and Formalisms
- Anne L. BezuidenhoutTrondheim, September 18-22,
2006
2Lexical Pragmatics
Lexical Pragmatics is a particular account of the
division of labor between lexical semantics and
pragmatics It combines the idea of (radical)
semantic underspecification in the lexicon with a
theory of pragmatic strengthening (based on
conversational implicatures). In the core of this
approach is a precise treatment of Atlas and
Levinsons (1981) Q- and I-Principles and the
formalization of the balance between
informativeness and efficiency in natural
language processing (Horns (1984) division of
pragmatic labor). In a roughly simplified
formulation, the I-Principle seeks to select the
most coherent interpretation, and the Q-Principle
acts as a blocking mechanism which blocks all the
outputs which can be grasped more economically by
an alternative linguistic input. Recently, these
mechanisms have been implemented within a
bi-directional version of optimality theorywhich
aims to integrate expressive and interpretive
optimization. (Blutner Solstad 11)
3Components of OT
It assumes three formal components, namely a
Generator, an Evaluator and a system Con of
ranked constraints. Given some input, Gen creates
a set of candidate outputs, and Eval then selects
the optimal candidate for that input. The optimal
candidate is the one that violates the fewest
constraints. The principal innovation of a
bi-directional version of OT is that Gen delivers
a set of input-output pairs.
4Lexical-Pragmatics Interface
Comprehension Perspective
5The Generator
Applying BOT to lexical pragmatics, and thinking
of the encoded meaning of an expression as its
context change potential, we assume that Gen
delivers a set of expression-interpretation pairs
, defined as follows Gencg
cg( u ) i Here i is a potential result of
updating the common ground cg with u , which
is the encoded meaning associated with the
expression u. This encoded meaning is assumed to
be semantically underspecified.
6The Cost Function
We also need to assume that there is an ordering
relation (being less costly than) that ranks
the elements in Gencg. The total cost function
can be represented as follows cost(u, i)
compl(u) . c( u , i) In English Cost
Complexity of expression ? Surprise value of
interpretation (The less complex and the less
surprising, the less costly)
7Bi-Directional OT
BOT (strong version) asserts that an
expression-interpretation pair is optimal
if and only if it satisfies both of the following
constraints (Q) There is no other pair
? Gencg such that (I) There
is no other pair ? Gencg such that i?
These constraints (formulated from the
Comprehension perspective) could be formulated in
English as follows (Q) There is no less costly
way for the speaker to express her meaning. (I)
There is no meaning that is less costly to attach
to the speakers utterance.
8Lexical Blocking
Total blocking is a situation in which some
productive form is disallowed because there is
already a lexicalized form that serves to cover
the intended range of meaning. E.g, although one
can talk of pale yellow, pale blue, and pale
green, one would not talk of pale red, since
pink already exists as an expression covering
the relevant range of the color spectrum. The
strong version of BOT would explain why the pair
is non-optimal, because there
is another pair that is less
costly than it (in this case because the
lexicalized form pink is less complex than the
productive form pale red).
9Partial Blocking
Partial blocking is the phenomenon in which a
lexicalized or more productive expression exists
to cover some (stereotypical) part of some
relevant domain of meaning, and a less productive
form is used to refer to the remaining (somehow
unusual or special) elements in the domain. E.g.,
consider the contrast between lexical and
productive causatives a. Black Bart killed the
sheriff b. Black Bart caused the sheriff to die.
10Weak Version of BOT
An expression-interpretation pair is
optimal in the weak sense if and only if it
satisfies both of the following
constraints (Qw) There is no other pair
? Gencg that satisfies the I-principle such that
(Iw) There is no other pair i? ? Gencg that satisfies the Q-principle such
that The weak version of BOT is
one in which the directions of optimization make
reference to each other.
11Horns Division of Pragmatic Labor
The weak version of BOT can explain why marked
expressions are associated with stereotypical
meanings/ situations and unmarked forms are
associated with non-stereotypical situations.
The marked, less productive form is not totally
blocked by the unmarked form, because it is less
costly to associate the unmarked form with the
typical rather than the atypical interpretation.
12Scalar and Clausal Implicatures
BOT can explain why P or Q gets interpreted in
the exclusive sense, as ruling out the joint
truth of the two disjuncts. E.g., You can have
soup or salad gets interpreted as The hearer can
have either soup or salad but not both soup and
salad. This is because there is an alternative
expression, P and Q, that is a less costly way
of expressing the possibility of the joint truth
of P and Q. It is less costly because the
surprise value of learning that the state of the
world is such that both P and Q hold is less
after learning that P and Q is true than after
learning that P or Q is true. See Blutner
(1998 134) van Rooy (2002a 3). So the
Q-principle blocks the joint truth of P and Q
being part of the interpretation of P or Q,
forcing an exclusive interpretation of the
disjunction.
13Pragmatic Scales
Bob has just returned from a visit to the Kruger
National Park game reserve. Suppose that it is
part of the common ground that on visits to game
reserves it is desirable to see lions, and that
hippo sightings are not as prized as lion
sightings. In this context, consider the
following dialog Anne Did you see any
lions? Bob I saw some hippos. Implicature Bob
did not see any lions. The scale here only
exists because of assumptions in the common
ground. It wouldnt exist if assumptions were
changed. For example, suppose it is part of the
common ground that lions are always to be spotted
whenever hippos are spotted. The implicature
would disappear.
14Problem 1 for BOT
This implicature is derived by means of what has
been called a pragmatic scale. Unlike the
elements in a Horn scale or other comparable
linguistic scale, we cannot say that the lexical
entry for the expression lion records the fact
that lion is informationally stronger than the
expression hippo. Lion and hippo are not
expression-alternates. So, it looks as though the
blocking function of the Q-principle cannot work
in this case to yield the desired implicature. So
BOT cannot account for such cases.
15Abductive Extension of BOT
The idea is to extend the coverage of BOT beyond
the sort of cases mentioned in the previous
section (cases of lexical blocking, scalar and
clausal implicatures, etc.) to cases where
interpretation depends on specific aspects of
world and discourse knowledge. Blutner discusses
the pragmatics of adjectives and cases of
systematic polysemy in particular.
16Pragmatics of Adjectives
(1) a. The apple is red. b. Its peel is red. c.
Its pulp is red. Blutner (1998 148) claims that
(1b) but not (1c) is a conversational implicature
of (1a). He argues that part of the
underspecified meaning of apple is that apples
have parts that are colored and so the
underspecified meaning of (1a) is that the apple
in question has a part that is red. Context will
have to supply something to fill the part-slot in
this underspecified representation.
17Adjectives (cont.)
Blutner (1998 149) writes Given the assumption
that the colour of the peel is more diagnostic
for classifying apples than the colour of other
apple parts, for example, the colour of the pulp,
the red peel-specification is arguably the cost
minimal specification. The red
peel-specification comes out as the cost-minimal
specification if its total costs are smaller than
the costs of any other specification. Suppose
that, as is rather plausible, this condition is
satisfied, then the I-principle selects the red
peel-interpretation and blocks the red
pulp-interpretation.
18Adjectives (cont.)
Notice here that it is the I-principle that is
said to do the blocking, not the
Q-principle. Red peel and red pulp are not
expression-alternates of red. Rather, the red
peel- and red pulp-specifications are what
Blutner (1998 144) calls abductive
variants. Presumably it is our world knowledge of
apple parts that will suggest what variants are
to be considered by the comprehension mechanism
and are to be ruled in or out by the I-principle.
19Systematic Polysemy
(2) a. The school has strict hiring
policies. b. The school has a flat roof. c. The
school building has a flat roof. School is
polysemous, and can be understood as referring
either to an institution, as in (2a), or to the
physical buildings that house that institution,
as in (2b). Blutner (1998 152-155) argues that
the abductive extension of his basic BOT
mechanism can explain why (2b) is interpreted as
(2c).
20Systematic Polysemy (cont.)
Blutner claims that the institution and building
senses of school are equally salient, but that
the building interpretation can be integrated
into the conceptual frame created by the
predicate has a flat roof more readily than the
institution interpretation can. So, since the
building interpretation is less costly, it is
selected by the I-principle and the institution
interpretation is blocked, resulting in the
conversational implicature in (2c). Note that the
blocking of the institution interpretation of
school in (2b) is not a result of the operation
of the Q-principle.
21Some Initial Worries
- No longer have a bi-directional version of OT,
if the Q-principle drops out of the picture. - Not clear that the blocking that I-principle
does is blocking in the same sense as done by the
Q-principle. (Abductive variants are not
expression-alternates). - If the Q-principle drops out of the picture, we
cant account for pragmatic scales. Lion is not
an abductive variant of the underspecified
meaning of hippo.
22Problem 2 for BOT
Accounting for novel uses Suppose that we are
playing a board game where the game pieces are
Granny Smith apples and the game board consists
of a grid of colored squares. If a speaker were
to utter Play the red apple she would be
taken to have suggested that the hearer play the
Granny Smith apple on the red square of the game
board. The apple counts as red in this context
not in virtue of the color of any of its parts,
but in virtue of the fact that it is currently
placed on the red square of the game board.
23Problem 3 for BOT
Pragmatic loosening Blutner (1998 131) says
that a pair ? Gencg iff u holds in i,
written as i u. This would allow only
interpretations that are logical
strengthenings. Pragmatic strengthenings that
result in logical weakenings (e.g., cases in
which the domain of a universal quantifier is
restricted) or pragmatic loosenings are not
covered by this definition. Since the RT
comprehension mechanism deals in the same way
with both loosenings and strengthenings (see
Carston, 1997), its mechanism is more general
than the BOT mechanism.
24van Rooys LOT
The utility value of a proposition B, UV(B), can
be thought of as the increase in expected utility
that results from adding B to the common ground.
One interpretation B is better than another C
just in case UV(B) UV(C). A special case of
this can be applied in a situation in which we
have a partition of logical space Q, and we must
decide which element of Q is true. The addition
of new information might make this decision
problem easier, by eliminating cells in the
partition. We say that the entropy value of a
proposition B with respect to a decision problem
Q, EVQ(B), is its usefulness in resolving this
decision problem. So if learning a proposition B
eliminates more cells of the partition Q than
learning C, EVQ(B) EVQ(C).
25Argumentative Value
Another special case is the argumentative value
of a proposition B relative to a hypothesis h,
AVh(B). Learning B might increase or decrease the
probability of h. We can say that B is positively
relevant to h if the conditional probability of h
given B is greater than the prior probability of
h, i.e., p(h/B) p(h). It is negatively relevant
if p(h/B) with respect to h is thus a measure of its degree
of relevance to the argument, and B is a better
argument for h than is C if AVh(B) AVh(C).
26AVh vs. Informativeness
The argumentative value of a proposition B with
respect to a hypothesis h may be greater than the
argumentative value of a proposition C with
respect to h even though C entails B. For
example B Bobs knife was found at the scene
of the murder. C Carols blood (as well as the
victims blood) was found on the knife. h Bob
is the murderer. p(h/B) p(h/ B ? C) So AVh(B)
AVh(B ? C) Yet (B ? C) entails B. Hence (B ? C)
is more informative than B. van Rooy notes that
in this situation it would be more useful/
relevant to say only B, even though this would
not be very cooperative.
27Problem for LOT?
(1) (a) People who are getting married should
consult a doctor about possible hereditary risks
to their children. (b) Two people both of whom
have thalassemia should be warned against having
children. (c) Susan has thalassemia.
(2) (a) Susan, who has thalassemia, is getting
married to Bill. (b) Bill, who has thalassemia,
is getting married to Susan. (c) Bill, who has
thalassemia, is getting married to Susan, and
1967 was a very good year for French wines.
28Problem (cont.)
RT says (2b) is more relevant than (2a) because
(2b) has more contextual effects than (2a), while
we can assume that they are roughly equal in
terms of processing costs. As van Rooy puts it,
we can think of the above discourse as raising
the following questions (i) Who should consult
a doctor? (ii) Who should be warned against
having children? (2a) resolves question (i) for
both Susan and Bill. But (2b) resolves both
questions for both individuals. Thus it has
greater entropy value and hence is more useful/
relevant than (2a). So far, RT and LOT are in
agreement.
29van Rooys Full Account
However, RT says that (2c) is less relevant than
(2b), intuitively because (2c) gives extra
irrelevant information that adds to the
processing costs without any compensatory
contextual effects. On the other hand, it
appears that LOT has to say that (2b) and (2c)
are equally relevant, because they resolve the
same questions. van Rooy proposes the following
as a solution Proposition B is more relevant
than proposition C just in case Either UV(B)
UV(C) or UV(B) UV(C) and inf(B) this definition it will turn out that (2c) is
less relevant than (2b). So once more RT and LOT
are in agreement.
30Differences
31A Meta-Framework?
Does LOT allow us to unify RT and the neo-Gricean
theories of Horn/Levinson? No, because to
construe RT in terms of LOT is to ignore RTs
central aim of offering a psychological
theory. One could of course construe LOT as a
psychological processing account too, where
processing effort is measured in quantitative
terms (e.g., in terms of duration). Such
quantitative measures of effort are routinely
used in psycholinguistics. Moreover, RT (with
certain provisos) can accept a methodology that
measures effort quantitatively. Does this make
unification any more likely? No. There are still
the incompatibilities between BOT and RT that
Ive already catalogued in my discussion of
Blutners views. Besides, RT and neo-Gricean
accounts make differential predictions about
processing, and there is at least some
psycholinguistic evidence that favors RT over
neo-Gricean accounts.