Ann Devitt - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

Ann Devitt

Description:

... of Emotion and Financial News. Ann Devitt. Khurshid Ahmad ... Specialised Language of Financial News. Bloomberg.com, 18/2/09. Ann Devitt ... Financial ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 60
Provided by: Gar74
Category:

less

Transcript and Presenter's Notes

Title: Ann Devitt


1
Wednesday 18 February 2009
  • The Languages of Emotion and Financial News
  • Ann Devitt
  • Khurshid Ahmad

2
Sentiment and the Markets
3
Sentiment and the Markets
4
Specialised Language of Financial News
Global chipmakers, battling slower technology
demand, are betting size matters as they pin
their hopes for future growth on small and easy
to carry mobile devices such as netbooks and
smartphones.
Bloomberg.com, 18/2/09
5
Specialised Language of Financial News
Global chipmakers, battling slower technology
demand, are betting size matters as they pin
their hopes for future growth on small and easy
to carry mobile devices such as netbooks and
smartphones.
Bloomberg.com, 18/2/09
6
Specialised Language of Financial News
Global chipmakers, battling slower technology
demand, are betting size matters as they pin
their hopes for future growth on small and easy
to carry mobile devices such as netbooks and
smartphones.
Bloomberg.com, 18/2/09
7
Specialised Language of Financial News
Global chipmakers, battling slower technology
demand, are betting size matters as they pin
their hopes for future growth on small and easy
to carry mobile devices such as netbooks and
smartphones.
Bloomberg.com, 18/2/09
8
Sentiment and the Markets
9
Sentiment and the Markets
10
Engle Ng (1993) Asymmetry Curve
11
Outline
  • Current psychological theory of emotion
  • Evaluation of lexical emotion resources
  • Corpus analysis of language of emotion

12
Outline
  • Current psychological theory of emotion
  • Evaluation of lexical emotion resources
  • Corpus analysis of language of emotion

13
Cognitive Theory of EmotionCategorical
Ekman (1975)
14
Cognitive Theory of EmotionDimensions
  • Osgood / Russell
  • Evaluation
  • Activity
  • Potency
  • Mehabrian PAD
  • Pleasure
  • Activation
  • Dominance

15
Cognitive Theory of Emotion
Watson and Tellegen (1985)
16
Outline
  • Current psychological theory of emotion
  • Evaluation of lexical emotion resources
  • Corpus analysis of language of emotion

17
Lexical Resource Evaluation
SentiWordNet
Whissel
General Inquirer
WNA
18
Lexical Resource Evaluation Senti WordNet
SentiWordNet
Whissel
General Inquirer
WNA
19
Lexical Resource Evaluation Senti WordNet
  • Word PositiveVal NegativeVal
  • Happy 0.9 0.0
  • Sad 0.0 0.9
  • 39066 terms
  • Evaluation dimension scale 0 - 1
  • Low average Pos0.18, Neg0.23
  • More extreme Neg values
  • Error-prone rude (pos 0.875), gladsome (neg
    0.875)

20
Lexical Resource Evaluation General Inquirer
SentiWordNet
Whissel
General Inquirer
WNA
21
Lexical Resource Evaluation General Inquirer
  • ECSTATIC Pos Pleasure
  • SORROWFUL Neg Pain
  • Hand-coded, content analysis basis
  • 8641 terms
  • 184 binary categories (including MAB dimensions)
  • Negative gt Positive
  • Active gt Passive
  • Strong gt Weak

22
Lexical Resource Evaluation Whissel Dictionary
of Affect
SentiWordNet
Whissel
General Inquirer
WNA
23
Lexical Resource Evaluation Whissel Dictionary
of Affect
  • Word Eval Activ Imag
  • great 2.6250 2.1250 1.0
  • disastrous 1.4444 2.4000 2.0
  • Corpus selection, hand-coded
  • 8742 terms
  • Dimensional representation 1-3 scale
  • Evaluation, Activation, Imagery

24
Lexical Resource Evaluation WordNet Affect
SentiWordNet
Whissel
General Inquirer
WNA
25
Lexical Resource EvaluationWordNet Affect
  • Word BinaryFeatures
  • Loneliness cognitive state, emotion
  • Happiness cognitive state, emotion
  • 5432 terms
  • Domains of emotional experience
  • No Polarity
  • Short-term Mood, Manner
  • Long-term Attribute, Trait

26
Lexical Resource EvaluationLexical Overlap
  • Are the lexica consistent?
  • Are they mutually exclusive?
  • Dice, Jaccard, Asymmetric coefficients

27
Lexical Resource Evaluation Lexical Overlap
SentiWordNet
Whissel
General Inquirer
WNA
28
Lexical Resource Evaluation Lexical Overlap
SentiWordNet
  • Statistically significant agreement for Polarity
    Assignment (Chi square test)
  • Very weak correlation for activation features.

General Inquirer
Whissel
WNA
29
Lexical Resource Evaluation Lexical Overlap
  • Weak correlation of SWN with Whissel evaluation
  • 2. No correlation with Whissel activation
    dimension
  • 3. SWN positive negatively correlated with
    imageability

SentiWordNet
Whissel
General Inquirer
WNA
30
Lexical Resource Evaluation Lexical Overlap
  • SWN tends to negative for short term WNA features
  • SWN tends to positive for long-term WNA features

SentiWordNet
WNA
Whissel
General Inquirer
31
Lexical Resource Evaluation Lexical Overlap
SentiWordNet
Whissel
General Inquirer
WNA
32
Lexical Resource Evaluation Lexical Overlap
  • WNA feature division
  • Short-term Long-term
  • Negative Positive
  • Physical Cognitive
  • More active Less active
  • Internal External
  • Less abstract More concrete

33
Lexical Resource EvaluationSome conclusions
  • The lexica
  • Are quite consistent
  • Can be used in combination
  • SentiWN Largely unexplored territory

34
Outline
  • Current psychological theory of emotion
  • Evaluation of lexical emotion resources
  • Corpus analysis of language of emotion
  • General Language

35
Emotion in General LanguageCorpus Study Aims
  • Does emotion constitute a distinct
    sub-language?
  • Is there a polarity bias in General Language?
    (the Polyanna Hypothesis of Boucher and Osgood)
  • What is the impact of using different lexica?

36
Corpus AnalysisThe Data
  • BNC
  • 100 million words
  • Balanced, broad corpus

37
Corpus AnalysisMethodology
  • Is emotion a distinct sub-language?
  • Examine distribution type
  • Examine distribution spread
  • Bootstrap sampling distribution

38
Corpus AnalysisDistribution Type
  • Zipfian BNCEmotion Lexica

39
Corpus AnalysisDistribution shape
  • Comparison of means student t-test
  • BNC ? Emotion Lexica (plt0.000)
  • Different sample means
  • 5-30 times more frequent than gen. language
  • Assumptions of test?

40
Corpus AnalysisBootstrap Sampling Distribution
  • Are sentiment-bearing terms a statistically
    distinct and highly frequent subset of English?
  • 1000 random samples of terms from BNC
  • Sample size size of sentiment lexicon
  • H0 Observed sample falls inside within 95 of
    bootstrap random sampling distribution of means

41
Corpus AnalysisBootstrap Sampling Distribution
  • Are sentiment-bearing terms a statistically
    distinct and highly frequent subset of English?
  • For all lexica
  • Mean term frequency of lexicon well outside 95
  • Sentiment lexica are not representative of BNC
    (plt0.05)

42
Corpus AnalysisSentiment Features
  • Is there a polarity bias in General Language?
  • Positive polarity bias
  • Statistically significant for all lexica (?2
    test of independence)

43
Corpus AnalysisSentiment Features
  • Is there a polarity bias in General Language when
    you include intensity of polarity?
  • Positive polarity bias
  • Statistically significant for all lexica
  • ?2 158.5, df1, plt0.0001 for General Inquirer
  • ?2 63.6, df1, plt0.0001 for Whissel

44
Corpus AnalysisSome conclusions
  • Sentiment-bearing terms are a distinct subset of
    English
  • Positive polarity bias in BNC
  • General Inquirer and Whissel
  • Low coverage and high frequency
  • SentiwordNet
  • Wide coverage and much lower frequency

45
Outline
  • Current psychological theory of emotion
  • Evaluation of lexical emotion resources
  • Corpus analysis of language of emotion
  • Comparative

46
Comparative Corpus AnalysisAims
  • Examine affective term use
  • Identify statistically different distributions
  • Is there a dominant feature/polarity?

47
Comparative Corpus AnalysisThe Data
  • Financial Language
  • 2 million words
  • On-line financial news
  • Reuters, CNN, Bloomberg
  • Newspapers
  • General Language
  • BNC
  • 100 million words
  • Balanced, broad corpus

48
Comparative Corpus Analysis The Data
  • BNC sub-corpora
  • Imaginative written English
  • 16 million words
  • Informative written English
  • 70 million words

49
Comparative Corpus AnalysisMethodology
  • Compare proportions of Sentiment Features
  • ?2 Test of Independence
  • H0 p FinCorpus p BNC

50
Comparative Corpus Analysis Methodology
  • Statistical significance of different proportion
  • ?2 gt 7.8794
  • p gt 0.005
  • Features
  • 41 Lexicon Sentiment Features from 4 lexica
  • Frequency per million words

51
Comparative Corpus AnalysisFinancial Corpus
  • WRT Imaginative More affective terms
  • WRT Informative Many more affective terms
  • WRT BNC
  • Dependent on feature type
  • Distributions are statistically distinct

52
Comparative Corpus AnalysisPositive GI Features
53
Comparative Corpus AnalysisPositive GI Features
54
Comparative Corpus AnalysisNegative GI Features
55
Comparative Corpus AnalysisNegative GI Features
56
Comparative Corpus AnalysisNegative GI Features
57
Some conclusions
  • Lexical resources for sentiment are consistent
  • Financial news is a sub-language
  • Affective content is statistically distinct
    relative to general language
  • Text polarity is asymmetric, positive skew
  • Different skews for different domains

58
Something to think about
  • If different language varieties and domains have
    distinct use of sentiment terms and their own
    polarity bias
  • Individual sentiment values are not informative
  • So what do we need??

59
Thank You!
  • Ann.Devitt_at_cs.tcd.ie
Write a Comment
User Comments (0)
About PowerShow.com