11-719 Computational Models of Discourse Analysis - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

11-719 Computational Models of Discourse Analysis

Description:

Title: Displayed Bias as a Reflection of Both Speaker and Intended Hearer in Conversational Settings Author: cprose Last modified by: cprose Created Date – PowerPoint PPT presentation

Number of Views:120
Avg rating:3.0/5.0
Slides: 31
Provided by: cprose
Category:

less

Transcript and Presenter's Notes

Title: 11-719 Computational Models of Discourse Analysis


1
11-719Computational Models of Discourse Analysis
  • Carolyn Penstein Rosé
  • Language Technologies Institute
  • and Human-Computer Interaction Institute

2
New York Times ArticleWhat strikes you about the
agents style of speaking?
  • June 24, 2010
  • Computers Learn to Listen, and Some Talk Back
  • By STEVE LOHR and JOHN MARKOFF
  • Hi, thanks for coming, the medical assistant
    says, greeting a mother with her 5-year-old son.
    Are you here for your child or yourself?
  • The boy, the mother replies. He has diarrhea.
  • Oh no, sorry to hear that, she says, looking
    down at the boy.
  • The assistant asks the mother about other
    symptoms, including fever (slight) and
    abdominal pain (He hasnt been complaining).
  • She turns again to the boy. Has your tummy been
    hurting? Yes, he replies.
  • After a few more questions, the assistant
    declares herself not that concerned at this
    point. She schedules an appointment with a
    doctor in a couple of days. The mother leads her
    son from the room, holding his hand. But he keeps
    looking back at the assistant, fascinated, as if
    reluctant to leave.
  • Maybe that is because the assistant is the
    disembodied likeness of a womans face on a
    computer screen a no-frills avatar. Her words
    of sympathy are jerky, flat and mechanical. But
    she has the right stuff the ability to
    understand speech, recognize pediatric conditions
    and reason according to simple rules to make an
    initial diagnosis of a childhood ailment and its
    seriousness. And to win the trust of a little
    boy.

3
Not all so rosy
4
Are we missing something?Sociolinguists and
Discourse Analysts have been studying social
aspects of language since the 20s and 30s!!!
5
Ask yourself thisWhere do I sound like Im from?
Actually from California, but picked up some
accent from my dad from New York... Did you
notice the a in Carolyn? But not the back-open
r. And if you heard me say daughter But how
often do I say that in class?
Note that context is everything. a in sat
doesnt have the same significance as a in
Carolyn.
6
What information are we throwing away or
ignoring that would allow us to distinguish
meaningful variation from meaningless variation?
7
What will you get out of this class?
  • Learn to read the primary literature in
    sociolinguistics, discourse analysis, and
    pragmatics
  • Get a more intimate familiarity with the
    state-of-the-art in language processing applied
    to analysis of social media, especially
    conversation and narrative
  • Explore what insights these fields of linguistics
    can contribute to language technologies
  • Explore what language technologies might be able
    to do to advance these fields of linguistics
  • Get hands on experience working on both

8
Please Introduce Yourself
  • What experience do you have with discourse
    analysis?
  • What do you most want to get out of this class?

9
Review from my LTI Colloquium talk
10
Discourse and Identity
  • Identity is reflected in the way we present
    ourselves in conversational interactions
  • Reflects who we are, how we think, and where we
    belong
  • Also reflects how we think of our audience
  • Examples
  • Regional dialect shows my identification with
    where I am from, but also shows I am comfortable
    letting you identify me that way
  • Jargon and technical terms shows my
    identification with a work community, but also
    shows I expect you to be able to relate to that
    part of my life
  • Level of formality shows where we stand in
    relation to one another
  • Explicitness in reference shows whether I am
    treating you like an insider or an outsider

11
Discourse and Identity
  • Discourse is text above the clause level (Martin
    Rose, 2007)
  • A Discourse is an ongoing conversation type
  • Socialization is the process of joining a
    Discourse (Lave Wenger, 1991 Sfard, 2010)
  • We join Discourses that match our core identity
    (de Fina, Schiffrin, Bamberg, 2006)
  • In moving from the periphery to the core of a
    Discourse community, we sound more and more like
    the community (Arguello et al., 2006)
  • A discourse is one instance of it token
  • All discourses contain echoes of previous
    discourses (Bakhtin, 1983)

Lakoff Johnson, 1980
Lave Wenger, 1991
12
Metaphors Structure our Experience
  • We describe arguments using terms related to war
  • Using a typical war script to structure a story
    about an argument
  • We orient towards arguments as though they were
    wars
  • Our conversational partner is our opponent
  • We may feel that we won or lost
  • We may feel wounded as a result

13
Discourses, Frames, and Metaphors
  • Frame A portion of a discourse belonging to
    distinct Discourse
  • Metaphor One linguistic device that can be used
    to define a set of discourse practices that
    constitute a frame
  • Topic models a technical approach that makes
    sense for identifying frames within a discourse
  • A discourse could be drawn from a mixture of
    Discourses
  • Within the same conversation, we may wear a
    variety of hats
  • E.g., the same discourse with a co-worker may
    contain exchanges pertaining to our relationship
    as colleagues and others to our relationship as
    friends

14
Now its your turn
15
http//video.google.com/videoplay?docid-654777733
6881961043hlen
16
Discussion Questions
  • What other stories/movies/genres does this remind
    you of?
  • What is the message being communicated about
    Hummers?
  • What is communicated about the company that makes
    them?
  • What is communicated about the assumed audience?
  • What are other messages?
  • E.g., are any political statements being made?

17
Semester Plan
  • In each Unit
  • Readings from Discourse Analysis and
    Sociolinguistics
  • Readings from Language Technologies
  • Hands-on assignment
  • Implementation and corpus based experiment
  • Competitive error analysis
  • Student Presentations
  • Unit 1 Theoretical Foundation
  • Unit 2 Linguistic Structure
  • Unit 3 Sentiment
  • Unit 4 Identity and Personality
  • Unit 5 Social Positioning

18
Gradingpeople who make a good faith effort
always do well in my courses
  • 15 for each of 5 Unit assignments
  • First one is a discourse analysis
  • Others are corpus based experiments
  • We provide the corpus
  • You implement a feature extractor, test it, do an
    error analysis, and present your well motivated
    idea and evaluation in class
  • 10 for class participation
  • Doing readings (will be posted to course Drupal)
  • Posting to Drupal discussion by 10pm the night
    before class
  • Actively contributing to class discussions
  • 15 for final critique of a technical paper

19
Corpora for experimentation
  • Unit 2 Maptask data (Negotiation coding)
  • Possibly other chat corpora with same coding as
    well
  • Unit 3 Product Reviews (Sentiment)
  • Unit 4 Blog corpus (Age and Gender)
  • Unit 5 AMI meeting corpus (Dialogue Acts)
  • Other corpora
  • Email discussion list (Social Support coding)

20
SIDE Workbench for Experimentation
  • http//www.cs.cmu.edu/cprose/SIDE.html

21
SIDE
22
SIDE
23
Two Options
  • Create your own feature extractor plugins
  • We will provide documents abstract classes that
    you create specializations of
  • Programmed in Java
  • Elijah is the developer and can answer your
    questions
  • Use SIDEs feature creation functionality to
    create novel functions
  • Grades will be based on
  • The extent to which your features are theory
    motivated or data motivated
  • The depth of your error analysis

24
SIDE
25
SIDE
26
SIDE
27
Setting up the course Drupal
  • If you are not registered, please do so
  • If you dont have an Andrew account, make sure I
    have your email address
  • We will manage the course through Drupal
  • All materials, including pdfs for required
    readings, will be posted to Drupal
  • Slides for all lectures will be posted to Drupal
    after class
  • Discussion threads in preparation for each
    lecture will be found on Drupal

28
Assignment 1 (not due til Jan26)
  • Transcribe a scene from a favorite move, play, or
    TV show
  • As a shortcut, you can find a script online
  • Excerpt should be no more than one page of text
  • Select one of the methodologies we are discussing
    in Unit 1 (e.g., from Gee, Martin Rose, or
    Levinson)
  • Do a qualitative analysis of the script and write
    it up
  • Use readings from Unit 1 as a collection of
    models to chose from
  • Due on Week 3 lecture 2
  • Turn in transcript, raw analysis (can be
    annotations added to the transcript), and write
    up (your interpretation of the analysis)
  • Prepare a powerpoint presentation for class (no
    more than 5 minutes of material)

29
For next time.
  • You will receive login information for Drupal
  • http//kanagawa.lti.cs.cmu.edu/11719/
  • Read excerpts from James Gees book (linked to
    syllabus entry for Wednesdays lecture)
  • Post to drupal (in response to discussion
    question posted for Week 1 Lecture 2)

30
Questions?
  • Carolyn Penstein Rosé
  • http//www.cs.cmu.edu/cprose
  • cprose_at_cs.cmu.edu
  • Gates-Hillman Center 5415
Write a Comment
User Comments (0)
About PowerShow.com