A Probabilistic Approach to Semantic Representation - PowerPoint PPT Presentation

1 / 90
About This Presentation
Title:

A Probabilistic Approach to Semantic Representation

Description:

requires efficient abstraction. Why do we store this information? function of semantic memory ... Since we only sample z we need. number of times word w ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 91
Provided by: tomgri
Learn more at: http://vw.indiana.edu
Category:

less

Transcript and Presenter's Notes

Title: A Probabilistic Approach to Semantic Representation


1
A Probabilistic Approach to Semantic
Representation
  • Tom Griffiths
  • Mark Steyvers
  • Josh Tenenbaum

2
  • How do we store the meanings of words?
  • question of representation
  • requires efficient abstraction

3
  • How do we store the meanings of words?
  • question of representation
  • requires efficient abstraction
  • Why do we store this information?
  • function of semantic memory
  • predictive structure

4
Latent Semantic Analysis(Landauer Dumais, 1997)
co-occurrence matrix
high dimensional space
SVD
X
U D V T
5
Mechanistic Claim
  • Some component of word meaning can be extracted
    from co-occurrence statistics

6
Mechanistic Claim
  • Some component of word meaning can be extracted
    from co-occurrence statistics
  • But
  • Why should this be true?
  • Is the SVD the best way to treat these data?
  • What assumptions are we making about meaning?

7
Mechanism and Function
  • Some component of word meaning can be extracted
    from co-occurrence statistics
  • Semantic memory is structured to aid
    retrieval via context-specific prediction

8
Functional Claim
  • Semantic memory is structured to aid
    retrieval via context-specific prediction
  • Motivates sensitivity to co-occurrence statistics
  • Identifies how co-occurrence data should be used
  • Allows the role of meaning to be specified
    exactly, and finds a meaningful decomposition of
    language

9
A Probabilistic Approach
  • The function of semantic memory
  • The psychological problem of meaning
  • One approach to meaning
  • Solving the statistical problem of meaning
  • Maximum likelihood estimation
  • Bayesian statistics
  • Comparisons with Latent Semantic Analysis
  • Quantitative
  • Qualitative

10
A Probabilistic Approach
  • The function of semantic memory
  • The psychological problem of meaning
  • One approach to meaning
  • Solving the statistical problem of meaning
  • Maximum likelihood estimation
  • Bayesian statistics
  • Comparisons with Latent Semantic Analysis
  • Quantitative
  • Qualitative

11
The Function of Semantic Memory
  • To predict what concepts are likely to be needed
    in a context, and thereby ease their retrieval
  • Similar to rational accounts of categorization
    and memory (Anderson, 1990)
  • Same principle appears in semantic networks
    (Collins Quillian, 1969 Collins Loftus, 1975)

12
The Psychological Problem of Meaning
  • Simply memorizing whole word-document
    co-occurrence matrix does not help
  • Generalization requires abstraction, and this
    abstraction identifies the nature of meaning
  • Specifying a generative model for documents
    allows inference and generalization

13
One Approach to Meaning
  • Each document a mixture of topics
  • Each word chosen from a single topic
  • from parameters
  • from parameters

14
One Approach to Meaning
w P(wz 1) f (1)
w P(wz 2) f (2)
HEART 0.2 LOVE 0.2 SOUL 0.2 TEARS 0.2 JOY
0.2 SCIENTIFIC 0.0 KNOWLEDGE 0.0 WORK
0.0 RESEARCH 0.0 MATHEMATICS 0.0
HEART 0.0 LOVE 0.0 SOUL 0.0 TEARS 0.0 JOY
0.0 SCIENTIFIC 0.2 KNOWLEDGE 0.2 WORK
0.2 RESEARCH 0.2 MATHEMATICS 0.2
topic 1
topic 2
15
One Approach to Meaning
Choose mixture weights for each document,
generate bag of words
q P(z 1), P(z 2) 0, 1 0.25,
0.75 0.5, 0.5 0.75, 0.25 1, 0
MATHEMATICS KNOWLEDGE RESEARCH WORK MATHEMATICS
RESEARCH WORK SCIENTIFIC MATHEMATICS WORK
SCIENTIFIC KNOWLEDGE MATHEMATICS SCIENTIFIC
HEART LOVE TEARS KNOWLEDGE HEART
MATHEMATICS HEART RESEARCH LOVE MATHEMATICS WORK
TEARS SOUL KNOWLEDGE HEART
WORK JOY SOUL TEARS MATHEMATICS TEARS LOVE LOVE
LOVE SOUL
TEARS LOVE JOY SOUL LOVE TEARS SOUL SOUL TEARS JOY
16
One Approach to Meaning
q
z
  • Generative model for co-occurrence data
  • Introduced by Blei, Ng, and Jordan (2002)
  • Clarifies pLSI (Hofmann, 1999)

w
17
Matrix Interpretation
documents
topics
documents
C F Q
topics
words
words
normalized co-occurrence matrix
mixture weights
mixture components
A form of non-negative matrix factorization
18
Matrix Interpretation
documents
topics
documents
C F Q
topics
words
words
documents
vectors
vectors
documents
C U D VT
words
words
vectors
vectors
19
The Function of Semantic Memory
  • Prediction of needed concepts aids retrieval
  • Generalization aided by a generative model
  • One generative model mixtures of topics
  • Gives non-negative, non-orthogonal factorization
    of word-document co-occurrence matrix

20
A Probabilistic Approach
  • The function of semantic memory
  • The psychological problem of meaning
  • One approach to meaning
  • Solving the statistical problem of meaning
  • Maximum likelihood estimation
  • Bayesian statistics
  • Comparisons with Latent Semantic Analysis
  • Quantitative
  • Qualitative

21
The Statistical Problem of Meaning
  • Generating data from parameters easy
  • Learning parameters from data is hard
  • Two approaches to this problem
  • Maximum likelihood estimation
  • Bayesian statistics

22
Inverting the Generative Model
  • Maximum likelihood estimation
  • Variational EM (Blei, Ng Jordan, 2002)
  • Bayesian inference

WT DT parameters
WT T parameters
0 parameters
23
Bayesian Inference
  • Sum in the denominator over Tn terms
  • Full posterior only tractable to a constant

24
Markov Chain Monte Carlo
  • Sample from a Markov chain which converges to
    target distribution
  • Allows sampling from an unnormalized posterior
    distribution
  • Can compute approximate statistics from
    intractable distributions

(MacKay, 2002)
25
Gibbs Sampling
  • For variables x1, x2, , xn
  • Draw xi(t) from P(xix-i)
  • x-i x1(t), x2(t),, xi-1(t), xi1(t-1), ,
    xn(t-1)

26
Gibbs Sampling
(MacKay, 2002)
27
Gibbs Sampling
  • Need full conditional distributions for variables
  • Since we only sample z we need

number of times word w assigned to topic j
number of times topic j used in document d
28
Gibbs Sampling
iteration 1
29
Gibbs Sampling
iteration 1 2
30
Gibbs Sampling
iteration 1 2
31
Gibbs Sampling
iteration 1 2
32
Gibbs Sampling
iteration 1 2
33
Gibbs Sampling
iteration 1 2
34
Gibbs Sampling
iteration 1 2
35
Gibbs Sampling
iteration 1 2
36
Gibbs Sampling
iteration 1 2
1000
37
A Visual Example Bars
sample each pixel from a mixture of topics
pixel word image document
38
A Visual Example Bars
39
From 1000 Images
40
Interpretable Decomposition
  • SVD gives a basis for the data, but not an
    interpretable one
  • The true basis is not orthogonal, so rotation
    does no good

41
Application to Corpus Data
  • TASA corpus text from first grade to college
  • Vocabulary of 26414 words
  • Set of 36999 documents
  • Approximately 6 million words in corpus

42
A Selection of Topics
THIRD FIRST SECOND THREE FOURTH FOUR GRADE TWO FIF
TH SEVENTH SIXTH EIGHTH HALF SEVEN SIX SINGLE NINT
H END TENTH ANOTHER
BRAIN NERVE SENSE SENSES ARE NERVOUS NERVES BODY S
MELL TASTE TOUCH MESSAGES IMPULSES CORD ORGANS SPI
NAL FIBERS SENSORY PAIN IS
CURRENT ELECTRICITY ELECTRIC CIRCUIT IS ELECTRICAL
VOLTAGE FLOW BATTERY WIRE WIRES SWITCH CONNECTED
ELECTRONS RESISTANCE POWER CONDUCTORS CIRCUITS TUB
E NEGATIVE
NATURE WORLD HUMAN PHILOSOPHY MORAL KNOWLEDGE THOU
GHT REASON SENSE OUR TRUTH NATURAL EXISTENCE BEING
LIFE MIND ARISTOTLE BELIEVED EXPERIENCE REALITY
ART PAINT ARTIST PAINTING PAINTED ARTISTS MUSEUM W
ORK PAINTINGS STYLE PICTURES WORKS OWN SCULPTURE P
AINTER ARTS BEAUTIFUL DESIGNS PORTRAIT PAINTERS
STUDENTS TEACHER STUDENT TEACHERS TEACHING CLASS C
LASSROOM SCHOOL LEARNING PUPILS CONTENT INSTRUCTIO
N TAUGHT GROUP GRADE SHOULD GRADES CLASSES PUPIL G
IVEN
SPACE EARTH MOON PLANET ROCKET MARS ORBIT ASTRONAU
TS FIRST SPACECRAFT JUPITER SATELLITE SATELLITES A
TMOSPHERE SPACESHIP SURFACE SCIENTISTS ASTRONAUT S
ATURN MILES
THEORY SCIENTISTS EXPERIMENT OBSERVATIONS SCIENTIF
IC EXPERIMENTS HYPOTHESIS EXPLAIN SCIENTIST OBSERV
ED EXPLANATION BASED OBSERVATION IDEA EVIDENCE THE
ORIES BELIEVED DISCOVERED OBSERVE FACTS
43
A Selection of Topics
JOB WORK JOBS CAREER EXPERIENCE EMPLOYMENT OPPORTU
NITIES WORKING TRAINING SKILLS CAREERS POSITIONS F
IND POSITION FIELD OCCUPATIONS REQUIRE OPPORTUNITY
EARN ABLE
SCIENCE STUDY SCIENTISTS SCIENTIFIC KNOWLEDGE WORK
RESEARCH CHEMISTRY TECHNOLOGY MANY MATHEMATICS BI
OLOGY FIELD PHYSICS LABORATORY STUDIES WORLD SCIEN
TIST STUDYING SCIENCES
BALL GAME TEAM FOOTBALL BASEBALL PLAYERS PLAY FIEL
D PLAYER BASKETBALL COACH PLAYED PLAYING HIT TENNI
S TEAMS GAMES SPORTS BAT TERRY
FIELD MAGNETIC MAGNET WIRE NEEDLE CURRENT COIL POL
ES IRON COMPASS LINES CORE ELECTRIC DIRECTION FORC
E MAGNETS BE MAGNETISM POLE INDUCED
STORY STORIES TELL CHARACTER CHARACTERS AUTHOR REA
D TOLD SETTING TALES PLOT TELLING SHORT FICTION AC
TION TRUE EVENTS TELLS TALE NOVEL
MIND WORLD DREAM DREAMS THOUGHT IMAGINATION MOMENT
THOUGHTS OWN REAL LIFE IMAGINE SENSE CONSCIOUSNES
S STRANGE FEELING WHOLE BEING MIGHT HOPE
DISEASE BACTERIA DISEASES GERMS FEVER CAUSE CAUSED
SPREAD VIRUSES INFECTION VIRUS MICROORGANISMS PER
SON INFECTIOUS COMMON CAUSING SMALLPOX BODY INFECT
IONS CERTAIN
WATER FISH SEA SWIM SWIMMING POOL LIKE SHELL SHARK
TANK SHELLS SHARKS DIVING DOLPHINS SWAM LONG SEAL
DIVE DOLPHIN UNDERWATER
44
A Selection of Topics
JOB WORK JOBS CAREER EXPERIENCE EMPLOYMENT OPPORTU
NITIES WORKING TRAINING SKILLS CAREERS POSITIONS F
IND POSITION FIELD OCCUPATIONS REQUIRE OPPORTUNITY
EARN ABLE
SCIENCE STUDY SCIENTISTS SCIENTIFIC KNOWLEDGE WORK
RESEARCH CHEMISTRY TECHNOLOGY MANY MATHEMATICS BI
OLOGY FIELD PHYSICS LABORATORY STUDIES WORLD SCIEN
TIST STUDYING SCIENCES
BALL GAME TEAM FOOTBALL BASEBALL PLAYERS PLAY FIEL
D PLAYER BASKETBALL COACH PLAYED PLAYING HIT TENNI
S TEAMS GAMES SPORTS BAT TERRY
FIELD MAGNETIC MAGNET WIRE NEEDLE CURRENT COIL POL
ES IRON COMPASS LINES CORE ELECTRIC DIRECTION FORC
E MAGNETS BE MAGNETISM POLE INDUCED
STORY STORIES TELL CHARACTER CHARACTERS AUTHOR REA
D TOLD SETTING TALES PLOT TELLING SHORT FICTION AC
TION TRUE EVENTS TELLS TALE NOVEL
MIND WORLD DREAM DREAMS THOUGHT IMAGINATION MOMENT
THOUGHTS OWN REAL LIFE IMAGINE SENSE CONSCIOUSNES
S STRANGE FEELING WHOLE BEING MIGHT HOPE
DISEASE BACTERIA DISEASES GERMS FEVER CAUSE CAUSED
SPREAD VIRUSES INFECTION VIRUS MICROORGANISMS PER
SON INFECTIOUS COMMON CAUSING SMALLPOX BODY INFECT
IONS CERTAIN
WATER FISH SEA SWIM SWIMMING POOL LIKE SHELL SHARK
TANK SHELLS SHARKS DIVING DOLPHINS SWAM LONG SEAL
DIVE DOLPHIN UNDERWATER
45
A Probabilistic Approach
  • The function of semantic memory
  • The psychological problem of meaning
  • One approach to meaning
  • Solving the statistical problem of meaning
  • Maximum likelihood estimation
  • Bayesian statistics
  • Comparisons with Latent Semantic Analysis
  • Quantitative
  • Qualitative

46
Probabilistic Queries
  • can be computed in different ways
  • Fixed topic assumption
  • Multiple samples

47
Quantitative Comparisons
  • Two types of task
  • general semantic tasks dictionary, thesaurus
  • prediction of memory data
  • All tests use LSA with 400 vectors, and a
    probabilistic model with 100 samples each using
    500 topics

48
Fill in the Blank
  • 12856 sentences extracted from WordNet
  • Overall performance
  • LSA gives median rank of 3393
  • Probabilistic model gives median rank of 3344

his cold deprived him of his sense of _ silence
broken by dogs barking _ a _ hybrid accent
49
Fill in the Blank
50
Synonyms
  • 280 sets of five synonyms from WordNet, ordered
    by number of senses
  • Two tasks
  • Predict first synonym
  • Predict last synonym
  • Increasing number of synonyms

BREAK (78) EXPOSE (9) DISCOVER (8) DECLARE
(7) REVEAL (3) CUT (72) REDUCE
(19) CONTRACT (12) SHORTEN (5) ABRIDGE
(1) RUN (53) GO (34) WORK (25)
FUNCTION (9) OPERATE (7)
51
First Synonym
52
Last Synonym
53
Synonyms and Word Frequency
54
Synonyms and Word Frequency
Probabilistic
LSA
55
Synonyms and Word Frequency
Probabilistic
LSA
56
Word Frequency and Filling Blanks
LSA
Probabilistic
57
Performance on Semantic Tasks
  • Performance comparable, neither great
  • Difference in effects of word frequency due to
    treatment of co-occurrence data
  • Probabilistic approach useful in addressing
    psychological data frequency important

58
Intrusions in Free Recall
CHAIR FOOD DESK TOP LEG EAT CLOTH DISH WOOD DINNER
MARBLE TENNIS
  • Intrusion rates from Deese (1959)
  • Used average word vectors in LSA, P(wordlist) in
    probabilistic model
  • Favors LSA, since probabilistic combination can
    be multimodal

59
Intrusions in Free Recall
60
Intrusions in Free Recall
word frequency
models
61
Word Frequency is Not Enough
  • An explanation needs to address two questions
  • Why do these words intrude?
  • Why do other words not intrude?

62
Word Frequency is Not Enough
  • An explanation needs to address two questions
  • Why do these words intrude?
  • Why do other words not intrude?
  • Median word frequency rank 1698.5
  • Median rank in model 21

63
Word Association
  • Word association norms from Nelson et al. (1998)

PLANETS
people EARTH STARS SPACE
SUN MARS UNIVERSE SATURN GALAXY
model STARS STAR SUN
EARTH SPACE SKY PLANET UNIVERSE
associate number 1 2 3 4 5 6 7 8
64
Word Association
65
Performance on Memory Tasks
  • Outperforms LSA on simple memory tasks, both far
    better at predicting memory data
  • Improvement due to role of word frequency
  • Not a complete account, but can form a part of
    more complex memory models

66
Qualitative Comparisons
  • Naturally deals with complications for LSA
  • Polysemy
  • Asymmetry
  • Respects natural statistics of language
  • Easily extends to other models of meaning

67
Beyond the Bag of Words
q
z
z
z
w
w
w
68
Beyond the Bag of Words
q
q
z
z
z
z
z
z
w
w
w
w
w
w
s
s
s
69
Semantic categories
PLANTS PLANT LEAVES SEEDS SOIL ROOTS FLOWERS WATER
FOOD GREEN SEED STEMS FLOWER STEM LEAF ANIMALS RO
OT POLLEN GROWING GROW
GOLD IRON SILVER COPPER METAL METALS STEEL CLAY LE
AD ADAM ORE ALUMINUM MINERAL MINE STONE MINERALS P
OT MINING MINERS TIN
BEHAVIOR SELF INDIVIDUAL PERSONALITY RESPONSE SOCI
AL EMOTIONAL LEARNING FEELINGS PSYCHOLOGISTS INDIV
IDUALS PSYCHOLOGICAL EXPERIENCES ENVIRONMENT HUMAN
RESPONSES BEHAVIORS ATTITUDES PSYCHOLOGY PERSON
CELLS CELL ORGANISMS ALGAE BACTERIA MICROSCOPE MEM
BRANE ORGANISM FOOD LIVING FUNGI MOLD MATERIALS NU
CLEUS CELLED STRUCTURES MATERIAL STRUCTURE GREEN M
OLDS
DOCTOR PATIENT HEALTH HOSPITAL MEDICAL CARE PATIEN
TS NURSE DOCTORS MEDICINE NURSING TREATMENT NURSES
PHYSICIAN HOSPITALS DR SICK ASSISTANT EMERGENCY P
RACTICE
BOOK BOOKS READING INFORMATION LIBRARY REPORT PAGE
TITLE SUBJECT PAGES GUIDE WORDS MATERIAL ARTICLE
ARTICLES WORD FACTS AUTHOR REFERENCE NOTE
MAP NORTH EARTH SOUTH POLE MAPS EQUATOR WEST LINES
EAST AUSTRALIA GLOBE POLES HEMISPHERE LATITUDE PL
ACES LAND WORLD COMPASS CONTINENTS
FOOD FOODS BODY NUTRIENTS DIET FAT SUGAR ENERGY MI
LK EATING FRUITS VEGETABLES WEIGHT FATS NEEDS CARB
OHYDRATES VITAMINS CALORIES PROTEIN MINERALS
70
Syntactic categories
BE MAKE GET HAVE GO TAKE DO FIND USE SEE HELP KEEP
GIVE LOOK COME WORK MOVE LIVE EAT BECOME
ONE SOME MANY TWO EACH ALL MOST ANY THREE THIS EVE
RY SEVERAL FOUR FIVE BOTH TEN SIX MUCH TWENTY EIGH
T
HE YOU THEY I SHE WE IT PEOPLE EVERYONE OTHERS SCI
ENTISTS SOMEONE WHO NOBODY ONE SOMETHING ANYONE EV
ERYBODY SOME THEN
MORE SUCH LESS MUCH KNOWN JUST BETTER RATHER GREAT
ER HIGHER LARGER LONGER FASTER EXACTLY SMALLER SOM
ETHING BIGGER FEWER LOWER ALMOST
ON AT INTO FROM WITH THROUGH OVER AROUND AGAINST A
CROSS UPON TOWARD UNDER ALONG NEAR BEHIND OFF ABOV
E DOWN BEFORE
THE HIS THEIR YOUR HER ITS MY OUR THIS THESE A AN
THAT NEW THOSE EACH MR ANY MRS ALL
GOOD SMALL NEW IMPORTANT GREAT LITTLE LARGE BIG
LONG HIGH DIFFERENT SPECIAL OLD STRONG YOUNG COMMO
N WHITE SINGLE CERTAIN
SAID ASKED THOUGHT TOLD SAYS MEANS CALLED CRIED SH
OWS ANSWERED TELLS REPLIED SHOUTED EXPLAINED LAUGH
ED MEANT WROTE SHOWED BELIEVED WHISPERED
71
Sentence generation
RESEARCH S THE CHIEF WICKED SELECTION OF
RESEARCH IN THE BIG MONTHS S EXPLANATIONS S
IN THE PHYSICISTS EXPERIMENTS S HE MUST QUIT
THE USE OF THE CONCLUSIONS S ASTRONOMY PEERED
UPON YOUR SCIENTISTS DOOR S ANATOMY ESTABLISHED
WITH PRINCIPLES EXPECTED IN BIOLOGY S ONCE BUT
KNOWLEDGE MAY GROW S HE DECIDED THE MODERATE
SCIENCE LANGUAGE S RESEARCHERS GIVE THE
SPEECH S THE SOUND FEEL NO LISTENERS S WHICH
WAS TO BE MEANING S HER VOCABULARIES STOPPED
WORDS S HE EXPRESSLY WANTED THAT BETTER VOWEL
72
Sentence generation

LAW S BUT THE CRIME HAD BEEN SEVERELY POLITE
OR CONFUSED S CUSTODY ON ENFORCEMENT RIGHTS IS
PLENTIFUL CLOTHING S WEALTHY COTTON PORTFOLIO
WAS OUT OF ALL SMALL SUITS S HE IS CONNECTING
SNEAKERS S THUS CLOTHING ARE THOSE OF
CORDUROY S THE FIRST AMOUNTS OF FASHION IN THE
SKIRT S GET TIGHT TO GET THE EXTENT OF THE
BELTS S ANY WARDROBE CHOOSES TWO SHOES THE
ARTS S SHE INFURIATED THE MUSIC S ACTORS
WILL MANAGE FLOATING FOR JOY S THEY ARE A SCENE
AWAY WITH MY THINKER S IT MEANS A CONCLUSION
73
Conclusion
  • Taking a probabilistic approach can clarify
    some of the central issues in semantic
    representation
  • Motivates sensitivity to co-occurrence statistics
  • Identifies how co-occurrence data should be used
  • Allows the role of meaning to be specified
    exactly, and finds a meaningful decomposition of
    language

74
(No Transcript)
75
Probabilities and Inner Products
  • Single word
  • List of words

w
76
(No Transcript)
77
Model Selection
  • How many topics does a language contain?
  • Major issue for parametric models
  • Not so much for non-parametric models
  • Dirichlet process mixtures
  • Expect more topics than tractable
  • Choice of number is choice of scale

78
(No Transcript)
79
(No Transcript)
80
Gibbs Sampling and EM
  • How many topics does a language contain?
  • EM finds fixed set of topics, single estimate
  • Sampling allows for multiple sets of topics, and
    multimodal posterior distributions

81
(No Transcript)
82
(No Transcript)
83
Natural Statistics
  • Treating co-occurrence data as frequencies
    preserves the natural statistics of language
  • Word frequency
  • Zipfs Law of Meaning

84
Natural Statistics
85
Natural Statistics
86
Natural Statistics
87
(No Transcript)
88
Word Association
CROWN
people KING JEWEL QUEEN
HEAD HAT TOP ROYAL THRONE
model KING TEETH HAIR
TOOTH ENGLAND MOUTH QUEEN PRINCE
89
Word Association
SANTA
people CHRISTMAS TOYS LIE
model MEXICO SPANISH
CALIFORNIA
90
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com