Some Thoughts on Tagging presentation

About This Presentation

Transcript and Presenter's Notes

Title: Some Thoughts on Tagging

1
Some Thoughts on Tagging
Marti Hearst UC Berkeley
2
Outline

What are Tags?
Organizing Tags for Navigation
Facets and faceted navigation
How to (semi)automatically create facet
hierarchies
Whats up with Tag Clouds?

3
Social Tagging

Metadata assignment without all the bother
Spontaneous, easy, and tends towards single terms
Usually used in the context of social media

4
Example from del.icio.us
5
The Tagging Opportunity

At last! Content-oriented metadata in the large!
Attempts at metadata standardization always end
up with something like the Dublin Core
author, date, publisher, yaaawwwwnnn.
Ive always thought the action was in the subject
metadata, and have focused on how to navigate
collections given such data.

6
The Tagging Opportunity

Tags are inherently faceted !
It is assumed that multiple labels will be
assigned to each item
Rather than placing them into a folder
Rather than placing them into a hierarchy
Concepts are assigned from many different content
categories
Helps alleviate the metadata wars
Allows for both splitters and lumpers
Is this a bird or a robin
Doesnt matter, you can do both!
Allows for differing organizational views
Does NASCAR go under sports or entertainment?
Doesnt matter, you can do both!

7
Tagging Problems

Tags arent organized
Thorough coverage isnt controlled for
The haphazard assignments lead to problems with
Synonymy
Homonymy
See how this author attempts to compensate

8
Tagging Problems / Opportunities

Some tags are fleeting in meaning or too personal
toread todo
Tags are not professional
(I personally dont think this matters)
Great example from Trant
"Anecdotal evidence also shows that
professional cataloguers find the basic
description of visual elements surprisingly
difficult a curator exhibited significant
discomfort during this description task. When
asked what was wrong, he blurted out "everything
I know isn't in the picture".
Investigating social tagging and folksonomy in
the art museum with steve.museum", J. Trant, B.
Wyman, WWW 2006 Collaborative Tagging Workshop

9
Investigating social tagging and folksonomy in
the art museumwith steve.museum", J. Trant, B.
Wyman, WWW 2006 Collaborative Tagging Workshop
10
What about Browsing?

I think tags need some organization
Currently most tags are used as a direct index
into items
Click on tag, see items assigned to it, end of
story
Co-occurring tags are not shown
Grouping into small hierarchies is not usually
done
del.icio.us now has bundles, but navigation isnt
good
IBMs dogear and RawSugar come the closest
I think the solution is to organize tags into
faceted hierarchies and do browsing in the
standard way

11
Faceted Navigation and Flamenco
12
The Problem With Hierarchy

Most things can be classified in more than one
way.
Most organizational systems do not handle this
well.
Example Animal Classification

Skin Covering
otter penguin robin salmon wolf cobra bat
Locomotion
Diet
13
The Problem with Hierarchy

Inflexible
Force the user to start with a particular
category
What if I dont know the animals diet, but the
interface makes me start with that category?
Wasteful
Have to repeat combinations of categories
Makes for extra clicking and extra coding
Difficult to modify
To add a new category type, must duplicate it
everywhere or change things everywhere

14
The Problem With Hierarchy
start
15
The Idea of Facets

Facets are a way of labeling data
A kind of Metadata (data about data)
Can be thought of as properties of items
Facets vs. Categories
Items are placed INTO a category system
Multiple facet labels are ASSIGNED TO items

16
The Idea of Facets

Create INDEPENDENT categories (facets)
Each facet has labels (sometimes arranged in a
hierarchy)
Assign labels from the facets to every item
Example recipe collection

Ingredient
Cooking Method
Chicken
Stir-fry
Bell Pepper
Curry
Course
Cuisine
Main Course
Thai
17
The Idea of Facets

Break out all the important concepts into their
own facets
Sometimes the facets are hierarchical
Assign labels to items from any level of the
hierarchy

Preparation Method Fry Saute Boil
Bake Broil Freeze
Desserts Cakes Cookies Dairy
Ice Cream Sorbet Flan
Fruits Cherries Berries Blueberries
Strawberries Bananas Pineapple
18
Using Facets

Now there are multiple ways to get to each item

Preparation Method Fry Saute Boil
Bake Broil Freeze
Desserts Cakes Cookies Dairy
Ice Cream Sherbet Flan
Fruits Cherries Berries Blueberries
Strawberries Bananas Pineapple
Fruit gt Pineapple Dessert gt Cake Preparation gt
Bake
Dessert gt Dairy gt Sherbet Fruit gt Berries gt
Strawberries Preparation gt Freeze
19
The Flamenco Interface

Fine Arts Museum Example

20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
Advantages of the Approach

Systematically integrates search results
reflect the structure of the info architecture
retain the context of previous interactions
Gives users control and flexibility
Over order of metadata use
Over when to navigate vs. when to search
Allows integration with advanced methods
Collaborative filtering, predicting users
preferences

32
Advantages of Facets

Cant end up with empty results sets
(except with keyword search)
Helps avoid feelings of being lost.
Easier to explore the collection.
Helps users infer what kinds of things are in the
collection.
Evokes a feeling of browsing the shelves
Is preferred over standard search for collection
browsing in usability studies.
(Interface must be designed properly)

33
Related WorkAutomated Tag Organization

Some efforts are on tag prediction
Mishne 06
Uses IR techniques to find the closest tagged
documents, uses their tags to assign new tags.
Measures on how well new tags predicted
Xu et al. 06
Use tags that have already been predicted for a
document to predict which to show to a new user
who is tagging the document
Some efforts on tag organization
Brooks Montanez 06
Tries to see if tags can predict document
clusters, which in my book arent really
categories
After clustering based on text they try to induce
a tag hierarchy by agglomerative clustering the
text. Results not described in detail
Begelman et al. 06
Use clustering and tag co-occurrence to find
associated tags. Not clear what the
organizational goal is

34
RawSugar

A company/website that organizes tags from blogs
into facets
They are undergoing a revamp, will move to
channels
However, nothing published on this
(presumably, patents filed)

35
(No Transcript)
36
(No Transcript)
37
(No Transcript)
38
How to Create Facet Hierarchies?

Our Approach Castanet
(Stoica Hearst, to appear at HLT-NAACL 07)

39
Example Recipes (3500 docs)
40
Castanet Output (shown in Flamenco)
41
Castanet Output (shown in Flamenco)
42
Castanet Output (shown in Flamenco)
43
Example Biology Journal TitlesCastanet Output
(shown in Flamenco)
44
Castanet Algorithm

Leverage the structure of WordNet

Documents
45
1. Select Terms
Build tree
Comp. tree
Documents
Select terms
Get hypernym paths

Select well distributed
terms from collection

WordNet
46
2. Get Hypernym Path
red
blue
47
3. Build Tree
Build tree
Comp. tree
Documents
Select terms
Get hypernym paths
WordNet
red
blue
48
4. Compress Tree
Build tree
Comp. tree
Documents
Select terms
Get hypernym paths
WordNet
color
chromatic color
red, redness
blue, blueness
green, greenness
red
blue
green
49
4. Compress Tree (cont.)
Build tree
Comp. tree
Documents
Select terms
Get hypernym paths
WordNet
color
color
chromatic color
red
blue
green
red
blue
green
50
5. Divide into Facets
Divide into facets
51
Disambiguation

Ambiguity in
Word senses
Paths up the hypernym tree

52
How to Select the Right Senses and Paths?

First build core tree
(1) Create paths for words with only one sense
(2) Use Domains
Wordnet has 212 Domains
medicine, mathematics, biology, chemistry,
linguistics, soccer, etc.
Automatically scan the collection to see which
domains apply
The user selects which of the suggested domains
to use or may add own
Paths for terms that match the selected domains
are added to the core tree
Then add remaining terms to the core tree.

53
Castanet Evaluation Method

Information architects assessed the category
systems
For each of 2 systems output
Examined and commented on top-level
Examined and commented on two sub-levels
Also compared to a baseline system
Then comment on overall properties
Meaningful?
Systematic?
Likely to use in your work?

54
CastaNet Evaluation Results

Results on recipes collection for
Would you use this system in your work?
Yes in some cases or yes, definitely
Castanet 29/34
LDA 0/18
Subsumption 6/16
Baseline 25/34
Average response to questions about quality
(4 strongly agree)

55
Will Castanet Work on Tags?

Class project by Simon King and Jeff Towle, 2004
1650 captions captured from mobile phones
Blocks with Grandpa, Weezer , A veterans day
tour of berkeley in front of south hall., Bad
photo, Kitchen, Jgj
Wanted to organize them.
Use the CastaNet wordnet-based facet-hierarchy
creation algorithm
by Stoica Hearst, to appear at HLT-NAACL 07
Had to first remove proper names

56
Example Photos Captions (King Towle)
very scary x-mas tree
Hp presentation
chasing a cat in the dark
My cat
57

instrumentality, (112)
vehicle (26)
car (9)
bike (8)
vessel, watercraft (4)
mayflower (2)
ferry (1)
gig (1)
truck (3)
airplane (2)
device (20)
machine (7)
computer (4)
laptop (1)
sander (1)

container (16)
vessel (7)
bottle (5)
water_bottle (2)
jug (1)
pill_bottle (1)
bath (2)
bowl (1)
can (2)
backpack (1)
bumper (1)
empty (1)
salt_shaker (1)
furniture, piece of furniture, article of
furniture (12)
seat (8)
bench (2)
chair (2)
couch (2)
lounge (1)

58
Research Questions for Tags Search

The role of interface on tag convergence
There seems to be a big effect
Would be really interesting to experiment with
this
Also, for facet grouping
Anchor text vs. tags?
How are they the same how do they differ?
How to get tag expertise?
Right now, in many cases it is least-common-denomi
nator
ESP-game

59
Whats up with Tag Clouds?
What does a typical tag cloud look like?
60
Definition

Tag Cloud A visual representation of social
tags, organized into paragraph-style layout,
usually in alphabetical order, where the relative
size and weight of the font for each tag
corresponds to the relative frequency of its use.

61
Definition

Tag Cloud A visual representation of social
tags, organized into paragraph-style layout,
usually in alphabetical order, where the relative
size and weight of the font for each tag
corresponds to the relative frequency of its use.

62
flickrs tag cloud
63
del.icio.us
64
del.icio.us
65
blogs
66

ma.gnolia.com

67
NYTimes.com tags from most frequent search terms
68
IBMs manyeyes project
69
Amazon.com Tag clouds on term frequenies
70
Alternative Semantic Layout

Improving Tag-Clouds as Visual Information
Retrieval Interfaces, Yusef Hassan-Monteroa, 1
and Víctor Herrero-Solana, InSciT2006
Tags grouped by similarity, based on clustering
techniques and co-occurrence analysis

71
I was puzzled by the questions

What are designers and authors intentions in
creating or using tag clouds?
How do they expect their readers to use them?

72
On the positive side

Compact
Draws the eye towards the most frequent
(important?) tags
You get three dimensions simultaneously!
alphabetical order
size indicating importance
the tags themselves

73
Weirdnesses

Initial encounters unencouraging
Some reports from industry
Is the computer broken?
Is this a ransom note?

74
Weirdnesses

Violates principles of perceptual design
Longer words grab more attention than shorter
Length of tag is conflated with its size
White space implies meaning when there is none
intended
Ascenders and descenders can also effect focus
Eye moves around erratically, no flow or guides
for visual focus
Proximity does not hold meaning
The paragraph-style layout makes it quite
arbitrary which terms are above, below, and
otherwise near which other terms
Position within paragraph has saliency effects
Visual comparisons difficult (see Tufte)

75
Weirdnesses

Meaningful associations are lost
Where are the different country names in this tag
clouds?

76
Weirdnesses

Which operating systems are mentioned?

77
Tag Cloud Study (1)

First part compared tag cloud layouts
Independent Variables
Tag size
Tag proximity to a large font
Tag quadrant position
Task recall after a distractor task
13 participants effects for size and quadrant
Second part compared tag clouds to lists
11 participants
Tested recognition (from a set of like words) and
impression formation
Alphabetical lists were best for the latter no
differences for the former

Getting our head in the clouds Toward
evaluation studies of tagclouds, Walkyria
Rivadeneira Daniel M. Gruen Michael J. Muller
David R. Millen, CHI 2007 note

78
Tag Cloud Study (2)

62 participants did a selection task
(find this country out of a list of 10 countries)
Independent Variables
Horizontal list
Horizontal list, alphabetical
Vertical list
Vertical list, alphabetical
Spatial tag cloud
Spatial tag cloud, alphabetical
Order for non-alphabetical not described
Alphabetical fastest in all cases, lists faster
than spatial
May have used poor clouds (some people couldnt
see larger font answers)

An Assessment of Tag Presentation Techniques
Martin Halvey, Mark Keane, poster at WWW 2007.

79
A Justifying Claim

You get three dimensions simultaneously!
alphabetical order
size indicating importance
the tags themselves
but is this really a conscious design decision?

80
Solution Celebrity Interviews

I was really confused about tag clouds, so I
decided to ask the people behind the puffs
15 interviews, conducted at foocamp06
Several web 2.0 leaders
5 more interviews at Google and Berkeley

81
A Surprise

7 interviewees DID NOT REALIZE that alphabetical
ordering is standard.
2 of these people were in charge of such sites
but had had others write the code
What was the answer given to what order are tags
shown in?
hadnt thought about it
dont think about tag clouds that way
random order
ordered by semantic similarity
Suggests that perhaps people are too distracted
by the layout to use the alphabetical ordering

82
Suggested main purposes

To signal the presence of tags on the site
A good way to get the gist of the site
An inviting and fun way to get people interacting
with the site
To show what kinds of information are on the site
Some of these said they are good for navigation
Easy to implement

83
Tag Clouds as Self-Descriptions

Several noted that a tag cloud showing ones own
tags can be evocative
A good summary of what one is thinking and
reading about
Useful for self-reflection
Useful for showing others ones thoughts
One example comparing someone elses tags to
owns one to see what you have in common, and
what special interests differentiate you
Useful for tracking changes in friends lives
Oh, a new girls name has gotten larger he must
have a new girlfriend!

84
Tag Clouds as showing Trends

Several people used this term, that tag clouds
show trends in someones behavior
Trends are usually patterns across time, which
are not inherently visible in tag clouds
To note a trend using a tag cloud, one must
remember what was there at an earlier time, and
what changed
tracking the girls names example
This suggests a reason for the importance of the
large tags draws ones attention to what is big
now versus was used to be large.
Suggests also why it doesnt matter that you
cant see small tags.

85
New Perspective Tag Clouds are Social!

Its not about the information!
Not surprising in retrospect tagging is in large
part about the social aspect
Seems to work mainly when the tags can be seen by
many
Even better when items can be tagged by many and
seen by many
What does this mean though when tag clouds are
applied to non-social information?

86
Follow-up Study

Informed by the interview results, we search for,
read, and coded web pages that mentioned tag
clouds.
Looked at about 140 discussions
Developed 21 codes
Looked at another 90 discussions
Used web queries tag clouds, usability tag
clouds, etc
Sampled every 10th url
58 personal blogs
20 commercial blogs
10 commercial web pages
rest from group blogs and discussion lists
Doesnt tell us what people who dont write about
tag clouds think.

87
The Role of Popularity

Popularity in the sense that tag clouds (and
tagging) are trendy and popular.
Some people liked the visualization, but their
popularity made them less appealing
Famous post Tag clouds are the new mullets
Led to self-consciousness about liking them
Many complained about unaesthetic cloud designs
Little consensus on if they are a fad or have
staying power
Popularity also in the sense of the large font
size for more popular tags
Many people like the prominence of large tags,
but several commented on the tyranny of the
popular

88
The Role of Navigation

Opinions vary
Many simply state they are useful for navigation,
but with no support for this claim
Some claim the compactness makes navigation
easier than a vertical list
Some object to the varying font size on
scannability
Others object to the lack of organization
Overall, there is no evidence either way that we
could find in the blog community

89
Aesthetic Considerations

Disagreement on the aesthetic and emotional
appeal, especially for lay users.
Those who like them find them fun and appealing
Those who dont find them messy, strange, like a
ransom note
Informal reports with first time users who are
not in the Web 2.0 community are negative

90
Trends again

As in the interviews, the benefit of trends was
mentioned many times.
There is another sense of trend as tendency or
inclination, and this might be what people mean.

91
Summary of Stated Reasons for Tag Clouds(Note
some refuted by studies)
92
Tag Clouds as Social Information

An emphasis that tag clouds are meant to show
human behavior.
We found reports of people commenting on other
uses that were invalid because they did not
reflect live user input
One blogger noted the incongruity of an online
library using keyword frequencies in a tag cloud
rather than having it reflect patrons usage of
the collection.
An online community noticed one sites cloud
didnt change over time and realized the sizes
were decided by marketing. This was greated with
derision.

93
Implications

Assume tag clouds are meant to reflect human
mental activity (individual or group)
Then what might seem design flaws from an
information conveyance perspective may not be
A large part of the appeal is the fun and
liveliness.
The informality of the layout reflects the human
activity beneath it.

94
Judith Donath, CACM 45(4), 2002

Traditional data visualization focuses on
making abstract numbers and relationships into
concrete, spatialized images the goal is to
highlight important patterns while also
representing the data accurately. This is a fine
approach for social scientists studying the
dynamics of online interactions. Yet for our
purpose it is also important that the
visualization evoke an appropriate intuitive
response representing the feel of the
conversation as well as depicting its dynamics

95
Judith Donath, CACM 45(4), 2002

One argument for deliberately designing
evocative visualizations for online social
environments is the existing default textual
interfaces are themselves evocative, they simply
evoke an aura of business-like monotony rather
than the lively social scene that actually
exists.''

96
Tag Cloud Alternatives

Provided by Martin Wattenberg

97
Conclusions

Social tagging is, in my view, a terrific way to
get good content metadata.
I think automated techniques can do a lot to help
clean them up and organize them.
They are an inherently social phenomenon, part of
social media, which is a really exciting area.
The socialness of social media can yield
surprises, like tag clouds.

Write a Comment

User Comments (0)

About PowerShow.com

Some Thoughts on Tagging PowerPoint PPT Presentation