Labels - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

Labels

Description:

Classification provides a system for organizing knowledge. ... http://www.zappos.com/welcome.zhtml. Consideration: Stability. 2000. 2002. 1998. 2004. 2006 ... – PowerPoint PPT presentation

Number of Views:178
Avg rating:3.0/5.0
Slides: 51
Provided by: sjf
Category:
Tags: labels | zappos

less

Transcript and Presenter's Notes

Title: Labels


1
Labels
  • the practice and science of classification

2
Classification
  • Classification provides a system for organizing
    knowledge.
  • Classification provides a system for organizing
    knowledge.

3
Classification Types
  • Controlled Vocabulary
  • Classification
  • Taxonomy
  • Thesaurus
  • Ontology

4
Commonalities of Controlled Vocabulary,
Classification, Taxonomy, Thesaurus, Ontology
  • Standardized naming conventions
  • Highly structured
  • Generally hierarchical
  • Highly specific application
  • Highly specific audience
  • Prescriptive, rather than descriptive
  • Low adaptability
  • Enables more efficient indexing and searching

5
Controlled Vocabulary
  • List of terms that have been enumerated
    explicitly.
  • Controlled by and is available from a controlled
    vocabulary registration authority.
  • All terms in a controlled vocabulary should have
    an unambiguous, non-redundant definition.

6
Purpose of Controlled Vocabulary
  • Translation Provide a means for converting the
    natural language of authors, indexers, and users
    into a vocabulary that can be used for indexing
    and retrieval.
  • Consistency Promote uniformity in term format
    and in the assignment of terms.
  • Indication of relationships Indicate semantic
    relationships among terms.
  • Label and browse Provide consistent and clear
    hierarchies in a navigation system to help users
    locate desired content objects.
  • Retrieval Serve as a searching aid in locating
    content objects.

7
Control Rules
  • Synonyms (two words with the same meaning, like
    jeans and dungarees)
  • Homonyms (words that sound the same, but have
    different meanings, like bank the financial
    institution and bank the side of a stream or
    river)
  • Common misspellings

8
Control Rules
  • Changes in content (e.g., countries that change
    their name or have multiple spellings)
  • Identifying Best Bets connecting a womans
    married name to her maiden name
  • Connecting abbreviations to the full word (e.g.,
    NY and New York, the chemical symbol Si with the
    element Silicon)

9
(No Transcript)
10
Classification Types
  • Controlled Vocabulary
  • Classification
  • Taxonomy
  • Thesaurus
  • Ontology

11
Classifying a work with the DDC
  • Requires determination of
  • The subject
  • The disciplinary focus
  • The approach or form (if applicable)

12
Dewey Decimal Classification - 1876
  • 000 Computers, information general reference
  • 100 Philosophy psychology
  • 200 Religion
  • 300 Social sciences
  • 400 Language
  • 500 Science
  • 600 Technology
  • 700 Arts recreation
  • 800 Literature
  • 900 History geography

13
Notational hierarchy
600 Technology (Applied sciences) 630
Agriculture and related technologies 636
Animal husbandry 636.7 Dogs 636.8 Cats
14
Table of Last Resort (tie breaker)
  • Kinds of things
  • Parts of things
  • Materials from which things, kinds, or parts are
    made
  • Properties of things, kinds, parts, or materials
  • Processes within things, kinds, parts, or
    materials
  • Operations upon things, kinds, parts, or
    materials
  • Instrumentalities for performing such operations

15
Book Industry Study Group Subject Headings
http//www.bisg.org/standards/bisac_subject/major_
subjects.html
16
Taxonomy
  • Word comes from the Greek t????, taxis, 'order'
    ??µ??, nomos, 'law' or 'science'.
  • Collection of controlled vocabulary terms
    organized into a hierarchical structure

17
Taxonomy Classification
  • Biological classification has two basic
    objectives
  • To serve as a basis for generalization in
    comparative studies.
  • To serve as an information storage system.

18
Linnaean System of classification
19
Search Dilemma
Peter Morville
20
Example
  • Taxonomy Warehouse

21
Classification Types
  • Controlled Vocabulary
  • Classification
  • Taxonomy
  • Thesaurus
  • Ontology

22
Thesaurus
  • Networked collection of controlled vocabulary
    terms.
  • Uses associative relationshipsin addition to
    parent-child relationships.

23
Thesaurus Provides Variant Terms
24
http//www.visualthesaurus.com/
25
Classification Types
  • Controlled Vocabulary
  • Classification
  • Taxonomy
  • Thesaurus
  • Ontology

26
Z39.50
  • Client-server protocol for searching and
    retrieving information from remote computer
    databases.
  • Guidelines for the Construction, Format, and
    Management of Monolingual Controlled Vocabulari
  • ANSI/NISO Z39.19-2005

http//www.niso.org/standards/resources/Z39-19-200
5.pdf
27
Extend taxonomies to be more descriptive
  • Thesaurus BT/NT, USE/UF, SN and RT
  • ISO standard 2788 - Properties
  • BT Broader term - one level up in the hierarchy
  • NT Narrow term / Inversed with BT
  • SN Scope note (Explanation of meaning of the
    term)
  • RT Related term (No synonym or BT/NT See
    also)
  • USE Other term preferred/synonym /Inversed with
    UF
  • To provide a much richer vocabulary for
    describing the terms than taxonomies do.

28
Z39.50
  • It is covered by ANSI/NISO standard Z39.50, and
    ISO standard 23950.
  • (ANSI) American National Standard For Information
  • (ISO) International Organization for
    Standardization
  • (NISO) National Information Standards
    Organization
  • The standard's maintenance agency is the Library
    of Congress.

29
5.2.2 Content Objects
  • There are two classes of content objects, primary
    and secondary, although this distinction is
    rarely made.
  • A primary content object is the item itself,
    whether it exists in physical form (e.g. print,
    audiotape, DVD, movie) or exists solely in
    electronic form.
  • A secondary content object is the metadata that
    describes the primary content object.
  • Many data stores combine the primary content
    object and its metadata into a single, hybrid
    content object.

30
(No Transcript)
31
9.3.1 Alphabetical Displays
  • An alphabetical listing is the most basic type of
    vocabulary display. It may contain both preferred
    terms and entry terms with their respective USE
    and USED FOR references.

32
9.3.1.2 Flat Format Displays
  • Most commonly used controlled vocabulary display
    format.
  • All terms arranged in alphabetical order,
    including their term details, and one level of
    BT/NT hierarchy.
  • (BT/BT) Broad Term / Narrow Term

(SN) Scope Note
(UF) Used For
(RT) Related Term
33
9.3.4.1.1 Multilevel Broader and Narrower Terms
Hierarchical Displays
  • In a multilevel hierarchical display format, all
    levels of the broader and narrower terms related
    to a given term are immediately visible.
  • This is in contrast with the flat format
    described above, in which only one level of
    broader or narrower terms is displayed and the
    user is required to navigate from term to term
    one level at a time to discover the full
    hierarchy.

34
Multilevel Hierarchy using Tree Structure
35
Ontology
  • Ontology is a controlled vocabulary expressed in
    an ontology representation language.
  • Includes a grammar for using vocabulary terms
    within a domain of interest.
  • The grammar contains formal constraints
  • Rule define what constitutes a
  • well-formed statement
  • assertion
  • query

36
Commonalities of Controlled Vocabulary,
Classification, Taxonomy, Thesaurus, Ontology
  • Standardized naming conventions
  • Highly structured
  • Generally hierarchical
  • Highly specific application
  • Highly specific audience
  • Prescriptive, rather than descriptive
  • Low adaptability
  • Enables more efficient indexing and searching

37
http//magia3e.wordpress.com/2007/02/04/informatio
n-classification-part-ii-taxonomy/
38
Creating a Controlled Vocabulary
  • What is this content object?
  • How can I describe it?
  • What distinguishes it from other content objects?
  • How can I make it findable?

39
Creating a Top-Down Classification
  • Define the sites target audience.
  • List all topics, actions, concepts, theories,
    functions, roles other aspects of the sites
    projected content.
  • Categorize the items according to
    functional-topical
  • verb or noun, action or actor, entity or
    relationship splits.
  • Analyze the categories from a user perspective
    and build a tentative structure.
  • Develop a unifying metaphor to incorporate into
    the design, expressing the relationships in a
    holistic, symbolic way.
  • User test the site structure and labeling.

Louise Gruenberg
40
Consideration Specificity
http//www.zappos.com/welcome.zhtml
41
Consideration Stability
1998
2000
2002
2004
2006
2007
42
Faceted System Bottom-up Approach
  • Focuses on the important, essential or persistent
    characteristics of content objects.
  • Useful for fine-grained rapidly changing
    repositories.
  • Easy to add a new facet at any time.
  • Example a facet map file http//facetmap.com/con
    f/wine.txt

43
Other Considerations
  • Implicit information?
  • Classification of objects over fuzzy boundaries.

44
Faceted Classification
  • Bottom-Up approach
  • Central concept How do I describe this?
  • Oct 2003 69 of sites made at least some use of
    faceted classification.

45
Faceted Classification is not new!
  • S. R. Ranganathan ((1892-1972)
  • a clearly defined, mutually exclusive, and
    collective exhaustive aspects, properties or
    characteristics of a class or specific subject.
  • Describing documents from various perspectives
  • Prolegomena to Library Classification (1967)
    definitive resource on faceted classification

46
Potential Facets
  • Topic
  • Product
  • Document Type Format?
  • technical report
  • white paper
  • news article
  • Source Creator?
  • Intended Audience
  • Geography
  • Price

Peter Morville
47
Usability is Key
  • Don't have to predict what facets the users will
    find most intuitive.
  • MUST create an intuitive interface, so that the
    user, with a minimum amount of effort can use the
    facets to search the site.

Kathryn La Barre
48
Principles for Choice of Facets
  • Principle of Differentiation
  • Principle of Relevance
  • Principle of Ascertainability
  • Principle of Permanence
  • Principle of Homogeneity
  • Principle of Mutual Exclusivity
  • Principle of Fundamental Categories

Louise Spiteri
49
Creating a Faceted Classification
  • Content analysis
  • Gathering a representative sample site's content.
  • Adopt a Noah's Ark approach Try to capture a
    couple of each type of animal.

Peter Morville
50
Resources
  • Introduction to Dewey Decimal Classification
    system from OCLC
  • BISAC Subject Headings 2006 Edition - September
    2006
  • Everything is Miscellaneous blog by David
    Weinberger
  • What are the differences between a vocabulary, a
    taxonomy, a thesaurus, an ontology, and a
    meta-model? by Woody Pidcock
  • Building a Synonymous Search Index by Peter
    Morville
  • Creating a Controlled Vocabulary by Leise, Fast
    and Steckel
  • Guidelines for the Construction, Format, and
    Management of Monolingual Controlled Vocabulari
    Z39.50 Standard Document (pdf) from NISO
  • A Simplified Model for Facet Analysis by Dr.
    Louise Spiteri
Write a Comment
User Comments (0)
About PowerShow.com