Title: KNOWLEDGE ORGANIZATION: bringing order to the information universe
1KNOWLEDGE ORGANIZATIONbringing order to the
information universe
2FRAN ALEXANDER
- Library Assistant
- Reference Editor
- Editorial Director at Keesings
- MRes at UCL (DIS UCLIC)
- Taxonomy Manager at BBC
All views expressed here are entirely my own
personal views and in no way represent the BBC or
official BBC policy.
3STRUCTURE OF THE AFTERNOON
- purpose of Knowledge Organisation (KO)
- some formal KO methods
- group exercise sorting out objects
- systematic structures
- break
- how do we go about making KO structures?
- complex subjects and citation ordering
- group exercise creating a classification
- representation and labelling
4KNOWLEDGE ORGANISATION IS NOTHING NEW
- Hittite and Sumerian catalogues and lists
- hypertext links in The Talmud
- libraries in ancient Greece and Rome
- renaissance quests for Memory Palaces and
Universal Languages - Linnaean taxonomy
- Otlets Mundaneum and search engine service
- Ranganathan to Rosenfeld
- Internet of Things AI and onwards
5(No Transcript)
6WHY ORGANIZE?
- have you ever tried to find files on someone
elses desktop? - even if you only have a few hundred files,
finding them again can take ages - media archives have millions of files
- footage/recordings/documents that cant be found
have no value - free text search only takes you so far (scissors
and scalpels)
7WHY NOT JUST USE GOOGLE?
- synonyms and misspellings
- disambiguation - which Titanic? Budget and Spain?
- imperfect prior knowledge - Trying to learn about
a topic - aboutness - meaning beyond the words
- comprehensiveness
- audio-visual assets
- 2 billion users of the Internet
8WHAT IS THIS FOOTAGE OF?
- Nothing will ever surpass the first flight I
made on 6th October 1941. Dressed for the first
time in overalls, helmet and goggles, I sat in
the rear seat and bumped across the grass until
the aircraft suddenly stopped bumping and we had
left the ground behind. 35 minutes of ecstasy
until we touched down.
9PROFESSIONAL KNOWLEDGE ORGANIZATION
- a core function of the information professional
- to avoid chaos!
- how many published items?
- USA (2008) 288,000, UK (2009) 133,000 (up from a
total in USA and UK in 2005 of 378,000) - how many websites?
- 156 million (2008) 266 million (December 2010)
- to present resources in an orderly and
predictable manner - to enable access to specific content
- to aid retrieval of specific items
- to support exchange of information through the
use of standard formats
10HOW DO USERS LOOK FOR INFORMATION?
- Retrieval function of KO
- users may search for specific items - known item
retrieval - they may search for items characterized by some
particular feature - books by a certain author, document forms, etc.
- they may look for specific information
- Browsing function of KO
- they may want to see what is available
- they may not know what terms to use
11HOW DOES KNOWLEDGE ORGANIZATION SUPPORT THESE TWO
APPROACHES?
- the processes of enabling access to knowledge
- labelling resources
- classification
- indexing
- tagging
- building vocabularies
- creating formal records to represent resources
- cataloguing
- bibliographic description
- metadata schemes
- creating systematic structures to hold
information - classifications, taxonomies, concept and topic
maps, ontologies
12LABELLING RESOURCES
- adding information to a resource about its
subject content - classification
- classification schemes and codes
- subject cataloguing
- subject heading lists
- indexing
- controlled vocabularies, thesauri, keyword lists
- metadata schema
- tagging
- usually uncontrolled
13Group exercise categorising objects
14SYSTEMATIC STRUCTURES FOR THE ORDERING OF
KNOWLEDGE
- sometimes there is a need to present information
in a structured way - physical organization materials in a physical
collection - listing presentation of items such as a subject
bibliography or index - display browsing interface of a digital
collection
15(No Transcript)
16CREATING FORMAL RECORDS TO REPRESENT ITEMS
- listing characteristics of an item that represent
it - what its called? name, title
- who created it? author, creator
- who published it? publisher (commercial,
institutional, personal) - when and where? place of publication, web
address - whats it about? subject descriptors,
classification codes - physical attributes? size, dimensions, file
type, references, illustrations - representing these as fields in a database or
equivalent structure - using rules to ensure conformity of entries
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21SYSTEMATIC STRUCTURES FOR THE ORDERING OF
KNOWLEDGE
- it will be necessary to group items according to
subject - this is often described as classification or
categorization - the structure can be linear (as in a
classification) - the structure can be two-dimensional (as in a
concept map) - hypertext can be used to represent different
levels of a hierarchy (as in taxonomies)
22(No Transcript)
23(No Transcript)
24(No Transcript)
25HOW DO WE MAKE A KO STRUCTURE?
- dont muddle the design of the interface with the
structure of the information - data must be well structured to support browsing
and retrieval - sequence of topics must be logical
- relationships between topics must be clear
- structure must be understandable and predictable
26TOP-DOWN AND BOTTOM-UP CLASSIFICATIONS
- traditionally classifications were made by
repeated subdivision of classes into smaller and
smaller units - this tends to create rather rigid and abstract
classifications - modern methods tend to work by clustering or
grouping concepts to form classes - this method creates more flexible systems, more
closely related to reality
27TREE STRUCTURES
28CLUSTERING STRUCTURES
29(No Transcript)
30SORTING AND GROUPING
- this is the first stage in organizing a
collection of objects or concepts - different attributes may be used as the basis of
the classification - a whole variety of different (but quite valid)
classifications can be made by varying the
criteria for arrangement - this explains why classifications almost always
show cultural bias of some kind
31Can you name the odd one out?
- trumpet
- violin
- French horn
- trombone
32- trumpet
- violin
- French horn
- trombone
33- London
- China
- Brazil
- France
34- London
- China
- Brazil
- France
35- French
- Spanish
- Hebrew
- English
36- French
- Spanish
- Hebrew
-
- English
37ENTITY CLASSIFICATIONS
- used in scientific classifications
- used to arrange objects or entities themselves
- each entity has a unique place in the
classification - typical examples are
- chemical elements
- minerals
- astronomical bodies
- biological organisms
38(No Transcript)
39(No Transcript)
40ASPECT CLASSIFICATIONS
- bibliographic or documentary classifications
arrange first by subject field - objects or entities are scattered
- a rabbit could appear in many different places in
a library classification - zoology - cookery agriculture - fashionpet
keeping - conjuringmedical research -
mythology - the rabbit is a distributed relative
41(No Transcript)
42(No Transcript)
43(No Transcript)
44(No Transcript)
45Break
46COMPLEXITY IN SUBJECT MATTER
- life would be easier if subjects were simple
- the subjects of most documents or resources are
very complicated - decisions must be made about the location of a
complex topic - this involves giving priority to some attributes
over others - this should be done
- consistently
- with the needs of the users in mind
- there are lots of examples of this problem in the
real world, not just in the information world
47- dog
- cow
- potato
- rabbit
- Michelangelo
- Titian
- Dante
- Van Gogh
48- dog
- cow
- potato
- rabbit
- Michelangelo
- Titian
- Dante
- Van Gogh
49- dog
- cow
- potato
- rabbit
- Michelangelo
- Titian
- Dante
- Van Gogh
50(No Transcript)
51CITATION ORDER
- the order in which the various constituent
elements of a subject are listed when arranging,
classifying, or indexing documents - citation order should be consistent and
predictable in order to retrieve things - in a taxonomy, the choice of citation order will
determine which are the top levels of the taxonomy
52(No Transcript)
53Examples of citation order in physical
arrangement
- Tesco arrangement is by storage /preservation
- fresh food
- tinned food
- chilled food
- frozen food
- W. H. Smith arrangement is by form
- newspapers
- magazines
- books
- videos
- DVDs/CD-ROMs
54Group exercise
- Creating classifications card sorting technique
55DOCUMENT REPRESENTATION
- indexing terms or descriptors
- informal tagging
- a formal record of the document or resource,
e.g. a bibliographic record, metadata schemes, a
database structure with separate fields for the
different elements
56FORMAL SYSTEMS OF REPRESENTATION
- the last century has seen the development of a
number of standard formats for records, and, more
recently, for metadata - these esnable interoperability and the exchange
of information - were concerned here primarily with the structure
of content, and not the exchange formats per se
57INTERNATIONAL AGREEMENTS
- most countries had their own standards for
records - in 1950 UNESCO held an international conference
aimed at achieving a universal record of
information output - also aimed to standardize bibliographic document
formats - Anglo-American Cataloguing Rules and the MARC
format - potential uniformity at a global level
58Catalogue record showing MARC21 tags
59MARC tags indicate fields in foreign language
record
60MARC tags indicate fields in foreign language
record
61(No Transcript)
62METADATA STANDARDS
- in the late 20th century similar initiatives were
put in place for metadata for digital resources - best known is the Dublin Core, developed by OCLC
in Dublin Ohio - DC close in format and coverage to the MARC
standard - many metadata standards have been created to
cover other aspects of digital resources - Semantic Web and Linked Data are underpinned by
metadata standards
63METADATA UNIVERSE
64Metadata embedded in coding for website
65SUBJECTIVE AND OBJECTIVE METADATA
- none of these standards make any provision for
the specification of subject - this is mainly because subject cannot be inferred
automatically from the resource itself - subject evaluation is a very subjective process
and presents some specific problems - a separate subject indexing tool must be used
- this usually takes the form of a controlled
vocabulary
66SUMMARY
- knowledge organization matters
- as digital content explodes, the need for KO
increases - natural language usually too fuzzy for
effective retrieval - many systems and methods of controlling
vocabulary used, various strengths and
weaknesses, many viewpoints
67Image credits
- Labels http//www.flickr.com/photos/mrsmagic/5870
198525/ - Electrolux Design Lab - The robotic kitchen
assistant http//www.flickr.com/photos/electrolux
-design-lab/3811899536/sizes/l/in/photostream/ - Hieroglyphs at the Karnak temple
http//www.flickr.com/photos/tamburix/2900907735/ - Metadata Universe http//www.dlib.indiana.edu/je
nlrile/metadatamap/