Title: Taxonomy Development An Infrastructure Model
1Taxonomy DevelopmentAn Infrastructure Model
- Tom ReamyChief Knowledge Architect
- KAPS Group
- Knowledge Architecture Professional Services
- http//www.kapsgroup.com
2Agenda
- Introduction
- Type of Taxonomies
- The Enterprise Context
- Making the Business Case
- Infrastructure Model of Taxonomy Development
- Taxonomy in 4 Contexts
- Content, People, Processes, Technology
- Infrastructure Solutions the Elements
- Applying the Model Practical Dimension
- Starting and Resources
- Conclusion
3KAPS Group
- Knowledge Architecture Professional Services
(KAPS) - Consulting, strategy recommendations
- Knowledge architecture audits
- Partners Convera, Inxight, FAST, and others
- Taxonomies Enterprise, Marketing, Insurance,
etc. - Taxonomy customization
- Intellectual infrastructure for organizations
- Knowledge organization, technology, people and
processes - Search, content management, portals,
collaboration, knowledge management, e-learning,
etc.
4Two Types of Taxonomies Browse and
FormalBrowse Taxonomy Yahoo
5Two Types of Taxonomies Formal
6Browse Taxonomies Strengths and Weaknesses
- Strengths Browse is better than search
- Context and discovery
- Browse by task, type, etc.
- Weaknesses
- Mix of organization
- Catalogs, alphabetical listings, inventories
- Subject matter, functional, publisher, document
type - Vocabulary and nomenclature Issues
- Problems with maintenance, new material
- Poor granularity and little relationship between
parts. - Web site unit of organization
- No foundation for standards
7Formal Taxonomies Strengths and Weaknesses
- Strengths
- Fixed Resource little or no maintenance
- Communication Platform share ideas, standards
- Infrastructure Resource
- Controlled vocabulary and keywords
- More depth, finer granularity
- Weaknesses
- Difficult to develop and customize
- Dont reflect users perspectives
- Users have to adapt to language
8Facets and Dynamic Classification
- Facets are not categories
- Entities or concepts belong to a category
- Entities have facets
- Facets are metadata - properties or attributes
- Entities or concepts fit into one category
- All entities have all facets defined by set of
values - Facets are orthogonal mutually exclusive
dimensions - An event is not a person is not a document is not
a place. - Facets variety of units, of structure
- Date or price numerical range
- Location big to small (partonomy)
- Winery alphabetical
- Hierarchical - taxonomic
9Faceted Navigation Strengths and Weaknesses
- Strengths
- More intuitive easy to guess what is behind
each door - 20 questions we know and use
- Dynamic selection of categories
- Allow multiple perspectives
- Trick Users into using Advanced Search
- wine where color red, price x-y, etc..
- Weaknesses
- Difficulty of expressing complex relationships
- Simplicity of internal organization
- Loss of Browse Context
- Difficult to grasp scope and relationships
- Limited Domain Applicability type and size
- Entities not concepts, documents, web sites
10Dynamic Classification / Faceted navigation
- Search and browse better than either alone
- Categorized search context
- Browse as an advanced search
- Dynamic search and browse is best
- Cant predict all the ways people think
- Advanced cognitive differences
- Panda, Monkey, Banana
- Cant predict all the questions and activities
- Intersections of what users are looking for and
what documents are often about - China and Biotech
- Economics and Regulatory
11Business Case for TaxonomiesThe Right Context
- Traditional Metrics
- Time Savings 22 minutes per user per day
1Mil a Year - Apply to your organization customer service,
content creation, knowledge industry - Cost of not-finding re-creating content
- Research
- Advantages of Browsing Marti Hearst, Chen and
Dumais - Nielsen Poor classification costs a 10,000
user organization 10M each year about 1,000
per employee. - Stories
- Pain points, success and failure in your
corporate language
12Business Case for TaxonomiesIDC White Paper
- Information Tasks
- Email 14.5 hours a week
- Create documents 13.3 hours a week
- Search 9.5 hours a week
- Gather information for documents 8.3 hours a
week - Find and organize documents 6.8 hours a week
- Gartner Business spend an estimated 750
Billion annually seeking information necessary to
do their job. 30-40 of a knowledge workers
time is spent managing documents.
13Business Case for TaxonomiesIDC White Paper
- Time Wasted
- Reformat information - 5.7 million per 1,000 per
year (400M) - Not finding information - 5.3 million per 1,000
(370M) - Recreating content - 4.5 Million per 1,000
(315M) - Small Percent Gain large savings
- 1 - 10 million
- 5 - 50 million
- 10 - 100 million
14Business Case for TaxonomiesThe Right Context
- Justification
- Search Engine - 500K-2Mil
- Content Management - 500K-2Mil
- Portal - 500-2Mil
- Plus maintenance and employee costs
- Taxonomy
- Small comparative cost
- Needed to get full value from all the above
- ROI asking the wrong question
- What is ROI for having an HR department?
- What is ROI for organizing your company?
15Infrastructure Model of Taxonomy
DevelopmentTaxonomy in Basic 4 Contexts
- Ideas Content Structure
- Language and Mind of your organization
- Applications - exchange meaning, not data
- People Company Structure
- Communities, Users, Central Team
- Activities Business processes and procedures
- Central team - establish standards, facilitate
- Technology / Things
- CMS, Search, portals, taxonomy tools
- Applications BI, CI, Text Mining
16Taxonomy in ContextStructuring Content
- All kinds of content and Content Structures
- Structured and unstructured, Internet and desktop
- Metadata standards Dublin core
- Keywords - poor performance
- Need controlled vocabulary, taxonomies, semantic
network - Other Metadata
- Document Type
- Form, policy, how-to, etc.
- Audience
- Role, function, expertise, information behaviors
- Best bets metadata
- Facets entities and ideas
- Wine.com
17Taxonomy in ContextStructuring People
- Individual People
- Tacit knowledge, information behaviors
- Advanced personalization category priority
- Sales forms ---- New Account Form
- Accountant ---- New Accounts ---- Forms
- Communities
- Variety of types map of formal and informal
- Variety of subject matter vaccines, research,
scuba - Variety of communication channels and information
behaviors - Community-specific vocabularies, need for
inter-community communication (Cortical
organization model)
18Taxonomy in ContextStructuring Processes and
Technology
- Technology infrastructure and applications
- Enterprise platforms from creation to retrieval
to application - Taxonomy as the computer network
- Applications integrated meaning, not just data
- Creation content management, innovation,
communities of practice (CoPs) - When, who, how, and how much structure to add
- Workflow with meaning, distributed subject matter
experts (SMEs) and centralized teams - Retrieval standalone and embedded in
applications and business processes - Portals, collaboration, text mining, business
intelligence, CRM
19Taxonomy in Context The Integrating
Infrastructure
- Starting point knowledge architecture audit,
K-Map - Social network analysis, information behaviors
- People knowledge architecture team
- Infrastructure activities taxonomies,
analytics, best bets - Facilitation knowledge transfer, partner with
SMEs - Taxonomies of content, people, and activities
- Dynamic Dimension complexity not chaos
- Analytics based on concepts, information
behaviors - Taxonomy as part of a foundation, not a project
- In an Infrastructure Context
20Taxonomy in Context The Integrating
Infrastructure
- Integrated Enterprise requires both an
infrastructure team and distributed expertise. - Software and SMEs is not the answer - keywords
- Taxonomies not stand alone
- Metadata, controlled vocabularies, synonyms, etc.
- Variety of taxonomies, plus categorization,
classification, etc. - Important to know the differences, when to use
which - Multiple Applications
- Search, browse, content management, portals, BI
CI, etc. - Infrastructure as Operating System
- Word vs. Word Perfect
- Instead of sharing clipboard, share information
and knowledge.
21Infrastructure Solutions The start and
foundationKnowledge Architecture Audit
- Knowledge Map - Understand what you have, what
you are, what you want - The foundation of the foundation
- Contextual interviews, content analysis, surveys,
focus groups, ethnographic studies - Category modeling Intertwingledness -learning
new categories influenced by other, related
categories - Natural level categories mapped to communities,
activities - Novice prefer higher levels
- Balance of informative and distinctiveness
- Living, breathing, evolving foundation is the
goal
22Infrastructure Solutions ResourcesPeople and
Processes Roles and Functions
- Knowledge Architect and learning object designers
- Knowledge engineers and cognitive anthropologists
- Knowledge facilitators and trainers and
librarians - Part Time
- Librarians and information architects
- Corporate communication editors and writers
- Partners
- IT, web developers, applications programmers
- Business analysts and project managers
-
23Infrastructure Solutions Resources People and
Processes Central Team
- Central Team supported by software and offering
services - Creating, acquiring, evaluating taxonomies,
metadata standards, vocabularies - Input into technology decisions and design
content management, portals, search - Socializing the benefits of metadata, creating a
content culture - Evaluating metadata quality, facilitating author
metadata - Analyzing the results of using metadata, how
communities are using - Research metadata theory, user centric metadata
- Design content value structure more nuanced
than good / poor content.
24Infrastructure Solutions ResourcesPeople and
Processes Facilitating Knowledge Transfer
- Need for Facilitators
- Amazon hiring humans to refine recommendations
- Google humans answering queries
- Facilitate projects, KM project teams
- Facilitate knowledge capture in meetings, best
practices - Answering online questions, facilitating online
discussions, networking within a community - Design and run KM forums, education and
innovation fairs - Work with content experts to develop training,
incorporate intelligence into applications - Support innovation, knowledge creation in
communities
25Infrastructure Solutions ResourcesPeople and
Processes Location of Team
- KM/KA Dept. Cross Organizational,
Interdisciplinary - Balance of dedicated and virtual, partners
- Library, Training, IT, HR, Corporate
Communication - Balance of central and distributed
- Industry variation
- Pharmaceutical dedicated department, major
place in the organization - Insurance Small central group with partners
- Beans a librarian and part time functions
- Which design knowledge architecture audit
26Infrastructure Solutions ResourcesTechnology
- Taxonomy Management
- Text and Visualization
- Entity and Fact Extraction
- Text Mining
- Search for professionals
- Different needs, different interfaces
- Integration Platform technology
- Enterprise Content Management
27Taxonomy Development Tips and TechniquesStage
One How to Begin
- Step One Strategic Questions why, what value
from the taxonomy, how are you going to use it - Variety of taxonomies important to know the
differences, when to use what. - Step Two Get a good taxonomist! (or learn)
- Library Science Cognitive Science Cognitive
Anthropology - Step Three Software Shopping
- Automatic Software Fun Diversion for a rainy
day - Uneven hierarchy, strange node names, weird
clusters - Taxonomy Management, Entity Extraction,
Visualization - Step Four Get a good taxonomy!
- Glossary, Index, Pull from multiple sources
- Get a good document collection
28Infrastructure Solutions Taxonomy
DevelopmentStage Two Taxonomy Model
- Enterprise Taxonomy
- No single subject matter taxonomy
- Need an ontology of facets or domains
- Standards and Customization
- Balance of corporate communication and
departmental specifics - At what level are differences represented?
- Customize pre-defined taxonomy additional
structure, add synonyms and acronyms and
vocabulary - Enterprise Facet Model
- Actors, Events, Functions, Locations, Objects,
Information Resources - Combine and map to subject domains
29Taxonomy Development Tips and TechniquesStage
Three Development and/or Customization
- Combination of top down and bottom up (and
Essences) - Top Design an ontology, facet selection
- Bottom Vocabulary extraction documents, search
logs, interview authors and users - Develop essential examples (Prototypes)
- Most Intuitive Level genus (oak, maple, rabbit)
- Quintessential Chair all the essential
characteristics, no more - Work toward the prototype and out and up and down
- Repeat until dizzy or done
- Map the taxonomy to communities and activities
- Category differences
- Vocabulary differences
30Taxonomy Development Tips and TechniquesStage
Four Evaluate and Refine
- Formal Evaluation
- Quality of corpus size, homogeneity,
representative - Breadth of coverage main ideas, outlier ideas
(see next) - Structure balance of depth and width
- Kill the verbs
- Evaluate speciation steps understandable and
systematic - Person Unwelcome person Unpleasant person -
Selfish person - Avoid binary levels, duplication of contrasts
- Primary and secondary education, public and
private
31Taxonomy Development Tips and TechniquesStage
Four Evaluate and Refine
- Practical Evaluation
- Test in real life application
- Select representative users and documents
- Test node labels with Subject Matter Experts
- Balance of making sense and jargon
- Test with representative key concepts
- Test for un-representative strange little
concepts that only mean something to a few people
but the people and ideas are key and are normally
impossible to find
32Sources
- Books
- Women, Fire, and Dangerous Things
- What Categories Reveal about the Mind
- George Lakoff
- The Geography of Thought
- Richard E. Nisbett
- Software
- Convera Retrievalware
- Inxight Smart Discovery entity and fact
extraction - Courses
- Convera Taxonomy Certification
33Conclusion
- Taxonomy development is not just a project
- It has no beginning and no end
- Taxonomy development is not an end in itself
- It enables the accomplishment of many ends
- Taxonomy development is not just about search or
browse - It is about language, cognition, and applied
intelligence - Strategic Vision (articulated by K Map) is
important - Even for your under the radar vocabulary project
- Paying attention to theory is practical
- So is adapting your language to business speak
34Conclusion
- Taxonomies are part of your intellectual
infrastructure - Roads, transportation systems not cars or types
of cars - Taxonomies are part of creating smart
organizations - Self aware, capable of learning and evolving
- Think Big, Start Small, Scale Fast
- If we really are in a knowledge economy
- We need to pay attention to
- Knowledge!
35 Questions?
- Tom Reamytomr_at_kapsgroup.com
- KAPS Group
- Knowledge Architecture Professional Services
- http//www.kapsgroup.com