Title: MGT8004 Electronic Records Management
1MGT8004 Electronic Records Management
- Module 6
- Classification and Indexing Electronic Records
- (much of the content of this module is drawn from
ISO 15489 which replaced AS 4390 (1996) referred
to in this course in October, 2001 students for
this course in 2002 may refer to either standard
in assignments and examination work)
2The Electronic Records Pyramid
Covered in Module 7
LEVEL 4 Retention and Disposition
THIS MODULE
LEVEL 3 Classification and indexing
Covered in Module 5
LEVEL 2 Capturing full and accurate records
LEVEL 1 Appraisal and Disposal
3The need for efficient classification and
indexing information retrieval
- Every minute of delay in finding a record is
costly - in user of requester waiting time and in
filer searching time - to say nothing of possible
loss of business as an ultimate result - Smith and Kallaus, 1997
- Effective retrieval requires a knowledge of
classification and indexing techniques and a
thorough understanding of the organisations
activities
4The complexity of the retrieval process
- Consider the increasing volume of organisational
information (quantities of records) - Consider the increasingly diverse formats in
which information can be stored - Consider the variety of locations for paper
records ie work area, central records, off site
storage, archives etc - Consider need for compatibility in technology to
retrieve records - Compare searching for organisational records to
search of electronic database - Need for matching classification scheme for
electronic and paper records
5Classification and Indexing.- Definition
according to AS4390-1996
- Classification .. Allowing for appropriate
grouping, naming, security protection, user
permissions and retrieval - Indexing - allocating attributes or codes to
particular records to assist in their retrieval
6Classification and ISO 15489
- Classification systems provide an organisation
with a tool to - Organize, describe and link its records
- Link and share interdisciplinary records, either
internally or externally to the organisation and - Provide improved access, retrieval, use and
dissemination of its records as appropriate - (Section 4.2.2.1)
7Classification and ISO 15489 (cont)
- Supported by instruments such as vocabulary
controls, classification systems promote
consistency of titling and description to
facilitate retrieval and use. - Classification systems can be used to support a
variety of records management processes in
addition to facilitating access and use, for
example, storage and protection, and retention
disposition - (Section 4.2.2.1)
8Degree of classification control needed
- Classification systems reflect the simplicity or
complexity of the organisation from which they
derive - Degree of control needed will depend on
- Organisational structures
- Nature of the business
- Accountabilities
- Technology deployed
9Functions of classification
Provide linkages between individual records which
accumulate to provide a continuous record of
activity
Ensure records are named in a consistent manner
over time
Assist in retrieval of all records relating to a
particular function or activity
Determine appropriate retention periods and
disposition actions for records
To assist in a number of Records Management
processes
Determine security protection and access
appropriate for sets of records
Allocate user permissions for access to, or
action on, particular groups of records
Distribute responsibility for management of
particular sets of records
Distribute records for action
10Classification Systems
- Classification systems reflect the business of
the organisation from which they derive and are
normally based on an analysis of the
organisations business activities.
11Analysis of business activity
- Should provide a conceptual model of what an
organisation does and how it does it - Demonstrates how records relate to both the
organisations business and it is business
processes - Will contribute to decisions in subsequent steps
about the creation, capture, control, storage and
disposition of records and about access to them
(particularly important in an electronic business
environment where adequate records will not be
captured and retained unless the system is
properly designed)
12Outcomes of analysis of business activity
- Description of the organisations business and
business processes - Business classification scheme that shows the
organisations functions, activities and
transactions in a hierarchical relationship and - A map of the organisations business processes
that shows the points at which records are
produced or received as products of business
activity - Basis for developing
- Thesaurus
- Disposition authority
- Help for identifying and implementing appropirate
metadata strategies and in formally assignming
responsibilities for keeping records
13Business Classification Revisited Steps
involved in developing a BCS
Gather documentary information and conduct
interviews
Understanding of the overall mission/objectives
of the organisation
Derive and list the functions needed to achieve
objectives
Identify hierarchies of activities which support
each function
Identify the transactions which operationalise
each activity
Identify processes/activities which are common
across functions
Produce a map of the hierarchies for each function
14The structure of the BCS
- Structure of the classification system is usually
hierarchical as follows - First level usually reflects the business
function - Second level based on activities constituting the
function - Third and subsequent levels are further
refinements of the activities or groups of
transactions that take place within each activity - Degree of refinement reflects the complexity of
the functions undertaken within the organisation - See next slide for example
- ISO15489 4.2.2.2
15Example of BCS
- Managing Human Resources
- 1.1 Determining allowances
- 1.2 Establishing conditions of employment
- 1.2.1 Appointments
- 1.2.2 Apprenticeships
- 1.2.3 Childcare
- 1.2.4 Flexible work arrangements
- 1.3 Calculating leave
- 1.3.1 Accrual
- 1.3.2 Entitlements
- 1.3.3 Holidays
- 1.4 Recruiting employees
- 1.5 Determining salaries
- 1.5.1 Deductions
- 1.5.2 Overtime
- 1.5.3 Remuneration
- 1.5.4 Superannuation
-
16Some ISO 15489 guidelines for developing a BCS
- Terminology should be derived from business
functions and activities, not from names of
organisational units - Should be specific to each organisation
- Should provide a consistent and standard way of
communicating across organisational units and
sharing the same information for interrelated
functions - Should be hierarchical, moving from the most
general to the most specific concept ie from
high-level function to specific transaction - Should consist of unambiguous terms reflecting
organisational usage - Should consist of sufficient groupings and
sub-groupings to include all of the business
functions and activities being documented - Should consist of discrete groupings
- Should be devised in consultation with the
records creators - Should be maintained to reflect changing business
needs and to ensure that the scheme is up to date
reflecting changes in the functions and
activities of the organisation.
17BCS v knowledge based classification systems
- Read Section 6.1.3 and explain in your own words
the differences between these two classification
systems.
18Series, files and documents
- Covered in Module 5
- Read Section 6.2 and then complete Activity 6.1
and Activity 6.2
19File based v document based systems
- Individual retrieval units may be either files or
documents - A file is a group of related documents located
within a file cover or folder - Which description ie file based or document based
is best applied to electronic records systems?
20Registration of records
- The purpose of registration is to provide
evidence that a record has been created or
captured in a recordkeeping system. - (Australian Standard AS 4390-1996, pt 4, p 4,
6.1) - Involves recording brief descriptive information
about the records in a register, and assigning
the record a unique identified - Not commonly used for paper-based systems
21Registration and electronic records systems
- In electronic record systems a register may
include classification and determination of
disposition and access status - Electronic records systems can be designed to
register records through automatic processes,
transparent to the user of the business system
from which it is captured and without the
intervention of a records management practitioner - Even if not totally automated, elements of the
registration process can be automatically derived
from the computing and business environment from
which the record originates - The register is usually unalterable. If changes
are made there must be an audit trail.
22Minimum metadata required at registration
Unique identified assigned from the system
Date and time for registration
Author (person or corporate body), sender or
recipient
Title or abbreviated description
23Other information which may be required at
registration
- Will depend upon
- The nature of business recorded
- The organisations evidence requirements
- The technology deployed
- See other information listed on next slide
24Other metadata which may be required at
registration
Date and time or communication and receipt
Author (with his/her affiliation)
Incoming, outgoing or internal
Text description or abstract
Document date and title
Date of creation
Business system from which record captured
Recipient (with his/her affiliation)
Classification according to classification scheme
Sender (with his/her affiliation)
Links to related records
Physical form
Application software version under which record
created
Details of embedded document links
Standard with which records structure complies ie
XML
Access
Templates required to interpret document structure
Retention period
Other structural and contextual information
useful for management purposes
25Indexing
- ..the process of establishing and applying terms
or codes to particular records by which they may
be retrieved - AS 4390 (1996)
- Appropriate allocation of index terms extends the
possibilities of retrieval of records across
classifications, categories and media - (ISO 15489)
- indexing terms may be derived from a document by
computer or assigned manually using
preestablished categories or indexing terms such
as a thesaurus
26File titling
- Titles need to be representative of a records
context as well as its content - File title possibly a set of index terms act as
a label for the file - File titles aim to achieve two objectives
- to help minimise confusion over what file to
place a document on - to aid retrieval
When automated retrieval software is used,
sequential numbering is used. Construction of
title is important as each word in title should
be able to be searched.
27Types of computerised indexing systems
- Free text retrieval (full text retrieval)
locate records based on searching their content - Indexing may also be based on
- User profiles
- Document and subject profiles
- Document content
- Use of intelligent agents
28Indexing terms
- May be restricted to terminology established in
BCS and/or thesaurus - Commonly derived from
- Format or nature of the record
- Title or main heading of the record
- Subject content of the record (in accord with
business activity) - Abstract of a record
- Dates associated with transactions recorded in
the record - Names of clients or organisations
- Particular handling or processing requirements
- Attached documentation not otherwise identified
or - The uses of the record
29Keyword AAA Thesaurus
- Used widely in public sector organisations
- complies with AS4390-1996 ie
- based on business classification rather than
knowledge-base classification - tight hierarchical structure employing three
levels of terms ie - keyword
- activity descriptor (may be more than one)
- subject descriptor/free text (may be more than
one) - can be used with electronic or paper records
- http//www.unimelb.edu.au/CSD/image/execserv/keyin
tro.htm
30File Titling
- File titling aims to achieve two objectives
- Help minimise confusion over what file to place a
document on - To aid retrieval
- Organisations using automated retrieval software
often use sequential numbering filing systems in
which case the construction of the fiile title is
vital as the file number gives no indication of
the contents of the file - Automated retrieval software allows each
individual word within the title to be searchable
31Hierarchical file titling
- Allows printing or browsing of alphabetical
listings with file titles grouped together within
their broad class terms (keywords) and activities - Allows broad searching (at level of keyword) or
very specific searching (at the level of free
text) - Allows representation of both contextual and
content aspects of the record - Possible disadvantages include
- Need to prespecify as many hierarchies as
possible - Tendency to force each title into an
inappropriate hierarchical order
32Consolidation file titling
- Complete Activity 6.3 and Activity 6.4 on page
6.8 of your study book now.
33Metadata and electronic records
- Metadata .. A description or profile of a
document or other information object which may
contain data about its context, form and content. - Additional information required at the
registration stage for electronic documents
because of higher risk of loss - addition of metadata can be automated by records
management software programs - http//www.gmb.com.au/products/button/intro.htm
- Metadata is considered a vital ingredient of
electronic recordkeeping because the risk of loss
of electronic documents is much higher than for
paper records
34Metadata and electronic records (cont)
- Recordkeeping metadata meets records management
objectives by enabling records to be uniquely - Identifiable,
- contextualised,
- retrievable,
- understandable,
- managed,
- accountable and
- migratable
- (Cumming 2001)
- Specific business benefits which accrue from
application of a full range of recordkeeping
metadata include - Increased control
- Understanding
- Authenticity,
- Security
- Accessibility
- of organisational information and the ability to
reuse data as required
Need for metadata increasing in transition from
paper to electronic records in order that a
search will reveal both hard copy and digital
copies of information
35Automating metadata
- Addition of metadata to electronic records is
being facilitated by electronic records
management software programs which can automate
the addition of metadata to electronic records - Access the gmb website shown at the top of page
6.10 and explain how RecFind automates the
inclusuion of metadata for electronic records.
36Steps in the Indexing Process (also involves
classification)
- 1 Examine the document in an attempt to classify
and find suitable indexing terms - look for - title
- names of originating persons or organisations
- opening and closing paragraphs
- groups of words underlined or printed in
different typefaces - 2 Identify useful retrieval concepts by asking
questions such as - Does the document/file record a transaction?
- Does the document/file record an activity or
course of action? - Does the document/file refer to methods for
accomplishing a course of action? - Does the document/file deal with a particular
product, organisation, or condition? - Does the subject of the document/file contain an
action concept ie an operation or process?
37Steps in the Indexing Process (cont)
- 1 Examine the document
- 2 Identify useful retrieval concepts
- 3 Translate concepts into the indexing
vocabulary. Issues to be considered include - controlled and/or natural language
- method of indexing proper names
- pre-coordinate or post-coordinate method
- how specific index headings will be
- how to achieve consistency in indexing
-
NOW COMPLETE ACTIVITY 6.5 ON PAGE 124 OF YOUR TEXT
38Controlled v Natural Language vocabulary
- Controlled vocabulary
- indexer translates identified concepts into the
standardised or authorised allowed terms in an
alphabetical thesaurus - Natural langauge
- non-thesaurus terms and phrases assigned by the
indexer in an extra field eg Narrative - often include proper names
- Summaries or abstracts can also be used as index
terms where terms in any field are searchable
online
39Consistency in indexing
CONSISTENT INDEXING
Use of Thesaurus Rules for formulating
names Guidelines on translation
PREDICTABILITY IN RETRIEVAL
Scattering of documents on same topic in
different files
Use of Thesaurus Rules for formulating
names Guidelines on translation
Incomplete files
Low retrieval rates and difficulty in finding
individual documents
Problems with reliable and efficient retention
and disposal
40Impact of technology on indexing and retrieval
Increasingly sophisticated indexing software
Increasingly sophisticated search engines
Increasingly sophisticated navigational mechanisms
41Indexing and Search Methods for Full Text
Databases and Networked information
- The nature and extent of human classification and
indexing required will depend on the storing,
indexing, and searching software capabilities of
the system - LANs and intranets allow the requesting of
information by a client from document collections
stored on a server - Geographically dispersed organisations can access
corporate documents stored at different points on
the network
Technology is changing the nature of a document.
Explain
What do we index ie individual object or entire
document?
42New concepts for classification and indexing in
technological environments
Digitally based team collaboration Organisational
intranet could be regarded as a very simple
example of groupware
GROUPWARE
WORKFLOW SOFTWARE
Automates the flow of tasks and information
around an organisation
Digital documents which may be a combination of
text, audio or graphic objects with elements not
necessarily stored together on one server but
brought together through hypertext links
COMPOUND DOCUMENTS
Complete Activity 6.6 on page 6.12 of your study
book
43Evaluating retrieval performance
RECALL - the number of documents
retrieved PRECISION - number of documents found
to be relevant
Recall
Precision
Recall
Precision
44Indexing and Searching Technologies
- Free text searching - computer searches for work
or phrase in one or more database fields or
document full text - Free text scanning - computer sequentially scans
terms in each document or a database to find a
match - N-grams or suffix arrays - index stores word
fragments on which matching takes place - Pattern recognition - index stores binary
representations - overcomes need for correct
spelling but reduces precision - great for sound,
video and images - Document clustering - document assigned a theme
which is used as the index value - Hypertext systems - nodes or chunks of
information (including text, images or sound) are
stored and connected by means of links or pathways
45Search Approaches (1)
What will each search find?
- Boolean searches
- and - Loans and Students
- or - Loans or Students
- not - Loans not Students
- Wildcard searching ( to search for word where
some letter/s are missing at beginning, in middle
of or at end of word) - McG ( more than one letter)
- McGra_at_y (_at_ just one letter)
- Truncation - used where you are not sure of
ending of word - Har
46Search approaches (2)
- Proximity operators - used to stipulate that
terms must be adjacent, in same sentence, in same
paragraph etc - Student ADJ Loans
- Fuzzy logic - search specifications made more
vague than that input by researcher - an AND search may reveal both AND and OR
COMPLETE ACTIVITY 6.7 AND 6.8 ON PAGE 6.13 OF
YOUR STUDY BOOK
47Consolidation for Module 6
- COMPLETE ACTIVITY 6.9 ON PAGE 6.14 OF YOUR STUDY
BOOK. This is a particularly valuable reading
for revising the concepts provided in Module 6.