Title: Electronic Records Retention: Enterprise Strategies
1Electronic Records Retention Enterprise
Strategies
- Presented by David O. Stephens, CRM, FAI
- To the Nebraska Chapter ARMA International
- April 19, 2006
2Our objective today The What, Why and How of
Electronic Records Retention
- Introduction Basic principles of records
retention - EER A truly historic opportunity for records
management - The role of EER in enhanced data storage and life
cycle management - The business case for ERR
- ERR in desktop computing environments
- ERR in IT-managed production applications
3ERR Defined . . .
- . . . The act of retaining computer-based records
in digital format for specified, pre-determined
periods of time, commensurate with their value,
with subsequent disposal or permanent
preservation as a matter of official
organizational policy. - Or, in plain English, getting rid of dead data
as soon as it dies!
4Introduction
- Few organizations do records retention well.
- Still fewer do ERR well, or indeed at all!
- Lets introduce the subject of ERR by making some
brief comments about records retention in
general.
5Common mistakes in retention
- Retention schedules poorly developed too
general or too detailed. - Inadequate coverage for electronic records.
- Inadequate implementation strategy.
6The goal 80 to 90 perfect!
- Records retention is never perfect!!!
- Some employees will always circumvent whatever
policy or guidelines are in place. - But 80 to 90 perfect is a big success!
7Getting to (nearly) perfect in enterprise records
retention
- Clear, comprehensive policies and schedules
- Aggressive purge day strategies
- Employee / departmental compliance requirements
with penalties for non-compliance - Retention audits by corporate compliance officer
8Enterprise retention strategies must be
successful in five recordkeeping environments
- 1. Active paper records at departmental
workstations - 2. Inactive paper records in storage facilities
- 3. Personal working papers kept in desks,
credenzas and bookcases - 4. Data in computer applications managed by IT
- 5. Electronic records in desktops,controlled by
their creators
9Basic questions concerning data life cycle
management
- What happens to computer data as it ages?
- Does the value of data increase or decrease as
time passes? - Do storage management requirements change as data
ages through its life cycle? - In the world of paper, these are questions that
records managers have addressed for decades! - But not in the world of IT, where retention has
not been widely practiced.
10ERR Its in the embryonic stage!
- According to one study, 47 of the respondents
reported that electronic records have not been
specifically included in their organizations
retention schedules. - Another study 81 of the respondents reported
that rules for automatic purging of data, under
authority of retention schedules, have not been
incorporated into their organizations computer
applications.
11ERR A truly historic opportunity for RM
- As it is practiced in the USA, records retention
is, by far, the most important component of RM. - And ERR is, arguably, the single most important
issue in RM today.
12The ten most important issues in RM today (in my
opinion)
- The role of RM in improving business performance
by adding value to business processes - Sarbanes-Oxley and the role of RM in corporate
governance demonstrating the integrity of
records and recordkeeping systems - The role of RM in transitioning to a (nearly)
all-digital recordkeeping environment - ISO 15489, benchmarking and best global practices
for RM - The impact of September 11th and RMs role in
enhancing information protection and security
13In my opinion . . . The ten most important issues
in RM today
- 6. RMs role in enhancing the enterprise
accessibility of information - 7. Implementing best practices for records
retention Getting to (nearly) perfect in in the
five major recordkeeping environments - 8. Bringing data life cycle to the desktop and to
system applications through retention strategies - 9. RMs role in the improved management of e-mail
- 10. The role of RM in long-term data retention /
digital preservation
14ERR is related, either directly or indirectly, to
nearly all these issues!
- ERR is key to the future of RM.
- If RM can do ERR, it will be relevant to what
organizations want and need from this
professional discipline.
15ERR Its required for advanced professional
practice in RM!
- If records managers presume to operate enterprise
records retention programs at an advanced level
of professional practice, they must - Bring records retention to every desktop in the
organization - Bring records retention to every IT-managed
production application that requires it.
16ERR The two key components
- 1. A policy prescribing how long electronic
records must be retained, usually referred to as
a records retention schedule. - 2. Strategies and tools for implementing that
policy, in two major computing environments at
the desktop and for IT-managed production
applications.
17And, separate methodologies are required for data
of temporary and permanent values
- Ultimately, every byte of data will be either
deleted or otherwise rendered unprocessible, or
it will be retained, sometimes indefinitely or
for extended periods of time. - Retention strategies are required for both sets
of disposition actions.
18Selling ERR must be based on a viable business
case
- Given the fact that ERR has not been widely
practiced, if it is to be done, it must be sold. - So, is there a viable business case for ERR, and,
if so, how compelling is it?
19Both sides of the business case
- First, we will first look at why ERR has not been
widely practiced and the factors militating
against a compelling business case. - Then, we will look at the logic of ERR and why
its in an organizations best interest.
20If getting rid of dead data is such a good idea,
why hasnt it been widely practiced???
- A largely invisible problem no physical /
visible manifestations. - In some situations, its cheaper to retain than
purge. - For decades, IT had carte blanche to buy all
the storage they wanted no questions asked! - No strong advocate among key stakeholder groups.
21None of the key stakeholders in business
computing strongly advocated ERR, so it didnt
happen!
- IT departments Data retention not a priority
no methodology or expertise. - Vendors Driven by customer priorities. Data
retention not historically an issue. But this is
changing! - Data owners Usually content to take whatever
data they can get.
22The business case for ERR is weak if the
following four circumstances are present
- 1. The growth of data is modest and the budget
for storage is stable. - 2. Data retention needs do not exceed the life of
the system. - 3. Data retention does not pose a performance
problem for the system or accessibility problems
for the users. - 4. Ongoing retention does not pose a legal risk
to the organization.
23The biggest perceived weakness in the business
case Media capacity and the cost of storage
- With media capacity increasing at 60 per year
and the cost-per-megabyte declining at 35 per
year, it is often assumed that there is no viable
business case for ERR. - But . . . this view neglects the explosive growth
of data and increases in the total cost of data
ownership!
24The total cost of data ownership
- The total cost of data ownership is unknown or
poorly understood in most organizations. - The fundamental problem The total cost of data
ownership continues to rise, even while media
costs continue to decline. - As weve noted, increasing media capacity and
declining cost per MB leads many IT specialists
to conclude, erroneously, that there is no viable
business case for ERR. - Actually, what is happening is that these two
trends create a virtually unlimited demand for,
and consumption of, data storage!
25The growth of data The traditional IT response
- Buy a bigger electronic warehouse that is,
purchase additional storage capacity. - This is counter-productive, because additional
storage hardware requires additional staff
support (which is the biggest component of
storage costs), as well as other resources to
administer the storage function.
26The logic of the business case Five key factors
- 1. How much does it cost to store and maintain
all the computer data in the organization? When
building the business case, the total cost of
data storage and maintenance must include the
cost of administering the data over its entire
life cycle, not just the cost of the storage
media on which the data resides. - 2. How much of the current volume of stored data
is useless that is, it is inactive and is no
longer needed for any purpose? - Â 3. What would be the cost and benefits of
disposing of all such data? On the other hand,
what will be the costs, risks and benefits of not
doing so?
27The logic of the business case Five key factors
- 4. What has been the rate of growth in stored
data in recent years, and what growth rates are
forecast in the foreseeable future? How will
this growth affect the overall data storage
situation, including the additional costs that
must be borne? - 5. To the extent that data storage is being
currently under-managed or even mismanaged, how
could the systematic disposal of useless data
contribute to better overall storage management?
What costs and benefits would be quantifiable as
a result of this?
28Key trends in data storage
- As weve noted, storage capacity is growing at
over 60 per year while performance improves at
less than 10 per year. - Average annual storage demand rates for all
platforms is 50 to 60. - The value and criticality of data is increasing
exponentially while the percentage of data that
is actually managed is declining.
29Key trends in data storage
- Storage device capacity is growing more than 10
times faster than device performance. - Storage is growing faster than disks are getting
cheaper. - More data is being accumulated for longer periods
of time without effective management of its life
cycle. - SOURCE Computer Technology Review.
30The explosive and unprecedented growth in
data storage
- Hard disk drive capacity and server capacity are
doubling each year. - SOURCE How Much Information study, Univ. of
California, Berkeley, 2003. - Relational databases are growing at the rate of
125 each year. - SOURCE META Group.
31Why the unprecedented growth of data???
- Accelerated digitization of business processes
throughout the enterprise. - Greater capacity of data storage systems (60
percent annual growth). - Corresponding declining cost / megabyte
(declining at 30 to 40 annually) - Ten years ago, the cost / MB was approx. 15
today, its just a few cents!
32The explosive and unprecedented growth in
data storage
- Annual storage growth
- rates are expected to
- range from 60 to over
- 100 annually for the
- next five years.
- Generally, the quantity of
- stored data doubles every 2
- to 3 years.
- Source Computer
- Technology Review
33The explosive and unprecedented growth in
data storage
- The META Group forecasts data increases of a
hundred fold within the next five years. - The total expenditures necessary to accommodate
this growth will escalate more than ten fold
during the next five years. - Over the next five years, given a six-fold
decrease in price per terabyte and a hundred-fold
increase in the quantity of data to be stored, a
thirteen-fold increase can be expected in total
data management costs.
34The explosive and unprecedented growth in
data storage
- The total cost of managed storage now rivals or
exceeds the investment in systems and servers,
and often accounts for 50 or more of total IT
spending. - Data storage costs will rise to three-quarters of
all IT spending over the next few years. - Source Storage Inc.
35Simple formulae for calculating the cost of data
ownership
- For every dollar spent on acquiring storage
hardware, another 5 to 7 will be required to
operate the devices over their service lives. - For every 1 per MB spent on disk storage, the
total spent to manage that storage ranges from 3
to 8 per MB per year. - The cost of managing storage hardware ranges from
2 to 10 times the cost of its acquisition. - SOURCE These metrics were gleaned from various
computer journals.
36The cost-avoidance impact of ERR Buying a new
disk drive
- Cost of acquisition - 10,000.
- Annual cost of operation - 3 to 8 for every
dollar spent on the hardware. - Average service life 5 years
- Total cost - 250,000
- Cost if this device was not needed / never
acquired - ZERO!!!
37Cost-avoidance resulting from better allocation
of storage resources
- A company has 24 terabytes of data residing on
its high-end storage array, but only 17 TBs
require very high performance and availability. - Thus, the company migrates the 7 TBs to near-line
servers, which cost one-fourth the price of
high-end storage. - Re-deployment of this 7 TBs costs 140k, but
frees 560k of high-end storage for re-use, for
an immediate net payback of 420k an effective
ROI of 200. - SOURCE SMS. 2003.
38The business case for ERR Conclusions
- The case is viable and based on sound logic.
- However, ERR doesnt sufficiently affect
competitive position or quality of service to
rise to a priority issue for IT. - In other words, its in the realm of a good
idea nice to do rather than have to do!
39ERR and data life cycle management
- Theres never been a better time than now for
information lifecycle management, because the
growth of information propelled by business
continuity, compliance, and the proliferation of
unstructured content such as rich media and
e-mail is far outpacing the growth of IT
budgets. ILM helps companies around the world
deal with this growth while lowering the overall
cost of data ownership. - Joe Tucci, CEO, EMC
40ERRs contribution to better data life cycle
management
- Identification of data aging rates, access
requirements, protection levels, and data values
and how they change over time. - Development of data retention and storage
policies and procedures, prescribing where data
should be stored in the several stages of its
life cycle. - Automatic, policy-based migration of aging data
onto optimum storage platforms. - Purging of expired data as per the retention
policy.
41The four stages in the life cycle of data
- Active stage Typically lasts for 30 days and
requires high-performance disk storage. - Reference stage Typically lasts for 60 days and
usually requires disk and automated tape storage.
- Archival stage Frequently lasts up to 7 years
and sometimes longer. Storage solutions include
non-automated tape and the newer ATA disks. - Delete / destroy stage The end of the data life
cycle.
42Probability of data access by stage of life cycle
- Active stage gt.5
- Reference stage gt.1
- Archival stage lt.01
- Delete / destroy stage lt.001
- SOURCE Computer Technology Review, 2003.
43The major driver for ERR
- Unrestrained retention of electronic records
poses an unacceptable risk for many
organizations. - This risk is greatest for the desktop and email
less so for production applications.
44The biggest risk of unrestrained retention The
desktop!
- While undefined / indefinite retention of data
residing in IT-managed production applications
poses some degree of risk and often has adverse
storage or other consequences, the single biggest
ERR challenge is the desktop! - The desktop is an RM basket case!
45Needed Desktop Records Management for Dummies
- There are approximately 100 million office
desktop users in the USA. - Very few of them are furnished with good guidance
concerning how to manage and retain these
records. - This is, arguably, the single greatest failure in
RM today! - But . . the desktop is where over half of all
digital content resides!
46Bringing RM and retention to the desktop
- Electronic records management will be
successfully implemented at the desktop level
when every desktop user will be routinely
declaring records on a daily basis, as a part of
their everyday business processes. All declared
records will have a correctly assigned
classification code, with a 95 or higher
accuracy rate. And the disposition of all
declared e-records will be governed under
approved retention rules that are fully compliant
with regulatory requirements. And finally, the
volume of stored e-docs will be shrinking rather
than growing, because large scale, accountable
destruction will be occurring routinely. - Bruce Miller, IBM
47A specious argument Bringing ERR to the desktop
is not feasible
- Another way to increase storage capacity is for
IT administrators to demand that users themselves
clear out their older files from the system (with
the added threat that files over a certain age
will be automatically deleted from the system if
users take no action). This is clearly
ludicrous. The cost to a business to (a) spend
considerable amounts of time reviewing all their
data . . . - Source Document World
48A specious argument Bringing ERR to the desktop
is not feasible (contd)
- . . . and then (b) to have to make decisions
about the potential value of data and act
accordingly is inestimable. Users will always
believe they use their data more often than they
do. What is really needed is a method of
automatically moving older, less frequently used
data from high performance disk drives attached
to the server, to less expensive secondary
storage devices. - Source Document World
49Bringing ERR to the desktop Six basic principles
- 1. Responsibility for retention rests with
individual desktop users. - 2. Retention should be content-driven / records
series related (users are required to comply with
the retention schedules).
50Bringing ERR to the desktop Six basic principles
- 4. Users may dispose of all desktop documents of
unofficial character at their discretion
(provided they do not exceed the retention
requirement for official documents.) - 5. Most desktop documents are of relatively
short-term value. Consider a default retention
of 2 years.
51Bringing ERR to the desktop Six basic principles
- 6. Desktop Purge Day are strongly recommended as
the single most effective step in retention
implementation. - Labor requirements Between 4 and 16 hours per
user per year, exclusive of email.
52Users Guide to Managing Electronic Records at
the Desktop Level
- Bringing Professional Records Management to the
Desktop - Filing Records Created by Desktop Applications
- Protecting Records Created by Desktop
Applications - General Retention Policies for Electronic Records
- Retention of Records Created by Desktop
Applications - Responsibilities for Disposing of Electronic
Records - Records Management Practices for Email
- Saving E-mail
- Deleting E-mail
- Using AutoArchive to Save and Delete Messages
53Bringing ERR to IT-managed System Applications
Six Steps
- 1. Obtain cooperation / participation of IT
- 2. Collect summary data describing system
applications - 3. Solicit data via questionnaire
- 4. Interview applications developers and data
owners - 5. Integrate data retention requirements into
enterprise schedules - 6. Implement application retention requirements
under a new IT policy for data retention
54Step 1 Obtain cooperation / participation of IT
department
- Retention Schedule development 3 to 9 month
project - Retention implementation Integrate data purge
functionality into application environments - A multi-year project
55Step 2 Collect summary data describing system
applications
- You need
- A general description of all system applications
IT is running, usually contained in an
Applications Portfolio, Systems Directory. - A list of data owners and applications developers.
56Step 2 Summary data describing system
applications (Sample)
- System name Personnel Information Management
System - Business function Comprehensive data about the
employment history / current status of individual
employees. - Data owner Jane Smith
- Applications developer John Doe
- Platform UNIX
- Applications software PeopleSoft
- Database Oracle
- Disk size / space 32 GB
57Certain applications may be disregarded as
non-schedulable
- Schedule real business records in system
applications. - Applications containing pass thru data (data
feeds, system interface) are not usually
scheduled since they are not, themselves,
repositories of retained business records. - Schedule the lakes and ponds, not the rivers
and streams!
58Solicit data via questionnaire Six key questions
- 1. Briefly describe the business process or
function performed by this application. - 2. Has any provision been made to "flag" or
identify inactive data in the application? - 3. What data retention practices are currently in
place for this application, if any? - 4. Is any inactive data routinely archived or
otherwise deleted or "purged" from this
application?
59Solicit data via questionnaire Six key questions
- 5. If yes, describe the functionality for purging
inactive data and indicate when this occurs. For
example, can the software effectuate a
system-wide purge of expired data based on
specified, pre-defined retention criteria, or can
it only accomplish manual deletions of individual
data records? - 6. Please share any opinions concerning how long
you believe inactive data from this application
should be retained for operational or business
purposes, and elaborate on the reason(s)
justifying these opinions.
60Step 4 Interview applications developers and
data owners Four key issues
- (1) Validate the data contained on the survey
forms for each application - (2)Â Define the electronic records series the
schedulable bodies of data contained in each
application - (3)Â Define the retention values of each
electronic records series and to make preliminary
retention decisions for each of the - (4) Discuss strategies and functionality for
implementing the retention periods.
61Step 5 Integrate data retention requirements
into enterprise schedulesThree format options
- (1) Media specific (separate schedules for paper
and electronic) - (2) Media independent (one schedule that doesnt
distinguish between media types) - (3) Multimedia (one schedule but separate,
specific guidance for redundant data on multiple
media)
62Step 6 A new IT policy for data retention
implementation
- For most organizations, the only feasible
strategy for incorporating purge functionality
into all applications software requiring it to
adopt a new IT policy requiring that such
functionality be incorporated either at the time
of initial systems design or at the time of the
next technology upgrade.
63Step 6 A new IT policy for data retention
implementation
- This policy is designed to make all applications
retention-capable within three to five years. - This would be a huge victory for ERR!!!