Title: MARC COOP Workshop
1MARC COOP Workshop
2Dan Esser CBCPColumbia, MOdaniel_at_contingencynow
.com
3Expectations for Session
- Help those who are started expand their plans
- Help those who are not started with how to start
- Use the Template to launch the COOP Planning
process - Session will not tell everything you need to
know, but will put you on the right track
4Format
- For the most part there will be an overview of
points to be covered. After that we will cover
each in detail. You may hear some things more
than once. - If Im going too fast Tell me to slow down.
- If Im going too slow Tell me to speed up.
- If I digress Tell me to get back on point.
- If you have a question or comment that is
pertinent to what we are discussing Stop me and
ask or comment.
5Points for Discussion
- Definition of COOP
- Why COOP is Important
- How to Start / Expand Current Planning
- Things to Add to the Template
- Why Complete isnt
- Things you want to talk about...
6Jackson Article on Making Plans Actionable
- Continuity Plans are often not used during
Testing or Recovery - Why Not?
- The Information is There
- The Actions to Recover are Not
7RTO / RPODefinitions We Need Throughout
- RTO Recovery Time Objective
- The Time within which a business process must be
restored after a disaster to avoid unacceptable
consequences associated with a break in service - RPO Recovery Point Objective
- The point in time to which data must be recovered
"acceptable loss" in a distressed situation
8Essential Functions
- C-Now Functions that, if interrupted beyond the
RTO, would cause serious or irreparable harm to
people or institutions in the jurisdiction. - Template Functions that enable city/county
agencies to provide vital services, exercise
civil authority, maintain the safety and well
being of the citizens, sustain the
industrial/economic base in an emergency.
9Essential Functions (cont.)
- Just keeping the essential stuff going is not
enough - Each Jurisdiction has to be planning toward full
recovery - Not Essential does not mean Not Important
10COOP Review
- What is COOP?
- Why COOP?
- What are the Components?
- What are Sources of Risk?
11What is COOP?
- COOP is the ability to continue essential
functions or restore within a predetermined time
frame (RTO) and with data loss held within the
maximum acceptable loss (RPO). - Creating a COOP Plan is about making all the
decisions that can be made prior to an incident
before anything really happens.
12Why COOP?
- At a Departmental or Building Level
- Any Department that cannot perform its essential
functions - Does not help the rest of the jurisdiction
- May impede the rest of the jurisdiction
- Applies to Small and Large Incidents
- YES IT CAN HAPPEN TO YOU!
13COOP Components
- Department/Building Safety Response
- Department/Building Incident Response
- Continuation/Recovery of Essential Functions
- Recovery of All Functions
- Return to Normal Operations Permanent Space
14Hazards
- Natural Hazards
- Earth, Wind, Fire, Water
- Human Intervention
- Terrorism
- Human Error
- Health Hazards
- Infrastructure Disruptions
15Dept./Building Safety Response
- Knowledge and Practice of
- Evacuation to Safe Area
- Accounting for Occupants and Visitors
- Shelter in Place
- Establishment of an Organizational Structure and
Occupant Emergency Plan
16Dept./Building Incident Response
- Structure Related to Size of Jurisdiction
- At its Most Basic
- Take Care of People
- Gather Information
- Assess the Situation
- Determine and Act on Next Steps
- Information to Management
- Information to Public/Press by Jurisdiction
Leaders
17Building on the Current Template
- Components outlined in FPC 65 are present
- Legal Underpinnings are present
- Annexes are Repository for Recovery Information
- Open Ended - Jurisdictions can add things they
need or re-order what is there
18Getting Started
- Six Major Pieces for creating an Actionable COOP
- Preparation / Management Buy In
- Impact Analysis Risk Assessment
- Mitigation Strategies
- Operations Restoration
- Information Technology Restoration
- Regular Exercise / Learning / Updating
19Preparation / Management Buy-in
- Identify COOP Plan Coordinator
- Identify other resources available to help with
planning and their roles - Management Support and Funding
20Impact Analysis Risk Assessment
- Identify functions
- Determine scope
- Set initial RTOs and RPOs
- Identify essential functions
- Identify key enabling Technologies and Processes
- Identify Risks that could disable service
- Identify Mitigation strategies
- Identify Recovery Strategies
21Mitigation? Whats That?
- How to keep an incident from being a disaster
- Workarounds unique to each organization
- Examples for
- Payroll Direct Deposit
- Payroll Checks
- Power Flickers during storms
- Power Outage
- Everything is stored in the office
- Homes and Cars
22COOP for Operations
- Manage the incident and start the recovery
process - Plan for total building loss
- Consider how to provide alternate work space
- Set final function RTOs and RPOs and identify
critical services - Identify technologies required for critical
services - Look out for SPOFs
23COOP for Operations (cont.)
- Create workarounds for absence of technology
- Identify services that can go forward with
limited or no technology - Pre-Arrange for ongoing communications
- A capability to put messages on incoming phone
lines located outside the PBX equipment of the
jurisdiction - Pre arranged capacity to move lines to an
alternate place - Identify and pre-position key start-up items or
supplies (mitigation)
24COOP for Operations (cont.)
- Set up contact information and processes for
employees and officials - Set up contact information and processes for key
vendors/suppliers - Set up processes for quarterly or semi-annual
plan updates, annual testing
25COOP for Information Technology
- Coordinate IT recovery with the overall incident
manager or team - Identify critical servers and associated
infrastructure - Base this on user functional needs and which
parts of the technology support those - Arrange for equipment and a place to restore
critical servers - May include redundant systems located in a
different place for those things that must be up
24/7/365
26COOP for Information Technology (cont.)
- Pre-position supplies, documentation of
restoration steps and copies of software - Set up contact information and processes for IT
employees and officials - Set up processes for quarterly or semi-annual
updates, annual testing
27Exercising and Testing
- Its a TEST! EEK! -- Failure Anxiety!
- Sometimes this causes people to plan the test
rather than test the plan - Establish a Safe Environment for the exercise
28OK, but where do I start?
29Step 1 - Management Buy In
- Things you may hear
- Theres no budget for that.
- Why spend money on something that may not happen?
- We have insurance. Why do we need this?
- Information Technology has all of that handled.
Everything is backed up. - Get out of my office I dont want to talk about
this. - Your Favorites?
30Management Buy In (cont.)
- Leadership needs to know the impact of not having
essential functionality - If there is resistance to the concept, ask for a
Pilot study to see if more is needed - Low Cost and can be restricted to a few buildings
or departments - Results will determine more Analysis is needed
- Pick High Impact Departments (IT, Payroll,
Accounts Payable, Tax Collection, Courts)
31Management Buy In (cont.)
- Note the need to identify
- Essential Functions that cannot be interrupted
- Essential Functions that can be interrupted, but
must be back soon and how soon that is - Important Functions whose recovery can be delayed
and how long - Systems, processes and assets already in place
- Effect of the absence of functions on
- Departments and buildings of the jurisdiction
- Residents and businesses in the jurisdiction
32Management Buy In (cont.)
- The information gathering step provides data for
Impact Analysis - If doing the entire jurisdiction it may be more
than a one person job - Appoint a Study Coordinator
- Identify others of like mind
- Put a team together
- If a team is not practical, consider surveys
33Step 2 Scope
Impact Analysis Information Gathering
- For the Jurisdiction
- How many buildings?
- How many departments?
- Which departments are housed in multiple
buildings? - How many employees?
- What are the dependencies?
- How many Servers and Where are they?
- Who operates in normal business hours?
- Who operates outside normal hours?
- Are there Organization Charts?
34Step 3 Functions Ranking
Impact Analysis Information Gathering
- Contact knowledgeable people from areas being
studied - ID functions and priorities according to depts.
- Tools available include Worksheets 1-5 and Annex
A - Try to get both RTO and RPO to assist with
ranking
35Functions Ranking (cont.)
Impact Analysis Information Gathering
- Consider Revising Annex A to use something
similar to this - Use it to give functions an initial ranking
36Step 4 IT Set Server Priorities
Impact Analysis Information Gathering
- Use the functional priorities to set server
priorities - Calculate preliminary RTOs / RPOs for each
server in light of - Current Backup Schedules
- Current Replacement Hardware Availability
- Current Alternate Facility Availability
- Current Required building infrastructure
37Step 5 Reality Check
Impact Analysis Information Gathering
- Virtually everything the jurisdiction does is
going to depend on IT in some form. - How do the RTOs and RPOs desired by the
operating departments compare with the
Information Technology recovery capabilities? - Boil out any happy talk about what people think
they can do without having ever tested it. - If the needs of the operating departments cannot
be met, what next steps are appropriate?
38Step 6 Analysis of the Impacts
- For each essential operating function, what is
the impact on the jurisdiction if it is not
functioning? - When is x hours/days/weeks intolerable?
- Revenue lost
- Public dissatisfaction with services
- Bills not paid
- Danger to the public
- Danger to employees
- Lost Records?
39Analysis of the Impacts (cont.)
- Based on analysis revisit and revise RTOs /
RPOs - Operations (work with managers)
- Information Technology
- It is an iterative process
40Risk Assessment
- The Template has good reviews of Natural Hazards
in the Kansas City Area. - The review of Technological Hazards is also
worthwhile and jurisdictions need to review these
for each building. - Where an asset is at high risk, it is important
to take steps to mitigate the risk or relocate
the asset. - An available tool (handout) has Risk Analysis
Process questions each jurisdiction can use.
41Situational Awareness
- Each Department Head / Building Manager needs to
be aware of what is nearby and what threats it
might pose. - Railroads
- Rivers (Barge Traffic, Dams Levees)
- Highways
- Airports
- Federal Office Buildings
42Step 7 Results to Leaders
- Present differences between current recovery
ability and operational RTOs / RPOs for
essential functions - Present the costs of being unprepared, money,
human cost, citizen dissatisfaction, bad press - Identify existing capacities and strengths
- Resist assumptions that things will happen
without preplanning
43HOORAY (sort of)
- Leaders say go ahead with COOP
- Dont spend much money
- Dont give up hope. Not everything happens at
once.
44Step 8 Plan as Repository
- Earlier we noted the COOP is a Repository for key
information - Complete the Annex Development Process
- Add new annexes as necessary.
45Adding to Annexes
- You already have some of this
- Each jurisdiction decides how much they need
- List of Essential Functions with revised RTOs
and RPOs in Recovery Priority order by building
or department (update annually) - Current Organization Chart (update annually)
- Contact Information and a jurisdiction chosen
method for using it (call trees to vendor
supplied systems) - Employees (update semi-annually)
- Local Emergency Numbers (update annually)
- Regulatory entities (update annually)
46Adding to Annexes (cont.)
What? Go Kit?
- Prepositioned Materials (Off Site Boxes)
- Department start up (forms, schedules, reference
material, custom stamps) - Jurisdiction Command Center if needed (tool kit)
- Internal Dependencies (departments that depend on
each other) (update annually) - External Dependencies (Vendors, Other
Jurisdictions) (update semi-annually) - Work-in-progress recovery procedures (update
annually) - Minimum Standards for Replacements (update
annually) - Workstations
- Desk Equipment
47Adding to Annexes (cont.)
- Desktop PC Hardware
- Laptop PC Hardware
- Desktop and Laptop Software Configuration
- Personal Workstation Printers
- Networked Printers
- Copiers
- Fax Machines
- Scanners
48Adding to Annexes (cont.)
- Vital Records and Databases
- Records that, if absent, would cripple services
or the legal position of the jurisdiction - List by Department (identify if multiple users)
- Classify Paper or Electronic or Both
- Develop Strategies to Recover or Rebuild
- Work with IT on backup/archiving for Electronic
Records - In the plan, note each vital record, its RTO and
RPO - There are going to be some paper records than
cannot be backed up or stored off site and the
jurisdiction may have to assume the risk of loss
49Step 9 Management Structure
- Establish who can declare and activate COOP
- Establish EOC or Command Center, as needed
- Coordinates recovery efforts between IT and
Operations - Handles employee notifications / updates
- Uses prepositioned supplies if needed (tool kit)
- Reports to the top authority in the jurisdiction
- Initiates damage assessment and repair of primary
facilities - One objective of the EOC is to put itself out of
business as soon as possible
50Wheres Information Technology?
- This is the point in the planning process where
IT planning becomes different than Operations - We will return to IT Step 10 after the remainder
of the Operations steps
51Step 10 Succession Planning
- Annex F in the Template
- Not necessary to name a Successor for every
position - Concentrate on the top 5 to 10 plus head of
critical departments, but go deeper if you want
to - For top positions, try to name successors not
physically located right next to those they might
succeed
52Step 11 Plan to Secure/Replace Assets
- Assess Damage to building(s)
- Assess Loss of Contents
- Streamline the Purchasing Process
- Adequate Controls
- Special Accounts
- Safeguard jurisdiction Insurance Information
- Use the standards for replacement equipment
53Step 12 Media Policy
- Name the Spokesperson
- Limit Interaction by all others
- Disseminate policy to all employees
- Give them something they can say if cornered
(toolkit)
54Great Now we have this large pile of
information, but we still dont really know to
regain lost functionality.
55Step 13 Alternate Facility Acquisition
- If your workspace is destroyed you have to have a
place to go - Its not just space.
- Furniture
- Wiring for Power
- Wiring for Data
- Wiring for Phones
56Alternate Facility Acquisition (cont.)
- Potential Options
- Reciprocal agreements within the Jurisdiction
- Reciprocal agreements with other Jurisdictions
- Recovery Vendor provided Mobile Space (comes with
phones and PCs) - Reciprocal Agreements
- Negotiate and Document as carefully as commercial
leases - Review Annually
- Be careful about double booking
- Consider shift work and space sharing
57Step 14 Departmental Planning
- More than just a computer and a phone
- What has become invisible?
- Reference Materials
- Documentation
- Things tacked to the wall
- Step by step document what it takes to recover
each function (more later)
58Departmental Planning (cont.)
- Each Department needs to consider a prepositioned
materials or off site box with things needed to
restart operations. Examples include - Small supplies of any forms
- Custom rubber stamps, stationery, envelopes
- Software CDs for any software installed by the
Department that IT could not replace easily - Copies of templates or form letters not available
on the network - Copies of brochures or printed materials where
whole supply is on site
59Step 15 Distribution Lists
- Develop who gets plans and how they should handle
- Plans contain confidential information
- Holders need a copy accessible away from work but
they should not be left lying around
60Step 16 Damage Evaluation Process
- Facility Manager work with COOP Coordinator
- Plan to move quickly wet stuff deteriorates
- Assess damage to building and equipment
- Consider use of specialists for salvage
- Freeze Dry paper
- Special care for Electronics
61Information Technology
62IT Step 10 IT Recovery Organization
- Possibilities (Each Jurisdiction Customizes)
- Network Team
- Desktop Technology Team
- Database Team
- Image Systems Team
- It is also possible to contract for Recovery
Services - Make sure any organization selected is unlikely
to be hit by a common event
63IT Step 11- Server and Network Plan
- If the Server building is destroyed, needs
include - Suitable Space with connectivity
- Replacement equipment deliverable in time to meet
RTOs - Backups current enough to meet RPOs
- Detailed information on server configuration,
operating systems - Step by step instructions on restoration of
critical systems - Redundant failover systems for critical 24/7/365
systems located away from primary location
64IT Step 12 Desktop Technology Plan
- Standard hardware configuration - Documented
- Standard software configuration - Documented
- Install files available on LAN and on CDs
- Source for replacement equipment that meets RTOs
- May require a Recovery Vendor
65IT Step 13 Telephone Plan
- Jurisdiction wide phone system?
- The PBX phone switch is usually in IT
- Review restoration requirement if destroyed
- Recovery Vendor
- PBX Vendor
- Are there phones that cannot be off line
- 911
- Police
- Fire
- Others?
- What does it take to assure continuity?
66IT Step 14 Return to Data Center
- Moving a Recovered Data Center is an opportunity
for another disaster - Develop a plan in advance to use as an aid for
the real plan (tool kit)
67So how do I know I can really recover?
68Recovery Definition
- Recovery is when an area is again able to perform
its essential functions after an incident that
interrupts normal work - Examples
- False Fire Alarm Reentry to building
- Real Fire When people can again perform their
essential functions at a new location - May not be all the functions
- May not be room for all the people
69Our Plan is Perfect
- No Every plan has holes
- Things forgotten
- Things changed since last update
- Things not covered in the Template because they
are unique to a jurisdiction
70How do I Improve the Plan?
- Individual Action Plans by Department
- Table Top Exercises
- Full Recovery Exercises
71Action Plans by Department
- Each Department in the Jurisdiction needs to have
an action plan that includes - Employee safety
- Evacuation / Shelter in Place
- Gathering point
- Contact protocols
- Phone Trees
- Other?
- Who outside Dept. to Notify
72Action Plans by Department (Cont.)
- What to do to restart functions
- A place to go
- How to recover work in progress
- Workarounds until Technology is in place
- Something to NOT do Do not establish outside
email accounts with Gmail, Yahoo, Hotmail, etc.
and try to do jurisdiction business on them - Not Secure, could violate privacy laws
- Will confuse people when switching back
73Table Top Exercise
- Representatives of affected areas gather
- No forewarning of the scenario
- Involve as many people as reasonable
- Each area notes critical functions and steps
through what they would do to restart them - Note losses (records, supplies, work in progress)
- Note all dependencies
- Note material required that is not pre-positioned
74Table Top Exercise (cont.)
- Identify Assumptions Verify during or after
exercise - Ask How and When a lot to avoid surprises in
a real incident - Disallow unrealistic assumptions If it is not
preplanned it doesnt exist Some common ones - Acquisition of New Space
- Acquisition of Furniture
- Acquisition, Installation, Configuration of PCs,
phones, copiers, fax machines - Network Connectivity
75Table Top Exercise (cont.)
- Clock starts running on RTO at the time of the
incident - Use what is learned to improve the plan
- Revisit Department RTOs and discuss what happens
if not met Some may be movable, others not - Develop Workarounds where possible to cover
functions without waiting for technology
76Full Recovery Exercise
- Information Technology and Limited Operations
- Table Top exercises are useful for IT but at some
point they need to know they can actually restore
servers supporting critical systems on new (and
probably dissimilar) equipment - Cost is probably going to be an issue
- A limited group of operations personnel need to
participate to know the systems work as they are
supposed to after restoration.
77Full Recovery Exercise (cont.)
- Have IT Recovery Priorities been mapped to
Jurisdiction Functional Priorities? - If IT is able to restore at the recovery site
- Review the pile of stuff brought with them How
much of it is replicated off site and would have
been available if the primary site were
destroyed? - Review the people Is anyone so key that the
restoration would have failed without them What
if they were unavailable? - Review documentation and make sure you are not
depending on Joe to be there. He may not be...
78Full Recovery Exercise (cont.)
- There is a reasonable probability that IT will
not be able to restore critical systems within
the RTO the first time they try - Problems may include
- Problems with System State restores on
dissimilar gear - Incompatible tape drives
- Slow tape drives
- Bad tape
- Lack of specialized knowledge to restore stable,
but complex equipment (example AS/400 iSeries)
79Tool Kit
- Some will need these others will not
- Bomb Threat Checklist and Instructions
- Command Center Supply List (very basic)
- Event Description Form
- Event Log Form
- Generic Initial Recovery Steps
- Incident Report Form
- Risk Analysis Questions
- Site Assessment Form
- IT Server Hardware/Software/Switch Worksheet
- IT Generic Data Center Move List
- IT Server Data Sheet
- Recovery and Cleanup Vendors
80Conclusion
- Start Now
- Do it one step at a time
- Take advantage of what is already in place and
build on it - Thank you for your time
- Do not hesitate to email or call with questions
- daniel_at_contingencynow.com