Title: DB-12:%20Atlantis,%20The%20Holy%20Grail,%20and%205-9
1DB-12 Atlantis, The Holy Grail, and 5-9s In
Search of the Perfect World
Brian Bowman
Senior Solution Engineer
- Are you in the right place?
- Definitions, Myths, Realities
- 5-9s for the Rest of Us
- Building an Ark
- So where does OpenEdge fit in?
- Summary Questions
3What is your company is asking?
- To get to five 9s?
- To do Operations job too?
- And Systems Administration?
- And DR planning?
- And
- To perform miracles?
4What we are going to focus on Today
- What 5-9s means to the DBA / SA
- How to make the concept manageable
- How to convert business requirements into
technological needs
- Are you in the right place?
- Definitions, Myths, Realities
- 5-9s for the Rest of Us
- Building an Ark
- So where does OpenEdge fit in?
- Summary Questions
6Did Atlantis really exist?
7Holy Grail
20,000 books listed on Amazon.com under Holy
61 movies made about the Holy Grail
9Myth of five 9s
- Everybody needs to be at 5-9s
- The business cant afford to lose data
- Our customers will leave us if they cant get to
us - Its always the software
- So what does 5-9s really mean?
10Realities of five 9s
Time / Year Time / Week
90 876 Hours 16 hours
99 87 Hours, 36 minutes 101 minutes
99.9 8 hours, 45.5 minutes 10 minutes
99.99 52 minutes, 33 seconds 61 seconds
99.999 5 minutes, 15 seconds 6.05 seconds / week
- This presentation has been running for 8 minutes
11What is 99.999?
According to Forsythe Technology Inc
- If systems are up... Then yearly downtime is...
- So while 99 uptime sounds good, the reality is
that IT would be down 1.7 hours per week! - Today's definition of "high availability" is a
system at "five 9's" (99.999). - Can ltInsert Companygt achieve 99.999?
- A pipe dream
12Another View of Achieving 5-9s
- Getting very high availability depends on many
things - Minimizing the number of failures, typically by
eliminating the faults that cause them. - Quickly detecting and accurately diagnosing the
failures that do occur. - Having spare capacity to allow recovery to be
performed. - Having recovery actions that can be rapidly
completed. - (http//home.att.net/wamontgomery/communications/
99999.htm, Extracted 13 April 2007)
13- Faster
- Better
- Cheaper
- pick any two
- Are you in the right place?
- Definitions, Myths, Realities
- 5-9s for the Rest of Us
- Building an Ark
- So where does OpenEdge fit in?
- Summary Questions
15Now In our terms
- HA means making sure the application is available
when you need it - Access
- Performance
- BC DR are part of HA
- Without BC DR HA cannot exist
- HA Costs more
- Usually means a combo of HW and SW
- More work to achieve a higher level of 5-9s
16Figuring out the cost of downtime
- Factors that go into calculating the cost of
downtime - Lost Sales
- Lost Wages
- Lost Time
- Lost Customers
- Lost Partners
- Lost Reputation
17Did Atlantis really exist?
18What Does it Really Take to Achieve 5-9s?
- Commitment
- Focus
- Organization
- Agreement
- Business Need
- Need to build the bridge between DR plan and BC
- Are you in the right place?
- Definitions, Myths, Realities
- 5-9s for the Rest of Us
- Before You Start Your Quest
- So where does OpenEdge fit in?
- Summary Questions
20What affects 5-9s in Your Environment?
- Identifying Events
- Grouping Events
- Setting Achievable Goals
- Risk vs. Reward
21Identifying Events(Putting a Box around
Network Outage
Application Problem
Planned DB Maintenance
Unplanned HW Maintenance
HW Outage
Unplanned Upgrade
Planned Upgrade
Planned HW Maintenance
Application Deployment
Bad Disk
Unplanned DB Maintenance
CPU Panic
22Planned vs. Unplanned
- Planned
- DB Maintenance
- Application Maintenance
- Hardware Maintenance
- Unplanned
- Application Outage
- Hardware Outage
- Database Outage
- Network Outage
23Putting a Box around Events
Planned Events
Unplanned Events
Planned Upgrade
Application Problem
Application Deployment
Unplanned Upgrade
Network Outage
HW Outage
CPU Panic
Planned HW Maintenance
Bad Disk
Unplanned HW Maintenance
Planned DB Maintenance
Unplanned DB Maintenance
24Identifying Problems in your Environment
- History do you have one?
- Can you find it or is it in Atlantis
- Tech Support history from vendor
25What are my top X Events?
- Dont try to boil the ocean
- Categorize problems
- Is it a problem?
- What is the cost?
- Can I control it?
- Select ones with the highest impact and that are
also the most achievable.
26Charting my Events
- Disk Loss
- DB Maintenance
- Index Rebuild
- Upgrade to OE 10
- New Hardware
- Application Upgrade
Ability to Change
Impact of Change
27What Do These Events Cost Me?
- Down time
- Reputation
- Lost sleep
28Database vs. Application vs. Hardware
- Where does the problem lie?
- If Application what has changed?
- Do you have a plan for that failure?
- Who will solve the problem?
- If it is a HW problem and it will take them 4
hours to fix it - How long will it take them?
- If it will take 30 minutes to reboot the server
has this been taken into account?
29Placing limitations on downtime requirements
- Defining downtime
- Internal or External
- Determining when to throw in the towel
- What is your escape plan?
30Incremental steps
- Establish Plans to address problems
- Communicate plans at all levels
- Document completed plans
31Delivering the final product
- Complete documentation of process and findings
- Executive Summary
- What is our vulnerability?
- What is the risk?
- What will it cost me to fix it?
32Summarizing the Process
- Define Events
- Plan to address the Events
- Execute the Plan
- Deliver the solution
- Are you in the right place?
- Definitions, Myths, Realities
- 5-9s for the Rest of Us
- Building an Ark
- So where does OpenEdge fit in?
- Summary Questions
34Quantifying five 9s with OpenEdge
- Ongoing improvements to the database
- Ongoing improvements to the Application Server
- Increased Availability Increased 9s
- Planned vs. Unplanned Events
35Increasing Planned Availability
36Features That Increase Planned Availability
- Online Schema Updates
- Auto-Defragging of the database
- Self Healing Attributes
- Truncate log file online
- Adding extents online
- Enabling AI Online
- SQL Revoke Privileges
- OpenEdge Replication Failback
37Increasing Unplanned Availability
38Features that Increase Unplanned Availability
- Database Storage Clusters
- More Self-tuning Database
- Show I/O by Table by User
- Raise Database Limits
- OpenEdge Replication Failback
- OpenEdge Management
39Increasing All Availability
- Are you in the right place?
- Definitions, Myths, Realities
- 5-9s for the Rest of Us
- Building an Ark
- So where does OpenEdge fit in?
- Summary Questions
41Have you found the Holy Grail?
What is your Holy Grail?
42What to do when youre done
- What are you going to do when you have more money
than Bill Gates and will live forever? - OR
- What are you going to do when you have achieved
fine 9s?
43How do you maintain High Availability
- The moment a change is introduced to your system
all of the work you just did could have just been
thrown out the window. - Change Management!!!
- 5-9s is both a Myth and a reality
- Identify what is important to your business
- Clearly build a box around what is under your
control and what is not - Wash, Rinse, Repeat
45Is this your Shangri-La?
46Where to go from here
- Other Exchange sessions
- COMP-10 OpenEdge Management and Replication
Divide et impera! (June 13, 8am) - COMP-15 Disaster Recovery Planning (June 13,
330pm) - PSDN for doc link
- http//www.psdn.com/library/kbcategory.jspa?catego
ryID555 - Professional Services for assistance
48Thank you for your time!
49(No Transcript)
50Additional Information
- The Resilient Enterprise, by Yossi Sheffi
- Atanium Business Continuity Web Site,
http//www.attainium.net/index.php - Complying with New Business Continuity and
Contingency Plan Rules, by Nick Benvenuto and
Brian Zawada, http//www.dmreview.com/editorial/dm
Extracted 25 May 2007 - NASD Rule 3510 Web Page, http//nasd.complinet.com