OpenEdge High Availabilty

About This Presentation

Title:

OpenEdge High Availabilty

Description:

Title: Progress Client/Server for developers Author: Alan Wilkinson Last modified by: Adam Backman Created Date: 7/2/2006 12:05:35 PM Document presentation format – PowerPoint PPT presentation

Number of Views:96

Avg rating:3.0/5.0

Slides: 40

Provided by: AlanW192

Category:

more less

Transcript and Presenter's Notes

Title: OpenEdge High Availabilty

1
OpenEdge High Availabilty

Adam Backman
Grand Poobah White Star Software

2
About the speaker

Head Winemaker White Star Software
One of the oldest and most respected consulting
and training companies in the Progress OpenEdge
sector
Lackey DBAppraise
Managed database services backed up by
experienced Progress OpenEdge professionals not
rookies off the bench
Read a book or two
Snappy Dresser
Knows a bit about systems and OpenEdge

3
Agenda

Are you really 24X7?
Redundancy
Replication
Maintenance
Failing over
Conclusion

4
What is High Availability?

A real business need that requires full access to
current data at any time of the day or night
Many sites are kind of 24X7 but only a small
percentage of companies have real business
requirements that necessitate access to the data
24 hours a day.
Some applications have high availability needs
but only during given hours which simplifies
maintenance
The need is growing every day

5
Are You Really 24X7?

Business runs 24 hours a day
3-shift manufacturing, Utility, Casino, Website,
Business needs access 24 hours
Work during the day, report and plan at night
Weekend requirements

6
What is High Availability?

The ability to keep running your business
Continuous Access which allows for failures with
zero impact to the users
Minimally Invasive failure management like using
HACMP clustering with OpenEdge as a cluster
service
Major Failover where physical location of the
application must be changed
Minimal recovery time in case of disaster
It is not disaster recovery DR is only used
when HA fails

7
Before you begin

Understand your business
Understand the cost of downtime
Do not build a solution that costs more that what
you are protecting

8
People

Who owns the data
Be inclusive with invites most will drop out
This is not solely an IT decision
You are the keeper, not owner of the data
You know what is technically possible
You know the cost of the tech needed to build the
solution
The goal is to eliminate surprises if/when a
problem occurs

9
Planning

Budget it is not free
Hardware fault tolerant, redundancy,
Software OpenEdge plus ALL the other stuff you
have to run the operation
Knowledge Buy or Rent
Time schedule and outage time
Personnel constraints Who is on call and who is
their backup

10
Causes of Downtime

Hardware
Disks are most vulnerable as they are the only
moving part unless you have SSD
Power - All the hardware requires power
Software
OS bug
OpenEdge (core or application) bug
Natural disaster
Fire
Flood
Sabotage
Human Error

11
Basic Rules

Good Hardware
Trusted vendor
Good support (local support if possible)
No Windows (OK, maybe 2008)
You need a good recovery plan
You will run with after imaging enabled

12
Redundancy

Hardware
Software
Personnel

13
Redundancy Hardware

Power (UPS or UPS Generator)
Mirrored disks
Network - in machine and general network
Non-interleaved memory (some use FT memory)
Multiple CPUs
Support hardware (PCs, terminals, phone,)
Complete failover environment

14
Hardware

Why have a UPS and a generator?
UPS has limited capacity
Generators can run for a long time
Have a reliable source of extra fuel

15
Hardware

Do not let standby systems sit idle
Use them for development or test
Keep copies of all support files
.pf
.ini
.d

16
Redundancy Software

Host-based are least fault tolerant
Web-based can provide a good environment provided
the AppServer calls are stateless
In client/server model remember that file servers
need to be redundant as well

17
Redundancy Software

NameServer on the broadcast and clustered
Dont use the NameServer
Cluster your AppServers so if a single AppServer
fails there is another to pick up the load

18
Redundancy Staffing

Is the failover machine close?
Can it reliably be accessed remotely (failure
point)
Possible to call in additional resources?
More hands
Different skills
Relief of tired staff
Is it necessary to support all functions or only
core?

19
Replication of Data

Database data
OpenEdge replication (synchronous)
Log-based replication (asynchronous)
Hardware-based replication (?)
Application and User files
OS utililty (fsync, rsync, )
Hardware (remote mirroring)
Third-party (polyserve)

20
Replication OpenEdge

Pros
Supported product
Synchronous
Fast (Really Fast)
Cons
Cost
Yet another thing to support
Additional resource usage

21
Replication Log-based

Pros
Cheap (Not free, but close)
Easy to setup and maintain
Cons
No formal support
Additional resource utilization

22
Hardware Replication

Pros
Easy setup
Easy Maintenance
Cons
Expensive
Possibility of data corruption unless ALL writes
are guaranteed

23
Maintenance

Script everything to eliminate human error
Scheduled Maintenance
Application changes
Backups
Index maintenance
Adding space
Unscheduled maintenance
Eliminate unscheduled maintenance buy monitoring
and trending

24
Maintenance Application

Schema
Use fast schema add then add default value
Still requires an outage for some changes due to
table locks
Code changes
If you are n-tier you can stop the AppServer to
reduce the interruption
Switch to a different propath and move clients
over through natural attrition

25
Maintenance Backups

Progress backup
Reliable
Online option
Split mirror backup
Replication backup
Eliminate overhead on production db
Must be a no recover backup for log-based
replication

26
Maintenance Index

Index rebuild cannot be run against a replicated
database
Use index compact online
proutil ltdbnamegt -C idxcompact lttable.indexgt
Notes
Watch for open transactions as idx compact will
do a significant amount of logging
Schedule outside of busy times to allow
replication to keep up

27
Maintenance Add Space (Online and offline
approaches)

prostrct addonline to add space while you are
running
Process
Make sure your umask is correct
Validate your add.st file
prostrct addonline db add.st
prostrct is supported for both source and target
databases with the exception of prostrct unlock
Process
Shutdown source and target
Make changes to source
Make changes to target
Start both databases

28
Maintenance

All maintenance should be scripted and tested in
a test environment before proceeding with the
Production run
Eliminate the human element (no typos)
Know how long it will take
Make sure maintenance does not cause a problem
Apply and test schema changes thoroughly

29
Building a failover plan

Who
Business and technical personnel
Gets informed email, conference call, call
tree,
Makes Decisions
Does the work
What
What resources are affected?
Where
Location of physical resources
Location of personnel
Location of replacement/replication target

30
Building a failover plan - continued

When
Times of backups
Times of data archiving
Times of backup archiving
Times of log archiving
Why
What are we protecting ourselves from
Why did we choose not to deal with some event

31
Risk Assessment

Things to consider
Risk Natural Disaster, Human caused, hardware,
Likelihood
Impact to application environment
Time to recover
It is OK to say we considered that and it was not
high enough in likelihood in our eyes to create a
solution
Determine the dependency of each level
Hardware requires power
OpenEdge application requires PostalSoft

32
Solutions

Document redundancy where it exists
Document places where redundancy is missing or
unknown (on purpose or omission)
Ensure reasonable software update procedures are
in place and documented
Verify security, division of responsibilities and
software release policies per layer
Need to develop Risk Assessment form

33
Aspects of a failover plan

When
When do we decide to move to the standby
environment?
Who makes the decision?
Who does the work along with a backup for who
does the work
Defined process
Service level agreements with customers
Milestones in the process
Why
This is a tougher decision than you think
Fix or flee lost time vs. lost data

34
Documenting your plan

Your plan should be able to be executed by anyone
You cannot have enough detail
Automate as much of the process as possible to
eliminate the human element
Document and automate both the failover and the
failback

35
Test your plan

Switch over to your standby environment and run
for a day or more
You dont want to cause an extended outage
testing your plan
You will only find issues if you run at full load
Do this at least once a year
Follow your document and correct mistakes as you
go

36
Keep documents and support files up-to-date

Keep your failover and failback documents
up-to-date
Keep contact lists up-to-date
Keep all individual process documents up-to-date
Keep copies of your support files
Scripts
Application (.pf, .ini, .properties, )
Good password management
Keep everything accessible (online and hard
copies)

37
Points to Remember

Build redundancy into all aspects of your
operation
Look at the likelihood of a failure and its
impact to the customer
Protect your entire application environment both
hardware and software
Build a total solution but think about the
cost/benefit of each component
Automate tasks to eliminate human error
Test your failover plan at least once a year

38
Questions?
Adam Backman adam_at_wss.com
39
Thank you for your time!

Write a Comment

User Comments (0)

About PowerShow.com

OpenEdge High Availabilty - PowerPoint PPT Presentation

OpenEdge High Availabilty

Title: Progress Client/Server for developers Author: Alan Wilkinson Last modified by: Adam Backman Created Date: 7/2/2006 12:05:35 PM Document presentation format – PowerPoint PPT presentation