Title: OpenVMS Solutions Center Lab Project Spring 2004 : Oracle 9i RAC DTHA in a distributed OpenVMS Envir
1OpenVMS Solutions Center Lab Project - Spring
2004 Oracle 9i RAC DT/HA in a distributed
OpenVMS Environment Phase I Failover
2RAC DT/HA Goals Phase I
- First
- Demonstrate that Oracle 9iRAC continues to run
during simulated network failure using LAN
Failover and failSAFE IP configurations. - Second
- Measure the latency effect of failover when RAC
instances are connected over long distance
(100km).
3RAC DT/HA What is Failover?
- Oracle RAC failover The ability to resume work
on an alternate instance upon instance failure - Oracle TAF (Transparent Application Failover)
Runtime failover which enables client
applications to automatically reconnect to the
database if the connection fails - LAN Failover Hardware failover from failed
network interface card (NIC) to another NIC
configured as part of LAN failover set - failSAFE IP Address failover to alternate
interfaces
4RAC DT/HA Hardware Config
- 2 4-cpu GS160, with Shared Cluster System disk, a
Shared Oracle install disk on Enterprise Storage
Array connected via Fibre SAN A Switch - DE602-AA (EIA) NICs, using Twisted Pair on
100m-bit LAN Extreme Summit4 Switch - 5-DEGPA-SA, 1-DEGXA-SA (EWA-D) NICs, 1Gbit
fiber on 1Gbit LAN Digital Networks DNSwitch 800 - 100km cable - Gbit SCS Extreme Summit 7i Switch
5RAC DT/HA Server Config
- OpenVMS 7.3-2, TCPIP 5.4
- Oracle Server 9.2.0.4, with Oracle patch for bug
fix 3026720 Excessive CPU and BUFIO for LMD0 and
SMON processes when gt2cpu - Running 2 RAC instances, in 2 node cluster
- Requires the INITltSIDgt.ORA parameter
CLUSTER_INTERCONNECTS to specify alternate
network interface for RAC communication
6RAC DT/HA Client Config
- 9.2 SQLNet Client, on PC running Windows 2000
- Benchmark/Load Generating software
- Swingbench 2.1f- An unofficial, Java based,
client load generating tool from Oracle, which
allows a load to be generated and the
transactions/response times to be charted - Configured to connect 100 clients, load balanced
between the 2 instances, and run 50,000 typical
Order Entry transactions
7RAC DT/HA Test Plan
- Restore from disk backup before each test run to
ensure same starting point - Ensure RAC instances communicating over specified
network interface - Run 3 iterations of same benchmark load while
collecting data - Run Benchmark load, no failures
- Run Benchmark load, fail instance
- Run Benchmark load, fail network connection
between instances
8RAC DT/HA Data collection
- T4 running on both nodes, 10sec sampling interval
- Saved Swingbench data results after each run
- Executed and saved output of VMS commands
during network failures to see status of network
devices and Oracle processes - MC LANCP SHOW DEVICE/CHARATERISTICS LLA0
- TCPIP SHOW INTERFACES/FULL
- PIPE SHO SYSSEA TT ORA_CPU
9Tabular Timeline Tracking Tool T4
- Created by OpenVMS Sustaining Engineers to help
diagnose OS functionality. Uses OpenVMS Monitor
data, stored in Comma Separated Value file format
(.csv file), which can then be used by a variety
of applications (spreadsheets, TlViz, etc) - Download from web. Shipped with OpenVMS 7.3-2,
in SYSETC directory - http//h71000.www7.hp.com/openvms/products/t4/inde
x.html - Users are able to queue data collection and
configure data collection frequency - Helpful in establishing baseline performance
footprint which can then be used in before and
after comparisons of system changes - T4 hooks for Oracle and Rdb Server being
created
10RAC DT/HA EIA Network
11RAC DT/HA T4 data - EIA
12RAC DT/HA - LAN Failover Network
13RAC DT/HA LAN Failover DCL
- MCR LANCP SHOW DEVICE/CHAR LLA0
- Before NIC fails
- Device Characteristics LLA0
- Value Characteristic
- ------ --------------
- 256 Max receive buffers
- Yes Full duplex enable
- . .
- . .
- 1000 Line speed (mbps)
- "EWB0" Failover device
- "EWA0" Failover device (active)
- . .
- . .
- 0 Failover priority
- After NIC fails
- Device Characteristics LLA0
- Value Characteristic
- ------ --------------
- 256 Max receive buffers
- Yes Full duplex enable
- . .
- . .
- 1000 Line speed (mbps)
- "EWB0" Failover device (active)
- "EWA0" Failover device
- . .
- . .
- 0 Failover priority
14RAC DT/HA-T4 LAN Failover EWA/B
EWA0 cable pulled
EWB0 cable pulled
15RAC DT/HA-T4 LAN Failover LLA0
16RAC DT/HA-T4 Overlay of EWA/LLA0
17RAC DT/HA failSAFE IP Network
18RAC DT/HA failSAFE IP DCL
- TCPIP SHOW INTERFACE/FULL
- Route Tree for Protocol Family 2
- default 161.114.69.1 UGS 0
7999 IE0 - 10.4.4/24 10.4.4.2 U 274
408185 WE3 - 10.4.4/24 10.4.4.3 U 274
445714 WE4 - 10.4.4.2 10.4.4.2 UHL 0
0 WE3 - 10.4.4.3 10.4.4.3 UHL 0
14 WE4 - WE3 flagsc43ltUP,BROADCAST,RUNNING,MULTICAST,SIMP
LEXgt - failSAFE IP Addresses
- inet 10.4.4.3 netmask ffffff00 broadcast
161.114.69.63 (on QBB3 WE4) - inet 10.4.4.2 netmask ffffff00 broadcast
10.4.4.255 ipmtu 1500 - WE4 flagsc43ltUP,BROADCAST,RUNNING,MULTICAST,SIMP
LEXgt - failSAFE IP Addresses
- inet 10.4.4.2 netmask ffffff00 broadcast
161.114.69.63 (on QBB3 WE3) - inet 10.4.4.3 netmask ffffff00 broadcast
10.4.4.255 ipmtu 1500
19RAC DT/HA failSAFE IP DCL Failed 1
- TCPIP SHOW INTERFACE/FULL
- Route Tree for Protocol Family 2
- default 161.114.69.1 UGS 0
7999 IE0 - 10.4.4/24 10.4.4.2 U 274
408185 WE3 - 10.4.4/24 10.4.4.3 U 274
445714 WE4 - 10.4.4.2 10.4.4.2 UHL 0
0 WE3 - 10.4.4.3 10.4.4.3 UHL 0
14 WE4 - WE3 flagsc43ltUP,BROADCAST,RUNNING,MULTICAST,SIMP
LEXgt - failSAFE IP - interface is in a failed state
- failSAFE IP Addresses
- inet 10.4.4.3 netmask ffffff00 broadcast
161.114.69.63 (on QBB3 WE4) - inet 10.4.4.2 netmask ffffff00 broadcast
10.4.4.255 (on QBB3 WE4) - WE4 flagsc43ltUP,BROADCAST,RUNNING,MULTICAST,SIMP
LEXgt - inet 10.4.4.3 netmask ffffff00 broadcast
10.4.4.255 ipmtu 1500 - inet 10.4.4.2 netmask ffffff00 broadcast
161.114.69.63 ipmtu 1500
20RAC DT/HA failSAFE IP DCL Failed 2
- TCPIP SHOW INTERFACE/FULL
- Route Tree for Protocol Family 2
- default 161.114.69.1 UGS 0
7999 IE0 - 10.4.4/24 10.4.4.2 U 274
408185 WE3 - 10.4.4/24 10.4.4.3 U 274
445714 WE4 - 10.4.4.2 10.4.4.2 UHL 0
0 WE3 - 10.4.4.3 10.4.4.3 UHL 0
14 WE4 - WE3 flagsc43ltUP,BROADCAST,RUNNING,MULTICAST,SIMP
LEXgt - inet 10.4.4.2 netmask ffffff00 broadcast
10.4.4.255 ipmtu 1500 - inet 10.4.4.3 netmask ffffff00 broadcast
161.114.69.63 ipmtu 1500 - WE4 flagsc43ltUP,BROADCAST,RUNNING,MULTICAST,SIMP
LEXgt - failSAFE IP - interface is in a failed state.
- failSAFE IP Addresses
- inet 10.4.4.2 netmask ffffff00 broadcast
161.114.69.63(on QBB3 WE3) - inet 10.4.4.3 netmask ffffff00 broadcast
10.4.4.255 (on QBB3 WE3)
21RAC DT/HA T4 data failSAFE IP
EWD0 cable pulled
EWE0 cable pulled
22RAC DT/HA 100km cable Network
23RAC DT/HA T4 EWA0 w/100km cable
24RAC DT/HA T4 EIA compared w/ EWA
25RAC DT/HA Load Generation Data
- 50k Transactions, no RAC or Network Failure
26RAC DT/HA Load Generation Data
- 50k Transactions, Network failover
27RAC DT/HA Load Generation Data
- 50k Transactions, 1 RAC instance failed
28RAC DT/HA Conclusions
- RAC seemed to have no problems when running with
network configured to use LAN Failover or
failSAFE IP (on the same node). - There seems to be a definite distributing effect
on network traffic when Oracle init.ora
parameter CLUSTER_INTERCONNECTS is used
29RAC DT/HA Phase II and III
- Phase II Configure Oracle 9iRAC 2-node cluster
using Raid-1 Shadow Sets for database and
logfiles, and test recently released Host Based
Mini-Merge (HBMM) functionality in a variety of
configurations. - Refer to http//h71000.www7.hp.com/news/hbmm.htm
- Phase III Distribute nodes in cluster over
100km distance and test failover and HBMM
functionality
30RAC DT/HA - References
- OpenVMS Technical Journal
- Matt Muggeridges July 2003 - V2 Article
Configuring TCP/IP for High Availability
http//h71000.www7.hp.com/openvms/journal/v2/artic
les/tcpip.pdf - Steve Liemans January 2004 - V3 Article
TimeLine-Driven Collaboration with T4 Friends
A Time-saving Approach to OpenVMS Performance
http//h71000.www7.hp.com/openvms/journal/v3/t4.pd
f
31RAC DT/HA References (cont)
- TCPIP docs
- http//h71000.www7.hp.com/doc/tcpip54.html
- OpenVMS docs http//h71000.www7.hp.com/doc/os732_
index.html - HP TCP/IP Services for OpenVMS Management
Chapter 5 Configuring and Managing FailSAFE IP - http//h71000.www7.hp.com/doc/732final/documentati
on/pdf/aa-lu50m-te.pdf
32RAC DT/HA References (cont)
- HP OpenVMS System Management Utilities Reference
Manual Chapter 13, LAN Control Program (LANCP)
Utility - http//h71000.www7.hp.com/doc/732FINAL/DOCUMENTATI
ON/PDF/aa-pv5ph-tk.PDF - HP OpenVMS System Managers Manual, Volume 2
-Tuning, Monitoring, and Complex Systems Chapter
10, Managing the Local Area Network (LAN)Software - http//h71000.www7.hp.com/doc/732FINAL/aa-pv5nh-tk
/aa-pv5nh-tk.pdf
33RAC DT/HA References (cont)
- Oracle References
- Swingbench an unofficial load generating
benchmarking tool, developed in Java, which
allows a load to be generated and the
transactions/response times to be charted - http//www.dominicgiles.com/swingbench.php
- OTN otn.oracle.com Real 24/7 Use Oracle9i RAC
and TAF to guarantee availability.
http//otn.oracle.com/oramag/oracle/02-may/o32clus
ters.html
34RAC DT/HA References (cont)
- Oracle Metalink articles metalink.oracle.com.
- Note183340.1 - Frequently Asked Questions
About the. - CLUSTER_INTERCONNECTS Parameter in 9i.
- Note 220970.1 - Which network is Oracle using
for RAC traffic?" - Note 162725.1 - OPS/RAC VMS Using alternate TCP
Interconnects on 8i OPS and 9i RAC on OpenVMS. - Note 226880.1 Configuration of Load Balancing
and Transparent Application Failover.
35OpenVMS Solutions Lab
- Available to customers to test new hardware,
software, applications - Alpha and Integrity systems available for use
- To get the most benefit from the Lab, customer is
expected to be prepared with exact list of
hardware and software requirements, test plan and
goals