In RAC We Trust - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

In RAC We Trust

Description:

In RAC We Trust ORACLE - From Dream To Production BGOUG Gabrovo 22.04.2005,, Let someone k n o w Plamen Zyumbyulev Presentation Goals Describe the major steps ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 47
Provided by: PlamenZy
Learn more at: https://bgoug.org
Category:
Tags: rac | trust

less

Transcript and Presenter's Notes

Title: In RAC We Trust


1
In RAC We Trust
  • ORACLE - From Dream To Production

BGOUG Gabrovo 22.04.2005
,, Let someone k n o w
Plamen Zyumbyulev
2
Presentation Goals
  • Describe the major steps in RAC implementation
  • Show and explain the main problems and obstacles
  • Show and explain solution and workarounds
  • Give some practical ideas and variations
  • Try to look at the problems from different angles
  • Non-Goals
  • Explain RAC concepts and fundamentals
  • Show ALL aspects of RAC

3
Agenda
  • Introduction
  • RAC Installation
  • High Availability Configuring
  • Testing and Tuning RAC
  • Implementing RAC in production

4
Introduction
5
RAC Installation
  • Problems
  • Solutions

6
Problems
  • Need to evaluate and purchase cluster hardware
  • Need of knowledgeable OS, High Availability,
    Network and Storage professionals
  • All this takes time

7
Solutions
  • RAC on Single Node
  • RAC on Single VMware Node
  • RAC on Multiple VMware Nodes
  • RAC and Network Block Device
  • Other Solutions

All of the solutions presented here are for
testing purposes ONLY. These configurations are
not certified or supported by Oracle Support
Services
8
RAC on Single Node
  • Why not?
  • Metalink Note241114.1 - Step-By-Step
    Installation of RAC on Linux - Single Node
    (Oracle9i 9.2.0 with OCFS)
  • Key Points
  • No need of fencing configuration
  • No need of using clustered file system or raw
    devices
  • No need of multiple oracle homes (ORACLE_HOME)

9
RAC on Single Node (contd)
  • Key Points (contd)
  • One oracle user with 2 or more profiles one for
    every instance. (e.g. .rac1, .rac2, )
  • zyumix/ su - oracle oracle_at_zyumix . rac1
    oracle_at_zyumix echo ORACLE_SID rac1
  • zyumix/ su - oracle oracle_at_zyumix . rac2
    oracle_at_zyumix echo ORACLE_SID rac2

10
RAC on Single Node (contd)
  • Key Points (contd)
  • Oracle Universal Installer needs Clusterware
    software in order to install RAC option.
  • Disadvantages

11
RAC on Single Node (contd)
Client side load balancing
Listener A
Listener B
Server side load balancing
Server A Instance A
Server B Instance B
Database
12
RAC on Single VMware Node
  • Even easier !!!
  • The Oracle-on-Linux VMware Cookbook -
    http//www.oracle.com/technology/tech/linux/vmware
    /cookbook/index.html
  • An easy, hands-on, step-by-step guide describing
    how-to install VMware, Linux (RHEL/SLES) and
    Oracle RAC (again on single node)
  • VMware Workstation (90-day free eval
    registration required)
  • RHEL3 (not free) , SLES8 (not free)
  • Disadvantages

13
RAC on Multiple VMware Nodes
  • VMware GSX/ESX Server permits the sharing of
    plain disks with multiple virtual machines
    running on the same host, provided the disk in
    question is a SCSI disk.
  • This approach is very powerful but complex. You
    can create very complex environments multiple
    NICs, switches, disks etc.
  • Now there are a number of nodes although virtual
  • Disadvantages

14
RAC and Network Block Device
  • This solution allows you to build scalable and
    high available database system only with common
    Intel PCs connected into Ethernet network.
  • In this solution, a standard shared disk
    subsystem is replaced by a native Linux
    technology - Network Block Device (NBD) that maps
    remote files to local block devices (e.g.
    /dev/nb0) via TCP/IP network. One computer (not
    necessarily Linux machine) serves as data storage
    for all cluster nodes (Linux machines) instead of
    expensive disk array.

15
RAC and Network Block Device (contd)
  • With this thing compiled into the kernel, Linux
    can use a remote server as one of its block
    devices. Every time the client computer wants to
    read /dev/nd0, it will send a request to the
    server via TCP, which will reply with the data
    requested.
  • The remote resource doesn't need to be a whole
    disk or even a partition. It can be a file.

16
RAC and Network Block Device (contd)
Typical configuration
Simple NBD configuration
17
RAC and Network Block Device (contd)
  • Installation
  • Both client and server machines are with RHEL3
  • Download source from http//nbd.sourceforge.net/
  • As root do
  • bunzip2 nbd-2.7.3.tar.bz2
  • tar -xvf nbd-2.7.3.tar
  • cd nbd-2.7.3
  • ./configure
  • make
  • make install

18
RAC and Network Block Device (contd)
  • Creating new empty files at NBD server
  • root_at_rac2 root dd if/dev/zero
    of/u01/oradata/rac/system.01 count300 bs1M
  • 3000 records in
  • 3000 records out
  • 314572800 bytes transferred in 1.683993 seconds
    (186801738 bytes/sec)
  • root_at_rac2 root
  • Running NBD server
  • Syntax nbd-server ltportgt ltfilenamegt
  • root_at_rac2 root nbd-server 4101
    /u01/oradata/rac/system.01
  • root_at_rac2 root

19
RAC and Network Block Device (contd)
  • NBD client
  • NBD client must be run as root (because of
    kernel parts of NBD). Before starting NBD client
    you would have to install Linux kernel NBD module
  • Installing ndb module RHEL3
  • root_at_rac3 root rpm -Uvh kernel-unsupported-2.4.
    21-4.EL.i686.rpm
  • warning kernel-unsupported-2.4.21-4.EL.i686.rpm
    V3 DSA signature NOKEY, key ID db42a60e
  • Preparing...
    100
  • 1kernel-unsupported
    100
  • root_at_rac3 root

20
RAC and Network Block Device (contd)
  • NBD client (contd)
  • Loading nbd module
  • root_at_rac3 dev lsmod grep nbd
  • root_at_rac3 dev modprobe nbd
  • root_at_rac3 dev lsmod grep nbd
  • nbd 16388 0 (unused)
  • root_at_rac3 dev
  • running nbd client
  • Syntax nbd-client ltdata servergt ltportgt
    /dev/nbltngt
  • root_at_rac3 dev nbd-client rac2 4101 /dev/nb0
  • Negotiation ..size 307200KB
  • bs1024, sz307200
  • root_at_rac3 dev

21
RAC and Network Block Device (contd)
  • Now block devices are configured and it is
    possible to access remote data. Oracle Real
    application clusters need raw access to shared
    disk subsystem so mapping raw devices to block
    devices is needed. This could by done with
    standard raw command.
  • Syntax raw /dev/raw/rawltNgt /dev/ltblockdevgt
  • root_at_rac3 root raw /dev/raw/raw1 /dev/nb0
  • /dev/raw/raw1 bound to major 43, minor 0
  • root_at_rac3 root

22
Other Solutions
  • RAC and FireWire
  • Build Your Own Oracle RAC 10g Cluster on Linux
    and FireWire
  • http//www.oracle.com/technology/pub/articles/hunt
    er_rac10g.html
  • RAC and NFS
  • Locking
  • Caching
  • Write through cache

23
HA Configuration
24
Simplified RAC schema
NET
Database
25
HA System
26
Maximum Availability Architecture
Application Server
Application Server
Data Guard
Primary Site
Secondary Site
27
Extended Distance Clusters
Application Server
Application Server
RAC
Virtualization storage layer
What about the Quorum Server???
Primary Site
Secondary Site
28
Extended Distance Clusters (contd)
  • Resolving Distance Problems
  • Application partitioning
  • gc_files_to_locks
  • Wavelength Division Multiplexing
  • Dense Wavelength Division Multiplexing DWDM
  • Coarse Wavelength Division Multiplexing CWDM
  • ACTIVE_INSTANCE_COUNT
  • .active_instance_count 1
  • .cluster_database_instances 2

29
Testing and Tuning RAC
  • Introduction
  • RAC testing steps
  • Functional Application Tests
  • RAC High Availability tests
  • Scalability tests
  • Digging into RAC performance problems

30
Introduction
  • Testing isnt trivial !!!
  • Classical testing/tuning methods.
  • Always tune single instance first!
  • Specific RAC issues
  • RAC aware tools

31
RAC testing steps
  • Functional Application Tests
  • RAC High Availability tests
  • Be aware about the timeouts!!!
  • Scalability tests

32
RAC testing steps (contd)
  • Scalability tests (contd)
  • Patterns of application scalability

Performance (TPS, 1/response time) of one user
nearly static
linear
constrained
exponential
Load ( users, size of tables)
33
RAC testing steps (contd)
  • Scalability tests (contd)
  • Good scalability

Performance -TPS (for all users)
2 node RAC
Single node
of concurrent users
34
RAC testing steps (contd)
  • Scalability tests (contd)
  • Problem!!!

Possible disk bottleneck
Performance -TPS (for all users)
2 node RAC
Single node
of concurrent users
35
RAC testing steps (contd)
  • Scalability tests (contd)
  • Problem!!!

Possible interconnect bottleneck
Performance -TPS (for all users)
Single node
2 node RAC
of concurrent users
36
Digging into RAC performance problems
  • Interconnect and shared storage are the two most
    possible performance problem areas in RAC
  • Interconnect speed
  • Throughput
  • Latency
  • average latency of a consistent block request.
    AVG CR BLOCK RECEIVE TIME should typically be
    about 15 milliseconds depending -- on your system
    configuration and volume

37
Digging into RAC performance problems (contd)
  • Interconnect types

38
Digging into RAC performance problems (contd)
  • cluster_interconnects parameter
  • It provides Oracle with information about
    additional cluster interconnects available for
    use and can be used to load balance the
    interconnect traffic to different physical
    interconnects thus increasing interconnect
    bandwith.
  • When you set CLUSTER_INTERCONNECTS in cluster
    configurations, the interconnect high
    availability features are not available. In other
    words, an interconnect failure that is normally
    unnoticeable would instead cause an Oracle
    cluster failure as Oracle still attempts to
    access the network interface which has gone down.

39
Digging into RAC performance problems (contd)
  • STATSPACK reports
  • The STATSPACK report show statistics ONLY for the
    node or instance on which it was run
  • Top 5 Timed Events
  • Global Cache Service and Global Enqueue Service
  • Note135714.1 Script to Collect RAC Diagnostic
    Information (racdiag.sql)

40
Digging into RAC performance problems (contd)
  • BAD PERFORMANCE
  • Top 5 Timed Events

  • Total
  • Event
    Waits Time (s) Ela Time
  • ---------------------------------------
    ------------ ----------- --------
  • global cache cr request
    34,568 958 31.44
  • buffer busy global CR
    6,513 620 20.35
  • db file sequential read
    64,214 455 14.92
  • latch free
    13,542 453 14.88
  • buffer busy waits
    10,971 295 9.69
  • GOOD PERFORMANCE
  • Top 5 Timed Events

  • Total
  • Event
    Waits Time (s) Ela Time
  • ---------------------------------------
    ------------ ----------- --------
  • latch free
    10,969 666 51.28

41
Digging into RAC performance problems (contd)
  • Global Cache Service - Workload Characteristics
    BAD GOOD
  • -----------------------------------------------
  • Ave global cache get time (ms)
    11.8 2.2
  • Ave global cache convert time (ms)
    51.7 11.2
  • Ave build time for CR block (ms)
    0.7 0.0
  • Ave flush time for CR block (ms)
    0.2 0.2
  • Ave send time for CR block (ms)
    0.0 0.2
  • Ave time to process CR block request (ms)
    0.9 0.4
  • Ave receive time for CR block (ms)
    1.6 0.4
  • Ave pin time for current block (ms)
    0.2 0.2
  • Ave flush time for current block (ms)
    0.0 0.0
  • Ave send time for current block (ms)
    0.1 0.1
  • Ave time to process current block request (ms)
    0.3 0.3
  • Ave receive time for current block (ms)
    33.4 7.5
  • Global cache hit ratio
    9.5 3.9
  • Ratio of current block defers
    0.0 0.0
  • of messages sent for buffer gets
    6.7 2.5
  • of remote buffer gets
    1.8 0.7
  • Ratio of I/O for coherence
    1.2 1.3

42
Digging into RAC performance problems (contd)
  • Global Enqueue Service Statistics BAD
    GOOD
  • ---------------------------------
  • Ave global lock get time (ms)
    0.2 0.0
  • Ave global lock convert time (ms)
    0.0 0.0
  • Ratio of global lock gets vs global lock
    releases 1.2 1.1
  • GCS and GES Messaging statistics
  • --------------------------------
  • Ave message sent queue time (ms)
    16.5 1.7
  • Ave message sent queue time on ksxp (ms)
    29.4 2.5
  • Ave message received queue time (ms)
    1.9 0.3
  • Ave GCS message process time (ms)
    0.1 0.1
  • Ave GES message process time (ms)
    0.1 0.0
  • of direct sent messages
    49.7 63.4
  • of indirect sent messages
    50.3 36.6
  • of flow controlled messages
    0.0 0.0

43
Implementing RAC in production
  • Smooth transition from single instance to RAC
  • Change ORACLE_HOME
  • Relinking the RAC Option ON/OFF
  • CLUSTER_DATABASE TRUE/FALSE
  • Start/Stop the second instance
  • Start gradual movement of clients from one
    instance to another

44
Relinking the RAC Option
  • Login as the Oracle software owner and shutdown
    all database instances on all nodes in the
    cluster.
  • 2. cd ORACLE_HOME/rdbms/lib
  • 3. make -f ins_rdbms.mk rac_on (rac_off)
  • If this step did not fail with fatal errors then
    proceed to step 4.
  • 4. make -f ins_rdbms.mk ioracle

45
Reference
  • Metalink Note211177.1 RAC Survival Kit Rac On /
    Rac Off - Relinking the RAC Option
  • Metalink Note183340.1 Frequently Asked Questions
    About the CLUSTER_INTERCONNECTS Parameter in 9i.
  • http//www.fi.muni.cz/kripac/orac-nbd/
  • The Oracle-on-Linux VMware Cookbook
  • http//www.oracle.com/technology/tech/linux/vmwar
    e/cookbook/index.html
  • Build Your Own Oracle RAC 10g Cluster on Linux
    and FireWire
  • http//www.oracle.com/technology/pub/articles/hunt
    er_rac10g.html
  • Note135714.1 Script to Collect RAC Diagnostic
    Information (racdiag.sql)

46
Thank You
QA
zyumbyulev_at_mobiltel.bg
Write a Comment
User Comments (0)
About PowerShow.com