Monitoring ObjectStore Performance - PowerPoint PPT Presentation

1 / 57
About This Presentation
Title:

Monitoring ObjectStore Performance

Description:

Network Usage - indicates heavy. client - server. communications ... Memory Usage. should have enough physical memory to keep critical processes in real memory ... – PowerPoint PPT presentation

Number of Views:156
Avg rating:3.0/5.0
Slides: 58
Provided by: dirkb6
Category:

less

Transcript and Presenter's Notes

Title: Monitoring ObjectStore Performance


1
Monitoring ObjectStore Performance
  • Dirk Butson (dbutson_at_progress.com)

2
Why Am I Here?
  • To enumerate the tools available to everyone
    using R6.1 for monitoring ObjectStore behavior.
  • System Commands
  • ObjectStore Commands
  • ObjectStore APIs
  • unsupported library

3
Agenda
  • What and Why
  • System Resources
  • ObjectStore Server
  • Database / Log I/O
  • ObjectStore Client
  • Summary

4
Why Monitor Performance?
  • Critical step in order to tune the system
  • Helps analyze problems / issues that might occur
    during production e.g. -
  • whats the effect of doing an online backup on
    the system?
  • how has 30 growth of our business affected our
    system?
  • how does the new release compare with the
    previous release?

5
What Do We Monitor?
  • ObjectStore architecture / components
  • Hardware
  • CPU, Disk I/O, Memory Usage, Network I/O
  • Software
  • client(s), cache manager(s), server(s)
  • Files
  • databases, transaction log
  • cache and commseg files (UNIX only)
  • Interprocess Communications
  • messages to server, callbacks to the cache
    manager, messages to the cache manager from client

6
ObjectStore Architecture
Server Host
Client Host
Application Process
Transaction Log
Program I-Session
?
PageCache
?
Client Library
Commseg
Database Files
ADB
Cache Manager
7
Agenda
  • What and Why
  • System Resources
  • ObjectStore Server
  • Database / Log I/O
  • ObjectStore Client
  • Summary

8
System Tools
  • Gives us an understanding of what system
    resources we are using heavily
  • and possibly inefficiently
  • We want to look at
  • Disk Usage - which drives, files
  • CPU Usage - which processes
  • Memory Usage - are we swapping
  • Network Usage - indicates heavy client -
    server communications

9
System Tools (cont)
  • What we should expect to see
  • Disk Usage
  • low on client, maybe high on server
  • CPU
  • high on client might imply non database
    activities
  • high on server not expected
  • unless in kernel time (related to I/O calls)
  • Memory Usage
  • should have enough physical memory to keep
    critical processes in real memory
  • Network Usage
  • not expected to be a bottleneck
  • in a LAN environment

10
Win XP/2000/NT
  • Use Window Task Manager to visually inspect
    snapshot performance meters of CPU, I/O, and
    Memory usage for all processes on a given
    machine.
  • Use Microsoft Management Console to selectively
    record performance meters of CPU, I/O, and Memory
    usage for selective processes, processors, files,
    and disk.
  • Other utilities can be found in Resource Kit or
    http//www.sysinternals.com

11
Microsoft Mgmt Console
12
Unix System Tools
13
Agenda
  • What and Why
  • System Resources
  • ObjectStore Server
  • Database / Log I/O
  • ObjectStore Client
  • Summary

14
ObjectStore Server Tools
  • ossvrstat -meters -clients lthostnamegt
  • gives you an overview of server activities
  • totals
  • client-based activities
  • perl script ossvrstat.pl will collect and
    tabularize this data on a periodic basis.
  • ossvrdebug
  • provides a debug trace of server
  • use when number of messages are high and you are
    uncertain of what is going on.

15
ossvrstat Server Meters
  • ossvrstat gives you a snapshot of how much
    client/server work has been done since the
    osserver started, over the last hour, 10 minutes,
    and one minute.

promptgt ossvrstat meters MyHost Server Meters
for all clients that have used this
osserver Total since server start up Client
Meters 3530452 messages received 15582
callback msgs sent 88648 callback sectors
87808 succeeded sectors 13233499 KB read
13557865 KB written 95755
commits 15094 readonly commits
69820 aborts 0 two phase
txns 0 lock timeouts 486
lock waits (avg..) 44 deadlocks
0 msg buffer waits
16
ossvrstat Client Meters
  • ossvrstat gives you a snapshot of how much work
    each client has done since it started.

promptgt ossvrstat -clients MyHost Client Meters
(12 active client(s)) Client 543
(host1/7381/CallStartpjh_at_host1 1374 messages
received 246 callback messages sent
3904 callback sectors 3872 succeeded sectors
6350 KB read 7197 KB written
57 commits 51 readonly commits
4 aborts 0 two phase transactions 0 lock
timeouts 24 lock waits (average 15 ms)
1 deadlocks Client connections awaiting a client
message Client 543 (zubrus/7381/CallStartpjh_at_zu
brus) priority0x8000, duration2851 seconds,
work0, no txn on server
17
ossvrstat Limitations
  • Snapshots are not that useful when trying to
    analyze how the system is behaving
  • You need to look at trends
  • Snapshots throughout the day
  • Compare the trends to understand how components
    of the system are affecting other components

18
ossvrstat.pl
  • Perl script that takes server status at regular
    intervals.
  • Reformats data into two tables for easy plotting,
    comparison, trend analysis
  • Meter table (file)
  • client meters and server meter
  • Log table (file)
  • Log meter log size message buffer waits
  • The script is available after 6.0.SP7 in
    OS_ROOTDIR/unsupported/osu/scripts.

19
ossvrstat Meter Output
  • Each row is one client or server meter data for
    one loop interval.

20
ossvrstat Log Output
  • Each row is one loop interval.

21
Reviewing ossvrstat
  • After running ossvrstat.pl you have a history of
    how your system behaved during the run.
  • Recommend that you run this in production.
  • Keep a daily log of this output so that you can
  • Review the trends during the day, week, month,
    etc.
  • Establish a baseline to understand why the system
    feels sluggish
  • Use scripts to gather
  • throughput statistics
  • Plot to quickly visualize system behavior

22
ossvrstat.pl Scripting
  • Meter data
  • Use client id as key for splitting file by
    client.
  • awk NF 17 print 0 gt 3 .txt
  • Look at throughput (write / txn)
  • awk
  • if (11 ! LC)
  • print d f\n,3,
  • (8-LW3)/(11-LT3)
  • LW3 8
  • LT3 11
  • LC 11

23
ossvrstat.pl Visualizing
  • Graphs can quickly point out issues and patterns
    of behaviors.

24
ossvrdebug
  • ossvrdebug lthostnamegt ltlevelgt
  • level 0-9 , 0 turns debugging off
  • must have root permissions to set debugging
  • Use to understand what messages are being sent
    from a client to the server
  • Or to find out where lock contention is

25
ossvrdebug Example()
  • Given the following ossvrstat output

promptgt ossvrstat -clients MyHost Client Meters
(12 active client(s)) Client 8
(mylaptop/1840/whysomanymsgs) 69195 messages
received 0 callback messages sent 0
callback sectors 0 succeeded sectors
1997 KB read 0 KB written
0 commits 1 readonly commits
0 aborts 0 two phase
transactions 0 lock timeouts 0
lock waits 0 deadlocks
Turn on server debugging for a little
while ossvrdebug butsonhp6000 9 wait (10
secs) ossvrdebug butsonhp6000 0
26
ossvrdebug (Example)
  • Look at c\temp\osserver.txt
  • Heres a simple awk script run on this example.

CID 10 /RECEIVED/ MSGID (NF-1)
MESIDCID MSGID /FINISHED/ printf "s
s s\n",CID, MESIDCID,MESSAGECID
MESSAGECID 4 substr(0,index(0,")")2)

27
ossvrdebug (Example)
  • awk -f messages.awk C\temp\osserver.txt

8 101182 132352 server_actions
read_with_read_own 8 101183 132352
server_actions get_cluster_size 8 101184
132352 server_actions is_cluster_deleted 8
101185 132352 server_actions
get_cluster_size 8 101186 132352
server_actions is_cluster_deleted 8 101187
132352 server_actions get_cluster_size 8
101188 132352 server_actions
is_cluster_deleted 8 101189 132352
server_actions get_cluster_size ... 8 103229
132353 server_actions get_cluster_size 8
103230 132353 server_actions is_cluster_deleted
28
Agenda
  • What and Why
  • System Resources
  • ObjectStore Server
  • Database / Log I/O
  • ObjectStore Client
  • Summary

29
ObjectStore Database / Log Tools
  • Time spent in the server will most likely be
    related to disk I/O or Concurrency Management
    (lock waits, resource waits)
  • There are several commands to look at time spent
    doing disk I/O
  • What database files are opened?
  • What AIO threads are databases assigned to?
  • How much time is being spent reading,writing or
    flushing file?

30
Server I/O Commands
  • ossvrstat -databases lthostnamegt
  • to get a list of databases and which AIO threads
    they are assigned to
  • osdbcontrol -statistics ltdatabasegt ...
  • to get I/O statistics related to a given database
  • ossvrstat -log lthostnamegt
  • to get detailed output related to transaction log
    including time spent in I/O to/from transaction
    log

31
ossvrstat -databases
  • New in R6.1
  • displays information about the databases
    currently opened by each client

ossvrstat databases localhost Client Open
Databases (1 active client(s)) Client 22
(butsonhp6000/1180/(unknown)) RD 12s CR RL TR
3 localhostX\ADB\osquery.adb WR 16s CR RL TR
3 localhostX\dbs\mydb.odb RD 12s CR RL TR
1 localhostX\ADB\ossevol.adb RD 15s CR RL TR
1 localhostX\ADB\osji.adb RD 15s CR RL TR
2 localhostX\ADB\os_coll.adb
32
ossvrstat -databases
33
osdbcontrol -statistics
  • Displays I/O statistics for each given database
  • How long has statistics been accumulated
  • For each I/O operation (read, write, flush)
  • Number of operations performed (N)
  • Total number of sectors handled (S)
  • Total elapsed time of operation (t)
  • Number of seconds db has been opened (T)

34
osdbcontrol -statistics (cont)
  • Also includes some basic calculations for each
    operation
  • operations / sec N/T
  • million bytes / sec (S512)/T/(1000000)
  • time / operation (t/N)
  • time / million bytes (t / (S512/(1000000)))
  • total time in operation (100t/T)

35
osdbcontrol (Example)
osdbcontrol statistics test.odb osdbcontrol
statistics for database test.odb Statistics
accumulated over 0 hrs 0 mins 28 secs 4662
reads of 33805 sectors took 19.888 seconds of 28
secs. 166.500000 reads per second 0.618149
million bytes per second read 4.27 milliseconds
per read operation 1.15 seconds per million
bytes read 71.030357 of elapsed time spent
reading this file 0 writes of 0 sectors took
0.000 seconds of 28 seconds 0 flushes of 0
sectors took 0.000 seconds of 28 seconds
36
ossvrstat -log lthostnamegt
  • Information about servers transaction log
  • Divided into 4 parts
  • amount of disk space used
  • metering info over 4 time intervals
  • total, recent hour, recent 10 mins, recent min
  • databases where propagation has been delayed due
    to backup or MVCC
  • server parameter settings relevant to the
    transaction log

37
ossvrstat -log lthostnamegt
  • Disk space usage

Data segment
2 record segments
Log size
Not assigned
overhead
Size 3840KB 3MB data 512KB records 255KB
free 0KB overhead. Transaction log AIO is
served by thread 1.
38
Agenda
  • What and Why
  • System Resources
  • ObjectStore Server
  • Database / Log I/O
  • ObjectStore Client
  • Summary

39
ObjectStore C Client
  • Core Services
  • transactions
  • relocation
  • schema management
  • cache management
  • page faulting
  • placements
  • sessions

40
ObjectStore Java Client
  • OSJI Implementation API is java code
  • DMA Interface is JNI code (C) bridging java to
    C mapping
  • Core Services and IPC are the only measurable
    components in an OSJI application

OSJI API
OSJI Implementation
DMA Interface
Soft Pointers
Core Services
IPC to Server and Cache Manager
41
API to Monitor Client
  • Use Hooks to instrument code
  • Transaction Hooks
  • Page Event Hooks
  • Instruments to collect
  • N Counters
  • Realtime Counters
  • Server Meters
  • CPU Usage
  • Time Spent
  • Page Event Activity

42
Transaction Hooks
  • Can be used to transparently add instrumentation
    around ostore transactions

int main() before_retry_transaction_hook()
OS_BEGIN_TXN(tx1,0,os_transactionupdate)
after_begin_transaction_hook() if ()
before_abort_transaction_hook()
os_transactionabort()
after_abort_transaction_hook()
before_commit_transaction_hook()
OS_END_TXN(tx1) after_commit_transaction_hook
()
43
Page Event Hooks
  • Static methods on class objectstore
  • Can register callbacks to be invoked when
  • page created
  • page fault
  • page fetch
  • page evicted
  • page relocated
  • page committed
  • page upgraded
  • page created

44
Counters API
  • Measure counts and times of core ObjectStore
    client
  • N-Counters
  • measure only counts
  • Realtime Counters
  • measure counts and time
  • Counters are on a per session basis

45
ObjectStore API toEnable and Read Counters
void objectstoreenable_realtime_counters() void
objectstoreenable_counters() void
objectstorerecord_realtime_counters(os_boolean)
void objectstorereset_counters() os_unsigned_int
32 objectstoreread_counter(...) os_int32
objectstoreget_n_counters() void
objectstoreget_all_counters()
46
OSU_MONITOR
  • Unsupported library to collect and log various
    instrumentation (record) at page event and
    transaction boundaries
  • The client collects (writes) event records
  • Event records are ASCII but not formatted
  • Scripts are written to view and/or analyze the
    data
  • ASCII format for event records eases script
    writing to parse and analyze

47
OSU_MONITOR
  • Easily added to your application via a single
    line of code
  • osu_monitorinitialize() or
  • osu_monitorinitialize_all()
  • No overhead if not enabled
  • leave in your production code to analyze
    unexpected behavior
  • Enable by setting environment variables before
    execution of application
  • OSU_MONITOR
  • selective filters which events to record
  • if not set then no events are recorded and no
    overhead incurred
  • OSU_MONITOR_URL
  • where to write event records - filename or socket

48
OSU_MONITOR (API)
  • include ltosu/osu_monitor.hhgt
  • objectstoreinitialize()
  • osu_monitorinitialize()

setenv OSU_MONITOR TMN test db test.db
49
OSU_MONITOR Environment Variable
  • Set to any value will output duration and CPU
    usage for each transaction
  • You can turn on various instrumentation by
    including any of the following
  • T activate and output realtime counters
  • M activate and output srvstat meters
  • N activate and output n-counters
  • other flags for turning on each individual
  • page event hook (see next slide).

50
Enabling Page Events
  • AWCDFRSEU
  • A Page active hook
  • W Page commit hook (write)
  • C Page create hook
  • D Page deadlock hook
  • F Page fault hook
  • R Page fetch hook (read)
  • S Page relocation hooks (swizzle)
  • E Page evict hook
  • U Page upgrade hook

51
OSU_MONITOR Events
int main() OS_BEGIN_TXN(tx1,0,os_trans
actionupdate) if ()
os_transactionabort()
OS_END_TXN(tx1)
BEFORE TXN RETRY
AFTER TXN BEGIN
BEFORE TXN ABORT
AFTER TXN ABORT
BEFORE TXN COMMIT
AFTER TXN COMMIT
52
osu_monitor Scripts
  • There are some basic awk scripts in
  • OS_ROOTDIR/unsupported/osu/scripts
  • These scripts reformat the raw output into
    something more understandable
  • msgs.awk - creates a table of which msgs are
    sent from client to server (TM)
  • osu_monitor.awk - reformats and displays on a per
    transaction basis
  • txntime.awk - reformats in a table format where
    time is spent for each transaction

53
msgs (durations)
awk f msgs.awk v OPTD osu_monitior_2324.txt R
ealtime times of server calls for session
6qu2w txnname nmsgs nacct READ
READS UPGRD EVICT WRITE 2 d\r6.0_bl26\sr 73
9 0.041 0.000 0.000 0.000 0.000 ... 1
setup 117 9 0.066 0.000 0.000 0.000
0.570 ... 1 read_queue 1 0 0.000
0.000 0.000 0.000 0.028 ... 1 read_queue
13 1 6.269 0.000 0.000 0.000 0.000 ... 1
read_queue 5 0 10.279 0.000 0.000
0.000 0.000 ... 1 execute 1774 5
1.502 0.000 10.436 0.040 0.566 ... 1
read_queue 85 0 0.056 0.000 0.006 0.003
0.203 ... 1 execute 903 0 1.204
0.000 0.022 0.102 1.060 ... 1 read_queue
0 -1 0.000 0.000 0.000 0.000 0.000 ... 1
read_queue 3 0 0.330 0.000 0.000
0.000 0.000 ... 1 execute 870 0
1.079 0.000 0.028 0.142 1.252 ... 1
read_queue 0 -1 0.000 0.000 0.000 0.000
0.000 ... 1 read_queue 3 0 0.064
0.000 0.000 0.000 0.001 ...
54
txnsummary
  • 37203 txn-begin create CREATE
  • allocate_normal_object 256 0.0660
  • commit_rpc 1 1.1792
  • map_page 256 0.0026
  • protect_page 771 0.0070
  • reloc_page_in 256 0.0014
  • return_rpc 50 0.0125
  • server_side_allocation 16 0.0109
  • signal 3 0.0001
  • validate_schema 1 0.0001
  • n_create_page 256
  • n_evict 250
  • n_page_fault 3
  • n_page_fault_active 3
  • n_pages_committed 256
  • n_sigsegv 3
  • n_top_level_commits 1
  • n_touch_page 3
  • n_unwire 250

55
Agenda
  • What and Why
  • System Resources
  • ObjectStore Server
  • Database / Log I/O
  • ObjectStore Client
  • Summary

56
Summary
  • Monitor your system
  • There are some tools in ObjectStores
    unsupported directory to help you
  • osu_monitor
  • ossvrstat.pl
  • There are various ObjectStore commands and C
    APIs you can use
  • And dont forget the system commands available

57
?
Questions
Write a Comment
User Comments (0)
About PowerShow.com