Title: Monitoring ObjectStore Performance
1Monitoring ObjectStore Performance
- Dirk Butson (dbutson_at_progress.com)
2Why Am I Here?
- To enumerate the tools available to everyone
using R6.1 for monitoring ObjectStore behavior. - System Commands
- ObjectStore Commands
- ObjectStore APIs
- unsupported library
3Agenda
- What and Why
- System Resources
- ObjectStore Server
- Database / Log I/O
- ObjectStore Client
- Summary
4Why Monitor Performance?
- Critical step in order to tune the system
- Helps analyze problems / issues that might occur
during production e.g. - - whats the effect of doing an online backup on
the system? - how has 30 growth of our business affected our
system? - how does the new release compare with the
previous release?
5What Do We Monitor?
- ObjectStore architecture / components
- Hardware
- CPU, Disk I/O, Memory Usage, Network I/O
- Software
- client(s), cache manager(s), server(s)
- Files
- databases, transaction log
- cache and commseg files (UNIX only)
- Interprocess Communications
- messages to server, callbacks to the cache
manager, messages to the cache manager from client
6ObjectStore Architecture
Server Host
Client Host
Application Process
Transaction Log
Program I-Session
?
PageCache
?
Client Library
Commseg
Database Files
ADB
Cache Manager
7Agenda
- What and Why
- System Resources
- ObjectStore Server
- Database / Log I/O
- ObjectStore Client
- Summary
8System Tools
- Gives us an understanding of what system
resources we are using heavily - and possibly inefficiently
- We want to look at
- Disk Usage - which drives, files
- CPU Usage - which processes
- Memory Usage - are we swapping
- Network Usage - indicates heavy client -
server communications
9System Tools (cont)
- What we should expect to see
- Disk Usage
- low on client, maybe high on server
- CPU
- high on client might imply non database
activities - high on server not expected
- unless in kernel time (related to I/O calls)
- Memory Usage
- should have enough physical memory to keep
critical processes in real memory - Network Usage
- not expected to be a bottleneck
- in a LAN environment
10Win XP/2000/NT
- Use Window Task Manager to visually inspect
snapshot performance meters of CPU, I/O, and
Memory usage for all processes on a given
machine. - Use Microsoft Management Console to selectively
record performance meters of CPU, I/O, and Memory
usage for selective processes, processors, files,
and disk. - Other utilities can be found in Resource Kit or
http//www.sysinternals.com
11Microsoft Mgmt Console
12Unix System Tools
13Agenda
- What and Why
- System Resources
- ObjectStore Server
- Database / Log I/O
- ObjectStore Client
- Summary
14ObjectStore Server Tools
- ossvrstat -meters -clients lthostnamegt
- gives you an overview of server activities
- totals
- client-based activities
- perl script ossvrstat.pl will collect and
tabularize this data on a periodic basis. - ossvrdebug
- provides a debug trace of server
- use when number of messages are high and you are
uncertain of what is going on.
15ossvrstat Server Meters
- ossvrstat gives you a snapshot of how much
client/server work has been done since the
osserver started, over the last hour, 10 minutes,
and one minute.
promptgt ossvrstat meters MyHost Server Meters
for all clients that have used this
osserver Total since server start up Client
Meters 3530452 messages received 15582
callback msgs sent 88648 callback sectors
87808 succeeded sectors 13233499 KB read
13557865 KB written 95755
commits 15094 readonly commits
69820 aborts 0 two phase
txns 0 lock timeouts 486
lock waits (avg..) 44 deadlocks
0 msg buffer waits
16ossvrstat Client Meters
- ossvrstat gives you a snapshot of how much work
each client has done since it started.
promptgt ossvrstat -clients MyHost Client Meters
(12 active client(s)) Client 543
(host1/7381/CallStartpjh_at_host1 1374 messages
received 246 callback messages sent
3904 callback sectors 3872 succeeded sectors
6350 KB read 7197 KB written
57 commits 51 readonly commits
4 aborts 0 two phase transactions 0 lock
timeouts 24 lock waits (average 15 ms)
1 deadlocks Client connections awaiting a client
message Client 543 (zubrus/7381/CallStartpjh_at_zu
brus) priority0x8000, duration2851 seconds,
work0, no txn on server
17ossvrstat Limitations
- Snapshots are not that useful when trying to
analyze how the system is behaving - You need to look at trends
- Snapshots throughout the day
- Compare the trends to understand how components
of the system are affecting other components
18ossvrstat.pl
- Perl script that takes server status at regular
intervals. - Reformats data into two tables for easy plotting,
comparison, trend analysis - Meter table (file)
- client meters and server meter
- Log table (file)
- Log meter log size message buffer waits
- The script is available after 6.0.SP7 in
OS_ROOTDIR/unsupported/osu/scripts.
19ossvrstat Meter Output
- Each row is one client or server meter data for
one loop interval.
20ossvrstat Log Output
- Each row is one loop interval.
21Reviewing ossvrstat
- After running ossvrstat.pl you have a history of
how your system behaved during the run. - Recommend that you run this in production.
- Keep a daily log of this output so that you can
- Review the trends during the day, week, month,
etc. - Establish a baseline to understand why the system
feels sluggish - Use scripts to gather
- throughput statistics
- Plot to quickly visualize system behavior
22ossvrstat.pl Scripting
- Meter data
- Use client id as key for splitting file by
client. - awk NF 17 print 0 gt 3 .txt
- Look at throughput (write / txn)
- awk
- if (11 ! LC)
- print d f\n,3,
- (8-LW3)/(11-LT3)
- LW3 8
- LT3 11
-
- LC 11
-
23ossvrstat.pl Visualizing
- Graphs can quickly point out issues and patterns
of behaviors.
24ossvrdebug
- ossvrdebug lthostnamegt ltlevelgt
- level 0-9 , 0 turns debugging off
- must have root permissions to set debugging
- Use to understand what messages are being sent
from a client to the server - Or to find out where lock contention is
25ossvrdebug Example()
- Given the following ossvrstat output
promptgt ossvrstat -clients MyHost Client Meters
(12 active client(s)) Client 8
(mylaptop/1840/whysomanymsgs) 69195 messages
received 0 callback messages sent 0
callback sectors 0 succeeded sectors
1997 KB read 0 KB written
0 commits 1 readonly commits
0 aborts 0 two phase
transactions 0 lock timeouts 0
lock waits 0 deadlocks
Turn on server debugging for a little
while ossvrdebug butsonhp6000 9 wait (10
secs) ossvrdebug butsonhp6000 0
26ossvrdebug (Example)
- Look at c\temp\osserver.txt
- Heres a simple awk script run on this example.
CID 10 /RECEIVED/ MSGID (NF-1)
MESIDCID MSGID /FINISHED/ printf "s
s s\n",CID, MESIDCID,MESSAGECID
MESSAGECID 4 substr(0,index(0,")")2)
27ossvrdebug (Example)
- awk -f messages.awk C\temp\osserver.txt
8 101182 132352 server_actions
read_with_read_own 8 101183 132352
server_actions get_cluster_size 8 101184
132352 server_actions is_cluster_deleted 8
101185 132352 server_actions
get_cluster_size 8 101186 132352
server_actions is_cluster_deleted 8 101187
132352 server_actions get_cluster_size 8
101188 132352 server_actions
is_cluster_deleted 8 101189 132352
server_actions get_cluster_size ... 8 103229
132353 server_actions get_cluster_size 8
103230 132353 server_actions is_cluster_deleted
28Agenda
- What and Why
- System Resources
- ObjectStore Server
- Database / Log I/O
- ObjectStore Client
- Summary
29ObjectStore Database / Log Tools
- Time spent in the server will most likely be
related to disk I/O or Concurrency Management
(lock waits, resource waits) - There are several commands to look at time spent
doing disk I/O - What database files are opened?
- What AIO threads are databases assigned to?
- How much time is being spent reading,writing or
flushing file?
30Server I/O Commands
- ossvrstat -databases lthostnamegt
- to get a list of databases and which AIO threads
they are assigned to - osdbcontrol -statistics ltdatabasegt ...
- to get I/O statistics related to a given database
- ossvrstat -log lthostnamegt
- to get detailed output related to transaction log
including time spent in I/O to/from transaction
log
31ossvrstat -databases
- New in R6.1
- displays information about the databases
currently opened by each client
ossvrstat databases localhost Client Open
Databases (1 active client(s)) Client 22
(butsonhp6000/1180/(unknown)) RD 12s CR RL TR
3 localhostX\ADB\osquery.adb WR 16s CR RL TR
3 localhostX\dbs\mydb.odb RD 12s CR RL TR
1 localhostX\ADB\ossevol.adb RD 15s CR RL TR
1 localhostX\ADB\osji.adb RD 15s CR RL TR
2 localhostX\ADB\os_coll.adb
32ossvrstat -databases
33osdbcontrol -statistics
- Displays I/O statistics for each given database
- How long has statistics been accumulated
- For each I/O operation (read, write, flush)
- Number of operations performed (N)
- Total number of sectors handled (S)
- Total elapsed time of operation (t)
- Number of seconds db has been opened (T)
34osdbcontrol -statistics (cont)
- Also includes some basic calculations for each
operation - operations / sec N/T
- million bytes / sec (S512)/T/(1000000)
- time / operation (t/N)
- time / million bytes (t / (S512/(1000000)))
- total time in operation (100t/T)
35osdbcontrol (Example)
osdbcontrol statistics test.odb osdbcontrol
statistics for database test.odb Statistics
accumulated over 0 hrs 0 mins 28 secs 4662
reads of 33805 sectors took 19.888 seconds of 28
secs. 166.500000 reads per second 0.618149
million bytes per second read 4.27 milliseconds
per read operation 1.15 seconds per million
bytes read 71.030357 of elapsed time spent
reading this file 0 writes of 0 sectors took
0.000 seconds of 28 seconds 0 flushes of 0
sectors took 0.000 seconds of 28 seconds
36ossvrstat -log lthostnamegt
- Information about servers transaction log
- Divided into 4 parts
- amount of disk space used
- metering info over 4 time intervals
- total, recent hour, recent 10 mins, recent min
- databases where propagation has been delayed due
to backup or MVCC - server parameter settings relevant to the
transaction log
37ossvrstat -log lthostnamegt
Data segment
2 record segments
Log size
Not assigned
overhead
Size 3840KB 3MB data 512KB records 255KB
free 0KB overhead. Transaction log AIO is
served by thread 1.
38Agenda
- What and Why
- System Resources
- ObjectStore Server
- Database / Log I/O
- ObjectStore Client
- Summary
39ObjectStore C Client
- Core Services
- transactions
- relocation
- schema management
- cache management
- page faulting
- placements
- sessions
40ObjectStore Java Client
- OSJI Implementation API is java code
- DMA Interface is JNI code (C) bridging java to
C mapping - Core Services and IPC are the only measurable
components in an OSJI application
OSJI API
OSJI Implementation
DMA Interface
Soft Pointers
Core Services
IPC to Server and Cache Manager
41API to Monitor Client
- Use Hooks to instrument code
- Transaction Hooks
- Page Event Hooks
- Instruments to collect
- N Counters
- Realtime Counters
- Server Meters
- CPU Usage
- Time Spent
- Page Event Activity
42Transaction Hooks
- Can be used to transparently add instrumentation
around ostore transactions
int main() before_retry_transaction_hook()
OS_BEGIN_TXN(tx1,0,os_transactionupdate)
after_begin_transaction_hook() if ()
before_abort_transaction_hook()
os_transactionabort()
after_abort_transaction_hook()
before_commit_transaction_hook()
OS_END_TXN(tx1) after_commit_transaction_hook
()
43Page Event Hooks
- Static methods on class objectstore
- Can register callbacks to be invoked when
- page created
- page fault
- page fetch
- page evicted
- page relocated
- page committed
- page upgraded
- page created
44Counters API
- Measure counts and times of core ObjectStore
client - N-Counters
- measure only counts
- Realtime Counters
- measure counts and time
- Counters are on a per session basis
45ObjectStore API toEnable and Read Counters
void objectstoreenable_realtime_counters() void
objectstoreenable_counters() void
objectstorerecord_realtime_counters(os_boolean)
void objectstorereset_counters() os_unsigned_int
32 objectstoreread_counter(...) os_int32
objectstoreget_n_counters() void
objectstoreget_all_counters()
46OSU_MONITOR
- Unsupported library to collect and log various
instrumentation (record) at page event and
transaction boundaries - The client collects (writes) event records
- Event records are ASCII but not formatted
- Scripts are written to view and/or analyze the
data - ASCII format for event records eases script
writing to parse and analyze
47OSU_MONITOR
- Easily added to your application via a single
line of code - osu_monitorinitialize() or
- osu_monitorinitialize_all()
- No overhead if not enabled
- leave in your production code to analyze
unexpected behavior - Enable by setting environment variables before
execution of application - OSU_MONITOR
- selective filters which events to record
- if not set then no events are recorded and no
overhead incurred - OSU_MONITOR_URL
- where to write event records - filename or socket
48OSU_MONITOR (API)
- include ltosu/osu_monitor.hhgt
-
- objectstoreinitialize()
- osu_monitorinitialize()
setenv OSU_MONITOR TMN test db test.db
49OSU_MONITOR Environment Variable
- Set to any value will output duration and CPU
usage for each transaction - You can turn on various instrumentation by
including any of the following - T activate and output realtime counters
- M activate and output srvstat meters
- N activate and output n-counters
- other flags for turning on each individual
- page event hook (see next slide).
50Enabling Page Events
- AWCDFRSEU
- A Page active hook
- W Page commit hook (write)
- C Page create hook
- D Page deadlock hook
- F Page fault hook
- R Page fetch hook (read)
- S Page relocation hooks (swizzle)
- E Page evict hook
- U Page upgrade hook
51OSU_MONITOR Events
int main() OS_BEGIN_TXN(tx1,0,os_trans
actionupdate) if ()
os_transactionabort()
OS_END_TXN(tx1)
BEFORE TXN RETRY
AFTER TXN BEGIN
BEFORE TXN ABORT
AFTER TXN ABORT
BEFORE TXN COMMIT
AFTER TXN COMMIT
52osu_monitor Scripts
- There are some basic awk scripts in
- OS_ROOTDIR/unsupported/osu/scripts
- These scripts reformat the raw output into
something more understandable - msgs.awk - creates a table of which msgs are
sent from client to server (TM) - osu_monitor.awk - reformats and displays on a per
transaction basis - txntime.awk - reformats in a table format where
time is spent for each transaction
53msgs (durations)
awk f msgs.awk v OPTD osu_monitior_2324.txt R
ealtime times of server calls for session
6qu2w txnname nmsgs nacct READ
READS UPGRD EVICT WRITE 2 d\r6.0_bl26\sr 73
9 0.041 0.000 0.000 0.000 0.000 ... 1
setup 117 9 0.066 0.000 0.000 0.000
0.570 ... 1 read_queue 1 0 0.000
0.000 0.000 0.000 0.028 ... 1 read_queue
13 1 6.269 0.000 0.000 0.000 0.000 ... 1
read_queue 5 0 10.279 0.000 0.000
0.000 0.000 ... 1 execute 1774 5
1.502 0.000 10.436 0.040 0.566 ... 1
read_queue 85 0 0.056 0.000 0.006 0.003
0.203 ... 1 execute 903 0 1.204
0.000 0.022 0.102 1.060 ... 1 read_queue
0 -1 0.000 0.000 0.000 0.000 0.000 ... 1
read_queue 3 0 0.330 0.000 0.000
0.000 0.000 ... 1 execute 870 0
1.079 0.000 0.028 0.142 1.252 ... 1
read_queue 0 -1 0.000 0.000 0.000 0.000
0.000 ... 1 read_queue 3 0 0.064
0.000 0.000 0.000 0.001 ...
54txnsummary
- 37203 txn-begin create CREATE
- allocate_normal_object 256 0.0660
- commit_rpc 1 1.1792
- map_page 256 0.0026
- protect_page 771 0.0070
- reloc_page_in 256 0.0014
- return_rpc 50 0.0125
- server_side_allocation 16 0.0109
- signal 3 0.0001
- validate_schema 1 0.0001
- n_create_page 256
- n_evict 250
- n_page_fault 3
- n_page_fault_active 3
- n_pages_committed 256
- n_sigsegv 3
- n_top_level_commits 1
- n_touch_page 3
- n_unwire 250
55Agenda
- What and Why
- System Resources
- ObjectStore Server
- Database / Log I/O
- ObjectStore Client
- Summary
56Summary
- Monitor your system
- There are some tools in ObjectStores
unsupported directory to help you - osu_monitor
- ossvrstat.pl
- There are various ObjectStore commands and C
APIs you can use - And dont forget the system commands available
57?
Questions