PTW083 Super Database Reorganization - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

PTW083 Super Database Reorganization

Description:

... Dump ... Use Record Counts before the Dump and after the Load to Validate that the Load ... proutil command can only Dump/Load one table at a time (one ... – PowerPoint PPT presentation

Number of Views:415
Avg rating:3.0/5.0
Slides: 50
Provided by: danfo
Category:

less

Transcript and Presenter's Notes

Title: PTW083 Super Database Reorganization


1
PTW-083Super Database Reorganization
  • Dan Foreman
  • Progress Expert

2
Introduction- Dan Foreman
  • Progress user since 1984
  • Author of
  • Progress Performance Tuning Guide
  • Progress Database Admin Guide
  • Progress Virtual System Tables
  • V9 Database Admin Jumpstart
  • Online Search Availability for each Publication
  • ProMonitor - Database Monitoring Tool
  • Pro DL - Dump/Load with minimal downtime
  • Progress DBA Resource Kit
  • STGEN - V9 Structure File Generator
  • Logical Scatter Analyzer

3
Why Dump/Load (DL)?
  • Improve Performance by reducing Table Scatter
    Factor Record Fragmentation
  • Reclaim Space in the Database
  • Migrate between different Progress Versions (e.g.
    from V9 to R10)
  • Migrate between different Platforms
  • (e.g. Windows to HP/UX)

4
Why DL?
  • Change the Database Blocksize (V8.2)
  • Change the Records per Block (RPB)
  • Convert to Multiple Storage Areas (V9)
  • Convert to R10 Type 2 Storage Areas
  • Recover Data from a Corrupted Database

5
How Often?
  • Linear method - every X number of months
  • Subjective method - because it feels like about
    the right time to do it
  • Serendipitous method - Whens the next 3 day
    weekend?
  • Empirical method - predictive - decide when is
    the best time based upon various statistics
  • Scatter Factor
  • Fragmentation Factor
  • Application and/or Utility Performance

6
Scatter Factor
  • How close are the Records in a Table to each
    other physically
  • Also known as Physical Scatter Factor
  • Obtained from proutil dbanalys/tabanalys
  • The Progress recommendations in the documentation
    are somewhat low (conservative) for most
    databases you will spend many weekends doing
    DLs

7
Scatter Factor
  • For Large, Frequently Accessed Tables
  • .9 Data overlap problem (bug)
  • 1.0 Green Flag - Perfect
  • 1.1-2.0 Probably OK
  • 2.1-3.0 Yellow Flag - Monitor
  • 3.1-4.0 Performance deteriorating
  • 4.1 Red Flag - take action ASAP

8
Scatter Factor
  • Scatter Factor is not as crucial in V9 if
    selected Tables are stored in dedicated Storage
    Areas
  • But dont forget there is also a Logical Scatter
    Factor to consider - does the Physical sequence
    of the records match the sequence of the keys in
    the most commonly used index
  • New tool to measure Logical Scatter Factor
    (Logical Scatter Analyzer)

9
Fragmentation Factor
  • A Fragment is a Record in proutil dbanalys
  • A Record can consist of one or more Fragments
  • The more Fragments per Record the more I/O is
    required to read and update a Record
  • VSTs (_ActRecord) can tell you how much Excess
    Fragment I/O is actually occurring but
    unfortunately not on a table level also the
    statistics are somewhat suspicious

10
Index Statistics
  • Obtained from proutil dbanalys/ixanalys
  • Block Utilization Percentage
  • 50 2X more I/O to read all the Keys
  • 25 4X more I/O to read all the Keys
  • B-Tree Levels
  • Each level means a possible extra I/O to retrieve
    a key
  • How often an Index is used can be found using the
    _IndexStat VST

11
General Preparation
  • Archive Purge as much data as possible
  • The DL will take less time
  • The Scattering caused by the purging will be
    eliminated by the DL
  • Perform an Index Rebuild (if possible)
  • Might improve performance enough to avoid a DL
    but unlikely in my experience
  • Might reveal hidden DB corruption that needs to
    be dealt with first (i.e. 1124 errors)
  • May make the dump process go faster

12
General Preparation
  • Run Time some Heavy Duty Reports (multi-table,
    many records, reproducible)
  • Run Time proutil dbanalys
  • Use Record Counts before the Dump and after the
    Load to Validate that the Load was successful
  • Might Reveal Corruption (i.e. 1124 errors)
  • The Elapsed Time to run proutil is an important
    performance measurement
  • 40 Hours ? 40 Minutes (24gb DB that had never
    been Reorganized)

13
General Preparation
  • Consider Disabling Disk Mirroring
  • Extra Dump Space
  • Availability of 2nd Disk Controller
  • Potential (usually minor) Performance Gain
  • But dont forget there may be overhead of
    re-syncing the mirrors
  • Dump to non-database disks if possible
  • AI/BI/Temp File Disks are good choices

14
4GL Dump
  • Progress Dictionary/Custom 4GL Code
  • Dictionary has a Simple Interface
  • Usually the Slowest Method
  • Dictionary Dump/Load Programs can be run
    non-interactively
  • Custom Coding requires 4GL experience
  • Dump files are subject to 2GB limit but can code
    around it

15
Binary Dump
  • Old Binary Dump
  • In Progress since V5.2I
  • Undocumented Unsupported
  • An Amateur could lose data
  • Dumps all data to one file unless you know the
    trick
  • Obsolete unless you need to migrate from V6 to V8
    quickly because it no longer works in V9

16
Binary Dump
  • New Binary Dump (V8.2 and later)
  • Documented Supported
  • Option on proutil
  • proutil command can only Dump/Load one table at a
    time (one proutil per Table)
  • Multi-Threading - parallel dump are possible
  • Dump files are portable

17
Binary Dump
  • New Binary Dump (V8.2 and later)
  • No 2GB limit (dump file automatically splits)
  • Single dump file on V9.1B and later unless there
    is an external (non-Progress) size limit
  • Possibility exists of carrying Database
    corruption to the new Database

18
Binary Dump
  • Tip Use the Read Only (-RO) option for faster
    dump performance
  • Benchmark - no difference using -RO in
    V9.1D09/R10.0B, used to be 10 faster
  • Tip Dont specify a large -B with -RO
  • Use bin.p (or bin-nt.p) on the BravePoint web
    site to generate DL scripts (UNSUPPORTED) or
    dumpload.p in DBA Resource Kit

19
Binary Dump
  • R10 Performance Tip for T2 Areas
  • proutil dbname -C dump table -index 0
  • Kbase P99891
  • Benchmark

20
Dump Methodology
  • Do Parallel Dumps if
  • You want to go home earlier
  • You have multiple CPUs
  • You have multiple (non-DB) disks with some free
    space
  • 1-4 dump threads per CPU is usually safe
  • If you have many non-DB disks or another machine,
    once a table is finished dumping, you can start
    loading into a database on the other system or
    set of disks

21
Dump Methodology
  • If a table is large (10 million records), the
    Binary dump might not be the fastest method
    because it is single threaded
  • Consider multiple, parallel, 4GL dumps
  • Potential Danger
  • / dump1.p / where custnum LT 1000000
  • / dump2.p / where custnum GT 1000000
  • Bulk Load the Data (proutil bulkload)
  • Alternative in V9.1D you can use proutil
    dumpspecified instead of dump but major
    limitations - next slide

22
proutil dumpspecified
  • Cant dump multiple sections of the table to the
    same directory because the dump file is named
    table.bd
  • No Complex Operators allowed (And, Or, Not)
  • It is not possible to select an alternate index
    with the -index option
  • Primary Index needs to be a single Field
  • The Index must have the same name as the Field
  • Progress says they will fix in V10.1X

23
Dump Methodology
  • Sometimes dumping using a non-primary index is
    faster particularly if the secondary index is
    smaller
  • Benchmark - 7X slower (default -B)
  • Dont forget to dump Sequences and the User table
    (cant Binary dump)
  • Dont forget SQL92 Privileges

24
Pre-Load
  • Build a Multi-Volume Void Structure or reuse the
    existing one to save time
  • Consider using 4k/8k Database Block Size
  • Match to Largest possible F/S Block Size
  • In V8 8k may cause the DB to radically grow in
    size if Average Record Size of Large Tables is
    small
  • In V9/R10 adjust Records/Block (1 to 256)
  • Don't forget to reduce -B proportionately

25
Load Methods
  • Dictionary Load/Custom 4GL Code
  • Same Advantages Disadvantages as the
    Dictionary/custom Dump
  • Usually the Slowest Option
  • Deactivate Indexes first (V8 and earlier)
  • Parallel with Indexes Active (V9)
  • No more than one Load per Area

26
Load Methods
  • Bulk Load
  • Option on proutil
  • Can load Dictionary or 4GL dump (.d) files
  • Very Fast but not quite as fast as Binary
  • Single Threaded only
  • Index Rebuild is required
  • Possible problems if New Line characters are
    embedded in the data

27
Load Methods
  • Binary Load
  • Single or Parallel loads are possible
  • Start the Database Broker to
  • Observe the speed (Records Created) of the Load
    in promon
  • Avoid Crash Recovery overhead each time proutil
    is invoked!!!!!!
  • For better Performance use the No Integrity (-i)
    Option
  • Benchmark 2.2X slower without -i

28
Load Methods
  • Binary Load
  • Use V8.3C or higher for bug fixes
  • NOTE Cant Binary Dump/Load a Table if the Table
    has Deleted Fields (until V9 and later) Use
    rpos.p from the BravePoint web site to see if you
    have tables with deleted fields

29
Parallel Loads
  • V8 - NO, Increases the Scatter Factor slow
  • V9/R10 - Maybe, if using Areas and a maximum of
    one load thread per Area
  • Benchmark 2 threads 3844 (ser) vs 2523 (para)
  • Dont Forget to Tune the Broker
  • No Integrity (-i)
  • Big Cluster Size (16mb)
  • APWs, BIW
  • -spin
  • -bibufs
  • No AI
  • -directio (maybe)

30
Index Rebuild
  • Backup the Database BEFORE you start the Index
    Rebuild
  • If the Index Rebuild fails you might not be able
    to restart it

31
Index Rebuild - proutil idxbuild
  • Disk Sort Method
  • Fastest (but single threaded)
  • Builds a more compact index
  • A Sort file is created on disk
  • Sort File on Non-DB Disks 20 Faster

32
Index Rebuild - proutil idxbuild
  • Disk Sort Method
  • Sort File Size Estimate 1X-2X Data Size to be
    absolutely safe
  • The Sort file is subject to the 2GB limit until
    V9.1D SP08
  • V8. allows Multi-Volume Sort files
  • V8 Bug Don't put TABs in the .srt file
  • Bug the Sort files are not always removed until
    V9.1D SP07

33
Index Rebuild - proutil idxbuild
  • Memory Sort Method
  • Much Slower than Disk Sort
  • Less Compact Index
  • Use V9 idxcompact to compact the Index
  • A compaction percentage can be specified
  • Can be run online or offline
  • Potential DB Corrupting Bug in idxcompact fixed
    in V9.1D SP05
  • No Disk Space Required
  • Use a larger -B (but not too large)

34
Index Rebuild - proutil idxfix
  • V8.0 and Later
  • Can run Online or Offline
  • Can be run in parallel
  • Can be restarted if interrupted
  • Sounds good, right?

35
Index Rebuild - proutil idxfix
  • Much Slower than idxbuild because granularity is
    higher (key level) and idxfix generates record
    locks and transactions like a 4GL Client
    (benchmarks on next slide)
  • Output is VERY Verbose
  • Prior to V8.2C went to .lg and could hit 2gb!
  • Output now goes to Standard Out
  • Very useful -silent parameter (undocumented but
    broken in V9.1D, but fixed in SP06 or SP07)

36
Index Rebuild - proutil idxfix Benchmarks
  • V8.3B, Unix
  • 1,000,000 Records, 3 keys
  • Ran with No Integrity (-i) and other options
  • Single Thread, Output to Screen 4037
  • Single Thread, Output to /dev/null 2500

37
Index Rebuild proutil idxfix Benchmarks
  • 3 Parallel Threads (one for each index)
  • Killed the job after 1.25 hours
  • Only 169,000 records had been Indexed (17)
  • Commits per Sec dropped from 400 to 100
  • Continuous TXE Latch Contention
  • This may be fixed in V9 (Concurrent Lock
    Protocol enhancement)
  • By comparison idxbuild took 523 (mmss)

38
Index Rebuild Options
  • -TB 31 Disk Sort don't use 24 as recommended in
    Progress manual
  • -TM 32 Disk Sort
  • db.srt Multi-volume Sort file
  • Keep Sort files on non-DB disks (10-20)
  • -t Disk Sort Unix
  • -B Useful for Memory Sort only
  • -SS V9.1B build indexes option (shortly)
  • -SG Sort Group SP07 default 48 use 64

39
V9/R10 Considerations
  • Pre-V9 Dump/Load of an individual Table has No
    value other than eliminating Fragmentation
  • In V9 it might if the Table lives in a
    Dedicated Area and the Logical Scatter Factor is
    high
  • V9.1B has an option on the Binary Load to rebuild
    the Keys during the Load
  • Each index is logged as a single transaction
  • Benchmark coming up later

40
V9/R10 Table Move
  • Can be Performed Online
  • Reorganizes and defragments if moving the table
    to a clean (i.e. empty) Area
  • But .

41
V9/R10 Table Move
  • Table is Exclusively Locked for entire move
  • All Indexes on the table are rebuilt
  • One Transaction - Watch out for BI growth 16/32TB
    limit in V9 but the 2GB file size limit exists
    until V9.1C (EnableLargeFiles) so what happens if
    you kill the process (it was taking too long)
    too late and crash recovery grows the BI file
    into the 2gb limit?
  • AI Space required

42
Are We Done Yet?
  • Check the BI Cluster Size easy to forget
  • Check the AI BI Block Sizes
  • Re-Enable Virtual System Tables (pre-V9)
  • Re-Enable After Imaging - What no AI???
  • Re-Format the BI File (proutil bigrow)
  • Record the DB High Water Mark
  • Run Time the same Heavy Duty Reports
  • Run Time proutil dbanalys

43
Did We Succeed?
  • Improvement in speed of dbanalys reports -
    record the numbers for next time
  • But you might not have a perfect (physical)
    Scatter Factor because of how Progress allocates
    Space
  • You might have a perfect Physical Scatter Factor
    (particularly if you dedicate a V9 Area to one
    Table) but not a perfect Logical Scatter Factor

44
Benchmarks V9.1D SP06
  • Dict Load/Idx Inactive/idxbuild 2707
  • Dict Load/Idx Active (3 threads) 2357
  • Bulk Load/idxbuild 1610
  • Serial Binary/idxbuild 1515
  • Parallel Binary/idxbuild 1529
  • Serial Binary with -SS 1956
  • Parallel Binary with -SS 3804
  • Note that SP06 is the last release before the
    huge index rebuild performance changes were added

45
Benchmarks V10.0B
  • Serial (2 tables)
  • Load 519 (240 239)
  • Idxbuild 1212
  • Total 1731
  • Parallel with build indexes option
  • Total 552

46
Exotic Methods
  • Exotic Labor Intensive Higher Risk
  • Multi-Phase
  • Dump the Static portions of large tables anytime
    using 4gl Dump
  • Use Schema Triggers to ensure any changes made to
    records already dumped are captured
  • RECID Dump
  • Keys point to highly scattered records

47
Additional Resources
  • Progress Database Administration Guide has an
    entire chapter on Database Reorganization
  • Pro DL
  • A tool to minimize DL down time to a very short
    duration (as little as a few minutes) regardless
    of the size of the database

48
Miscellaneous Tips (If we have time)
  • Failure in the middle of a load?
  • proutil truncate area
  • Parallel Balancing
  • Record size is more important than record count

49
Conclusion
  • Questions?
  • Thank you for coming
  • Please dont forget your conference evaluations
  • danf_at_prodb.com
  • www.BravePoint.com
  • Tel 1 541-754-2116 (24 hours)
Write a Comment
User Comments (0)
About PowerShow.com