D0 Taking Stock - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

D0 Taking Stock

Description:

D0 Taking Stock – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 18
Provided by: aku9
Category:
Tags: avg | free | stock | taking

less

Transcript and Presenter's Notes

Title: D0 Taking Stock


1
D0 Taking Stock
  • By Anil Kumar
  • CD/CSS/DSG
  • June 06, 2005

2
Production/Integration Infrastructure
  • 8 900MHz CPU 16Gb RAM
  • The machine has a Clariion 4500 hardware raid
    array with 80 drives.
  • Oracle Server 9.2.0.6 (64 bit) on Solaris 2.9 64
    bit.
  • Load Avg 2 3 CPU usage 61
  • Average Db Response time/execute 0.01-0.05 secs
  • Uptime excluding schedule down times.
  • 99.85269 Uptime (based on 420 min of total db
    unavailability) since
  • Nov 15, 2004
  • System Performance
  • http//d0om.fnal.gov/d0admin/sysperf/
  • Db Performance Charts
  • https//mistest12.fnal.gov/cp/

3
D0 offline development Infrastructure
  • 8 400 MHz, 4GB of RAM
  • OS and Oracle Version same as of int/prd.
  • 64bit OS and 64 bit Oracle 9.2.0.6
  • Load Avg 1-2 , CPU usage 10-15, Mem Free 7.8
  • System Performance URL
  • http//d0om.fnal.gov/d0admin/sysperf/

4
D0 Calib Servers Deployment Infrastructure
Linux PC for Analysis Farm
Linux PC for Reconstructed Farm
Sun Solaris
Linux PC
Linux PC
Failover
Failover
Note There was 1 failure for User Servers and
Farm Servers since Nov 15, 2005
5
Space Usage
6
Space Usage Summary
  • D0ofprd1 786 GB used.
  • d0ofint1 77GB used.
  • 800Gb is available for use for int and
    production.
  • d0ofdev1 82 GB Used
  • 190GB is available for use.

7
Capacity Planning
  • Next three years expected Growth 825Gb.
  • SAM growth 250Gb/year and other apps
  • 25Gb/year. This exclude Luminosity DB
  • We have around 800Gb available.
  • Should start planning to upgrade Disk Capacity
    Next Year.
  • Luminosity growth is 125Gb/year.
  • Sun v40z machine with a Sun StorEdge 3310
    scsi disk array w/ 1.752 tb 2 Ultra160 raid
    controllers.
  • URL for Capacity Planning
  • http//www-css.fnal.gov/dsg/internal/d0_ofl_dbs/D0
    _database_servers(sun)/d0ora_index_page/d0ora2_d0o
    fprd/d0ora2_disk_planning.htm

8
Accomplishments
  • Upgraded D0 offline databases to 9.2.0.6
  • Quarterly Database Security Up-to-date
  • Tested Complete Database Recovery of d0ofprd1
    database. It took 4 hours. This assumes hardware
    is already configured and Backup files are
    available on disk.
  • Moving  d0 offline to a standardized backup
    recovery method using a san and enstore. 
    Parallel testing of san as backup media for
    development and production instances going well.
  • Luminosity db deployed 9i and 10g versions on
    loaner CDF machine

9
Monitoring And Data Modeling Tools
  • Monitoring Tools
  • dbatool/toolman
  • To monitor the space usage, users, SQL,
    tempspace, sniping of inactive sessions, auto
    start of Listener, IA, estimate table/Index stats
  • OEM (Oracle Enterprise Manager)
  • - DB Monitoring tool/ Monthly charts posted on
    web
  • Db Performance Charts
  • http//www-cdserver.fnal.gov/cd_public/css/dsg/db_
    stats/data/db_stats.html
  • The url for the ganglia charts (monitoring tools)
    ishttp//fcdfmon2.fnal.gov/
  • Data Modeling Tool
  • Oracle Designer is used for Data Modeling and
    space estimates.

10
Back-up/Recovery
  • D0ofprd1
  • - Daily, 7 days of archives, one always on DISK
  • - Bi-weekly backup of READ ONLY tablespaces
  • - Allocated 1179GB Used 755GB, Tape Daily, RMAN
    Back-up time - 5 Hrs 45Min ( 3 Hrs Excl READ
    ONLY 2 Hrs 45 Exclude READ ONLY )
  • No Export
  • Tape Rotation 1 Week for Daily backups and 2
    months for Read Only backups.
  • D0ofint1 Once a week on SAN
  • D0ofdev1
  • - Daily 3 days of archives Sat on DISK
    otherwise on SAN
  • -Allocated 100GB, used 58GB, Daily Tape Backup
  • RMAN Backup time - 1.5 Hr.
  • Tape Rotation 2 Months.
  • - backup strategy for d0of lum boxes will be the
    SAN centralize strategy 

11
RMAN Backup on SAN
  • Inexpensive, large disk array can accommodate
    growing RMAN backups
  • Fast reliable backup and recovery
  • 24 x 7 and 8 x 5 support tiers available
  • Can serve various O/S platforms
  • Briefing on the database backup/recovery
    standardization on june 16, it will discuss the
    san testing in more detail
  • Multiplexing of archives to local disk and SAN

12
RMAN to SAN test case on d0ofdev1
  • d0ofdev1 RMANs to SAN since Nov. 04
  • Two 1TB SAN mount points available
  • Keep 2 alternating days of RMANs on SAN,
    once/week to local backup disk
  • RMAN validation to determine backup file
    integrity
  • One validation failure since Nov. 04
  • Recoveries from SAN were all successful

13
Production backups to SAN
  • Initial problems encountered due to incompatible
    PCI cards solved now
  • Two 1TB SAN mount points in use
  • 2 daily backups one to SAN, one to local backup
    disk
  • Always 2 backups on disk, plus X200 tape library
    backup of RMAN from local disk
  • Read-only portion of database backed up
    twice/month to local backup disk

14
SAN issues
  • Current SAN is not 24 x 7 support
  • IDE disks are not as reliable as other, more
    expensive disks are
  • Purchasing 24 x 7 SAN requires licensing and
    changes to O/S to be able to use it
  • Firewall issues (CDF D0 online)

15
SAM Schema
  • Production Deployments
  • - Autodestination Sub-System of SAM schema
  • - Indexes on Param Values Deployed in
    production.
  • - Data Types correction cut.
  • - Indexes for Volumes to be
    deployed on 06/07/05
  • - Partition Cut to be deployed on
    06/07/05
  • Work-in-progress
  • - Request Sub System of SAM Schema. Cut in
    Mini-sam.
  • Upgrade to Mini SAM as SAM Schema Evolved. -
    This facilitate individual developers to have
    copy of SAM metadata and seed data available for
    server software rewrite if needed.
  • Mini-SAM in Postgres. Initiative to move towards
    free ware Databases for SAM
  • Proof of product not complete, requires
    testing with a dbserver  from the sam development
    team
  • 1.61B events in 32 Partitions. Now Avg 1
    partition/ 3 running weeks
  • Partitions Rollover dates URL
  • http//www-css.fnal.gov/dsg/internal/databs_appl/s
    am_event_partitions.html

16
Whats Next ?
  • Deploy san/enstore backup recovery plan.
  • ( TESTING OF SAN on d0ofprd1 is work-in-progress)
  • Deployment of Lum Db in production. 10g.
  • Possible Upgrade to 10g to d0ofprd1 due to
    enhanced feature of incremental database backups.
  • Upgrade OEM to 10g
  • Rewrite of dbatools/toolman for enhanced features
    of monitoring and 10g support.
  • SAM Schema Deployment for SAM Request System.
  • Testing of postgres mini sam for proof of
    product.

17
Concerns
  • Backups will get bigger . So backup of VLDB
  • Speaker Bureau application to be moved to
    production ASAP. It is on dev being used in
    production mode.
  • SAM Servers on Linux ?
  • Not Enough Space for Integration db to do full
    refresh of SAM.
  • Single point of failures with D0 offline
    database.
  • future of the aging clarion array must be
    addressed in next budget.
Write a Comment
User Comments (0)
About PowerShow.com