3178 24 x 7 StarTeam - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

3178 24 x 7 StarTeam

Description:

Disk, CPU, memory, power supply, fans, network card, motherboard, disk controller, etc. ... Viruses, worms, DDOS. Application Issues. Application brown-outs ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 60
Provided by: randy48
Category:
Tags: starteam

less

Transcript and Presenter's Notes

Title: 3178 24 x 7 StarTeam


1
317824 x 7 StarTeam
  • Randy Guck
  • Chief Scientist, DSP
  • Borland

2
Overview
  • High availability fundamentals
  • How available is highly-available?
  • High availability at what price?
  • Enemies of high availability

3
Overview
  • StarTeam high availability best practices
  • Administrative practices
  • Flash (demand peak) control
  • Backup procedures
  • Redundancy
  • Failover and clustering
  • Disaster recovery and replication

4
High Availability FundamentalsHow available is
highly-available?
5
A Distorted Term
  • Depending on who you ask, high availability
    means
  • 24 x 7 uptime
  • Clustering
  • Failover
  • Online backups
  • Five nines or Six sigma
  • Currently not dating

6
Availability by the Numbers
7
The Myth of the Nines
  • Most people want more than they need
  • Actual reliability difficult to compute (complex
    mathematics)
  • Example 99.99 reliability (downtime52
    minutes/year) of 7 components results in 99.93
    (downtime6 hours/year).
  • Downtime often affected by future, unforeseeable
    business decisions

8
MTBF versus MTTR
  • MTBF mean time between failures
  • MTTR mean time to repair
  • Availability
  • A MTBF / (MTBF MTTR)
  • Availability is good if MTTR is low
  • 99.9999 availability (six sigma) 6 mins
    downtime in 11.4 years!

9
A Better Approach
  • Focus on scenarios and probabilities
  • Examine organizations needs
  • Identify possible service disruptions
  • Prioritize failures by probability
  • Address scenarios on a cost/benefit basis
  • Test each failure scenario
  • The result is your high availability plan!

10
High Availability FundamentalsHigh availability
at what price?
11
Availability versus Investment
  • Disaster recovery planning
  • Failover management
  • Redundancy (no SPOFs)
  • Backup procedures

Availability
  • Demand peak management (flash control)
  • Basic administrative practices

Investment
12
What kind of systems need the highest
availability?
  • Life-rated systems
  • Space shuttle onboard systems
  • Emergency response systems
  • Command-and-control systems
  • High financial cost systems
  • Stock-trading systems
  • Reservation systems
  • Banking systems

13
ALM High Availabilityin Perspective
  • Although ALM systems are becoming more
    mission-critical, they do not have the same
    financial or loss of life impact as some
    systems, so it doesnt make sense to model high
    availability after them
  • Bottom line Strike a reasonable balance between
    investment and high availability

14
High Availability Fundamentals Enemies of
availability
15
Infrastructure Issues
  • Hardware failures
  • Disk, CPU, memory, power supply, fans, network
    card, motherboard, disk controller, etc.
  • Environmental failures
  • Power, cooling, fire, flood, hurricane,
    earthquake, terrorism, etc.

16
Infrastructure Issues
  • Network outages
  • LAN outages switch/cable failures (server-to-DB
    network segment, client-to-server network
    segment, etc.)
  • WAN outages VPN failure, ISP failure, physical
    network issues, etc.
  • Service outages DNS, DHCP, directory server,
    email, etc.

17
Infrastructure Issues
  • Database outages
  • Out-of-disk issues, recovery time after reboot,
    index corruption, etc.
  • Bandwidth issues
  • Network congestion, database congestion, resource
    starvation
  • Denial-of-service Attacks
  • Viruses, worms, DDOS

18
Application Issues
  • Application brown-outs
  • Locking/bottleneck issues, demand peaks, etc.
  • Application outages
  • Hangs, fatal exceptions, out-of-memory
  • Scheduled outages
  • Offline backups, application patches, database
    upgrades, etc.

19
Plan of Attack
  • To a specific user, a service is down when it
    is not available for any reason
  • A comprehensive high availability plan must
    consider all potential outages from end-to-end,
    on a cost/benefit basis

20
StarTeam High AvailabilityBest Practices
  • Disaster recovery planning
  • Failover management
  • Redundancy (no SPOFs)
  • Backup procedures

Availability
  • Demand peak management (flash control)
  • Basic administrative practices

Investment
21
Administrative Practices
  • Administrative best practices top 10 list
  • 10 Dont be cheap
  • 9 Enforce security
  • 8 Centralize your servers
  • 7 Enforce change control
  • 6 Document everything

22
Administrative Practices
  • Administrative best practices top 10 list
  • 5 Test everything
  • 4 Design for growth
  • 3 Choose mature software
  • 2 Choose mature hardware
  • 1 K.I.S.S.

23
StarTeam High AvailabilityBest Practices
  • Disaster recovery planning
  • Failover management
  • Redundancy (no SPOFs)
  • Backup procedures

Availability
  • Demand peak management (flash control)
  • Basic administrative practices

Investment
24
Flash Control
  • Client/server systems have natural demand peaks
  • Peaks are often time-based e.g.
  • Everyone logs in the morning
  • Big reports launched just before lunch
  • Peaks are often calendar-based e.g.
  • End-of-week builds
  • End-of-month reports

25
Client/Server Architecture
StarTeam Client
StarTeam Server
Command API
Demand peak congestion areas
StarTeam Client
DB
Vault
All information is pulled by clients using a
request/reply command API
StarTeam Client
26
StarTeamMPX
StarTeam Client
StarTeam Server
Event publish stream
Message Broker
StarTeam Client
DB
Vault
Updated objects are pushed to clients, preventing
poll and refresh requests, smoothing demand peaks
StarTeam Client
27
New for 7.0 MPX Cache Agent
StarTeam Client
StarTeam Server
Check-out requests
Message Broker
Cache Agent
DB
Vault
File publish stream
Encrypted Cache
The Cache Agent is trickled charged with file
contents, providing an alternate check-out source
for remote clients.
StarTeam Client
28
StarTeam High AvailabilityBest Practices
  • Disaster recovery planning
  • Failover management
  • Redundancy (no SPOFs)
  • Backup procedures

Availability
  • Demand peak management (flash control)
  • Basic administrative practices

Investment
29
Backups for High Availability
  • Mirroring does not replace backups
  • Backups are an important part of high
    availability
  • Test integrity of backups periodically
  • Consider a rotating/hierarchical storage system,
    which can serve disaster recover scenarios

30
StarTeam Backups
  • StarTeam 6.0 backup procedure
  • Lock the server ?
  • Backup the database and vault
  • Disk-to-disk and differential dumps can speed
    things up
  • Unlock the server
  • Why does the server need to be locked?

31
Review StarTeam 6.0 Vault
Single Volume
Base version
Delta 1
Delta 2
Delta 3

Text files
Archive Folder
Base version
Delta 1
Delta 2
Delta 3

Base version
Rev 1
Rev 2
Rev 3

Binary files
Base version
Rev 1
Rev 2
Rev 3

Single Volume
Full version
Full version
Cache Folder
Full version
Full version
Uncompressed
Full version
Full version
Full version
Full version
32
New StarTeam 7.0 Vault
StarTeam Server
Hive Index
Vault
DB

Hive
Hive
Hive
Hive
33
7.0 Vault Inside the Hive
subfolders
MD5-based storage
Hive
compressed
000a807b9f393f58a69998b2cd7db7d2.gz
00/0
Archive Root
000752242cc7e16d573f299a127903f2.gz


uncompressed
ff/f
fff16c26e911ac72abad5557ac44d84c
000a807b9f393f58a69998b2cd7db7d2
uncompressed
00/0
Cache Root
000752242cc7e16d573f299a127903f2


ff/f
fffb865605a09eef1f06be92a38bc8da
34
StarTeam 7.0On-line Backup Procedure
  • The new vault allows on-line backups
  • Backup the database on-line
  • When complete, backup archive and attachment
    folders
  • 2.1 Perform full backups weekly
  • 2.2 Perform incremental backups daily
  • No need to lock the server! ?

35
StarTeam 7.0Recovery Procedure
  • To recover a full StarTeam configuration
  • Reload the database
  • Simultaneously reload archive and attachment
    folders
  • 2.1 Load latest full backup
  • 2.2 Load all incrementals since last full
    backup in parallel
  • Modify this procedure for partial recoveries

36
StarTeam High AvailabilityBest Practices
  • Disaster recovery planning
  • Failover management
  • Redundancy (no SPOFs)
  • Backup procedures

Availability
  • Demand peak management (flash control)
  • Basic administrative practices

Investment
37
Reducing SPOFs
  • Servers
  • Dual power supplies, ECC/mirrored memory, dual
    fans, etc.
  • Storage
  • Dual controllers, mirrored/RAID disks
  • Network
  • Dual network cards, redundant switches, dual ISP
    connections, etc.

38
Redundant Everything
StarTeam Server
ECC memory, dual fans, etc.
Switch
RAIDvaultdisks
dualcontrollers
dual NICs
RAIDDBdisks
Switch
Database Server
39
StarTeam High AvailabilityBest Practices
  • Disaster recovery planning
  • Failover management
  • Redundancy (no SPOFs)
  • Backup procedures

Availability
  • Demand peak management (flash control)
  • Basic administrative practices

Investment
40
Failover Checklist
  • At least two identically configured systems
  • Shared disks
  • Network connections
  • Heartbeat/server network
  • Client-facing service network
  • Optional administrative network
  • Failure Management System (FMS)
  • Cluster set app, db connections, IP address

41
StarTeam Active/Passive Configuration Requirements
  • Each system identically configured
  • StarTeam release (including patches)
  • starteam-server-configs.xml
  • EventServices\ltconfiggt\.xml
  • ServerLicenses.st
  • Access to shared vault and database
  • Only one instance can be running at a time
  • Failover time is secondary startup time

42
Active/Passive Configuration
heartbeat network
Mirroreddisks
Active Server
Passive Server
12.34.56.78
client-facing service network
43
Failover Condition
heartbeat network
X
Mirroreddisks
Active Server
Passive Server
12.34.56.78
client-facing service network
44
StarTeam and BDOC
  • Borland Deployment Op-Center can assist with
    process monitoring and restart
  • StarTeam Server process
  • MPX Processes
  • Message Broker
  • Multicast Service
  • Cache Agent
  • Workflow Notification Agent

45
Op-Center Example
46
StarTeam High AvailabilityBest Practices
  • Disaster recovery planning
  • Failover management
  • Redundancy (no SPOFs)
  • Backup procedures

Availability
  • Demand peak management (flash control)
  • Basic administrative practices

Investment
47
Replication for DR
  • Types of replication based on latency
  • Synchronous Remote site is always up-to-date
  • Asynchronous Remote site lags by a small amount
    of time
  • Batch Remote site receives periodic snapshots
    (e.g., backups)

48
Synchronous Replication
  • Long-distance mirroring
  • Fibre channel 10km or more with newer
    technologies
  • Variation disk replication software (e.g.,
    Veritas Volume Replicator)
  • Advantages real-time replication
  • Disadantages cost

49
Asynchronous Replication
  • Possible strategy for StarTeam
  • Database-provided replication e.g.
  • SQL Server Log Shipping
  • Oracle Standby Database Replication
  • Continuous/incremental copy of attachment and
    archive files
  • Exploits write-once feature of StarTeam 7.0 vault
  • Possible because not yet in use!

50
Asynchronous Replication
  • Advantages
  • Less network bandwidth needed than synchronous
    replication
  • Database currency window can be tuned
  • Disadvantages
  • Requires reliable network
  • Not yet tested!

51
Batch Replication
  • Sending backups offsite
  • Never underestimate the bandwidth of a station
    wagon filled with tapes barreling down the
    highway
  • Make copies of backups or rotate backups through
    offsite storage
  • Send backups via FedEx, UPS, Volvo net, etc.

52
Batch Replication
  • Advantages
  • Reliable
  • Low cost
  • Full backups ensure recoverability
  • Disadvantages
  • Asynchronous (time lag)
  • Manual process (media handling) unless network
    bandwidth is available

53
StarTeam High AvailabilityBest PracticesOther
Topics
54
Other High Availability Features for StarTeam 7.0
  • New StarTeam 7.0 Vault
  • Conversion from StarTeam 6.0 vault can occur in
    real-time as background or scheduled process
  • Vault space can be increased dynamically by
    adding new hives
  • Archive files can be offloaded/ reloaded
    dynamically

55
Other High Availability Features for StarTeam 7.0
  • New StarTeam 7.0 Memory Management
  • New memory management caps memory growth with
    XxxCaching values gt 0 (where Xxx Files,
    ChangeRequests, etc.)
  • Allows the server to run for very long periods
    without restarting

56
Summary
  • High availability is a cost/benefit pursuit
  • Review administrative practices
  • Smooth demand peaks (MPX)
  • Establish on-line backup procedures
  • Eliminate SPOFs
  • Consider clustering for failover
  • Create a disaster recovery plan
  • Document and test everything!

57
References
  • Blueprints for High Availability 2nd Edition,
    Evan Marcus and Hal Stern, Wiley Publishing Inc.
    (2003) detailed discussion of all issues related
    to high availability
  • Applied Reliability, Paul Tobias and David
    Trindade, Kluwer Academic Publishers (1995)
    detailed mathematical treatment of failure rates
    and renewability

58
Questions?
59
Thank You
  • 3178
  • 24 x 7 StarTeam
  • Please fill out the speaker evaluation
  • You can contact me further at randy.guck_at_borland
    .com
Write a Comment
User Comments (0)
About PowerShow.com