Network Availability Management and Reporting - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Network Availability Management and Reporting

Description:

Outages are usually reported from multiple devices and multiple ... This outage has been estimated to be EOD-hour (hrs) EOD-min (min) ... Outage Reports ... – PowerPoint PPT presentation

Number of Views:270
Avg rating:3.0/5.0
Slides: 23
Provided by: MikeOC4
Category:

less

Transcript and Presenter's Notes

Title: Network Availability Management and Reporting


1
Network Availability Management and Reporting
  • ESnet Network Management
  • Mike OConnor
  • ESnet Network Engineering Group
  • Lawrence Berkeley National Lab
  • moc_at_es.net

2
Objective
  • Develop and deploy a system which will track
    ESnet network availability, producing clear and
    concise reports.
  • Accurately reflect both planned and unplanned
    network outages for ESnet, its customer sites,
    contracted carriers and network peers.
  • Provide a measure of "uptime" for any given site
    and for the ESnet Backbone.
  • Improve maintenance planning and eliminate or
    improve response time to repetitive systematic
    network failures.
  • Increase network availability for ESnet customers.

3
Network Management Systems Fall Short
  • Commercial off the shelf tools dont provide the
    level of integration necessary to correlate and
    categorize planned maintenance with the high
    volume of alarms reported by an NMS.
  • Producing accurate customer centric, availability
    reports based on alarm data requires a
    comprehensive and well-integrated planning,
    scheduling and reporting system.
  • An NMS will report what alarmed and how long it
    alarmed, but almost never why it alarmed.

4
Over Reporting Outages
  • Network management systems typically over-report
    events.
  • Outages are usually reported from multiple
    devices and multiple subsystems of the same
    device.
  • Device centric, no correlation with maintenance
    or effect on specific customers.

Where when Yes
Why? No
5
  • Formulate objective
  • Devise a solution
  • Allocate resources
  • Select a maintenance
  • window and record in calendar
  • Schedule resources
  • Send out advance Notification
  • Execute the plan
  • Verify results
  • Review planned outages
  • Investigate unplanned outages
  • Feedback into planning process, contracted
    external internal
  • Capture network alarms
  • Correlate with planned events
  • Format report

6
Prior to report and review integration
7
(No Transcript)
8
Planned Maintenance Calendar Month View
9
Planned Maintenance Calendar Day View
10
Planning Notification Tools
  • Event Input Forms
  • Outage Footprint Calculator (OFC)
  • Email Templates
  • Contact Database
  • Automatically referenced by the OFC.

11
Outage Footprint Calculator
12
(No Transcript)
13
(No Transcript)
14
Email Templates
ESnet Outage Notification ltTTSgt-ltTitlegt Begin
ltDategt ltGMTgt End ltEndgt
ltEndGMTgt Location ltLocationgt This outage has
been estimated to be ltEOD-hourgt(hrs)
ltEOD-mingt(min) within the above maintenance
window Description ltdescriptiongt Affected
Devices ltaffectedgt -----------------------------
------------------------------------------------
ENERGY SCIENCES NETWORK
(ESnet) 24x7 NOC (510)486-7607 Email
trouble_at_es.net http//www.es.net --------------
--------------------------------------------------
-------------
15
Planned Maintenance CalendarQueued Email View
16
Event Correlation ExampleSunnyvale Circuit
Maintenance
  • Bechtel Path to ESnet Core
  • bechtel-rt1 - bechtel-ga.es.net
  • bechtel-rt1, Serial0/0
  • gac-rt2, Serial5/0/0
  • gac-rt2 - gac-pos-snv.es.net
  • gac-rt2, POS0/1/0
  • snv-cr1, so-3/1/0.0 (maintenance)
  • snv-cr1 ESnet core router in Sunnyvale CA.

17
Event CorrelationSNV so-3/1/0 example
18
Event Correlation Engine
EVENT 08/01/2003 225857 08/01/2003 233857
000004000 snv-cr1 so-0/1/0 EVENT 08/02/2003
082130 08/02/2003 100017 000013847
snll-rt1 rtr_cisco START 08/02/2003 100000
US/Pacific 2003 SNV-to-GACmaintenance EVENT
08/02/2003 100153 08/02/2003 101153
000001000 snv-cr1 so-3/1/0 EVENT 08/02/2003
100159 08/02/2003 100811 000000612 gac-rt2
POS0/1/0 EVENT 08/02/2003 100159 08/02/2003
100811 000000612 gac-rt2 Serial5/0/0 EVENT
08/02/2003 100235 08/02/2003 100810
000000535 bechtel-rt1 rtr_cisco EVENT
08/02/2003 100235 08/02/2003 100811
000000536 gac-rt2 rtr_cisco EVENT 08/02/2003
100311 08/02/2003 100811 000000500
bechtel-rt1 Serial0/0 EVENT 08/02/2003
102751 08/02/2003 103912 000001121
snll-rt1 rtr_cisco STOP 08/02/2003 103001
US/Pacific 2003 SNV-to-GACmaintenance EVENT
08/02/2003 104739 08/02/2003 105027
000000248 llnl-rt3 rtr_cisco EVENT
08/02/2003 104746 08/02/2003 105027
000000241 llnl-rt3 gen_if_port Outage
Footprint Calculation affected -i
snv-cr1,so-3/1/0 Maintenance Interfaces snv-cr1,so
-3/1/0 Affected Routers bechtel-rt1 gac-rt2
19
Event Review
  • Review planned events
  • Maintenance window
  • Expected outage duration
  • Outage footprint
  • Investigate unplanned events
  • Circuit providers
  • Hardware Vendors
  • The NMS

20
Event Categorization Outage Reports
  • Accurate availability statistics in a clear
    concise format suitable for distribution
  • Planned vs. Unplanned
  • ESnet, Site, Carrier, Peer categorization
  • Consolidation of multiple reporters
  • Network regional reports
  • Backbone specific report

21
Future Work
  • Unplanned outage categorization tools.
  • Customer version of the planned maintenance
    calendar.
  • Achieve 100 event categorization.
  • Uptime metrics

22
Questions?
Write a Comment
User Comments (0)
About PowerShow.com