OpenBSI Trouble Shooting - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

OpenBSI Trouble Shooting

Description:

Polled protocol created to maximize bandwidth and allow multilevel data access ... you will see this in the OpenBSI Journal File if using Dataview or Harvester. ... – PowerPoint PPT presentation

Number of Views:208
Avg rating:3.0/5.0
Slides: 32
Provided by: eddieph
Category:

less

Transcript and Presenter's Notes

Title: OpenBSI Trouble Shooting


1
OpenBSI Trouble Shooting
  • Presented By Bob Findley
  • Director of Marketing and Business Development
  • Written By Steve Hill
  • Director, Software Product Business Development
  • Email shill_at_bristolbabcock.com
  • Phone 860-945-2501

2
Basic BSAP primer
  • Polled protocol created to maximize bandwidth and
    allow multilevel data access for difficult
    communication environments basics of SCADA

Host PC
3
As found in Manual 5080 Std BSAP
4
Immediate Response
5
OpenBSI Trouble Shooting - Introduction
  • Messages, Buffers and Wait packets
  • Ten Most Common Problems examined
  • Starting a System up
  • Where to start when a system breaks

6
What is a message?
A BSAP Message is like a package you send it
somewhere via the network
7
What is a buffer?
A Buffer is like a shelf that can only fit one
BSAP message
8
Where are the buffers?
  • Every depot on the network must have storage
    buffers
  • They are in the PCs
  • And they are in the RTUs
  • Messages are kept in buffers when they are not in
    transit
  • In 33xx and ControlWave, you must specify how
    many you need.

9
What is a wait packet?
  • A wait packet is like the airway bill receipt.
    You must hang on to it until you know the message
    has arrived ok.
  • There is one in use for every message
    outstanding on the network

10
What happens when theres a problem?
  • If the network is down, or slow, messages start
    to build up in buffers, waiting for pickup
  • We will run out of buffers

11
or the destination is unavailable?
  • If the destination is unavailable, our wait
    packets will build up at the PC

12
Wait packets and buffers in use
are controlled by the message timeout. Both are
discarded when it expires.
13
Buffers and Wait Packets - Summary
  • A buffer is a piece of memory that holds a
    message waiting to go somewhere.
  • If the network is busy, or large, they have to
    wait a while before being sent so they are
    queued in buffers.
  • A wait packet is a piece of memory that holds the
    information necessary to process a reply
  • If the network is slow, then a reply takes a long
    time so theres a lot of wait packets in use.
  • When a message times out, the wait packet is
    discarded.
  • If theres not enough buffers or wait packets,
    cant communicate with healthy RTUs
  • Symptom timeouts, RTUs going dead in HMI

14
Top Ten Problems..
15
1. Not enough Buffers Wait Packets
  • There should be at least one buffer for every
    message being transmitted at a time. No harm in
    having too many!
  • Recommend (tags/10) i.e. 2000 tags 200
    buffers, but will depend on HMI Software
  • Consider worst case - of messages sent at
    startup.
  • Recommend wait packets at least buffers x 2
  • Look at whats going on in Netview-gtMonitor
  • Configured in the NDF file with text editor

16
NDF File Contents
  • CONSTANTS
  • MESSAGE_EXCHANGES15
  • WAIT_PACKETS200
  • TOTAL_BUFFERS100
  • RTU_BLOCKS100
  • GOAL_FREE_BUFFERS30
  • RTU_RETRIES4
  • DEF_MESSAGE_TIMEOUT45
  • DELETE_JOURNAL1

17
2. Not Enough Buffers in the RTU
  • For 33xx, you specify the number of buffers in
    the load
  • You get some, but often not enough by default.
  • Increase (NEVER DECREASE) the number when you add
    more global signals, alarms etc
  • Sure sign of not enough buffers is NAKs being
    transmitted from the RTU.
  • Dont overload a Pseudo Slave Port!! you cant
    change the number of buffers for this!
  • Check for NAKS in OpenBSI and Communications
    stats from RTU.

18
3. Message Timeout Too Short
  • There are two timeouts. Message Timeout and Link
    Level Timeout
  • Message Timeout is the time for a message to go
    from the application (HMI) on the PC, to the RTU
    and return
  • Id recommend it being at least 3x combined poll
    period of all levels the message travels through
    (e.g. 1st level 5 secs, 2nd level60 secs, then
    make it 65x3195 seconds
  • Better is to look at actual time to turn around.
    requires looking at analyzer, DLM or HMI stats

19
Message Timeout Too Short
  • The longer the message timeout, the more wait
    packets you will use, if RTUs die.
  • If you have too short a timeout, you will see
    this in the OpenBSI Journal File if using
    Dataview or Harvester.

Wed Dec 01 145942 2004 DATAVIE Wait packet for
message id 0010 not found, msg discarded Wed Dec
01 145948 2004 DATAVIE Wait packet for message
id 0011 not found, msg discarded Wed Dec 01
150002 2004 DATAVIE Wait packet for message id
0013 not found, msg discarded Wed Dec 01 150003
2004 HARVEST Wait packet for message id 0013 not
found, msg discarded Wed Dec 01 150005 2004
DATAVIE Wait packet for message id 002A not
found, msg discarded Wed Dec 01 150008 2004
DATAVIE Wait packet for message id 002B not
found, msg discarded
  • Use data view as a test
  • Then use OpenBSI Journal Tool to view

20
4. Message Timeout too short in IP RTU
  • An Ethernet RTU contains similar code to OpenBSI
  • There is a message timeout it uses when talking
    to its slaves
  • The default is only 30 seconds
  • Configure this (and various other parameters)
    using the Internet_Protocol Module (33xx) or
    System Variable Wizard (ControlWave)

21
5. Link Level Timeout incorrect
  • Link Level timeout is the time expected for an
    ACK response from a top level RTU.
  • If its too short, the message is timed out to
    early and then the response tramples on the
    next message
  • If its too long, a single dead RTU will use up
    most of your bandwidth.
  • Use a network analyzer or the DLM to check.
    Heres some recommendations
  • Direct Serial 0.2 - 0.5 Seconds
  • Radio 1 Second (depends on configuration,
  • Internet (via re-director) lt 1 Second
  • Satellite/VSAT/Multi-drop cellular 5 second (seen
    9 though)
  • Configured in Netview, as part of the Line
    Properties

22
6. No Alarms with Serial RTU
  • For serial RTUs, make sure you didnt plug into
    the Pseudo Slave port!
  • For all RTUs, use the Alarm Router to check that
    you dont have an HMI problem

23
(No Transcript)
24
7. No Alarms with IP RTU
  • Check the NHP Address is correct in Netview
  • Check the NHP Address is correct in the RTU
  • Check you reset (33xx)/power-cycled (CW) the RTU
    since last NHP address change

25
8. Two NHPs fighting for control
  • Avoid having two PCs configured as NHPs online
    at the same time
  • Use SCADA Software that ensures one offline
  • If you do, make certain they have the same
    OpenBSI files (and hence NRT versions)
  • Make sure only one Server is time synching the
    RTUs

26
9. Accidentally Configuring an RBE Signal
  • If theres an RBE signal in the load, the host
    will try and communicate with the RBE Module in
    Task 0
  • If it doesnt respond, a time out occurs
  • And the RTU is declared dead
  • If using OpenEnterprise, take a look at the
    numrbe attributes on nw3000device
  • If using another HMI, search the .ACC for RBE

27
10. Broadcast Storms
  • Various Network protocols transmit broadcast
    packets
  • Everyone of these will need to be processed by
    the OS on the machines they hit.
  • In the case of an RTU, this means an interrupt
    and no control for a few ms.
  • The worst offenders are ARP and Netbios
  • Use a Network Analyzer to look at the network
  • Segment the network switches and routers, and
    disable Netbios, or make sure its configured
    correctly
  • Keep RTUs off the corporate network. 8Mhz v 2GHz
    is a battle you wont win!!

28
Problem with a new Network?
  • Add RTUs one at a time
  • Check communications after each one
  • Keep data requests slow until you have them all
    running OK. Speed it up SLOWLY.
  • Dont ever skip one problem thinking it will go
    away
  • If one appears, go back a step
  • Once you have it all working, then speed it up
  • But check that it still works with 50 dead RTUs
  • Check all log files even when you think its
    working

29
Something has stopped working where to start?
  • If the system WAS working
  • what changed?
  • Look at the Netview Journal Files
  • Check OpenBSI Resources (buffers, wait packets)
  • Look outside for physical problems!
  • Use a network Analyzer (if IP Network)
  • Check your system logs! (you do have them,
    right?)

30
Configuring DLM logging for OpenBSI
Save this file as BSBSAP.INI in the Windows
Folder to monitor Serial communications to
Bristol RTUs on COM1 DLM ENABLED1 COM1C\COM1-
LOG.TXT
Save this file as BSIPDRV in the Windows Folder
to monitor Ethernet communications to a Bristol
RTU DLM Enabled 1 Filter
120.0.210.46 File C\BSAPIP-DLM.LOG Data_Dump
1 Dump_Data 1
31
OpenBSI Troubleshooting
  • For further information, please contact
  • Steve Hill
  • Director, Software Product Business Development
  • Email shill_at_bristolbabcock.com
  • Phone 860-945-2501
Write a Comment
User Comments (0)
About PowerShow.com