Health Check Monitoring - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Health Check Monitoring

Description:

Health Check Monitoring. Ivan M. Redondo. Technical Relation Manager. What is Health Check Monitoring? a service to perform health checking on various subsystems ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 24
Provided by: wes133
Category:

less

Transcript and Presenter's Notes

Title: Health Check Monitoring


1
Health Check Monitoring
  • Ivan M. Redondo
  • Technical Relation Manager

2
What is Health Check Monitoring?
  • a service to perform health checking on various
    subsystems
  • performs self-healing actions.
  • tests can be configured and extended by the user
  • HCA will rely on the IMA Advanced service to
    perform privileged actions
  • stopping IMA
  • rebooting a server

3
Typical Problems to be addressed
  • Enumeration of published applications
  • XML service hangs (Black Hole Issue)
  • IMA service hangs
  • IMA Ticket requests fail
  • XML Ticket requests fail
  • Terminal Service hangs

4
Issues NOT addressed
  • SSL Relay bad certificate test
  • SG authentication test
  • Queued ASP Requests on the IIS WI server
  • AAC Authentication test
  • IMA slowness to enumerate user permissions
  • End user client cannot obtain a valid Citrix
    license
  • No ICA listener test

5
Farm Properties
6
Server Properties
7
Four Tests
  • IMA Service test
  • IMATest.exe
  • Terminal Services test
  • CheckTermSrv.exe
  • XML Ticket Request test
  • RequestTicket.exe
  • Logon/Logoff test
  • LogonMonitor.dll

8
IMA Service Test
9
IMA Service Test
  • Performing an app enumeration on the IMA service
    to ensure it is up and running
  • Description IMA Test
  • Interval 60
  • Filename imatest.exe
  • Threshold 2
  • Timeout 120
  • ActionOnError Alert only

10
Terminal Services test
  • Enumerates list of sessions running on the server
    and session user information
  • Interval 30
  • Filename checktermsrv.exe
  • Threshold 2
  • Timeout 120
  • ActionOnError Alert only

11
XML Service test
  • Requests a ticket from the XML service running on
    the server.
  • Interval 60
  • Filename requestticket.exe
  • Threshold 2
  • Timeout 120
  • ActionOnError remove server from LB

12
Logon Monitor test
  • Monitors the logon/logoff cycles to determine if
    there is a problem with session initialization
  • Interval 30
  • Filename LogonMonitor.dll
  • Threshold 2
  • Timeout 120
  • ActionOnError Alert only

13
Logon Monitor test parameters
  • Three parameters unique to Logon Monitor
  • Must be updated manually through command line
  • SessionTime 5
  • SessionThreshold 50
  • SampleInterval 600

14
Recovery Actions
  • Performed when a test fails
  • Intended to self-heal the server
  • Configured by admin
  • Regardless of action an event will be triggered
  • Event will cause an alert in SMA

15
List of recovery actions
  • Alert Only
  • Remove Server From Load Balance Tables
  • Shutdown IMA service
  • Restart IMA Service
  • Reboot Server

16
Alert Only
  • HCA will send an error message to the event log
  • HCA will continue running its tests but will not
    post more event log on subsequent failures
  • If the test starts passing again another event
    will be sent to the Application log indicating
    that the test passed

17
Remove Server From LB Tables
  • HCA will communicate to the local IMA service
  • IMA service will remove server entry out of the
    dynamic store server load table
  • ZDC will stop resolving incoming connections to
    the server
  • Existing connections continue to run
  • Disconnected sessions will continue to resolve to
    this server
  • Direct connections can be made (bypassing LB)

18
Farm Exclusion Limit
19
Shutdown IMA service
  • HCA will send a message to the IMA Advanced
    service to shutdown IMA
  • IMA Advanced service tries to shutdown IMA
  • If not successful it will kill the IMA process
  • HCA executes will continue running its tests
  • Any test that relies on IMA will fail, triggering
    and event and the configured action

20
Restart IMA Service
  • HCA will send a message to the IMA Advanced
    service to restart IMA
  • IMA Advanced service tries to shutdown IMA
  • if not successful it will kill the IMA process
  • IMA Advanced service will then attempt to start
    the IMA service
  • Obviously only shutting down IMA is guaranteed to
    succeed, starting IMA is not
  • Any test that relies on IMA will fail, triggering
    and event and the configured action

21
Reboot Server
  • HCA will send a message to IMA Advanced service
    to reboot the server
  • The IMA Advanced service will make a system call
    to reboot the machine
  • After reboot, HCA will operate normally.

22
Questions?
23
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com