Title: Application Quality of Service
1Application Quality of Service
Tuesday, 1400, 14 Sept 2004
- Karl Schopmeyer k.schopmeyer_at_opengroup.org
V 1.1, 14 Sept 2004
2Agenda
- Introduction
- Objectives
- Overview of the Architecture
- Technologies involved
- State of the Standards
- What we need to make AQOS work
- Future Directions for AQOS
3Overview
The Objective Increasing QUALITY while
decreasing COST of the services you provide.
- Some of thhe Issues
- Resolve bottlenecks quickly or better yet,
prevent them from happening. - Make sure important users not waiting
- Let the right apps get through and not break the
network - Make efficient use of resources and have
flexibility to move them quickly to where they
are required
AQRM Application Quality and Resource Management
An open management framework that all component
providers, applications or hardware, can plug
their own product into and work together with
all other components to deliver priority tasks
and adjust to meet critical workloads based on
policy from a business viewpoint.
4AQRM Work Group
- Work Group of the Open Group Enterprise
Management Forum - Application Quality and Resource Management
- Actively working with DMTF WGs
- Application Management Work Group
- Policy Work Group
- Etc.
- Participating in CIM/WBEM Implementations
5Charter Overview - AQRM
- Understand the current state of the industry
regarding Application Quality Resource
Management - Develop a framework for the accomplishment of
automated Application Quality Resource
Management - Identify the standards and open tools that fill
the needs of the framework - Identify gaps that need to be filled
- Provide solutions for the gaps or challenge other
organizations working in closely related fields
to fill the gaps
6A Perception of The Solution
- AQRM is about management, not monitoring
- AQRM is a value-add management function
- Build on existing management
- Do not duplicate what exists
- Depend on the existence of effective
instrumentation (networks, applications, systems) - AQRM is an abstraction
- Requires management concepts that allow
abstracting information to the resource
management model - Abstract from managed elements to resource view
- AQRM must be based on standards
- AQRM is only one component of the general
management solution (e.g. it must work with other
FCAPS management components).
7The Major AQRM Objectives
- To develop
- a set of architecture principles,
- profiles of standards, and
- appropriate standards for an adaptive approach to
measuring and controlling the Quality of
Experience and the Quality of Service of existing
and new applications across one or more
real-time enterprises. - Includes
- Recognition/acceptance/endorsement of an adaptive
approach to managing distributed applications. - Industry ratification of architecture principles
and profiles of standards for the above
objective. - Identification of standards gaps and an action
plan to close gaps. - Ensure management interoperability, across
management domains, and plug-and-play
integration, within resource domains.
8Applications and Resources
9The AQRM Functional Model
Resources
Control
Analysis Decision
Monitor
10Integrating Standard Management Functionality
into the Model
Behavioral Control
Common Management Instrumentation Infrastructure
Control
Resources
Analysis Decision
SLAs
Monitor
Policies
AQRM Abstractions
11A Management Approach For AQRM
Components
- Application Management
- Manages the application and interfaces to
underlying infrastructure - Web Services
- Manages Web services, including interfaces to
network and database elements - Database
- Manages database elements, including server and
storage interfaces, as well as elements required
for distributed operation - Server Management
- Manages server system resources
- Storage Management
- Manages storage disk / SAN elements
- Network Management
- Manages the connections between all the systems
and networks (including storage, as in the case
of IP-SAN) - Common Infrastructure
- Manages the common elements required to
facilitate disaggregated rack-based systems
(e.g., cabling, power, etc.)
Application Management
Web Services Management
Database Management
Network Management
Server Management
Storage Management
Common Low-Level Elements
12What is required to bring this together
- Monitoring
- A common instrumentation infrastructure
- Information on performance of objects of interest
to App management - Application performance model
- A Service Measurement model
- Control
- Control of managed resources
- A state and behavior control model so that
behavior can be controled, not simply executed. - Analysis and Decision
- State model for managed resources
- Policy objectives models
- Mapping between objective level policies and
decision implementation - Abstraction between decision and Monitor/Control
13The components of AQRM
- A lot of things have to come together for AQRM to
work - Management of Networks, Systems, etc.
- Management of Applications
- Management of information flow
- Measurement of Service oriented metrics
- Service Level concepts (SLAs, SLOs
- Policies
14AQRM Basic Decisions
- Use CIM as the modeling base
- CIM is the only standards based management
instrumentation that meets AQRM requirements. - Work with the CIM/WBEM groups to extend/complete
the functionality needed for AQRM - SLAs
- Policy
- Application Management
- Behavior and State control
- Etc.
15Early conclusions
- This is not just a application problem.
- This is not just a network or OS problem
- Typically just a few of the many monitor
variables are significantly indicative of service
performance - Typically just a few of the many control
variables available have a significant impact on
service performance.
16Measuring Service Time UOW and ARM
App Response time measurement API, ARM
App Response time measurement API, ARM
Servers
Network
App
App
App
App
App
Client
Client
Client
Client
Client
Response time measurement Is the Unit of Work
(UofW) model
App Component definition, performance and
resource measurement App runtime Model
17Modeling The Transaction - UOW
- Measure a time interval
- Identify the transaction
- Identify the application
- Provides information for correlation of multiple
measurements - Provides information to understand component
UofWork (parent/child units of work) - Provides metric information places for resource,
etc. information - Marry with the instrumentation technology - ARM
18Unit of Work
- Defines a type of work
- Represents a UOW that has started and may have
completed executing - Associated to a UOW definition
- Provides information such as
- Response or elapsed time
- Status
- Active, Suspended, Completed (with status),
Aborted - Metric Information about the UOW
- Examples
- Update account balance
- Execute batch
- Query Data server
- Execute subroutine
19Status
- UOW model
- Model Developed by DMTF Application Work Group
- Corresponds today to ARM 1, 2, 3
- Working on ARM 4 equivalent model
- ARM
- ARM API for C and Java today (Open Group Standard
- Version 4 extend model to more useful metrics,
correlation.
20Service Levels and Objectives
- Service Level Agreement
- Documented result of a negotiation between a
customer/consumer and a provider of a service,
that specifies the levels of availability,
serviceability, performance, operation or other
attributes of the service. RFC2475 - Service Level Objective
- Partitions an SLA into individual metrics and
operational information to enforce and/or monitor
the SLA. It is a set of parameters and their
values. The actions of enforcing and reporting
monitored compliance can be implemented as one or
more policies.
21Intersections of SLAs and Management
- Formalize SLA concepts of SLAs and their metrics
- Standard and exchangable definitions
- Monitor and programmatically enforce SLAs
- Decompose high-level business SLAs to
device/resource configurations and monitored
parameters. - View the SLAs as goals
- Define identity information to support
per-customer billing and SLAs - Within and across boundaries
- Define and enforce application-specific SLAs
22SLAs and Semantics
- SLAs require understanding of syntax and
semantics - Describe what is managed and what happens when
problems occur (e.g. policies) - SLAs cannot go from general to specific without a
general semantic framework - What does it mean to understand a specific
application without understanding what an
application is.
23Service Level Measuring the User experience
- Service levels are a measure of the user
experience - How much
- How fast
- How available
- How reactive to problems
- SLAs define what will keep the user happy
24Application Management and SLAs
- Typical runtime service level parameters
- User perspective on performance
- Interactive responsiveness
- Transaction Response time / Time to accomplish
- Throughput / How many simultaneous users or how
many things can be done in a defined time - Batch turnaround
- Critical deadlines (e.g. end-of-month processing)
- Availability
- Percentage of time service is available
- Maximum limits on service-down times
- Other non-runtime SLA issues
- Recoverability
- Data Integrity
- Problem responsiveness
- Affordability
25Goals of Application Measurement
- Provide Monitors for
- Service Level management
- Need information and controls so that analysis
can be done and decisions made and implemented - Business and Business Process management
- Provide Application Controls for
- Fault Determination
- Performance characteristic attribution
- Application monitor, management and manipulation
in terms of application components, aggregation
into whole to support SLOs
OR
- Monitor to provide information for SLA reporting
- Provide controls for SLA tuning
- Provide means to find why not meeting SLAs
It is not enough to know you have a problem if
you do not know why or how to solve the
problem. It is even more worthless to have a
means for defining SLAsAnd SLOs and no means to
measure them on the system.
26What are Applications?
- Complex collections of software components
- Multilayered functionally
- E.g. Presentation, application, database, etc.
- Dynamically assembled
- GOALS
- Model the components as viewed in runtime
including the interactions - Aggregate the information into the whole
- Disaggregate information from whole into the
components
27Application Runtime Management
- Model being developed by DMTF Application Work
Group
app status
deployable
installable
executable
running
initial life cycle
sub-model
transport
setup
installation
runtime
Requirements Architecture, Management,
Manageability, Meta
28Application Runtime Manageability Requirements
- Define logical runtime structure of complex
applications - Define Application components/layers
- Support distributed and dynamic applications
- Relate physical structures and logical runtime
structures - Model usage of system resources as viewed by the
application - Model dataflow between components and
applications and between applications - Relate Unit of Work information to runtime
structure - Allow monitor and control of application state
- Support fault management
- Aggregate information from components to the whole
29Modeling FCAPS
- Fault
- Indications
- Error and status properties (counter,
information) - Log-entries, traces, etc.
- Performance
- Base metrics (IO, timebound metrics, etc.)
- UoW
- Metric properties
- Statistics
- Configuration
- Persistent configuration information
configuration, settings - Control methods
- Current configuration object properties, support
classes, associations
30App Runtime Model Concepts (Simplified)
31Application Model Hierarchy
- In development today
- Application System submodel (CIM 2.8)
- Components of Function submodel (CIM 2.9 prelim)
- Data submodel, structure submodel planned for
future CIM verisons (2.10, etc.)
32SLAs, SLOs and Policy
Contractual Based on Business Process and
requirements
Service Level Agreements
Manual Translation
Incorporates
Service Level Policy Specification
Common expression and Metrics
Enforces
- Policy Refinement
- Decompose service into elements
Element Policy
Element Policy
Create, Update, Maintain
Model, schema, access methods, Error detection,
recovery, creation Deletion, mod, security, etc.
Policy Management
33Policy
- Why is Policy important
- automation encapsulating management tasks,
expressing a desired state or result to be
achieved. - Policy can be used to express quality of service
parameters to meet required service levels.
- Model developed by DMTF Policy and Service Level
Work Group - New work includes
- Generic conditions (query based)
- Generic actions
- Issues
- Aggregation and disaggregation
- Language as an alternative to models.
34Behavior and State
- Objective
- Extend CIM to allow expressing behavior, behavior
control, and interobject behavior - State and State Control
- Allow model to define states and to define
allowable state transitions - Provide means for Actions on other components of
the model to be defined as part of state
transition (Actions) - Definition of interopbject Actions
- Possibly common with Policy
- BH work group in DMTF today. In process of
defining mechanisms for state transition and
actions definitions as extensions to CIM model.
35The AQRM Functional Model
Resources
Control
Decision Process
Monitor
36Growth of the Information model to a Management
model
Managed Services From Information model To
information and behavior model
Management Services
Managed Services Model (tomorrow)
Managed Services Model
Manageability Objects (Today)
Manageability Model
37The Object Model
Policy
This will be a Key interface Between the QOS
objects And the resource Objects. Typically The
resource Objects will be Dynamically Created and
deleted And the QOS Objects must Support this.
QOS Management Model
Detailed Resource Management Objects (NOTE This
is effectively the majority of the Current CIM
model)
Resources
38Some Characteristics of the AQRM model
- AQRM Decision Making
- Based on abstractions that support the QOS
concepts and define policies that operate on the
controls based on the monitors. - Policy Objects
- Scripts
- State management
- AQRM Objects
- Represent concepts like
- Capacity
- Throughput
- Latency
- Queue Length.
- In effect a network of queues
- Every step up through the model we are
abstracting the information required for QOS.
Note that the relationships is the way this is
accomplished.
39The Model Components for AQRM
Objective Policy (SLO, etc)
Action Policy
Map Policy to Resource
Manageability Service Layer
Performance Resource Layer
Application Model
Current CIM Model(s)
SNMP
Resources
SNMP Resources
40Conclusions
- AQRM is based on CIM and will extend CIM models
- AQRM depends on the existence of widely used
common instrumentation implementations that
interoperate
41Working Together
OpenGroup AQRM WG
Application model
SLAs, Service Approach
DMTF Application WG
DMTF Utility WG
DMTF Policy WG
Other DMTF WGs, Oasis WSDM,
DMTF Behavior and State WG
Policy and SLA
42Next Steps
- Continue working with these groups to develop the
components needed by AQRM - Define preliminary models for AQRM concepts to
allow us to demonstrate better what we are trying
to accomplish
Join Us
43AQRM Application Quality / Resource
Managementand The Open Group
44Questions?
?
?
?
?
?