Title: PI: Katerina Goseva
1Performability of Web-based Applications
- PI Katerina Goseva Popstojanova
- Students Ajay Deep Singh Sunil Mazimdar
- Lane Dept. Computer Science and Electrical
Engineering - West Virginia University, Morgantown, WV
- katerina_at_csee.wvu.edu
2Problem
- World Wide Web is the biggest existing
distributed system so far - Huge number of Web clients - tens of millions and
rising - Users demand 24/7 availability and response time
within several seconds - However, very often they experience long and
unpredictable delays
- Problem Traditional analysis and prediction
methods do not work for Web
3Relevance to NASA
- Increasing use of Web-based technology at NASA
- Web sites Agency wide
- Control of daily mission operations from multiple
geographically distributed locations via Internet
(e.g., Web Interface for Telescience at JPL) - Real-time applications remotely
controlled/monitored over the Internet or an
Intranet (e.g., Tempest embedded Web server at
Glenn Research Center)
4Relevance to NASA
- Our empirical analysis is based on data extracted
from actual Web logs of ten servers - Three public and three private Web servers at the
NASA IVV Facility
- Lane Department of Computer Science and
Electrical Engineering (CSEE) Web server
- NASA Kennedy Space Center (NASA-KSC) Web server
- Campus wide Web server at the University of
Saskatchewan - Web server of the commercial Internet provider
ClarkNet
5Approach
- Develop methods and tools that are general and
powerful enough to provide flexible analysis and
quality assurance of Web reliability,
availability, and performance - Develop scalable framework that combines
measurements and models at different levels of
detail and abstraction - Reliability/Availability based on typical usage
patterns - Performance non-Poisson queuing theory
- Combine reliability / availability and
performance and analyze their tradeoffs
6Approach
User session characterization
Web access log analysis
Realistic workload
Session layer (user view)
Performance model
Software/hardware resource utilization
Service layer (software architectural view)
Performability model
Application hardware resource monitoring
System layer (deployment view)
Software/hardware failure/recovery characterizatio
n
Reliability/ availability model
Resource layer (hardware device view)
Web error log analysis
Request-based and session-based error
characterization
7Accomplishments
Create relational database
8Accomplishments
- Empirical analysis of the Web workload, errors,
request-based and session-based reliability for
ten Web servers
- Fixing the errors with the highest frequency of
occurrence is the most cost effective way to
improve Web quality
9Accomplishments
- We argue that session-based reliability is a
better indicator of the users perception of the
Web quality than request-based reliability
10Importance/benefits
- Innovative theoretical and empirical research
results - Introduced and empirically analyzed new measures
for session-based workload and reliability - Conducted detailed empirical study on Web errors,
including severity level of errors, unique
errors, and unique files with errors measures
that have not been considered earlier
- Practical value
- The results of our research were actually used by
Web administrators of the NASA IVV and CSEE Web
servers to improve their quality in a
cost-effective way
11Next steps
- Performance attributes
- Develop non-Poisson queuing theory
- Standard performance models (Queuing Networks
Layered Queuing Networks) assume Poisson arrivals - Web workload is bursty (highly non-Poisson)
- Dependability attributes (reliability,
availability) - Develop architecture-based models based on
typical usage patterns - Combine performance and reliability /
availability models - Analyze tradeoffs among multiple quality
attributes