How to live with lowintermittent bandwidthconnectivity - PowerPoint PPT Presentation

About This Presentation
Title:

How to live with lowintermittent bandwidthconnectivity

Description:

... scripts, e.g., Active Server Pages (ASP), Java Server Pages (JSP), Servlets ... work from web/app server. Low reusability for highly personalized web pages ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 40
Provided by: monicacr
Category:

less

Transcript and Presenter's Notes

Title: How to live with lowintermittent bandwidthconnectivity


1
How to live with low/intermittent
bandwidth/connectivity
Krithi Ramamritham IIT Bombay krithi_at_cse.iitb.ern
et.in
2
Web Content
  • Web sites have traditionally served static
    content
  • But, dynamic content generation has come into
    vogue
  • generated on the fly by running dynamic scripts,
    e.g., Active Server Pages (ASP), Java Server
    Pages (JSP), Servlets
  • allows generation of different content for the
    same request

3
Dynamic Web Pages
Web Page
A News content site
4
Generic Architecture
wired hosts
sensors
Network
Network
mobile hosts
servers
Data sources
End-hosts
5
Coherency of Dynamic Data
  • Strong coherency
  • The client and source always in sync with each
    other
  • Strong coherency is expensive!
  • Relax strong coherency ? - coherency
  • Time domain ?t - coherency
  • The client is never out of sync with the source
    by more than ?t time units
  • eg Traffic data not stale by more than a minute
  • Value domain ?v - coherency
  • The difference in the data values at the client
    and the source bounded by ?v at all times
  • eg Only interested in temperature changes larger
    than 1 degree

6
Generic Architecture
wired host
sensors
Network
Network
servers
Proxies /caches
mobile host
Data sources
End-hosts
7
The Push Approach
  • Proxy registers the data item of interest and the
    coherency requirement with the server
  • Server pushes interesting changes
  • Achieves Strong Consistency
  • Keeps network overhead minimum
  • -- Poor Scalability (has to maintain state and
    has to keep connections open)
  • -- Low Resiliency

8
The Pull Approach
  • Proxy Pulls after
  • Time to Live (TTL)
  • Time To next Refresh (TTR / TNR)
  • Can be implemented using the HTTP protocol
  • Stateless and hence is generally scalable with
    respect to state space and computation
  • Weak cache consistency
  • Heavy polling for stringent coherence requirement
    or highly dynamic data
  • Network overheads higher than for Push

9
Typical End-to-end Web Site Architecture
Application Server Cluster
Web Server Cluster
Data
. . . .
10
WS vs. AS
  • Web servers
  • Do well defined and quantifiable local work
  • e.g., processing HTTP headers, serving static
    content
  • Application servers
  • Run multi-layer programs
  • e.g., scripts involving
  • calls to backends

11
Inside the Application Layer3-tier model
HTML
  • JSP
  • ASP

PRESENTATION
Objects
  • Servlets
  • COM
  • EJB

ADDTL SERVICES
BUSINESS LOGIC
Row Set
  • Commerce
  • Content Mgt.
  • Personalization

DATA CONNECTOR
  • JDBC
  • ODBC

Legacy Systems
Databases
12
Inside the Application Layer
Code Block(s)
PRESENTATION
. . .
ADDTL SERVICES
Code Block(s)
BUSINESS LOGIC
. . .
  • Commerce
  • Content Mgt.
  • Personalization

DATA CONNECTOR
  • JDBC
  • ODBC

4. DBMS calls storage system
Legacy Systems
Databases
13
Performance and Scalability Issues
  • Computationally-intensive logic executed
    atmultiple tiers
  • Cross-tier communication
  • Object instantiation and cleanup processing
  • External I/O calls
  • Database connection pool latencies
  • Content conversion and formatting

14
Optimizing the Application LayerTraditional Means
  • Optimize each tier independently
  • Presentation-level caches built inside
    application server processes
  • Main memory database employed over persistent
    DBMS
  • Persistent object storage techniques employed
    inside content management systems and so on

Local cache and optimization code
15
Query result caching
  • Many application server products
  • offer this feature
  • -- mitigates only local database access latency
  • -- only a subset of query results may be reused
    in page generation
  • -- page fragments may not all be from databases

16
Middle tier database caching
  • Caching database tables in main memory
  • Oracle 9i Cache
  • Main-memory databases, e.g., TimesTen
  • -- mitigates only database access latency
  • -- caching at table granularity results in poor
    cache utilization
  • -- main-memory databases are difficult to
    integrate and maintain and can be expensive

17
Page Level Caching
  • Dynamically generated HTML pages are cached
  • Can completely offload work from web/app
    server
  • Low reusability for highly personalized web pages
  • URL may not uniquely identify a page
  • -- increasing the risk of delivering
    incorrect pages
  • Often introduces excessive invalidations
  • -- e.g., even if a single element on the
    page changes

18
Optimizing the Application LayerIssues
  • Traditional techniques impact specific components
    within the application, but not the entire
    application
  • No mitigation of component-to-component
    interaction latencies
  • Different synchronization and invalidation
    policies risk data integrity
  • Each optimization scheme consumes programmer
    timefor development and maintenance

19
Key ideas
  • Re-use program results to eliminate redundant
    work
  • Facilitate single-point, architecture-wide
    optimization
  • Apply to both
  • programmatic objects and result fragments

20
Optimizing the Application Layer
  • JSP
  • ASP

PRESENTATION
  • Servlets
  • COM
  • EJB

ADDTL SERVICES
BUSINESS LOGIC
Enables the results of programs to be re-used.
  • Commerce
  • Content Mgt.
  • Personalization

DATA CONNECTOR
  • JDBC
  • ODBC

Legacy Systems
Databases
21
Usually.
Legacy Systems
Plus, at each step there are communication delays
and logic processing delays
22
Novel Solution
Can store any program output, but is most
commonly an HTML fragment or a Programmatic
Object.
Appl. Programming Interface
Chutney tags
Real-time storage engine
Code Block(s)
PRESENTATION
. . .
Function
Parameter(s)
Result
Code Block(s)
BUSINESS LOGIC
. . .
Tags trigger calls to the storage engine.
When the Result of a Function with a
specific Parameter set is already known (and
up-to-date), the work normally necessary to
produce that Result is bypassed.
DATA CONNECTOR
  • JDBC
  • ODBC

23
Code Blocks Perform Work
Page generation script
Write to Out
Write to Out
. . .
. . .
24
Code Blocks lt-gt Components
Page generation script
Web Page
Ad Component
Write to Out
Headline Component
Headline Component
Navigation Component
Headline Component
Headline Component
Write to Out
. . .
Personalized Component
(Example News content site)
Certain components can be cached
25
DCA Our Solution
Page generation script
Code block
Request
Dynamic Content Accelerator
Code Block Output
Application logic
Code block
Work bypassed
Database calls
HTML formatting
. . .
26
DCA in a Typical End-to-end Web Site Architecture
  • A single instance of the DCA serves a rack of
    application servers
  • Application servers communicate with DCA through
    a lightweight API

Application Server Cluster
Web Server Cluster
Data
Dynamic Content Accelerator
27
Cache Management
  • A critical aspect of any caching solution
  • DCA supports novel cache management strategies
  • Prediction-based cache replacement
  • Observation-based cache invalidation

28
Cache Replacement
  • Prediction-based replacement
  • fragments having lowest probability of access
    replaced
  • Least-Likely-to-be-Used (LLU)
  • Access probabilities based on
  • Current user navigational patterns over site
    graph
  • (in the form of clickstreams)
  • Historical user navigational patterns over site
    graph
  • (in the form of association rules)

(News, Sports, Hockey) ? Schedules 20
(News, Sports, Hockey) ? Players 15
LLU
(News, Sports, Hockey) ? Teams 10
(News, Sports, Hockey) ? Scores 55
29
Cache Invalidation
  • DCA supports common cache invalidation
    techniques
  • Time-based Each cache element assigned a TTL
  • Event-based Updates to the database send an
    invalidation message to the cache
  • On demand Manual invalidation of selected
    elements
  • DCA supports additional invalidation techniques.

30
Cache Invalidation
  • Other invalidation techniques supported
  • Observation-based
  • User-initiated updates are observed in scripts
    each such update sends an invalidation message to
    the cache
  • Most appropriate for auction sites, online
    trading sites
  • Invalidation does not require communication with
    the databases
  • Keyword-based
  • Elements can be associated with keywords e.g.,
    a retailer may wish to invalidate all
    seasonal items
  • Regular expression-based
  • Elements can be invalidated based on regular
    expression matching

31
Performance Study
  • Test Site
  • Fictitious online retail site, allows browsing of
    product catalog
  • Pages generated using JSP scripts
  • Site content stored in Oracle database
  • Database schema based on Dublin Core Metadata
    Open Standard
  • Contains 200,000 products and 44,000 categories
  • Each page consists of 3 components, each
    involving a database call

32
Performance Study
  • Test Setup
  • Content Database Server
  • Oracle 8.1.6
  • Web/Application Server
  • WebLogic 6.0 running on cluster of 2 machines
  • Server machines
  • have 1 GB RAM, dual P III-933 Mhz processors
  • run Windows 2K Advanced Server

33
Testing Methodology...
  • Baseline Parameters
  • Cache Size, i.e., percentage of fragments that
    fit into cache 75
  • Cache replacement policy LLU
  • User load is varied by sending requests from
    client machines running Radviews WebLoad
  • Simulated users navigate site according to Zipf
    80-20 distribution (i.e., 80 of users follow 20
    of navigation links)

34
Performance Impact
80 faster response times through existing
application infrastructure
Source Fortune 100 client results
35
Chutney Throughput Impact
250 increase in transaction rates
Source Fortune 100 client results
36
Alternative CDNs
  • e.g., Akamai

Content Distribution Networks
Push BasedCore Infrastructure
37
Conclusion
  • Increased use of dynamic page generation
    technologies
  • gt increases load on application servers
  • gt serious performance and scalability
    problems
  • for e-business sites
  • DCA (Dynamic Content Acceleration)
  • gt significantly reduces the load on the
    server side infrastructure, allows e-business
    sites to scale
  • gt significantly outperforms existing middle
    tier caching solutions

38
IIT Bombays aAQUA Community Forum
Farmers get information and get their questions
answered -- In the local context -- In their
local language
Capitalizes on existing human and infrastructural
resources Agri-extension center KVK,
Baramati NGO Vigyan Ashram, Pabal Government
MCIT
www.aAQUA.org
39
Access over low bandwidthResource Optimization
Resource constraints Low/unpredictable bandwidth
gt disconnected operation/access Exploit cachi
ng prefetching (through prediction of future
needs) Profiling by user type, location
gtoffline aAQUA Data characteristics Static data
text, images land records, photos can be
cached/hoarded Dynamic data weather/price
information cached info need to be refreshed
carefully Continuous media VoIP, video data QoS
considerations
Write a Comment
User Comments (0)
About PowerShow.com