Project Overview - PowerPoint PPT Presentation

About This Presentation
Title:

Project Overview

Description:

University of Mining and Metallurgy, Cracow, Poland. Cracow Grid Workshop, Nov.5 ... Caries information about user or programmer wishes. Expert system processes ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 36
Provided by: rafalw
Category:

less

Transcript and Presenter's Notes

Title: Project Overview


1
(No Transcript)
2
Optimisation of Data Access in Grid Environment
  • Darin Nikolow1 Renata Slota1
  • Lukasz Dutka1 Jacek Kitowski12
  • Piotr Nyczyk1 Mariusz
    Dziewierz1
  • 1Institute of Computer Science - AGH
  • 2Academic Computer Centre CYFRONET - AGH
  • University of Mining and Metallurgy, Cracow,
    Poland

CrossGrid Project - Task 3.4
Cracow Grid Workshop, Nov.5-6, 2001
3
Outline
  • Background
  • Bottom-top approach
  • Media management software
  • middleware for existing HSM
  • dedicated VTSS
  • Local component-expert systems
  • Global policy for migration/replication

FOR MORE INFO...
http//www.icsr.agh.edu.pl/
4
Motivation
  • Big and growing stuff of data
  • Multimedia database systems (applications -
    medical, educational, virtual reality, virtual
    laboratories, digital libraries, advanced
    simulations, ...)
  • Solution Tertiary Storage Systems (TSS) Media
    Libraries Management Software
  • Examples of existing TSS
  • HPSS, DataCutter, APRIL, Condor, OmniStore,
    UniTree, ......
  • Possible directions
  • Data access time estimation system - efficient
    usage
  • Data distribution and grid implementation - large
    scale experiments
  • Expert system for data management
  • Replication policies

5
Background
  • PARMED Project(Uni. of Klagenfurt - Uni. of
    Mining Metall. Cracow)
  • to support physicians with telematic services
    for
  • long distance collaboration of medical centers,
  • medical teleeducation
  • case archives

6
Bottom-top approach -Major Components
  • Assumptions
  • mechanism neutrality
  • policy neutrality
  • compatibility with grid infrastructure
  • uniformity of information infrastructure

Replica Selection
Replica Management
Resource Management
Storage System
Metadata Repository
HSM
UniTree
Castor
HPSS
LDAP .....
7
Media Management Software
  • Nikolow, D., Slota, R., Kitowski, J., Nyczyk, P.,
    Otfinowski, J., "Tertiary Storage System for
    Index-Based Retrieving of Video Sequences",
    Proc. Int. Conf. HPCN, Amsterdam, June 25-27,
    2001, Lect.Notes in Comp. Sci. 2110, pp. 62-71,
    Springer, 2001.
  • Nikolow, D., Slota, R., Kitowski, J.,
    Benchmarking TertiaryStorage Systems with File
    Fragmentation, PPAM Conf., Naleczów, Lect.Notes
    in Comp.Sci., accepted.

8
Media Management Softwareand its usage in X
  • Darin Nikolow
  • darin_at_uci.agh.edu.pl

9
Motivation
  • Main purpose of the developed TSS efficient
    index-based retrieving of video fragments
    (instead of file fragments)
  • specific requirements for frequent data reading
  • startup latency
  • transfer time
  • minimal transfer rate gt video bitrate
  • Two prototypes proposed and benchmarked
  • middleware layer for existing HSM
  • dedicated TSS
  • The developed systems are of general use -gt
    possible grid implementations

10
Multimedia Storage and Retrieval System (MMSRS)
  • Requirements
  • use existing software (UniTree HSM)
  • reduce latency (start-up delay), i.e. -reduce
    file granularity
  • file fragmentation (subfiles)
  • Implementation
  • splitting files into pieces of similar size
  • Middleware layer on HSM
  • Consists of
  • Automated Media Library
  • UniTree HSM managing system
  • MPEG extension for HSM (MEH)
  • MEH receives the name of video file and the frame
    range - start/end frames
  • output stream via HTTP

11
Video Tertiary Storage System (VTSS)
  • Repository Daemon REPD
  • keeps repository information
  • Tertiary File Manager Daemon TFMD
  • managesfiledb - tape ident and startup position
    of the fragmenttapedb - information about tape
    usage
  • Dedicated TSS
  • Client requests to VTSS can be of the following
    kinds
  • write a new file to VTSS, read a file fragment
    from VTSS, delete a file from VTSS.
  • The fragment range is defined in the frame units
  • Two daemons implemented in C using Unix sockets

12
MMSRS and VTSS performance
  • Hardware (AML QuantumATL)
  • ATL 4/52 (DLT 2000)
  • ATL 7100 (DLT 7000)
  • HP D-class server (with UniTree HSM)
  • Data
  • 790 MB MPEG1 file with B0.4 MB/s bitrate (33
    min.)
  • subfile for MMSRS - 16 MB (8,16, 32 MB tested)
  • as short as possible to keep reproducing smooth
    (low latency)
  • optimal subfile length depends on
  • positioning time
  • drive transfer rate
  • bitrate of the video file

13
Benchmarks
  • Startup latency - time elapsed from issuing the
    request to receiving the first byte
  • Transfer time - time from receiving the first
    byte till the end of transmission
  • Minimal rate - minimal transfer rate experienced
    by a client with endless buffer (should be
    greater than the bitrate of the video stream to
    have smooth reproduction)

14
Startup latency
VTSS (DLT2000)
MMSRS (DLT2000)
UniTree reference startup latency 718 s
VTSS (DLT7000)
15
Transfer time(beginning part shown only)
VTSS (DLT2000)
MMSRS (DLT2000)
UniTree reference transfer time 135 s
VTSS (DLT7000)
16
System performance for the whole video file
transfer (DLT2000)
17
Minimal transfer rate
  • Definitions (for VTSS)
  • Minimal transfer rate
  • Time offset for tape changing direction
  • n - number of packets
  • Bj - number of bytes in j-th packet
  • ti - time when i-th packet was received
  • T - tape capacity in MB
  • N - number of tracks
  • Br - bitrate of video file in MB/s
  • no bad blocks

18
Minimal transfer rate
VTSS (DLT2000)
MMSRS (DLT2000)
  • For DLT2000
  • T 10 GB
  • N 64
  • Br 0.4 MB/s
  • For DLT7000
  • T 35 GB
  • N 52
  • Br 0.4 MB/s

Qdt 400 s
Qdt 1723 s
VTSS (DLT7000)
19
Access Time Estimation Motivation for X
  • Retrieving a file from TSS could last few seconds
    or few hours
  • Users satisfaction increases when the access
    time of data is known (e.g. user waiting to watch
    selected video administrator recovering from
    backup)
  • Efficient use of storage resources in Grid
    environment (data replication subsystem)

20
Access Time Estimation Approaches
  • Open TSS approach
  • source code changes
  • will be used as experimental platform
  • Black Box TSS approach - for existing HSMs in X
    sites
  • retrieving TSSs state info via its native tools
    and available internal files

21
Access Time Estimation - Open TSS Approach
TSS
TSS Symulator
events
req. 1
ETA 4
data
ETA of req. id? 3
Client
req. id 2
TSS source code changes - adding event reporting
functions
22
Access Time Estimation - Black Box TSS Approach
events collecting
TSS Monitor
update 4
TSS
TSS state 5
databases
logs
TSS Simulator
conf. files
fileid 9
ETA 6
Monitoring tools
fileid 2
data 10
Disk cache
queue state 3
Request Monitor Proxy
feedback 12
ETA 7
  • Needed info by Simulator
  • nr of drives
  • tape labels
  • media types
  • position of file in media
  • nr of requests
  • ...

fileid 8
data 11
Client
fileid ETA? 1
23
Conclusions
  • MMSRS and VTSS more efficient than standard
    UniTree HSM
  • MMSRS efficient enough to be used as a middleware
    for existing HSM of UniTree type (in X sites)
  • Proposed measurements could be used for
  • building more sophisticated distributed storage
    systems (faster access to files stored in TSS)
  • building access time estimation subsystem
  • Access time estimation subsystem ---gtgtgt an
    information provider for X replication and
    migration of data

http//www.icsr.agh.edu.pl/
24
(No Transcript)
25
Component-expert Systems
  • Dutka, L., and Kitowski, J., Implementation of
    expert technologies in information systems based
    on a component methodology, MSK 2001 Conf.,
    Nov. 19-21,2001 Cracow, accepted (in Polish).
  • Dutka, L., and Kitowski, J., Component-expert
    technology in mass-storage grid applications,
    ICCS 2002 Conf., April 2002, Amsterdam, in
    preparation.

26
Basics of Component-Expert Technology and its
usage in X
  • Lukasz Dutka
  • dutka_at_agh.edu.pl

27
Classical component strategy
28
Component-expert strategy
29
Component structure
30
Component header structure
31
Structure of component code
32
Call-Environment
  • Describe state of the call place
  • Describe call place requirements
  • Caries information about user or programmer
    wishes
  • Expert system processes Call-Environment and
    finds best component for given Call-Environment

33
Expert Subsystem
  • Rule-based expert system
  • Typical rule looks like If log-expr Then action1
    Else action2
  • The rules describe what is meant by The best
    component for given Call-Environment
  • Expert system logs calls and stores deduction
    results for further analysis

34
Profits from Component-Expert technology
  • Dynamic expanding system possibility
  • Ease of solving new problems
  • Minimising programmer responsibility for
    component choice
  • Ease of programming in heterogeneous environment
  • Maximal reusable of components
  • Internal simplicity of components code
  • Increase efficiency of programming process

35
Component-Expert Technology for X Task 3.4
36
Basic analysis of Data-access problems in X
  • Different data set types
  • Huge data files
  • Distributed environment
  • Long distance connections
  • Mission critical applications
  • Heterogeneous data storing systems
  • Heterogeneous computing systems
  • Open system
  • Unpredictable file types

37
Basic connection diagram
38
Sequence Diagram
39
Example of Component-Expert technology usage for
data access in X
  • Sample Attributes
  • User ID
  • Computing Node ID
  • Preferred replica localisation
  • Required throughput
  • Application purpose
  • Data sharing
  • Critical level
  • Replica expiration .....
  • Example of local decisions
  • Devices choosing (according to availability and
    type)
  • Storing format (blocks, multimedia
    streams,......)
  • Available delivering performance (network,
    storage devices,....)
  • ... And much more ...

40
Control System for Migration/Replication
Strategies (1/2)
  • Assumptions
  • replica lt--gt file instances
  • read only
  • no update, no coherence

From replica manager
41
System Management for Migration/Replication
Strategies (2/2)
  • In cooperation with other projects
  • High-level control system (e.g. cooperating with
    LDAP)
  • Two possible realizations
  • heuristic reinforcement learning based on
    heuristic strategies for migration/replication
    and system state
  • classical rule-based expert system

42
Conclusions
  • Some elements have been defined and implemented
  • Working on higher level structure and cooperation
    with other X modules and services

43
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com