Overview of LOCKSS - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Overview of LOCKSS

Description:

Title Database / Conspectus Database. Provider Sites ... Title / Conspectus Databases ... Conspectus Database designed for MetaArchive Project, provides more extensive ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 22
Provided by: kylef1
Category:

less

Transcript and Presenter's Notes

Title: Overview of LOCKSS


1
Overview of LOCKSS
2
Session Learning Objectives
  • Provide an overview of the LOCKSS architecture.
  • Describe the LOCKSS polling process
  • Describe how LOCKSS private networks differ.
  • Provide a vocabulary of technical terms used
    frequently with LOCKSS networks

3
Architectural Components
  • Provider Sites (digital collections)
  • LOCKSS nodes (aka peers)
  • Plugins / Plugin Repository
  • Cache Manager
  • Title Database / Conspectus Database

4
Provider Sites
  • Prepare a digital collection so that it is web
    accessible to the preservation nodes
  • Expose a manifest web page for each collection,
    according to LOCKSS specifications.
  • Grants permission for LOCKSS to crawl
  • Gives starting point for crawl
  • Provide information sufficient to create a LOCKSS
    plugin for the collection (or else create the
    plugin themselves and reposit that plugin with
    the LOCKSS network)

5
LOCKSS Peer Nodes
  • Data caches for harvested content
  • Caches organized into archival units (AUs)
  • Nodes can select which AUs to crawl and preserve
  • There must be gt 6 copies of an AU in order for
    the polling process to work properly

6
Plugins / Plugin Repository
  • Tell LOCKSS where, how and how often to crawl a
    provider site for AUs
  • Plugins are Java based
  • Distinct from core LOCKSS software

7
Cache Manager
  • Distributed separately from LOCKSS
  • Can remotely inspect and manage the caches on the
    various peer nodes

8
Title / Conspectus Databases
  • Title database on each node describes and manages
    which AUs to preserve on that node
  • Conspectus Database designed for MetaArchive
    Project, provides more extensive metadata about
    the preserved digital collections, and feeds the
    Title database with entries

9
Plugin Repository
DC1
Digital Collection 1
Private LOCKSS Network Nodes
1
DC1
AU 1
DC2
DC2
2
DC2
Web Site
3
Manifest page
DC1
AU 2
4
DC1
DC2
5
DC2
Digital Collection 2
AU 1
6
AU 2
Web Site
DC1
Source Code
7
DC1
DC2
DC1
8
AU 3
DC2
Manifest page
SQL Dump
9
DC2
10
The Polling Process
11
Invited nodes create fresh SHA1 digest of the AU
Polling Process resulting in landslide loss, AU
repair
Poll Effort Proof is cryptographically derived
and sent to affirmative voters challenges
Affirmative PollChallenge message responses allow
that inner circle node to participate in poll
DC2-AU1
DC2-AU1
2
4
SHA1
SHA1
There is a landslide of valid, disagreeing
votes against the Node 5s SHA1 digest of DC2-AU1
Invitation
Valid vote disagrees
Valid vote disagrees
Node 5 calls poll on AU 1 of Digital Collection 2
PollChallenge
PollProof
1
Once repair is completed, Node 5 immediately
calls a new poll, which effectively verifies, or
invalidates and corrects, the repair
DC2-AU1
Valid vote disagrees
5
DC2-AU1
SHA1
Encrypted RepairRequest message
Repair made
SHA1
Valid vote agrees
Node 9 nominates 7 and 8
Node 5 invites some recently encountered peers to
vote. (Each node maintains a reference list of
the recently encountered peers) Those invited
are the inner circle for this opinion poll.
Node 5 discovers new peers through nomination
process
9
DC2-AU1
Since agreeing votes are below threshold, Node 5
picks a random disagreeing voter from the inner
circle
SHA1
8
DC2-AU1
7
DC2-AU1
Nominated Nodes 7 and 8 belong to the outer
circle, can be invited to subsequent voting
rounds by Node 5
12
Polling Refresh Timer
  • A peer sets a refresh timer for a given AU to
    determine the interval between successive polls
  • System parameter R is the mean for the possible
    random values generated for the refresh timer

13
System Parameter Quorum
  • Q of valid inner circle votes required to
    conclude a poll successfully
  • Q 6 is the thoroughly tested value in use
  • If votes lt Q, poller invites additional peers, or
    else aborts the opinion poll

14
Polling Outcome Landslide Win
  • The poller considers its current copy to have
    integrity
  • This is the only scenario in which an opinion
    poll concludes successfully
  • The poller updates its reference list and then
    waits until the next polling period (determined
    by the refresh timer)

15
Reference List Update
  • Happens only after a successful poll
  • Poller removes the inner circle peers who had
    valid votes in the last opinion poll
  • Culls peers it has not been able to contact for
    some time
  • Adds outer circle peers whose votes were valid
    and eventually agreeing

16
Polling Outcome - Inconclusive
  • D max allowed minority votes
  • If Agreeing Votes gt D, and
  • Agreeing Votes lt Total valid votes D,
  • Then the poll is inconclusive, raises alarm
  • Human intervention needed to determine if nodes
    have been compromised
  • Peers voting in agreement with a known bad copy
    are blacklisted if that peer node cant be
    identified or it wont cooperate

17
Further Details on Polling Process
  • Petros Maniatis, Mema Roussopoulos, TJ Giuli,
    David S. H. Rosenthal, Mary Baker, and Yanto
    Muliadi, "LOCKSS A Peer-to-Peer Digital
    Preservation System", ACM Transactions on
    Computer Systems (TOCS). http//www.eecs.harvard.e
    du/mema/publications/TOCS2005.pdf
  • See also LOCKSS related publications at
    http//www.lockss.org/lockss/Publications

18
The LOCKSS Private Network Difference
  • More flexible (not appliance based)
  • Can run on any operating system that supports
    Java
  • LOCKSS Team maintains rpm packages for Linux
    installations
  • Peer Node administrators have greater discretion
    configuring access, customizing functionality,
    e.g. altering system parameters

19
The LOCKSS Private Network Difference (cont.)
  • Can extend LOCKSS core functionality with
    supplemental tools and methods to fit new use
    cases
  • E.g. the MetaArchive Conspectus database

20
Vocabulary
  • (Please refer to the workshop binder for
    terminology and definitions)

21
Overview of LCAP version 3
Write a Comment
User Comments (0)
About PowerShow.com