the ECHO DEPository Project - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

the ECHO DEPository Project

Description:

Develop a national strategy to collect, archive and ... G.I.S. Photos. Video. Audio. Admin Data. Web Archives Workbench. Digital preservation investigation ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 16
Provided by: Ralph102
Category:

less

Transcript and Presenter's Notes

Title: the ECHO DEPository Project


1
theECHO DEPository Project
  • A project of the University of Illinois at
    Urbana-Champaign and OCLC in partnership with the
    Library of Congress
  • ALA Annual Chicago
  • June 2005
  • Taylor Surface, OCLC

2
The digital preservation problem
  • Information is being produced in greater
    quantities and with greater frequency than at any
    time in history.
  • How will society preserve this information and
    make it available to future generations?
  • How will libraries and other repositories
    classify this information so that their patrons
    can find it with the same ease that they can
    locate a book on a shelf?
  • The ease with which electronic information can be
    created and "published" makes much of what is
    available today, gone tomorrow. Thus there is an
    urgent need to preserve this information before
    it is forever lost.
  • Library of Congress (http//www.digitalpreservat
    ion.gov)

3
About NDIIPP
  • The National Digital Information Infrastructure
    Preservation Project is a 99.8M national digital
    strategy effort led by the Library of Congress.
  • Its mission
  • Develop a national strategy to collect,
    archive and preserve the burgeoning amounts of
    digital content, especially materials that are
    created only in digital formats, for current and
    future generations.
  • http//www.digitalpreservation.gov

4
Library of Congress NDIIPP Program
  • Building Digital Preservation Infrastructure
  • Partnerships
  • Policy
  • Standards
  • Technical components

5
NDIIPP key areas of interest
  • Digital Preservation
  • Practical applications and models
  • National technical architecture
  • Basic research

6
ECHO DEPository Overview
  • Design selection methodology
  • Develop software implementing theory
  • Machine-assisted
  • Open source
  • Evaluate various repositories
  • Using content gathered from tools
  • Other content providers
  • Study semantic preservation techniques

7
Three objectives
  • Comparative test of repositories with various
    digital collections
  • Development of Web Archives Workbench
  • Investigations of semantic digital preservation
    and alternate applications of workbench tools

8
Project Partners
  • University of Illinois, Urbana-Champaign
  • Libraries, GSLIS, NCSA, WILL, DMI
  • OCLC
  • State Libraries of Arizona, Connecticut,
    Illinois, North Carolina and Wisconsin
  • Tufts Perseus Project
  • Michigan State Sounds Archives
  • Library of Congress, NDIIPP Program
  • 3 million funding over 3 years

9
ECHO DEPository Project
Digitized Texts
Admin Data
Universe of Content
G.I.S. Photos
Video
Audio
Tools from this project
Comparative repository testing
Service Provider
UIUC
OCLC
NCSA
DSpace
Greenstone
Digital Archive
SRB
Repository
Fedora
10
ECHO DEPository Project
Digitized Texts
Admin Data
State Pubs
Universe of Content
G.I.S. Photos
Video
Audio
Tools from this project
Web Archives Workbench Arizona model
W.A.W. development
Service Provider
UIUC
OCLC
NCSA
DSpace
Greenstone
Digital Archive
SRB
Repository
Fedora
11
The Arizona Model
  • Web domains as archival collections
  • Creates efficiencies for
  • Selection of documents
  • Name authority other metadata
  • Browseable access

12
Arizona Model a new approach
  • Assumptions
  • Content creators wont help
  • Item by item selection is unsatisfactory
  • Bulk harvesting is unsatisfactory
  • An archival approach
  • Identifying groups of similar material (series)
  • Automatic identification of new series items
  • Series description
  • Item level description is possible if warranted
  • Ingest of documents into an archive

13
Web Archives Workbench
Apache
Packager Tool
Discovery Tool
Analysis Tool
Properties Tool
Heritrix Harvester
Cloudscape DB
TomCat
Linux
14
Web Archives Workbench (WAW)
  • Tools for curators
  • Discovery identify manage domains
  • Properties associate metadata, content, and
    providers
  • Analysis select content from structure
  • Packager package content metadata

15
WAW - Discovery Tool
  • Currently available (May 2005)
  • Helps curators identify domains that are within
    their collecting scope
  • Crawls web sites and extracts domains of possible
    interest from content
  • Maintains lists of domains
  • Monitors selected domains for changes

16
WAW - Properties Tool
  • Currently available (May 2005)
  • Relates content providers to web sites
  • Organizes a group of web sites hierarchically
  • Associates metadata to content providers and,
    later, to selected content
  • Metadata can be subject headings, preferred
    names, aliases, etc.

17
WAW - Analysis Tool
  • Available January 2006
  • Content selection at varying levels of
    granularity
  • Harvests an entire site or one document
  • Scheduled harvesting of content
  • Shows site structure
  • Understands serials
  • Content is automatically associated to content
    providers metadata

18
WAW - Packager Tool
  • Available January 2006
  • Combines descriptive metadata about content
    creator, series, and object
  • Creates administrative and preservation metadata
  • Packages web content and metadata into an XML
    standard package (METS)
  • Neutral format for ingest into OCLC archive and
    other repositories

19
ECHO DEPository Project
Digitized Texts
Admin Data
State Pubs
Universe of Content
G.I.S. Photos
Video
Audio
Tools from this project
Web Archives Workbench
Digital preservation investigation
Service Provider
UIUC
OCLC
NCSA
DSpace
Greenstone
Digital Archive
SRB
Repository
Fedora
20
theECHO DEPository Project
  • A project of the University of Illinois at
    Urbana-Champaign and OCLC in partnership with the
    Library of Congress
  • ECHO DEPository project web site
  • http//www.ndiipp.uiuc.edu/index.html
  • NDIIPP web site
  • http//digitalpreservation.gov
  • Me Taylor Surface, OCLC
  • taylor_surface_at_oclc.org
Write a Comment
User Comments (0)
About PowerShow.com