The Web Archiving Service - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

The Web Archiving Service

Description:

Security vs. freedom of information ... http://wiki.cdlib.org/WebAtRisk. You Tube Video: 'Web-at-Risk Collections' tracy.seneca_at_ucop.edu ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 42
Provided by: cdl8
Category:

less

Transcript and Presenter's Notes

Title: The Web Archiving Service


1
The Web Archiving Service
and the Web-at-Risk NDIIPP Project
Tracy SenecaCalifornia Digital Library
National Digital Information Infrastructure
Preservation ProgramLibrary of Congress
California Digital Library
New York University
University of North Texas
2
Overview
  • Web archiving what why
  • Web-at-Risk grant scope purpose
  • Web Archiving Service Sample Screens

3
Web archiving what why
4
Web Archiving Assumptions
  • Using automated methods to gather web content
  • Building some kind of collection composed of more
    than one site
  • Intent on preserving captured content
  • Results are searchable
  • Public access may not be available

5
How is the material at risk?
  • Vulnerability of
  • Digital publications
  • Web publications
  • Government web publications
  • Local government web publications

6
The Ephemeral Web
7
Issues Unique to Government and Political Web
Documents
  • Publication notification streams
  • Elections, political change
  • Security vs. freedom of information
  • Local agencies often dont have the resources to
    archive their own publications

8
Web-at-Risk grant scope purpose
9
Grant ScopeJan 2005 Jun 2009
  • Build tools to allow librarians to capture,
    curate and preserve web-based government and
    political information.
  • Create topical and event-based archives
  • Capture individual sites and documents
  • Assess the impact of these tools on traditional
    collection development practices.
  • Explore web archiving service sustainability.

10
Project Partners
11
Web-at-Risk Collections
12
Beyond the Grant
  • Support web archiving for the University of
    California
  • Enable collaboration across campuses
  • Enable collaboration between librarians and
    researchers/faculty

13
Web Archiving Service (WAS)
  • Tangible outcome of grant work
  • Being developed and release over a series of
    pilot tests
  • Pilot test 5 underway until May 23
  • 2008-2009 develop rights management and public
    access features

14
WAS Production
  • Early summer 2008, Web Archiving Service goes
    into limited production.
  • Available 24/7 to the curators who have taken
    part in the pilot tests so far
  • Expand user community within UC as CDL confirms
    that WAS infrastructure, user support and
    training is sufficient.

15
Web Archiving ServiceWorkflow and Sample Screens
16
WAS workflowProject gt Site gt Capture gt Collection
  • Set up a project (usually a topic or event)
  • Define the sites to capture
  • Run single or multiple captures of each site
  • Choose which results to add to a single,
    searchable collection

17
(No Transcript)
18
Capture sites individually
19
Set Frequency
20
Add metadata (or not)
21
(No Transcript)
22
Sites can be captured in batches
23
When Capture Finishes
24
(No Transcript)
25
Display Results(QA capture effectiveness)
26
Display Results Overview Reports
27
Display Results Full Text Search
28
Display Results
29
Display Results(metadata)
30
(No Transcript)
31
Create Collection
32
Build Collection(add entire captures)
33
Build Collection
34
WAS features for analysis
  • Its impossible to know what a web site
    contains until after you capture it!
  • Tools for understanding where the data comes from
    and how it has changed.

35
Whats the nature of this content?
36
What new publications are in this capture?
37
Build Collection(Select files from Compare
screen)
38
How volatile is this site?(Not yet available)
39
Potential
  • We can now capture the chit chat the popular
    reaction to historic events, in ways never before
    possible.
  • How will researchers interact with captured
    content once it is in an archive?
  • Visualization
  • Text analysis
  • What is the potential, beyond simple search and
    display?

40
Web Archive VisualizationDoantam Phan Stanford
University
41
Questions?
Web-at-Risk Wikihttp//wiki.cdlib.org/WebAtRisk
You Tube Video Web-at-Risk Collections tracy.
seneca_at_ucop.edu
Write a Comment
User Comments (0)
About PowerShow.com