Multivalent Documents: Anytime, Anywhere, Any Type, Every Way User-Improvable Digital Document System - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Multivalent Documents: Anytime, Anywhere, Any Type, Every Way User-Improvable Digital Document System

Description:

Any Type, Every Way User-Improvable Digital Document System Richard Fateman The UCB Digital Library Project Team Thanks to T. Phelps, R. Wilensky for s – PowerPoint PPT presentation

Number of Views:155
Avg rating:3.0/5.0
Slides: 39
Provided by: Prof8294
Category:

less

Transcript and Presenter's Notes

Title: Multivalent Documents: Anytime, Anywhere, Any Type, Every Way User-Improvable Digital Document System


1
Multivalent Documents Anytime, Anywhere, Any
Type, Every Way User-Improvable Digital
Document System
  • Richard Fateman
  • The UCB Digital Library Project Team
  • Thanks to T. Phelps, R. Wilensky for slides

http//elib.cs.berkeley.edu/
2
Multivalent Documents Motivation
  • Document manipulation is ubiquitous
  • e-mail, web browsing, word processing, net news,
    help systems, program editors,
  • Most existing systems are
  • Pre-specified in format/genre/delivery mechanism
  • not well integrated
  • not very extensible
  • Word, Framemaker API for given functions
  • OpenDoc, HTMLApplet juxtaposition, not
    integration
  • Netscape Open source, hard to integrate and
    distribute diverse changes
  • Situation inhibits experimentation with new
    functionality, modes of interaction

3
Goal
  • Anytime - Add content (annotations or core) and
    functionality on demand
  • Anywhere - over network, read-only media, mobile
    devices
  • Any Type - scanned page images, HTML,...
    Implement functionality once, works on any type
  • Every Way - content, functionality, operation
  • User - End-user dynamically loads easily, hacker
    gets deep access and easy distribution
  • Improvable - Seamless integration of improvements
    for inexpressive (i.e., all current) formats
  • Digital Document System - Conform to modern
    practices multimedia, structure-based, style
    sheets, XML, WYSIWYG, GUI, networked, incremental
    algorithms, ...

4
Multivalent Documents
  • A highly open, extensible model of documents
  • Content Multiple, distributed layers of
    intimately related information.
  • Functionality Implemented via behaviors
    small, dynamically loadable, reusable, composable
    program units.
  • Infrastructure supports composition via
    well-defined protocols.

5
Implementation
  • Implemented MVD infrastructure in Java
  • Initially an applet, now an application
  • In alpha, beta ASN
  • Several applications developed
  • Enlivening scanned page images
  • Extensible HTML
  • Distributed, in situ annotations
  • Video scripting
  • Other individual behaviors
  • Available at http//www.cs.berkeley.edu/wilensky/
    MVD.html

6
(No Transcript)
7
(No Transcript)
8
This is saved as http//...arpa-anno.mvd
9
Behaviors with temporal extent
10
The MVD Protocol Suite Reify fundamental
document lifecycle
Hub doc
restore
save
High
High
Low
Low
Runtime
build
paint
format
user-events
undo
High
High
High
High
Before
Before
Before
Before
clipboard
After
After
After
After
Low
High
Low
Low
Low
Before
After
Low
  • Protocols execute methods according to their
    behaviors priority.
  • Some have a second (after) phase in which
    additional methods are executed in low-to-high
    order
  • Some protocols traverse trees
  • Hub document is the persistent MVD object.

11
Behaviors
  • All user-level functionality implemented as
    behavior extensions
  • Behaviors invoked by framework according to
    protocols ( function signatures)
  • Some types of behaviors
  • Media Adaptor OCR (Xdoc), HTML, ASCII
  • Search with visualization
  • Structural alt. select-and-paste, Notemarks
  • Span hyperlink, highlight, copy editor mark
  • Lens OCR, magnify, notes, Pilot notes
  • Manager lens coordination, user interface

12
Layers of Content

13
Multivalent Protocols Restore Protocol and Hub
Document
  • External to internal
  • Instantiates relevant behaviors
  • behaviors initialize
  • some load corresponding layer(s)
  • Document components stored as hub document
  • spontaneous hubs system built from cascading hubs

14
Hub Example
  • ltMULTIVALENT
  • URL"file/H/wilensky/mvd122/demo/xdoc/620/OCR-
    XDOC/00000001.xdc"
  • PAGES"9" ORGANIZATION"" NOTES""
    TYPE"Varian" SEARCHNB"ON"
  • ABSTRACT"" AUTHOR"Hal Varian" GENRE"Xdoc"
  • BIB-VERSION"CS-TR-v2.0" TITLE"A Model of
    Sales" ID"ELIB//620"
  • ENTRY"February 8, 1996" DATE"February 1996"gt
  • ltLayer NAME"Personal" BEHAVIOR"multivalent.Lay
    er" URL"inline"gt
  • ltSpan BEHAVIOR"HighlightSpan"
    CREATEDAT"941142025281
  • NB"ANNONB" COLOR"YELLOW"
    LENGTH"16"gt
  • ltStart BEHAVIOR"multivalent.Location"
  • TREE"0 49/Stiglitz 1/PARA
    2/REGION 0/OCR 0/IROOT"
  • CONTEXT"Stiglitz and
    28197729."gt
  • lt/Startgt
  • ltEnd BEHAVIOR"multivalent.Location"
  • TREE"7 50/28197729.
    1/PARA 2/REGION 0/OCR 0/IROOT"
  • CONTEXT"28197729. Stiglitz
    They"gt
  • lt/Endgt
  • lt/Spangt
  • lt/Layergt

15
Build Protocol and Document Tree
  • Isolation to union Iterates over behaviors,
    constructing tree for document content (and,
    soon, a separate tree for the user interface)
  • Runtime data structure logical/structural tree
  • All media types/genres expose structure for
    manipulation by other behaviors
  • E.g., augmenting scanned with table, biblio
  • Media adaptor behavior bridges between concrete
    and abstract through leaves, throughout lifecycle
  • E.g., scanned page images parse XDOC load
    image draw image/OCR
  • Behaviors request UI categories and elements
    system groups and instantiates all requests

16
Document Represented Internally as Graph
annotation root
table root
section1
section3
section2
base root
section1
section2
section3
section2
p1
p2
p3
p4
table
col1
col2
line
line
line
w
w
w
w
w
w
Media adaptors
text
image
17
Format Protocoland Graphics Context
  • Logical to physical Traverse document tree
    placing elements at geometric locations
  • Media-specific leaves report dimensions, internal
    structural nodes position children, i.e.,
    implement layout policies (line breaking, table
    cells, frames)
  • Three coordinate systems screen, absolute
    document, and for efficiency, parent-child
    relative
  • Current display properties in graphics context
    font, colors, line spacing, underline, signals,
    ...
  • Graphic context changed structurally (style
    sheets), linear range (spans), geometrically
    (lenses)

18
Paint Protocol
  • System to user Paint representation of content
    on screen within viewport
  • Printing reformat repaint on different canvas
  • Incremental for good performance with lenses,
    editing
  • In fact, Paint invokes Format of dirty nodes on
    demand

19
Events Protocoland Grabs
  • User to system User mouse clicks and keypresses
    passed as events to system
  • Events distributed to behaviors according to
    declared interest within tree region
  • e.g., table sorting
  • Defaults for usual editing commands (as
    replaceable behavior)
  • Grabs - behavior gets future events directly
  • e.g., hyperlinks

20
Save Protocoland Robust Locations
  • Internal to external Save reconstitutable
    description to hub (tagattributes)
  • Robust Locations
  • Documents change, but rely on registration of
    parts, especially for annotations
  • Redundant descriptions
  • ID - guaranteed correct when available
  • Tree Path - robust to insertion, deletion, change
    in hierarchy
  • Context - most flexible, but less reliable

21
Clipboard Protocoland Behavior Interaction
  • System to other system Iterate over
    corresponding media adapters in the selection,
    each contributing medium-specific representation
  • Chunky spans - step by largest wholly-contained
    subtrees in span
  • Before/After/Short-circuit (available on all
    protocols)
  • e.g., alternative select and paste for biblio

22
Means of Composition
  • Overall coordination by core framework protocols
    and behavior adherence
  • Side-effect on document tree (e.g., table
    sorting)
  • Before/After/Short-circuit (e.g., alt sp)
  • Global and graphics context attributes (e.g.,
    current page number, view as image/OCR)
  • Namespaces
  • of variables with ESIS values.
  • Manager behaviors (e.g., lens, UI)

23
Packaged support for
  • Robust locations references
  • Spans (across structural boundaries)
  • Tree manipulations, traversal
  • Templates for lens
  • Style sheet-based and fixed-format layout

24
MVD Third Party Work
  • Printing, support for other OCR formats, by HP
  • Palm Pilots notes, PDF, ink (Francis Li)
  • Temporal-extent behaviors (Wojciech Matusik)
  • Japanese support by NEC application to office
    document management
  • Chinese character and multilingual lens by UCB
    Instructional Support staff (Owen McGrath)

25
Supporting Services
  • Annotation Server (ByungHoon Kang)
  • Service allows annotation search and storage on
    DBMS.
  • MVD interface via a behavior.
  • Emailer (ByungHoon Kang)

26
GIS Viewer
  • MVD applied to geo-referenced data
  • GIS data comprise
  • geo-rectified images (raster data)
  • vectors (points, lines, polygons, etc.)
  • geo-positioned data items
  • Fits naturally into a layers and behaviors
    framework

27
GIS Viewer 3.0
  • Supported layer types
  • Raster geo-rectified GIF, JPEG
  • Vectors internal, DLG, ArcInfo Shape files
  • Grouping format/protocol tilePix (data pyramid
    for raster and vector data)
  • Behaviors are
  • pan
  • zoom (with automatic projection transformation)
  • change projection
  • display context
  • display semi-transparently
  • spatial hyperlinks
  • user authoring for annotation
  • issue query

28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
GIS Viewer Example http//elib.cs.berkeley.edu/ann
otations/gis/buildings.html
34
GIS Viewer Plans
  • Harden and distribute
  • About to inter-operate with MS Terraserver for
    wide-scale coverage of US.
  • Possible future developments
  • support for OGIS/OGDI for interoperation with
    other formats, services
  • support for queries as first class objects
  • implementation by MVD proper

35
MVD Related Work
  • Integration vs Juxtaposition OpenDoc, OLE,
    Quill, HTML with Java applets and plug-ins
  • Composition vs Shared Library GNU Emacs
  • Deep Extension vs Scripting Dynamic HTML,
    Microsoft Word, FrameMaker
  • Union vs Confederation Microcosm, UNIX Guide,
    Firefly
  • Fundamental API vs Open Source Netscape 5, Tk
    text widget
  • Cross-format vs Single Format IDVI

36
MVD Challenges
  • The W3C standards set continues to grow and
    become more powerful.
  • Some (but not all) MVD-only functionality could
    be done by browsers implementing CSS2,
    ECMAScript, XLINK, XPOINTER, etc.
  • probably also requiring limited extensions
    (applets or plug-ins)
  • The MVD approach requires providing all this
    functionality within our framework.
  • Even providing good layout, robust HTML parsing
    is a lot of work.
  • Fortunately, most of the hard part has been done.
  • We (at least somewhat) ride the wave of Java
    improvements.
  • We will ultimately depend on a Linux-like network
    of developments to have a competitive open-source
    platform.

37
MVD Ongoing Developments
  • Support for more genres
  • Fixed image formats PDF, other OCR wordboxes
  • Niche formats LaTeX (alpha DVI adaptor exists)
  • XMLXSL, etc.
  • Multi-page documents
  • homogeneous, heterogeneous, internal, tours,.
  • Re-do temporal data (video, sound)
  • Spatial data (3-D graphics, GIS?)
  • Still a few HTML features to add.

38
MVD Developments (cont)
  • Support for CSSn.
  • Proto-CSS1 exists
  • More annotation tools
  • E.g., move text
  • Work out multi-user annotation discipline
  • Annotation service issues
  • security, groups, self-administering documents
  • Harden, tune user interface, programmers guide
  • See on-line behavior writers guide.
  • Release to community, get feedback, iterate, use
    in DLIB project.
  • Hedge with MVD Lite?
  • MVD-like HTML annotation supported using CSS2,
    scripting
Write a Comment
User Comments (0)
About PowerShow.com