Genericity and flexibility in the NIR schema - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Genericity and flexibility in the NIR schema

Description:

... author of a norm is the legislator, the provider of the actual XML document ... The legislator is GOD (his decisions cannot be discussed), but He only speaks ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 20
Provided by: Fab37
Category:

less

Transcript and Presenter's Notes

Title: Genericity and flexibility in the NIR schema


1
Genericity and flexibility in the NIR schema
  • Fabio Vitali, University of Bologna

2
Genericity
  • Defined as the capacity of a document structure
    to
  • correctly deal with an open set of document
    types,
  • adapting when possible,
  • extending when not possible
  • Ideally, a generic document structure allows the
    correct and precise description of document types
    that did not exist when the structure was
    invented
  • Genericity does not imply the lack of
    specificity, but the existence of a number of
    tricks to invent specificity when none existed in
    advance

3
Flexibility
  • Defined as the capacity of a document structure
    to
  • Correctly deal with an open set of uses of the
    documents
  • Describing when possible
  • Deducing when not possible
  • Ideally, a flexible document structure allows the
    correct and precise applicability of documents in
    situations and for uses that did not exist when
    the document was marked up.
  • Flexibility does not mean the lack of precise
    support for specific features, but the existence
    of tricks to invent precise support for specific
    features when none existed in advance

4
The size of the problem
  • Italian citizens are affected by thousands of
    norms expressed at three different levels
  • Local documents (organization, municipality,
    region)
  • National (Laws, decrees, etc.)
  • International (UE, treaties, etc.)
  • These provide a number of more than 200 different
    types of documents, from Legge to Bando del
    Duce del Fascismo to Direttiva della Comunità
    Europea, etc.
  • Each has different names and uses and success,
    but most share the same overall structure.
  • We have identified three structures
  • Strictly hierarchical (documento articolato)
  • Partially hierarchical (documento semiarticolato)
  • Without any visible structure (documento NIR)

5
Shared and peculiar vocabulary
  • One of the issues we have to face is that
    although the overall structure is the same, names
    and order of containment may vary concretely and
    frequently.
  • E.g.
  • Articolo, article, 1.
  • Part, title, book and head appear in different
    orders in different types of documents
  • Section only appears in some documents
  • Solutions
  • Lack of prescription (plenty for editors, none
    for authors)
  • Abundance of description (72 elements for norms
    132 metadata elements 30 xhtml elements for
    exceptions)
  • Generic elements (16 elements)

6
Applications
  • This is an incomplete list of applications (i.e.,
    uses) for a NIR document
  • Sophisticated on screen display (open wrt.
    devices, o.s., browsers, skins, additional info,
    user wishes)
  • Sophisticated print on paper (ditto)
  • Support for references (through hypertext links)
  • Reflection (provide management information about
    itself even outside of document management
    systems)
  • Support for consolidation (e.g., answering to
    questions such as What was the enacted text on
    2002/03/05? as well as Which modification norm
    and when caused the existence of this
    fragment?,etc.
  • Sophisticated search (all laws signed by
    Berlusconi)
  • Support for provisions (i.e., formal description
    of types and arguments of individual norm
    snippets).
  • from authoritative sources, professionals and
    amateurs.
  • Etc

7
Design issues for NIR (1)
  • Data structure rather than application
  • Norme In Rete knows about applications, but is
    not dependent on any use of the data and is not
    specifically targeted towards any specific
    application (except presentation)
  • Rigorous distinction of roles
  • The author of a norm is the legislator, the
    provider of the actual XML document is the
    editor. The legislator is GOD (his decisions
    cannot be discussed), but He only speaks through
    the text of the norms.
  • The editor can add a large quantity of
    information, which has no official status. The
    very act of adding tag is an editorial operation,
    subjective and open to discussions.
  • In fact, any addition coming from editors
    (structure identification, notes, comments,
    interpretation) happens outside of the document
    content (in markup structures or in special
    metadata sections)
  • Nonetheless, the editor can and must provide as
    precise and specific markup as possible

8
Design issues for NIR (2)
  • Complexity of the access to texts
  • Many editors, many publishing systems, many
    copies in different stages of evolution
  • There is no authoritative source of XML documents
    (only of printed documents).
  • One web site could forget about updating a law to
    the latest version
  • Use of URN allows to refer to the text of a law
    without identifying a single existing
    authoritative source.
  • Support for description and prescription
  • Tagging of existing texts can only be descriptive
    (supporting any possible mess that the legislator
    may have put in)
  • Support for legal drafting can be provided,
    suggesting or enforcing legal drafting rules in
    the writing.

9
Design issues for NIR (3)
  • Everything has a reliable name
  • Every legal structure needs to be referenced and
    accessible.
  • References need to be unambiguous, universal,
    definitive.
  • URN for whole documents,
  • Mandatory id attributes for substructures and
    spans
  • XPointers for even smaller entities.
  • Specific support for multiple interpretations
  • Disposizioni (law provisions) can be identified
    and specified on the text.
  • Multiple different interpretations of the same
    text must be allowed
  • So they can be placed outside of the main
    document.

10
Design issues for NIR (6)
  • Clean separation between objective properties and
    interpretation
  • Objective properties can be marked by low-level
    editors, while interpretation requires experts
    and high-level editors.
  • Objective (manifest) properties include
    identification of boundaries (articles, clauses,
    etc.) and official facts about texts (publication
    dates, etc.)
  • Interpretation includes identification of dates,
    identification of normative content of the texts
    provisions, application of modifications.
  • Objective properties need to be added when
    marking the document rather than later on (more
    expensive). Subjective properties can be added at
    any time.

11
Flexibility through variability
  • High description, low prescription level
  • Mostly no constraint on the content, much more on
    the metacontent
  • Systematic extensions for local purposes
  • Clear distinction between
  • Mandatory structures (very few)
  • Recommended structures
  • Optional structures
  • Extension model

12
Generic elements
  • Most NormeInRete elements are organized into four
    categories (or, rather, design patterns)
  • Containers (hierarchies and separators)
  • Blocks (containers of text with vertical
    arrangement)
  • Inline (containers of text with horizontal
    arrangement)
  • Properties (atomic values outside of the main
    document flow. E.g., metadata, signatures, etc.)
  • Each category also provides a generic element,
    that can be used whenever there is no specific
    element of that category to be used. The name
    attribute allows to provide detail to the
    additional element
  • ltnirpartgt ltnircontainer namepartgt
  • ltnirblock typefoobargt ltnirfoobargt with
    block content model

13
Genericity and description
  • The risk is of using generic elements ignoring
    description completely.
  • Wrong
  • Description costs little
  • Description allows identification (given two
    containers without name attribute, are they the
    same type of element, or are they two different
    containers?)
  • Description does not have to be subjected to
    policies, but is any exist, it can be enforced.
  • In NIR the name attribute is mandatory for all
    generic elements, and the equivalence exist
    between a named element and the corresponding
    element of the same category with the appropriate
    name attribute

14
The Schemas for NIR documents
  • 3 different DTDs
  • Strict rules (prescriptive legal drafting)
  • Loose rules (descriptive existing norms)
  • Light rules (support for most common cases
    simple, everyday norms)
  • They are intercompatible
  • The vocabulary is exactly the same
  • All light documents are also loose
  • All strict document are also loose

15
The overall structure of the NIR DTD
light.dtd
isodia.pen isogrk3.pen isolat1.pen isolat2.pen iso
num.pen isopub.pen isotech.pen
proprietary.dtd
meta.dtd
norms.dtd
text.dtd
globals.dtd
strict.dtd
loose.dtd
16
An example of descriptive markup
  • NIR Complete
  • lt!ELEMENT book (num, heading?, (parttitlehead
    article))gt
  • NIR Flexible
  • lt!ELEMENT book (num?, heading?,
    (parttitleheadsectionparagrapharticlepartit
    ioncontainer))gt

17
An example of generic markup
  • lt!ELEMENT hierarchy (l1 l2 l3 l4 l5
    l6)gt
  • lt!ELEMENT l1 (num?, tit?, (block
    l2l3l4l5l6))gt
  • lt!ELEMENT l2 (num?, tit?, (block l3 l4 l5
    l6))gt
  • lt!ELEMENT l3 (num?, tit?, (block l4 l5
    l6))gt
  • lt!ELEMENT l4 (num?, tit?, (block l5 l6))gt
  • lt!ELEMENT l5 (num?, tit?, (block l6))gt
  • lt!ELEMENT l6 (num?, tit?, (block))gt
  • All generic elements have a mandatory name
    attribute.

18
Examples of extensibility
  • Editor footnotes
  • Editor inline notes
  • Global vs. proprietary metadata
  • Additional arguments to provisions
  • xhtml elements for typographical properties
    without semantic justification (e.g., bold for
    emphasis)
  • ltspan classfoobargt and ltinline namefoobargt
    for specifying the inline element foobar
  • ltdiv classfoobargt and ltblock namefoobargt
    for specifying the block element foobar

19
Conclusions
  • Genericity and flexibility do not need to happen
    at the expenses of detail and appropriate
    description.
  • Temptation to choose the easiest road is strong.
  • It must be resisted. Markup time is the best time
    to dump all available information on a document,
    information that in future could be hard to find
    or associate to the content
  • The document should come out of the markup
    session as complete and rich of information as
    possible.
  • The genericity and flexibility mechanisms of Norm
    In rete should help in that.
Write a Comment
User Comments (0)
About PowerShow.com