Xerces2: The Sequel With No Equal

About This Presentation

Title:

Xerces2: The Sequel With No Equal

Description:

ApacheCon US - Las Vegas, Nevada. 1. Xerces2: The Sequel With No Equal. Andy Clark ... ApacheCon US - Las Vegas, Nevada. 5. 20 November 2002. Design ... – PowerPoint PPT presentation

Number of Views:38

Avg rating:3.0/5.0

Slides: 28

Provided by: AndyC69

Learn more at: http://people.apache.org

Category:

more less

Transcript and Presenter's Notes

Title: Xerces2: The Sequel With No Equal

1
Xerces2The Sequel With No Equal

Andy Clark

2
Introduction

Speaker
Worked for IBM
Currently unemployed ?
Parser
First developed in IBMs Tokyo research lab
Maintained and expanded in California
Donated to Apache
Work continues in Toronto

3
Agenda

Xerces1 Overview
Design and problems
Xerces2 Overview
Challenges and design
Q A

4
Xerces1 OverviewDesign and Problems

Andy Clark

5
Design

XML4J/Xerces1 designed for performance
Parser Implementation
Parsing pipeline
Custom reader implementations
StringPool
Defers transcoding of byte buffers until needed
Symbol table for common document strings

6
Pipeline Configuration

Intended to be generic

Scanner
Validator
Parser
XML
API
7
Pipeline Configuration Problems

Hard-coded dependencies on implementation
Inconsistent Interfaces

XML
API
Dependency
Different Interfaces
8
Custom Readers
9
Custom Readers Problems

Duplicated code
Allows more bugs to appear
Bugs are different based on encoding because code
is not shared
More complicated

10
Deferred Transcoding
11
Deferred Transcoding Problems

All components need reference to StringPool
Strings not immediately available to methods
Must make call to StringPool to query String
Memory management is complicated
Responsibility of callee to free resources
Uses more memory

12
Xerces2 OverviewChallenges and Design

Andy Clark

13
Challenges

Requirements
Simple design and implementation
Easy to maintain
More modularity and configurability
Support current and future features
Design Decisions
Always transcode bytes into Unicode characters
Removes StringPool and dependencies
Clean architecture

14
Xerces Native Interface (XNI)

Streaming Information Set
Similar to SAX
No loss of document information
Parser configuration and layering
Future extensions
Native pull-parser, tree model, etc.
Does not preserve all document information but
communicates more information to the application
than DOM or SAX.

15
(No Transcript)
16
Parsing Pipeline

Handlers communicate information between parser
components

17
Handler Overview
XMLDocumentHandler
XML
API
XMLDTDHandler XMLDTDContentModelHandler
18
Parser Layout

Components and Manager

Component Manager
Regular Components
19
Reader Management
20
Parser Configuration

Before

Parser pipeline is part of the document parser
base class.
Required duplication to re-configure parser and
still take advantage of API generator code.
XML
21
Parser Configuration

After

Parser pipeline and settings are specified in a
separate parser configuration object.
Allows re-use of framework without rewriting
existing code.
22
API Generators