Title: D
1FRE 2645
Unformating SVG DocumentsApplication To Graphic
Document Indexing
Master Training par Norolala Ramangaseheno Tutors
Eric Trupin, Tony Pridmore Date 2005-09-06
2Unformating SVG Documents
- Presentation of the Project
- Environment of Work
- Introducing the Subject
- Unformat Process Applied to Graphics Indexing
- Contributions
- Applications
PSI 2005 N. Ramangaseheno
3Environment of Work
- Laboratories
- DI (Document Interaction) team of PSI
(Perception, Systèmes, Information) laboratory,
University of Rouen, France, - IPI (Image Processing and Interpretation)
research group of SCSIT (School of Computer
Science and Information Technology), University
of Notthingham, England. - Tutors
- Eric Trupin, Directeur de Recherche within
the PSI, - Tony Pridmore, Senior Lecturer, member of
IPIresearch group. - Collaborations
- Mathieu Delalandre, post-doc within the IPI
group, - Karim Zouba, master traineewithin the PSI.
- Integrated project Indexation de graphiques
vectoriels - Programming language Java
4Unformating SVG Documents
- Presentation of the Project
- Environment of Work
- Introducing the Subject
- Unformat Process Applied to Graphics Indexing
- Contributions
- Applications
PSI 2005 N. Ramangaseheno
5Vector Graphics
- Common vector formats
- AI (Adobe Illustrator)
- SVG (Scalable Vector Graphic)
- WMF (Windows Metafile)
- EPS (Encapsulted PostScript)
- DXF (AutoCAD)
PSI 2005 N. Ramangaseheno
6Image Indexing
Graphics types - raster images or bitmaps
- vector mages Indexing automatic
extraction of informative characteristics from
multimedia containers to aid retrieval
and browsing through large databases.
Architecture of an image indexing system
Graphic document
Conception of visual descriptors
Image signature
Measure of similarity
Image database
Classification
PSI 2005 N. Ramangaseheno
7Unformating SVG Documents
- Presentation of the Project
- Environment of Work
- Introducing the Subject
- Unformat Process Applied to Graphics Indexing
- Contributions
- Applications
PSI 2005 N. Ramangaseheno
8Diagram of the Indexing System
Generator of synthetic documents
Vector graphic
Unformat process
Low-level representation
Analyzer
High-level representation
PSI 2005 N. Ramangaseheno
9Problem with Unformating
Why unformat before analysing ?
PSI 2005 N. Ramangaseheno
10Problem with Unformating
acquiring
modelling
filtering
PSI 2005 N. Ramangaseheno
11Unformating SVG Documents
- Presentation of the Project
- Contributions
- Diagram of the Unformat System
- The SVG Format
- Parsing
- Modelling
- Filtering
- Intersection Search
- Applications
PSI 2005 N. Ramangaseheno
12Diagram of the Unformat System
parsing
filtering
intersection search
modelling
13Unformating SVG Documents
- Presentation of the Project
- Contributions
- Diagram of the Unformat System
- The SVG Format
- Parsing
- Modelling
- Filtering
- Intersection Search
- Applications
PSI 2005 N. Ramangaseheno
14The SVG Format
- Norm and advantages
- W3C norm for describing 2D graphics gt open
standard - Growing format gt growig number of visualizers
and users - Vectorial description of graphical objects gt
scalablility - Based on XML (described by a DTD) compatible
with XLink, XPointer, CSS/XSL, and SMIL
(animation language) gt textual, separation
between semantic and presentation - Scripts and animations started on associated
events gt interactif - Inconvenient
- Lack of realism
- Adapté pour
- Interactive geographic maps
- Technical drawings
- XML accounts
PSI 2005 N. Ramangaseheno
15Structure of a SVG Document
Exemple
lt?xml version"1.0" encoding"iso-8859-1"
standalone"no"?gt lt!DOCTYPE svg
PUBLIC "-//W3C//DTD SVG 1.0//EN" "http//www.w3.
org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd"gt ltsvg
width"5cm" height"4cm"gt ltdescgtUn joli
rectanglelt/descgt ltrect x"3cm" y"0.5cm"
width"1.5cm" height"2cm"/gt lt/svggt
SVG tag corresponding declaration
ltsvggt SVG document
ltggt group of objects
ltsymbolgt geometrical shape
lttextgt , lttspangt, ou lttrefgt text
ltimagegt image
ltdefsgt definition of links
ltusegt link towards an internal graphical object
16Geometrical Shapes
- Common shapes
- Ellipse ltellipse cx"400" cy"300" rx"72"
ry"50" /gt - Rectangle ltrect x"150" y"50" width"135"
height"100" /gt - Circle ltcircle cx"70" cy"100" r"50"
/gt - Line ltline x1"375" y1"50" x2"425"
y2"150" /gt - Polyline ltpolyline points"50,
250,75,350,100,250,125,350" /gt - Polygon ltpolygon points"250,250,297,284,279,
340,220,340" /gt - Complex shape
- Path ltpath d"M 50 250 L 100 250 L 150
300"/gt
17Unformating SVG Documents
- Presentation of the Project
- Contributions
- Diagram of the Unformat System
- The SVG Format
- Parsing
- Modelling
- Filtering
- Intersection Search
- Applications
PSI 2005 N. Ramangaseheno
18- Any XML handling need a parser
- a parser is a syntaxic analyzer it is placed
between the XML file and the application - a parser can be used
- from a program (script, java, C)
- from a navigator
- SAX, event driven parser
- handler methods called from special events
- file sequentially analyzed before being
transmitted to the application
Handler
startDocument() startElement() endElement() endDoc
ument()
19PSI 2005 N. Ramangaseheno
20Unformating SVG Documents
- Presentation of the Project
- Contributions
- Diagram of the Unformat System
- The SVG Format
- Parsing
- Modelling
- Filtering
- Intersection Search
- Applications
PSI 2005 N. Ramangaseheno
21Graphical Objects Modelling
- GOMLib Delalandre 2004
- Graphical Objects Modelling Library
- XML and SVG export
- Multi-model different representations possible
PSI 2005 N. Ramangaseheno
22Unformating SVG Documents
- Presentation of the Project
- Contributions
- Diagram of the Unformat System
- The SVG Format
- Parsing
- Modelling
- Filtering
- Intersection Search
- Applications
PSI 2005 N. Ramangaseheno
23Need filtering to respect orders 1 point of a 2D
planimetry 1 single representation
24Preliminary Tests (1/2)
- Given - two lines L1 et L2
- - b1(xb1,yb1) begin point of L1 b2(xb2,yb2)
begin point of L2 - - e2(xe2,ye1) end point of L1 e2(xe2,ye2) end
point of L2 - L1 isEqual L2 L1 and L2 are equal if
- xb1 xb2 yb1 yb2 xe1 xe2
ye1 ye2 - L1 isParallel L2 L1 and L2 are parallel if
- (( xe1 - xb1 ) ( xe2 - xb2 ) - ( ye1 - yb1 )
( ye2 - yb2 )) 0 - L1 isColinear p a point p(x,y) is colinear to
L1 if - y t x o
- (t ( ye1 - yb1 ) / (xe1 - xb1 ) and
o yb1 - t xb1 ) - L1 isColinear L2 L2 is colinear to L1 if
- L1 isColinear b2 and L1 isColinear e2
-
PSI 2005 N. Ramangaseheno
25Preliminary Tests (2/2)
- L1 overlaps p L1 overlaps a point p(x,y) if
- (( x - xb1 ) ( x - xe1 )) lt 0 or
(( y - yb1 ) ( y - ye1 )) lt 0 - L1 overlaps L2 L1 overlaps L2 if
- L1 overlaps( b2 ) or L1 overlaps( e2
) - L1 isConnected p L1 is connected to the point
p(x,y) if - b1 p
- or e1 p
- L1 isConnected L2 L1 is connected to L2 if
- L1 isConnected b2
- or L1 isConnected e2
l1 is connected to l2
PSI 2005 N. Ramangaseheno
26Filtering Tests (1/3)
- L1 sameAs L2 L1 and L2 are the same if
- L1 isEqual L2
- or xb1 xe2 yb1 ye2
xe1 xb2 ye1 yb2 - in this case, line L2 is filtered (erased)
l1 same as l2
PSI 2005 N. Ramangaseheno
27Filtering Tests (2/3)
- L1 includes L2 L1 includes L2
- case (a) L2 totally included inside L1
- if L1 isColinear L2
- and L1 overlaps b2
- and L1 overlaps e2
- case (b) L2 included inside and connected to L1
- or L1 isConnected L2
- and L1 isParallel L2
- and L1 overlaps b2 or L1 overlaps e2
(a)
(b)
l1 includes l2
PSI 2005 N. Ramangaseheno
28Filtering Tests (3/3)
- L1 isJoined L2 L1 and L2 join
together - case (a) L1 is extended by L2, without
overlapping - if L1 isConnected L2
- and L1 isParallel b2
- and L1 overlaps e2 is false
- case (b) L1 is extended by L2, with overlapping
- or L1 isColinear L2
- and L1 overlaps L2
- and L2 overlaps L1
(a)
(b)
l1 and l2 join together
PSI 2005 N. Ramangaseheno
29Unformating SVG Documents
- Presentation of the Project
- Contributions
- Diagram of the Unformat System
- The SVG Format
- Parsing
- Modelling
- Filtering
- Intersection Search
- Applications
PSI 2005 N. Ramangaseheno
30Get Line Intersection (1/3)
X junction
T junction
Segments separation
Multi-degree junction
PSI 2005 N. Ramangaseheno
31Get Line Intersection (2/3)
- Intersections processing algorithm
- Search and list all junctions of the document,
- For each line, test if it contains the junction
- If yes, break the line in two at the junction
point - Used tests
- L1 isIntersected L2 L1 and L2
intersects themselves on a point p(x,y) - so that p lt-- L1 getIntersection L2
- case (a) X junction
- if p is not null
- case (b) T junction
- if p is null
- L1 overlaps p and L2 overlaps p
- or L1 isConnected p and L2 overlaps p
- or L2 isConnected p and L1 overlaps p
- In both cases, add junction p(x,y) to the
junctions list
32Get Line Intersection (3/3)
- 2. L1 is regular, L2 irregular
- Two cases
- L2 is horizontal
- y1 a1 x1 b1
- x2 c
- -?? y2 ? ?
- yc c
- xc (c - b1)/ a1
- L2 is vertical
- y1 a1 x1 b1
- y2 c
- -?? x2 ? ?
- xc c
- yc a1 c b1
- L1 isIntersected L2 returns
- intersection point p(xc,yc) between lines L1 and
L2 - null if lines are parralel or colinear
- null if xlt0 , ylt0
- Four cases to take into account
- 1.L1 and L2 are regular
- y1 a1 x1 b1
- y2 a2 x2 b2
- yc y1 y2
- xc x1 x2
- xc (b2 - b1)/(a1 - a2)
- yc a1 xc b1 a2 xc b2
-
- 4. L1 and L2 are irregular (see case 2.)
PSI 2005 N. Ramangaseheno
33Unformating SVG Documents
- Presentation of the Project
- Contributions
- Applications
PSI 2005 N. Ramangaseheno
34Experiments Results (1/3)
Unformating results on SVG documents created with
the 2gT system ("graphic ground Truth").
PSI 2005 N. Ramangaseheno
35Experiments Results (2/3)
- Colour index
- black vectors no change
- red vectors filtered vectors
- blue vectors broken vectors
- visually , all original lines are retrieved
- After reduction, we do see that all intersections
have been erased
PSI 2005 N. Ramangaseheno
36Experiments Results (3/3)
- Algorithm complexity n(n-1)
- Filtering n (n-1) (n-2) 1
comparaisons - Intersections retrieval n (n-1) (n-2)
1 comparaisons - Runtime about 1,5 min for 100 documents (2500
vectors and 100 intersections per document)
PSI 2005 N. Ramangaseheno
37Conclusion
- Outcome
- Technical achievement
- Unformating system effective and functional
Perspectives Use upstream pattern recognition
tools
PSI 2005 N. Ramangaseheno