Title: Finding the Story Generating LargeScale Document Structure in SemanticstoHypermedia Transformation
1Finding the Story Generating Large-Scale
Document Structure in Semantics-to-Hypermedia
Transformation
- Lloyd Rutledge
- CWI, Amsterdam
2The Topia Project
- Principles and Goals
- Topiary Hypermedia plant once and trim
- Presentation generation
- User-controlled and on-demand
- Automated propogation of each author change
- Structure-focussed
- Domain-independence and facilitated specificity
3Document Request ala Google
generate new multimedia
find existing document
4The Topia Demo
5Document Engineering Triangle
6Document Engineering History
the past 5000 years
the past 5 years
post/archive
browse results
select archive
enter query
7The Topia Architecture
user
engineer
stylist
archivist
online
topic
clustering
style sheet
selection
structure
presentation
semantics
8Principles
- User Control
- pick expert set options become author
- Cross-applicability
- each experts contribution applies to any from
the others - Show what and why
- why archivist selected content for user request
- why engineer put concept where it is in structure
- why stylist picked each media for its concept
9Archivists Responsibilities
- To user
- reasonable (amount of) content for reasonable
requests - To engineer
- enough relations between subset to derive
structure - To stylist
- media for presenting concepts in different
structural context - Node-based interaction with all levels
10Archival RDF Code
text
ltrdfDescription rdfabout'ARIAArtefactSK-A-26
70'gt ltrdftype rdfresource'ARIAArtefact'/gt
ltdctitlegtPinks in the Breakerslt/dctitlegt
lttopiaartefactImage rdfresource'RMSK/Org/SK-A
-2670.org.jpg'/gt ltdccreator rdfresource'ARIA
Artist11960'/gt ltdcdategt1875-1885lt/dcdategt
ltdcdescriptiongt ... along the beach by horses.
Scheveningen did not ... lt/dcdescriptiongt
lttopiaartefactMaterialgtOil on canvaslt/topiaartef
actMaterialgt lttopiaartefactSizegt90 x 181
cmlt/topiaartefactSizegt ... lt/rdfDescriptiongt
11ARIA Concept Map
12User's Request Interface
13Selection SeRQL Result Code
lttableQueryResultgt ltheadergt ltcolumnNamegtConcept
lt/columnNamegt ltcolumnNamegtPropertylt/columnNamegt
ltcolumnNamegtStringlt/columnNamegt lt/headergt
... lttuplegt lturigtARIAArtefactSK-A-2670lt/urigt
lturigthttp//purl.org/dc/elements/1.1/descripti
onlt/urigt ltliteralgt ... along the beach by
horses. Scheveningen did not ... lt/literalgt
lt/tuplegt ... lt/tableQueryResultgt
14Clustering for Structure
original selection
cluster node
15Document Structure
hierarchical nodes
from clustering
form introduction and summary displays
recurrence
parent-child
from user query
leaf nodes
form main displays
16Proximity Principle
- Proximity Matrix
- each pair of selected concepts has a proximity
measure - Matching conceptual and structural proximity
- grouping, sequence and recurrance convey
proximity - Lets not forget why
- presentation should convey why structurally
proximate concepts were measured as proximate
17Engineer's Interface
18Concept Lattices
19Beyond Lattices
- Inferred properties to beef up the link metrics
- we use art genre sub-class inheritence
- rules provided by archivist as domain-specific
- Relational clustering
- property (ie lattice) functional subset of
relational - Can infer relations just like properties
- Axial (numeric) clustering
- creates virtual group nodes, without RDF resource
20Structure Code
ltnodegt ltconcept literal"beach"/gt ltnodegt
ltconcept property"artefactThema"
resource"ARIAThema6292"/gt ltnodegtltconcept
resource"ARIAArtefactRP-P-OB-4635"/gtlt/nodegt
ltnodegtltconcept resource"ARIAArtefactSK-A-2670"
/gtlt/nodegt lt/nodegt ltnodegt ltconcept
property"type" resource"ARIATerm26402"/gt
ltconcept property"type"
resource"ARIABroaderTerm24480"/gt
ltnodegtltconcept resource"ARIAArtefactRP-P-FM-11
57-A"/gtlt/nodegt ltnodegt ltconcept literal"Oil
on canvas" property"artefactMaterial"/gt
ltconcept property"type" resource"ARIATopTerm4
"/gt ltnodegtltconcept resource"ARIAArtefactSK-
A-4868"/gtlt/nodegt ltnodegtltconcept
resource"ARIAArtefactSK-A-2670"/gtlt/nodegt
ltnodegtltconcept resource"ARIAArtefactSK-A-3602"
/gtlt/nodegt ltnodegtltconcept resource"ARIAArtef
actSK-A-3597"/gtlt/nodegt ltnodegtltconcept
resource"ARIAArtefactSK-A-4644"/gtlt/nodegt
lt/nodegt lt/nodegt lt/nodegt
21Make it Presentable
22Stylist Responsibilities
- Good presentation of each concept
- retrieval of good media
- Good presentation of structure
- global view and local context
- Use media, layout and timing to show why
- why primary content in presentation
- why structure was chosen
- group, sequence, (adjacency) and recurrence
23One Example of Style
original user request
outline (structure)
main display (node)
ltdctitlegt
ltdctitlegt
default progression
seen
current
contextual recurrence access
recurrence
ltdccreatorgt
ltdcdategt
ltdcdescriptiongt
24Media for the Stylist
- Dublin Core for Main Display Text
- title, description, date, creator
- Media URIs for Main Display
- Titles and thumbnails for outline and context
- ltrdfslabelgt for why
- describes what type of concept a concept is
- describes property types, thus relations
- Titus is the son of the painter Rembrandt
concept
property type
concept type
concept
25Media Selection XSLT
ltxsltemplate match"" mode"getDesc"gt
ltxslvariable name"server"
select'sesamesetServer("http//media.cwi.nl8080
/sesame/")'/gt ltxslvariable name"rep"
select'sesamesetRepository("topia")'/gt
ltxslvariable name"handle" select"_at_resource"/gt
ltxslvariable name"desc"gt ltsesameserql
query" SELECT DISTINCT desc FROM
lt!handlegt ltdcdescriptiongt desc USING
NAMESPACE topia lt!http//www.telin.nl/rdf/topia
gt "/gt lt/xslvariablegt ltxsltextgt
lt/xsltextgt ltxslapply-templates
select"xalannodeset(desc)/tableQueryResult/tupl
e"/gt ltxsltextgt lt/xsltextgt lt/xsltemplategt
character escaping removed
26New Topia Domain Google
27New Topia Interface Spectacle
28DISC Domain-specific Discourse
29SampLe More User Control
30Topia Take-home Message
- Content/Style/Structure all separate
- defined apart and interchangable
- full user control from selection as such
- Structure is current challenge for generation
- can be defined apart and domain-independent
- facilitated user/engineer control
- Result is user-controlled on-demand hypermedia
generation