Title: Content Management: Lessons Learned
1Content Management Lessons Learned
- Brian Stoll
- Content Management Developer
- stollb_at_seattleu.edu
2 Overview
- Site and System Description
- Source Management
- Transforms
- Hardware Software
- Conclusions
- Questions? Feedback
3Site and System Description
Platform
Production Server
Linux OS
Apache Web Server
Staging Server
Apache Tomcat Web Container
Apache Cocoon Content Framework
Apache Ant Build Tool
Redhawk CMS
Personal Workstations
Google Search Appliance
4Site and System Description
Properties and Sources The web properties of the
Seattle University Law School receive content
from multiple sources. These multiple properties
also share many of the same sources.
Web Properties Main Website Student
Portal Library Portal Law Docket Library
Docket Departmental Portals, etc.
Content Sources Redhawk Repository Static
Files Java, ASP, Databases Syndicated Feeds (ex.
Jurist, NOAA)
5Site and System Description
Content Versions Content is rendered by the
various properties in multiple format versions.
XSLT transform
Flash
Source
lt?xml version"1.0" encoding"UTF-8"?gt ltallAnnounc
ementsgt ltannouncement priority"normal
headline"Summer 2005 Registration Validation"gt
ltpgtltbgtComplete and sign your Summer 2005
Business Office no later than 430 p.m. on
Friday, June 3. lt/bgtThis form must have been
received by Wednesday, May 25 if you plan to pick
up your refund check on Thursday, June 2. Refund
checks will be available after 1 p.m. Questions?
Contact the ltbgtltfont color"8C0000"gtlta
href"javascriptEventLink('person','Business
Office')"gtBusiness Officelt/agtlt/fontgtlt/bgt at
(206) 398-4050.lt/pgtlt/announcementgt
lt\lt/allAnnouncementsgt
Standard HTML
Text only
6Site and System Description
Examples
Law Docket touch screen http//www.law.seattleu.
edu/docket/kiosk
Main site http//www.law.seattleu.edu
Text-only version
Student Portal http//students.law.seattleu.edu
7Source Management
Document Types/Schemas Redhawk supports
development of custom "document classes", which
correspond to XML document types (or
schemas). XML Repository When Redhawk was first
designed and created over three years ago, there
was a specific publishing purpose, that being
announcements and events. We now have sixteen
different publishing templates and three years of
content. This has significantly slowed Redhawk
as the content is not indexed directly as a
database would. Lessons Learned In our coming
revision we will move to database driven content
that is created and queried in XML format, thus
XML becomes the medium rather than the
repository. Content management systems have
developed a great deal since Redhawk was designed
and developed and we will consider these as well.
OSCOM (http//www.oscom.org) and OpensourceCMS
(http//www.opensourcecms.com) are two examples
of helpful resources.
8Source Management
WYSIWYG Pluggable WYSIWYG editing environment.
We are using Altova's free browser-based XML
editor, Authentic. Authentic WYSIWYG forms are
easily created to match the schemas of our
content. ActiveX ActiveX controls can cause
difficulties for users to install and limit
browser usage. Lessons Learned Consider other
WYSIWYG controls such as DHTML or Flash based
tools.
9Source Management
Workflow Provides basic CRUD (Create, Read,
Update, Delete) and role-based workflow
functionality. Two types of user for each
document class Author and Editor. Create,
Update, or Delete requests by an Author must be
approved by an Editor before taking effect.
Author creates content and submits.
Editor reviews edits. Approve or rejects.
Upon approval content is published and deceases
upon expiration date.
Lessons Learned This simple workflow has been
effective, but customizable workflow based on
document type would be helpful.
10Source Management
Calendar Based Content A significant amount of
content that we publish is tied to an event or
schedule that is maintained externally to the
system. Integrating our content management
system with calendaring funcationality allows for
more efficient content creation and eliminate
redundancy. For example, a department can use
such calendaring functionality to schedule their
events, submit appropriate events for publishing,
and sync with their personal calendar (such as
Outlook).
11Source Management
Syndicated Content Some of the content we
publish is derived from extensible syndication
such as RSS and XML feeds. Examples include
Jurist legal news and NOAA weather data.
Sometimes this requires combining multiple
syndications into one source document. In the
case of NOAA we require both current and forecast
data, so we request and combine data as required
for our publishing targets.
12 Transforms
- Transforms allow extensible content repositories
and publishing to work. It allows for reuse of
content towards multiple formats and targets. - Apache Cocoon At the heart of our publishing
system lies the Cocoon web framework. - Cocoon is an open-source, Java-based XML Web
publishing framework. - Designed to enable the separation of concerns
between content, logic, and style. - SAX-based pipeline mechanism allows XML content
to go through a series of transformations,
configurable by the sitemap, Cocoon's central
point of configuration
13 Transforms
- Cocoon Pipeline
- Each pipeline consists of
- Exactly one generator to produce XML content
- Zero or more transformers to process XML
- Exactly one serializer into a specific format
Generate
Transform
Serialize
Requests Files Database
XHTML PDF SVG
XSLT
14 Transforms
- Transforms can be expensive in terms of both time
and memory. There are a number of ways to
organize content in order to efficiently manage
transforms. - Use a build process such as Ant to transform
static content prior to the request. - Separate common content, such as the header and
footer, so that they only require one transform
each rather than once per page.
Pre-Transformed Content
Request
Transformed at Request
Header
Static Content
Dynamic Content
Footer
15 Transforms
Personalization The pipeline process allows for
multiple levels of personalization, both at the
transform and at data generation. Example A
blind student can have their default site set to
text only. Example Based on a profile in a
database, a student could be alerted as to
their class being canceled.
16Hardware Software
Open Source Software A great resource that saves
money and allows for a great degree of
flexibility. Linux OS Currently running Redhat
9 and must now decide to pay Redhat licensing
fees or move to another distribution.
Memory Web services such as Tomcat and Cocoon
can consume a great deal of memory, so adding and
managing memory usage is critical. This can take
a toll on your own memory, so manage that as
well!
17 Conclusions
Extensible Content Repository Allows multiple
publishing targets and eliminates redundancy in
creation and management. XML as a Medium We will
use XML more as a medium from a database storing
our dynamic content repository to improve
retrieval and indexing. WYSIWYG Move to a more
browser compliant, Non-ActiveX enabled authoring
controls (e.g.. Flash, JavaScript).
18 Conclusions
- Customizable Workflow
- Varying content a can require various workflows.
Having a configurable workflow will allow for
this. - Calendar Based Content
- Integrate event management and scheduling into
the publishing process to eliminate redundancy
and improve communication. - Cocoon allows for Cohesion
- We have multiple content sources publishing to
multiple targets and cocoon integrates this
process into one system via the pipelines of the
sitemap.
19 Conclusions
- Manage Transforms
- Eliminate redundancy in transformed content and
pre-transform any static content to improve
performance. - Open Source Software
- Allows for flexible development and
implementation while reducing/eliminating
licensing costs.
20Questions,Feedback
Thanks for your time and interest.
Brian Stoll Content Management Developer stollb_at_se
attleu.edu