Title: Antarctic Master Directory Metadata Tools
1Antarctic Master Directory Metadata Tools
__________________________________________________
______ GCMD tools and developments
JCADM-7 Brussels, Belgium
2- The Next Generation of Metadata Authoring Tools
__________________________________________________
______ GCMD tools and developments
JCADM-7 Brussels, Belgium
3Background on Old Tools
- Thanks Taco!
- GCMD wrote
- DIFBuilder, SERFBuilder, DIFModBuilder,
SERFModBuilder, DIFBuildlet,. - Perl based and very hard to maintain
- Did not work standalone
- Written by a summer high school student
4DocBuilder DEMO
__________________________________________________
______ GCMD tools and developments
JCADM-7 Brussels, Belgium
5DocBuilder Features
- Object-oriented design. Allows code reuse.
- Java/Jython implementation. Offers platform
independence and maintenance reduction. - XML support. Promotes extensibility and
standardized information exchange. - Multiple versions. Supports many types of users
and environments by offering both a Web and
stand-alone application. - MD8 integration as a plugin. Provides added
functionality. - Multi-document support. Increases software
flexibility by allowing the user to choose what
type of document to build, i.e. DIF or SERF or
Project Supplemental or even a FGDC or ISO
document. - Customization capabilities. Strengthens
integration with Portals.
__________________________________________________
______ GCMD tools and developments
JCADM-7 Brussels, Belgium
6- Offer both a Web- and stand-alone application
- Abstract the majority of the Java code, with
implementation-specific code separated out into
each of the Swing and HTML user-interface
classes. - Integrate standard MD8 components
- Write DocBuilder in Java to allow the integration
of already-written MD8 components, such as the
validation and loading utilities. - Create a generic, yet highly customizable product
- Allow users to build any type of document - DIF,
SERF, Supplemental, FGDC, ISO, and more! Also,
allow users to put a unique face on the tool by
customizing the look and feel.
7- Inputs
- DTD/Schema Existing file detailing the
structure of the metadata document. - Layout file Customizable xml document specifying
element order, alias, visibility, etc. (to
override schema). Also, can detail widget-type
and default values. - Cache Repository of existing XML metadata
(including templates). - .inf file Portal configuration file listing
preferred color, logo, etc. - Output
- Automatically generated Web or Java Swing
interface
8- DTD Example dif.dtd
- lt!ELEMENT DIF ( Entry_ID, Entry_Title,
Parameters, Data_Set_Citation, Personnel, ...
) gt - lt!ELEMENT Entry_ID (PCDATA)gt
- lt!ELEMENT Entry_Title (PCDATA)gt
- lt!ELEMENT Parameters (Category, Topic, Term,
Variable?, Detailed_Variable?)gt - lt!ELEMENT Category (PCDATA)gt
- lt!ELEMENT Topic (PCDATA)gt
- lt!ELEMENT Term (PCDATA)gt
- lt!ELEMENT Variable (PCDATA)gt
- lt!ELEMENT Detailed_Variable (PCDATA)gt
- ...
9- Layout Example dif-layout.xml
ltLayoutgt ltItem name"DIF"gt ltItemsgt
ltItem name"Entry_ID" order"1"
required"true" aliasIdentifier/gt
ltItem name"Entry_Title" order"2"
required"true"/gt ltItem
name"Parameters" order"3" type"listbox"
options"/servlets/md/get_valids.py?typeparameter
svalidampformatcolon" repeated"true"
required"true"gt ltItemsgt
ltItem name"Category" order"1" visible"false"
required"true"/gt ltItem
name"Topic" order"2" visible"false"
required"true"/gt ltItem
name"Term" order"3" visible"false"
required"true"/gt ltItem
name"Variable" order"4" visible"false"/gt
ltItem name"Detailed_Variable"
order"5"/gt lt/Itemsgt
lt/Itemgt ...
10- .inf Example ceos.inf
- bg_color fefefe
- link_color 000000
- vlink_color 000000
- background_url /Data/portals/ceos/Images/white_g
reenbar2.jpg - horiz_line_url /Data/portals/ceos/Images/horiz_l
ine_blue.gif - section_bar_color 383866
- title CEOS International Directory Network
- portal_logo_url /Data/portals/ceos/Images/ceoslo
go.gif - portal_name ceos
- help_page_url /Data/portals/ceos/frame_page_help
.html - border_color 336699
11DocBuilder HTML Version
- Very similar functionality to Java/Swing (stand
alone) version - Will look similar to the current perl tool
(DIFbuilder) so users wont need to learn a whole
new tool.
__________________________________________________
______ GCMD tools and developments
JCADM-7 Brussels, Belgium
12DocBuilder HTML Version
- HTML Version The initial focus of effort was on
the Java/Swing version. As this has become a
more stable product, attention has shifted to the
HTML version, which will replace the current Web
tools (DIFbuilder, SERFbuilder, etc.). - All components are written strictly in HTML and
JavaScript. No Java components (applets) are
included in the Web-client. - A fully functional widget has been written that
is comparable to the Java/Swing searchable list
widget. - Anticipated release date of the HTML version is
September 2003.
__________________________________________________
______ GCMD tools and developments
JCADM-7 Brussels, Belgium
13- Sample screens
- Main Page Overview showing fields checklist.
__________________________________________________
______ GCMD tools and developments
JCADM-7 Brussels, Belgium
14- Sample screens
- Field Using SearchableTable Widget
- Widget for use with fields whose values have
both a controlled and uncontrolled
portion.
Controlled
Uncontrolled
__________________________________________________
______ GCMD tools and developments
JCADM-7 Brussels, Belgium
15- Sample screens
- Field Using Basic Widget
- Generic widget for use with simple
fields.
Field name
Form element (text box)
Description
Passed-in value
__________________________________________________
______ GCMD tools and developments
JCADM-7 Brussels, Belgium
16- Sample screens
- Field Using ComboBox Widget
- Basic widget for use with fields whose values
must be selected from an existing list.
__________________________________________________
______ GCMD tools and developments
JCADM-7 Brussels, Belgium
17- Sample screens
- Field Using Searchable List Widget
- Widget for use with fields whose values are
selected from a long list.
Filter term box
Full list of values
__________________________________________________
______ GCMD tools and developments
JCADM-7 Brussels, Belgium
18Other Tools
19AMD Supports Two Search Interfaces
- Free-text search
- DIFs stored in XML files on a file system are
indexed by free-text words (Currently Isite) - Keyword or Science Parameter search
- Uses a relational database (Oracle) of DIFs
20Why a New Free-text Search Engine?
- Reduce maintenance
- Improve capabilities and functionalities
- Increase usability to help users find data
21Lucene
- Part of Jakarta project
- High performance, full-featured text search
engine written in Java. - Technology suitable for nearly any application
requiring full-text search, especially
cross-platform. - Simplifies security model doesnt use its own
dedicated port. - Well-designed, well-documented
- Incremental indexing flexible data sources (file
or stream). - Supports fielded content.
- Adaptable relevance ranking.
- No spatial or temporal search.
22What About Isite?Yes, we still need it!
- Provides z39.50 access that supports remote
queries to the AMD content - Meets USA requirement to participate in FGDC
ClearingHouse
23Reduce Maintenance?
- No need to deal with Zgate client
- Zgate client is very complex due to Z39.50
protocol - Zgate client is not well supported
- Simplifies security model. Lucene doesnt use its
own dedicated port. - Lucene is java based and thus interfaces very
nicely with MD8 and Jython - Allows for lots of MD8 reuse
- Allows for more free-text functionality
- Lucene is well designed, well documented and has
a large open source user community
24Lucene Capabilities
- Incremental Indexing
- Flexible data sources (file or stream)
- Supports fielded content
- Stop word processing e.g. Exclude a, and,
the - Stemming
- Lots of query features
- Adaptable relevance ranking
25Lucene Query Parser
- Single terms hello or phrases hello dolly
- Booleans AND, , OR, NOT, -
- The OR operator is the default conjunction
operator - Also for AND, for OR, ! for NOT
- Complex fielded searches
- e.g. parentData_Center AND
tagnamePersonnel AND contentSmith - Wildcard searches
- Single character using ?
- e.g te?t matches test and text
- Multiple character using
- e.g. test matches test, tester or tests
26Lucene Query Parser
- Fuzzy search
- e.g. snow matches snow, snowy or snowflake
- Proximity search
- e.g. international nodes10 matches
international and nodes within 10 words of each
other - Term Boosting using with a number
- e.g. coke4 pepsi makes documents with coke
more relevant than those with pepsi
27The E-Government Geospatial One-Stop Initiative
- Geospatial One-Stop is an initiative sponsored by
the U.S. Office of Management and Budget (OMB) to
improve access to geographic information
throughout local, state and federal governments. - This encourages increased sharing and access to
data at all levels of government. - The implementation of Geospatial One-Stop will
involve an online, interactive metadata portal
for access to all these data sets. - NASA, along with other organizations, is required
to document future projects based on the
Geospatial One-Stop standard. - Additional information on Geospatial One-Stop is
available at - http//www.geo-one-stop.gov/
28Enhanced Project Metadata
- Project_Name
- Group Project_Organization
- Organization_Name
- Organization_URL
- End_Group
- Keyword
- ISO_Topic_Category
- Summary
- Group Personnel
- Project_Status
- Group Project_Timeframe
- Start_Date
- Stop_Date
- End_Group
- Group Spatial_Coverage
- Location
- Data_Resolution
- Scale
- Purpose
- Group Project_Investment
- Funds_Type
- Group Costs
- Projected_Costs
- Budgeted_Costs
- Other_Cost
- End_Group
- Partnered_Funds
- Agency_Partner
- Group Timing
- Estimated_Start_Date
- Estimated_Stop_Date
- End_Group
- End_Group
- IDN_Node
- Metadata_Creation_Date Metadata_Revision_Date
Metadata_Revision_History - Same field or group as specified in the DIF
29(No Transcript)
30(No Transcript)
31The Laundry List
32Open Source Packages Integrated With MD8
Software
- Ant (Apache), Java-based build tool.
- Similar to Make without Makes Wrinkles.
- Used to build MD8 software on a variety of
different platforms. - Apache
- Public domain web server.
- Hosts more than 50 of web sites in world.
- Linux Operating System
- Wide support robust.
- Flexible runs on multiple platforms.
- Competition weak or costly.
33Open Source Packages Integrated With MD8 Software
- Jakarta Tomcat, a standalone web server
- Supports servlets 2.3 and JSP 1.2.
- Plugs into Apache.
- Jython (Jython is Python for Java, so Jython
easily integrated with Java code.) - Intuitive, uncluttered language, elegant.
- For scripting web based servlets without need to
recompile and reload servlet. Examples
getdif.py, get_entry-IDS.py - For generating reports.
34Free-Text Search Package Integrated With MD8
Software
- Isite
- Full-text/fielded-text Boolean search with
spatial/temporal coverage search capabilities. - ANSI/NISO Z39.50 based search and retrieval
protocol. - High maintenance with complex Zgate client (not
well supported).
35Open Source Packages Used Peripherally With MD8
Software
- AutoUpdate, Java API program
- Automatically installs code from web server to
client. - Clients will mirror target directory structure
from server. - Supports compressed server files and incremental
updates. - Application extended, providing mechanism for OPS
users to retrieve latest software updates
automatically. - SourceForge -Recently unsupported as Open
Source - Intuitive.
- Supports many aspects of project development -
CVS repository, mailing lists, message forums,
task management software, web site testing,
permanent file archival, full backups, bug
tracking, and total web-based administration.
- Bugzilla viable competitor for bug tracking ZOPE
for shared documents.
36Open Source Packages Used Peripherally With MD8
Software
- ZOPE, the Zen of Object Publishing
- Web application server (in Python) used for
content management. - Maintained using a browser-based GUI. Allows
participants located anywhere access to
management utilities. - Scripts can be used to enhance the web site.
Example Python, Perl, Document Template, Markup
Language (DTML), ZOPE Page Template (ZPT). - Concurrent Versioning System
- Network transparent program allowing developers
to track different development versions of
source code. - Keeps single copy and records all changes made.
- Any version can be reconstructed.
37Open Source Packages Used Peripherally With MD8
Software
- NetBeans - (may be eclipsed with Eclipse)
- Powerful and flexible Integrated Development
Environment (IDE). - Competition Eclipse (IDE tool from IBM),
XEmacs, Forte, JBuilder (COTS). - GAIM, An Instant Messenger Application
- Stable runs well on LINUX.
- Essential for geographically dispersed software
group. - Competitors are AOL Instant Messenger (AIM),
Yahoo! Instant Messenger, etc. - GIMP, a Photograph Editor and Drawing Application
- Competitors Macromedia Flash, Freehand and
Fireworks, Adobe Photoshop and Illustrator,
Corel Photopaint and Draw. - Powerful skilled users may find it more powerful
than Photoshop.
38Open Source Packages Used Peripherally With MD8
Software
- Robust Audio Tool (RAT), an audio conferencing
and streaming application. - One of best voiceover IP applications.
- High quality voice transmission.
- Linux and Windows.
- Used with TightVNC.
- TightVNC (Virtual Network Computing)
- Client/server software allowing remote network
access to graphical desktops. - Access your computer from anywhere with graphical
desktop (rather than text-based session). - Integrated RAT and TightVNC to provide shared
desktop for developers in dispersed locations to
work from one desktop and communicate via voice
through internet. - Useful for demos and training.
39Open Source Packages Used Peripherally With MD8
Software
- RMIC, RMI Callback Component (MD8 OPS Client)
- Custom socket factory provides capability forRMI
callbacks to function when client is behind a
firewall and server needs to communicate back to
client. - Tunnels communication from server to client back
through the connection initially established by
client. - Squid
- A full-featured Web component for proxying and
caching of HTTP, FTP, and other URLs. - Used for GCMDs proxy hardware architecture.
- Rsync
- File transfer program for Unix.
- Used to mirror GCMD web tree.
40Open Source Packages Currently Being Investigated
For Use
- Web Statistical Packages
- Analog easy setup, fast, works on .gz files,
output in HTML and GIFs. - WebAnalyzer adequate temporal breakdown, output
in HTML, nice graphics. - Funnel Web Analyzer (www.quest.com) Enterprize
edition available at 995 with added
functionality doesnt operate on referrer logs
(only access logs), output in HTML, detailed
reports, nice graphics. Automatically checks
format of access logs, permitting easy running of
ZOPE log. - Commercial competitors 123LogAnalyzer,
Webtrends, Virtual Webtrends, NetTracker, Deep
Metrix. Costs range from 129 to 10,500.
Shortcomings include speed of analyzers, platform
incompatibilities, inadequate temporal resolution
of stats, and lack of filtering.
41Open Source Packages Currently Being Investigated
For Use
- Eclipse (for Integrated Development Environment)
(http//www.eclipse.org/) - Code completion with Java documentation code
formatting. - Refactoring, version control, compiling,
debugging, and syntax highlighting. - Native GUI.
- Hundreds of plug-ins to create tools for other
languages and applications. - Wizards for implementing classes.
- Bugzilla for bug tracking.
- Mark tasks for specific releases.
- Use also for task tracking.
- Simple Object Access Protocol (SOAP)
- XML/HTTP-based protocol for accessing services,
objects, and servers in platform-independent
manner.
42Open Source Packages Currently Being Investigated
For Use
- PostgreSQL
- Database with triggers.
- Subselects.
- Views.
- Mckoi http//mckoi.com
- SQL database written for JavaTM platform.
- Can be embedded as stand-alone application.
- Optimized to run as client/server database server
- for multiple clients.
- Highly multi-threaded.
- Features extendable object-oriented engine.
- Views and subselects.