Title: Optimising Data Accessibility
1Optimising Data Accessibility via Reference
Metadata Management Principles
Q2006 - April 2006 (Russell Penlington OECD)
2Search Engine Metadata Indexing
Query Search Engine
Get Results
Find Web Page with Data Link
- Linkage
- Relevancy
- Metatags
- Uniqueness
- Page Structure
Download Data
Understanding Search Engines
- Search engines crawl and index content by
following hyperlinks found in online reference
metadata. - There exists a set of core concepts that serve
as significant components in major search engine
algorithms. - Statistical organisations should provide well
structured reference metadata reported online in
a way for search engines to optimally index it.
3Linkage Structuring Navigation
Linking Strategy
Multidimensional Data Structure
Reference Metadata Attachment Levels
- Common or shared Level
- Dataset Level
- Dimension Level
- Dimension Member Combinations (Coordinate
Slices) Level - Series Level
- Observation Level
Fundamentals of Linking
- Metadata content must be linked together so that
search engines are able to index the entire
content of the domain. - Underlying metadata management systems should
allow reference metadata to be attached to any
level of the multidimensional data space. - Online metadata reporting systems should use
friendly URLs so that each content page can be
represented independently and linked to. - Unique metadata content pages should promote
easy linking methods to facilitate discovery
links from external domain sources.
4Relevancy Establish Maintain
Online Content Example
Gross Domestic Product gt Transaction gt Final
Consumption
Content Inherited from higher level pages
Fundamentals of Relevancy
- Each metadata page is attributed a relevancy
score for each of the keywords and phrases found
in the page. - Overall relevancy of a particular phrase for a
page is derived from the combination relevancy
scores on the linking pages and page content
itself. - Text chosen for linking reference metadata
together plays an important role in carrying
through relevancy within the hierarchical
structure of the content. - Maintaining relevance when linking reference
metadata together also provides an intuitive
navigation system for end users.
5Metatags Create Dynamically
Metatags
Characteristics
Index Focus
Constructing Optimised Metatags
- Title The first 50 characters are typically
indexed by the search engine which represents the
most important content in terms of relevancy. - Description The first 200-250 characters are
indexed and should contain grammatically correct
sentences with no more than 2 commas. - Keywords The first 5 to 10 keywords are used by
some minor search engines, however they have
diminished in importance for large search
engines. - Content for each of the important Metatags can
be constructed dynamically from the underlying
reference metadata management system.
6Page Structure Content Code
Uniqueness
- Content deemed to be a copy is inevitably
discredited and given a low weighting in the
final search results. - It is important not to replicate the same
reference metadata content at different levels of
the reference metadata reporting structure. - Metadata should be attached at the highest level
possible of the datas multidimensional space
with only the exceptions detailed at the lower
levels.
Fundamentals of Page Content
- Metadata content found at the beginning of a
page is given a higher weighting that content
found at the end. - Content display methods exist that do not
involve modification to the actual content such
as the insertion of introductory headings and
text drawn from the metadata management system. - Textual content should be grammatically correct
and free of spelling errors.
Fundamentals of Coding Techniques
- Externalisation Client-side script (JavaScript
CSS) should be referenced in separate files to
avoid weighting dilution of the visible text
content. - No Frames Reference metadata displayed in an
unlinked separate frame from the base access page
will not be able to be indexed by a search
engine. - Valid HTML Search engines favour HTML pages
that adhere to the HTML standards specified by
the World Wide Web Consortium (W3C).
7Summary Online Examples
- Linkage - Online metadata reporting systems
should utilise a system of inheritance when
linking together content allowing all content to
be traversed from a single entry point by using
friendly URLs. - Relevance Unique pages within a metadata
reporting system should retain and pass on
relevance relative to neighbouring pages linking
in and out of the particular unique page. - Metatags Metatag content for each page in a
metadata reporting system should be relevant to
the pages content and be dynamically constructed
from the underlying metadata management system. - Uniqueness Each page within an online metadata
reporting system should contain unique content
and provide links to higher levels of the
multidimensional data structures metadata to
inherit parent level metadata. - Page Structure HTML pages to report metadata
online should be free from textual content
errors, not be displayed in frames, and adhere to
HTML version specifications.
OECD Main Economic Indicators - Sources and
Definitions Pilot project reporting single
dataset metadata online from MetaStore metadata
management system. http//stats.oecd.org/mei/ Goo
gle Keyword Search Examples Consumer Opinion
Surveys Definition, United Kingdom Labour
Compensation, United States Monetary Aggregates
8Screenshot Backups
9Google Search Consumer Opinion Surveys Definition
10Google Search United Kingdom Labour Compensation
11Google Search United States Monetary Aggregates
12MEI Sources Definitions United States Monetary
Aggregates
http//stats.oecd.org/mei/default.asp?subject14c
ountryUSA
13MEI Sources Definitions Monetary Aggregates
http//stats.oecd.org/mei/default.asp?subject14
14MEI Sources Definitions Home Page
http//stats.oecd.org/mei/