Title: Metadata workshop, 15 December 2003
1Metadata workshop, 15 December 2003 Durham
University
2Metadata Workshop Timetable
- project overview
- metadata and geo-spatial datasets
- HFE Metadata Application Profile and guidelines
- metadata tools
- benefits of creating metadata
- a Go-Geo! Portal overview and hands on
evaluation session.
3Aims of the Workshop
- introduce geo-spatial metadata concepts and
available resources with the intention of
establishing a new mindset amongst data
developers and users in academia - encourage metadata creation and publication
- seek your feedback on the design and
functionality of the Go-Geo! portal.
4Project Overview
- The driving impetus of this project was the
recognition that a data sharing and management
solution needed to be developed for the academic
community to address the increasing amounts of
geo-spatial datasets that academics and students
were creating with the use of GI systems and
database technologies and conventional means. - Portal technology and metadata were identified as
the resources for delivering these capabilities
to the academic community, especially with
regards to using portal technology as a mechanism
to publicise and deliver existing datasets to a
range of users. - This led to the development of
5The Go-Geo! Portal a simple interface designed to
run queries to discover geo-spatial datasets.
The portal enables searching by the use of
various options including free text, date,
resource type and geographic location.
Geo-spatial datasets refer to data...
6Geo-spatial dataset data that have some form of
spatial or geo-graphic reference that enables
them to be located in two- or three-dimensional
space
7Project History Phase I - Scoping Study
- 10-month project phase (Aug 2000 - June 2001),
JISC funded - undertaken by EDINA and the History Data Service
(now UKDA) and involved other key players e.g.
JISC, MIMAS, ADS, UKDA - feasibility study
- understand requirements and demand for a portal
and browser - explore options investigate technical and
organisational issues - activities included
- undertaking requirements analysis
- reviewing metadata standards v. needs of HFE
- identifying geo-spatial resources and assessing
what metadata existed
8Phase II - Portal Demonstrator
- running from July 2002 to June 2003,
portal-related activities included logo name and
design, hence, the Go-Geo! Portal - portal Help pages
- the development of a demonstrator portal with
simple query interfaces allowing for search by
subject, date, resource type and geographic
location - and further development to demonstrate
cross-searching of - a database local to the portal
- an existing, remote, structured geo-spatial data
directory service to find geo-spatial data HDS
database - an existing resource catalogue containing
geo-related resources GEsource - the GIgateway and its directory services.
9Portal
Other IEContent Providers
Geo-data Gateway
NGDF/GIgatewayNetwork
Go-Geo! portal architecture
Metadata or resource servers
Geo-data Network(proposed)
10Phase II Go-Geo Content
- GI-related resources, tied together by location,
which include - software, learning resources, courses and
training, etc. - information about studies and projects, articles,
reports, organisations, personal contacts,
mailing-lists - conferences
- guidance and reference documents for
understanding and creating geo-spatial metadata - a workshop was held at University of Essex in
January 2003 to introduce the Go-Geo! Portal
demonstrator to stakeholders as part of an effort
to publicise the portal and receive suggestions
for its improvement.
11Phase II Metadata Activities
- amended and finalised the HFE Metadata
Application Profile derived in large part from
the NGDF Discovery Metadata Guidelines and
mandatory ISO 19115 elements - produced a 150 page guideline document for the
HFE Profile and to support metadata creation - cross-mapped between the HFE Profile, the ISO
19115 Metadata Standard, the FGDC Standard and
the NGDF - created 100 metadata records for portal content
and demonstrator purposes. This included
converting 25 records from Archaeology Data
Service (ADS), the History Data Service (HDS),
and the Manchester Information Associated
Services (MIMAS) and 75 created records from
EDINA - reviewed and selected potential sources for
geo-spatial datasets hence, contacts for
metadata also distributed questionnaires - developed an MS Access-based tool for creating
metadata records.
12Phase III Go-Geo Portal Trial Serviceand
Metadata Initiative
- running from August 2003 to July 2004, Phase III
project efforts will entail running portal
evaluation sessions at selected metadata
workshops and creating an on-site questionnaire.
Both efforts are meant to encourage feedback that
will lead to improvements in portal functionality
and design in preparation for rolling the portal
out as full service - a sister project, the JISC-funded Metadata
Initiative, will involve the promotion of
geo-spatial metadata through workshops and
presentations. These are to be organised and
presented at up to 18 universities across the UK.
The workshops, such as this one, will provide an
introduction to geo-spatial metadata, the HFE
Metadata Application Profile, supporting
guidelines and metadata tool.
13Metadata represent an ordered summary of
information that describes something, in this
case, a geo-spatial dataset. The details include
the What, Where, When, Who and Why of the
dataset, plus the means to access and use it. A
metadata record may answer the following
questions about a dataset
- What is the purpose of the dataset?Â
- Where did the dataset originate?Â
- What attribute information does it contain?Â
- What processes or algorithms were employed to
create it? - What spatial reference system does the dataset
use? - What is the granularity of the data?Â
- When was the dataset created?
- How do I obtain the dataset?Â
- What geographic area or extent does it cover?Â
- Whom do I contact for more information or access
to the dataset? - What are the access and use restrictions and how
much will it cost? - Who is responsible for creating the metadata
record for the dataset? - What time period does the dataset content cover?
14- Metadata reveal
- information that isnt
- apparent when looking
- at geo-spatial dataset
- files in a directory. The
- information details of a
- dataset file are revealed
- in the metadata record.
15- A geo-spatial dataset file
- opened in a GIS software
- package doesnt always reveal
- detailed information without
- further investigation.
- What do these polygons
- represent?
- which application?
- what are the attributes?
- where is this study area?
- which projection and
- co-ordinate system?
- what is the spatial accuracy?
- when were the data captured and processed?
- These questions can be answered
- with one metadata record.
16- Think also of defining metadata
- in terms of food product
- labelling. Labels provide
- specific information about the
- ingredients in these tins.
- Remove the labels and decide
- which tin to open. One tin
- contains tuna-flavoured cat food
- and the other tuna fish. Would
- you select Tin A or B?
17Metadata Standards
- Metadata Standards represent precise
specifications applied to information
documentation operations/procedures to enforce
and ensure consistency and interoperability. - Metadata Standards are organized in a hierarchy
of compound elements or entities and data
elements that define the information content for
metadata to document a set of data. - Metadata Standards also assigns structure and
conditions to elements and entities. These
include Element and Entity Definitions and
Identifiers, Obligations, Data Type and Domain.
Obligations refer to whether or not a value must
be entered for the element Data Type defines the
value format entered, such as character string,
date, numeric or a list.
18Metadata Standard Initiatives
- Perhaps the most well-known metadata initiative
is the Dublin Core. - The Dublin Core element set defines 15 metadata
elements for - simple resource discovery. It also serves as an
intermediary source - for use between the numerous community-specific
formats. - 1) Title 2) Creator 3) Subject and Keywords
4) Description - 5) Publisher 6) Contributor 7) Date 8)
Resource Type 9)Format - 10) Resource Identifier 11)Source 12)
Language 13) Relation - 14) Coverage 15) Rights Management
19Geo-spatial Metadata Standard Initiatives
- Federal Geographic Data Committees Content
Standard for Digital - Geo-spatial Metadata (CSDGM) contains 334
elements. This standard - was produced during a mid 1990s initiative for
the intended use of - documenting geo-spatial datasets. The National
Geo-spatial Data - Framework (NGDF)/Gigateway Metadata Guidelines
are based on the - FGDC standard. The NGDF/Gigateway Guidelines
represent an - application profile created for the UK
geo-spatial community and - Gigateway gateway web service.
- The International Organisation for
Standardisation (ISO) has recently - produced an approved version of the ISO 19115
Metadata Standard for - Geographic Information. This standard contains
337 elements and will - replace the FGDCs Standard. Currently, the
NGDF/Gigateway - Guidelines are being reviewed in order to make it
compliant with the ISO - 19115.
20Application Profiles
- The geo-spatial metadata standards contain too
many elements and - many organisations turn to the development of
application profiles to - meet their needs.
- a significant reduction in the number of entities
and elements each organisation selects from the
standards - this allows for selecting specific elements that
are best suited for specific applications. The
NGDF/GIgateway Metadata Guidelines contain 42
entities and elements and were selected to meet
the needs of the UK geo-spatial community. - additional elements can also be added that arent
part of a standard, though this reduces
cross-searching capabilities across a wider
network and other organisations using their own
profiles and standards. - the careful selection of core element set is
always critical to assure success in
cross-searches.
21The HFE Metadata Application Profile and
Guidelines
- derived from the NGDF Metadata Guidelines and the
ISO 19115 Metadata Standard, the HFE Metadata
Application Profile was created to support the
needs of the UK academic community - it contains 71 elements categorised and separated
under the eight entity groups - has 27 mandatory elements of which 12 elements
are used for contact details. With the exception
of one element (Description), the remaining 15
elements require only short answers or the
selection of appropriate term(s) from lists - Guidelines are embedded in the Go-Geo! Portal and
contain 150 pages of support material and
examples to assist portal users and metadata
creators from numerous academic disciplines. -
22The HFE Metadata Application Profile Eight Groups
(Entities)
- G1 Citation
- G2 Identification Information (What)
- G3 Data Capture Period (When)
- G4 Time Period Covered by Dataset (When)
- G5 Spatial Extent of Dataset (Where)
- G6 Custodian (Who)
- G7 Distributor (Access)
- G8 Metadata Creator/Record Creator
23..and subgroup entities
- G2 Identification Information
- G1.sg2 Spatial Reference System
- G1.sg3 Level of Spatial Detail
- G5 Spatial Extent of Dataset
- G5.sg1 Spatial Referencing using Geographic
- Co-ordinates
- G5.sg1-a Spatial Referencing using
Co-ordinates of a Bounding Rectangle - G5.sg1-b Spatial Referencing using
co-ordinates of a Bounding Polygon - G5.sg1 Spatial Referencing using Geographic
Identifiers - G7 Distributor
- G7.sg1 Access and Use Constraints
24G1 Citation
- 1. Title (Mandatory) (1)
- The name by which the dataset is known.
- 2. Alternative Title (Optional)
- Short name, other name, acronym or alternative
language title. - 3. Creator (Mandatory) (2)
- Organisation or person that developed the dataset
and has primary - responsibility for the intellectual content of
the dataset.
25- 4. Identifier (Optional)
- A unique string or number used to identify the
dataset. - 5. Edition (Mandatory) (3)
- The number of the edition of the dataset.
26G2 Identification Information (What)
- 6. Topic (Mandatory) (4)
- Main theme(s) of the dataset.
- 1) Farming 2) Biota 3) Boundaries
- 4) Climatology/Meteorology/Atmosphere 5)
Economy 6) Elevation 7) Environment
8) Geo-scientific Information 9) Health - 10) Imagery/Base Maps/Earth Cover 11)
Intelligence/Military - 12) Inland Waters 13) Location 14)
Oceans 15) Planning/Cadastre16) Society
17) Structure 18) Transportation19)
Utilities/Communication
27- 7. Controlled Vocabulary (Mandatory) (5)
- Name of the controlled vocabulary used as a
source for the - controlled keywords.
- -UNESCO Thesaurus (United Nations Educational,
Scientific and - Cultural Organization)
- -GEMET (GEneral Multilingual Environmental
Thesaurus) - -HASSET (Humanities and Social Science
Electronic Thesaurus) - 8. Controlled Keywords (Mandatory) (6)
- Keywords taken from a controlled vocabulary
summarising the - subject of the dataset.
- 9. Other Keywords (Optional)
- Other words or phrases summarising the subject of
the dataset.
28- 10. Controlled Place Name Vocabulary
(Optional) - Name of the controlled vocabulary used as a
source for the - controlled place name keywords.
- -Getty Thesaurus of Geographic Names
- -Ordnance Survey 150000 Gazetteer
- -geoXwalk
- 11. Controlled Place Name Keywords (Optional)
- The geographic name of a location(s) covered by a
dataset.
29- 12. Description (Mandatory) (7)
- A brief description of the dataset. This should
include some - explanation as to why the dataset was produced
and how it has - been used since its creation.
- 13. Quality (Optional)
- A general assessment of the quality of a dataset
for determining its - fitness for use. Quality is stated in terms of
accuracy, - completeness, and consistency for both the data
and the dataset. - 14. Language (Mandatory) (8)
- The language(s) used within the dataset.
- English Gaelic - Welsh
30- 15. Further Information (Optional)
- Source of further information about the dataset.
- 16. Related Datasets (Optional)
- Information about other, related datasets of a
similar theme or - derived from a common source, which may be of
interest to the - user.
31G2 Identification Information (What) G2.sg1
Spatial Reference System
- 17. Co-ordinate System (Conditional Mandatory)
(9) - Name or description of the spatial referencing
system used within - the dataset, which is based on co-ordinates e.g.
British National - Grid, Irish National Grid, latitude and
longitude. - 18. Geographic Identifiers (Conditional
Mandatory) (10) - Name or description of the spatial referencing
system used within - the dataset, which is based on geographic
identifiers e.g. - postcodes, postal addresses, administrative
units, or countries.
32G2 Identification Information (What) G2.sg2
Level of Spatial Detail
- 19. Source Scale Denominator (Optional)
- Denominator of the representative fraction on
the source map(s) - (e.g. on a 150000 scale map, the source scale
denominator is - 50000). If no source map used, enter 0. If
multiple source map - scales were used, enter the Source Scale
Denominator of the - smallest scale map (largest denominator).
33- 20. Imagery or Grid Raster Cell or Pixel Size
X-Value (Optional) - The column width of a raster cell expressed in
distance units of - measure.
- 21. Imagery or Grid Raster Cell or Pixel Size
Y-Value (Optional) - The row height of a raster cell expressed in
distance units of - measure.
- 22. Smallest Administrative Unit (Optional)
- The smallest representative unit associated with
disaggregated - statistical data.
34G3 Data Capture Period (When)
- 23. Status of the Start Date for Dataset Capture
(Optional) - Declaration on the status of the starting date
for data capture. - Known - Not Known - Not Applicable
- 24. Start Date of Dataset Capture Process
(Optional) - Date on which data for dataset were first
collected. - 20031215
35- 25. Status of the Completion Date for Dataset
Capture (Optional) - Declaration on the status of the completion date
for data capture. - Known - Not Known - Not Applicable - Ongoing
- 26. Completion Date of Dataset Capture Process
(Optional) - Date on which data for dataset were last
collected. - 20031215
36- 27. Update Frequency (Optional)
- The frequency with which revisions and updates
are made to the - dataset after its initial completion.
- Hourly Daily Weekly Fortnightly Monthly
Quarterly - Biannually Annually Biennially Triennially
Quinquennially - Decennially Continuous Irregular Never
Not Known - Other
37G3 Time Period Covered by Dataset (When)
- 28. Start Date for Time Period Covered by
Dataset (Optional) - The start date of the actual time period the
dataset covers. - 29. End Date for Time Period Covered by Dataset
(Optional) - The end date of the actual time period the
dataset covers.
38G5 Spatial Extent of Dataset (Where) G5.sg1
Spatial Referencing using Geographic
Co-ordinates
- 30. System of Spatial Referencing by
Co-ordinates (Mandatory) (11) - Name of the spatial reference system used for the
geographic co- - ordinates.
- British National Grid Irish Grid Latitude and
Longitude
39 G5.sg1 Spatial Referencing using Geographic
Co-ordinates G5.sg1-a Spatial Referencing
using Co-ordinates of a Bounding Rectangle
- 31. West Bounding Co-ordinate (Mandatory) (12)
- Westernmost co-ordinate of a bounding rectangle.
(Grid Value/Longitude) - 32. East Bounding Co-ordinate (Mandatory) (13)
- Easternmost co-ordinate of a bounding rectangle.
(Grid Value/Longitude) - 33. North Bounding Co-ordinate (Mandatory) (14)
- Northernmost co-ordinate of a bounding rectangle.
(Grid Value/Latitude) - 34. South Bounding Co-ordinate (Mandatory) (15)
- Southernmost co-ordinate of a bounding rectangle.
(Grid Value/Latitude)
40 G5.sg1 Spatial Referencing using Geographic
Co-ordinates G5.sg1-b Spatial Referencing
using Co-ordinates of a Bounding Polygon
- 35. Spatial Referencing using Co-ordinates of
the Bounding - Polygon (Optional)
- The set of x and y co-ordinates (first number
easting of a point, - second number northing of a point)Â that make
up the bounding - polygon.
41 G5.sg2 Spatial Referencing by
Geographic Identifiers
- 36. Nations (Optional)
- Geographic coverage expressed in terms of nations
within the - British Isles.
- England Northern Ireland Scotland Wales
Isle of Man - Channel Islands United Kingdom Republic of
Ireland
42- 37. Administrative Areas (Optional)
- Geographic coverage expressed in terms of
administrative areas. - 38. Postcode Districts (Optional)
- Geographic coverage expressed in terms of
postcode districts.
43G6 Custodian (Who)
- 39. Name of Custodian (Mandatory) (16)
- The name of the organisation or person
responsible for the - maintenance of the dataset.
- 40. Postal Street Address of Custodian
(Mandatory) (17) - 41. Postal City of Custodian (Mandatory) (18)
- 42. Postal County of Custodian (Optional)
- 43. Postal Code of Custodian (Mandatory) (19)
- 44. Postal Country of Custodian (Mandatory)
(20)
44- 45. Telephone Number of Custodian (Optional)
- 46. Facsimile Number of Custodian (Optional)
- 47. Email Address of Custodian (Optional)
- 48. Web Address of Custodian (Optional)
45G7 Distributor (Access)
- 49. Name of Distributor (Mandatory) (21)
- The name of the organisation or person from whom
the dataset may - be obtained.
- 50. Full Postal Street Address of Distributor
(Mandatory) (22) - 51. Postal Code of Distributor (Mandatory) (23)
- 52. Telephone Number of Distributor (Optional)
- 53. Facsimile Number of Distributor (Optional)
- 54. Email Address of Distributor (Optional)
- 55. Web Address of Distributor (Optional)
46- 56. Presentation Type (Optional)
- Form in which the dataset is available.
- Image Graphic Map Numeric Text - Other
- 57. Dataset Format (Optional)
- Format in which digital data can be provided
(e.g. DXF, DLG, - MapInfo, IDRISI, ARC/INFO, ERDAS, DBF)
- 58. Supply Media (Optional)
- Media format in which the dataset can be
supplied. - Paper Magnetic Optical Online - Other
47- 59. Sample (Optional)
- A sample of the dataset and its approximate file
size (Megabytes). -
- 60. Online Linkage (Optional)
- The name of the World Wide Web site or other
on-line source that - contains the dataset.
48G7 Distributor (Access) G7.sg1 Access and
Use Constraints
- 61. Access Constraints (Optional)
- Restrictions and legal prerequisites for
accessing the dataset. - These include any access constraints applied to
assure the - protection of privacy or intellectual property,
and any special - restrictions or limitations on obtaining the
dataset. - Financial Legal Other Not Known - None
49- 62. Access Details (Optional)
- Description of the restrictions and legal
prerequisites for accessing - the dataset.
- 63. Use Constraints (Optional)
- Restrictions and legal prerequisites on using the
dataset after - access is granted. These include any access
constraints applied to - assure the protection of privacy or intellectual
property, and any - special restrictions or limitations on obtaining
the dataset.
50G8 Metadata Creator/Record Creator
- 64. Name of Metadata Creator (Mandatory) (24)
- The name of the organisation or person
responsible for the metadata - updates.
- 65. Full Postal Street Address of Metadata
Creator (Mandatory) (25) - 66. Postal Code of Metadata Creator (Mandatory)
(26) - 67. Telephone Number of Metadata Creator
(Optional) - 68. Facsimile Number of Metadata Creator
(Optional)
51- 69. Email Address of Metadata Creator
(Optional) - 70. Web Address of Metadata Creator (Optional)
- 71. Metadata Last Updated (Mandatory) (27)
- Date on which the metadata (file) were created or
last updated. - 20031215
52Metadata Tool
- temporary metadata tool designed within MS
Access, which can be downloaded at the project
web pages and used for creating metadata records - includes all 71 elements of the HFE Metadata
Application Profile - metadata records can be saved as database files
and sent to the UK Data Archive where theyll be
validated and sent to EDINA for conversion into
an XML format and stored on the Go-Geo! Portals
node. XML is an EXtensible Markup Language,
which is a system for marking up documents and
data using tags that indicate or define
structural elements - the UKDA and EDINA are developing a JAVA-based
internet metadata tool that will further simplify
the process of metadata creation and validation - finally, some GIS software packages contain
metadata editors that can extract some
information, but ultimately there are no easy
alternatives because most of the information lies
here...
53in our brains and we need to move it from here to
the computer
Some day we might find solutions to this
problem....
54- we may encounter beings
- from other worlds who
- could use telepathy to
- extract dataset information
- from our heads..
55- or well discover a
- technological solution that
- extracts dataset information
- from our heads and
- transfers it to the
- computer.
- Until then, well need to
- depend on available tools.
56MS Access Metadata Tool Demonstration
57The Importance of Metadata Creation
- provides support to create a mindset and
operational structure for managing and storing
dataset information for departmental and
intra-departmental use - assures integrity of existing and new datasets
using metadata as a tracking mechanism to monitor
changes and edits to datasets - maintains an inventory of datasets to reduce
redundancy and time required to reassess existing
datasets for new and future applications - eliminates or reduces the risk of redundancy in
data collection or deletion of existing datasets - reduces effects of staff turnover and minimise
its disruptive effects - protects investments of time and cost dedicated
to data development
58- assures that other organisations will not
replicate data at added cost and time - provides potential users a dataset catalogue to
view and select datasets to complement or augment
their existing in-house datasets, which can be
used together for other applications - allows for more spontaneity amongst users as they
browse the Go-Geo! portal and metadata. The
discovery of a dataset may instigate the user to
develop an idea for a new application - metadata on the portal can be referenced and
cited for project proposals - the portals node can serve as a repository for
organisations to store and manage their metadata
and use the portal as an internal resource to
access and share datasets. This will save an
organisation the time and cost that is associated
with establishing and maintaining internal
servers to store, manage and update metadata
59- metadata and the portal can provide a quick,
short-term solution for data developers to
protect their intellectual rights using metadata
to announce their data and applications on the
portal - some organisations and individuals may wish to
advertise and sell their datasets to other
interested parties in academia and in the private
and public sectors - the portal will be linked to the other portals
and the UK gateway, thus allowing for
advertisement to a large audience of data users - the metadata and portal will complement and
augment other UK academic portals and the UKs
GIgateway gateway site.
60GIgateway
- Just a few words regarding
- Gigateway (www.gigateway.org.uk).
- It is a geo-data gateway site
- serving the UKs geo-spatial
- community. The Go-Geo! Portal will
- be a service that complements the
- Gigateway site. The Go-Geo! portal
- will focus specifically on the needs of
- the academic community. Students
- moving on to employment in
- government and public and private
- sector organisations will be able to
- create metadata for GIgateway using
- its support system of metadata tools
- and guidelines.
61Go-Geo! Portal Trial Service
- the trial of the Go-Geo! portal demonstrator
service began on 17th November 2003. - an initial evaluation is being held to allow for
further feedback at the start of the trial. - 2nd evaluation will take place towards the end of
the trial. - The trial Go-Geo! service and further information
about the project can be found at
http//www.gogeo.ac.uk.
62Go-Geo! Portal Evaluation
- The evaluation session will last for 45 minutes.
- try out the portal at http//www.gogeo.ac.uk.
- complete and return the questionnaire
63(No Transcript)