Development of a Grid Enabled Occupational Data Environment - PowerPoint PPT Presentation

About This Presentation
Title:

Development of a Grid Enabled Occupational Data Environment

Description:

GEODE - eSS Manchester, June 2006. Development of a Grid Enabled Occupational ... {issues in support for alterative security levels to allow modification of ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 19
Provided by: geo98
Category:

less

Transcript and Presenter's Notes

Title: Development of a Grid Enabled Occupational Data Environment


1
Development of a Grid Enabled Occupational Data
Environment
  • GEODE www.geode.stir.ac.uk
  • Paper presented to the Second International
    Conference on e-Social Science, Manchester, 28-30
    June 2006

2
Development of a Grid Enabled Occupational Data
Environment
  • Introduction Occupational Information
  • Activities in two areas
  • Occupational Information Depository
  • Access to occupational information
  • Conclusions and prospects

3
Whats the problem?
  • Indexed mainly by Occupational Unit Group (OUG).
    But
  • Numerous alternative occupational data files
    (time country format)
  • Alternative OUG schemes other index factors
    (employment status)
  • Inconsistent translations to social
    classifications by file or by fiat
  • Dynamic updates to occupational data resources
  • Low uptake of existing occupational information
    resources
  • Strict security constraints on users
    micro-social survey data

4
Some illustrative occupational information
resources
5
GEODE Grid Enabled Occupational Data Environment
  • Objectives
  • Operate as a portal
  • Facilitate linking occupational information to
    users datasets
  • (initial focus on CAMSIS occupational information
    resources)
  • GEODE data resources occupational information
    data curated as data service in Stirling,
    accessed by users via portal
  • Create an international Virtual Organization for
    occupational data community
  • Sharing, indexing, curating diverse
    occupational data
  • Other analytical functions on occupational data?

6
GEODE Building blocks
  • Globus Toolkit 4 (WSRF implementation)
  • To build grid application services
  • GridSphere 2.1.2 (portal framework JSR 168)
  • OGSA-DAI (data access grid middleware)
  • http//www.ogsadai.org.uk/
  • DDI (social science metadata in XML)
  • http//www.icpsr.umich.edu/DDI/
  • Development environment
  • Jakarta Tomcat 5.x
  • Axis SOAP Engine
  • Java

7
2) Occupational Information Depository
  • Grid as a system that (e.g. Foster et al 2001)
  • coordinates resources that are not subject to
    centralized control
  • uses standard, open, general-purpose protocols
    and interfaces
  • delivers non-trivial qualities of service
  • Use with occupational information depository
  • Create a community where members have abstract
    access to heterogeneous resources securely, and
    achieve wider collaboration

8
GEODE - architecture
9
GEODE Occupational Information Depository
  • Data Index Service uses DDI and OGSA-DAI
  • User Requirements / Evaluations
  • Three elements
  • Semantic data curation
  • Data storage
  • Data indexing / access

10
Occupational information depository
  • 2.1) Semantic curation of occupational
    information
  • Establish a GEODE-M meta-data subset (.xml)
  • Founded on Michigan Data Documentation Initiative
  • Minimise curation requirements to suit occ.
    information resource providers (pilots)
  • Web proforma entry
  • via Portal using Gridsphere

11
Occupational information depository
  • 2.2) Storing occupational information resources
  • Considerations
  • All data stored at GEODE vs Linkage to external
    data
  • Proprietary software (plain text / SPSS / STATA)
  • Rectangular index files vs other formats (e.g.
    pdf)
  • index file format is easy and aids data storage
    / indexing
  • Finite number of occ info. files / model of
    plurality of supply
  • International community of data providers
  • Negligible security restrictions (free online
    resources)
  • Strategy
  • GEODE-M proforma, suits all formats, completed
    online
  • Translation to csv index file
  • Modify GEODE-M record for index file
  • (2) (3) performed automatically or manually
  • Storage OGSA-DAI framework to link index files

12
Occupational information depository
  • OGSA-DAI implementations on prototype service
  • Testing dynamic deployment of selected data
    resources (CAMSIS)
  • Registration with index service (pilot tests)
  • Searchable via portal service
  • OGSA-DAI evaluations
  • Foundations suited to collation of diverse occ
    data resources
  • Also facilitates data access functions (see 3)
  • Accepts GEODE-curated resources externally
    curated resources and potential connections with
    other Grid data services
  • issues in support for alterative security levels
    to allow modification of initially deposited
    resources

13
Occupational information depository
  • 2.3) Virtual Organisation for Occupational
    Information Depository
  • MDS (via GT4) to manage VO access to and
    distribution of occupational information
    resources
  • International virtual community
  • Dynamic data supply

14
3) Access to Occupational Information
15
GEODE portal access
  • 3.1) File linkage mechanisms
  • Multiple occupational variables on (A)
  • Strict security constraints on (A)
  • Inconsistent OUG formats on (A)
  • Prototype linkages (e.g. CAMSIS) require full
    access to (A)
  • Cater to limited access to (A)
  • Investigate digital certification (X.509) to
    allow restricted data transfer A_OUGs
    A_context
  • Requirements analysis
  • Minimal user certification process
  • Avoid application installation by users
  • Users complex survey data (e.g. multiple
    occupational records)

Micro-social data (A) ? Occupational information
resources (B)
16
GEODE portal access
  • 3.2) Analytical queries
  • Process analytical tasks on aggregate
    occupational information resources
  • Summary data
  • Coverage searches
  • Summary statistics
  • Consider more complex analyses?
  • CAMSIS derivations
  • Involve interactive data management tasks
  • cf. Nesstar / Data Web

17
4) Conclusions and prospects
  • Occupational Information Depository
  • OGSA-DAI implementations
  • Index-files annotated through GEODE-M
  • Some ongoing manual support requirements
  • Portal framework
  • Accessible GT4 / GSI structures
  • Curation of occupational data
  • Contribution widely used international resources
  • Semantics data annotation (DDI)
  • Generic data service
  • Hinges on numeric OUG index cf. CASCOT
  • other application areas e.g. Education,
    Geography

18
GEODE, eScience and eSocial Science
  • Some tentative comparisons...
Write a Comment
User Comments (0)
About PowerShow.com