Title: Case Study in e-Social Science
1Case Study in e-Social Science
Rob Allan (CCLRC Daresbury Laboratory) Rob
Crouchley (University of Lancaster)
- Building Collaborative e-Research Environments
- JISC Consultation Workshops, 23/2/04 and 5/3/04
2Specific Social Scientists Problems
- They have much less experience and expertise in
the use of the Grid than those typically from
other research council areas - There is a significant intellectual gap between
such disciplines and computer science - Distributed systems are also inherently complex
and associated middleware products are not easy
to use - The Open Middleware Infrastructure Institute
(OMII) is likely to provide generic (open-source)
middleware and associated services. - E-Science middleware currently not specifically
targeted for the social science community.
3Social Scientists Need
- Help to develop a more computer-literate
collaborative culture - Help to develop component-based software, visual
composition tools and scripting languages which
are easy to use - To exploit state-of-the-art software development
technologies such as aspect-oriented programming
to enhance flexibility. -
- Middleware could be the catalyst for re-use and
sharing in the e-Social Sciences. Some examples
and ideas follow.
4Some Features of Social Science Research
- Research motivated by a desire to determine
causality - Involves
- identifying the various factors which influence
the behaviour or outcome of interest and
quantifying their effects - controlling for all the different confounding
factors which would otherwise result in spurious
relationships and misleading results. - Randomised experiments not feasible, we cannot
randomly allocate individuals to different levels
of training in order to evaluate programs. - We rely on observational data, i.e. data that
have been obtained from surveys and censuses. - This is different to exact sciences like
physics and chemistry where repeatable
experiments can be performed.
53 related Aspects of Soc. Sci. Research
6Soc. Sci. needs Comprehensive Models
- Interdependent sub models, we need joint models
for the data complexities and the core processes
we want to understand - Models are not linear in the parameters, require
special procedures and are highly computationally
intensive due to the high dimensionality and the
interdependent sub models. - Simple analyses are usually very misleading about
the role of the controls, eth, sex etc. - Soc. Sci. research is complex - large parameter
space, many interpretations and models which need
to be tested. Cannot be done in isolation - Increasing need to link components and access
large computers/ data sets from desktop.
7E-Science Technology can link Components!
8New Tools The Analysis Cycle
Main ESDS Data Sets
TTWA Data, NOMIS
Select Data Set and Appropriate Variables
Merge Files Add Variables
Contextual Data
Working Data
Results
9New Tools Simultaneous Analysis
Example research in educational attainment
10E-Science can enhance Collaboration!
- Particularly important in qualitative research
- Enable comparison of different markup/
interpretation - Direct access to datasets for validation
- Direct input of data from fieldwork involving
questionnaires, photography etc. - Delivery/ input devices (some mobile) may
include portals, Access Grid, PC tablets, PDA,
camera, phone etc.
11New Tools Collaboration in Video Markup
VIDGRID Multiple video streams can be delivered
into an AG or portlet environment
12Training and Awareness in e-Social Science!
- Project ReDReSS Resource Discovery for
Researchers in e-Social Science - to accelerate the development and awareness of
a new kind of computing and data infrastructure
for the Social Sciences, and to support the
increasingly national and global collaborations
emerging in many areas of Social Science - To help illustrate appropriate methodologies and
software that admits the full complexity of
substantive problems - To help articulate the middleware needs of social
researchers - To help nurture and support a community of social
researchers - To help to provide critical mass and improve the
efficiency of interactions between the interested
researchers, thus reducing the number of lost
opportunities for social science.
13(No Transcript)
14We will use/ contribute to existing technologies
- Resource discovery
- Sharing tools
- Personalised workspaces
- Flexibly delivery
15Samples showing use of CHEF framework for ReDReSS
and delivery of lecture material by video
16E-Science enabling a Virtual Research Environment!
- to make the use of e-Science technologies,
methodologies and resources easier and more
transparent than simply developing bespoke
applications on an infrastructure toolkit (such
as Globus GT2 or OGSI/ WSRF). - We need to
- Bridge the gap between different types of
technology (database management, computational
methods, data collection, networks, Condor
resources, visualization systems, collaborative
working, Access Grid, etc.) - Build on pilot projects and take input from other
disciplines - Link to core JCSR clusters and resources at other
e-Science Centres - Provide an environment to enhance the
programmability and usability of such a Grid by
integrating work from a number of ongoing
projects and encourage community input.
17The Grid Client Problem
Many clients want to access a few Grid-enabled
resources
Grid Core
Consumer clients PC, TV, video, AG
Middleware e.g. Globus
Workplace desktop clients
Portable clients phones, laptop, pda, data
collection
Grid Core
18Some VRE Functions
- Authentication, Authorisation and Accounting
use Shibboleth and Permis in line with JISC
proposals - Community development of content - Content
Management and Editing tools - Access to middleware resources and documentation,
- Access to training materials and resources,
- Enable shared development of services/
applications, - Access to a consultancy/ support service,
- Application Management Services - user access via
pre-defined tools and applications to the UK
e-Science Grid - Data Management Services discovery,
authorisation, transfer, replication, upload,
validation, curation - Access to Broadcasts - on the Access Grid
network - Management Functions - for experts to maintain
the system and guide non-experts, e.g. via expert
systems and workflow.
19Functionality/Content of the VRE
20Sanity Check
- However a number of areas significant for a
production Grid environment have hardly yet been
tackled. Issues include - Grid information systems, service registration,
discovery and definition of facilities - Security, in particular role-based authorisation
- Portable parallel job specifications
- Meta-scheduling, resource reservation and on
demand access - Dynamic linking and interacting with remote data
sources - Wide-area computational/ exprtimental steering
- Workflow composition and optimisation for complex
procedures - Distributed user and application management
- Data management and replication services
- Grid programming environments, PSEs and user
interfaces - Auditing, advertising and billing in a Grid-based
resource market - Semantic and autonomic tools
- Usability issues, ethics, etc
21Human Factors
- Customised delivery may be key to long-term
uptake - Use an environment familiar to the researchers,
e.g. - Web portals - training, awareness, search tools
(search engines are popular) - Libraries - e.g. C for programmers
- Programming environment e.g. R for statistical
analysis with well-known packages - Sound, video for virtual collaboration (TV is a
popular medium) - Bottom line
- There is a lot we can/ need to do, but
- Social Science is already hard the scientists
need tools that do not make it harder!
22UK E-Social Science Programme
- There is currently a growing body of work and
projects in this area - Pilot projects - ESRC
- ReDRESS Resource Discovery for Researchers in
e-Social Science JISC - UK National Grid Service e-Science Grid - JCSR
and DTI Core Programme - NCeSS National Centre for e-Social Science -
ESRC - CQeSSS Centre for Quantitative e-Social Science
Support - ESRC ( future NCeSS nodes)