Engaging researchers with e-Infrastructure - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Engaging researchers with e-Infrastructure

Description:

Demonstrate success and inspire trust in e-Infrastructure ... MCTP: new users at: Swansea, Galway and Liverpool making use of the updated system; ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 13
Provided by: simonhe
Category:

less

Transcript and Presenter's Notes

Title: Engaging researchers with e-Infrastructure


1
Engaging researchers with e-Infrastructure
  • Leaping hurdles planning IT provision for
    research
  • 6 June 2009
  • Neil Chue Hong / Steve Brewer

2
Engaging Research with e-Infrastructure
  • What do people want to do? What are they doing
    already?
  • Trivial barriers can seem insurmountable
  • Demonstrate success and inspire trust in
    e-Infrastructure
  • Get users to engage with e-Infrastructure to
    improve research output
  • Interview researchers to identify what works and
    whats needed
  • Analyse requirements and propose interventions
  • Develop solutions and disseminate best practice
  • www.engage.ac.uk

3
Engaging Research with e-Infrastructure
Interviews
Wider deployment
Projects
Dissemination
Adoption
New requirements
4
ENGAGE Researcher Interviews
  • 53 interviews
  • semistructured
  • 36 face-to-face
  • 17 telephone
  • 60 people
  • 24 institutions
  • Triage process to identify development projects
    and best practice

5
The Analysis ofData in ENGAGE
Interview Summary
1) Writes own software in Python - looking at
better ways of getting it used 2) Software works
on multiple datasets mapping to own data
format 3) Cytoscape not automated. Cannot
automate visualisations. 4) Data visualisation is
restricted as there are multiple datasets to
download and access 6) Runs take 1-2 weeks. Not
interactive. 7) Cannot submit large jobs.
Transcription
Analysis Triage
Obstacles
Best Practice
Interviews
  1. Tools not easily accessible to other researchers
  2. Unable to run large jobs on current resources
  3. Difficult to reuse / repurpose workflows

Sourcing a system on which to run the
services Assumptions made by software
installation Assumptions made handling I/O with
WS framework Timing issues when checking for
secure services Teaching, admin and other
research commitments meant that the primary
researcher had insufficient time
  • Undertake feasibility study and investigate
    making the protein sequence databases available
    as web services before the wrapped applications
    can use their data.
  • Get the wrapped applications and workflows to
    work in a production environment on
    instituitional facilities
  • Investigate and carry out the migration from the
    production environment to the NGS.

Commission
Evaluation
Evaluation Report
Development Project
ProjectBrief
6
Case 2 - Interview Summary
  • Lost support of software
  • Certificates cause problems
  • Analysis takes 3 days
  • Cannot log in to MetaData Database 50 of time
    due to proxy certificate problems
  • Would use grid tools is had more confidence in
    their stability
  • Security an issue as human nature has made it
    less secure
  • Morphology and Growth Rate calculations better if
    distributed
  • Lots of data uploaded to NGS
  • Process takes 3 months from beginning to end

7
Post Interview Questionnaire
  • Research/project objective
  • Name of main sponsor of the work
  • Research/project main challenge(s)
  • Software application(s) used also provide
    relationship role (analyst, developer,
    contributor, user, support/ advisor, non-user
    researcher) if applicable
  • Number of current users number of expected users
  • Infrastructure/platform(s) used/desired

8
Triage Questionnaire
  • What is the degree of applicability of the work
    to OMII-UK/NGS?
  • Suggested other partner(s) and partner
    institution(s) (if applicable)
  • Research/project objective observation/clarifica
    tion required
  • Related/relevant software application(s)
  • Estimation of probable users suggested target
    for eventual users
  • Infrastructure/platform used
  • Research/project main obstacle(s)/issue(s)
    (maximum of 3) Research/project main
    technological obstacle(s)/issue(s) (maximum of 3)
  • Potential solution(s) (maximum of 3)
  • Salient points, comments or brief proposal
    (elevator speech)

9
Case 2 Identified Obstacles
  • Research/Project Obstacles
  • Lack of continuity of support. They would use
    grid resources more if they were confident about
    stability and continuity.
  • Started second phase of project to look at making
    the tools accessible. More marketing and
    streamlining of the tools.
  • Technical Obstacles
  • Have lost key people who tinkered with the code
    to get it to run.
  • Amount of time spent getting the computational
    systems working and efficient.
  • Future work might be more sensitive (possible
    forging of polymorphs, pharmaceutical data), so
    security will become increasingly important.

10
Case 2 Potential Approaches
  • Installation and wrapping support for specific
    applications in different contexts eg. NGS,
    Legion and security eg easier authentication
  • Consultancy and support gluing it all together
  • Partnership with their consultancy

11
Case 2 Project Brief
  1. Replace DMAREL with DMACRYS, which is capable of
    dealing with much larger molecules and crystal
    structures, and using better models for the
    intermolecular forces.
  2. Expand the BPEL workflow to perform
    post-processing of the results, such as
    consistency checks and the re-submission of
    minima that are transition states. Enhance the
    presentation during the search and the storage
    for retrieval of the results.
  3. Port the deployment to run on both Legion and
    Condor pool for testing, and design it to then
    also run on the NGS so that polymorph
    calculations can be performed by the wider range
    of users.
  4. Consider whether either larger machine would be
    able to also offer the alternative search method
    Crystal Predictor.

12
First Phase ENGAGE Development Projects
  • High Throughput Humanities for e-Research
  • Exposing bioinformatic programs as Web Services
  • Protein Molecule Simulation on the Grid
  • Enable workflows in a Shared Genomics causality
    workbench
  • Linking and Querying Ancient Texts
  • SWARMCloud
  • Rapid Chemistry Portals by Engaging Users

13
Second Phase ENGAGE Development Projects
  • Monte Carlo Treatment Planning
  • Crystal Energy Landscape Application
  • Epigraphy and papyrology image processing
  • Strengthening and support for eMinerals RMCS
    system
  • Configuration parameters for the GENIE simulator
  • Lab Blog Book
  • Strengthening and supporting the text and data
    analysis toolkit OSCAR

14
Planning IT provisionfor research
Putting the Team together
Creating a Common language
Need to bring together researchers, developers
and infrastructure providers. Can be difficult to
retain experienced staff.
Shared vocabulary for information exchange e.g.
analysis, ontologies. Experience can make it
easier to broker this process.
Lab Blog Book
Best Practice
Link researchers analysing molecular structure
and function via crystallography and MD simulation
Understand where approaches can be reused.
Virtual server provided on NGS2 hosting Lab Blog
server. Databases ported for wider use. Working
with IT administrators makes provisioning faster
Follow how researcher constructs the DL Poly
simulation files, recreate at Southampton, link
it to servers.Evaluation improves the usability
of the work
Unix, Apache 2, PHP 5, MySQL 5, ImageMagick, 1GB
storage. Well defined requirements drive wider
infrastructure adoption
Defining Requirements
Provisioning infrastructure
Evaluating Usage
15
First Phase ENGAGE Development Projects
  • HiTheR implemented different document similarity
    algorithms, 75 reduction in run-time on small
    Condor cluster, positive researcher evaluations -
    discovered various chains of related articles and
    misclassified articles, looking to transfer to
    NGS
  • Exposing bioinformatic programs as Web Services
    Nine protein sequence analysis applications
    hosted on 144 CPU cluster, workflows created now
    in daily use by postgraduates, has impressed
    infrastructure providers
  • ProSim Tools connected and made available in
    portal, workflows evaluated, workshop ran from
    20-24 April with 40 attendees
  • Shared Genomics workbenches integrated leading
    to new ideas for innovative user interfaces based
    on coverflow techniques
  • LaQuAT three databases integrated, in different
    languages. researcher about to complete formal
    evaluation
  • RCPER 3 portals complete, 1 portal underway 1
    portal evaluated and about to be used by 100
    undergraduates dissemination at ScotChem workshop

16
Second Phase ENGAGE Development Projects
  • MCTP new users at Swansea, Galway and
    Liverpool making use of the updated system
  • Crystal Energy Landscape application-CPOSS New
    DMACRYS system now working re-engineered
    workflows being evaluated by Sally Prices
    research team at UCL
  • RMCS Remote job submission for molecular
    simulation Project complete and good progress
    achieved Examining link to other projects
  • Integration of image processing tools within the
    VRE-SDM New integrated system previewed at
    recent Image, Text, Interpretation workshop in
    Oxford user interface well received
  • Aladdin 2 a launchpad for the GENIE Earth-System
    Model Ported GENIE simulator now operational
    configurable parameters can be rendered MatLab
    logic has been ported from GENIELab.

17
ENGAGE Summary
  • From interview to exemplar project showing the
    use of e-Infrastructure
  • Provide publicly available information to improve
    uptake

18
www.engage.ac.uk
Write a Comment
User Comments (0)
About PowerShow.com