October 12-15, 2004 PowerPoint PPT Presentation

presentation player overlay
1 / 50
About This Presentation
Transcript and Presenter's Notes

Title: October 12-15, 2004


1
Emerging Standards for Interoperable Biological
Systems
  • Technology for Life North Carolina Symposium on
    Biotechnology and Bioinformatics

2
Standards Why do we care?
  • IEEE standards for plugs, outlets and wiring I
    can buy an appliance and use it ( most of the
    time )
  • Any international traveler will tell you that
    standards vary around the world

3
Without Standards -
  • Custom builds by experts
  • Build once use once
  • Need expertise in specific domain
  • Expensive
  • Most of us still using candles

4
Standards and Software
  • World Wide Web
  • Plug and play
  • Plug-in / modular components
  • XML Extensible Markup Language
  • Web Services
  • Federated Search
  • Grid Services

5
My Standards Journey
  • Middleware to integrate learning systems with
    enterprise resource planning systems
  • IMS / IEEE learning technology standards
    learning object metadata
  • National Science Digital Library STEM LOM
    repository
  • NCCU BBRI Cardiovascular Study similar issues

6
Bioinformatics Community
  • Embraced open source
  • Philosophy of sharing of data and tools
  • Community involvement yields foundation for
    standards development

7
Emerging Standards
  • tools/middleware web services for harvesting
    federated searches
  • grid computing
  • ontologies developing controlled vocabularies
  • analysis standards for sharing results e.g.
    microarray analysis
  • models- Systems Biology standards for
    interchange

8
Sharing Data, Tools, and Middleware
  • XML, go to http//www.w3.org/XML/
  • Specifications for data interchange in biology
    applications (XML schemas)
  • Web services
  • Define WSDL for biology applications

9
XML for data exchange
  • AnatML,
  • CellML,
  • BIOML,
  • GEML,
  • MSAML,
  • GeneXML,
  • MAGE-ML,
  • BSML,
  • CDISC, and
  • HL7

10
Virginia Bioinformatics Institute
  • toolbus
  • PathPort
  • Middleware for web services
  • query multiple databases
  • facilitate decision making and data
    interpretation
  • http//staff.vbi.vt.edu/pathport/services/

11
BioMOBY
  • simple extensible protocols
  • Web services for interoperable databases
  • http//biomoby.org/

12
Grid Computing
  • user authentication and authorization ( like
    X.509 certificates )
  • Open Grid Computing Environment (OGCE) portal
    toolkit
  • Open Grid Services Architecture , OGSA
  • Globus Toolkit

13
Grid Applications
  • iNquiry commercial product
  • NC BioGrid prototype / planning stages
  • statewide Bioinformatics Portal being created by
    the University of North Carolina at Chapel Hill
  • GridNexus project

14
Ontologies
  • Controlled vocabulary
  • Crosswalks between controlled vocabularies
  • Interoperability
  • Browse and search services across disparate
    repositories
  • www.geneontology.org

15
Data Analysis
  • MIAME, minimal information for the annotation of
    a microarray experiment
  • http//mged.sourceforge.net/ontologies/index.php

16
Systems Biology
  • Historically many custom, small scale models
    with little reuse
  • Goal of Systems Biology is to construct the
    system with modular models where data can be
    supplied via web service queries to databases

17
Model Integration
  • Biology Workbench (SBW) strives to support model
    integrations through
  • Systems Biology Markup Language ( SMBL) XML to
    represent biochemical networks common framework
    to document models
  • SBW provides framework for interoperation across
    heterogeneous modeling tools http//sbml.org/index
    .psp

18
Implications
  • expose databases with web services
  • construct queries to locate the data
  • standards for grid services
  • community developed XML schemas for sharing
    biological data

19
GridNexus
20
UNCW Grid Initiative GridNexus
  • The UNCW Grid Computing Project is a two-year
    collaborative project among a multi-discipline,
    multi-investigator core research team at UNCW and
    several discipline-focused researchers at partner
    institutions NCSU, WCU, NCCU, ECU, and CFCC. The
    research areas and institutional interests of
    this project are
  • Advanced Grid Software Development (UNCW)
  • Computational Chemistry (UNCW and ECU)
  • Bioinformatics (UNCW, NCSU, and NCCU)
  • Combinatorics (UNCW)
  • Business Computing (UNCW and NCCU)
  • Education and Training (UNCW, WCU, CFCC)
  • This project proposes to develop a Grid interface
    that is easy-to-use and may be used by a
    wide-range of applications and users. We have
    developed an innovative graphical user interface
    (GUI) for grid applications. In particular, we
    introduced a new scripting language (JXPL)
    designed for web-based services, a GUI for
    creating scripts, and have demonstrated the use
    of these tools with grid services.

21
GridNexus
  • This initiative grew in part out of a need for
    HPC resources following the closure of the NCSC
    in June 2003, coupled with the availability of
    faculty with software programming expertise and
    others with computing applications that could
    benefit from use of a Grid.
  • The UNC-OP funded UNCWs proposal for 557,634
    over two years to develop Grid portals (GUI
    middleware to allow users to access software on
    computers on a Grid).

22
Resources of UNCW Grid
  • Beowulf cluster 16 PIII processors in Computer
    Sciences Department
  • Fire and FireDev servers plus disc storage
    devices
  • PQS Quantum Cube 8 cpu cluster with PQS and
    Gaussian 03 computational chemistry software,
    plus TCP-Linda environment.
  • An 8 processor IBM blade cluster with 0.5 tB disk
    storage will be added soon.
  • Other computers may be added, including the
    possibility of using all computing lab computers,
    or possibly even all faculty/staff computers
    (when not in use).

23
GridNexus
  • The objective is to make accessing HPC resources
    (wherever they may be located) easy to scientists
    who are not computer savvy.
  • Most computation involves doing various
    mathematical operations on a dataset.
  • A GUI approach is employed, in which the user,
    after a single login that checks authentication
    and authorization, can create a workflow of
    functions/operations graphically by connecting
    boxes dragged from a series of lists of options,
    then applying that series of steps to a dataset.
  • Such a workflow can be saved for subsequent
    application to another dataset.

24
GridNexus
  • Job submission Ideally in a grid, the grid
    middleware should select the best resource
    those computers that are available, capable, and
    have the software needed to handle the job.
  • The user need not select nor know where the
    computation is taking place. In fact, the job
    may even be passed from one computer to another
    for various aspects of the calculation.
  • The output is returned to the users workstation
    or account, rather than the user having to access
    and download the output file from a remote
    computer.

25
GridNexus
  • GridNexus is a GUI that allows the user to
    create/edit/run workflows
  • Based on Ptolemy II http//ptolemy.eecs.berkeley.e
    du/ptolemyII. Ptolemy provides the GUI and
    workflow features. We have extended it to provide
    the functionality we want (JXPL and GridServices)
  • Release 1.0.0 download available
    www.gridnexus.org

26
Getting Started
  • The right frame is the palette for building
    workflows
  • The upper left frame provides the library of
    modules
  • The lower left is a thumbnail of the entire
    workflow

27
The Basics
  • Sources produce data without needing input
  • Sinks consume data but may have side effects
    (such as displaying results)
  • All workflows must start with sources and end
    with sinks

28
Simple Example 1
  • Click and drag the Const source to the
    workflow.
  • Click and drag the JxplDisplay sink to the
    workflow

29
Simple Example 1
  • Double click on the Const module
  • Change its value to 10
  • Click commit
  • The new value is shown on the icon

30
Simple Example 1
  • Input ports are on the left-hand side and output
    ports are on the right-hand side of each module
  • Click and drag from the output port of the Const
    module to the JxplDisplay

31
Simple Example 1
  • A link (or relation) is created between the two
    modules
  • The output of Const is consumed by the JxplDisplay

32
Simple Example 1
  • Click on the run button ( )
  • The JxplDisplay evaluates the input and produces
    a display window to show the results.
  • Notice the output is in XML (actually JXPL)

33
Simple Example 2
  • Transformers are modules that take input,
    transform it, and produce new output
  • This example computes the express (23 6) -2

34
Simple Example 2
  • The Multiplication module takes the result of the
    addition (its first input) and multiplies that by
    -2 (its second input)
  • The result is consumed by JxplDisplay

35
What's Going On?
  • The workflow is not actually performing the
    operations. Instead it is creating a script
    (JXPL) that, when executed, produces the result
  • The JxplDisplay is evaluating the script and
    displaying the results

36
What's Going On?
  • Double-click on the JxplDisplay and deselect the
    Evaluate Jxpl parameter
  • This parameter tells JxplDisplay whether or not
    to evaluate the script that is generated

37
What's Going On?
  • Now when we run it, we see the actual script that
    is produced by the workflow
  • The script is written in XML using a language
    developed at UNCW called JXPL

38
A Little Bit about JXPL
  • JXPL is based on LISP
  • The corresponding LISP to the JXPL on the right
    looks like
  • ( ( (23 6) -2)

39
A Little Bit about JXPL
  • Why?
  • XML is used to transport data between web/grid
    services
  • XML opening/closing tags lt-gt LISP opening/closing
    parens
  • Everything is either an atom or a list
    (functions, Data Structures)

40
GridNexus and JXPL with Grid Services
  • create workflows that can make use of web and
    grid services
  • implement primitives in JXPL that are generic web
    and grid clients
  • inspect the WSDL of the service to determine its
    interface

41
GSClient module
  • GSClient module whereby the user can specify
    the factory URL, the instance name of the
    service, the stub class, and the port type
  • primitive uses the OGSIServiceGridLocator to find
    the grid service and invoke the appropriate
    method with the arguments

42
GridNexus and OGSA-DAI
  • OGSA-DAI Grid Data Services are designed so that
    the output of one can be delivered to another
  • GridNexus allows non-programmers to create JXPL
    to control GDS interaction in a graphical
    environment

43
Using OGSA-DAI grid service clients
44
  • Molecular biology workflow created in GridNexus

45
Molecular chemistry workflow in GridNexus
46
Build the Library
  • Identify tasks in scientific workflows
  • Investigate existing open source modules for
    possible integration with GridNexus
  • Design for reuse incorporating appropriate
    standards
  • Implement library module in GridNexus

47
GridNexus
  • Release 1.0.0 download available www.gridnexus.org

48
Acknowledgments
  • UNC-OP for funding the UNCW Grid Initiative
    Proposal
  • Fostering Undergraduate Research Partnerships
    through a Graphical User Environment for the
    North Carolina Computing Grid, Dr. Ron Vetter,
    PI
  • Co-PIs Dr. Rebecca S. Boston, NCSU Dr. Anthony
    Wilkinson, WCU Dr. Marilyn McClelland, NCCU Dr.
    Libero Bartolotti, ECU Ms. Judy Porter, CFCC.
  • UNCW Participants Computer Science Dr. Ron
    Vetter, Dr. Clayton Ferner, Dr. David Berman, and
    Dr. Tom Hudson. Information Technology Systems
    Dr. Bob Tyndall and Mr. Bobby Miller.
    Mathematics and Statistics Dr. Jeff Brown.
    Chemistry and Biochemistry Dr. Ned H. Martin.
    Biological Sciences Dr. Ann Stapleton
    Information Systems and Operations Management
    Dr. Tom Janicki.
  • UNCW Computer Science students working on the
    Chemistry portal Tristan
    Carland, Jerry Martin, Andrew Martin

49
Acknowledgments
  • Grid Computing Harnessing Underutilized
    Resources Dr. Ned H. Martin
  • GridNexus UNCW GUI for Workflow Management Dr.
    Clayton Ferner
  • GridNexus A Grid Services Scientific Workflow
    System Jeffrey L. Brown, Clayton S. Ferner,
    Thomas C. Hudson, Ann E. Stapleton, Ronald J.
    Vetter, Andrew Martin, Jerry Martin, Allen Rawls,
    William J. Shipman, and Michael Wood

50
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com