Genome Browsing and AJAX: Advancing GMODs GBrowse to the Next Level - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Genome Browsing and AJAX: Advancing GMODs GBrowse to the Next Level

Description:

Most Web-based genome browsers are static HTML pages the entire page is ... The generic genome browser: a building block for a model organism system database. ... – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 37
Provided by: andrew185
Category:

less

Transcript and Presenter's Notes

Title: Genome Browsing and AJAX: Advancing GMODs GBrowse to the Next Level


1
Genome Browsing and AJAXAdvancing GMODs
GBrowseto the Next Level
  • by Andrew Uzilov
  • for Holmes Lab group meeting
  • October 13, 2006

2
Genome browsers are not just a good idea... they
are The Way
  • Necessary for visualizing and understanding
    large amounts of genomic information
  • genome organization (including synteny)
  • multiple splicing
  • comparing predictions against known data
  • some insights may be more obvious visually than
    through flat files, database queries, or writing
    custom programs for data analysis

3
What else are they good for?
  • Retrieving information
  • point-and-click on features of interest better
    interface for exploring
  • BLAST and other database searches get you a
    visual of the genomic context, not just text
  • Prepare pretty pictures for publications
  • annotation upload feature is a must for this
  • Better interface for community annotation (genome
    wiki)
  • Genome feature WYSIWYG editor?

4
What are the problems with current genome
browsers? (1)
  • Most Web-based genome browsers are static HTML
    pages the entire page is refreshed (HTML
    generated anew by server) anytime the user
    navigates, changes layout, etc.
  • Delay incurred while page reloads annoying
  • Vertical scroll position lost also annoying
  • Sometimes, JavaScript or Flash is used to provide
    some dynamic content (you can change certain
    things without triggering reload), but usually
    navigation still causes reloads

5
What are the problems with current genome
browsers? (2)
  • Most (all?) Web-based genome browsers rely on
    the server renders graphics from scratch upon
    client request model
  • Images for genome views are rendered on demand,
    after user navigates, changes layout, etc.,
    making the user wait
  • Rendered images arent reused or reusable not
    saved or cached, rendered anew each time
  • There are difficulties in preparing pre-rendered
    content

6
Pre-rendering difficulties
  • It would be is to have all images rendered ahead
    of time, then just serve them up, requiring no
    live rendering overhead/delay
  • obstacle is that pixel width of genome views is
    quite, quite, quite large cant render view as
    single image, will run out of memory
  • cant render in small parts either, as
    BioPerl/GBrowse will not produce parts that
    concatenate into a nicely contiguous genome view
  • and probably other rendering frameworks

7
The Insight
  • Make BioPerl think it is rendering a massively
    wide single image, but instead intercept all
    rendering calls to the graphics library (i.e. the
    graphics primitives) and store them in database
  • Now, we can query the database for only a
    manageable subset of primitives (i.e. only those
    required for a single tile the basic unit out
    of which the total genome view is constructed)
    and render only them, producing a
    reasonably-sized tile image
  • primitives coordinates are offset if they start
    in tiles prior to (left of) the current one

8
Enter THE WEB 2.0 GBROWSE
  • Basic philosophy
  • the client is an application
  • maintains internal state (no longer a static
    page)
  • knows how to render itself (old way server
    generates the whole pages HTML for you)
  • knows how to change itself dynamically (old way
    server generates new HTML for you)
  • the server is a well, literally, a server
  • pre-processes as much as possible to reduce
    session-time delays/overhead
  • off-loads as much work as possible on the client
  • all this reduces server load, speeding up session
  • less trite name under review

9
So how does it work?
  • Based on GMODs GBrowse framework
  • The server-side GBrowse Perl code for rendering
    genome views (i.e. the gbrowse_img script for the
    CGI) was hacked apart and back together to be a
    standalone pre-rendering program that uses
    BioPerl and GD libraries in the same way as
    GBrowse
  • except TiledImage.pm intercepts calls to cache
    primitives and render tiles
  • The client was written in JavaScript from scratch

10
Server side - the original way
The GBrowse framework (from Stein LD, Mungall C,
Shu S, Caudy M, Mangone M, Day A, Nickerson E,
Stajich JE, Harris TW, Arva A, Lewis S (2002).
The generic genome browser a building block for
a model organism system database. Genome
Research 12(10), 1599-1610.
11
Server side - the new way
or at least one current proposed new way
(subject to change)
12
Server side features currently implemented (1)
  • MySQL database
  • tile rendering Perl TiledImage.pm
  • intercepts BioGraphicsPanel calls to
    GDImage (using AUTOLOAD) and stores them in
    database, keyed on the bounding box to which they
    apply
  • now, if we want to know which GD primitives need
    to be rendered for some tile, we just search the
    database for all primitives overlapping with the
    tile bounding box

13
Server side features currently implemented (2)
  • tile rendering Perl generate-tiles.pl
  • uses TiledImage.pm to
  • fill MySQL database with graphics primitives
  • render tiles from a given database of primitives
  • generate XML containing client config info
  • do any combination of the above, including on
    subsets of tiles (allows to break rendering down
    into jobs, suitable for rendering on multiple
    CPU)

14
Server side features to be implemented,
short-term (1)
  • tile serving module
  • pre-fill the database with all primitives, but
    only render them selectively as users request
    tiles
  • store already rendered tiles to prevent
    re-rendering
  • maybe idle server CPU cycles can be used to
    render arbitrary tiles, always filling the tile
    space
  • generate-tiles.pl should process externally
    rendered tiles, e.g.
  • dotplot tracks
  • histograms
  • supporting material for features such as pictures
    of fluorescent gene expression profiles,
    physiological changes due to gene knockout
    experiments, etc.

15
Server side features to be implemented,
short-term (2)
  • database optimizations
  • query for primitives in the slow step rendering
    takes much longer than loading database
  • key on tile number (1 key), not bounding box (4
    keys)
  • go to whiteboard
  • gridlines account for gt50 of the primitives, but
    are the same for every tile
  • maybe load gridlines for just one tile, and
    return them for every query?
  • GUI to wrap generate-tiles.pl
  • should be built into Web interface for annotation
    upload

16
Server side features to be implemented,
long-term (1)
  • how to serve up feature info?
  • short-term solution
  • have generate-tiles.pl produce an XML file with
    feature data (bounding boxes, etc.) for each
    tile, since it has easy access to that info
  • client loads and parses an XML tile for each tile
  • more robust solution
  • need a database of features (but what kind?)
  • necessary to support efficient search for
    features
  • necessary for community annotation, because
    people will be changing the feature info
    constantly

17
Server side features to be implemented,
long-term (2)
  • community annotation
  • concurrency is an issue (updating changes,
    notifying client of updates since start of
    session, locking features for editing, etc.)
  • feature upload seems (to me) to be a special case
    of community annotation and should use its
    framework
  • quality control (registration, security)
  • are there existing database schemas or other
    frameworks that can serve this purpose?

18
Client side features currently implemented
  • Dragging works, but with bugs when large views
    are involved (fix is non-trivial, in progress)
  • Also work jumping, centering, zooming, dynamic
    resize
  • Tracks can be toggled hidden/visible
  • Hovering labels (either all on, or pop up on
    mouseover), with adjustable transparency

19
Brief aside what is AJAX?
  • Asynchronous JavaScript and XML
  • A combinations of technologies to make clients
    behave more like applications
  • JavaScript client code that uses XMLHttpRequest
    to asynchronously query the server for things
  • Implies XHTML (well-formed HTML) and DHTML (DOM
    manipulation), use of CSS

20
Why I am avoiding existing AJAX frameworks
  • Useful for flashy graphics effects, but dont
    help with the engine of the client (except maybe
    Prototype and Google Web Tookit)
  • but, GWT is closed source and an early version
    even online demo has bugs
  • None support dragging, track management, tile
    caching, etc so that needs to be done ourselves
    (and has, so far, consumed most of the effort)
  • But Im willing to consider them for
  • adding graphics effects after engine is more
    developed
  • for asynchronous communication with the server

21
DOM from XHTML to a tree
  • lttablegt
  • lttbodygt
  • lttrgt
  • lttdgtShady Grovelt/tdgt
  • lttdgtAeolianlt/tdgt
  • lt/trgt
  • lttrgt
  • lttdgtOver the River, Charlielt/tdgt
  • lttdgtDorianlt/tdgt
  • lt/trgt
  • lt/tbodygt
  • lt/tablegt

This is from a W3 page, so you know its
right http//www.w3.org/TR/2004/REC-DOM-Level-3-C
ore-20040407/introduction.html
22
Client the nitty-gritty (1)
  • Code is broken down into multiple JavaScript
    classes
  • by which I mean just separate .js files, most of
    which are object instances that provide
  • class functions and methods
  • namespaces
  • modularity, organization
  • Static classes (standalone file, no instance)
  • Other.js misc. helpers
  • Load.js loads XML when it is loaded,
    instantiates all objects in the correct order

23
Client the nitty-gritty (2)
  • The Component system
  • An attempt to bring order to chaos
  • Each discrete UI element (e.g. main view,
    navigation panel, panel with track control
    buttons, etc.) is a Component
  • code for each component in its own file
  • Components are
  • instantiated by Load.js
  • connected through ComponentInterface.js
  • should not modify other Component properties
    directly (although JavaScript allows this), but
    rather use ComponentInterface.js for sanity!

24
Client the nitty-gritty (3)
  • Each Component must define
  • constructor
  • renderComponent()
  • returns the DOM node for this Component
  • will (eventually) be called by Load.js, which
    will then take the DOM node and append it to
    document
  • once fully implemented, there will be no need for
    content in ltbodygt of XHTML JavaScript will
    render everything dynamically
  • which allows for possibility of having a
    server-side config file specifying client-side
    layout, thus further removing users from the
    necessity of doing any programming

25
Client the nitty-gritty (4)
  • Each Component must also define
  • getState()
  • for setting bookmarks/history
  • setState()
  • for restoring bookmarked/history points
  • some bookmarking object will eventually use the
    above methods to store/load bookmarked states by
    polling all Components

26
Client the nitty-gritty (5)
  • If a programmer writes a new Component, they have
    to
  • add accessors/modifiers for its object properties
    to ComponentInterface.js
  • add calls to constructor and renderComponent() to
    Load.js
  • However, eventually, accessor/modifier
    construction will be done automatically by
    ComponentInterface.js (in theory, its possible)
  • this means that a Component programmer never has
    to look outside their own Component code, using
    the API for the other Components to access/modify
    them

27
Gods below! Was it really necessary to take 5
slides for this?
  • Yes, because object-oriented programming in
    JavaScript requires discipline, and its important
    to work these things out early on
  • with multiple people working on this code, it
    needs to be compartmentalized somehow
  • otherwise, debugging may cause blood pressure to
    rise to dangerous levels (although Venkmans
    debugger will alleviate that)
  • see ComponentTemplate.js in SVN for a template,
    with guidelines on how to write a component of
    your own

28
Client the nitty-gritty (6)
  • Current components
  • ViewerComponent.js
  • NavigationComponent.js
  • TrackControlComponent.js
  • DebugComponent.js
  • Other classes
  • View.js
  • stores limited information about current view
  • intended to be the class that manages feature
    info fetching, caching, etc.
  • TracksAndZooms.js
  • just a data structure to hold config info from
    the XML file and current state info about what
    zoom level were at, and what tracks are
    hidden/visible
  • These should really be prototypes for other
    objects

29
Client the nitty-gritty (7)
  • Dragging and genome view events
  • brace yourself, this is going to be ugly
  • go to whiteboard
  • Ideally, no one should have to deal with this
    after its been programmed, as it will be wrapped
    up in ViewerComponent.js, and navigation can be
    accomplished by using accessors to move view
    around

30
Client side features to be implemented (1)
  • Client has no idea what the information on the
    tiles actually means (no knowledge of where and
    what the features are)
  • must be made aware of what it is displaying
    short-term solution is load this from XML file
    for each tile (remember the server-side to do?)
  • the client JavaScript class for doing this can
    be later replaced with something more
    sophisticated, e.g. an XSL transformational
    grammar and XHR for fetching feature info from
    database there are many possibilities

31
Client side features to be implemented (2)
  • How can the user actually see the information
    about features?
  • pop-up menu on mouseover?
  • would have option to pop up details in separate
    window, manage annotation, etc.
  • displayed in a sidebar a la Google Local?
  • There is no one True Answer, so maybe we can
    build all of the above and provide options to
    toggle between things

32
Client side features to be implemented (3)
  • Feature search
  • by feature, keyword, regular expression, etc.
  • search results display
  • pop open a table (load Component) displaying
    results clicking on results in table will center
    the view on them
  • multiple views can open up stacked on one another
  • can be used to display synteny link them all to
    a single horizontal dragging ruler

33
Client side features to be implemented (4)
  • Posting things to server (what protocol? XML?
    JSON?)
  • community annotation
  • feature upload
  • automated bug reporting system
  • Needs to check for changes in server-side
    database, tiles rendered, etc., since community
    annotation may change contents that you are
    looking at

34
Client side features to be implemented (5)
  • Bookmarking
  • entire state of browser encoded in URL
  • can use Web browser bookmarking to save
  • have internal tracking of history
  • internal back/forward buttons, log of what you
    did
  • every Component must have getState() and
    setState() defined to implement this
  • JSON would be perfect for this, no?
  • Output current view to image (PNG, SVG, etc.)

35
Client side features to be implemented (6)
  • The genome browser as a plug-in
  • runs in a little box on someone elses website to
    show an example

36
This was written to the sounds of
  • Tortoise Standards
  • Jazz History Vol. 5 Now As Then-Revival
  • Tosca Suzuki
  • Aphex Twin - I Care Because You Do
  • Squarepusher - Ultravisitor
Write a Comment
User Comments (0)
About PowerShow.com