Genome Browsing and AJAX: Advancing GMODs GBrowse to the Next Level - PowerPoint PPT Presentation

1 / 36

About This Presentation

Title:

Genome Browsing and AJAX: Advancing GMODs GBrowse to the Next Level

Description:

Most Web-based genome browsers are static HTML pages the entire page is ... The generic genome browser: a building block for a model organism system database. ... – PowerPoint PPT presentation

Number of Views:94

Avg rating:3.0/5.0

Slides: 37

Provided by: andrew185

Category:

more less

Transcript and Presenter's Notes

Title: Genome Browsing and AJAX: Advancing GMODs GBrowse to the Next Level

1
Genome Browsing and AJAXAdvancing GMODs
GBrowseto the Next Level

by Andrew Uzilov
for Holmes Lab group meeting
October 13, 2006

2
Genome browsers are not just a good idea... they
are The Way

Necessary for visualizing and understanding
large amounts of genomic information
genome organization (including synteny)
multiple splicing
comparing predictions against known data
some insights may be more obvious visually than
through flat files, database queries, or writing
custom programs for data analysis

3
What else are they good for?

Retrieving information
point-and-click on features of interest better
interface for exploring
BLAST and other database searches get you a
visual of the genomic context, not just text
Prepare pretty pictures for publications
annotation upload feature is a must for this
Better interface for community annotation (genome
wiki)
Genome feature WYSIWYG editor?

4
What are the problems with current genome
browsers? (1)

Most Web-based genome browsers are static HTML
pages the entire page is refreshed (HTML
generated anew by server) anytime the user
navigates, changes layout, etc.
Delay incurred while page reloads annoying
Vertical scroll position lost also annoying
Sometimes, JavaScript or Flash is used to provide
some dynamic content (you can change certain
things without triggering reload), but usually
navigation still causes reloads

5
What are the problems with current genome
browsers? (2)

Most (all?) Web-based genome browsers rely on
the server renders graphics from scratch upon
client request model
Images for genome views are rendered on demand,
after user navigates, changes layout, etc.,
making the user wait
Rendered images arent reused or reusable not
saved or cached, rendered anew each time
There are difficulties in preparing pre-rendered
content

6
Pre-rendering difficulties

It would be is to have all images rendered ahead
of time, then just serve them up, requiring no
live rendering overhead/delay
obstacle is that pixel width of genome views is
quite, quite, quite large cant render view as
single image, will run out of memory
cant render in small parts either, as
BioPerl/GBrowse will not produce parts that
concatenate into a nicely contiguous genome view
and probably other rendering frameworks

7
The Insight

Make BioPerl think it is rendering a massively
wide single image, but instead intercept all
rendering calls to the graphics library (i.e. the
graphics primitives) and store them in database
Now, we can query the database for only a
manageable subset of primitives (i.e. only those
required for a single tile the basic unit out
of which the total genome view is constructed)
and render only them, producing a
reasonably-sized tile image
primitives coordinates are offset if they start
in tiles prior to (left of) the current one

8
Enter THE WEB 2.0 GBROWSE

Basic philosophy
the client is an application
maintains internal state (no longer a static
page)
knows how to render itself (old way server
generates the whole pages HTML for you)
knows how to change itself dynamically (old way
server generates new HTML for you)
the server is a well, literally, a server
pre-processes as much as possible to reduce
session-time delays/overhead
off-loads as much work as possible on the client
all this reduces server load, speeding up session
less trite name under review

9
So how does it work?

Based on GMODs GBrowse framework
The server-side GBrowse Perl code for rendering
genome views (i.e. the gbrowse_img script for the
CGI) was hacked apart and back together to be a
standalone pre-rendering program that uses
BioPerl and GD libraries in the same way as
GBrowse
except TiledImage.pm intercepts calls to cache
primitives and render tiles
The client was written in JavaScript from scratch

10
Server side - the original way
The GBrowse framework (from Stein LD, Mungall C,
Shu S, Caudy M, Mangone M, Day A, Nickerson E,
Stajich JE, Harris TW, Arva A, Lewis S (2002).
The generic genome browser a building block for
a model organism system database. Genome
Research 12(10), 1599-1610.
11
Server side - the new way
or at least one current proposed new way
(subject to change)
12
Server side features currently implemented (1)

MySQL database
tile rendering Perl TiledImage.pm
intercepts BioGraphicsPanel calls to
GDImage (using AUTOLOAD) and stores them in
database, keyed on the bounding box to which they
apply
now, if we want to know which GD primitives need
to be rendered for some tile, we just search the
database for all primitives overlapping with the
tile bounding box

13
Server side features currently implemented (2)

tile rendering Perl generate-tiles.pl
uses TiledImage.pm to
fill MySQL database with graphics primitives
render tiles from a given database of primitives
generate XML containing client config info
do any combination of the above, including on
subsets of tiles (allows to break rendering down
into jobs, suitable for rendering on multiple
CPU)

14
Server side features to be implemented,
short-term (1)

tile serving module
pre-fill the database with all primitives, but
only render them selectively as users request
tiles
store already rendered tiles to prevent
re-rendering
maybe idle server CPU cycles can be used to
render arbitrary tiles, always filling the tile
space
generate-tiles.pl should process externally
rendered tiles, e.g.
dotplot tracks
histograms
supporting material for features such as pictures
of fluorescent gene expression profiles,
physiological changes due to gene knockout
experiments, etc.

15
Server side features to be implemented,
short-term (2)

database optimizations
query for primitives in the slow step rendering
takes much longer than loading database
key on tile number (1 key), not bounding box (4
keys)
go to whiteboard
gridlines account for gt50 of the primitives, but
are the same for every tile
maybe load gridlines for just one tile, and
return them for every query?
GUI to wrap generate-tiles.pl
should be built into Web interface for annotation
upload

16
Server side features to be implemented,
long-term (1)

how to serve up feature info?
short-term solution
have generate-tiles.pl produce an XML file with
feature data (bounding boxes, etc.) for each
tile, since it has easy access to that info
client loads and parses an XML tile for each tile
more robust solution
need a database of features (but what kind?)
necessary to support efficient search for
features
necessary for community annotation, because
people will be changing the feature info
constantly

17
Server side features to be implemented,
long-term (2)

community annotation
concurrency is an issue (updating changes,
notifying client of updates since start of
session, locking features for editing, etc.)
feature upload seems (to me) to be a special case
of community annotation and should use its
framework
quality control (registration, security)
are there existing database schemas or other
frameworks that can serve this purpose?

18
Client side features currently implemented

Dragging works, but with bugs when large views
are involved (fix is non-trivial, in progress)
Also work jumping, centering, zooming, dynamic
resize
Tracks can be toggled hidden/visible
Hovering labels (either all on, or pop up on
mouseover), with adjustable transparency

19
Brief aside what is AJAX?

Asynchronous JavaScript and XML
A combinations of technologies to make clients
behave more like applications
JavaScript client code that uses XMLHttpRequest
to asynchronously query the server for things
Implies XHTML (well-formed HTML) and DHTML (DOM
manipulation), use of CSS

20
Why I am avoiding existing AJAX frameworks

Useful for flashy graphics effects, but dont
help with the engine of the client (except maybe
Prototype and Google Web Tookit)
but, GWT is closed source and an early version
even online demo has bugs
None support dragging, track management, tile
caching, etc so that needs to be done ourselves
(and has, so far, consumed most of the effort)
But Im willing to consider them for
adding graphics effects after engine is more
developed
for asynchronous communication with the server

21
DOM from XHTML to a tree

lttablegt
lttbodygt
lttrgt
lttdgtShady Grovelt/tdgt
lttdgtAeolianlt/tdgt
lt/trgt
lttrgt
lttdgtOver the River, Charlielt/tdgt
lttdgtDorianlt/tdgt
lt/trgt
lt/tbodygt
lt/tablegt

This is from a W3 page, so you know its
right http//www.w3.org/TR/2004/REC-DOM-Level-3-C
ore-20040407/introduction.html
22
Client the nitty-gritty (1)

Code is broken down into multiple JavaScript
classes
by which I mean just separate .js files, most of
which are object instances that provide
class functions and methods
namespaces
modularity, organization
Static classes (standalone file, no instance)
Other.js misc. helpers
Load.js loads XML when it is loaded,
instantiates all objects in the correct order

23
Client the nitty-gritty (2)

The Component system
An attempt to bring order to chaos
Each discrete UI element (e.g. main view,
navigation panel, panel with track control
buttons, etc.) is a Component
code for each component in its own file
Components are
instantiated by Load.js
connected through ComponentInterface.js
should not modify other Component properties
directly (although JavaScript allows this), but
rather use ComponentInterface.js for sanity!

24
Client the nitty-gritty (3)

Each Component must define
constructor
renderComponent()
returns the DOM node for this Component
will (eventually) be called by Load.js, which
will then take the DOM node and append it to
document
once fully implemented, there will be no need for
content in ltbodygt of XHTML JavaScript will
render everything dynamically
which allows for possibility of having a
server-side config file specifying client-side
layout, thus further removing users from the
necessity of doing any programming

25
Client the nitty-gritty (4)

Each Component must also define
getState()
for setting bookmarks/history
setState()
for restoring bookmarked/history points
some bookmarking object will eventually use the
above methods to store/load bookmarked states by
polling all Components

26
Client the nitty-gritty (5)

If a programmer writes a new Component, they have
to
add accessors/modifiers for its object properties
to ComponentInterface.js
add calls to constructor and renderComponent() to
Load.js
However, eventually, accessor/modifier
construction will be done automatically by
ComponentInterface.js (in theory, its possible)
this means that a Component programmer never has
to look outside their own Component code, using
the API for the other Components to access/modify
them

27
Gods below! Was it really necessary to take 5
slides for this?

Yes, because object-oriented programming in
JavaScript requires discipline, and its important
to work these things out early on
with multiple people working on this code, it
needs to be compartmentalized somehow
otherwise, debugging may cause blood pressure to
rise to dangerous levels (although Venkmans
debugger will alleviate that)
see ComponentTemplate.js in SVN for a template,
with guidelines on how to write a component of
your own

28
Client the nitty-gritty (6)

Current components
ViewerComponent.js
NavigationComponent.js
TrackControlComponent.js
DebugComponent.js
Other classes
View.js
stores limited information about current view
intended to be the class that manages feature
info fetching, caching, etc.
TracksAndZooms.js
just a data structure to hold config info from
the XML file and current state info about what
zoom level were at, and what tracks are
hidden/visible
These should really be prototypes for other
objects

29
Client the nitty-gritty (7)

Dragging and genome view events
brace yourself, this is going to be ugly
go to whiteboard
Ideally, no one should have to deal with this
after its been programmed, as it will be wrapped
up in ViewerComponent.js, and navigation can be
accomplished by using accessors to move view
around

30
Client side features to be implemented (1)

Client has no idea what the information on the
tiles actually means (no knowledge of where and
what the features are)
must be made aware of what it is displaying
short-term solution is load this from XML file
for each tile (remember the server-side to do?)
the client JavaScript class for doing this can
be later replaced with something more
sophisticated, e.g. an XSL transformational
grammar and XHR for fetching feature info from
database there are many possibilities

31
Client side features to be implemented (2)

How can the user actually see the information
about features?
pop-up menu on mouseover?
would have option to pop up details in separate
window, manage annotation, etc.
displayed in a sidebar a la Google Local?
There is no one True Answer, so maybe we can
build all of the above and provide options to
toggle between things

32
Client side features to be implemented (3)

Feature search
by feature, keyword, regular expression, etc.
search results display
pop open a table (load Component) displaying
results clicking on results in table will center
the view on them
multiple views can open up stacked on one another
can be used to display synteny link them all to
a single horizontal dragging ruler

33
Client side features to be implemented (4)

Posting things to server (what protocol? XML?
JSON?)
community annotation
feature upload
automated bug reporting system
Needs to check for changes in server-side
database, tiles rendered, etc., since community
annotation may change contents that you are
looking at

34
Client side features to be implemented (5)