Title: XBrain
1 XBrain
XQuerying the Brain Mapping Database
- Stacy Tang, Yana Kadiyska
- Jim Brinkley, Dan Suciu
2Human Brain Project
- Problem
- Explosion of information due to proliferation of
techniques. - NIH Goal
- WWW based information tools that allow
management, integration, and sharing of research
data.
3Brain Mapping Database
- Study of language function through invasive
neurosurgical method, Cortical Stimulation
Mapping - Combined with non-invasive methods such as MRI,
fMRI, PET scan - 64 patients with 13 of them published
4XML
- Document markup language
- Become the standard for data exchange between
inter-enterprise applications - Platform independent
- Self-describing data
5XML example
- ltbibgt
- ltbook year"2000"gt
- lttitlegtData on the Weblt/titlegt
- ltauthorgtltlastgtAbiteboullt/lastgtltfirstgtSergelt/firs
tgtlt/authorgt - ltauthorgtltlastgtBunemanlt/lastgtltfirstgtPeterlt/firstgt
lt/authorgt - ltauthorgtltlastgtSuciult/lastgtltfirstgtDanlt/firstgtlt/au
thorgt - ltpublishergtMorgan Kaufmann Publisherslt/publisher
gt - ltpricegt39.95lt/pricegt
- lt/bookgt
- ltbook year1992gtlt/bookgt
- ...
- lt/bibgt
6SilkRoute
- Data stored in relational database, how to
translate to XML for exchange. - Is a tool for publishing relational data in XML.
- Allows querying of the data using XQuery.
- Developed by Dan and Yana, along with
collaborators from other institutions.
7What SilkRoute Does
PublicQuery
RelationalSchema
SilkRoute
OutputXML
User InputUser Query
SQL
tuples
RDBMS
8Objective of Project
- To demonstrate the usability of SilkRoute
- and XQuery for data sharing by applying it to
- a real relational database -- the Brain Mapping
- database.
9Project Background
- Started as a CSE544 (Intro to Database) project
(Spring 2002 Quarter). - Original project members Hao Li and myself.
- Demonstrated feasibility of project.
- Unfinished
- Covered small part of database
- Depended on manual tweaking of data
- Minimal web interface
10Tasks of the Project
- 1. Migrate database from MySQL to PostgreSQL -
automate as much as possible. - 2. Complete XQuery-based public view for the
entire database. - 3. Work with Yana to smooth out SilkRoute issues
- bug fixes, error handling, etc. - 4. Web interface - add new features, improve look
and feel, improve UI.
111.MySQL to PostgreSQL
- Why is this necessary?
- Robustness
- Sub-select queries
- Problems MySQL and PostgreSQL are very
different, and the data needs to be cleaned up. - The previous process involved too much manual
tweaking, need to improve. Wrote scripts for
this.
12MySQL to PostgreSQL - Step 1
- Make a dump of the MySQL database
- - MySQL database is on tela.biostr
- - Use a perl script to create a dump in a
specified directory.
13MySQL to PostgreSQL - Step 2
- Translate MySQL dump to PostgreSQL. Use scripts
to - - clean up syntax
- - rename table/column names that are reserved
words (user, public) in PostgreSQL. - - designate primary keys when lacking
- - get rid of WIRM related tables
14MySQL to PostgreSQL - Step 3
- Create SQL files for running later (generated
using python scripts). The SQL files - - correct some of the bad data
- - add foreign key constraints (lacking in the
MySQL dump)
15MySQL to PostgreSQL - Step 4
- Import the data into PostgresSQL
- - run the dump and generated SQL files in a
specific order to allow the data to be entered - - reorder the insert statements as to not
violate foreign key constraints - - still errors about bad rows, those arent
inserted
162. The Public View
- Provides a virtual view of the relational
database - Very large (over 1000 lines)
- Data Privacy
- Choose not to publish some fields.
- Protect patient privacy, e.g. patient.initials,
patient.research_num, etc. - Protect unpublished research data.
- How to translate graph to tree
- DB tables may not be hierarchical, so have to
force parent-child relationships for the DTD.
17Brain Mapping DB Schema
Patient(oid,initials,first_name,last_name,locatio
n,registered,age,sex,viq,pnum, is_public,handednes
s,wada,size,copy,pre,description,gao_research_num)
Surgery(oid,patient,surgery_date,surgeon,diagn
osis,side,lobe,grid) CSMStudy(oid,surgery,funct
ion,trial_data,site_data) File(oid,label,domain
,locator,source,mime_type,submit_date,submitted_by
, version,context,description) Photo(oid,prefer
ence,image,csmstudy,image_pathname,image_filename)
StimSite(oid,site_label,zone,lobe,csmstudy,an
atomical_name) Trial(oid,trial_num,site_label,t
rial_time,current,slide,eeg_score,miriam_code,
confidence,comments,km_score,site_suffix,csmstudy,
stimulation_site) UserPerson(oid,login,first_nam
e,last_name,email,password,user_group)
18Brain Mapping DB Schema (cont)
SiteToAnatomyMap(oid,csmstudy,photo,scene,author,
map_date, sitetoanatomyfile,rendered_map,sitetoana
tomy_pathname, sitetoanatomy_filename,
preference,modtime) SiteToAnatomyMapElement(o
id,sitetoanatomymap,stimsite,site_label,
ant_coord,sup_coord,right_coord,x,y,confidence)
Scene(oid,imaging_study,description,description_
file,preference, ismapscene) ImagingStudy(oid,p
atient,image_date,billed,prefix,subject,suffix,
computed_image_pathname,computed_image_filename,
computed_coords_pathname,computed_coords_filename,
lowres_surface_pathname,lowres_surface_filename,a
ligned_pathname) MRExam(oid,imaging_study,exam_
num,description,import_date, import_info,location)
Rendering(oid,rendering_type,preference,image,
scene,image_pathname, image_filename)
19Brain Mapping DB Schema (cont)
SceneComponent(oid,scene,description,surface_mode
l,volume) SurfaceModel(oid,volume,model_instanc
e,format,model_file, model_pathname,model_filename
,preference) RadialSliceModelInstance(oid,volum
e,model,landmarks_file,instance_file,expansion_fac
tor,instance_pathname,instance_filename,preference
, landmarks_pathname,landmarks_filename,derived_fr
om) RadialSliceModel(oid,pathname,filename,comm
ent,theta_radials,slices, training_set,model_file,
preference) MRSeries(oid,mrexam,location,showin
g,total_images,plane,scan_start,
scan_end,psd,type,description,fov_x,fov_y,height,w
idth,bytes_per_pixel, bits_per_pixel,optical_disk,
start_img,stop_img,threshold,tissue,first,last,
label,thickness,spacing) MRSlice(oid,sequence_
num,image_file,mrseries) AlignedVolume(oid,seri
es,format,volume_file,filename,tissue,patient)
20Brain Mapping DB Schema Diagram
Patient
Surgery
ImagingStudy
21Brain Mapping DB Schema Diagram (cont)
22Brain Mapping DB Schema Diagram (cont)
23The Public View DTD Graph
24The Public View DTD Graph (cont)
25The Public View - In XQuery
- ltrootgt
-
- for patient in cv/Patient
- where patient/is_public/text() "1"
- return
- ltpatient oid"patient/oid/text()"gt
- ltfirst_namegt xxx lt/first_namegt
- ltlast_namegt xxx lt/last_namegt
- ltlocationgt patient/location/text()
lt/locationgt - ltsexgt patient/sex/text() lt/sexgt
- ...
-
- for surgery in cv/Surgery
- where data(surgery/patient)
data(patient/oid) - return
- ltsurgery oid"surgery/oid/text()"gt
- ltdiagnosisgt surgery/diagnosis/text()
lt/diagnosisgt -
26User Queries - A Simple Example
Sample Query 1 (written in XQuery) List the last
names of all patients who DID NOT have surgery.
ltresultsgt for p in pv/root/patient
where empty(p/surgery) return ltlast_namegtp
/last_name/text()lt/last_namegt lt/resultsgt
27User Queries - A Simple Example
Alternative (written in XPath) XPath is a
subset of the XQuery language, and thus perfectly
acceptable to use for queries. You cant do as
much with XPath, but it is very simple to write.
ltresultsgt pv/root/patientempty(surgery)/last
_name lt/resultsgt
28User Queries - A Simple Example (cont)
Sample Query 1 Intermediate SQL query generated
by SilkRoute.
SELECT P78.last_name, P78.oid FROM Patient as
P78 WHERE NOT EXISTS ( SELECT
FROM Surgery as S99 WHERE S99.patient
P78.oid)
29User Queries - A Simple Example (cont)
Results (in XML)
ltresultsgt ltlast_namegtChopralt/last_namegt
ltlast_namegtTowneslt/last_namegt lt/resultsgt
303. Improve plumbing between SilkRoute and web
application
- Worked with Yana to improve error handling.
- If user inputs bad query, then return the parse
error back to the user. - When SilkRoute encounters an error, gracefully
exit instead of bringing down the web page.
314. Web Interface
- Located at http//quad.biostr.washington.edu8080
/xbrain/ - Make application available over web.
- Written in JSP and served by Tomcat.
- Talks to SilkRoute through a Java interface.
- Allows users to input their own queries and get
XML results. - Added feature for letting certain super users
to access a version of the public view that
contains all the patients (not just the 13 public
ones).
32Web Interface - System Diagram
quad.biostr.washington.edu
SilkRoute
Postgres
Tomcat4
XQuery
XML
MySQL
Web Browser(Internet)
33Web Interface - System Architecture
Tomcat (Application Server) Runs JSP/Servlets
XBrain pages
JSPs
Java API
Java Classes
SilkRoute
DB
34Web Interface - Screen Shots
35Web Interface - Screen Shots
36Web Interface - Screen Shots
37Web Interface - Screen Shots
38Web Interface - Screen Shots
39Web Interface - Screen Shots
40Web Interface - Screen Shots
41Current Status Future Work
- Currently, the website is up and running at
http//quad.biostr.washington.edu8080/xbrain/ - Immediate Future
- Figure out who the super users are by looking in
the UserPerson table. - Store user input in temporary files, to better
handle simultaneous users. - Add Secure Socket Layer (SSL) to ensure secure
transfer of XML data when user is logged in. - SilkRoute bug fixes.
42Future Work
- Future
- Graphical User Interface to help users formulate
user queries. - Flexible format for visualizing results (i.e.,
comma separated values instead of XML). - Extend this to other databases.
- Eventual goal of allowing multiple applications
to cooperate in a peer data management system.
43Team/Resources
- SilkRoute support Yana
- Faculty Dan Suciu, Jim Brinkley
44Questions?
For more information, go to the XBrain webpage
http//quad.biostr.washington.edu8080/xbrain/