Title: System Evolvability Features of the SenseLab Project
 1- System Evolvability Features of the SenseLab 
 Project
- Luis Marenco, MD 
- Center for Medical Informatics 
- Yale University School of Medicine
2Outline
- The nature of some bioscience applications (e.g. 
 Neuroscience) where domain knowledge is in
 constant revision requires an application
 infrastructure capable to evolve over time.
- The reasons and one possible solution to this 
 problem will be reviewed in the following
 topics
- Motivation The SenseLab project 
- Background issues of standard applications 
- Evolvable applications goals 
- Some Possible solution scenarios 
- EAV/CR - features for evolvable applications 
- EAV/CR derived methodologies for evolvable 
 applications
- EAV/CR application demo (SenseLab) 
- EAV/CR Solution Framework
3Motivation The SenseLab Project
- The SenseLab project is a ongoing effort to 
 integrate multidisciplinary sensory data using
 the olfactory system as model domain.
- The process involves the development of 
 neuroinformatics databases and tools in support
 of neuroscience research.
- SenseLab web-portal contains the following 
 web-databases
- Neuronal research NeuronDB, ModelDB, and 
 CellPropDB
- Olfactory research ORDB, OdorDB, and OdorMapDB 
- The fundamental problem statement is the 
 maintenance burden due
- Constant domain evolution 
- Research of not well understood process like 
 olfaction involves constant factoring-in of new
 variables or disciplines
4Background Issues of Standard Applications
- Standard database applications are characterized 
 with code entwined with metadata descriptors from
 back-end databases. The limitations to this
 approach are
- Increased coding as database complexity grows 
- Limited code reusability 
- Lack of robust data interoperability (messages 
 mirror the schema)
- Complexity derived by use of multiple tools to 
 maintain schema data editing, and security
- To advance knowledge represented as metadata, the 
 necessary schema changes will lead to
- Downtime and application breakdown 
- Interface redesign (GUI and Inter-application 
 recoding)
- Increased code complexity 
- Increased probability of coding errors
5Background Issues of Standard Applications (2)
- Traditional Web-database applications 
- Data entry and security Cumbersome, expensive 
 and non-portable to other applications
- Searching mechanisms Limited, difficult to 
 standardize and expensive to create. The hidden
 web remains an issue
- Site-wide architectures are cumbersome to adapt 
 to new web formats (e.g. Semantic-web types)
- Metadata maintenance 
- Data dictionary Incomplete 
- Complex, non centralized, and requires more than 
 one tool
- Requires specific database expertise. Non 
 portable knowledge
- Tools and software libraries are specific to 
 every vendor database
6Evolvable Application Goals
- PRIMARY 
- Create a programmatic approach capable to allow 
 databases structural changes without disrupting
 the existing data and code
- Minimize codemetadata dependency focusing on 
 automated interface generation (GUI  Inter App.)
- Improve code simplification as project matures 
 Extreme Programming principles
- SECONDARY 
- Facilitate system integration to a Web platform 
- Accessibility from common web browsers. 
- Incorporate role-based security with public and 
 private data
- Create generic interfaces and formats for data 
 exchange
- Improve code reusability leveraging previous 
 approach
- Foresee robust interoperability with standardized 
 protocols
7Possible Solution Scenarios (some)
- Use of object oriented or object relational 
 databases Immature and unsupported
- Leverage other application approaches (e.g. 
 Protégé) The part that is related with flexible
 data structures Lack of features (e.g.
 non-distributed or web-based, no security
 implied). Future version will possibly cover
 these features.
- Built a new ground-up solution to provide 
 needed features The EAV/CR Application Framework
 (Combination of data storage approach  software
 practices)
8EAV/CR Storage Approach
- EAV/CR (Entity-Attribute-Value with Classes and 
 Relationships) data storage system is derived
 from the EAV row based data modeling approach
 widely used in Electronic Patient Record Systems
 and MS Windows Registry, among others.
9Relational (left) to EAV/CR (right) Comparison
EAV/CR uses a limited number of tables to 
represent any amount of tables from a relational 
DB. EAV/CR treats data (VALUES) and metadata 
(CLASSES, ENTITIES, and ATTTRIBUTES) as 
relational data allowing flexible domain 
representation. 
 10EAV/CR Storage Approach (2)
- EAV/CR augments standard EAV by 
- Grouping entities in Classes C 
- Using strong data typing for value storage 
- Allowing computed attributes (functions) 
- Allowing entity relationships R (related and 
 hierarchical attribs.)
- Including implicit data and metadata versioning 
 and timestamp
- Including Web oriented features Metadata have 
 been enriched with web parameters to automate
 web-interface generation (Web forms, XML, )
- Assisting ontological representation Mapping 
 standardized vocabulary and semantic
 relationships identifiers to data and metadata
 elements
11EAV/CR Features for Evolvable Applications
- Automatic system adaptability to DB structural 
 changes
- Generic metadata-driven database navigation 
- Robust data-entry and schema-maintenance web 
 forms generation
- Ability to create database portals to present 
 different subsets of the data to users with a
 particular research focus
- Centralized role-based security. Uses a 
 compartmentalized distributed administration
 model to minimize dedicated administration costs
- Monitoring tools
12EAV/CR Features for Evolvable Applications (2)
- Expandable system architecture Allows parallel 
 processing by scaling-out. Parallel web servers
 can connect to the same EAV/CR database
 preserving security, data and metadata
 concurrency
- Delegated user profile management Users are 
 responsible of their own profiles, administrators
 provide access to users to specific database
 resources. (Web portal model)
13EAV/CR derived methodologies for evolvability
- Data Services Creation of the EDSP InfoSet 
 protocol to allow description of database
 ontology, metadata, and data in a simple XML
 format. (It brings the EAV/CR approach to the XML
 world).
- The following processes depend on EDSP 
- Data transference 
- Middle tier components 
- Automated Ad-hoc query interface generation 
- Using EDSP as the source for these processes 
 improves software components stability and
 reusability
14EAV/CR Application Framework
- Programming model 
- Component programmer 
- Domain programmer 
- EAV/CR Framework Toolkit (version1. Codename?) 
- Database Component Encapsulates EAV/CR logic 
 presenting interfaces for domain programmers.
 Created in MS C.NET
- Plumbing code Generic Web portal scripts. 
 IIS-ASP-VBScript
15Summary
- EAV/CR and Evolvability 
- High data integration 
- Flexibility in database schema evolution / 
 maintenance
- Code reuse and increased reliability 
- Extensible application architecture 
- Disadvantages 
- Querying complexity 
- Multi-parameterized queries performance penalty 
- Complex EAV/CR components programming
16Demo Metadata driven Ad hoc interface generation
-  Boolean expression can be added for complex 
 associations. Results can be retrieved in HTML,
 XML text and other formats.
17Demo Metadata driven Ad hoc interface generation 
(2)
- The same generic code behind this interface is 
 reused in other databases augmenting the value
 added in this robust evolvable design.
18Demo EAV/CR Centralized Schema Management
- The Schema Manager tool displays and allows 
 edition of the database structure. This figure
 shows the database inventory of the SenseLabs
 EAV/CR data store with links to specific elements
Next gtgt After selecting CellPropDB 
 19Demo EAV/CR Centralized Schema Management (2)
- Selecting a database (e.g. ModelDB), displays 
 the web database information, this can be changed
 at any time.
20Demo EAV/CR Centralized Schema Management (3)
- On the left, selecting Classes displays the 
 list of Classes for ModelDB, on the right the
 Class Models is being edited
21Demo EAV/CR Centralized Schema Management (4)
- While in the class Models, selecting the 
 Attributes tab shows all its attributes (left).
 On the right, the attribute neurons shows its
 relation to neuron objects from NeuronDB.
22Demo EAV/CR Centralized Schema Management (5)
- Similarly like in previous slides, Schema manager 
 allows entering of new users and granting rights
 to specific databases. Lastly, by clicking on the
 diagram link, shows the ER representation of the
 ModelDB database.
23Demo InfoSets and Evolvable Interoperability
- The creation of the EDSP (EAV/CR dataset 
 protocol) allows transference of database schema
 and data in a simple consistent format based upon
 the universally accepted XML format. This picture
 show a partial rendering of some olfactory
 receptors molecules from ORDB
24Demo InfoSets and Evolvable Interoperability (2)
- Meanwhile, exchange of data with other standard 
 protocols is achieved through XML
 transformations. Below is the previous EDSP
 message transformed into Microsoft XDR, format
 used by the MS Office Suite to import into MS
 Access and MS SQL Server.
25Demo InfoSets and Evolvable Interoperability (4)
- A practical use of the XDR is demonstrated here 
 while importing data directly from a SenseLab URL
 to an Access or SQL Server database.
26Demo InfoSets and Evolvable Interoperability (5)
- This example points to a particular olfactory 
 receptor at
- (http//senselab.med.yale.edu/senselab/site/dbGate
 /Xtract.asp?o1798xsledsp-officedata)
27Demo InfoSets and Evolvable Interoperability (6)
- Access shows the tables to be generated
28Demo InfoSets and Evolvable Interoperability (7)
  29Demo InfoSets and Evolvable Interoperability (8)
-  relationships, and the data (preserving strong 
 data typing )
- All in one deEAVfication process.