Title: CS 290C: Formal Models for Web Software Lecture 9: Analyzing Data Models Using Alloy Analyzer and SMT-Solvers Instructor: Tevfik Bultan
1CS 290C Formal Models for Web Software
Lecture 9 Analyzing Data Models Using Alloy
Analyzer and SMT-SolversInstructor Tevfik
Bultan
2Three-Tier Architecture
Browser
Web Server
Backend Database
3Three-Tier Arch. MVC Pattern
Browser
- MVC pattern has become the standard way to
structure web applications
- Ruby on Rails
- Zend for PHP
- CakePHP
- Struts for Java
- Django for Python
Views
Controller
Web Server
Model
Backend Database
4Benefits of the MVC-Architecture
- Benefits of the MVC architecture
- Separation of concerns
- Modularity
- Abstraction
- These are the basic principles of software design
- Can we exploit these principles for analysis?
5A Data Model Verification Approach
MVC Design Principles
Automatic Extraction
Add data model properties
6Rails Data Models
- Data model verification Analyzing the
associations/relations between data objects - Specified in Rails using association declarations
inside the ActiveRecord files - The basic relation types
- One-to-one
- One-to-many
- Many-to-many
- Extensions to the basic relations using Options
- through, conditions, polymorphic, dependent
7The Three Basic Relations in Rails
- One-to-One (One-to-ZeroOrOne)
- .
- One-to-Many
User
- class User lt ActiveRecordBase
- has_one account
- end.
- class Account lt ActiveRecordBase
- belongs_to user
- end
1
0..1
Account
class User lt ActiveRecordBase has_many
projects end. class Project lt
ActiveRecordBase belongs_to user end
User
1
Project
8The Three Basic Relations in Rails
Author
- class Author lt ActiveRecordBase
- has_and_belongs_to_many books
- end
- class Book lt ActiveRecordBase
- has_and_belongs_to_many authors
- end
Book
9Options to Extend the Basic Relations
- through Option
- To express transitive relations, or
- To express a many-to-many relation using a join
model as opposed to a join table - conditions Option
- To relate a subset of objects to another class
- polymorphic Option
- To express polymorphic relations
- dependent Option
- On delete, this option expresses whether to
delete the associated objects or not
10The through Option
- class Book lt ActiveRecordBase
- has_many editions
- belongs_to author
- end
- class Author lt ActiveRecordBase
- has_many books
- has_many editions, through gt books
- end
- class Edition lt ActiveRecordBase
- belongs_to book
- end
1
1
1
11The conditions Option
class Account lt ActiveRecordBase has_one
address, conditions gt address_typeBillin
g end .class Address lt ActiveRecordBase
belongs_to account end
Address
Account
1
0..1
address_typeBilling
12The polymorphic Option
class Address lt ActiveRecordBase
belongs_to addressable, polymorphic gt
true end class Account lt ActiveRecordBase
has_one address, as gt addressable end class
Contact lt ActiveRecordBase has_one
address, as gt addressable end
1
Account
0..1
Address
0..1
Contact
1
13The dependent Option
- class User lt ActiveRecordBase
- has_many contacts, dependent gt destroy
- end
- class Contact lt ActiveRecordBase
- belongs_to user
- has_one address, dependent gt destroy
- end
1
1
0..1
- delete directly deletes the associated objects
without looking at its dependencies - destroy first checks whether the associated
objects themselves have associations with the
dependent option set
14Formalizing Rails Semantics
- Formal data model M ltS, C, Dgt
- S The sets and relations of the data model (data
model schema) - e.g. Account, Address, Project, User and the
relations between them - C Constraints on the relations
- Cardinality constraints, transitive relations,
conditional relations, polymorphic relations - D Dependency constraints express conditions on
two consecutive instances of a relation such that
deletion of an object from the fist instance
leads to the other instance
15Formalizing Rails Semantics
- Data model instance I ltO,Rgt where O o1, o2,
. . . on is a set of object classes and R r1,
r2, . . . rm is a set of object relations and
for each ri ? R there exists oj, ok ? O such that
ri ? oj ok - I ltO,Rgt is an instance of the data model M
ltS,C,Dgt, denoted by I M, - if and only if
- the sets in O and the relations in R follow the
schema S, and - R C
16Formalizing Rails Semantics
- Given a pair of data model instances I ltO,Rgt
and I ltO,Rgt, (I, I) is a behavior of the
data model M ltS,C,Dgt, denoted by (I, I) M, - if and only if
- O and R and O and R follow the schema S
- R C and R C, and
- (R,R) D
17Data Model Properties
- Given a data model M ltS,C,Dgt, we define four
types of properties - state assertions (AS) properties that we expect
to hold for each instance of the data model - behavior assertions (AB) properties that we
expect to hold for each pair of instances that
form a behavior of the data model - state predicates (PS) predicates we expect to
hold in some instance of the data model - behavior predicates (PB) predicates we expect to
hold in some pair of instances that form a
behavior of the data model
18Data Model Properties
19Data Model Verification
- The data model verification problem Given a data
model property, determine if the data model
satisfies the property. - An enumerative (i.e., explicit state) search
technique not likely to be efficient for bounded
verification - We can use SAT-based bounded verification!
- Main idea translate the verification query to a
Boolean SAT instance and then use a SAT solver to
search the state space
20Data Model Verification
- SAT-based bounded verification This is exactly
what the Alloy Analyzer does! - Alloy language allows specification of objects
and relations, and the specification of
constraints on relations using first-order logic - In order to do bounded verification of Rails data
models, automatically translate the Active Record
specifications to Alloy specifications
21Translation to Alloy
ALLOY
RAILS
class ObjectA has_one objectB end . class
ObjectA has_many objectBs end . class
ObjectA belongs_to objectB end . class
ObjectA has_and_belongs_to_many
objectBs end
.sig ObjectA objectB lone ObjectB . sig
ObjectA objectBs set ObjectB . sig
ObjectA objectB one ObjectB . sig
ObjectA objectBs set ObjectB fact
ObjectA lt objectBs (ObjectB lt
objectA
22Translating the through Option
- class Book lt ActiveRecordBase
- has_many editions
- belongs_to author
- end
- class Author lt ActiveRecordBase
- has_many books
- has_many editions, through gt books
- end
- class Edition lt ActiveRecordBase
- belongs_to book
- end
- sig Book
- editions set Edition,
- author one Author
-
- sig Author
- books set Book,
- editions set Edition
- editions books.editions
- sig Edition
- book one Book
-
- fact
- Book lt editions (Edition lt book)
- Book lt authors (Author lt book)
Book
1
1
1
Edition
Author
23Translating the dependent Option
- The dependent option specifies what behavior to
take on deletion of an object with regards to its
associated objects - To incorporate this dynamism, the model must
allow analysis of how sets of objects and their
relations change from one state to the next
class User lt ActiveRecordBase has_one
account end . class Account lt
ActiveRecordBase belongs_to user,
dependent gt destroy end
sig User sig Account one sig PreState
accounts set Account, users set User,
relation1 Account lone -gt one User one sig
PostState accounts set Account,
users set User, relation1 Account set -gt
set User
24Translating the dependent Option
pred deleteAccount s PreState, s PostState,
x Account all x0 Account x0 in
s.accounts all x1 User x1 in s.users
s.accounts s.accounts - x
s.users s.users s.relation1
s.relation1 (x lt s.relation1)
- We also update relations of its associated
object(s) based on the use of the dependent
option
25Translating the dependent Option
pred deleteContext s PreState, s' PostState,
xContext all x0 Context x0 in
s.contexts all x1 Note x1 in s.notes
all x2 Preference x2 in s.preferences all
x3 Project x3 in s.projects all x4
RecurringTodo x4 in s.recurringtodos all
x5 Tag x5 in s.tags all x7 Todo x7 in
s.todos all x8 User x8 in s.users
s'.contexts' s.contexts - x s'.notes'
s.notes s'.preferences' s.preferences
s'.projects' s.projects s'.recurringtodos'
s.recurringtodos s'.tags' s.tags
s'.todos' s.todos - x.(s.context_todos)
s'.users' s.users s'.notes_user'
s.notes_user s'.completed_todos_user'
s.completed_todos_user s'.recurring_todos_user
' s.recurring_todos_user s'.todos_user'
s.todos_user - (x.(s.context_todos) lt
s.todos_user) s'.active_contexts_user'
s.active_contexts_user s'.active_projects_user
' s.active_projects_user s'.projects_user'
s.projects_user s'.contexts_user'
s.contexts_user - (x lt s.contexts_user)
s'.recurring_todo_todos' s.recurring_todo_todos
- (s.recurring_todo_todos gt x.(s.context_todos))
...
26Verification Overview
Counter-example Data Model Instance
Active Records
Alloy Specification
Alloy Analyzer
Translator
Verified
Data Model Properties
27Experiments
- We used two open-source Rails applications in our
experiments - TRACKS An application to manage things-to-do
lists - Fat Free CRM Customer Relations Management
software - We wrote 10 properties for TRACKS and 20
properties for Fat Free CRM
TRACKS Fat Free CRM
LOC 6062 lines 12069 lines
Data model classes 13 classes 20 classes
Alloy spec LOC 301 lines 1082 lines
28Types of Properties Checked
- Relationship Cardinality
- Is an Opportunity always assigned to some
Campaign? - Transitive Relations
- Is a Notes User the same as the Notes
Projects User? - Deletion Does Not Cause Dangling References
- Are there any dangling Todos after a User is
deleted? - Deletion Propagates to Associated Objects
- Does the User related to a Lead still exist
after the Lead has been deleted?
Note
User
Project
29Experimental Results
- Of the 30 properties we checked 7 of them failed
- For example, in TRACKS Notes User can be
different than Notes Projects User - Currently being enforced by the controller
- Since this could have been enforced using the
through option, we consider this a data-modeling
error - Another example from TRACKS User deletion
creates dangling Todos - User deletion does not get propagated into the
relations of the Context object, including the
Todos
1
1
Context
User
Todo
dependent gt delete
30Performance
- To measure performance, we recorded
- the amount of time it took for Alloy to run and
check the properties - the number of variables generated in the boolean
formula generated for the SAT-solver - The time and number of variables are averaged
over the properties for each application - Taken over an increasing bound, from at most 10
objects for each class to at most 35 objects for
each class
31Summary
- An approach to automatically discover data model
errors in Ruby on Rails web applications - Automatically extract a formal data model, verify
using the Alloy Analyzer - An automatic translator from Rails ActiveRecords
to Alloy - Handles three basic relationships and several
options (through, conditions, polymorphic,
dependent) - Found several data model errors on two open
source applications - Bounded verification of data models is feasible!
32What About Unbounded Verification?
- Bounded verification does not guarantee
correctness for arbitrarily large data model
instances - Is it possible to do unbounded verification of
data models?
33An Approach for Unbounded Verification
MVC Design Pattern
Automatic Extraction
Automatic Translation Automatic Projection Pr
operties
34Another Rails Data Model Example
Role
- class User lt ActiveRecordBase
- has_and_belongs_to_many roles
- has_one profile, dependent gt destroy
- has_many photos, through gt profile
- end
- class Role lt ActiveRecordBase
- has_and_belongs_to_many users
- end
- class Profile lt ActiveRecordBase
- belongs_to user
- has_many photos, dependent gt destroy
- has_many videos, dependent gt destroy,
- conditions gt "format'mp4'"
- end
- class Tag lt ActiveRecordBase
- belongs_to taggable, polymorphic gt true
- end
- class Video lt ActiveRecordBase
- belongs_to profile
User
1
1
0..1
1
Profile
Photo
1
1
format.mp4
1
Video
Taggable
Tag
35Translation to SMT-LIB
- Given a data model M ltS, C, Dgt we translate the
constraints C and D to formulas in the theory of
uninterpreted functions - We use the SMT-LIB format
- We need quantification for some constraints
36Translation to SMT-LIB
class Profile has_many videos end class
Video belongs_to profile end
RAILS
(declare-sort Profile 0) (declare-sort Video
0) (declare-fun my_relation (Video) Profile).
SMT-LIB
37Translation to SMT-LIB
class User has_one profile end class
Profile belongs_to user end
RAILS
(declare-sort User 0) (declare-sort Profile
0) (declare-fun my_relation (Profile)
User). (assert (forall ((x1 Profile)(x2
Profile)) (gt (not ( x1 x2)) (not (
(my_relation x1) (my_relation x2) )) ) ))
SMT-LIB
38Translation to SMT-LIB
class User has_and_belongs_to_many
roles end class Role has_and_belongs_to_many
users end
RAILS
(declare-sort Role 0) (declare-sort User
0) (declare-fun my_relation (Role User) Bool)
SMT-LIB
39Translating the through Option
- class Profile lt
- ActiveRecordBase
- belongs_to user
- has_many photos
- end
- class Photo lt
- ActiveRecordBase
- belongs_to profile
- End
- class User lt
- ActiveRecordBase
- has_one profile
- has_many photos,
- through gt profile
- end
- (declare-sort Profile 0)
- (declare-sort Photo 0)
- (declare-sort User 0)
- (declare-fun profile_photo (Photo)
- Profile)
- (declare-fun user_profile (Profile) User)
- (declare-fun user_photo (Photo) User)
- (assert (forall ((u User)(ph Photo))
- (iff ( u (user_photo ph))
- (exists ((p Profile))
- (and ( u (user_profile p))
- ( p (profile_photo ph)) ))
- ))
- )
Profile
0..1
1
1
1
Photo
User
40Translating the dependent Option
- The dependent option specifies what behavior to
take on deletion of an object with regards to its
associated objects - To incorporate this dynamism, the model must
allow analysis of how sets of objects and their
relations change from one state to the next
class User lt ActiveRecordBase has_one
account, dependent gt destroy end . class
Profile lt ActiveRecordBase belongs_to
user end
(declare-sort Profile 0) (declare-sort User
0) (declare-fun Post_User (User)
Bool) (declare-fun Post_Profile (Profile)
Bool) (declare-fun user_profile (Profile)
User) (declare-fun Post_user_profile
(Profile User) Bool)
41Translating the dependent Option
(assert (not (forall ((x User)) (gt (and
(forall ((a User)) (ite ( a x) (not
(Post_User a)) (Post_User a))) (forall ((b
Profile)) (ite ( x (user_profile b))
(not (Post_Profile b)) (Post_Profile b) ))
(forall ((a Profile) (b User)) (ite (and ( b
(user_profile a)) (Post_Profile a))
(Post_user_profile a b) (not
(Post_user_profile a b)) )) ) Remaining
property-specific constraints go here )))
- Update sets relations of its associated object(s)
based on the use of the dependent option
42Verification
- Once the data model is translated to SMT-LIB
format we can state properties about the data
model again in SMT-LIB and then use an SMT-Solver
to check if the property holds in the data model - However, when we do that, for some large models,
SMT-Solver times out! - Can we improve the efficiency of the verification
process?
43Property-Based Data Model Projection
- Basic idea Given a property to verify, reduce
the size of the generated SMT-LIB specification
by removing declarations and constraints that do
not depend on the property - Formally, given a data model M ltS, C, Dgt and a
property p, ?(M, p) MP - where MP ?S, CP, DP? is the projected data
model such that CP ? C and DP ? D
44Property-Based Data Model Projection
- Key Property For any property p,
- M p ? ?(M, p) p
- Projection Input Active Record files, property p
- Projection Output The projected SMT-LIB
specification - Removes constraints on those classes and
relations that are not explicitly mentioned in
the property nor related to them based on
transitive relations, dependency constraints or
polymorphic relations
45Data Model Projection Example
Property, p A Users Photos are the same as the
Users Profiles Photos.
46Verification Overview
Data Model Properties
Formal Data Model
Projection
Translator
SMT-LIB Specification
Counter-example Data Model Instance
SMT Solver (Z3)
Unknown
Verified
47Experiments
- We used five open-source Rails apps in our
experiments - LovdByLess Social networking site
- Tracks An application to manage things-to-do
lists - OpenSourceRails(OSR) Social project gallery
application - Fat FreeCRM Customer relations management
software - Substruct An e-commerce application
- We wrote 10 properties for each application
LovdByLess Tracks OSR Fat FreeCRM Substruct
LOC 3787 6062 4295 12069 15639
Data Model Classes 13 13 15 20 17
48Types of Properties Checked
- Relationship Cardinality
- Is an Opportunity always assigned to some
Campaign? - Transitive Relations
- Is a Notes User the same as the Notes
Projects User? - Deletion Does Not Cause Dangling References
- Are there any dangling Todos after a User is
deleted? - Deletion Propagates to Associated Objects
- Does the User related to a Lead still exist
after the Lead has been deleted?
Note
User
Project
49Experimental Results
- 50 properties checked, 16 failed, 11 were data
model errors - For example in Tracks, a Notes User can be
different than Notes Projects User - Currently being enforced by the controller
- Since this could have been enforced using the
through option, we consider this a data-modeling
error - From OpenSourceRails User deletion fails to
propagate to associated Bookmarks - Leaves orphaned bookmarks in database
- Could have been enforced in the data model by
setting the dependent option on the relation
between User and Bookmark
1
Bookmark
User
50Performance
- To measure performance, we recorded
- The amount of time it took for Z3 to run and
check the properties - The number of variables produced in the SMT
specification - The time and number of variables are averaged
over the properties for each application
51Performance
- To compare with bounded verification, we repeated
these experiments using the tool from our
previous work and Alloy Analyzer - The amount of time it took for Alloy to run
- The number of variables generated in the boolean
formula generated for the SAT solver - Taken over an increasing bound, from at most 10
objects for each class to at most 35 objects for
each class
52Performance Verification Time
53Performance Formula Size (Variables)
54Unbounded vs Bounded Performance
- Why does unbounded verification out-perform
bounded so drastically? - Possible reasons
- SMT solvers operate at a higher level of
abstraction than SAT solvers - Z3 uses many heuristics to eliminate quantifiers
in formulas - Implementation languages are different
- Z3 implemented in C
- Alloy (as well as the SAT Solver it uses) is
implemented in Java
55Summary
- Automatically extract a formal data model,
translate it to the theory of uninterpreted
functions, and verify using an SMT-solver - Use property-based data model projection for
efficiency - An automatic translator from Rails ActiveRecords
to SMT-LIB - Handles three basic relationships and several
options (through, conditions, polymorphic,
dependent) - Found multiple data model errors on five open
source applications - Unbounded verification of data models is feasible
and more efficient than bounded verification!
56Possible Extensions
- Analyzing dynamic behavior
- Model object creation in addition object deletion
- Fuse the data model with the navigation model in
order to analyze dynamic data model behavior - Check temporal properties
- Automatic Property Inference
- Manual property writing is error prone
- Use the inherent graph structure in the the data
model to automatically infer properties about the
data model - Automatic Repair
- When verifier concludes that a data model is
violated, automatically generate a repair that
establishes the violated property