Title: Object Persistence
1Object Persistence
2Who am I?
- Michael (Mick) Hollins
- PhD in Computer Science
- Distinguished Engineer at Bullant Technology
3What Do I do?
- Kernel Team Architect focusing on
- Compiler Virtual Machine
- Persistence / Transaction Processing
- Scalability
- Straight Line Performance
- Huh?
- Software Designs writing, reviewing,
brainstorming. - Coding writing reviewing.
- Team processes. e.g Source Code control
management, Risk Management, Automated Testing. - User level documentation.
- Maintenance defect fixing, customer support
(occasionally), Change Request processing, branch
merging.
4Object Persistence - Outline
- Lecture 1
- What is Persistence?
- Persistence Scalability
- Persistence Mechanisms
- Persistent Programming Languages
- Lecture 2
- ACID Transactions
- Atomicity, Consistency, Isolation, Durability
- Lecture 3
- ACID (cont)
- Schema Evolution
- Summary
5What is Persistence?
6What is Persistence?
Persistence is the ability for data to outlive
the program that created it. Typically the data
is stored on a disk storage system.
7Examples
- Documents
- E.g. Microsoft Word file (.doc), Microsoft Excel
spreadsheet file (.xls), Acrobat (.pdb) - Program data stored in a relational database
- E.g. Oracle, SQL Server, Access etc
- Images
- .jpg, .gif, .bmp etc
8An Experiment
9Experiment - Tasks
10Back to Persistence
Persistence is the ability for data to outlive
the program that created it. Typically the data
is stored on a disk storage system.
11Degrees of Persistence
- method dumpAsString
- returns rRet string
- locals lNameStr string, lAgeStr string
- begin
- lNameStr "My Name is " aName
- lAgeStr "And I am " aAge " years old"
- rRet lNameStr "\n" lAgeStr "\n"
- return rRet
- end method
12Persistence and Software Engineering Projects
- For small projects and systems the persistence of
data is often an after-thought. - This can be unfortunate as you typically end up
with a solution that is inflexible, cumbersome,
unreliable and un-scalable. - For large projects and systems the persistence of
data often dominates the architecture - This can be unfortunate as it should be the
application itself that dominates the thought
processes of developers and designers.
13Persistence Desirable Properties
- Durable/Reliable
- Simple to program
- Efficient/Fast
- Scalable
- Support Application Longevity
- E.g. support evolution of application including
schema changes
14Persistence and Scalability
- Scalable with respect to
- Size of persistent data
- Rate of change of data
- Number of Concurrently accessing/updating tasks
- Desirable properties for achieving scalability
- Fast
- Incremental
- Multi-threaded / Multi User
- Safe i.e. protect tasks from each other
- Non-exclusive should allow
- Concurrent readers
- Concurrent writers
- Optimise Disk access
15Experiment
16Just how slow is Disk Access?
- Latency is the killer (not bandwidth)
- Combination of
- Seek Time 0.008 seconds
- Rotational Latency 0.004 seconds
- Transfer Time 0.0005 seconds
- Total 0.012 seconds
- Compare with Memory Access
- .00000001 seconds
- Memory access is 1.2 million times faster than
disk access
From ONeil ONeil see Reference OO 01
17Keep Walking!
18Persistence Mechanisms
19Some Mechanisms for achieving Persistence
- Ad hoc Persistence
- Object Serialization
- External Database System
- Object ? Relational Mappings
- Persistent Programming Languages
20Ad Hoc Persistence
- The programmer writes whatever code it takes to
get the damn stuff out to (and back in from) a
file on disk. - Examples
- Hard to tell from the outside, but many
single-user applications use this approach to
store data to disk files. - e.g. desktop applications
- Comma separated files are an example of a file
format commonly used by applications using an Ad
hoc approach to persistence.
21Ad Hoc Persistence
- Pros
- No need to design up front ??
- Cons
- Manual / Difficult to use
- Not scalable
- Single task
- Not incremental
- Application Longevity typically not considered
(or dealt with manually)
22Object Serialization
- Makes use of reflective capabilities of the
programming language to provide an automated
transparent persistence facility - Uses simple disk files (typically text) to store
data - Example
- Java object serialization
23Object SerializationExample output file
object ID 1 Class Person attributes numbe
r 2 aName 2 aAge 3 end attributes end
object object ID 2 Class String Length
8 Value Mr Chips end object object ID
3 Class Integer Value 34 end object
Serialized version of Person Object
Serialized version of string Mr Chips
Serialized version of integer 34
24Object Serialization sample implementation code
method SerializeAttributes parameters
pObject object, pSerializer Serializer
returns rRet String "" locals lIndex
Integer, lAttrCount Integer, lAttribute object,
lType Type, lIter Iterator, lVar
Variable begin lAttrCount
pObject.GetAttributeCount() rRet rRet
"\t\tnumber " lAttrCount "\n" lType
pObject.GetObjectType() for lIndex 1
lIter lType.GetAttributesFlattened().cre
ateIterator() lVar
downcast(lIter.First(),Variable) step
lIndex lIndex 1 lVar
downcast(lIter.Next(),Variable) test lVar
is not null lAttribute
pObject.GetAttribute(lIndex) if lAttribute
is not null rRet rRet "\t\t"
lVar.getName() " "
Serializer.allocateID(lAttribute) "\n"
end if end for return rRet end method
25Object Serialization
- Pros
- Reasonably simple to use
- Cons
- Not scalable
- Single task
- Not incremental
- Typically inefficient
- Makes use of reflective mechanisms which tend to
be slow - Application Longevity?
26External Database System (DBMS)
- Uses API provided by external DBMS to store and
retrieve objects as relational data - Essentially Ad hoc in nature as programmer
determines mapping on a case by case basis
27External Database System
- Pros
- Can be incremental. i.e. more scalable than
simple Ad hoc mechanism. - Access to DBMS facilities
- Querying
- Transactions
- Indexing
- Schema Modification good for application
longevity - Cons
- Manual. i.e. User defines mappings on case by
case basis - Application Longevity Must ensure that code and
data are evolved in parallel - Suffers from Impedance Mismatch
28The Impedance Mismatch
- Object Identity
- Representing Collections
- Type System differences. e.g. Inheritance
subtyping - All of the solutions described so far suffer from
the Impedance Mismatch in one form or another
29Object ? Relational Mappings
- Provide a transparent mapping between an Object
Oriented Language and a Relational Database
30Object ? Relational MappingExample
Implementation impl new com.vendor.odmg.Implemen
tation() Database db impl.newDatabase() Transa
ction txn impl.newTransaction() try
db.open("addressDB", Database.OPEN_READ_WRITE)
txn.begin() // perform query OQLQuery
query new OQLQuery( "select x from
Person x where x.name \"Doug Barry\"")
Collection result (Collection)
query.execute() Iterator iter
result.iterator() // iterate over the
results while ( iter.hasNext() )
Person person (Person) iter.next() //
do some additional processing on the person
... person.address.street "13504 4th
Avenue South" txn.commit()
db.close()
http//www.object-relational.com/articles/java_and
_object-relational_mapping.html
31Object ? Relational Mapping
- Pros
- Object ? Relational Mapping solutions can be
simpler, faster and more scalable than the
previously described mechanisms. - Cons
- Still suffer from the Impedance Mismatch under
the hood. - Not as transparent as youd like. Impedance
Mismatch seeps through.
32Persistent Programming Languages
33Persistent Programming Languages
- The programming language (and/or runtime
environment) contains mechanisms for directly
specifying the persistence of data and for
directly manipulating that data. - Removes Impedance Mismatch
- Examples
- Napier
- Bullant
- Mozzie
Holds world record for smallest recorded
programmer population for a computer language.
Peak programmer population was 3.
34Persistent Programming Languages
- A persistent programming language provides
mechanisms for direct manipulation of persistent
data
35Example - Bullant
- The bullant system inherently supports the
persistence of objects. - The writer of a class need do nothing special in
order to make instances of that class
persistence capable.
36Bullant Persistence by Reachability
- Persistence by Reachability
- determines which objects need to persist by
calculating those objects that are reachable from
a persistent root. - The programmer need only define which objects are
persistent roots and the bullant system takes
care of the rest.
37The Persistent Graph
- The bullant system contains a table of persistent
roots. - All objects reachable from persistent roots make
up a directed graph known as the persistent
graph. - In a real system, the persistent graph may
consist of millions of objects.
38The Persistent Graph- Example
39Example Creating a Persistent Linked List
class ListCreator library Examples inherits
command, RootHandler method createList
locals lList LinkedList(Integer),
lInt Integer begin lList
construct(LinkedList(Integer)) for lInt 1
step lInt lInt 1 test lInt lt 10
lList.addTail(lInt) end for transaction
setPersistentRoot("My List", lList) end
transaction end method
Create Transient List
Make List Persistent
40Example Traversing a Persistent Linked List
method displayList locals lList
LinkedList(Integer), lIterator
LinkedListIterator(Integer), lObject
object, lInt integer begin
lObject getPersistentRoot("My List") lList
downcast(lObject, LinkedList(Integer))
lIterator lList.constructIterator() lInt
lIterator.getFirst() while lInt is not
null lInt.print() lInt
lIterator.getNext() end while end method
Retrieve list from persistent root
Display contents of list
41Advantages of Persistent Programming Languages
- Reduced Complexity
- Efficiency
- Protection mechanisms that operate over the whole
environment - Referential Integrity preserved over the whole
environment - From Morrison Atkinson MA 90
42Advantages Reduced Complexity
43Advantages Simplicity Efficiency
- Removes Impedence Mismatch
- Reduced code size and time to compute
44Advantages - Protection Mechanisms
- All data, no matter its longevity, has the same
representation - Can provide a uniform protection mechanism
- E.g. Type Safety
- Compare with external DBMS approach
- Use DBMS protection mechanisms for data held in
the database - Use Programming language mechanisms for data
loaded into memory
45Advantages Referential Integrity
- In a persistent programming language, different
programs may be permitted to execute over shared
persistent data allowing common objects to be
shared rather than copied.
46Experiment
47References