Title: Object Persistence CC438
1Object PersistenceCC438
2Need for Persistence
- To transmit objects over a network
- To save objects in a database
- To save objects for later use by the same program
- To save objects for later use by different
programs - Version resilience
- Will objects still be loadable when their
defining classes have changed? - An important consideration!!!
3Persistence Mechanisms
- Manual
- Javas Object Serialisation
- WOX (Serialisation to XML) (uses Reflection)
- Storing Objects in Relational Databases
- Object-relational mapping
- JDBC
- Object databases
- Heavyweight (mentioned briefly)
- Lightweight db4o (covered in some depth)
4Manual
- Naïve approach
- Write methods to save and load object files from
data streams - Methods are specific to particular classes
- Enables full control of data formats
- But at the cost of extensive programming effort
5Simple Persistence Object Serialisation
- Write/read one or more objects to/from streams
- Objects can be complex
- includes reading/writing of all dependent
objects, i.e. object fields, and object fields of
object fields, etc - Seem a powerful, lightweight approach to
persistence at first, but how useful is it
really? - major problem of scalability
- reads/write large object networks in one go
- infeasible for very large objects.
- Useful for sending/receiving remote objects (RMI)
- involves object duplication (deep copying)
6Serialising an Array
- import java.io.
- public class WriteArray
- public static void main(String args) throws
Exception - int n Integer.parseInt( args0 )
- int x new intn
- for (int i0 iltn i) xi i
- ObjectOutputStream oos
- new ObjectOutputStream(
- new FileOutputStream( "tmp.obj" ) )
- oos.writeObject( x )
- oos.close()
7And De-serialising
- import java.io.
- public class ReadArray
- public static void main(String args) throws
Exception - ObjectInputStream ois
- new ObjectInputStream(
- new FileInputStream( "tmp.obj" ) )
- Object o ois.readObject()
- int x (int) o
- int tot 0
- for (int i0 iltx.length i) tot xi
- System.out.println( "tot " tot )
-
8Running the Programs
- gtjava cc438.exercises.WriteArray 1000
- gtdir tmp.obj
- 10/11/2004 1640 4,027 tmp.obj
- 1 File(s) 4,027 bytes
- gtjava cc438.exercises.ReadArray
- tot 499500
- Q. Can you explain the file size of tmp.obj?
9How Serialisation Works
- Each newly encountered object is saved
- gets a serial number in the stream
- When saving an object
- save the objects type
- for each attribute of the object
- save attribute name
- if attribute is primitive, save value
- if attribute is an object
- If object has already been saved, write a
reference - Else call this algorithm recursively to save this
object - This is just the is just concept details may
differ - note that this algorithm works for cyclic data
structures
10Serialisation Considerations
- How to deal with inheritance of fields?
- How to access private fields?
- How to check whether an object has already been
written? - Versioning issues
- suppose a class is changed, i.e. attributes are
added, deleted, or modified - how to read in previously saved objects in this
case? - this is a serious problem in case of long-term
persistence
11Serialisation Continued
- Serialization is enabled by a class implementing
interface Serializable - this is another tagging interface like Cloneable
- many standard classes implement this interface
- When defining a new class, implementing the
Serializable interface is possible without
writing any methods - get the default serializer/de-serializer in this
case - Example
- public abstract class Shape implements
Serializable , - Default serialisation works as long as it does
not encounter any fields that are not
serializable.
12Custom Serialization
- Programmer can write their own writeObject() and
readObject() methods. Why? - default serializer might be slow
- perhaps only some fields need to be made
persistent - object might depend/be composed from other
objects which are themselves not serializable,
see for example jawa.awt.Ellipse2D - Declare non-persistent attributes as transient.
- Can reuse writeDefaultObject)_/readDefaultObject(
) - For more detail see
- http//java.sun.com/developer/technicalArticles/P
rogramming/serialization/
13WOX Serialiser
- Serialises most objects to XML
- WOX Web Objects in XML
- Written by Simon Lucas in 2004
- More flexible than Javas serialisation
- Works with more classes
- Classes do not need to be declared serialisable
- Still has some bugs (does not do HashMap yet, due
to problems with final declarations) - Objects in XML are human readable
- Many advantages to this
- See http//algoval.essex.ac.uk/wox/serial/readme.
html
14Object-Relational Mapping
15 Object-Relational Mismatch
- Objects
- data and behaviour
- traverse classes by following associations and
attributes - can be structured by inheritance
- Relational database
- data stored in form of tables
- relational algebra operations SQL CREATE,
ALTER, INSERT, DELETE, UPDATE, SELECT, - keys (primary, foreign, composite)
- no general behaviour, no inheritance
16 Object-Relational Mapping Basics
- Classes are mapped to tables
- Objects are mapped to rows
- identified by a primary key
- Attributes are mapped to columns.
- object attributes are mapped to a foreign key
identifying the object. - No mapping of
- methods
- attributes which do not contain persistent data
17Simple Object-Relational Mapping Class Order
18 Mapping Collections and Associations
- Collection attributes are mapped to tables with
two columns. - A row with entries x and y means that the
collection belonging to object x contains
element y - Associations can be eliminated by transforming
them first to collection attributes and then
mapping them to tables. - Association can also be mapped directly
- linked objects are identified by foreign keys
- many-to-many associations are usually mapped to a
separate table with two columns - Note the choice of mapping can have performance
and storage efficiency implications.
19Example Mapping a 1-to-Many Association
Customer
Order
1
ForeignKey
Order
Customer
customerID ...
orderID dateReceived customerID
20Example Mapping a Many-to-Many Association
Airport code airportName
City
Serves
cityName
Primary Key
Separate Table
Airport
Serves
City
21O/R Mapping ExerciseAssociation Class
Company companyName
Person natInsurancetaxCode
Employment salary
22 Object-Relational Mapping Inheritance
- Three principal methods for mapping inheritance
- One table for the whole class hierarchy
- attributes which are not used get a null value
- subtype can be indicated as an attribute
- One table for each concrete subclass
- no mapping of abstract superclasses.
- One separate table for each class
- to retrieve data for a subclass, both its own
table and that of the superclass have to be
accessed.
23Example Inheritance Hierarchy
Person
name phoneNumber
Employee
Customer
customerNumber preferences
startDate
Adapted from Ambler
24Mapping Inheritance Methods 1) and 2)
Single table for hierarchy
One table per concrete class
Person
Customer
Employee
OID name phoneNumber customerNumber preferences s
tartDate objectType
OID name phoneNumber startDate
OID name phoneNumber customerNumber preferences
25Mapping Inheritance Method 3)
One table per class
Customer
Employee
Person
OID (FK) customerNumber preferences
OID name phoneNumberobjectType
OID (FK) startDate
Exercise compare the efficiency (storage, access
time) of the three approaches to mapping
inheritance
26RDB Programming Good Practice
- Try to avoid vendor lock-in
- keep code portable even if it slows development
- Avoid polluting your code with SQL statements
- use specialised classes that handle the mapping
to databases - Avoid composite primary keys
- use object identifiers (numbers, strings, ) as
primary keys. - there are methods for the efficient generation of
unique object identifiers in concurrent systems.
- Consider advanced methods
- avoids repetitive work of having to write manual,
custom SQL code for each application repetitive - reusable persistence code using reflection or
meta-data - use persistence frameworks which operate on top
of RDBs
27A Brief Introduction to JDBC
28Caveat and Links
- It is assumed you have a working knowledge of
relational databases and SQL. - Hands-on material by Dick Williams
- http//cscourse.essex.ac.uk/course/cc351/dbase/
- Online tutorials
- Sun http//java.sun.com/docs/books/tutorial/jdbc
/index.html - IBM http//www-106.ibm.com/developerworks/java/li
brary/j-jdbc-objects/ - Bruce Eckel Thinking in Enterprise Java
- http//www.mindview.net/Books.
29JDBC API
- Interfaces and classes that allow you to
- load the database driver
- connect to a database
- create a statement
- execute a statement (update or query)
- look at a result set
- More interface than classes
- There are also other methods, in particular for
accessing database metadata - these allow you to write generic code such as
printing a row of a table in general, see
literature
30Loading a Database Driver
- import java.sql.
- static String driverClassName
"sun.jdbc.odbc.JdbcOdbcDriver" - Class.forName(driverClassName)
- The JDBC API is defined in package java.sql
- simplest to import the whole package
- Adjust driver class name for other drivers
- see database driver documentation
- The JDBC-ODBC bridge driver is not very efficient
- extra layer
- More efficient alternatives
- MS SQLServer JDBC driver, MySQL Connector/J, etc
31Connecting to a Database
- static String jdbcOdbcDriver
- "jdbcodbcDriverMicrosoft Access Driver
(.mdb)DBQ" - static String dbLocation "m\\src\\java\\cc438\\c
c438Exercises.mdb" - static String dbUrl jdbcOdbcDriver
dbLocation - String user ""
- String pwd ""
- Connection c DriverManager.getConnection(dbUrl,
user, pwd) - The database URL needs to be adjusted for each
database - Note the use of \\ in dbLocation declaration.
- User and password might be needed.
- Good design use configuration files to read
database connection parameters.
32 SQL Statement Strings JDBC Tutorial
- String createTableCoffees
- "CREATE TABLE COFFEES "
- "(COF_NAME VARCHAR(32), . " "SALES INTEGER,
TOTAL INTEGER)" -
- String query "SELECT COF_NAME, PRICE FROM
COFFEES - JDBC passes SQL statements as strings to the
database. - do not miss blanks
- special meaning of single quotes to enclose
text-type values - escape double- single quote if data contains
single quote - also careful with format of dates, keywords
(order by), etc - Alternative prepared statements
33Using Statement Objects
- Statement stmt con.createStatement()
- stmt.executeUpdate (createTableCoffees)
- stmt.executeUpdate( "INSERT INTO COFFEES "
"VALUES ('Colombian', 101, 7.99, 0, 0)") - ResultSet rs stmt.executeQuery(query)
- executeUpdate() returns row count or 0.
- A ResultSet object represents a number of db
rows. - There are various types of ResultSet objects
- default not scrollable, not updateable
- can be specified via createStatement() parameters
- See also subinterface javax.sql.RowSet
34Navigating a ResultSet
- String query "SELECT COF_NAME, PRICE FROM
COFFEES" - ResultSet rs stmt.executeQuery(query)
- while (rs.next()) String s
rs.getString("COF_NAME") float n
rs.getFloat("PRICE") System.out.println(s "
" n) - Cursor initially before the first row.
- use rs.next() to go to next row
- In a scrollable ResultSet, the curser can be
moved backwards and forwards by more than one row - for example rs.absolute(5) or rs.relative(-2)
35 ResultSet getXXX() Methods
- String s rs.getString(COF_NAME)
- float n rs.getFloat(PRICE)
- Retrieves the values in the columns of current
row. - getXXX() converts SQL type to Java types
- This can be tricky for some types, see literature
- Columns can also be identified by position
- starting from 1
- String s rs.getString(1)
- float n rs.getFloat(2)
- There are similar updateXXX() methods
- requires updatable ResultSet object
36Prepared Statements
- Use a prepared statement for frequently repeated
database queries and updates. - When you send a prepared statement to a database
- the database formulates a query strategy and
- saves it with the prepared statement
- Prepared statements can also help with tricky
queries that contain apostrophes. - Variable substitution adds proper escape
sequences.
37Prepared Statements JDBC Code
- // prepare a query
- String query "SELECT WHERE Account_Number
?" - PreparedStatement prepStat conn.prepareStatement
(query) - ..
- // The ? denotes variables that need to be
filled in when you - // make an actual query. Use the set method to
set the variables - prepStat.setString(1,accountNumber)
- // Execute the query when all variables are set
- ResultSet result prepStat.executeQuery()
38Transactions
- A transaction is a sequence of database updates
(or other operations) that should either all
succeed or not happen at all. - COMMIT makes the updates permanent.
- ROLLBACK undoes all changes since last COMMIT.
- Transactions are supported by many databases.
- JDBC default is automatic commit of updates
- disable it for manual transaction processing
- Connection conn . . .
- conn.setAutoCommit(false)
- Statement stat conn.getStatement()
39Transactions JDBC Code Structure
- try
- con
- con.setAutoCommit(false)
-
- updateSales.executeUpdate( )
- updateTotal.executeUpdate( )
- con.commit()
-
- catch(SQLException ex)
- if (con ! null)
- try System.err.print("Transaction is being
rolled back") - con.rollback()
40RDMBS Whats Missing
- Stored procedures
- Triggers
- see CC433 (Advanced Relational and
Object-Oriented Databases) - More on transactions
- see CC434 (Enterprise Component Architectures)
- Relationship with XML
- see C432
- Automated support for O/R mapping
- APIs Entity EJBs, see CC434
- tools Hibernate, Oracle object-relational
features,
41DB Access Architectural Patterns
- It is bad style to have SQL statements in the
middle of your code. - better put the SQL code into special methods
(save, load) that deal with database
access - There are various architectural patterns for
organising database access - add database access methods to domain classes or
- put database access methods into specialised
classes - one class per table table gateway class
- one class per database database façade
- End of JDBC crash course
42Object Oriented Databases
- OO databases with Java bindings can provide
natural persistence for Java objects. - OODB does not need to support SQL
- OO query languages
- no single dominating standard yet
- relatively popular Java DataObject(JDO)
- new persistence in EJB 3.0
- OODBMs ObjectDB, FastObjects, Versant, etc
- In cc438, we will take a look at db4o
- small company db4objects Inc,
- GPL, also commercial versions
43Old-Style Object DB(ObjectStore, FastObjects)
- Strict adherence to principles
- Enforced referential integrity
- An object could not be deleted while other
objects still referred to it - Enforced transactions
- Objects could only be accessed within a
transaction - This effect would be cascaded (so that objects
referenced by root object would be inaccessible
while root was opened for access) - Hard to work with Java (classes had to be
modified)
44Lightweight Object DBdb4o http//www.db4o.com
- Abandons all above principles!
- Programmer is given more responsibility
- Con possible to get things badly wrong
- Pro DB is simple to use
- Pro DB is MUCH easier to implement
- Code can be pure Java
- No problem with including SQL strings which cant
be syntax and type checked - An interesting idea lets see how it works
45db4o (DB for Objects)
- Native Java/.NET OODBMS
- db engine is a 400KB executable jar file
- aims at embedded devices market
- not for large-scale client/server environments
- Explicit storage and retrieval of objects
- Query by example, Native Queries, SODA
- Supports transactions and optimistic locking
- Installation add db4o-5.0-java5.jar to class
path - GUI tool ObjectManager
- See db4o tutorial and API doc for more details.
46Interface com.db4o.ObjectContainer
- ObjectContainer db Db4o.openFile(mydb.yap")
- Storage and query interface
- reminiscent of connection in other databases
- Single database file (here mydb.yap, binary)
- database file will be created if it does not
exist. - Every ObjectContainer owns one transaction
- all work is transactional, after commit() and
rollback(), a new transaction starts immediately - Every ObjectContainer maintains its own
references to stored and instantiated objects
47Example Domain Classes
Student - name - scheme getName()
getScheme() setName() setScheme() toString()
Course code teacher students toString()
CourseList name courseList toString()
48 Sample Objects
advancedCourses name
advancedCourse courseList cc362, cc384
cc362 name cc362 teacher
hhu students ann , ben claire
cc384 name cc384
teacher udo students ben, claire, dave
ann name Ann scheme AI
dave name Dave scheme RO
ben name Ben scheme SE
claire name Claire scheme AI
49Storing Objects
- try db.set(advancedCourses)
- db.commit()
- finally db.close()
- Stores object advancedCourses including all
attributes, i.e. students and courses. - Closing an ObjectContainer will release the
resources associated with it. - Warning do not forget it, otherwise you can get
problems with locked database file.
50More on the Code in Storing Objects
- db4o works with object identity, not equality
- inserting two different objects with the same
fields will result in both objects being stored
in the database - careful you do not get many copies of the same
object! - Storing an object that already exists in the
database will result in an update - if the object was stored in a previous program
run, then you usually need to retrieve it first,
then modify it, then call the set method. - update is shallow by default, see update
depth discussion later.
51Query by Example (QBE)
- Interface ObjectContainer has method ObjectSet
get(java.lang.Object template) - As a template use a sample object
- some attributes have a proper value
- some attributes are set to null ( wild card)
- db.get(template) will return all objects of that
type in db which fit the template. - templates can be composite, see db4o tutorial
example - QBE is not a full query language.
52Using Query by Example
- Retrieve all students
- ObjectSet results db.get(new Student(null,
null)) - Retrieve all students with nameClaire
- ObjectSet results db.get(new
Student(Claire, null)) - Retrieve all students with schemeAI
- ObjectSet results db.get(new Student(null,
AI))
53Interface ObjectSetltItemgt
- interface ObjectSet extends CollectionltItemgt,
IterableltItemgt, IteratorltItemgt and ListltItemgt - can use hasNext(), next() for iteration
- also reset(), size() and ext()
- Example print all elements of an ObjectSet
- static ltItemgt void showObjectSet(ObjectSetltItemgt
objs) - for (Item itm objs) System.out.println
(itm) -
54getStudentByName()
- public static Student getStudentByName(
- ObjectContainer db, String name)
- Student student new Student(name, null)
- ObjectSet resultSet db.get(student)
- if (resultSet.hasNext())
- student (Student) resultSet.next()
- return student
-
- Student anna getStudentByName(db,"Anna")
55Retrieving All Students on a Course
- public static Course getCourseByName( String
name) - Course course new Course (name,null,null)
- ObjectSet resultSet db.get(course)
- if (resultSet.hasNext())
- course (Course) resultSet.next()
- return course
-
-
- Course cc384 getCourseByName(db,"cc384")
- System.out.println("cc384 students"
cc384.students)
56Database Model Object Model
- Why did we need different code for showing all
students on a course? - The database model is the same as the object
model. - The object model treats course membership
different from being member of a scheme. - In a relational model, belonging to a scheme and
belonging to a course might be modelled in the
same way, as binary tables. - very similar program code in both cases
- Observation the object model effects queries
- in order to make querying effective, you might
need to adapt the object model.
57db4o Native Queries
- ListltStudentgt result
- db.query(new PredicateltStudentgt()
- public boolean match (Student stud)
- return stud.getName().startsWith("A")
- )
- Will return all Student objects in the database
with a name starting with A. - Relies on extending abstract class Predicate with
boolean method match(). - Potentially performance problems?
58Query API (S.O.D.A)
- S.O.D.A ( Simple Object Database Access)
- provides query API
- construct constraints on object fields
- Methods for building constraints
- greater(), smaller(), contains(), equal(),
identity(), .. - and(), or(), not()
- S.O.D.A is older API than native queries
- native queries are now recommended
59S.O.D.A Finding Annas Courses
- query db.query()
- query.constrain(Course.class)
- Constraint constrAnna
- query.descend("students").constrain(anna).contain
s() - query.constrain(constrAnna)
- ObjectSet coursesContainingAnna
query.execute() - showResults(coursesContainingAnna)
60Updating Objects in the Database
- Replace old copy of obj in db with new version
db.set(obj) - Need to ensure that object to be stored is the
same object as the one stored in the database - identical, not just equal
- often need to retrieve old object first from
database - Example
- Student anna getStudentByName(db,"Anna")
- anna.setScheme("SE")
- db.set(anna)
- Question what happens if you call
- db.set(new Student(Anna,SE)
61Shallow Update
- Answer The last line on the page above does not
update Anna in the db. - Instead it creates a second object with the
fields. - A different issue update is by default shallow
- the update depth is by default one
- it does not update mutable object fields that
were changed, i.e. object fields whose attributes
changed. - Example
- cc384.students.remove(benny)
- db.set(cc384)
- will not remove benny from cc384 in db.
62Cascading Update
- Shallow update avoids a lot of work when updating
a complex object - completely analogous to shallow cloning of
objects - If you want a deep update, can tell db4o to
cascade the update for certain classes - Db4o.configure(). objectClass(Course.class).casc
adeOnUpdate(true) - After applying this settings before opening the
database file, the command - db.set(cc384)
- will update cc384.students in the database, if
this object was changed in the program.
63Deleting Objects
- db.delete(obj) will remove obj from db
- this is a shallow delete, i.e. deleting a Course
object does not delete any students! - Again, a deep (recursive) delete can be
achieved by an appropriate configuration - Db4o.configure().objectClass(Course.class).cascad
eOnDelete(true) - Now db.delete(cc438) will also delete all objects
in cc438.students from the database. - What about referential integrity does db4o
check whether a deleted object is references
elsewhere? - Not checked value of such references becomes
null. - Programmers need to ensure themselves that a
delete operation does not create a problems.
64Methods in Interface ObjectContainer
- void activate(java.lang.Object obj, int depth)
- boolean close()
- void commit()
- void deactivate(java.lang.Object obj, int dept h)
- void delete(java.lang.Object obj)
- ExtObjectContainer ext()
- ObjectSet get(java.lang.Object template)
- Query query()
- void rollback()
- void set(java.lang.Object obj) // store or update
object  - Extended facilities provided by interface
com.db4o.ext.ExtObjectContainer     Â
65Inheritance and db4o
- Assume class InternationalStudent extends class
Student with an additional field country and - Student claire new InternationalStudent("Claire
","AI","Greece") - How does this effect the database code?
- Storage code needs no change
- db.set(cc384) will store all elements of
cc384.students, whether they belong to class
InternationalStudent or not. - The query code also still works
- db.get(new Student(Claire,null)) matches any
object of class Student or its subclasses,
provided nameClaire - Problem QBE with interfaces/abstract classes?
- Solution db.get(xyz.class) or use Query API.
66A Recursive Datastructure Labelled Binary Trees
LBinTree LBinTree left int value LBinTree
right LBinTree(int depth) LBinTree(..,..,..) cou
ntNodes(..)
LBinTree value1 left t1 right t2
t1LBinTree value0 left null right null
t2LBinTree value0 left null right null
67Storing and Retrieving LBinTree Objects
- Storing
- LBinTree original new LBinTree(8)
- db.set(original)
- Retrieving
- public static LBinTree getLBinTree
( ObjectContainer db, int value) - LBinTree btree new LBinTree(null,value,null)
- ObjectSet result db.get(btree)
- if (result.hasNext()) btree
(LBinTree)(result.next()) - return btree
-
68Testing LBinTree Persistence
- First session store LBinTree with 511 nodes
- LBinTree original new LBinTree(8)
- db.set(original)
- System.out.println(Stored tree with
LBinTree.countNodes(original) nodes.)gt
Stored tree with 511 nodes. - Later session
- LBinTree retrieved getLBinTree(db,8)
- System.out.println(Retrieved tree with
LBinTree.countNodes(retrieved) nodes) - gt Retrieved tree with 63 nodes.
- What went wrong?
69Activation Depth
- Similar to update and delete, db4o queries are
also depth-restricted. - The default activation depth for any object is 5
- Retrieval runs into nulls after traversing 5
references. - As usual, this can be set in the configuration
- there are also ways of dynamically activating
nodes - Db4o.configure(). objectClass(LBinTree.class).ca
scadeOnActivate(true) - This provides us with deep retrieval of LBinTrees
- gt Retrieved tree with 511 nodes.
70db4o Indexing
- Can index on particular fields
- Db4o.configure().objectClass(Student.class). obj
ectField("name").indexed(true) - Without such indexed, retrieval is very slow
- 10000 students, nearly 1sec to retrieve one
student - Building indexes is slow
- can take minutes
- Once index is build, queries are fast
- 10000 students, 15 ms to find one student
71db4o Miscellaneous
- Suggested to call commit() regularly
- avoid potential problems with stack overflows
- db4o claims to be thread-safe.
- offers support for usage from servlets/JSP
- Special database-aware collections types.
- Might need to start a new session (i.e.
db.close()) to see some effects as db4o uses
caching. - An ObjectContainer can either be
- a database in single-user mode (embedded)
- or a client to a db4o server (client-server)
72db4o OODBMS Summary
- Very simple to use
- but be aware of depth in updates, deletes and
queries - Query performance good after indexing
- but indexing is slow
- Lean database
- suitable for embedded or specialised applications
- not enough features for large-scale deployment
- Storing objects directly takes experience
- database model object model ? relational model
- object model might need adjustments in order to
improve query times