Title: Distributed Software Engineering
1SerializationFlatten your object for automated
storage or network transfer
2Software object persistence
- Persistence Saving information about an object
to recreate at different time, or place or both. - Object serialization means of implementing
persistence convert objects state into byte
stream to be used later to reconstruct
(build-deserialized) a virtually identical copy
of original object. - Default serialization for an object writes
- the class of the object,
- the class signature,
- values of all non-transient and non-static fields.
3Serialization protocol
- For serialization
- java.io.ObjectOutputStream via writeObject which
calls on defaultWriteObject, - For deserialization
- java.io.ObjectInputStream via readObject which
calls on defaultReadObject. - Any object instance that belongs to the graph of
the object being serialized must be serializable
as well. - Superclass must be Serializable.
4Serialization protocol
- Customize default implement extended versions
of default methods in - writeObject
- readObject
- But final fields cannot be read with readObject.
Need to use default. - Create own complete serialization by implementing
the interface Externalizable.
5Specifying persistent objects
- Class of the object to be serializable must
implement interface - java.io.Serializable
- This interface is an empty interface and is used
to mark the objects of such class as persistent.
6Deserialization
- It reads values written during serialization
- Static fields in the class are left untouched.
- If class needs to be loaded, then normal
initialization of the class takes place, giving
static fields its initial values. - Transient fields will be initialized to default
values - Recreation of the object graph will occur in
reverse order from its serialization.
7Example
import java.io.Serializable import
java.util.Date import java.util.Calendar public
class PersistentTime implements Serializable
public PersistentTime() time
Calendar.getInstance().getTime() public
Date getTime() return time private
Date time
8Class java.io.ObjectOutputStream
- An ObjectOutputStream instance writes primitive
data types and graphs of Java objects to an
OutputStream. The objects can be read
(reconstituted) using an ObjectInputStream.
Persistent storage of objects can be accomplished
by using a file for the stream. If the stream is
a network socket stream, the objects can be
reconstituted on another host or in another
process. - Only objects that support the java.io.Serializable
interface can be written to streams. The class
of each serializable object is encoded including
the class name and signature of the class, the
values of the object's fields and arrays, and the
closure of any other objects referenced from the
initial objects.
9Class java.io.ObjectOutputStream
- The method writeObject is used to write an object
to the stream. Any object, including Strings and
arrays, is written with writeObject. Multiple
objects or primitives can be written to the
stream. The objects must be read back from the
corresponding ObjectInputstream with the same
types and in the same order as they were written. - Primitive data types can also be written to the
stream using the appropriate methods from
DataOutput. Strings can also be written using the
writeUTF method.
10Example
import java.io.ObjectOutputStreamimport
java.io.FileOutputStream import
java.io.IOExceptionpublic class FlattenTime
public static void main(String args) String
filename "time.ser" if(args.length gt 0)
filename args0 PersistentTime
time new PersistentTime() FileOutputStream
fos null ObjectOutputStream out
null try fos new FileOutputStream(filenam
e) out new ObjectOutputStream(fos)
out.writeObject(time) out.close() catch(
IOException ex) ex.printStackTrace()
11import java.io.ObjectInputStreamimport
java.io.FileInputStreamimport
java.io.IOExceptionimport java.util.Calendarpu
blic class InflateTime public static void
main(String args) String filename
"time.ser" if(args.length gt 0)
filename args0 PersistentTime time
null FileInputStream fis null ObjectInputStr
eam in null try fis new
FileInputStream(filename) in new
ObjectInputStream(fis) time
(PersistentTime)in.readObject()
in.close() catch(IOException ex)
ex.printStackTrace() catch(ClassNotFoundExcep
tion ex) ex.printStackTrace() System.ou
t.println("Flattened time " time.getTime()) S
ystem.out.println("Current time "
Calendar.getInstance().getTime())
12Serializable vs. Non-Serializable objects
- Java.lang.Object does not implement serializable,
so you must decide which of your classes need to
implement it. - AWT, Swing components, strings, arrays are
defined serializable. - Certain classes and subclasses are not
serializable Thread, OutputStream, Socket - When a serializable class contains instance
variables which are not or should not be
serializable they should be marked as that with
the keyword transient.
13Transient fields
- These fields will not be serialized.
- When deserialized, these fields will be
initialized to default values - Null for object references
- Zero for numeric primitives
- False for boolean fields
- If these values are unacceptable
- Provide a readObject() that invokes
defaultReadObject() and then restores transient
fields to their acceptable values. - Or, the fields can be initialized when used for
the first time. (Lazy initialization.)
14Serial version UID
- You should explicitly declare a serial version
UID in every serializable class. - Eliminates serial version UID as a potential
source of incompatibility. - Small performance benefit, as Java does not have
to come up with this unique number. - private static final long serialVersionUID rlv
- rlv can be any number out thin air, but must be
unique for each serializable class in your
development. - If you want to make a new version of the class
incompatible with existing version, choose a
different UID. Deserialization of previous
version will fail with InvalidClassException.
15Customizing OutputObjectStream, InputObjectStream
- To provide special behavior in the writing or
reading of stream object bytes implement - private void writeObject(ObjectOutputStream out)
throws IOException - private void readObject(ObjectInputStream in)
throws IOException, ClassNotFoundException
16Creating your own protocol Externalizable
- Instead of implementing the Serializable
interface, implement Externalizable - interface Externalizable
- public void writeExternal(ObjectOutput out)
throws IOException - public void readExternal(ObjectInput in) throws
IOException -
17Performance
- Serialization is a very expensive process. You
must clearly have reasons to serialize instead of
you directly writing what you need to save about
the state of an object.
18Default or Customized serialization?
OrImplementing Serializable judiciously
- Allowing a classs instances to be serializable
can be as simple as adding the words implements
Serializable to the class specification. - This is a common misconception, the truth is far
more complex. - While efficiency it is one cost associated with
it, there are other long-term costs that are much
more substantial. - Using default serialization is very easy but this
a very specious
19Serialization Costs
- Your objects private structure is out for the
viewing!!!! Its become part of the API. - A major cost is that it decreases flexibility to
change a classs implementation once the class
has been release - Increases the likelihood of bugs and security
holes. - Increases the testing associated with releasing a
new version of the class.
20Serialization caveats
- Implementing Serializable is not a decision to be
undertaken lightly. - Classes design for inheritance should rarely
implement serializable and interfaces should
rarely extend it. - You should provide parameterless constructor on
non-serializable classes designed for
inheritance, in case it is subclassed and the
subclass wants to provide serialization. - Inner classes should rarely if ever, implement
Serializable. - A static member class can be serializable.
21Consider using a custom serialized form
- The default serialized form of an object is an
encoding of the physical representation of the
object graph rooted at the object - Data contained in the object
- Data contained in every object reachable from it.
- Topology by which all of these objects are
interlinked. - The ideal serialized form contains only the
logical data represented by the object. It is
independent of its physical representation.
22Consider using a custom serialized form
- Default serialization is likely to be appropriate
if an objects physical presentation is identical
to its logical content. - Appropriate A Name class.
- Not appropriate A doubly linked List class.
23Consider using a custom serialized form
- Disadvantages of default serialization when
physical and logical representation differ - Permanently ties the exported API to the internal
representation. - Can consume excessive space.
- Can consume excessive time.
- Can cause stack overflow.
24Consider using a custom serialized form
- A reasonable serialized form for a List is the
number of entries followed by each of the
entries. - Although default serialized form is correct for a
List case, it may not be the case for any object
whose invariants are tied to implementation-specif
ic details. - Example a hash table using buckets. This is
based on the hash code of the key, which may
change from JVM to JVM, or for different runs of
the hash table in same JVM. Thus default
serialized form can violate the invariant for
hash tables in this case.
25readObject() and security attacks
- Deserialization uses defaultReadObject() and
readObject() to create a new instance of a class.
- Thus readObject is a constructor!!!!!
- So, readObject must behave like any other
constructor - Check for arguments validity if need be
- Make copies of parameters where needed
- Otherwise, a very simple job for an attacker to
violate objects invariants. - Provide a hand-made serialization of the attack
object.
26Guide for writing a bulletproof readObject
- Private reference fields should be initialized
with copies of its values. - Check invariants and throw an InvalidObjectExcepti
on if they fail. - As with constructors, do not invoke any
overridable methods. - If an entire object graph must be check for
validity after deserialization, the
objectInputValidation interface should be used.
27writeReplace()
- Sometimes it may not be appropriate to serialize
the actual object, but some specifically given
object. - ltaccessgt Object writeReplace() throws
ObjectStreamException - Returns an object that will replace the current
object during serialization. Any object may be
returned including the current one.
28A comment about access qualifier
- These methods can be of any accessibility
- They will be used if they are accessible to the
object type being serialized - If a class has private readResolve, it only
affects serialization of objects that are exactly
its type. - If package-accessible readResolve affects only
subclasses within the same package - public and protected readResolve affect objects
of all subclasses.
29readResolve()
- Recall that deserialization produces an instance
of a class object. - If a given class should only have one instance
(singleton pattern), then via deserialization we
can provide a different instance!!! - In general you need to be concerned of what is
being created for instance-controlled classes. - Enter readResolve() this is a method that
returns the appropriate instance of the class at
hand by the readObject() or defaultReadObject()
methods. - ltaccessgt readResolve() throws ObjectStreamExceptio
n