Title: The R.oo package: Robust object-oriented design
1The R.oo package Robust object-oriented design
implementation with support for references
Henrik Bengtsson hb_at_maths.lth.se Mathematical
Statistics, Centre for Mathematical Sciences Lund
University, Sweden DSC-2003, Vienna. March
20-22, 2003
2Outline
- Purpose and what the package is and is not.
- RCC R Coding Conventions (draft).
- Reference variables.
- The root class Object.
- setMethodS3() setConstructorS3().
- Rdoc comments.
- Static methods.
- Virtual fields.
- trycatch() - exception handling based on class.
3Purposes
- End user (the most important person at the end of
the day!) - Provide consistent object-oriented APIs across
different packages, e.g. by having a well defined
naming convention for classes, methods, fields
and variables. - Make class inheritance more explicit.
- Provide a simpler API, e.g. less arguments.
- More memory efficient packages.
- Developer / programmer
- Provide reference variables to reduce memory
req.'s and data redundancy. - R Coding Convention, e.g. naming conventions.
- Create generic functions automatically.
- Make code cleaner and remove the need for tedious
code repetitions. - Minimize the risk for package conflicts.
- More code checking when creating methods and
classes to catch errors early on. - Catch rare but classical bugs, e.g. using
reserved words in method names. - Make help pages more up to date with the source
code by allowing Rd document to be placed
together with the code in the source files.
4Real world example
Read all GenePix Result files gpr lt-
MicroarrayDataread(pattern.gpr) Extract
the foreground background signals of the red
and the green channels. The slide layout is
also included. raw lt- as.RawData(gpr) Get the
background corrected signal as Mlog(R/G) and
Alog(RG)/2. ma lt- getSignal(raw,
bgSubtractTRUE) normalizeWithinSlide(ma,
methodp) print-tip normalization. knownGene
s lt- c(50,194,3433,5541,6384) plot(ma)
highlight(ma, knownGenes) highlights the data
points from the plotPrintorder(ma) highlight(ma,
knownGenes) correct slide in the correct
space. plotSpatial(ma) highlight(ma,
knownGenes) plotSpatial3d(gpr, fieldarea,
colgetColor(ma)) Write the normalized data to
a tab-delimited file write(ma, NormalizedExpressi
ons.dat)
5What the package is and isnt
- Is not supposed to replace S3 or S4, but
- is an extra layer on top of S3 (eventually S4),
to - move the focus from S3 and S4 details to
object-oriented design and implementation.
R.oo
R environment(S3 and eventually S4)
- It has been tested and verified for gt 2 years!
6RCC R Coding Conventions (draft)
http//www.maths.lth.se/help/R/RCC/
- Standardizes the coding style
- Example of the naming conventions
- Variables, objects, fields and methods should
verbs starting with a lower case letter, e.g.
shapeside and normalize(). - Classes should be nouns starting with an upper
case letter, e.g. MicroarrayData. - Constants should be in all upper case, e.g.
ColorsRED.HUE. - Similar to Java.
- Standards
- make the code (and the design) easier to read,
share and maintain. - reduce the risk for bugs and misunderstandings.
7Reference variables
- Memory efficient.
- Minimizes the amount redundant data.
- Very useful for some data structures, e.g.
graphs. - References in R.oo are implemented using the
environment data type. - Collected by the R garbage collector.
- (More user friendly methods interfaces since
methods can communicate with each other by
updating the state of the object.)
8A common root class Object
- All classes should have the common root class
Object. - A similar idea exists in R today, e.g. print(),
as.character() etc, but a common root class
makes it more explicit.
9Object the common root class
10A common root class Object
- All classes should have the common root class
Object. - A similar idea exists in R today, e.g. print(),
as.character() etc, but a common root class
makes it more explicit. - Fields of an Object can be accessed as elements
of a list, e.g. - squareside and
- squareside lt- 23
- Methods can also be called as
- squaregetArea()
- The implementation of reference variables is
taken care of within the Object class. Under the
hood, we roughly have - .Object lt- function(object, name)
get(name, envirattr(object, .env)) -
- lt-.Object lt- function(object, name, value)
assign(name, value, envirattr(object, .env)) -
11setMethodS3()
Does not require the Object class
- Defines a method of a class.
- Creates a generic function automatically iff
missing. - RCC
- Methods should start with a lower case letter.
- Asserts that a correct method name is used
reserved words and names of basic functions that
must not be overwritten or redefined are
protected.
setMethodS3(plotPrintorder, MAData,
function(object, ...) ... )
setMethodS3(next, Iterator, function(object,
...) ... ) Error 2003-03-18 162800
RccViolationException Method names must not be
same as a reserved keyword in R next, cf.
http//www.maths.lth.se/help/R/RCC/
12Problems with generic functions
- Hard to check if function (generic or not)
already exists. - Ad hoc solutions for creating generic function
automatically. - Under the S3 schema, it is possible to create
generic functions that are truly generic
normalize lt- function(...) UseMethod(normalize)
Note that the first argument is omitted. If not,
it would be impossible to have default functions
with no arguments, e.g. search(). - The R.oo package automatically creates generic
functions as above. - We are not aware of how to do the same in S4
(this is the main reason for why R.oo is
currently staying with S3).
13setConstructorS3()
Does not require the Object class
- Defines the constructor method of a class, but
also the class. - RCC
- Asserts that a correct class name is used
reserved words and names of basic functions that
must not be overwritten or redefined are
protected. - Class and constructor names should start with an
UPPER CASE letter. - Constructors should be named the same as the
class.
setConstructorS3(MAData, function(M, A,
layoutNULL) extend(MicroarrayData(layoutlayo
ut), MAData, M as.matrix(M), A
as.matrix(A) ) )
Constructor/class definition hybrid Creates an
object of the super class, which is then
extended into an MAData object with additional
fields.
14Quick inspection of a class
- print(ltclass namegt) or simply type the class name
at the prompt and press ENTER, e.g.
Object
gt MADataMAData extends MicroarrayData, Object
public A public layout public M ...
normalizeWithinSlide(...) ... public
plot(what"MvsA", ...) public plot3d(...)
public plotPrintorder(what"M", ...) ...
public print(...) public save(fileNULL,
pathNULL, ...)
MicroarrayData
Layout
ngrid.c integer ngrid.r integer nspot.c
integer nspot.r integer
... plot(...) plot3d(...) plotPrintorder(...) ...
... getName(...) character getId(...)
character ... nbrOfSpots() integer nbrOfGrids()
integer ...
MAData
A matrix M matrix
as.RGData() RGData ... normalizeWithinSlide(...)
normalizeAcrossSlides(...) ...
15Quick inspection of an object
- print(ltobjectgt) or simply ltobjectgt and ENTER at
the prompt, which by default is equal to
print(as.character(ltobjectgt)), e.g.gt ma1
"MAData M (5184x4), A (5184x4), Layout Grids
4x4 (16), spots in grids18x18 (324), total
number of spots 5184. Spot name's are specified.
Spot id's are specified." - ll(ltobjectgt) gives details information about the
(public) fields, e.g.
gt ll(ma) member data.class dimension
object.size 1 A SpotSlideArray c(5184,4)
143940 2 layout Layout 1
428 3 M SpotSlideArray c(5184,4)
143940 gt ll(malayout) or ll(getLayout(ma))
member data.class dimens2ion object.size1
geneGrps NULL 0 02
geneSpotMap NULL 0 03
id character 5184 638684
ngrid.c numeric 1 36...
11 printtipGrps NULL 0 0
16Rdoc Source-to-Rd converter
/ _at_Class Matlab
\titleMatlab client for remote or local Matlab
access \description _at_include
"Matlab.declaration.Rdoc" \usage
matlab lt- Matlab(host"localhost", port9999,
remoteFALSE) \arguments
\itemhostName of host to connect to.
Default value is \codelocalhost.
\itemportPort number on host to connect to.
Default value is \code9999.
\itemremoteIf \codeTRUE, all data to and
from the Matlab server will be transferred
through the socket connection, otherwise the data
will be transferred via a temporary file.
Default value is \codeFALSE.
\sectionFields and Methods _at_include
"Matlab.methods.Rdoc" _at_include
"Matlab.inheritedMethods.Rdoc"
\examples\dontrun_at_include "Matlab.Rex"
\authorHenrik Bengtsson, \urlhttp//www.braju.co
m/R/ \seealso Stand-alone methods
\code\linkreadMAT() and \code\linkwriteMAT(
) for reading and writing MAT file
structures. _at_visibility
public /
setConstructorS3("M
atlab", function(host"localhost", port9999,
remoteFALSE) extend(Object(), "Matlab",
...
- Rdoc comments are Rd documentation within the
source files - easy to generate complete Rd files from source
files. - less risk to forget to update Rd files.
- automatically generates class hierarchy and
method lists. - extra tags to include external files, e.g.
example code.
Does not require the Object class
17Static methods
- Methods that are specific to a class and do not
belong to a certain object. - Keeps the focus on classes/objects, not methods.
- For instance, static method names are easy to
remember for the end user (first class then
method), e.g. - MicroarrayDataread(slide1.gpr)
- Soundread(chime.wav)
- ColorsgetHeatColors(110)
- instead of
- readMicroarrayData(slide1.gpr)
- readSound(chime.wav)
- getHeatColors(110)
- which might not even be unique!
18Virtual fields
- Virtual fields are fields that does not exist,
but appears to do so because of existing methods
getltFieldgt() and setltFieldgt(). - Example 1 The virtual field area of the Square
class is defined by defining getArea() and
setArea() - squarearea will call getArea(square), which
will return the area (calculated from the field
side or in some other way) - squarearea lt- -12 will call setArea(square,
-12), which then throws an OutOfRangeException. - Example 2 Private fields, e.g. side, can be
protected by defining setSide(), which throws an
NoSuchFieldException. - Example 3 The constant field RED.HUE can be
write protected by defining setRED.HUE(), which
throws an AssignmentException. - Example 4 Provide cached fields that can be
calculated from the other fields, but can be
cached in case they are accessed often at it
takes a long time to calculate them. The cache
can be removed in case of low memory.
19Summary example
setConstructorS3(Square, function(side0)
Creates an object of class Square. Square, whose
fields are defined at the same time, extends
the class Shape. extend(Shape(), Square,
side side side is public
)) setMethodS3(setSide, Square,
function(this, side) sqside lt- a will
throw a NonNumericException if
(!is.numeric(side)) throw(NonNumericException(
Trying to set the side of a square \
to a non-numeric value ,
side)) sqside lt- -12 will throw an
OutOfRangeException if (!is.numeric(side))
throw(OutofRangeException(The side of a square
must be zero \ or
greater , side)) thisside lt- side
Assignment remains also after returning! )
20Extended exception handling
Does not require the Object class
- Throw Exception objects, which can be caught
(quietly) based on class, e.g. - trycatch( Calls setArea(), which throws an
OutOfRangeException. sqside lt- -12 ,
NonNumericException cat(The side of a
square must be a numeric value.\n), ANY
catches any other types of Exception (also
try-error). print(ExceptiongetLastException())
, finally always double the side
whatever happens. sqside lt- 2sqside)
R.oo
Object
Exception
RccViolationException
OutOfRangeException
NonNumericException
Exception
static getLastException() ExceptiongetMessage()
character getWhen() POSIX timethrow()
Error 2003-03-08 121143 OutOfRangeException
The side of a square must be zero or greater -12
21Future
- Make the API (even) more similar to the S4 API
- Makes transitions to and from R.oo (and S4),
easier. - Less confusing for beginners.
- Make an S4 version of the package
- When the problem generic functions are too
restricted on matching argument is solved. - Make it easier to declare private fields or
constants. - Implement the mechanisms for field access in
native code. - Publish R.oo on CRAN
- Requires a stable API. After 2 years it is
indeed very stable, but any major changes after
v1.0 will be annoying for the user.
22Acknowledgements
- The R development team
- People on the r-help mailing list
- All users that have given feedback to the project
- See http//www.maths.lth.se/help/R/ forRCC,
more documentation, help, examples, and
installation ofR.classes bundle R.audio,
R.base, R.graphics, R.io, R.lang, R.matlab, R.oo,
R.tcltk, R.ui,cDNA microarray package
com.braju.sma.