The R.oo package: Robust object-oriented design - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

The R.oo package: Robust object-oriented design

Description:

Rdoc comments are Rd documentation within the source files: ... readSound('chime.wav') getHeatColors(1:10) which might not even be unique! ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 23
Provided by: henrikbe
Category:
Tags: design | does | files | for | object | oriented | package | robust | stand | url | wav | what

less

Transcript and Presenter's Notes

Title: The R.oo package: Robust object-oriented design


1
The R.oo package Robust object-oriented design
implementation with support for references
Henrik Bengtsson hb_at_maths.lth.se Mathematical
Statistics, Centre for Mathematical Sciences Lund
University, Sweden DSC-2003, Vienna. March
20-22, 2003
2
Outline
  • Purpose and what the package is and is not.
  • RCC R Coding Conventions (draft).
  • Reference variables.
  • The root class Object.
  • setMethodS3() setConstructorS3().
  • Rdoc comments.
  • Static methods.
  • Virtual fields.
  • trycatch() - exception handling based on class.

3
Purposes
  • End user (the most important person at the end of
    the day!)
  • Provide consistent object-oriented APIs across
    different packages, e.g. by having a well defined
    naming convention for classes, methods, fields
    and variables.
  • Make class inheritance more explicit.
  • Provide a simpler API, e.g. less arguments.
  • More memory efficient packages.
  • Developer / programmer
  • Provide reference variables to reduce memory
    req.'s and data redundancy.
  • R Coding Convention, e.g. naming conventions.
  • Create generic functions automatically.
  • Make code cleaner and remove the need for tedious
    code repetitions.
  • Minimize the risk for package conflicts.
  • More code checking when creating methods and
    classes to catch errors early on.
  • Catch rare but classical bugs, e.g. using
    reserved words in method names.
  • Make help pages more up to date with the source
    code by allowing Rd document to be placed
    together with the code in the source files.

4
Real world example
Read all GenePix Result files gpr lt-
MicroarrayDataread(pattern.gpr) Extract
the foreground background signals of the red
and the green channels. The slide layout is
also included. raw lt- as.RawData(gpr) Get the
background corrected signal as Mlog(R/G) and
Alog(RG)/2. ma lt- getSignal(raw,
bgSubtractTRUE) normalizeWithinSlide(ma,
methodp) print-tip normalization. knownGene
s lt- c(50,194,3433,5541,6384) plot(ma)
highlight(ma, knownGenes) highlights the data
points from the plotPrintorder(ma) highlight(ma,
knownGenes) correct slide in the correct
space. plotSpatial(ma) highlight(ma,
knownGenes) plotSpatial3d(gpr, fieldarea,
colgetColor(ma)) Write the normalized data to
a tab-delimited file write(ma, NormalizedExpressi
ons.dat)
5
What the package is and isnt
  • Is not supposed to replace S3 or S4, but
  • is an extra layer on top of S3 (eventually S4),
    to
  • move the focus from S3 and S4 details to
    object-oriented design and implementation.

R.oo
R environment(S3 and eventually S4)
  • It has been tested and verified for gt 2 years!

6
RCC R Coding Conventions (draft)
http//www.maths.lth.se/help/R/RCC/
  • Standardizes the coding style
  • Example of the naming conventions
  • Variables, objects, fields and methods should
    verbs starting with a lower case letter, e.g.
    shapeside and normalize().
  • Classes should be nouns starting with an upper
    case letter, e.g. MicroarrayData.
  • Constants should be in all upper case, e.g.
    ColorsRED.HUE.
  • Similar to Java.
  • Standards
  • make the code (and the design) easier to read,
    share and maintain.
  • reduce the risk for bugs and misunderstandings.

7
Reference variables
  • Memory efficient.
  • Minimizes the amount redundant data.
  • Very useful for some data structures, e.g.
    graphs.
  • References in R.oo are implemented using the
    environment data type.
  • Collected by the R garbage collector.
  • (More user friendly methods interfaces since
    methods can communicate with each other by
    updating the state of the object.)

8
A common root class Object
  • All classes should have the common root class
    Object.
  • A similar idea exists in R today, e.g. print(),
    as.character() etc, but a common root class
    makes it more explicit.

9
Object the common root class
10
A common root class Object
  • All classes should have the common root class
    Object.
  • A similar idea exists in R today, e.g. print(),
    as.character() etc, but a common root class
    makes it more explicit.
  • Fields of an Object can be accessed as elements
    of a list, e.g.
  • squareside and
  • squareside lt- 23
  • Methods can also be called as
  • squaregetArea()
  • The implementation of reference variables is
    taken care of within the Object class. Under the
    hood, we roughly have
  • .Object lt- function(object, name)
    get(name, envirattr(object, .env))
  • lt-.Object lt- function(object, name, value)
    assign(name, value, envirattr(object, .env))

11
setMethodS3()
Does not require the Object class
  • Defines a method of a class.
  • Creates a generic function automatically iff
    missing.
  • RCC
  • Methods should start with a lower case letter.
  • Asserts that a correct method name is used
    reserved words and names of basic functions that
    must not be overwritten or redefined are
    protected.

setMethodS3(plotPrintorder, MAData,
function(object, ...) ... )
setMethodS3(next, Iterator, function(object,
...) ... ) Error 2003-03-18 162800
RccViolationException Method names must not be
same as a reserved keyword in R next, cf.
http//www.maths.lth.se/help/R/RCC/
12
Problems with generic functions
  • Hard to check if function (generic or not)
    already exists.
  • Ad hoc solutions for creating generic function
    automatically.
  • Under the S3 schema, it is possible to create
    generic functions that are truly generic
    normalize lt- function(...) UseMethod(normalize)
    Note that the first argument is omitted. If not,
    it would be impossible to have default functions
    with no arguments, e.g. search().
  • The R.oo package automatically creates generic
    functions as above.
  • We are not aware of how to do the same in S4
    (this is the main reason for why R.oo is
    currently staying with S3).

13
setConstructorS3()
Does not require the Object class
  • Defines the constructor method of a class, but
    also the class.
  • RCC
  • Asserts that a correct class name is used
    reserved words and names of basic functions that
    must not be overwritten or redefined are
    protected.
  • Class and constructor names should start with an
    UPPER CASE letter.
  • Constructors should be named the same as the
    class.

setConstructorS3(MAData, function(M, A,
layoutNULL) extend(MicroarrayData(layoutlayo
ut), MAData, M as.matrix(M), A
as.matrix(A) ) )
Constructor/class definition hybrid Creates an
object of the super class, which is then
extended into an MAData object with additional
fields.
14
Quick inspection of a class
  • print(ltclass namegt) or simply type the class name
    at the prompt and press ENTER, e.g.

Object
gt MADataMAData extends MicroarrayData, Object
public A public layout public M ...
normalizeWithinSlide(...) ... public
plot(what"MvsA", ...) public plot3d(...)
public plotPrintorder(what"M", ...) ...
public print(...) public save(fileNULL,
pathNULL, ...)
MicroarrayData
Layout
ngrid.c integer ngrid.r integer nspot.c
integer nspot.r integer
... plot(...) plot3d(...) plotPrintorder(...) ...
... getName(...) character getId(...)
character ... nbrOfSpots() integer nbrOfGrids()
integer ...
MAData
A matrix M matrix
as.RGData() RGData ... normalizeWithinSlide(...)
normalizeAcrossSlides(...) ...
15
Quick inspection of an object
  • print(ltobjectgt) or simply ltobjectgt and ENTER at
    the prompt, which by default is equal to
    print(as.character(ltobjectgt)), e.g.gt ma1
    "MAData M (5184x4), A (5184x4), Layout Grids
    4x4 (16), spots in grids18x18 (324), total
    number of spots 5184. Spot name's are specified.
    Spot id's are specified."
  • ll(ltobjectgt) gives details information about the
    (public) fields, e.g.

gt ll(ma) member data.class dimension
object.size 1 A SpotSlideArray c(5184,4)
143940 2 layout Layout 1
428 3 M SpotSlideArray c(5184,4)
143940 gt ll(malayout) or ll(getLayout(ma))
member data.class dimens2ion object.size1
geneGrps NULL 0 02
geneSpotMap NULL 0 03
id character 5184 638684
ngrid.c numeric 1 36...
11 printtipGrps NULL 0 0
16
Rdoc Source-to-Rd converter

/ _at_Class Matlab
\titleMatlab client for remote or local Matlab
access \description _at_include
"Matlab.declaration.Rdoc" \usage
matlab lt- Matlab(host"localhost", port9999,
remoteFALSE) \arguments
\itemhostName of host to connect to.
Default value is \codelocalhost.
\itemportPort number on host to connect to.
Default value is \code9999.
\itemremoteIf \codeTRUE, all data to and
from the Matlab server will be transferred
through the socket connection, otherwise the data
will be transferred via a temporary file.
Default value is \codeFALSE.
\sectionFields and Methods _at_include
"Matlab.methods.Rdoc" _at_include
"Matlab.inheritedMethods.Rdoc"
\examples\dontrun_at_include "Matlab.Rex"
\authorHenrik Bengtsson, \urlhttp//www.braju.co
m/R/ \seealso Stand-alone methods
\code\linkreadMAT() and \code\linkwriteMAT(
) for reading and writing MAT file
structures. _at_visibility
public /
setConstructorS3("M
atlab", function(host"localhost", port9999,
remoteFALSE) extend(Object(), "Matlab",
...
  • Rdoc comments are Rd documentation within the
    source files
  • easy to generate complete Rd files from source
    files.
  • less risk to forget to update Rd files.
  • automatically generates class hierarchy and
    method lists.
  • extra tags to include external files, e.g.
    example code.

Does not require the Object class
17
Static methods
  • Methods that are specific to a class and do not
    belong to a certain object.
  • Keeps the focus on classes/objects, not methods.
  • For instance, static method names are easy to
    remember for the end user (first class then
    method), e.g.
  • MicroarrayDataread(slide1.gpr)
  • Soundread(chime.wav)
  • ColorsgetHeatColors(110)
  • instead of
  • readMicroarrayData(slide1.gpr)
  • readSound(chime.wav)
  • getHeatColors(110)
  • which might not even be unique!

18
Virtual fields
  • Virtual fields are fields that does not exist,
    but appears to do so because of existing methods
    getltFieldgt() and setltFieldgt().
  • Example 1 The virtual field area of the Square
    class is defined by defining getArea() and
    setArea()
  • squarearea will call getArea(square), which
    will return the area (calculated from the field
    side or in some other way)
  • squarearea lt- -12 will call setArea(square,
    -12), which then throws an OutOfRangeException.
  • Example 2 Private fields, e.g. side, can be
    protected by defining setSide(), which throws an
    NoSuchFieldException.
  • Example 3 The constant field RED.HUE can be
    write protected by defining setRED.HUE(), which
    throws an AssignmentException.
  • Example 4 Provide cached fields that can be
    calculated from the other fields, but can be
    cached in case they are accessed often at it
    takes a long time to calculate them. The cache
    can be removed in case of low memory.

19
Summary example
setConstructorS3(Square, function(side0)
Creates an object of class Square. Square, whose
fields are defined at the same time, extends
the class Shape. extend(Shape(), Square,
side side side is public
)) setMethodS3(setSide, Square,
function(this, side) sqside lt- a will
throw a NonNumericException if
(!is.numeric(side)) throw(NonNumericException(
Trying to set the side of a square \
to a non-numeric value ,
side)) sqside lt- -12 will throw an
OutOfRangeException if (!is.numeric(side))
throw(OutofRangeException(The side of a square
must be zero \ or
greater , side)) thisside lt- side
Assignment remains also after returning! )
20
Extended exception handling
Does not require the Object class
  • Throw Exception objects, which can be caught
    (quietly) based on class, e.g.
  • trycatch( Calls setArea(), which throws an
    OutOfRangeException. sqside lt- -12 ,
    NonNumericException cat(The side of a
    square must be a numeric value.\n), ANY
    catches any other types of Exception (also
    try-error). print(ExceptiongetLastException())
    , finally always double the side
    whatever happens. sqside lt- 2sqside)

R.oo
Object
Exception
RccViolationException
OutOfRangeException
NonNumericException
Exception
static getLastException() ExceptiongetMessage()
character getWhen() POSIX timethrow()
Error 2003-03-08 121143 OutOfRangeException
The side of a square must be zero or greater -12
21
Future
  • Make the API (even) more similar to the S4 API
  • Makes transitions to and from R.oo (and S4),
    easier.
  • Less confusing for beginners.
  • Make an S4 version of the package
  • When the problem generic functions are too
    restricted on matching argument is solved.
  • Make it easier to declare private fields or
    constants.
  • Implement the mechanisms for field access in
    native code.
  • Publish R.oo on CRAN
  • Requires a stable API. After 2 years it is
    indeed very stable, but any major changes after
    v1.0 will be annoying for the user.

22
Acknowledgements
  • The R development team
  • People on the r-help mailing list
  • All users that have given feedback to the project
  • See http//www.maths.lth.se/help/R/ forRCC,
    more documentation, help, examples, and
    installation ofR.classes bundle R.audio,
    R.base, R.graphics, R.io, R.lang, R.matlab, R.oo,
    R.tcltk, R.ui,cDNA microarray package
    com.braju.sma.
Write a Comment
User Comments (0)
About PowerShow.com