Title: J2ME: Design, Performance and Efficiency With WSDD
1J2ME Design, Performance and Efficiency With WSDD
- Randy Faust
- Embedded Java Activist
- IBM OTI Labs Phoenix
- IBM OTI Labs Zürich
2Past, Present, and (near) Future.
- Whats happened so far?
- Why have people failed?
- Where do we go from here?
- What to consider.
- Best practices.
- Tools and Tuning.
- Questions?
3Whats Happened So Far.
- False starts
- Personal Java, Embedded Java, Micro Java, Pico
Java - IBM builds J9.
- IBM invents custom class libraries
- Extreme, Core, Max
- IBM is on its 9th embedded Java release
- VAES 1.0, VAME 1.0-1.4, WSDD 4.0-5.5
- J2ME is created
- Configurations, profiles, WME, WCE
- Java powered and TCK
4Why Have People Failed?
- Not enough hardware. 5 developers 5 boards
- Little or No in field development.
- Not enough testing throughout the project
lifecycle. - Failing to appreciate system constraints.
- Embedded guys trying to do Java.
- Java guys trying to do embedded.
5Why Have People Failed?
- Having insufficient knowledge of the OS.
- Believing WORA in the embedded space.
- Trying to extend the good-enough approach.
- Creating too many threads.
- Forgetting that Java heap isnt the only memory
required. The VM is an application which also
uses memory. How much can depend a good deal on
how many classes you load.
6Where Do We Go From Here?
- Greatest use of Java in the embedded space.
- More and better choices for embedded graphics.
- Faster VMs.
- Faster chips, cheaper memory, lower power.
- Plugins for WSDD from Eclipse or WSWB enabled 3rd
parties.
7What to consider.
- OS, Processor, Tools (self hosted or X-compiled,
debugger) - Hardware configuration
- memory, processor speed, interfaces (serial,
ethernet, MOST, flash) - Java feature set
- Class libs, Frameworks, UI, big Int, zip support,
compressed zips, verification, one-shot or
command-line, JIT, AOT - Development cycle turn-around time
- time generate build download run
- Available documentation and support
8Best Practices (This is all a lie)
- Choose the right parts to code in Java. Device
drivers are usually not good candidates. - Beware Vector.
- Fastest is ArrayList, although its
unsynchronized. Stack just extends Vector. - Be smart about GCs. Tune the app, and ask for
GCs yourself. - Reuse existing objects. Take the effort to
reinitialize their values. - Take into account the connectivity of your
platform. Dont forget the background traffic use
cases. - Know the fastest possible startup time, before
you run a single line of code.
9Best Practices (This is also a lie)
- Avoid method calls as loop termination criteria.
- for (int i 0 i lt str.length() i)
- Avoid synchronization
- JIT may be able to remove sync blocks, but dont
count on it - Avoid monster objects.
- Avoid finally() blocks.
- Avoid String concatenation. Use StringBuffer
objects instead.
10Best Practices (Yet more lies)
- The amount of code is bytecodes, not the size of
your source. - Message sends are costly. In-line small methods,
getters/setters, use JIT in-lining. - Always ask yourself, Does this make sense? Is it
easy to understand? Could we make it simpler? - Bigger methods take a longer time to inline and
cost more memory. Small methods JIT faster, but
large methods give you more value when theyre
JITted. - Dont optimize too early. Optimizing 5 usually
yields most of the potential improvement. - Use Lazy Initialization.
11Best Practices (Some truth here)
- Erase code.
- Manage your resources carefully.
- Version control your use cases and requirements.
- Hard writing makes for easy reading.
- The easiest way is probably the best way.
- Save time for certification.
- Objects die young.
12Tools and Tuning The J9 VM
- Bottom-up design, Cleanroom
- JDK 1.3 compliant
- Fast interpreter written in assembler using
register based calling convention - Multi-VM, JVMP, JDWP, INL
- Pluggable class libraries
- Mixed-mode execution AOT/JIT/interpreter
- New GC accurate, incremental, compacting
- Statically or dynamically linked
- Native threads, widgets and memory management
- Thin port layer to isolate use of OS
13Tools and Tuning Configuring the VM
- Three configuration levels
- At runtime using command options
- changing default memory settings or garbage
collector characteristics - JIT
- At integration time by selecting the desired set
of modules - removing shared objects (math, dbg, zip, jxe,
etc.) - At build time by changing the VM-OS code or
Application code - Jxe
- AOT
- changing the thread model or scheduler
14Tools and Tuning Jxes
- Java Executable. Split representation of Java
classes - WHY?
- Organized more thoroughly than .class files.
Future JSR-202? - Reduces memory footprint
- Faster execution as class loading time is reduced
- Allows sharing of class libraries, user code
- Multi-VM size reductions for servers and handheld
- No file system or network required
- Allows Execution in Place (XIP) from Flash
- Classes loaded on demand, not all at once
15Tools and Tuning VM Memory Tuning
-Xmcaltxgt Set RAM class segment increment to ltxgt
-Xmcoltxgt Set ROM class segment increment to ltxgt
-Xmxltxgt Set memory maximum to ltxgt
16Tools and Tuning The Garbage Collector
-Xmnltxgt Set new space size to ltxgt
-Xmsltxgt, -Xmoltxgt Set old space size to ltxgt
-Xmoiltxgt Set old space increment to ltxgt
-Xmxltxgt Set maximum memory to ltxgt
-Xmrltxgt Set remembered set size to ltxgt
17Tools and Tuning JIT and AOT
- JIT compiled code is better than AOT compiled
code because it will use runtime profiling
information. - AOT is typically more useful when you have prior
information about the runtime characteristics of
the application. - Adaptive recompilation is possible on certain
platforms. - Code is never un-JITed. Theres almost never a
reason for this. - AOT and JIT can be mixed in jxe files.
18Tools and Tuning JIT and AOT
- Use a limit file after profiling when you know
what you want to JIT. - Big methods take a longer time to compile or
inline and will use more memory. - High optimization level of JIT are often best
left for benchmarks. - X-compiler for x86.
- Dumptrucks only go a little faster with turbo.
19Tools and Tuning Smartlinker
- Preprocesses Java classes into target executable
form (jxe) - Splits code into RAM/ROM format
- Reduces code by removing unused classes, methods,
fields - Creates jxe file in target platform endian and
addressing - Supports XIP by prelocating jxe for ROM
addressing
20Tools and Tuning Smartlinker
- Package jxe with preverification info
- Precompile methods
- Segment the jxe
- Obfuscate method and class names
- Specify the startup class if applicable
- Create component jxes
21Tools and Tuning MicroAnalyzer
- Capture and timestamp key events with low
overhead 5 - Target or Workstation triggering and tracing
options - Measure time between user events
- Measure memory usage
- See context switch events between threads
22Tools and Tuning Remote Debugging
- Implements 1.4.0 JDWP wire protocol, including
hot code replace - Runs over a proxy to save space on the target
- -verbose
- class, jni, gc, dynload, stack, debug
- -memorycheck
- all Lots of checking, slow
- quick Minimal checking, fairly fast
- nofree Dont free any of the memory
- failat XFail allocation X
- skipto XOnly start memory checking at X
- Use a friendlier platform
23Questions?