Title: JAVA Performance Tuning
1JAVA Performance Tuning
- Dean Andreakis
- 02/19/2001
2Scope
- Performance enhancements from a Java application
developers perspective - What is not covered hardware improvements (more
memory, faster CPU, faster disk or network I/O),
Java VM improvements (i.e. specific VM versions,
JIT, Hot-Spot), optimizing compilers etc.
3What to Tune?
- Three resources typically limit applications CPU
speed availability, System memory, and
Disk/Network I/O - In general, the first step to tuning is to
determine which of these is causing your
applications to run slowly - If your application is CPU-bound then concentrate
efforts on code such as inefficient algorithms,
too many short-lived objects etc. - If your application exceeds system memory limits
then check if too many objects (or a few large
objects) are being erroneously held in memory.
Also check if too many large arrays are being
allocated. - If external data access is slowing your
application then examine the parts of your code
that interface with disk or network I/O and take
appropriate action
4Tuning Strategy
- Measure performance via benchmarks, profilers,
instrumented code - Identify bottlenecks
- Create hypothesis for the cause of the bottleneck
- Consider factors that may refute your hypothesis
- Create a test to isolate the factor identified by
your hypothesis - Test the hypothesis
- Alter the application to reduce the bottleneck
- Test that the alteration improves the performance
of your application - Repeat from step 1
5Perceived Performance
- If the user interface of an application provides
feedback and allows the user to carry on other
tasks or abort and start another function then
the user typically sees this as a responsive
interface and will not consider the application
as slow as he/she might normally think - Threading to appear quicker An application can
be designed to anticipate what the user may want
to do via a background low priority thread while
the user is reacting to the application interface
so that pre-calculated results are ready to
assist the user immediately. Ensuring that the
application stays responsive to the user even
while other functions are being executed also
makes the program seem fast and responsive.
Threading is the key to these methods and can be
used to ensure that a service is available and
unblocked when needed.
6Perceived Performance
- Streaming to appear quicker An application can
display the first set of results for activities
that take a long time and for which some results
can be returned more quickly than others. This
gives the user the appearance of a much more
responsive system. Ex Web Browsers that display
partial page results while images etc. are still
being downloaded. - Caching to appear quicker Use a local cache when
you think the same piece of data needs to be
moved from one place to another to calculate
results. Ex web browsers that keep a local copy
of a downloaded page so that subsequent requests
do not necessarily have to perform network I/O
again to retrieve the page.
7Profiling
- Instrument your code You can get fairly accurate
elapsed time data by using the System.currentTimeM
illis() call at various points in your program. - Use the -verbosegc option with the VM to printout
garbage collector statistics such as time and
space values for objects reclaimed and space
recycled as the reclaims occur. This can give you
an idea of how often the garbage collector is
running (a very expensive operation!). - Use the JDK profiler for object profiling by
running your program with the -Xrunhprof option
(_-Xrunprof for HotSpot VM). This provides a file
called java.hprof.txt that contains profile data.
The profile data shows object creation statistics
from the heap when the output occurs.
8Profiling
- Use the JDK profiler for method profiling by
running your program with the -Xrunhprofcpusampl
es,thready ltclassnamegt option (_-Xrunprof for
HotSpot VM). This provides information of the
amount of time spent in each method accessed for
that particular run of the program. Sample
output - CPU SAMPLES BEGIN (total 2294) Sat Feb 17
070457 2001 - rank self accum count trace method
- 1 15.21 15.21 349 274
sun/awt/windows/WToolkit.eventLoop - 2 13.95 29.16 320 73
SymantecJITCompilationThread.DoCompileMethod - 3 2.18 31.34 50 36
java/lang/ClassLoader.findBootstrapClass - 4 2.01 33.35 46 696
sun/awt/windows/WGraphics.W32LockViewResources - 5 1.26 34.61 29 611
sun/awt/font/NativeFontWrapper.getFontMetrics
9Profiling
- You can monitor gross memory usage via the
java.lang.Runtime.totalmemory() and
java.lang.Runtime.freememory() methods.
totalMemory() returns a long representing the
number of bytes currently allocated to the
runtime system for this particular VM process.
freeMemory() returns a long representing the
number of bytes available to the VM to create
objects from the section of memory it controls.
This increases after garbage collection and
decreases each time an object is created. - Client/Server or distributed applications can use
system level tools such as snoop, netstat, ndd on
Solaris to monitor communications traffic.
10Object Creation
- Avoid creating objects in frequently used
routines. Can primitives be used in these
routines or can singleton objects be used instead
of creating and destroying many short-lived
objects? - Employ an object reuse strategy where
appropriate. Container objects such as Vectors,
Hashtables etc. can be reused rather then created
and thrown away. Of course when you are not using
the retained objects you are holding on to more
memory then necessary but this need to be
balanced against the need for performance in the
particular case.
11Object Creation
Code Ex //In your program public static
VectorPoolManager vpm new VectorPoolManager(25)
public void somemethod() Vector v
vpm.getVector() vpm.returnVector(v) import
java.util.Vector public class VectorPoolManager
Vector pool boolean inuse public
VectorPoolManager(int initialPoolSize) pool
new VectorinitialPoolSize inUsenew
booleaninitialPoolSize for(int
Ipool.length-1 igt0 i--) poolinew
Vector() inuseifalse public
syncronized Vector getVector()//code to return a
vector from the pool public syncronized void
returnVector(vector v)//code to return a vector
object to the pool
12Object Creation
- Use StringBuffer objects in place of String
objects when the string will be altered at
runtime via concatenation operators or
re-assignments etc. String objects are immutable
therefore any changes to them actually creates
another String object in memory. StringBuffer
objects do not create new instances when altered
at runtime.
13Loops Switches
- Array access has more overhead than temporary
variable access because the VM performs
bounds-checking for array element access. Do
array access once outside a loop rather then
repeated for each loop iteration. Code Ex - for(int i 0 i lt Repeat i) countArr010
- Optimized
- countcountArr0
- for(int i0 iltRepeat i) count10
- countArr0count
14Loops Switches
- Switch statements in Java are compiled into one
of two bytecodes depending on the case value
ranges. The optimized version is utilized when
the case values are all contiguous. Some VMs
will optimize by inserting dummy case values to
make a case value set contiguous but if you find
a switch to be a bottleneck in your program dont
leave it up to the compiler, go ahead and design
the switch so that the case values are contiguous.
15Additional Areas
- File and Network I/O utilizing caching and
buffering techniques - Using more efficient sorting algorithms
- Utilizing thread pools, analyzing and fixing
thread deadlock and race conditions etc - Optimizations for distributed applications
including message reduction, caching, data
compression, application partitioning etc
16Further Resources
- Java Performance Tuning by Jack Shirazi
(OReilly Press) - General performance tuning information at
www.java.sun.com, www.alphaworks.ibm.com - Commercial Profilers such as OptimizeIt!(www.optim
izeit.com) or JProbe (www.klgroup.com)