The Java Virtual Machine Internal Architecture and Function - PowerPoint PPT Presentation

1 / 36

About This Presentation

Title:

The Java Virtual Machine Internal Architecture and Function

Description:

Dereferencing only once to access the instance data from the object referencing ... It hurts performance (by dereferencing twice), but it saves on memory. ... – PowerPoint PPT presentation

Number of Views:1021

Avg rating:3.0/5.0

Slides: 37

Provided by: davidg64

Category:

more less

Transcript and Presenter's Notes

Title: The Java Virtual Machine Internal Architecture and Function

1
The Java Virtual MachineInternal Architecture
and Function

Catalin Constantin

2
Contents

Overview
The Architecture
Class Loader Subsystem
Method Area
Method Tables
The Heap
Object and Class Data Representation
Object Representation
Local Variables Representation
Resolution, Exceptions, and Abrupt Method
Completion
Execution Engine and Execution Techniques
The Instruction Set
Native methods interaction
Execution Order and Optimizations
Summary

3
Overview

The Java Virtual Machine is the environment for
running Java programs. It is called a virtual
machine, because it is an abstract computer
defined by a specification.
Can be defined in three ways
Abstract Specification
Concrete Implementation
Runtime Instance
Each Java application runs in its own virtual
machine, and has exclusive access to the
structures created by the virtual machine to
accommodate its runtime.

4
The Architecture

JVM Architecture is based on subsystems, memory
areas, data types and instructions organized as
Class Loader Subsystem
mechanism for loading types, classes and
interfaces given fully qualified names.
Execution Engine
mechanism responsible for execution of the class
and method instructions.
Runtime Data Areas
organized memory unit used to store bytecodes
loaded from class files, object instances, method
parameters, return values, local variables and
intermediate results.
Native Method Interface
not always publicly available to the programmer.
Some implementations hide this part, while other
try to emphasize it and make code optimization
more feasible.
Note There are no general purpose registers,
instead JVM uses a stack to simulate
register-like operations

5
(No Transcript)
6
Class Loader Subsystem

Responsible with
Loading finding and importing binary data for
each type
Linking verification, preparation, resolution
Initializing invoking java code that performs
initialization
Two kinds of loaders
Bootstrap class loader part of the virtual
machine implementation
User-defined class loader part of the running
application
Classes loaded by each class loader are placed in
separate namespaces
Loaders must be able to recognize and load
classes stored in files that conform to Java
compiled class format

Bootstrap class loader
Loads trusted classes including the Java API
Is unique and has its own namespace
User-defined class loader
Is not necessarily unique
Inherits four gateway methods into the JVM, the
most important is resolveClass() which accepts a
reference to a heap object and can dynamically
determine its type
By using namespaces, JVM can load multiple types
with the same fully qualified name through
different loaders
When it resolves symbolic references from one
class to another, it requests the referenced
class from the same loader that imported the
referencing class

8
Method Area

Properties
Complex data area
Stores information about a loaded type
Methods of instantiated objects are kept in the
method area in fact, not in the heap along with
other objects content
Shared among all running threads (thread safe)
when two threads request the same type, only one
of the requests will actually load the type while
the other is waiting
Not fixed in size
Garbage collected
The idea here is slightly different from
collecting unreferenced objects

Contents
Basic Information
Fully qualified name
Relationship to the superclass
Type modifiers (public, abstract, final, etc)
Advanced Information
Constant Pool
Ordered set of constants used by type,
literals, symbolic references to types, fields
and methods. It plays a major role in dynamic
linking. Entries referenced by index much like
elements of an array
Field Information
Method Information
Static variables
Static variables of a loaded class must
retain changes across multiple calls. The fact
that two related classes are in the same
namespace ensures that subsequent accesses to
static variables is not memory-less
References to ClassLoader and Class

10
Method Tables

A method table is an array of direct
references to all the instance methods that may
be invoked on a class instance, including the
inherited methods.
Properties
Allows the virtual machine quick access to
instance methods
Each instantiated object will have a reference to
the method table associated with the class
In conjunction with information stored in the
heap, plays an important role in dynamic linking
and polymorphism

11
The Heap

Is the location where all class instances and
arrays (which are also viewed as objects) are
instantiated.
Properties
One common heap for each running instance of the
JVM
The JVM has an instruction for allocating space
on the heap, but has no explicit instruction for
de-allocating space. Just as an object cannot be
freed in Java code, it cannot be freed explicitly
in virtual machine code either
The garbage collector is solely responsible for
eliminating unreferenced objects from the heap

12
Object and Class Data Representation

Object Representation
The JVM specification is not strict about object
representation
Given an object instance, the JVM must be able to
quickly locate the instance data and the class
data
Memory allocated for an object in the heap must
contain a pointer into the method area, where the
class data is stored.
Two most important models are presented next
Arrays are represented as objects
Local variables representation
Local variables are stored in the Java stack
frame associated with each method
Each running thread gets its own Java stack, and
each method has an active method frame onto the
stack of the thread context in which it is called
Passing parameters is done through the Java stack
The storage size is one entry for int, float,
reference and returnAddress
The storage size is two entries for long and
double, called with the address of the first
entry
Note Variables of type byte, short and char
are stored as int on the Java stack. The boolean
type is not directly supported by the JVM, it is
translated into int

13
Object Representation (model 1)

Divide the heap in two parts the handle pool and
the object pool
An object reference is a native pointer to a
handle pool
Each handle pool has two entries
A pointer to instance data (in the heap)
A pointer to class data (in the method area)
Advantage
Prevents fragmentation. When an object is moved,
only one pointer needs to be changed
Disadvantage
Each time a referencing is made, in fact the
virtual machine must dereference two pointers.
One to the handle and another one to the data

14
(No Transcript)
15
Object Representation (model 2)

An object reference is a native pointer to a
bundle of data that contains the object instance
data and a pointer to class data
Advantage
Dereferencing only once to access the instance
data from the object referencing native pointer.
Disadvantage
Moving objects to prevent fragmentation becomes
more complicated. When the Java virtual machine
moves an object into the heap it must update
every reference to that object anywhere in the
runtime area where it is used.

16
(No Transcript)
17

Object Representation and Casts
The main reason the JVM needs to get access from
an object reference to the class data is for
resolving attempts to perform casts
It must check to see if the type being cast to
is
Either the actual type of the object the cast
is allowed instantly
Or a type of its ancestors the procedure
involves checking all superclasses up the class
inheritance tree
Note
Earlier we noted that an object must have a
reference to its super class data. Imagine a cast
this way. The Java virtual machine attempts a
cast. If the objects real type is the same as
the type being cast to, then the cast is allowed
instantly. If the two do not match, the Java
virtual machine can follow the reference to its
superclass. It will then check again the type
cast consistency and so on up to the Object
class. A successful type cast looks like a direct
path between the actual type and the cast type up
the class tree.

18
Most Common Model (2)
19
Similarities and Differences Java vs. C

Java object representation is somewhat similar to
VTBL structure in C.
In Java, the objects are represented by instance
data and a pointer to class data (and implicitly
method table)
In C, the objects are represented by instance
data and an array of pointers to any virtual
functions that can be invoked on the object
The main difference between Java objects and C
objects is that, while in C the functions are
not predominantly virtual, in Java they always
act like virtual.
If Java would adopt the same layout as VTBL of
C, then it would need to store (redundantly)
pointers to all instance methods
Java can accomplish the same results by only
storing one pointer to the class data
It hurts performance (by dereferencing twice),
but it saves on memory.

20
Array Representation

Java arrays are objects, they are stored in the
heap, and they are associated with a class type
Example A one dimensional array of int
elements and a two dimensional array of int
elements have different class types. Symbolically
they are represented as I and I. A two
dimendsional array of objects would be
symbolically represented as Ljava.lang.object
Multidimensional arrays are represented as arrays
of arrays, thus some array elements can be
considered themselves compatible for other array
type assignments or casts
The length of an array or any of its dimensions
does not determine the type of the array, it is
only an instance data (field)

21
Array Representation
22
Local Variables Representation

Local variables can be represented in any order
by the compilers inside the Java stack frame
associated with a method
Some locations on the stack can be reused for
local variables that temporarily go out of scope.
Parameters are also passed using the Java stack,
and they are pushed onto the stack in the order
they are encountered from left to right
There is one important difference between the
Java stack frames of class methods (static) and
instance methods
The instance method has in its first entry of
the stack a reference corresponding to the hidden
this, used to access the instance data in the
heap associated with the invoking object

class Example
public static int runClassMethod(int i, long l,
float f, double d, Object o, byte b)
return 0
public int runInstanceMethod(char c, double d,
short s, boolean b)
return 0

24
Resolution, Exceptions, and Abrupt Method
Completion

Resolution
The references to types, fields and methods in
the constant pool are initially symbolic.
When the JVM needs to refer to either one, they
are still in symbolic form, and the virtual
machine needs to perform a resolution
Resolutions are performed using data from the
Method Area together with information obtained
from the class loaders
Exceptions
JVM uses exception tables to handle exceptions
Exception table entries consist of ranges within
the bytecode of a method that are protected under
a certain exception
Entries contain a starting and ending point, and
also a pointer to the exception handler
Abrupt Method Completion
Every unmatched exception causes an abrupt method
completion
The JVM uses the Java frame data in the
processing of abrupt method completion to restore
the stack, set the exception message, and
terminate the running program

25
Execution Engines and Execution Techniques

The Execution Engine is part of the core of any
JVM. Its specification is made up of the
instruction set and what the implementation
should no, not how it should do it.
Possible implementations can interpret,
just-in-time compile, natively execute, or a
combination of these
Each thread of a running Java application is a
distinct instance of the execution engine in
action
Important aspects
The Instruction Set
Native Method Interaction
Execution Order and Optimization

26
The Instruction Set

Each instruction is a one-byte opcode followed by
zero or more operands
The opcode indicates the operation and the
operands supply the data needed by the JVM to
complete the operation. Information about how
many operands are needed is built in the nature
of the opcode itself
The execution engine processes one opcode at a
time
When running, the execution engine has direct
access to the current constant pool, current
frame, and current operand stack
The operand stack is part of the Java stack,
organized as an array of words, accessed solely
by push and pop operations, and used as a
workspace to perform stack based register-like
operations.
All instructions in the JVM are associated with
mnemonics. The listing of a class file can
produce an assembly-like language file
To be able to understand how the JVM works, we
can look inside a class file using the javap
program distributed with any Java 2 SDK

27
Example A class method, primitive types
28
Example B class method, object types
29
Example C instance method, primitive types
30
Native Methods Interaction

It is possible for the execution engine to be
requested a native method
Depending on the implementation of the virtual
machine it may or may not be able to invoke
native methods
The implementations that allow it, provide an
interface (JNI). The execution engine must be
able to invoke a native method, wait idle until
the native method returns, and then continue the
execution of bytecodes. It also must be able to
deal with exceptions that come from the native
method
There is a layer of complexity added to this
running schema, because the native methods
themselves need to be able to access information
in the JVM while running native code

31
Execution Order and Optimizations

Execution Order
Execution engines are responsible to determining
the next instruction to be executed
Generally the flow is straightforward, most
instructions are executed in order
Instructions like goto and return use data to
specify the next instruction
The only abnormal paths of execution are in the
case of exception handling
Optimizations
Interpretation first generation JVM
Just-in-time compilation second generation JVM
Adaptive optimization contemporary trend
Native execution A form of JIT and Adaptive
Optimization

32
Adaptive Optimization

Implemented by most modern versions of the JVM,
like Suns Hotspot virtual machine
The advantages of either pure interpretation or
just-in-time compilation are too extreme if
implemented in absolute terms
A purely interpreted program will be slow at
runtime, but it does not take extra time to get
started
JIT compilation allows for fast execution, but
would delay the beginning of execution by the
time needed to completely compile the bytecode to
native code
In Adaptive Optimization, the JVM takes advantage
of information available at runtime and attempts
to combine the bytecode interpretation with
compilation to native code.

Based on a clever remark
most programs spend 80 to 90 percent of the
time executing 10 to 20 percent of the code
The JVM
Begins by interpreting the bytecodes
Monitors execution of that code
Figures out the hot spot of the code and starts a
background thread to compile that code to native
code
Avoids premature optimization, which is typical
to static compilers

Too good to be true? Correct!
There are some issues with this, and depending
on how well these issues are dealt with, one
implementation can greatly differ in performance
from another
Known Issues
Adaptive optimization does not work well over
method invocations.
Inlining? This can have issues too when we talk
in terms of polymorphism
Going in and out the hot spot

35
Summary
36
Practice

Play with javap to determine
A) The assembly-like listing of a compiled class
using javap c
B) The method signature (public and protected) of
a compiled class using javap s
C) The complete profile (including Constant Pool)
of a compiled class javap -verbose

Write a Comment

User Comments (0)