0% found this document useful (0 votes)
20 views7 pages

Annotating Java Bytecode

Java programming notes compiled for quick learning and revision. Includes theoretical points, practical examples, and key concepts for coding practice
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views7 pages

Annotating Java Bytecode

Java programming notes compiled for quick learning and revision. Includes theoretical points, practical examples, and key concepts for coding practice
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Annotating Java Bytecode

Project Report
308-621 Optimizing Compilers
McGill University
April 2000

Patrice Pominville
9745398
[email protected]

Abstract by performing such optimizations [1], this approach is


inherently limited by the nature of bytecode. Byte-
The emergence of a new class of highly dynamic and code, as a secure, semantically rich platform indepen-
object-oriented programming languages presents new dent abstraction, is a much higher-level code represen-
challenges to the established field of Compiler Opti- tation than native machine code; typically one byte-
mization. With the advent of Java and it’s popular- code instruction will map to several native machine
ity, there is now a great incentive for addressing these instructions. As a result many traditional compiler op-
issues. This paper describes how the Runtime perfor- timizations cannot be expressed at the bytecode level
mance of Java could benefit by annotating Java code and thus cannot be generated statically by an ahead-
by means of classfile attributes and details how support of-time compiler.
for this feature has been added and implemented in the On another thread, because of the nature of the
Soot Framework. Java Runtime there is an opportunity for significant
dynamic, profile based optimizations. However at
present such optimizations are only done by high per-
1 Introduction formance JVM implementations as there is no stan-
dard way to express such information in bytecode.
Java is a clean, dynamic object-oriented language As the previous comments should have made clear,
that is compiled into classfiles that contain bytecode. there is an information gap between the static analy-
Bytecode is a high-level, platform independent pro- sis phase and the Java runtime which consequently im-
gram representation that is interpreted by Java Vir- poses a heavy performance burden on JVM implemen-
tual Machines (JVMs). Performing optimizations at tations. What is needed is a mechanism to bridge this
the classfile level has distinct advantages with respect gap while preserving all the characteristics of the Java
to optimizing JVMs: platform as well as the heavy software engineering in-
• Classfiles are portable across all platforms having vestments made in current JVMs. One solution to this
JVM implementations; optimizing classfiles has problem is to make use of classfile attributes. Classfile
the potential of improving performance across all attributes are a very flexible mechanism to attach ar-
these JVMs. bitrary information to classfiles. Such attributes can
be user defined and attached to classfiles without al-
• Classfile optimizations are performed statically tering their semantics: a JVM implementation will
and need only be done once; dynamic on-the-fly safely ignore attributes it does not understand .
JVM optimizations incur runtime costs and must
be repeated for each program run.
Hence optimizing classfiles is highly desirable; un- 2 Classfile Attributes
fortunately although significant gains can be achieved
∗ A complementary web site to this paper can be found at The De facto file format for Java bytecode is the
http://www.cs.mcgill.ca/~patrice/cs621 classfile format [11]. Builtin to this format is the no-
2.1.1 Static Compiler Analysis Information
attribute_info {
u2 attribute_name_index; As stated previously, many traditional compiler opti-
u4 attribute_length; mizations cannot be fully expressed in bytecode be-
u1 info[attribute_length]; cause of it’s abstract high-level nature; however by
} defining custom classfile attributes it is possible to get
around this problem. The following is a non exhaus-
tive list of possible applications for such attributes.
Figure 1: Classfile Attribute Data Structure
Register Allocation For each method, register al-
location can be abstracted and computed stati-
tion of attributes. Classfile attributes are simple data cally. This is definitely a promising avenue for
structures present inside classfiles having the format attributes and has already been show effective by
specified by Figure 2. 2 research groups [2] [3].
In Figure 2, attribute name index is a 2 byte
unsigned integer value corresponding to the index of Array Bound Checks The Java Language man-
the attribute’s name in the classfile’s Constant Pool, dates that each array access be checked to be
attribute length is 4 byte unsigned integer specify- valid (within the array’s bounds). This is done by
ing the length of the attribute’s data and info is an JVMs at runtime; however as previous work has
array of attribute length bytes that contains the shown, many such accesses can be shown valid
actual uninterpreted raw attribute data. at compile time. These could be annotated by
A handful of attributes are already defined as part means of attributes.
of the Java Virtual Machine specification [11]; in par-
ticular the actual bytecode for a non-abstract, non- Null Pointer Checks The Java Language man-
native method is contained in the Code classfile at- dates that each object reference be determined
tribute. However the power of the attribute mech- non-null before it is dereferenced. Although this
anism lies in the fact that arbitrary attributes can is often achieved by hardware traps which incur
be user defined, as long as their names to not clash no runtime performance overhead, some architec-
with those of the standard attributes. Adding custom tures could benefit from a nullness attribute.
attributes will not break compatibility with exiting Stack Allocation of Objects In bytecode there is
JVMs as these will simply ignore unknown attributes, only one instruction for allocating memory for
without altering the semantics of the given classfile. objects, and the memory is always allocated on
Attributes can be associated with 4 different struc- the heap. However certain static analyses such
tures within a classfile. In particular classfiles have as escape analysis can determine that some ob-
one class info structure as well as method info and ject can be safely allocated on the stack, thus
field info structures for each of the class’ meth- potentially reducing some memory management
ods and fields respectively. Each of these 3 struc- runtime overhead [4].
tures contain an attribute table which can hold an ar-
bitrary number of attribute info structures. Each Runtime Static Method Binding Virtual
non-native, non-abstract method’s attribute table will call sites that cannot be safely resolved statically
contain a unique Code attribute to hold the method’s often can be resolved dynamically if certain con-
bytecode. This Code attribute has itself a attribute ditions holds. These conditions could be supplied
table of it’s own, that can contain a few standard at- as annotations.
tributes and arbitrary custom attributes.
Parallel Computations In a method, if two regions
of code, or method calls are deemed to be in-
2.1 Possible Uses for Attributes
dependent then this could be annotated and the
JVM would be given a chance to execute them in
As hinted at in the previous section, the nature parallel.
of performance enhancing classfile attributes could
be dual: attributes could convey either static com- Exception Handling as Control Flow Some ap-
piler analysis information or execution profile infor- plication use exception handling intensively as a
mation. We shall examine both these application do- control flow mechanism. Providing such a hint
mains presently. could allow a JVM to potentially use a possibly
different exception handling mechanism tailored frequently taken could help a JVM produce more
to this type of situation. The expected target efficient native code.
of an exception handler could be specified, and
given certain constraints, the exception handler Hot Data Based on profiling, if certain objects are
could be called directly from the catch clause. often accessed as a group, this information could
be conveyed to the JVM for better memory al-
2.1.2 Profile Information location and data locality. This attribute could
provide a virtual memory map for the JVM.
The most successful paradigm to date for squeezing
speed out of Java, has been for JVMs to tune their ex-
ecution according to the execution profile of a program 2.2 Annotation Issues
by natively compiling the most heavily used pieces of
code. Hence JIT enabled JVMs typically compile only
the most executed methods based on their dynamic In deciding what to annotate and in designing spec-
appraisal of a specific run. This has yielded significant ifications for custom attributes many issues must be
speed improvements over purely interpreted bytecode considered. First it is highly desirable that the an-
and much effort is still being invested in this approach notations be compatible with Java’s execution model.
as the Java HotSpot Project gives testimony [12]. That is, annotations should be platform independent
The drawback with this approach is that an overly and ideally should not compromise the verifiability
heavy burden is placed on JVMs that must dynam- of classfiles. The latter requirement might mandate
ically profile the execution of the code they run and special considerations while designing an attribute
make optimization decisions on-the-fly. All this incurs scheme. For example in [3]’s virtual register alloca-
a runtime cost and can lead to highly complex and tion scheme, virtual registers are monotyped to ensure
buggy JITs. A solution would be to use attributes to speedy verifiability at runtime. Luckily for many of
provide such profiling information gathered on previ- the annotation ideas exposed previously, verifiability
ous, ahead-of-time runs of a program. Given that most is a non-issue: for example most profile annotations
programs present a similar execution profile from run can simply be viewed as ’hints’ given to the runtime.
to run, such information could ease the burden of JIT In designing annotations it is also important to take
implementors and reduce runtime profiling cost. The into account the classfile bloat that will result from
following is a non-exhaustive list of possible applica- these. This could have negative performance affects
tions for such attributes. due to network bandwidth and caches; furthermore,
the time needed to process the annotations themselves
Hot Methods Methods could be given a hotness rat- could overshadow any possible gains that could be
ing. This could provide useful hints for JITs in obtained from these. Finally annotations should be
helping them decide which methods to compile. general enough so that they provide benefits for most
programs on most architectures.
Persistent Objects Allocation sites could be given
a persistence rating based on the expected life-
time of an allocated object. This could provide a
useful hint to high-performance garbage collectors 3 Attribute Support in Soot
on how to efficiently manage object allocation.

Garbage Collection There is no garbage collector Soot [8] is an object-oriented bytecode analysis and
that is optimal for all programs. For some pro- optimization framework implemented in Java and de-
grams a generational garbage collection is a big veloped by the Sable Compiler Research Group at
plus, while for others it is overkill or the gen- McGill University [7]. In the context of this project,
erational allocation assumption simply does not we have extended the framework to support the em-
hold. Based on profiling, annotations could be bedding of custom, user defined attributes in classfiles.
produced to specify which type of GC would be The Soot framework enables one to easily define and
best for a program. A JVM could then use this implement various compiler analyses. The added at-
hint to select an appropriate GC at runtime. tribute support as is presented in this section, enables
the implementation of a wider range of such analyses,
Branch Prediction Annotation encompassing those whose results cannot be expressed
Annotating which bytecode branches are most directly in bytecode.
The Jasmin code is then transformed by Jasmin
public interface Host
into actual classfiles. To enable attribute support,
{
we have extended the Jasmin syntax with 4 new
public List getTags();
directives: .class attribute, .method attribute,
public Tag getTag(String aName);
.field attribute and .code attribute. These all
public void addTag(Tag t);
have the following format:
public void removeTag(String name);
public boolean hasTag(String aName); attribute directive attribute_name attribute_value
}
where attribute name corresponds to the at-
tribute’s name that will be stored in the classfile’s
Figure 2: The Host Interface ConstantPool and attribute value is the actual
value of the raw byte array for the attribute which
is encoded in Base64 in order to maintain the textual
public interface Tag
format of Jasmin code. Our custom Jasmin version
{
will compile the appropriate attribute in the resulting
public String getName();
classfile from these triples, translating the attribute
public byte[] getEncoding();
values in Base64 back to a raw byte array. There is
public String toString();
a peculiarity to this scheme: for .code attribute,
public void setValue(byte[] value);
Jasmin will replace the first 2 bytes of the attribute’s
}
data by the PC of the instruction it is referring to, as
Soot currently lacks a mechanism for abstracting the
Figure 3: The Tag Interface Program Counter for a method’s bytecode (see Sec-
tion 5.1). Hence at present, all Soot generated Code
attributes in classfiles start with a 2 byte PC index
3.1 The Host and Tag Interfaces that specify an instruction context.
As expected each Tag attached to a SootClass will
Attribute support in Soot has been achieved by generate a corresponding .class attribute in the
adding two key interfaces: Host and Tag. Hosts are Jasmin code, and similarly SootField attributes trans-
objects that can hold Tags; conversely Tags are objects late to .field attribute directives, SootMethod at-
that can be attached to Hosts. These interfaces are tributes to .method attribute directives and Unit at-
listed in Figures 2 and 3. There are 4 Soot classes that tributes to .code attribute directives.
implement the Host interface; these are SootClass, These directives must be produced in Jasmin code
SootField, SootMethod and Unit, the latter of which at specific locations:
is Soot’s abstract notion of a bytecode instruction.
Application specific subclasses of Tag can be created .class attribute These must be found immediately
and attached to these Hosts by implementors of Soot before the class’ field declarations.
based analyses. As can be easily inferred, there is
.field attribute These must be found immediately
a natural mapping between these 4 Soot classes and
after the field declaration they relate to.
the attribute architecture present in classfiles as de-
scribed in Section 2. Tags attached to a SootClasses .method attributes These must be found immme-
will be compiled into an entry in the attribute table diately after the method declaration they relate
of the corresponding class and similarly for methods to.
and fields. Tags attached to Soot Units will compiled
.code attribute These must be found immediately
into entries of the Unit’s method’s Code attribute ta-
after the instruction they relate to.
ble, along with the bytecode Program Counter (PC)
of the specific instruction they index. Sample Jasmin code embedding .code attributes
is given in Figure 4.
3.2 Producing Annotated Classfiles
3.3 Auxiliary Support in Soot for Tags
The process of translating Tags held by Soot
Hosts into actual classfile attributes is now de- Several utility classes and interfaces have been
tailed. In the current state of affairs Soot pro- added to Soot to provide additional support for Tags.
duces Jasmin code [9] for it’s processed classfiles. An overview of these is now given.
iadd if (maxValueMap.containsKey(index)) {
daload AbstractValue indexV =
dastore (AbstractValue)maxValueMap.get(index);
.code_attribute ArrayCheckTag AAAB
dastore if (indexV.lessThan(arrayLength))
.code_attribute ArrayCheckTag AAAB upCheck = false;
aload_0 }
else if (index instanceof IntConstant) {
AbstractValue tmpAv =
Figure 4: Sample Annotated Jasmin Code AbstractValue.newConstantValue(index);
if (tmpAv.lessThan(arrayLength))
3.3.1 The TagManager class upCheck = false;
}
This class is meant to contain static methods to pro- Tag checkTag = new ArrayCheckTag(lowCheck, upCheck);
vide Tag related functionality. At present it provides if (!lowCheck || !upCheck) {
a flexible facility for printing out Tags: a TagPrinter > s.addTag(checkTag);
can be registered and will subsequently be used for }
printing calls made through it’s interface.
TagManger also currently provides a lookup mech-
anism for mapping an attribute name onto the proper Figure 5: Adding an ArrayCheckTag to a Unit
Soot class (if any) corresponding to the attribute. This
is useful to decode Soot attributes found when reading
classfiles.
analysis. In the given code, having determined that ei-
ther an upper array bound or lower array bound need
3.3.2 The TagPrinter Interface and the Std-
not be checked, he creates an ArrayCheckTag and at-
TagPrinter class
taches it to the Jimple statement that contains the
The TagPrinter interface is meant to be implemented array reference. Soot then automatically takes care
by classes that can print Tags. For example a of propagating these tags to the appropriate bytecode
PreatyPrinter class or a XML printer class could im- array access instruction at code generation time.
plement this interface formating tags in a distinct
The current encoding of the ArrayCheckTag at-
fashion. As previously noted, a TagPrinter is reg-
tribute’s data at the bytecode level compromises 3
istered with the TagManager to configure the latter’s
bytes. First like currently all Soot generated Code
Tag printing behavior.
level attributes the first 2 bytes are the PC of the byte-
One such class that has been implemented and is
code instruction it references. The remaining byte is
now available in Soot is the StdTagPrinter that prints
used to encode which of the upper and lower bound
out attributes in a easily parsible format. This facility
checks can be omitted. This in fact requires only 2
is used by the PrintAttributes utility (see Section 6).
bits of the byte. If the first bit is on, then an the up-
per bound check can be omitted and if the second bit
3.4 A Sample Attribute is on then the lower bound check can be omitted.

A first Soot attribute is already being developed Thus the total size of an ArrayCheckTag attribute
and is currently successfully supported by the frame- in a classfile is 9 bytes (6 bytes for the header and 3
work. This attribute has been tentatively named bytes for the data), plus the cost of the ArrayCheck-
ArrayCheckTag and can be used to annotate array Tag’s ConstantPool entry which is shared by all such
accesses that have been proven by some analysis to be attributes in a given class. Hence annotating array
within bounds, thus indicating to a JVM that it can bound checks can be done effectively in terms of code
safely omit corresponding runtime array bound checks. size. Note however that the size and format of this
This attribute is currently being used in Feng Qian’s attribute are likely to grow somewhat as we standard-
work at McGill University in implementing a Soot ize the encoding of Soot attributes. In particular we
based analysis for unnecessary array bound checks plan on introducing major/minor version numbering
elimination. Figure 5 exhibits the salient point in his of attributes for future scalability.
Soot can be edited, reordered or otherwise deleted by
> java PrintAttributes FFT.class
various optimizations, the effect of which would have
<FFT:public void <init>()>+61/ArrayCheckTag AA==
to be reflected in the referencing Tags. We are cur-
<FFT:public void <init>()>+73/ArrayCheckTag AA==
rently still evaluation how abstracting the PC could
<FFT:public void <init>()>+74/ArrayCheckTag AA==
be best achieved in the Soot framework.

Figure 6: Sample Output of the PrintAttributes Util- 5.2 Reading Soot Attributes back into
ity Soot

4 Tools to Visualize Annotated Class- At present we can produce custom Soot annotated
files classfiles but we cannot read these same attributes, or
any other custom attributes, back into Soot. It would
4.1 PrintAttributes Utility also be desirable to parse attributes from text files
when processing a set of class files. These issues are
This is a simple utility to print out custom at- presently being addressed..
tributes in a easily parsible format. It uses JavaClass
API [10] to extract the attributes from the specified
classfile and uses Soot’s StdTagPrinter class to print 6 Related Work
them out. The utility currently only takes one argu-
ment, the filename of the class to print out. Sample
output for a classfile that has been annotated by the To the best of our knowledge there has been little
array bounds check analysis is given in Figure 6. work done in investigating the possible uses of class-
file attributes to improve the performance of byte-
4.2 Hypertext Browsing of Attributes code. We are aware of only 2 research groups that
have been investigating this topic and both are focused
on conveying register allocation information through
We have extended and modified JavaClass’
attributes [2] [3]. This involves developing a Virtual
class2HTML utility in support of custom attributes.
Register allocation scheme where one assumes an in-
The resulting utility processes classfiles and produces
finite number of registers and then proceeds to stati-
corresponding HTML files that have hyperlinks to
cally minimize the number that are actually used. The
attributes. The utility uses Soot to format the
scheme developed by [3] monotypes each virtual reg-
attributes it finds. If an attribute is understood
ister which allows for efficient runtime verifiability of
by Soot it will be formatted in a human friendly
their attributes; attributes to deal with spills are also
fashion, usually by instantiating a class for the at-
presented. Experimental results obtained to date by
tribute and calling upon it’s toString method. Sam-
both groups exhibit significant code speedups.
ple output produced by this utility can be found at
http://www.cs.mcgill.ca/~patrice/cs621/FFT.html

7 Future Work
5 What’s Missing for more Complete
Attribute Support Most of the attribute support in Soot is now com-
plete; although some important features still require
5.1 Abstracting the PC our attention, work can now be more focussed on de-
veloping innovative analyses that make use of this new
It is at times useful to produce attributes whose facility. It will also be crucial that runtime systems
data contains references to the explicit value of the PC understand and make use of the classfile attributes
(Program Counter) of certain bytecode instructions. generated by these. In this respect we expect to col-
This cannot be done easy in Soot since Soot works in laborate with IBM’s HPCJ [6] group and investigate
terms of Units and not absolute bytecodes. Although how the Kaffe OpenVM [5] could be modified in sup-
it would be possible to let Tags refer to Units and then port of these. Once we have such runtime support
translate these references into PCs at the Jasmin level, we will conduct experimental results to validate the
this scheme still presents many problems as Units in soundness and effectiveness of our annotations.
8 Conclusion [5] The Kaffe OpenVm, http://www.kaffe.org

Classfile attributes can be exploited to convey ex- [6] IBM’s High Performance Compiler for Java
tra information to JVMs and allow for faster code ex- http://www.research.ibm.com/topics/popups/
ecution speeds. Support for generating custom class- innovate/java/html/hpcj.html
file attributes has now been added to The Soot byte- [7] The Sable Compiler Research Group
code Optimization Framework. Soot analyses can now http://www.sable.mcgill.ca
make use of a simple API to annotate various Objects
in the framework; classfile attributes will automati- [8] The Soot Bytecode Optimization Framework
cally be generated from these annotated objects. Two http://www.sable.mcgill.ca/soot
tools have been developed to view custom annotated
classfiles. One of these allows for hypertext brows- [9] The Jasmin Bytecode Assembler
ing of the generated annotated classfiles. Although http://mrl.nyu.edu/meyer/jvm/jasmin.html
attribute support in the Soot framework is now ex- [10] JavaClass Bytecode Engineering Framework
tensive, there still remains some work to be done in http://www.inf.fu-berlin.de/ dahm/JavaClass
order to have a more complete and flexible implemen-
tation; most notably we must find a way to abstract [11] Tim Lindholm, Frank Yellin The Java Virtual
the PC for methods in Soot. Nonetheless Soot is now Machine Specification, Second Edition,
attribute enabled can be currently be used as an effec- http://java.sun.com/docs/books/vmspec/2nd
tive tool to generate attributes by those who require -edition/html/VMSpecTOC.doc.html
such functionality.
[12] Java HotSpot Technology,
http://java.sun.com/products/hotspot
Acknowledgements

Thanks are in order for Raja Vallée-Rai, principal


author of the Soot framework, for helping conceive the
attribute support facilities that have been described
in this paper; and Feng Qiang for providing the array
bounds check analysis here described, effectively the
first analysis to make use of Soot attributes.

References

[1] Raja Vallée-Rai, Etienne Gagnon, Laurie Hen-


dren, Patrick Lam, Patrice Pominville and Vi-
jay Sundaresan, “Optimizing Java Bytecode using
the Soot Framework: Is it Feasible?”, CC/ETAPS
2000, LNCS 1781, pp. 18-34, 2000.

[2] Ana Azevedo, Alex Nicolau and Joe Hummel. “An-


notating the Java Bytecodes in Support of Op-
timization”, ACM Java Grande Conference, San
Francisco, CA, June 1999.

[3] Joel Jones and Samuel Kamin, “Annotating Java


Bytecodes in Support of Optimization”, Concur-
rency: Practice and Experience, forthcoming.

[4] David Gay, Bjarne Steensgaard “Fast Escape


Analsis and Stack Allocation for Object-Based
Programs”, CC/ETAPS 2000, LNCS 1781, pp. 82-
93, 2000.

You might also like