Annotating Java Bytecode
Annotating Java Bytecode
Project Report
308-621 Optimizing Compilers
McGill University
April 2000
Patrice Pominville
9745398
[email protected]
Garbage Collection There is no garbage collector Soot [8] is an object-oriented bytecode analysis and
that is optimal for all programs. For some pro- optimization framework implemented in Java and de-
grams a generational garbage collection is a big veloped by the Sable Compiler Research Group at
plus, while for others it is overkill or the gen- McGill University [7]. In the context of this project,
erational allocation assumption simply does not we have extended the framework to support the em-
hold. Based on profiling, annotations could be bedding of custom, user defined attributes in classfiles.
produced to specify which type of GC would be The Soot framework enables one to easily define and
best for a program. A JVM could then use this implement various compiler analyses. The added at-
hint to select an appropriate GC at runtime. tribute support as is presented in this section, enables
the implementation of a wider range of such analyses,
Branch Prediction Annotation encompassing those whose results cannot be expressed
Annotating which bytecode branches are most directly in bytecode.
The Jasmin code is then transformed by Jasmin
public interface Host
into actual classfiles. To enable attribute support,
{
we have extended the Jasmin syntax with 4 new
public List getTags();
directives: .class attribute, .method attribute,
public Tag getTag(String aName);
.field attribute and .code attribute. These all
public void addTag(Tag t);
have the following format:
public void removeTag(String name);
public boolean hasTag(String aName); attribute directive attribute_name attribute_value
}
where attribute name corresponds to the at-
tribute’s name that will be stored in the classfile’s
Figure 2: The Host Interface ConstantPool and attribute value is the actual
value of the raw byte array for the attribute which
is encoded in Base64 in order to maintain the textual
public interface Tag
format of Jasmin code. Our custom Jasmin version
{
will compile the appropriate attribute in the resulting
public String getName();
classfile from these triples, translating the attribute
public byte[] getEncoding();
values in Base64 back to a raw byte array. There is
public String toString();
a peculiarity to this scheme: for .code attribute,
public void setValue(byte[] value);
Jasmin will replace the first 2 bytes of the attribute’s
}
data by the PC of the instruction it is referring to, as
Soot currently lacks a mechanism for abstracting the
Figure 3: The Tag Interface Program Counter for a method’s bytecode (see Sec-
tion 5.1). Hence at present, all Soot generated Code
attributes in classfiles start with a 2 byte PC index
3.1 The Host and Tag Interfaces that specify an instruction context.
As expected each Tag attached to a SootClass will
Attribute support in Soot has been achieved by generate a corresponding .class attribute in the
adding two key interfaces: Host and Tag. Hosts are Jasmin code, and similarly SootField attributes trans-
objects that can hold Tags; conversely Tags are objects late to .field attribute directives, SootMethod at-
that can be attached to Hosts. These interfaces are tributes to .method attribute directives and Unit at-
listed in Figures 2 and 3. There are 4 Soot classes that tributes to .code attribute directives.
implement the Host interface; these are SootClass, These directives must be produced in Jasmin code
SootField, SootMethod and Unit, the latter of which at specific locations:
is Soot’s abstract notion of a bytecode instruction.
Application specific subclasses of Tag can be created .class attribute These must be found immediately
and attached to these Hosts by implementors of Soot before the class’ field declarations.
based analyses. As can be easily inferred, there is
.field attribute These must be found immediately
a natural mapping between these 4 Soot classes and
after the field declaration they relate to.
the attribute architecture present in classfiles as de-
scribed in Section 2. Tags attached to a SootClasses .method attributes These must be found immme-
will be compiled into an entry in the attribute table diately after the method declaration they relate
of the corresponding class and similarly for methods to.
and fields. Tags attached to Soot Units will compiled
.code attribute These must be found immediately
into entries of the Unit’s method’s Code attribute ta-
after the instruction they relate to.
ble, along with the bytecode Program Counter (PC)
of the specific instruction they index. Sample Jasmin code embedding .code attributes
is given in Figure 4.
3.2 Producing Annotated Classfiles
3.3 Auxiliary Support in Soot for Tags
The process of translating Tags held by Soot
Hosts into actual classfile attributes is now de- Several utility classes and interfaces have been
tailed. In the current state of affairs Soot pro- added to Soot to provide additional support for Tags.
duces Jasmin code [9] for it’s processed classfiles. An overview of these is now given.
iadd if (maxValueMap.containsKey(index)) {
daload AbstractValue indexV =
dastore (AbstractValue)maxValueMap.get(index);
.code_attribute ArrayCheckTag AAAB
dastore if (indexV.lessThan(arrayLength))
.code_attribute ArrayCheckTag AAAB upCheck = false;
aload_0 }
else if (index instanceof IntConstant) {
AbstractValue tmpAv =
Figure 4: Sample Annotated Jasmin Code AbstractValue.newConstantValue(index);
if (tmpAv.lessThan(arrayLength))
3.3.1 The TagManager class upCheck = false;
}
This class is meant to contain static methods to pro- Tag checkTag = new ArrayCheckTag(lowCheck, upCheck);
vide Tag related functionality. At present it provides if (!lowCheck || !upCheck) {
a flexible facility for printing out Tags: a TagPrinter > s.addTag(checkTag);
can be registered and will subsequently be used for }
printing calls made through it’s interface.
TagManger also currently provides a lookup mech-
anism for mapping an attribute name onto the proper Figure 5: Adding an ArrayCheckTag to a Unit
Soot class (if any) corresponding to the attribute. This
is useful to decode Soot attributes found when reading
classfiles.
analysis. In the given code, having determined that ei-
ther an upper array bound or lower array bound need
3.3.2 The TagPrinter Interface and the Std-
not be checked, he creates an ArrayCheckTag and at-
TagPrinter class
taches it to the Jimple statement that contains the
The TagPrinter interface is meant to be implemented array reference. Soot then automatically takes care
by classes that can print Tags. For example a of propagating these tags to the appropriate bytecode
PreatyPrinter class or a XML printer class could im- array access instruction at code generation time.
plement this interface formating tags in a distinct
The current encoding of the ArrayCheckTag at-
fashion. As previously noted, a TagPrinter is reg-
tribute’s data at the bytecode level compromises 3
istered with the TagManager to configure the latter’s
bytes. First like currently all Soot generated Code
Tag printing behavior.
level attributes the first 2 bytes are the PC of the byte-
One such class that has been implemented and is
code instruction it references. The remaining byte is
now available in Soot is the StdTagPrinter that prints
used to encode which of the upper and lower bound
out attributes in a easily parsible format. This facility
checks can be omitted. This in fact requires only 2
is used by the PrintAttributes utility (see Section 6).
bits of the byte. If the first bit is on, then an the up-
per bound check can be omitted and if the second bit
3.4 A Sample Attribute is on then the lower bound check can be omitted.
A first Soot attribute is already being developed Thus the total size of an ArrayCheckTag attribute
and is currently successfully supported by the frame- in a classfile is 9 bytes (6 bytes for the header and 3
work. This attribute has been tentatively named bytes for the data), plus the cost of the ArrayCheck-
ArrayCheckTag and can be used to annotate array Tag’s ConstantPool entry which is shared by all such
accesses that have been proven by some analysis to be attributes in a given class. Hence annotating array
within bounds, thus indicating to a JVM that it can bound checks can be done effectively in terms of code
safely omit corresponding runtime array bound checks. size. Note however that the size and format of this
This attribute is currently being used in Feng Qian’s attribute are likely to grow somewhat as we standard-
work at McGill University in implementing a Soot ize the encoding of Soot attributes. In particular we
based analysis for unnecessary array bound checks plan on introducing major/minor version numbering
elimination. Figure 5 exhibits the salient point in his of attributes for future scalability.
Soot can be edited, reordered or otherwise deleted by
> java PrintAttributes FFT.class
various optimizations, the effect of which would have
<FFT:public void <init>()>+61/ArrayCheckTag AA==
to be reflected in the referencing Tags. We are cur-
<FFT:public void <init>()>+73/ArrayCheckTag AA==
rently still evaluation how abstracting the PC could
<FFT:public void <init>()>+74/ArrayCheckTag AA==
be best achieved in the Soot framework.
Figure 6: Sample Output of the PrintAttributes Util- 5.2 Reading Soot Attributes back into
ity Soot
4 Tools to Visualize Annotated Class- At present we can produce custom Soot annotated
files classfiles but we cannot read these same attributes, or
any other custom attributes, back into Soot. It would
4.1 PrintAttributes Utility also be desirable to parse attributes from text files
when processing a set of class files. These issues are
This is a simple utility to print out custom at- presently being addressed..
tributes in a easily parsible format. It uses JavaClass
API [10] to extract the attributes from the specified
classfile and uses Soot’s StdTagPrinter class to print 6 Related Work
them out. The utility currently only takes one argu-
ment, the filename of the class to print out. Sample
output for a classfile that has been annotated by the To the best of our knowledge there has been little
array bounds check analysis is given in Figure 6. work done in investigating the possible uses of class-
file attributes to improve the performance of byte-
4.2 Hypertext Browsing of Attributes code. We are aware of only 2 research groups that
have been investigating this topic and both are focused
on conveying register allocation information through
We have extended and modified JavaClass’
attributes [2] [3]. This involves developing a Virtual
class2HTML utility in support of custom attributes.
Register allocation scheme where one assumes an in-
The resulting utility processes classfiles and produces
finite number of registers and then proceeds to stati-
corresponding HTML files that have hyperlinks to
cally minimize the number that are actually used. The
attributes. The utility uses Soot to format the
scheme developed by [3] monotypes each virtual reg-
attributes it finds. If an attribute is understood
ister which allows for efficient runtime verifiability of
by Soot it will be formatted in a human friendly
their attributes; attributes to deal with spills are also
fashion, usually by instantiating a class for the at-
presented. Experimental results obtained to date by
tribute and calling upon it’s toString method. Sam-
both groups exhibit significant code speedups.
ple output produced by this utility can be found at
http://www.cs.mcgill.ca/~patrice/cs621/FFT.html
7 Future Work
5 What’s Missing for more Complete
Attribute Support Most of the attribute support in Soot is now com-
plete; although some important features still require
5.1 Abstracting the PC our attention, work can now be more focussed on de-
veloping innovative analyses that make use of this new
It is at times useful to produce attributes whose facility. It will also be crucial that runtime systems
data contains references to the explicit value of the PC understand and make use of the classfile attributes
(Program Counter) of certain bytecode instructions. generated by these. In this respect we expect to col-
This cannot be done easy in Soot since Soot works in laborate with IBM’s HPCJ [6] group and investigate
terms of Units and not absolute bytecodes. Although how the Kaffe OpenVM [5] could be modified in sup-
it would be possible to let Tags refer to Units and then port of these. Once we have such runtime support
translate these references into PCs at the Jasmin level, we will conduct experimental results to validate the
this scheme still presents many problems as Units in soundness and effectiveness of our annotations.
8 Conclusion [5] The Kaffe OpenVm, http://www.kaffe.org
Classfile attributes can be exploited to convey ex- [6] IBM’s High Performance Compiler for Java
tra information to JVMs and allow for faster code ex- http://www.research.ibm.com/topics/popups/
ecution speeds. Support for generating custom class- innovate/java/html/hpcj.html
file attributes has now been added to The Soot byte- [7] The Sable Compiler Research Group
code Optimization Framework. Soot analyses can now http://www.sable.mcgill.ca
make use of a simple API to annotate various Objects
in the framework; classfile attributes will automati- [8] The Soot Bytecode Optimization Framework
cally be generated from these annotated objects. Two http://www.sable.mcgill.ca/soot
tools have been developed to view custom annotated
classfiles. One of these allows for hypertext brows- [9] The Jasmin Bytecode Assembler
ing of the generated annotated classfiles. Although http://mrl.nyu.edu/meyer/jvm/jasmin.html
attribute support in the Soot framework is now ex- [10] JavaClass Bytecode Engineering Framework
tensive, there still remains some work to be done in http://www.inf.fu-berlin.de/ dahm/JavaClass
order to have a more complete and flexible implemen-
tation; most notably we must find a way to abstract [11] Tim Lindholm, Frank Yellin The Java Virtual
the PC for methods in Soot. Nonetheless Soot is now Machine Specification, Second Edition,
attribute enabled can be currently be used as an effec- http://java.sun.com/docs/books/vmspec/2nd
tive tool to generate attributes by those who require -edition/html/VMSpecTOC.doc.html
such functionality.
[12] Java HotSpot Technology,
http://java.sun.com/products/hotspot
Acknowledgements
References