In static program analysis, Soot is a language manipulation and optimization framework consisting of intermediate languages for the Java programming language. It has been developed by the Sable Research Group at McGill University known for its SableVM, a Java virtual machine and the AspectBench Compiler, an open research compiler for AspectJ. In 2010, two research papers on Soot (Vallée-Rai et al. 1999 and Pominville et al. 2000) were selected as IBM CASCON First Decade High Impact Papers among 12 other papers from the 425 entries.
Contents
Soot provides four intermediate representations for use through its API for other analysis programs to access and build upon:
The current Soot software release also contains detailed program analyses that can be used out-of-the-box, such as context-sensitive flow-insensitive points-to analysis, call graph analysis and domination analysis (answering the question "must event a follow event b?"). It also has a decompiler called dava.
Soot is free software available under the GNU Lesser General Public License (LGPL).
Jimple
Jimple is an intermediate representation of a Java program designed to be easier to optimize than Java bytecode. It is typed, has a concrete syntax and is based on three-address code.
Jimple It includes only 15 different operations, thus simplifying flow analysis. By contrast, java bytecode includes over 200 different operations.
Unlike java bytecode, in Jimple local and stack variables are typed and Jimple is inherently type safe.
Converting to Jimple, or "Jimplifying" (after "simplifying"), is conversion of bytecode to three-address code. The idea behind the conversion, first investigated by Clark Verbrugge, is to associate a variable to each position in the stack. Hence stack operations become assignments involving the stack variables.
Example
Consider the following bytecode, which is from the
iload 1 // load variable x1, and push it on the stackiload 2 // load variable x2, and push it on the stackiadd // pop two values, and push their sum on the stackistore 1 // pop a value from the stack, and store it in variable x1The above translates to the following three-address code:
stack1 = x1 // iload 1stack2 = x2 // iload 2stack1 = stack1 + stack2 // iaddx1 = stack1 // istore 1In general the resulting code does not have static single assignment form.