AI - OPAL - Abstract Interpretation Framework

A highly-configurable framework for the (abstract) interpretation of Java bytecode that relies on OPAL's resolved representation (org.opalj.br) of Java bytecode.

This framework basically traverses all instructions of a method in depth-first order until an instruction is hit where multiple control flows potentially join. This instruction is then only analyzed if no further instruction can be evaluated where no paths join (org.opalj.br.Code.cfPCs). Each instruction is then evaluated using a given (abstract) org.opalj.ai.Domain. The evaluation of a subroutine (Java code < 1.5) - in case of an unhandled exception – is always first completed before the evaluation of the parent (sub)routine is continued.

Interacting with OPAL's Abstract Interpreter

The primary means how to make use of this framework is to perform an abstract interpretation of a method using a customized Domain. That customized domain can be used, e.g., to build a call graph or to do other intra-/interprocedural analyses while the code is analyzed. Additionally, it is possible to analyze the result of an abstract interpretation.

Thread Safety

This class is thread-safe. However, to make it possible to use one abstract interpreter instance for the concurrent abstract interpretation of independent methods, the AITracer (if any) has to be thread-safe too.

Hence, it is possible to use a single instance to analyze multiple methods in parallel. However, if you want to be able to selectively abort the abstract interpretation of some methods or want to selectively trace the interpretation of some methods, then you should use multiple abstract interpreter instances. Creating new instances is usually extremely cheap as this class does not have any significant associated state.

Subclasses are not required to be thread-safe and may have more complex state.

Note

Useless Joins Avoidance

OPAL tries to minimize unnecessary joins by using the results of a naive live variables analysis (limited to the registers only!). This analysis helps to prevent unnecessary joins and also helps to reduce the overall number of processing steps. E.g., in the following case the swallowed exceptions that may occur whenever transformIt is called, would lead to an unnecessary join though the exception is not required!

if (enc != null) {
  try {
    return transformIt(transformIt(enc));
  } catch (RuntimeException re) {}
}
return "";

This analysis leads to an overall reduction in the number of evaluated instruction of about 4,5%. Additionally, it also reduces the effort spent on "expensive" joins which leads to an overall(!) improvement for the l1.DefaultDomain of ~8,5%.

Dead Variables Elimination based on Definitive Paths

(STILL IN DESIGN!!!!)

Idea

Given an instruction i which may result in a fork of the control-flow (e.g., a conditional branch or an invoke instruction that may throw a catched exception). If the (frist) evaluation of i definitively rules out several possible paths and - on all paths that are taken - some values are dead, but live on some of the other paths, then the respectively current values will never be propagated to the remaining paths, even if the remaining paths are eventually taken! This helps in variety of cases such as, e.g.,

var s : Object = null
for{/* it can statically be determined that this path is taken at least once!*/} {
    s = "something else"
}
doIt(s); // here, "s" is guaranteed not to reference the orignal value "null"!

Implementation

When we have a fork, check if all paths...

Customizing the Abstract Interpretation Framework

Customization of the abstract interpreter is done by creating new subclasses that override the relevant methods (in particular: AI#isInterrupted and AI#tracer).

OPAL does not make assumptions about the number of domain objects that are used. However, if a single domain object is used by multiple instances of this class and the abstract interpretations are executed concurrently, then the domain has to be thread-safe. The latter is trivially the case when the domain object itself does not have any state; however, most domain objects have some state.

Linear Supertypes

AnyRef, Any

Known Subclasses

BaseAI, BaseAI, BoundedInterruptableAI, CountingAI, InstructionCountBoundedAI, InterruptableAI, PerformAI, TimeBoundedAI

Instance Constructors

new AI(IdentifyDeadVariables: Boolean = true)

Type Members

type SomeLocals[V <: SomeLocals.V.d.type.DomainValue forSome {val d: D}] = Option[IndexedSeq[V]]

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
def +(other: String): String

Implicit information
This member is added by an implicit conversion from AI[D] to any2stringadd[AI[D]] performed by method any2stringadd in scala.Predef.
Definition Classes
any2stringadd
def ->[B](y: B): (AI[D], B)

Implicit information
This member is added by an implicit conversion from AI[D] to ArrowAssoc[AI[D]] performed by method ArrowAssoc in scala.Predef.
Definition Classes
ArrowAssoc
Annotations
@inline()
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final val IdentifyDeadVariables: Boolean
def apply(method: Method, theDomain: D): AIResult { val domain: theDomain.type }

Performs an abstract interpretation of the given method using the given domain.
Performs an abstract interpretation of the given method using the given domain.
method
A non-native, non-abstract method of the given class file that will be analyzed. All parameters are automatically initialized with sensible default values.
theDomain
The domain that will be used to perform computations related to values.
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
def continueInterpretation(code: Code, cfJoins: BitSet, liveVariables: LiveVariables, theDomain: D)(initialWorkList: Chain[PC], alreadyEvaluated: Chain[PC], theOperandsArray: D.OperandsArray, theLocalsArray: D.LocalsArray, theMemoryLayoutBeforeSubroutineCall: Chain[(PC, D.OperandsArray, D.LocalsArray)], theSubroutinesOperandsArray: D.OperandsArray, theSubroutinesLocalsArray: D.LocalsArray): AIResult { val domain: theDomain.type }

Continues the interpretation of/performs an abstract interpretation of the given method (code) using the given domain.
Continues the interpretation of/performs an abstract interpretation of the given method (code) using the given domain.
code
The bytecode that will be interpreted using the given domain.
cfJoins
The set of instructions where two or more control flow paths join. The abstract interpretation framework will only perform a join operation for those instructions.
theDomain
The domain that will be used to perform the domain dependent computations.
initialWorkList
The list of program counters with which the interpretation will continue. If the method was never analyzed before, the list should just contain the value "0"; i.e., we start with the interpretation of the first instruction (see initialWorkList). Note that the worklist may contain negative values. These values are not related to a specific instruction per-se but encode the necessary information to handle subroutines. In case of calls to a subroutine we add the special values SUBROUTINE and SUBROUTINE_START to the list to encode when the evaluation started. This is needed to completely process the subroutine (to explore all paths) before we finally return to the main method.
alreadyEvaluated
The list of the program counters (PC) of the instructions that were already evaluated. Initially (i.e., if the given code is analyzed the first time) this list is empty. This list is primarily needed to correctly resolve jumps to sub routines (JSR(_W) and RET instructions.) For each instruction that was evaluated, the operands array and the locals array must be non-empty (not null).
theOperandsArray
The array that contains the operand stacks. Each value in the array contains the operand stack before the instruction with the corresponding index is executed. This array can be empty except of the indexes that are referred to by the initialWorklist. The operandsArray data structure is mutated by OPAL-AI and it is recommended that a Domain does not directly mutate the state of this array.
theLocalsArray
The array that contains the local variable assignments. Each value in the array contains the local variable assignments before the instruction with the corresponding program counter is executed. The localsArray data structure is mutated by OPAL-AI and it is recommended that a Domain does not directly mutate the state of this array.
theSubroutinesOperandsArray
The array that contains the intermediate information about the subroutines' operands. This value should be null unless we are continuing an aborted computation and a subroutine was already analyzed.
theSubroutinesLocalsArray
The array that contains the intermediate information about the subroutines' locals. This value should be null unless we are continuing an aborted computation and a subroutine was already analyzed.
def continueInterpretation(code: Code, theDomain: D)(initialWorkList: Chain[PC], alreadyEvaluated: Chain[PC], theOperandsArray: D.OperandsArray, theLocalsArray: D.LocalsArray): AIResult { val domain: theDomain.type }
def ensuring(cond: (AI[D]) ⇒ Boolean, msg: ⇒ Any): AI[D]

Implicit information
This member is added by an implicit conversion from AI[D] to Ensuring[AI[D]] performed by method Ensuring in scala.Predef.
Definition Classes
Ensuring
def ensuring(cond: (AI[D]) ⇒ Boolean): AI[D]

Implicit information
This member is added by an implicit conversion from AI[D] to Ensuring[AI[D]] performed by method Ensuring in scala.Predef.
Definition Classes
Ensuring
def ensuring(cond: Boolean, msg: ⇒ Any): AI[D]

Implicit information
This member is added by an implicit conversion from AI[D] to Ensuring[AI[D]] performed by method Ensuring in scala.Predef.
Definition Classes
Ensuring
def ensuring(cond: Boolean): AI[D]

Implicit information
This member is added by an implicit conversion from AI[D] to Ensuring[AI[D]] performed by method Ensuring in scala.Predef.
Definition Classes
Ensuring
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def formatted(fmtstr: String): String

Implicit information
This member is added by an implicit conversion from AI[D] to StringFormat[AI[D]] performed by method StringFormat in scala.Predef.
Definition Classes
StringFormat
Annotations
@inline()
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
def initialLocals(method: Method, domain: D)(someLocals: SomeLocals[D.DomainValue] = None): D.Locals

Returns the initial register assignment (the initialized locals) that is used when analyzing a new method.
Returns the initial register assignment (the initialized locals) that is used when analyzing a new method.
Initially, only the registers that contain the method's parameters (including the self reference (this)) are used. If no initial assignment is provided (someLocals == None) a valid assignment is automatically created using the domain. See perform(...) for further details regarding the initial register assignment.
This method is called by the perform method with the same signature. It may be overridden by subclasses to perform some additional processing. In that case, however, it is highly recommended to call this method to finalize the initial assignment.
method
A non-native, non-abstract method. I.e., a method that has an implementation in Java bytecode (e.g., method.body.isDefined === true).
domain
The domain that will be used to perform computations related to values.
def initialOperands(method: Method, domain: D): D.Operands

Returns the initial set of operands that will be used for the abstract interpretation of the given method.
Returns the initial set of operands that will be used for the abstract interpretation of the given method.
In general, an empty list is returned as the JVM specification mandates that the operand stack is empty at the very beginning of a method.
This method is called by the perform method with the same signature. It may be overridden by subclasses to perform some additional processing.
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def isInterrupted: Boolean

Determines whether a running (or to be started) abstract interpretation should be interrupted (default: false).
Determines whether a running (or to be started) abstract interpretation should be interrupted (default: false).
In general, interrupting the abstract interpreter may be meaningful if the abstract interpretation takes too long or if the currently used domain is not sufficiently precise enough/if additional information is needed to continue with the analysis.
Called during the abstract interpretation of a method to determine whether the computation should be aborted. This method is always called directly before the evaluation of the first/next instruction. I.e., before the very first instruction or after the ai has completely evaluated an instruction, updated the memory and stated all constraints.

Attributes
protected
Note
When the abstract interpreter is currently waiting on the result of the interpretation of a called method it may take some time before the interpretation of the current method (this abstract interpreter) is actually aborted. This method needs to be overridden in subclasses to identify situations in which a running abstract interpretation should be interrupted.
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def perform(method: Method, theDomain: D)(someLocals: Option[IndexedSeq[D.DomainValue]] = None): AIResult { val domain: theDomain.type }

Analyzes the given method using the given domain and the pre-initialized parameter values (if any).
Analyzes the given method using the given domain and the pre-initialized parameter values (if any). Basically, first the set of initial operands and locals is calculated before the respective perform(...,initialOperands,initialLocals) method is called.
Controlling the AI
The abstract interpretation of a method is aborted if the AI's isInterrupted method returns true.
method
A non-abstract, non-native method of the given class file. I.e., a method with a body.
theDomain
The abstract domain that will be used for the abstract interpretation of the given method.
someLocals
The initial register assignment (the parameters passed to the method). If the values passed to a method are already known, the abstract interpretation will be performed under that assumption. The specified number of locals has to be equal or larger than the number of parameters (including this in case of a non-static method.). If the number is lower than method.body.maxLocals it will be adjusted as required.
returns
The result of the abstract interpretation. Basically, the calculated memory layouts; i.e., the list of operands and local variables before each instruction. Each calculated memory layout represents the layout before the instruction with the corresponding program counter was interpreted. If the interpretation was aborted, the returned result object contains all necessary information to continue the interpretation if needed/desired.
def performInterpretation(strictfp: Boolean, code: Code, theDomain: D)(initialOperands: D.Operands, initialLocals: D.Locals): AIResult { val domain: theDomain.type }

Performs an abstract interpretation of the given (byte)code using the given domain and the initial operand stack and initial register assignment.
def preInterpretationInitialization(code: Code, instructions: Array[Instruction], cfJoins: BitSet, liveVariables: LiveVariables, theDomain: D)(theOperandsArray: D.OperandsArray, theLocalsArray: D.LocalsArray, theMemoryLayoutBeforeSubroutineCall: Chain[(PC, D.OperandsArray, D.LocalsArray)], theSubroutinesOperandsArray: D.OperandsArray, theSubroutinesLocalsArray: D.LocalsArray): Unit

Performs additional initializations of the Domain, if the Domain implements the trait TheAI, TheCodeStructure, TheMemoryLayout or CustomInitialization.
Performs additional initializations of the Domain, if the Domain implements the trait TheAI, TheCodeStructure, TheMemoryLayout or CustomInitialization.
This method is called before the abstract interpretation is started/continued.

Attributes
protected[this]
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
def tracer: Option[AITracer]

The tracer (default: None) that is called by OPAL while performing the abstract interpretation of a method.
The tracer (default: None) that is called by OPAL while performing the abstract interpretation of a method.
This method is called at different points to report on the analysis progress (see org.opalj.ai.AITracer for further details)
It is possible to attach/detach a tracer at any time.
To attach a tracer to the abstract interpreter override this method in subclasses and return some tracer object.
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
def →[B](y: B): (AI[D], B)

Implicit information
This member is added by an implicit conversion from AI[D] to ArrowAssoc[AI[D]] performed by method ArrowAssoc in scala.Predef.
Definition Classes
ArrowAssoc

AI

Related Doc: package ai

abstract class AI[D <: Domain] extends AnyRef

Interacting with OPAL's Abstract Interpreter

Thread Safety

Useless Joins Avoidance

Dead Variables Elimination based on Definitive Paths

Idea

Implementation

Customizing the Abstract Interpretation Framework

Instance Constructors

new AI(IdentifyDeadVariables: Boolean = true)

Type Members

type SomeLocals[V <: SomeLocals.V.d.type.DomainValue forSome {val d: D}] = Option[IndexedSeq[V]]

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

def +(other: String): String

def ->[B](y: B): (AI[D], B)

final def ==(arg0: Any): Boolean

final val IdentifyDeadVariables: Boolean

def apply(method: Method, theDomain: D): AIResult { val domain: theDomain.type }

final def asInstanceOf[T0]: T0

def clone(): AnyRef

def continueInterpretation(code: Code, theDomain: D)(initialWorkList: Chain[PC], alreadyEvaluated: Chain[PC], theOperandsArray: D.OperandsArray, theLocalsArray: D.LocalsArray): AIResult { val domain: theDomain.type }

def ensuring(cond: (AI[D]) ⇒ Boolean, msg: ⇒ Any): AI[D]

def ensuring(cond: (AI[D]) ⇒ Boolean): AI[D]

def ensuring(cond: Boolean, msg: ⇒ Any): AI[D]

def ensuring(cond: Boolean): AI[D]

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

def formatted(fmtstr: String): String

final def getClass(): Class[_]

def hashCode(): Int

def initialLocals(method: Method, domain: D)(someLocals: SomeLocals[D.DomainValue] = None): D.Locals

def initialOperands(method: Method, domain: D): D.Operands

final def isInstanceOf[T0]: Boolean

def isInterrupted: Boolean

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

def perform(method: Method, theDomain: D)(someLocals: Option[IndexedSeq[D.DomainValue]] = None): AIResult { val domain: theDomain.type }

Controlling the AI

def performInterpretation(strictfp: Boolean, code: Code, theDomain: D)(initialOperands: D.Operands, initialLocals: D.Locals): AIResult { val domain: theDomain.type }

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

def tracer: Option[AITracer]

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

def →[B](y: B): (AI[D], B)

Inherited from AnyRef

Inherited from Any

Inherited by implicit conversion any2stringadd from AI[D] to any2stringadd[AI[D]]

Inherited by implicit conversion StringFormat from AI[D] to StringFormat[AI[D]]

Inherited by implicit conversion Ensuring from AI[D] to Ensuring[AI[D]]

Inherited by implicit conversion ArrowAssoc from AI[D] to ArrowAssoc[AI[D]]

Ungrouped