semanticcpg apidocs1.1.1036 < Back

Packages

package root
Definition Classes
root
package io
Definition Classes
root
package shiftleft
Definition Classes
io
package semanticcpg
Domain specific language for querying code property graphs
Domain specific language for querying code property graphs
This is the API reference for the CPG query language, a language to mine code for defects and vulnerabilities both interactively on a code analysis shell (REPL), or using non-interactive scripts.
Queries written in the CPG query language express graph traversals (see https://en.wikipedia.org/wiki/Graph_traversal). Similar to the standard graph traversal language "Gremlin" (see https://en.wikipedia.org/wiki/Gremlin_(programming_language))) these traversals are formulated as sequences of primitive language elements referred to as "steps". You can think of a step as a small program, similar to a unix shell utility, however, instead of processing lines one by one, the step processes nodes of the graph.
Starting a traversal
All traversals begin by selecting a set of start nodes, e.g.,
```
cpg.method
```
will start the traversal at all methods, while
```
cpg.local
```
will start at all local variables. The complete list of starting points can be found at
```
io.shiftleft.codepropertygraph.Cpg
```
Lazy evaluation
Queries are lazily evaluated, e.g., cpg.method creates a traversal which you can add more steps to. You can, for example, evaluate the traversal by converting it to a list:
```
cpg.method.toList
```
Since toList is such a common operation, we provide the shorthand l, meaning that
```
cpg.method.l
```
provides the same result as the former query.
Properties
Nodes have "properties", key-value pairs where keys are strings and values are primitive data types such as strings, integers, or Booleans. Properties of nodes can be selected based on their key, e.g.,
```
cpg.method.name
```
traverses to all method names. Nodes can also be filtered based on properties, e.g.,
```
cpg.method.name(".*exec.*")
```
traverse to all methods where name matches the regular expression ".*exec.*". You can see a complete list of properties by browsing to the API documentation of the corresponding step. For example, you can find the properties of method nodes at io.shiftleft.semanticcpg.language.types.structure.MethodTraversal.
Side effects
Useful if you want to mutate something outside the traversal, or simply debug it: This prints all typeDecl names as it traverses the graph and increments i for each one.
```
var i = 0
cpg.typeDecl.sideEffect{typeTemplate => println(typeTemplate.name); i = i + 1}.exec
```
[advanced] Selecting multiple things from your traversal
If you are interested in multiple things along the way of your traversal, you label anything using the as modulator, and use select at the end. Note that the compiler automatically derived the correct return type as a tuple of the labelled steps, in this case with two elements.
```
cpg.method.as("method").definingTypeDecl.as("classDef").select.toList
// return type: List[(Method, TypeDecl)]
```
[advanced] For comprehensions
You can always start a new traversal from a node, e.g.,
```
val someMethod = cpg.method.head
someMethod.start.parameter.toList
```
You can use this e.g. in a for comprehension, which is (in this context) essentially an alternative way to select multiple intermediate things. It is more expressive, but more computationally expensive.
```
val query = for {
  method <- cpg.method
  param <- method.start.parameter
} yield (method.name, param.name)

query.toList
```
Definition Classes
shiftleft
package accesspath
Definition Classes
semanticcpg
AccessElement
AccessPath
AddressOf
ConstantAccess
Elements
FullMatchResult
IndirectionAccess
MatchResult
PointerShift
TrackedAlias
TrackedBase
TrackedFormalReturn
TrackedLiteral
TrackedMethod
TrackedMethodOrTypeRef
TrackedNamedVariable
TrackedReturnValue
TrackedTypeRef
TrackedUnknown
VariableAccess
VariablePointerShift
package codedumper
Definition Classes
semanticcpg
package dotgenerator
Definition Classes
semanticcpg
package language
Language for traversing the code property graph
Language for traversing the code property graph
Implicit conversions to specific steps, based on the node at hand. Automatically in scope when using anything in the steps package, e.g. Steps
Definition Classes
semanticcpg
package layers
Definition Classes
semanticcpg
package testing
Definition Classes
semanticcpg
package utils
Definition Classes
semanticcpg

io.shiftleft.semanticcpg

accesspath

package accesspath

Ordering

Alphabetic

Visibility

Public
Protected

Type Members

sealed abstract class AccessElement extends Comparable[AccessElement]
case class AccessPath(elements: Elements, exclusions: Seq[Elements]) extends Product with Serializable
case class ConstantAccess(constant: String) extends AccessElement with Product with Serializable
final class Elements extends Comparable[Elements]
case class FullMatchResult(stepOverPath: Option[AccessPath], stepIntoPath: Option[AccessPath], extensionDiff: Elements) extends Product with Serializable
Result of matchFull comparison
Result of matchFull comparison
stepOverPath
the unaffected part of the access path. Some(this) for no match, None for perfect match; may have additional exclusions to this.
stepIntoPath
The affected part of the access path, mapped to be relative to this stepIntoPath.isDefined if and only if there is a match in paths, i.e. if the call can affect the tracked variable at all. Outside of overtainting, if stepIntoPath.isDefined && stepIntoPath.elements.nonEmpty then: path.elements == other.elements ++ path.matchFull(other).stepIntoPath.get.elements extensionDiff.isEmpty
extensionDiff
extensionDiff is non empty if and only if a proper subset is affected. Outside of over tainting, if extensionDiff is non empty then: path.elements ++ path.matchFull(other).extensionDiff == other.elements path.matchFull(other).stepIntoPath.get.elements.isEmpty Invariants:
- Exclusions have no invertible tail
- Only paths without overTaint can have exclusions TODO: Figure out sensible assertions to defend these invariants
sealed trait MatchResult extends AnyRef
case class PointerShift(logicalOffset: Int) extends AccessElement with Product with Serializable
case class TrackedAlias(argIndex: Int) extends TrackedBase with Product with Serializable
trait TrackedBase extends AnyRef
case class TrackedLiteral(literal: Literal) extends TrackedBase with Product with Serializable
case class TrackedMethod(method: MethodRef) extends TrackedMethodOrTypeRef with Product with Serializable
sealed trait TrackedMethodOrTypeRef extends TrackedBase
case class TrackedNamedVariable(name: String) extends TrackedBase with Product with Serializable
case class TrackedReturnValue(call: CallRepr) extends TrackedBase with Product with Serializable
case class TrackedTypeRef(typeRef: TypeRef) extends TrackedMethodOrTypeRef with Product with Serializable

Value Members

object AccessPath extends Serializable
case object AddressOf extends AccessElement with Product with Serializable
object Elements
For handling of invertible elements, cf AccessPathAlgebra.md.
For handling of invertible elements, cf AccessPathAlgebra.md. The general rule is that elements concatenate normally, except for:
Elements(&) ++ Elements(*) == Elements() Elements(*) ++ Elements(&) == Elements() Elements(<0>) == Elements() Elements() ++ Elements() == Elements() Elements() ++ Elements() == Elements() Elements() ++ Elements() == Elements() Elements() ++ Elements() == Elements() From this, once can see that , * and & are invertible, is idempotent and <0> is a convoluted way of describing and empty sequence of tokens. Nevertheless, we mostly consider * as noninvertible (because it is, in real computers!) and as invertible (because it is in real computers, we just don't know the offset) Elements get a private constructor. Users should use the no-argument Elements.apply() factory method to get an empty path, and the specific concat operators for building up pathes. The Elements.normalized(iter) factory method serves to build this in bulk. The unnormalized factory method is more of an escape hatch. The elements field should never be mutated outside of this file: We compare and hash Elements by their contents, not by identity, and this breaks in case of mutation. The reason for using a mutable Array instead of an immutable Vector is that this is the lightest weight datastructure for the job. The reason for making this non-private is simply that it is truly annoying to write wrappers for all possible uses.
case object IndirectionAccess extends AccessElement with Product with Serializable
object MatchResult extends Enumeration
object TrackedFormalReturn extends TrackedBase
object TrackedUnknown extends TrackedBase
case object VariableAccess extends AccessElement with Product with Serializable
case object VariablePointerShift extends AccessElement with Product with Serializable

Packages

Starting a traversal

Lazy evaluation

Properties

Side effects

[advanced] Selecting multiple things from your traversal

[advanced] For comprehensions

accesspath

package accesspath

Type Members

Value Members

Ungrouped

accesspath