Backward Slicing

A program slice is a subset of statements that is obtained from the original program, usually by removing zero or more statements. Slicing is often helpful in debugging and program understanding. For instance, it’s usually easier to locate the source of a variable on a program slice.

A backward slice is constructed from a target in the program, and all data flows in this slice end at the target.

angr has a built-in analysis, called BackwardSlice, to construct a backward program slice. This section will act as a how-to for angr’s BackwardSlice analysis, and followed by some in-depth discussion over the implementation choices and limitations.

First Step First

To build a BackwardSlice, you will need the following information as input.

  • Required CFG. A control flow graph (CFG) of the program. This CFG must be an accurate CFG (CFGAccurate).
  • Required Target, which is the final destination that your backward slice terminates at.
  • Optional CDG. A control dependence graph (CDG) derived from the CFG. angr has a built-in analysis CDG for that purpose.
  • Optional DDG. A data dependence graph (DDG) built on top of the CFG. angr has a built-in analysis DDG for that purpose.

A BackwardSlice can be constructed with the following code:

>>> import angr
# Load the project
>>> b = angr.Project("examples/fauxware/fauxware", load_options={"auto_load_libs": False})

# Generate a CFG first. In order to generate data dependence graph afterwards,
# you’ll have to keep all input states by specifying keep_stat=True. Feel free 
# to provide more parameters (for example, context_sensitivity_level)for CFG 
# recovery based on your needs.
>>> cfg = b.analyses.CFGAccurate(context_sensitivity_level=2, keep_state=True)

# Generate the control dependence graph
>>> cdg = b.analyses.CDG(cfg)

# Build the data dependence graph. It might take a while. Be patient!
>>> ddg = b.analyses.DDG(cfg)

# See where we wanna go... let’s go to the exit() call, which is modeled as a 
# SimProcedure.
>>> target_func = cfg.kb.functions.function(name="exit")
# We need the CFGNode instance
>>> target_node = cfg.get_any_node(target_func.addr)

# Let’s get a BackwardSlice out of them!
# `targets` is a list of objects, where each one is either a CodeLocation 
# object, or a tuple of CFGNode instance and a statement ID. Setting statement 
# ID to -1 means the very beginning of that CFGNode. A SimProcedure does not 
# have any statement, so you should always specify -1 for it.
>>> bs = b.analyses.BackwardSlice(cfg, cdg=cdg, ddg=ddg, targets=[ (target_node, -1) ])

# Here is our awesome program slice!
>>> print bs

Sometimes it’s difficult to get a data dependence graph, or you may simply want build a program slice on top of a CFG. That’s basically why DDG is an optional parameter. You can build a BackwardSlice solely based on CFG by doing:

>>> bs = b.analyses.BackwardSlice(cfg, control_flow_slice=True)
BackwardSlice (to [(<CFGNode exit (0x10000a0) [0]>, -1)])

Using The BackwardSlice Object

Before go ahead and using BackwardSlice object, you should be noticed that the design of this class is pretty arbitrary right now, and it is by no means to be stable in the near future. I’ll try my best to keep this documentation updated when a refactoring/redesigning occurs.

Members

After construction, a BackwardSlice has the following members that describe a program slice:

Member Mode Meaning
runs_in_slice CFG-only A networkx.DiGraph instance showing addresses of blocks and SimProcedures in the program slice, as well as transitions between them
cfg_nodes_in_slice CFG-only A networkx.DiGraph instance showing CFGNodes in the program slice and transitions in between
chosen_statements With DDG A dict mapping basic block addresses to lists of statement IDs that are part of the program slice
chosen_exits With DDG A dict mapping basic block addresses to a list of “exits”. Each exit in the list is a valid transition in the program slice

Each “exit” in chosen_exit is a tuple including a statement ID and a list of target addresses. For example, an “exit” might look like the following:

(35, [ 0x400020 ])

If the “exit” is the default exit of a basic block, it’ll look like the following:

(“default”, [ 0x400085 ])

Export an Annotated Control Flow Graph

TODO

User-friendly Representation

Take a look at BackwardSlice.dbg_repr()!

TODO

Implementation Choices

TODO

Limitations

TODO

Completeness

TODO

Soundness

TODO

results matching ""

    No results matching ""