Changelog

This lists the major changes in angr. Tracking minor changes are left as an exercise for the reader :-)

angr 5.6.8.22

Major point release! An incredible number of things have changed in the month run-up to the Cyber Grand Challenge.

  • Integration with Unicorn Engine supported for concrete execution. A new SimRun type, SimUnicorn, may step through many basic blocks at once, so long as there is no operation on symbolic data. Please use our fork of unicorn engine, which has many patches applied. All these patches are pending merge into upstream.
  • Lots of improvements and bug fixes to CFGFast. Rumors are angr’s CFG was only "optimized" for x86-64 binaries (which is really because most of our test cases are compiled as 64-bit ELFs). Now it is also “optimized” for x86 binaries :) (editor's note: angr is built with cross-architecture analysis in mind. CFG construction is pretty much the only component which has architecture-specific behavior.)
  • Lots of improvements to the VFG analysis, including speed and accuracy. However, there is still a lot to be done.
  • Lots of speed optimizations in general - CFGFast should be 3-6x faster under CPython with much less memory usage.
  • Now data dependence graph gives you a real dependence graph between variable definitions. Try data_graph and simplified_data_graph on a DDG object!
  • New state option simuvex.o.STRICT_PAGE_ACCESS will cause a SimSegfaultError to be raised whenever the guest reads/writes/executes memory that is either unmapped or doesn't have the appropriate permissions.
  • Merging of paths (as opposed to states) is performed in a much smarter way.
  • The behavior of the support_selfmodifying_code project option is changed: Before, this would allow the state to be used as a fallback source of instruction bytes when no backer from CLE is available. Now, this option makes instruction lifting use the state as the source of bytes always. When the option is disabled and execution jumps outside the normal binary, the state will be used automatically.
  • Actually support self-modifying code - if a basic block of code modifies itself, the block will be re-lifted before the next instruction starts.
  • Syscalls are handled differently now - Before you would see a SimRun for a syscall helper, now you'll just see a SimProcedure for the given syscall. Additionally, each syscall has its own address in a "syscalls segment", and syscalls are treated as jumps to this segment. This simplifies a lot of things analysis-wise.
  • CFGAccurate accepts a base_graph keyword to its constructor, e.g. CFGFast().graph, or even .graph of a function, to use as a base for analysis.
  • New fast memory model for cases where symbolic-addressed reads and writes are unlikely.
  • Conflicts between the find and avoid parameters to the Explorer otiegnqwvk are resolved correctly. (credit clslgrnc)
  • New analysis StaticHooker which hooks library functions in unstripped statically linked binaries.
  • Lifter can be used without creating an angr Project. You must manually specify the architecture and bytestring in calls to .lift() and .fresh_block(). If you like, you can also specify the architecture as a parameter to the constructor and omit it from the lifting calls.
  • Add two new analyses developed for the CGC (mostly as examples of doing static analysis with angr): Reassembler and BinaryOptimizer.

angr 4.6.6.28

In general, there have been enormous amounts of speed improvements in this release. Depending on the workload, angr should run about twice as fast. Aside from this, there have also been many submodule-specific changes:

angr

Quite a few changes and improvements are made to CFGFast and CFGAccurate in order to have better and faster CFG recovery. The two biggest changes in CFGFast are jump table resolution and data references collection, respectively. Now CFGFast resolves indirect jumps by default. You may get a list of indirect jumps recovered in CFGFast by accessing the indirect_jumps attribute. For many cases, it resolves the jump table accurately. Data references collection is still in alpha mode. To test data references collection, just pass collect_data_references=True when creating a fast CFG, and access the memory_data attribute after the CFG is constructed.

CFG recovery on ARM binaries is also improved.

A new paradigm called an "otiegnqwvk", or an "exploration technique", allows the packaging of special logic related to path group stepping.

SimuVEX

Reads/writes to the x87 fpu registers now work correctly - there is special logic that rotates a pointer into part of the register file to simulate the x87 stack.

With the recent changes to Claripy, we have configured SimuVEX to use the composite solver by default. This should be transparent, but should be considered if strange issues (or differences in behavior) arise during symbolic execution.

Claripy

Fixed a bug in claripy where __div__ was not always doing unsigned division, and added new methods SDiv and SMod for signed division and signed remainder, respectively.

Claripy frontends have been completely rewritten into a mixin-centric solver design. Basic frontend functionality (i.e., calling into the solver or dealing with backends) is handled by frontends (in claripy.frontends), and additional functionality (such as caching, deciding when to simplify, etc) is handled by frontend mixins (in claripy.frontend_mixins). This makes it considerably easier to customize solvers to your specific needE. For examples, look at claripy/solver.py.

Alongside the solver rewrite, the composite solver (which splits constraints into independent constraint sets for faster solving) has been immensely improved and is now functional and fast.

angr 4.6.6.4

Syscalls are no longer handled by simuvex.procedures.syscalls.handler. Instead, syscalls are now handled by angr.SimOS.handle_syscall(). Previously, the address of a syscall SimProcedure is the address right after the syscall instruction (e.g. int 80h), which collides with the real basic block starting at that address, and is very confusing. Now each syscall SimProcedure has its own address, just as a normal SimProcedure. To support this, there is another region mapped for the syscall addresses, Project._syscall_obj.

Some refactoring and bug fixes in CFGFast.

Claripy has been given the ability to handle annotations on ASTs. An annotation can be used to customize the behavior of some backends without impacting others. For more information, check the docstrings of claripy.Annotation and claripy.Backend.apply_annotation.

angr 4.6.5.25

New state constructor - call_state. Comes with a refactor to SimCC, a refactor to callable, and the removal of PathGroup.call. All these changes are thoroughly documented, in angr-doc/docs/structured_data.md

Refactor of SimType to make it easier to use types - they can be instanciated without a SimState and one can be added later. Comes with some usability improvements to SimMemView. Also, there's a better wrapper around PyCParser for generating SimType instances from c declarations and definitions. Again, thoroughly documented, still in the structured data doc.

CFG is now an alias to CFGFast instead of CFGAccurate. In general, CFGFast should work under most cases, and it's way faster than CFGAccurate. We believe such a change is necessary, and will make angr more approachable to new users. You will have to change your code from CFG to CFGAccurate if you are relying on specific functionalities that only exist in CFGAccurate, for example, context-sensitivity and state-preserving. An exception will be raised by angr if any parameter passed to CFG is only supported by CFGAccurate. For more detailed explanation, please take a look at the documentation of angr.analyses.CFG.

angr 4.6.3.28

PyVEX has a structural overhaul. The IRExpr, IRStmt, and IRConst modules no longer exist as submodules, and those module names are deprecated. Use pyvex.expr, pyvex.stmt, and pyvex.const if you need to access the members of those modules.

The names of the first three parameters to pyvex.IRSB (the required ones) have been changed. If you were passing the positional args to IRSB as keyword args, consider switching to positional args. The order is data, mem_addr, arch.

The optional parameter sargc to the entry_state and full_init_state constructors has been removed and replaced with an argc parameter. sargc predates being able to have claripy ASTs independent from a solver. The new system is to pass in the exact value, ast or integer, that you'd like to have as the guest program's arg count.

CLE and angr can now accept file-like streams, that is, objects that support stream.read() and stream.seek() can be passed in wherever a filepath is expected.

Documentation is much more complete, especially for PyVEX and angr's symbolic execution control components.

angr 4.6.3.15

There have been several improvements to claripy that should be transparent to users:

  • There's been a refactoring of the VSA StridedInterval classes to fix cases where operations were not sound. Precision might suffer as a result, however.
  • Some general speed improvements.
  • We've introduced a new backend into claripy: the ReplacementBackend. This frontend generates replacement sets from constraints added to it, and uses these replacement sets to increase the precision of VSA. Additionally, we have introduced the HybridBackend, which combines this functionality with a constraint solver, allowing for memory index resolution using VSA.

angr itself has undergone some improvements, with API changes as a result:

  • We are moving toward a new way to store information that angr has recovered about a program: the knowledge base. When an analysis recovers some truth about a program (i.e., "there's a basic block at 0x400400", or "the block at 0x400400 has a jump to 0x400500"), it gets stored in a knowledge-base. Analysis that used to store data (currently, the CFG) now store them in a knowledge base and can share the global knowledge base of the project, now accessible via project.kb. Over time, this knowledge base will be expanded in the course of any analysis or symbolic execution, so angr is constantly learning more information about the program it is analyzing.
  • A forward data-flow analysis framework (called ForwardAnalysis) has been introduced, and the CFG was rewritten on top of it. The framework is still in alpha stage - expect more changes to be made. Documentation and more details will arrive shortly. The goal is to refactor other data-flow analysis, like CFGFast, VFG, DDG, etc. to use ForwardAnalysis.
  • We refactored the CFG to a) improve code readability, and b) eliminate some bad designs that linger due to historical reasons.

angr 4.5.12.?

Claripy has a new manager for backends, allowing external backends (i.e., those implemented by other modules) to be used. The result is that claripy.backend_concrete is now claripy.backends.concrete, claripy.backend_vsa is now claripy.backends.vsa, and so on.

angr 4.5.12.12

Improved the ability to recover from failures in instruction decoding. You can now hook specific addresses at which VEX fails to decode with project.hook, even if those addresses are not the beginning of a basic block.

angr 4.5.11.23

This is a pretty beefy release, with over half of claripy having been rewritten and major changes to other analyses. Internally, Claripy has been unified -- the VSA mode and symbolic mode now work on the same structures instead of requiring structures to be created differently. This opens the door for awesome capabilities in the future, but could also result in unexpected behavior if we failed to account for something.

Claripy has had some major interface changes:

  • claripy.BV has been renamed to claripy.BVS (bit-vector symbol). It can now create bitvectors out of strings (i.e., claripy.BVS(0x41, 8) and claripy.BVS("A") are identical).
  • state.BV and state.BVV are deprecated. Please use state.se.BVS and state.se.BVV.
  • BV.model is deprecated. If you're using it, you're doing something wrong, anyways. If you really need a specific model, convert it with the appropriate backend (i.e., claripy.backend_concrete.convert(bv)).

There have also been some changes to analyses:

  • Interface: CFG argument keep_input_state has been renamed to keep_state. With this option enabled, both input and final states are kept.
  • Interface: Two arguments cfg_node and stmt_id of BackwardSlicing have been deprecated. Instead, BackwardSlicing takes a single argument, targets. This means that we now support slicing from multiple sources.
  • Performance: The speed of CFG recovery has been slightly improved. There is a noticeable speed improvement on MIPS binaries.
  • Several bugs have been fixed in DDG, and some sanity checks were added to make it more usable.

And some general changes to angr itself:

  • StringSpec is deprecated! You can now pass claripy bitvectors directly as arguments.

results matching ""

    No results matching ""