The Binary Analysis Platform from Carnegie Mellon University

Analyze binary code without reinventing the wheel. BAP lifts binary code into an easy to analyze language called BIL. BAP also comes with many popular analyses and program representations built-in.

Meet BAP

Do you know what the add %rax, %rbx instruction does? It seems pretty simple at first: it stores the sum of %rax and %rbx in %rbx. But that's not quite the whole story. It also sets six different flags, which can decide the program's control flow. Subtle details like when each flag is set can make precise binary analysis extremely tedious. And remember, this is just for addition, a relatively simple operation. Obviously, understanding the behavior of all assembly instructions and their side effects is a huge task. Luckily, we've already done the work for you! Meet BAP, the Binary Analysis Platform. BAP makes it easy to analyze binary code by first lifting assembly instructions into a simple language called BIL (the BAP Intermediate Language), which you can see in the adjacent figure. BIL explicitly represents the side effects of assembly instructions, such as flag computations. All your analysis needs to do is analyze the lifted BIL code to understand what the binary will do.

addr 0x0 @asm "add    %rax,%rbx"
label pc_0x0
T_t1:u64 = R_RBX:u64
T_t2:u64 = R_RAX:u64
R_RBX:u64 = R_RBX:u64 + T_t2:u64
R_CF:bool = R_RBX:u64 < T_t1:u64
R_OF:bool = high:bool((T_t1:u64 ^ ~T_t2:u64) & (T_t1:u64 ^ R_RBX:u64))
R_AF:bool = 0x10:u64 == (0x10:u64 & (R_RBX:u64 ^ T_t1:u64 ^ T_t2:u64))
R_PF:bool =
  ~low:bool(let T_acc:u64 := R_RBX:u64 >> 4:u64 ^ R_RBX:u64 in
            let T_acc:u64 := T_acc:u64 >> 2:u64 ^ T_acc:u64 in
            T_acc:u64 >> 1:u64 ^ T_acc:u64)
R_SF:bool = high:bool(R_RBX:u64)
R_ZF:bool = 0:u64 == R_RBX:u64

BIL code for add %rax, %rbx

The goal of the Binary Analysis Platform (BAP) is to make it easy to develop binary analysis techniques and tools. In particular, BAP provides:

Instruction Modeling

BAP lifts assembly instructions into a simple language called BIL. Unlike assembly, BIL only has a few language constructs, which makes it easy to analyze. In addition, BIL explicitly represents side-effects such as flag computations.

Built-in Analyses

BAP has many analyses and representations built in for you to build on. BAP can represent programs in Control Flow Graph (CFG) and Static Single Assignment (SSA) forms. BAP also has a suite of common compiler optimizations and analyses.

Conversion Utilities

BAP is written in OCaml. Don't want to learn it? No problem! BIL can be exported in a variety of formats, including protobuf, XML, and JSON. BAP can also convert BIL into LLVM byte code.

There are two basic thrusts of our research. First, we need to improve our binary analysis techniques and tools. Second, these extensions motivate new applications for binary analysis. These two thrusts are synergistic: improvements in binary analysis allow us to get better results in any application of the analysis.

Specifically, some of the areas we are working on include:

  • Scalable formal verification techniques
  • Automatic reverse engineering
  • Vulnerability-Based Signature Generation
  • Automatic Exploit Generation
  • Crypto verification
  • Malware analysis
  • Vulnerability detection in COTS software

If you are interested in collaborating in any of these areas, please contact David Brumley.

Documentation

Support

Download

If you find BAP useful, we would appreciate an email. This will help us secure funding to continue this project.

Stable releases

The current stable version of BAP is 0.7.

Development

The BAP development repository is currently not available.

Projects Using BAP

Have a project that uses BAP that you'd like listed here? Send us an email.

Credits

The BAP team

Contributors

We would especially like to thank contributors to BAP and to the general development and direction of our platform. In particular, we would like to thank and recognize the following people for their ideas and contributions:

  • Chris Williamson
  • Eric Lee
  • Spencer Whitman
  • Ivan Jager
  • Jonghyup Lee
  • Matthew Maurer

Funding

BAP is made possible by grants from CyLab and DARPA.

The History of BAP

BAP is the successor to the binary analysis techniques developed for Vine as part of David Brumley's work on the BitBlaze project. Although BAP is a complete rewrite of Vine, BitBlaze is still a great project.