Abstract
We present a compiler-based scheme to protect the confidentiality of sensitive data in low-level applications (e.g. those written in C) in the presence of an active adversary. In our scheme, the programmer marks sensitive data by lightweight annotations on the top-level definitions in the source code. The compiler then uses a combination of static dataflow analysis, runtime instrumentation, and a novel taint-aware form of control-flow integrity to prevent data leaks even in the presence of low-level attacks. To reduce runtime overheads, the compiler uses a novel memory layout.
We implement our scheme within the LLVM framework and evaluate it on the standard SPEC-CPU benchmarks, and on larger, real-world applications, including the NGINX webserver and the OpenLDAP directory server. We find that the performance overheads introduced by our instrumentation are moderate (average 12% on SPEC), and the programmer effort to port the applications is minimal.
Methodology
Key insight: complete memory safety and perfect control-flow integrity (CFI) are neither sufficient nor necessary for preventing data leaks.
- Compiler performs dataflow analysis: analyzer infer expected type of taint
- Different types are placed into two separate regions
- Not requiring alias analysis
- Exclude the compiler from the TCB via a tiny verifier