ConfLLVM: A Compiler for Enforcing Data Confidentiality in Low-Level Code

October 22, 2023 at 11:16 am

Abstract

We present a compiler-based scheme to protect the confidentiality of sensitive data in low-level applications (e.g. those written in C) in the presence of an active adversary. In our scheme, the programmer marks sensitive data by lightweight annotations on the top-level definitions in the source code. The compiler then uses a combination of static dataflow analysis, runtime instrumentation, and a novel taint-aware form of control-flow integrity to prevent data leaks even in the presence of low-level attacks. To reduce runtime overheads, the compiler uses a novel memory layout.

We implement our scheme within the LLVM framework and evaluate it on the standard SPEC-CPU benchmarks, and on larger, real-world applications, including the NGINX webserver and the OpenLDAP directory server. We find that the performance overheads introduced by our instrumentation are moderate (average 12% on SPEC), and the programmer effort to port the applications is minimal.

Methodology

Key insight: complete memory safety and perfect control-flow integrity (CFI) are neither sufficient nor necessary for preventing data leaks.

Compiler performs dataflow analysis: analyzer infer expected type of taint
Different types are placed into two separate regions
Not requiring alias analysis
Exclude the compiler from the TCB via a tiny verifier