Доверя́й, но проверя́й: SFI safety for native-compiled Wasm

January 24, 2022 at 11:34 pm

Abstract

WebAssembly (Wasm) is a platform-independent bytecode that offers both good performance and runtime isolation. To implement isolation, the compiler inserts safety checks when it compiles Wasm to native machine code. While this approach is cheap, it also requires trust in the compiler's correctness—trust that the compiler has inserted each necessary check, correctly formed, in each proper place. Unfortunately, subtle bugs in the Wasm compiler can break—and emph{have broken}—isolation guarantees. To address this problem, we propose verifying memory isolation of Wasm binaries post-compilation. We implement this approach in VeriWasm, a static offline verifier for native x86-64 binaries compiled from Wasm; we prove the verifier's soundness, and find that it can detect bugs with no false positives. Finally, we describe our deployment of VeriWasm at Fastly.

Scalable Memory Protection in the Penglai Enclave

January 19, 2022 at 5:13 pm

Abstract

Secure hardware enclaves have been widely used for protecting security-critical applications in the cloud. However, existing enclave designs fail to meet the requirements of scalability demanded by new scenarios like serverless computing, mainly due to the limitations in their secure memory protection mechanisms, including static allocation, restricted capacity and high-cost initialization. In this paper, we propose a software-hardware co-design to support dynamic, fine-grained, large-scale secure memory as well as fast-initialization. We first introduce two new hardware primitives: 1) Guarded Page Table (GPT), which protects page table pages to support page-level secure memory isolation; 2) Mountable Merkle Tree (MMT), which supports scalable integrity protection for secure memory. Upon these two primitives, our system can scale to thousands of concurrent enclaves with high resource utilization and eliminate the high-cost initialization of secure memory using fork-style enclave creation without weakening the security guarantees.

We have implemented a prototype of our design based on Penglai, an open-sourced enclave system for RISC-V. The experimental results show that Penglai can support 1,000s enclave instances running concurrently and scale up to 512GB secure memory with both encryption and integrity protection. The overhead of GPT is 5% for memory-intensive workloads (e.g., Redis) and negligible for CPU-intensive workloads (e.g., RV8 and Coremarks). Penglai also reduces the latency of secure memory initialization by three orders of magnitude and gains 3.6x speedup for real-world applications (e.g., MapReduce).

Anatomy of Paper

Although Penglai is a very large system, this paper mainly focuses on scalability, security, and (initialization) performance of their design. This paper first introduces their motivations and the limitations of current systems (TEEs). The author them list their goals for the new design, and presents an overview.

After that, they demonstrate the new design focusing on several components.

  1. Memory Isolation Model (guarded page table);
  2. Memory Integrity (Mountable Merkle Tree), claimed challenges and novelty
  3. Secure Enclave Initialization (shadow fork)
  4. Cache-side channel mitigation (Cache Line Locking)

I don't think they follow a clear logic order to introduce the details of these aspects, but I think they tried to mention the difficulties & their solutions, and they explained them in a sensible way.

Their implementation is 2-fold, incorporating the hardware and software parts. They introduce the newly added commands, the tools & components used to realized the system, and some unimplemented functionalities.

In evaluation, they first present some microbenchmarks to show Penglai (especially the new components) only induces minor performance degradation. Then in they compare Penglai with similar systems (keystone, native, etc.) on bigger tasks. Besides, they also have 2 case studies which exhibits Penglai's strength on FaaS and distributed tasks.

Comments

I think this is a good and dense paper in general. Considering Penglai is a very huge system, the authors are clever since they focus on what made Penglai different: scalable enclave memory. However, some details are too vague and require a lot of background knowledge to understand, which is not covered in this paper. The 4 aspects in their design are scattered, and in each subsection I found no logic order, which made it difficult for understanding the big map.

Using Lightweight Formal Methods to Validate a Key-Value Storage Node in Amazon S3

January 21, 2022 at 3:09 pm

Abstract

This paper reports our experience applying lightweight formal methods to validate the correctness of ShardStore, a new key-value storage node implementation for the Amazon S3 cloud object storage service. By “lightweight formal methods" we mean a pragmatic approach to verifying the correctness of a production storage node that is under ongoing feature development by a full-time engineering team. We do not aim to achieve full formal verification, but instead emphasize automation, usability, and the ability to continually ensure correctness as both software and its specification evolve over time. Our approach decomposes correctness into independent properties, each checked by the most appropriate tool, and develops executable reference models as specifications to be checked against the implementation. Our work has prevented 16 issues from reaching production, including subtle crash consistency and concurrency problems, and has been extended by non-formal-methods experts to check new features and properties as ShardStore has evolved.

Summary

This paper explains the experience of verifying ShardStore, a KV DB implemented in Rust on Amazon S3. They first introduces what's ShardStore, and then describe their goals and the anatomy of their tasks. There are 3 major parts for validation:

  1. Correctness
  2. Crash Consistency
  3. Concurrent Execution

Correctness refers to the functionalities should not deviate from the specification. So they build a reference model alone with source code in Rust and launch conformance checking on the model. When testing, they can further use the mock model in memory and avoid using low-speed disks for performance.

For crash consistency, they want to verify persistence(operations should be consistent if they are persisted before a crash) and forward progress(grace shutdown implies the operations are persisted). They use a dependency graph to depict the interdependencies of operations and such Dependency type can also be passed by the operations. By adding DirtyReboot to the aforementioned model, they can check this property.

Finally for potential race conditions. they developed their own tool called Shuttle, since current tools cannot scale up. Their tool recruits random algorithm for model checking.

However, there are still some properties relying on unsafe Rust and cannot be covered by the tools. So they use the Rust interpreter validate these parts. And for some security-critical components, they also utilize a formal verifier.

It turns out that they found several problems, and more potential problems are eliminated before code review.

Comments

This paper is really well-written. However, I don't think there is something pretty novel in the traditional way and I'm surprised this is the best paper in the System top conference. It's more like to share readers their methodology and experiences of tuning a huge system. However, I do believe the way they present solved a real-world problem in a lightweight manner, which is really valuable.

Besides, I think the concept, Continuous Validation, is interesting. Although their method is not formal enough, they can build a system which allow the engineers without formal verification background to amend the specification(expected) model when contributing code for new functionalities, and therefore permits verifiability in the future.

Related Artifacts

Proof-carring Code

January 19, 2022 at 5:13 pm

SGX Papers

December 7, 2021 at 7:28 am

Meta

Official Documents

Reviews

Attacks:

Defenses:

Applications:

See also:

Related materials other than paper

Program Verification

Missing tiddler "Program Verification" – click to create