Exploiting Unprotected I/O Operations in AMD’s Secure Encrypted Virtualization

 February 16, 2024 at 7:58 pm

USENIX'19 Paper

System

Attacks

  • Monitoring nPT (nested page table) and gPT (guest page table)
  • Page faults => a list of accessed pages => find the actual address of the target page
  • Decryption oracle: move content of the ciphertext to the SWIOTLB
  • Let it be decrypted

Technical Details

  • Monitoring/MITM the DMA operations
  • Pattern matching to find the accessed address in private memory
  • IOremap to replace ciphertext
  • QEMU notifies the VM about DMA write. Does it notify the device to read DMA?

Building GPU TEEs using CPU Secure Enclaves with GEVisor

 February 8, 2024 at 4:07 pm

SoCC'23

System

  • Trust TPM & Secure Boot
  • Security Monitor implemented at VMX-root level, and the model(with implementation) are formally verified.
  • Enforce access control by trapping instrcuctions

Covered Attacks Surfaces

  • DMA buffer
  • MMIO mapping
  • GPU context

Attack vectors targeting these surfaces:

  • Malicious peripherals
  • Concurrent CPU Access (specific to GEVisor)

Question

  • How to ensure the integrity of Hypercalls? (i.e., protect the ring buffer)
  • How to prove active attackers would be detected by GEVisor? (and how to define active attackers?)
  • Why does the formal verification pass for GEVisor with flaws? (or what does pass mean for GEVisor?)

Honeycomb: Secure and Efficient GPU Executions via Static Validation

 February 8, 2024 at 1:28 pm

OSDI'23

  • Huge implementation effort
  • Type 1 hypervisor
  • Static analysis (validator-based)
  • Still need to rely on the driver (but not fully trust it)

SinClave: Hardware-assisted Singletons for TEEs

 February 5, 2024 at 4:54 pm

PDF Middleware ’23

Attack

If the software’s configuration that is not reflected in the enclave measurement (§ 3.3) dictates the to-be-loaded program, the adversary can simply start a report server implementation.

  • Is it correct? Yes, at least for SCONE.

Defense

The basic idea behind our solution is that system software adds an additional page, the instance page (see Figure 5), to the enclave dynamically during enclave construction. This page, in particular, contains an attestation token, and the verifier’s cryptographic identity. The attestation token is unique and generated by the verifier in a previous step.

  • How can SinClave ensure that such token is securely protected (i.e., not leaked to the adversary)?

Detection: Logic bugs, inconsistency, typo, etc.

 February 5, 2024 at 3:25 pm

Semantic Inconsistency

  • Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code

This is a very interesting paper, maybe the earliest work in this area. The idea is simple: some operations are based on implicit assumptions. For example, dereference implies a valid pointer.

  • Detecting Missing-Check Bugs via Semantic- and Context-Aware Criticalness and Constraints Inferences PDF

Solely focus on missing checks in Linux kernel.

  • APISAN: Sanitizing API Usages through Semantic Cross-checking PDF

To find API usage errors, APISAN automatically infers semantic correctness, called semantic beliefs, by analyzing the source code of different uses of the API.

the more API patterns developers use in similar contexts, the more confidence we have about the correct API usage.

This is a probabilistic approach. Symbolic execution can help to construct a "semantic belief" of the outcome (and pre- and post- conditions) of executing an API, and such construction will be more and more precise as there are more samples in the dataset.

This work is based mainly on relaxed symbolic execution.

  • Scalable and systematic detection of buggy inconsistencies in source code Paper

Analysis of bugs on cloned code. Searching: hash of code snippet, AST level.

  • Gap between theory and practice: an empirical study of security patches in solidity
  • Which bugs are missed in code reviews: An empirical study on SmartSHARK dataset

Semantic bugs are mostly missed (51.34%) from the review process, 7 out of 96 are typos.

Careless programming causes many bugs in the evaluated software. For example, simple bugs such as typos account for 7.8–15.0 % of semantic bugs in Mozilla, Apache, and the Linux kernel

  • Automating Code Review Activities by Large-Scale Pre-training

Finding Bugs Using Your Own Code: Detecting Functionally-similar yet Inconsistent Code

 February 1, 2024 at 4:01 pm

Applying cluster on code to find inconsistency.

Key insight: Our approach is inspired by the observation that many bugs in software manifest as inconsistencies deviating from their non-buggy counterparts, namely the code snippets that implement the similar logic in the same codebase. Such bugs, regardless of their types, can be detected by identifying functionally-similar yet inconsistent code snippets in the same codebase

Pros

  • Two-level clustering
  • Embedding on program dependency graph
  • Found bugs (at function level) in large projects
  • Embedding on code structures
  • Generality

Cons

  • Need repo-specific training
  • Literals are removed at the IR level

To abstract Constructs, we preserve only the variable types for each program statement and remove all variable names and versions.

  • Needs repository-specific configuration of thresholds
  • Still very high FP

Others

  • Granularity at the function level

If an inconsistent cluster contains more than a fixed number (e.g., 2) of deviating nodes (i.e., nodes in Table 2), the inconsistency is deprioritized because it is unlikely to be a true inconsistency (i.e., a single inconsistency rarely involves many deviations).

Detecting Misuses of Security APIs: A Systematic Review

 December 23, 2023 at 11:23 am

Prompt Engineering

 October 25, 2023 at 9:47 pm

Reference

  • Few Shot

Providing Examples

  • Chain-of-Thought

Let's think step by step.

  • Tree of Thoughts

Imagine three different experts are answering this question. All experts will write down 1 step of their thinking, then share it with the group. Then all experts will go on to the next step, etc. If any expert realises they're wrong at any point then they leave. The question is...

  • Retrieval Augmented Generation

High-level idea: retrieve external source of information.

  • Step-back Prompting

Paper

RTFM! Automatic Assumption Discovery and Verification Derivation from Library Document for API Misuse Detection

 October 24, 2023 at 4:03 pm

Abstract

To use library APIs, a developer is supposed to follow guidance and respect some constraints, which we call integration assumptions (IAs). Violations of these assumptions can have serious consequences, introducing security-critical flaws such as use-after-free, NULL-dereference, and authentication errors. Analyzing a program for compliance with IAs involves significant effort and needs to be automated. A promising direction is to automatically recover IAs from a library document using Natural Language Processing (NLP) and then verify their consistency with the ways APIs are used in a program through code analysis. However, a practical solution along this line needs to overcome several key challenges, particularly the discovery of IAs from loosely formatted documents and interpretation of their informal descriptions to identify complicated constraints (e.g., data-/control-flow relations between different APIs).

In this paper, we present a new technique for automated assumption discovery and verification derivation from library documents. Our approach, called Advance, utilizes a suite of innovations to address those challenges. More specifically, we leverage the observation that IAs tend to express a strong sentiment in emphasizing the importance of a constraint, particularly those security-critical, and utilize a new sentiment analysis model to accurately recover them from loosely formatted documents. These IAs are further processed to identify hidden references to APIs and parameters, through an embedding model, to identify the information-flow relations expected to be followed. Then our approach runs frequent subtree mining to discover the grammatical units in IA sentences that tend to indicate some categories of constraints that could have security implications. These components are mapped to verification code snippets organized in line with the IA sentence's grammatical structure, and can be assembled into verification code executed through CodeQL to discover misuses inside a program. We implemented this design and evaluated it on 5 popular libraries (OpenSSL, SQLite, libpcap, libdbus and libxml2) and 39 real-world applications. Our analysis discovered 193 API misuses, including 139 flaws never reported before.

System

  1. Extract IA from the documents
  2. Identify the functions referred in the IA
  3. Generate CodeQL queries of the target API (mis)uses

Strengthes and Weaknesses

  • High accuracy and low FP (how?)
  • Identified a lot of previously-unknown bugs
  • New dataset
  • Relies heavily on well-formed documentation?
  • The FN is not very low
  • Generating VC takes a long while

When GPT Meets Program Analysis: Towards Intelligent Detection of Smart Contract Logic Vulnerabilities in GPTScan

 October 22, 2023 at 2:14 pm

Arxiv Paper

Using LLM (GPT3.5) to discover logic bugs in smart contracts

Questions

  • How to reduce false positives? Vulnerability confirmation.
  • Does temperature = 0 really increase reliability? To my understanding, it just makes the answer deterministic. They do “mimic-in-thebackground” prompting, is that a good method? Any evaluations?
  • How are the Filtering Types determined for different types of vulnerabilities?

Highlights

Importance of Logic Vulnerabilities

The third group of vulnerabilities requires highlevel semantical oracles for detection and is closely related to the business logic. Most of these vulnerabilities are not detectable by existing static analysis tools. This group comprises six main types of vulnerabilities: (S1) price manipulation, (S2) ID-related violations, (S3) erroneous state updates, (S4) atomicity violation, (S5) privilege escalation, and (S6) erroneous accounting.

Use of Cheaper GPT-3.5

Methodology

Challenges

  • Big code base cannot be directly feed into GPT as a whole => narrow down the scope

Can we break down vulnerability types in a manner that allows GPT, as a generic and intelligent code understanding tool, to recognize them directly from code-level semantics?

  • Unreliable answers (FP?) => confirm vulnerabilities

Some Gadgets

GPT's Disadvantages (may not be general to LLMs)

However, we found that GPT struggles to comprehend the concept of “before,”

Missing Context Can Cause FP

For these two types, the main reason for the false alarms is that these vulnerabilities require specific triggering conditions involving other related logic, which may not be contained within a single function and its callers or callees.

New Vulnerabilities

GPTScan successfully discovered 9 vulnerabilities from 3 different types, which did not appear in the audit reports of Code4rena

GPT4

We also conducted a preliminary test using GPT-4, but we did not observe a notable improvement, while the cost increased 20 times.

Prompts

Prompt for scenario and property matching

System: You are a smart contract auditor. You will be asked questions related to code properties. You can mimic answering them in the background five times and provide me with the most frequently appearing answer. Furthermore, please strictly adhere to the output format specified in the question; there is no need to explain your answer.

Scenario Matching

Given the following smart contract code, answer the questions below and organize the result in a json format like {"1": "Yes" or "No", "2": "Yes" or "No"}.

"1": [%SCENARIO_1%]?

"2": [%SCENARIO_2%]?

[%CODE%]

Property Matching Does the following smart contract code "[%SCENARIO, PROPERTY%]"? Answer only "Yes" or "No".

[%CODE%]

Prompt for finding related variables/statements.

In this function, which variable holds the value of total minted share or amount? Please answer in a section starts with "VariableA:". In this function, which variable or function holds the total supply/liquidity AND is used by the conditional branch to determine the supply/liquidity is 0? Please answer in a section starts with "VariableB:". In this function, which variable or function holds the value of the deposit/mint/add amount? Please answer in a section starts with "VariableC:". Please answer in the following json format: {"VariableA":{"Variable name":"Description"}, "VariableB":{"Variable name":"Description"}, "VariableC":{"Variable name":"Description"}}

[%CODE%]

Expressing Information Flow Properties

 September 6, 2023 at 12:06 pm

Abstract

Industries and governments are increasingly compelled by regulations and public pressure to handle sensitive information responsibly. Regulatory requirements and user expectations may be complex and have subtle implications for the use of data. Information flow properties can express complex restrictions on data usage by specifying how sensitive data (and data derived from sensitive data) may flow throughout computation. Controlling these flows of information according to the appropriate specification can prevent both leakage of confidential information to adversaries and corruption of critical data by adversaries. There is a rich literature expressing information flow properties to describe the complex restrictions on data usage required by today’s digital society. This monograph summarizes how the expressiveness of information flow properties has evolved over the last four decades to handle different threat models, computational models, and conditions that determine whether flows are allowed. In addition to highlighting the significant advances of this area, we identify some remaining problems worthy of further investigation.

Considerations

Theoretical Foundations

  • Lattice theory
  • Noninterference
  • Logic models (e.g., temporal logic)

Threat Models

  • Termination: does termination leaks high*?
  • Time: different time
  • Interaction: input/output flow
  • Program code

Computational Models

  • Nondeterminism: the program is not deterministic
  • Composition of systems: e.g., feedback: the output becomes next input. Composition can lead to nondeterminism
  • Concurrency

Re(de)classification

  • Need to consider: what, where, when, and who
  • Introduce Delimited Release to declassify

Assisting Static Analysis with Large Language Models: A ChatGPT Experiment

 September 6, 2023 at 11:49 am

PDF Repo

Abstract

Recent advances of Large Language Models (LLMs), e.g., ChatGPT, exhibited strong capabilities of comprehending and responding to questions across a variety of domains. Surprisingly, ChatGPT even possesses a strong understanding of program code. In this paper, we investigate where and how LLMs can assist static analysis by asking appropriate questions. In particular, we target a specific bug-finding tool, which produces many false positives from the static analysis. In our evaluation, we find that these false positives can be effectively pruned by asking carefully constructed questions about function-level behaviors or function summaries. Specifically, with a pilot study of 20 false positives, we can successfully prune 8 out of 20 based on GPT-3.5, whereas GPT-4 had a near-perfect result of 16 out of 20, where the four failed ones are not currently considered/supported by our questions, e.g., involving concurrency. Additionally, it also identified one false negative case (a missed bug). We find LLMs a promising tool that can enable a more effective and efficient program analysis.

Noticeable Points

Prompt Design

  • Chain of Thought: Step-by-step result
  • Task decomposition: breaking down the huge tasks into smaller pieces
  • Progressive prompt: interactively feed information

Challenges in traditional code analysis

  • Inherent Knowledge Boundaries.

Static analysis requires domain knowledge to model special functions which cannot be analyzed. E.g., assembly code, hardware behaviors, concurrency, and compiler built-ins.

  • Exhaustive Path Exploration.

GPU TEE: Potential Problems

 September 6, 2023 at 11:31 am

Partial GPU

Passing a partial GPU to a VM using vGPU and SRIOV.

This shared case can be dangerous.

Isolation inside the CVM

How exactly does the GPU access the CVM's memory? Part of the memory is marked as shared, while encrypted.

Constant-Time Foundations for the New Spectre Era

 February 27, 2023 at 5:48 pm

Abstract

The constant-time discipline is a software-based countermeasure used for protecting high assurance cryptographic implementations against timing side-channel attacks. Constant-time is effective (it protects against many known attacks), rigorous (it can be formalized using program semantics), and amenable to automated verification. Yet, the advent of micro-architectural attacks makes constant-time as it exists today far less useful.

This paper lays foundations for constant-time programming in the presence of speculative and out-of-order execution. We present an operational semantics and a formal definition of constant-time programs in this extended setting. Our semantics eschews formalization of microarchitectural features (that are instead assumed under adversary control), and yields a notion of constant-time that retains the elegance and tractability of the usual notion. We demonstrate the relevance of our semantics in two ways: First, by contrasting existing Spectre-like attacks with our definition of constant-time. Second, by implementing a static analysis tool, Pitchfork, which detects violations of our extended constant-time property in real world cryptographic libraries.

Methodology

  • A model to incorporate speculative execution.
  • Abstract machine: fetch, execute, retire.
  • Reorder buffer => out-of-order and speculative execution
  • Directives can be initiated to model the detailed schedule of the reordered micro-ops
  • Observations are also modeled in such process to mimic the leakage visible to the attacker
  • Modeled instructions: op, fence, load, store, br, call, ret

Artifact

  • Symbolic execution based on angr
  • Schedule the worst case reorder buffer => maximize potential leakage
  • Can indeed found threat

Comments

  • Are the threats confirmed by the developers?
  • Is the model formalized in a proof checker?
  • Execution time of instructions is not modeled? (or need not to be modeled?)

PrivGuard: Privacy Regulation Compliance Made Easier

 March 1, 2023 at 4:58 pm

Paper

Abstract

Continuous compliance with privacy regulations, such as GDPR and CCPA, has become a costly burden for companies from small-sized start-ups to business giants. The culprit is the heavy reliance on human auditing in today's compliance process, which is expensive, slow, and error-prone. To address the issue, we propose PrivGuard, a novel system design that reduces human participation required and improves the productivity of the compliance process. PrivGuard is mainly comprised of two components: (1) PrivAnalyzer, a static analyzer based on abstract interpretation for partly enforcing privacy regulations, and (2) a set of components providing strong security protection on the data throughout its life cycle. To validate the effectiveness of this approach, we prototype PrivGuard and integrate it into an industrial-level data governance platform. Our case studies and evaluation show that PrivGuard can correctly enforce the encoded privacy policies on real-world programs with reasonable performance overhead.

Methodology

Users can prescribe their privacy policies, and the analyst can then leverage user data for data analysis tasks. However, the difference privacy policies are automatically enforced and satisfied by PrivGuard, which is executed inside TEE.

The policy is prescribed in a formal language, and the data analysis program is statically analyzed by PrivAnalyzer to check privacy policy compliance. PrivAnalyzer use python interpreter as a abstract interpreter to check if the privacy policies might be broken by the program. Since the python program may use a lot of 3rd party libraries, the authors purpose functions summaries for these functions and over approximate the result.

Weakness

  • Intentional information leakage (analyst is assumed trusted)
  • Language level bypasses

Sidenote

Some papers mentioned in this work is also interesting, especially those related to dealing with loops and branches in static analysis.

This paper may need to be checked again!

Privado: Practical and Secure DNN Inference with Enclaves

 March 1, 2023 at 4:57 pm

Abstract

Cloud providers are extending support for trusted hardware primitives such as Intel SGX. Simultaneously, the field of deep learning is seeing enormous innovation as well as an increase in adoption. In this paper, we ask a timely question: "Can third-party cloud services use Intel SGX enclaves to provide practical, yet secure DNN Inference-as-a-service?" We first demonstrate that DNN models executing inside enclaves are vulnerable to access pattern based attacks. We show that by simply observing access patterns, an attacker can classify encrypted inputs with 97% and 71% attack accuracy for MNIST and CIFAR10 datasets on models trained to achieve 99% and 79% original accuracy respectively. This motivates the need for PRIVADO, a system we have designed for secure, easy-to-use, and performance efficient inference-as-a-service. PRIVADO is input-oblivious: it transforms any deep learning framework that is written in C/C++ to be free of input-dependent access patterns thus eliminating the leakage. PRIVADO is fully-automated and has a low TCB: with zero developer effort, given an ONNX description of a model, it generates compact and enclave-compatible code which can be deployed on an SGX cloud platform. PRIVADO incurs low performance overhead: we use PRIVADO with Torch framework and show its overhead to be 17.18% on average on 11 different contemporary neural networks.

Model

  • Service: In-enclave ML model. model is unpublic
  • Users: data providers, input & output are secret

Attack

Infer the output label from memory access trace collected when the user's input is processing.

  • DNN contains data-dependent branches
  • A ML model (linear reg) is built up from memory access traces and the output label
  • Can achieve high accuracy on inferring output tag from memory trace

Defense

  • Data-dependency usually occurs at activation functions (e.g. ReLU) and max pooling, etc. Other layers merely contains data-dependent memory access.
  • Eliminate the input/secret-dependency in the Torch library
  • End-to-end model compilation

Proof Complexity vs. Code Complexity

 December 26, 2022 at 9:52 pm

Potential Threats of Memory Integrity on SEV(SNP), (Scalable) SGX2, and TDX

 December 6, 2022 at 12:09 am

SGX2 Memory Integrity

Documents

 *

Potential Attacks

  • Inside-in Aliasing
  • Outside-in Aliasing

Possible sources of aliasing

Server’s RAS feature

  • Memory Address Range Mirroring
  • Memory Predictive Failure Analysis (PFA)

PFA: if a physical memory page is believed to be affected by an underlying hardware fault (e.g., a weak cell or faulty row in a memory chip or DRAM), the affected page can be retired by relocating its content to another physical page, and placing the retired page on a list of physical pages that should not be subsequently allocated by the virtual memory system.

Documents

Possible attack from OS?

Via Memory Components (System Software)

  • Program critical system hardware devices, e.g., memory controller, DMA engines (doc1, p113) DMA is controlled by the CPU in x86-64 systems
  • Program page tables/EPT => inside-in alias (doc1, p113)

Other possible attack

  • Firmware <= defend by secure boot/PFR/Intel Hardware Shield (doc.2)

Documents

  1. https://www.intel.com/content/dam/develop/external/us/en/documents/332680-001-720907.pdf
  2. Intel PFR Github

Related Papers

TDX Problems

References

  1. Intel TDX

A Systematic Look at Ciphertext Side Channels on AMD SEV-SNP

 November 11, 2022 at 10:29 am

Abstract

Hardware-assisted memory encryption offers strong confidentiality guarantees for trusted execution environments like Intel SGX and AMD SEV. However, a recent study by Li et al. presented at USENIX Security 2021 has demonstrated the CipherLeaks attack, which monitors ciphertext changes in the special VMSA page. By leaking register values saved by the VM during context switches, they broke state-of-the-art constant-time cryptographic implementations, including RSA and ECDSA in the OpenSSL. In this paper, we perform a comprehensive study on the ciphertext side channels. Our work suggests that while the CipherLeaks attack targets only the VMSA page, a generic ciphertext side-channel attack may exploit the ciphertext leakage from any memory pages, including those for kernel data structures, stacks and heaps. As such, AMD’s existing countermeasures to the CipherLeaks attack, a firmware patch that introduces randomness into the ciphertext of the VMSA page, is clearly insufficient. The root cause of the leakage in AMD SEV’s memory encryption—the use of a stateless yet unauthenticated encryption mode and the unrestricted read accesses to the ciphertext of the encrypted memory—remains unfixed. Given the challenges faced by AMD to eradicate the vulnerability from the hardware design, we propose a set of software countermeasures to the ciphertext side channels, including patches to the OS kernel and cryptographic libraries. We are working closely with AMD to merge these changes into affected open-source projects.

Paper

Background

  • CipherLeak: ciphertext can be accessed by the hypervisor
  • In SEV, XEX encryption mode is applied => for a fixed address, same plaintext yields same ciphertext
  • Controlled side-channel: NPT (nested page table) present bit clear => PF
  • Before SEV-ES, the registers are saved without encryption
  • SNP: hypervisor cannot modify or remap guest VM pages (integrity protection)

Attack

  • Nginx SSL key generation -> 384bit ECDSA key recovery

This exploits constant time swap algorithm. A decision bit encryption pattern is observed, and therefore the nonce could be derived by observing the mask in 384 iterations.

Leaky DNN: Stealing Deep-learning Model Secret with GPU Context-switching Side-channel

 November 8, 2022 at 9:58 pm

Abstract

Machine learning has been attracting strong interests in recent years. Numerous companies have invested great efforts and resources to develop customized deep-learning models, which are their key intellectual properties. In this work, we investigate to what extent the secret of deep-learning models can be inferred by attackers. In particular, we focus on the scenario that a model developer and an adversary share the same GPU when training a Deep Neural Network (DNN) model. We exploit the GPU side-channel based on context-switching penalties. This side-channel allows us to extract the fine-grained structural secret of a DNN model, including its layer composition and hyper-parameters. Leveraging this side-channel, we developed an attack prototype named MosConS, which applies LSTM-based inference models to identify the structural secret. Our evaluation of MosConS shows the structural information can be accurately recovered. Therefore, we believe new defense mechanisms should be developed to protect training against the GPU side-channel.

Method

  • Shared resource: CUPTI, NVIDIA's performance counter
  • CUDA, kernels, blocks => construct concurrency & GPU contention => A spy
  • LSTM model training => 5 models: splitting iterations; recognizing long ops; recognizing other ops; inferring hyper-parameters; model correction

Questions

  • How is GPU shared between VMs? How to avoid different drivers interfereing with each other? (especially both are root).
  • Will these features(CUPTI) be available on GPUs in data centers?

Towards Formal Verification of State Continuity for Enclave Programs

 October 11, 2022 at 10:54 pm

Trusted Computing Base TCB

 September 8, 2022 at 9:48 pm

Blog posts

What’s a Trusted Compute Base?

Early Documents

SPECIFICATION OF A TRUSTED COMPUTING BASE (TCB)

  • G. H. Nibaldi, 30 November 1979 PDF

A Trusted Computing Base (TCB) is the totality of access control mechanisms for an operating system.

A TCB is a hardware and softwere access control mechunism that establishes v protection environment to control the sharing of information in computer systems. A TCB is an implementation of a reference monitor, as defined in [Anderson 72), that controls when and how data is accessed.

Proof that the TCB will indeed enforce the relevant protection policy can only be provided through a fonrial, methodological approach to TCB design and verification... Because the TCB consists of all the security-related mechanisms, proof of its validity implies the remainder of the system will perform correctly with resWpct to the policy.

Reference Monitor

a TCB is an implementation cf a reference monitor.

  • complete mediation of access
  • self-protecting
  • verifiable

Minimizing the complexity of TCB software is a major factor in raising the confidence level that can be assigned to the protection mechanisms it provides.

...two general design goals to follow after identifying all security relevant operations for inclusion in the TCB are (a) to exclude from the TCB software any operations not strictly security-related so that one can focus attention on those that are, and (b) to make as full use as possible of protection features available in the hardware.

DEPARTMENT OF DEFENSE TRUSTED COMPUTER SYSTEM EVALUATION CRITERIA

  • DoD 5200.28 STD, l5 Aug 83 PDF

The heart of a trusted computer system is the Trusted Computing Base (TCB) which contains all of the elements of the system responsible for supporting the security policy and supporting the isolation of objects (code and data) on which the protection is based.

... In the interest of understandable and maintainable protection, a TCB should be as simple as possible consistent with the functions it has to perform. Thus, the TCB includes hardware, firmware, and software critical to protection and must be designed and implemented such that system elements excluded from it need not be trusted to maintain protection.

Trusted Computing Base (TCB) - The totality of protection mechanisms within a computer system – including hardware, firmware, and software – the combination of which is responsible for enforcing a security policy. A TCB consists of one or more components that together enforce a unified security policy over a product or system. The ability of a trusted computing base to correctly enforce a security policy depends solely on the mechanisms within the TCB and on the correct input by system administrative personnel of parameters (e.g., a user's clearance) related to the security policy.

Now the concept of TCB is applicable not only in OS but also embedded systems, and focuses on a security-critical portion of the system, including hardware and software.

Some system (Class A1) still requires a formal design specification and verification of TCB to ensure high degrees of assurance.

Authentication in Distributed Systems: Theory and Practice

Another important concept is the ‘trusted computing base’ or TCB [9], a small amount of software and hardware that security depends on and that we distinguish from a much larger amount that can misbehave without affecting security

Some weaknesses of the TCB model

S&P 1997 Paper

Authentication in Distributed Systems: Theory and Practice,

ACM Transactions on Computer Systems, 1992

It’s not quite true that components outside the TCB can fail without affecting security. Rather, the system should be ‘fail-secure’: if an untrusted component fails, the system may deny access it should have granted, but it won’t grant access it should have denied.

An Efficient TCB for a Generic Content Distribution System

2012 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discover PDF

The trusted computing base (TCB) [1] for a system is a small amount of hardware and/or software that need to be trusted in order to realize the desired assurances. More specifically, the assurances are guaranteed even if all elements outside the TCB misbehave.

The lower the complexity of the elements in the TCB, the lower is the ability to hide malicious/accidental functionality in the TCB components. Consequently, in the design of any security solution it is necessary to lower the complexity of components in the TCB to the extent feasible.

More Recent Study

TCB Minimizing Model of Computation (TMMC)

Bushra, Naila.  Mississippi State University ProQuest Dissertations Publishing,  2019. 27664004. Paper

Reducing TCB Complexity for Security-Sensitive Applications: Three Case Studies

EuroSys, 2006 PDF

The security requirements fall into four main categories: confidentiality, integrity, recoverability, and availability. For clarity, we present the definition of these terms.

  • Confidentiality: Only authorized users (entities, principals, etc.) can access information (data, programs, etc.).
  • Integrity: Either information is current, correct, and complete, or it is possible to detect that these properties do not hold.
  • Recoverability: Information that has been damaged can be recovered eventually.
  • Availability: Data is available when and where an authorized user needs it.

Justifications of Reducing TCB

Relationships between selected software measures and latent bug-density: Guidelines for improving quality

Paper

It seems that nearly all code size/complexity measurements contributes to bug density, except Method Hiding Factor and Polymorphism Factor.

This work just focuses on C++ programs. What about using a different language, e.g., Rust?

Choice of Language also Matters

Rust in the Android platform

Rust modernizes a range of other language aspects, which results in improved correctness of code:

Other Related Work

Œuf: Minimizing the Coq Extraction TCB

Reducing TCB complexity for security-sensitive applications: Three case studies