Detection: Logic bugs, inconsistency, typo, etc.

October 22, 2024 at 3:40 pm

Semantic Inconsistency

Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code

This is a very interesting paper, maybe the earliest work in this area. The idea is simple: some operations are based on implicit assumptions. For example, dereference implies a valid pointer.

Detecting Missing-Check Bugs via Semantic- and Context-Aware Criticalness and Constraints Inferences PDF

Solely focus on missing checks in Linux kernel.

APISAN: Sanitizing API Usages through Semantic Cross-checking PDF

To find API usage errors, APISAN automatically infers semantic correctness, called semantic beliefs, by analyzing the source code of different uses of the API.

the more API patterns developers use in similar contexts, the more confidence we have about the correct API usage.

This is a probabilistic approach. Symbolic execution can help to construct a "semantic belief" of the outcome (and pre- and post- conditions) of executing an API, and such construction will be more and more precise as there are more samples in the dataset.

This work is based mainly on relaxed symbolic execution.

Scalable and systematic detection of buggy inconsistencies in source code Paper

Analysis of bugs on cloned code. Searching: hash of code snippet, AST level.

Gap between theory and practice: an empirical study of security patches in solidity
Which bugs are missed in code reviews: An empirical study on SmartSHARK dataset

Semantic bugs are mostly missed (51.34%) from the review process, 7 out of 96 are typos.

VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning
Bug characteristics in open source software

Careless programming causes many bugs in the evaluated software. For example, simple bugs such as typos account for 7.8–15.0 % of semantic bugs in Mozilla, Apache, and the Linux kernel

Automating Code Review Activities by Large-Scale Pre-training