BASS research group

Summary

This post provides an overview our paper Harm-DoS, accepted at the RAID 2022 conference. In Harm-DoS we present an approach to mitigate hash-collision denial-of-service (DoS) vulnerabilities automatically, in binary programs. We analyze 105,831 real-world programs and confirm the use of 796 weak hash functions in the same number of programs. We successfully replace 759 of these in a non-disruptive manner. Among the real-world programs analyzed, we discovered, disclosed and mitigated a zero-day hash-collision vulnerability in Reddit. The implementation of Harm-DoS can be found here.

Citing Harm-DoS

Please use the following BibTex string to cite Harm-DoS.

@inproceedings{harmdos,
  abbr={RAID},
  title={Harm-DoS: Hash Algorithm Replacement for Mitigating Denial-of-Service Vulnerabilities in Binary Executables},
  author={Weideman, Nicolaas and Wang, Haoda and Kann, Tyler and Zahabizadeh, Spencer and Wu, Wei-Cheng and Tandon, Rajat and Mikrovic, Jelena and Hauser, Christophe},
  booktitle={The 25th International Symposium on Research in Attacks, Intrusions and Defenses},
  year={2022},
selected={true}
}

Approach Overview

Harm-DoS is divided into five phases, namely Analysis Preparation, Vulnerability Diagnosis, Pre-patch Examination, Hash Transplant, and Post-patch Examination. In preparation for analysis, Harm-DoS disassembles the target binary, recovers control flow and identifies the function boundaries therein. In spite of the inherent complexity of working with binary code, Harm-DoS surgically analyzes the program to diagnose hash-collision vulnerabilities and perform a hash transplant – replacing the weak hash algorithm with a secure alternative. Similar to a medical organ transplant, the entire process must be conducted with utmost precision. We introduce hash-collision vulnerability diagnosis, a novel static analysis to detect weak hash functions at scale. After diagnosis, Harm-DoS conducts a thorough pre-patch examination, a novel use of symbolic execution, to ensure the patch can be performed safely, without introducing critical errors (like accessing memory out of bounds). Next, Harm-DoS performs the hash transplant, replacing the weak hash function with an appropriate secure alternative, crafted with the insights gained from the pre-patch examination. Finally, Harm-DoS conducts a post-patch examination, a second phase of symbolic execution to confirm that the replacement was successful and no errors were introduced. Since the weak hash function is removed from the patched program, the program is now resilient against hash-collision DoS attacks. The full details of these phases can be found in the paper.

Vulnerability Diagnosis

In the Vulnerability Diagnosis phase, Harm-DoS analyzes functions in the target binary, recovered in the Analysis Preparation phase. For each target function, it decides whether it is an implementation of a known weak hash algorithm. This decision is based on whether the control-flow, memory accesses and computational constants of the target function matches a template commonly used by weak hash functions.

Pre-patch Examination

In order to patch a weak hash function, it is important to understand how it interacts with the rest of the program, in terms of control flow and data flow. Harm-DoS performs four symbolic execution tests on the diagnosed weak hash functions to build a profile of the behavior in order to determine if a nondisruptive patch can be made. The behavior profile is built with regard to signature, input-output relationships, case sensitivity, and memory accesses. Next, we explain the importance of each of these aspects of the built profile.

Weak Hash Function Signature

A function’s signature dictates how it receives input via input parameters and how it yields output via a return value. In order to replace one function with another, it is vital that both the original and the replacement implement the same signature. We include information regarding the signature of the weak hash function in its profile to allow us to construct a replacement hash function implementing the same signature.

Understanding the signature of the weak hash function has another benefit: it allows us to perform symbolic execution on the function with controlled input. This is useful in the following steps.

Input-output Relationships

Each hash algorithm deterministically produces the output value for every given input. Since this relationship is unique to a specific algorithm, this can be used to identify the algorithm. Harm-DoS performs symbolic execution on the weak hash function with concrete input to observe the calculated hash value. This concrete hash value is compared to the expected hash value, as defined by the algorithm. This allows Harm-DoS to identify weak hash functions with no false positives.

Case Sensitivity

A case insensitive hash function yields equal hash values for inputs that differ only in case (e.g. AbC and abc). Using a case-sensitive hash function to replace a case-insensitive hash function will cause incorrect lookups in the hash table. Conversely, using a case-insensitive hash function to replace a case-sensitive hash function makes exploitation trivial. When replacing a weak hash function, it is important that the replacement hash function matches the original in terms of case sensitivity.

Memory Accesses

It is important to know of any side effects of the weak hash function. Side effects, in this context, are modifications to the program state that persist past the return of the hash function. Replacing the original hash function while omitting these side effects will change the program behavior in an unknowable way. Therefore, it is impossible to guarantee correctness while omitting these side effects. Since such side effects are usually very program specific, they cannot be reproduced by an automatic and generic approach like Harm-DoS. Harm-DoS detects these side effects to avoid patch the weak hash function.

Hash Transplant

If Pre-patch Examination shows that the weak hash function can be replaced nondisruptively, it is time to perform the Hash-Transplant. This is achieved by constructing a secure hash function that matches the original in terms of signature and case-sensitivity. We use the hash algorithms SipHash and Multilinear hash to this end.

The replacement itself is performed by overwriting the binary instructions of the original hash function with those of the replacement.

Post-patch Examination

After the Hash Transplant, Harm-DoS performs two more symbolic execution tests on the secure hash function in the patched binary. The first test monitors the memory accesses of the secure hash function and ensures these match those of the original exactly. This ensures no illegal memory accesses are introduced during the patching. The second test performs symbolic execution on the secure hash function with concrete input to calculate a hash value. Using an SMT solver, it then calculates a preimage to the original hash function that yields the same hash value. This ensures that the replacement hash function does not produce hash values that could not also be produced by the original. This prevents illegal lookups using the hash value.

Experimental Results

We evaluate Harm-DoS in three steps. First, we analyze a large data set of real-world binaries to test the ability of Harm-DoS to detect and patch vulnerabilities at scale. Second, we constitute and analyze a subset of these binaries, containing manually identified hash functions. We use this to determine the accuracy of our approach based on a known ground truth. Third, we discuss a case study.

Full Scale Analysis

Our data set consists of 105,831 unique AMD64 ELF executable files, extracted from the AllStar data set. This data set contains the binaries obtained when building the Jessie distribution of the Debian packages. We analyze these real-world programs and confirm the use of 796 weak hash functions in the same number of programs. We successfully replace 759 of these in a non-disruptive manner.

Ground Truth Analysis

In order to measure how well Harm-DoS detects all known-weak hash functions, we construct ground truth data set. We select 202 hash functions, for which we have manually verified from the source code that they implement a given known weak hash algorithm. We draw our functions from a subset of 156 binaries used in our full-scale analysis of the AllStar data set.

A true positive is a hash function that is correctly detected by Harm-DoS. Conversely, a false negative is a hash function either not detected by Harm-DoS, or is detected as a hash function, but mistaken for a different hash algorithm. We observe that Harm-DoS detects all hash functions correctly.

Case Study: Reddit Vulnerability

To show the effectiveness of Harm-DoS in a real-world context, we discuss a remotely-exploitable zero-day hash-collision vulnerability that we discovered, and patched, in Snudown, a component of Reddit. We disclosed this vulnerability to Reddit in accordance to the coordinated disclosure policy. The vulnerability was assigned ID CVE-2021-41168. We also implemented a mitigation that replaces the weak hash function with SipHash, which was accepted by the developers.

Snudown is a library used in Reddit to convert markdown to HTML. This library uses a hash table to map reference labels to their links, using a weak hash algorithm. We launch a proof-of-concept attack against Snudown, running locally, by parsing a large number of references with labels crafted to cause collisions in the hash table. We measure the parsing time of an increasing number of colliding reference labels. As a sanity check, we repeat the experiment with random labels. We plot the parsing time against the input size in the figure below.

The significant difference in parsing time growth between the malicious and random labels confirms the vulnerability. Note, the superlinear growth in parsing time for random labels, is caused by coincidental collisions due to the small table size.

We run Harm-DoS on the executable, which successfully produces a patched Snudown. To show the malicious growth in parsing time no longer occurs, we relaunch the same attack on the patched executable. It is clear that patched Snudown does not suffer from the same vulnerability. To show that we have mitigated the vulnerability without introducing errors, we run the test cases that are supplied with the Snudown project, which all pass. This shows that Harm-DoS can be used to identify and mitigate real-world hash-collision vulnerabilities.

More information about this vulnerability is available here.