ZIP bomb countermeasures
How to defend against ZIP bombs in Go.
A ZIP bomb is a malicious ZIP archive designed to crash the program or system reading it. When such a ZIP archive is extracted, it expands to terabytes or even petabytes of data1, which would quickly overwhelm most systems.
This is why ZIP bombs are often used in attacks to disable antivirus scanners, crash file processing services, or conduct denial-of-service attacks against systems that (automatically) extract archived files.
How it works
ZIP is a container format (and not a compression algorithm). A ZIP archive contains a central directory, which is basically a list of headers that reference the actual files in the archive. The files in the ZIP archive are often compressed using DEFLATE.
ZIP bombs achieve extreme compression ratios by exploiting the container format:
-
Recursive ZIP bombs contain nested ZIP files within ZIP files, that create a chain reaction when extracted. But this only works if the program can read ZIP archives recursively.
-
Non-recursive ZIP bombs overlap compressed files in the ZIP archive. This works by first creating a highly compressed file (e.g. a long string of repeated bytes), and then making (all) the headers in the ZIP’s central directory reference that compressed file. This technique can achieve compression ratios over 28 million, far beyond DEFLATE’s compression ratio of 1032.
So how can we defend against this in Go?
Countermeasures
When reading ZIP archives in Go, the following works in our favor:
- Go’s zip.Reader does not read recursively.
- Non-recursive ZIP bombs can be detected by their characteristics:
- Many headers will point to the same compressed data (i.e. “many files” in the ZIP archive).
- Files in the ZIP archive will typically have an unusually high compression ratio.
While Go currently lacks a resource limits API, we can easily implement the following countermeasures:
- For the ZIP archive:
- Limit the amount of allowed files.
- For the files in the ZIP archive:
- Apply a maximum uncompressed size.
- Apply a maximum compression ratio.
- Limit the amount of bytes that can actually be read.