xhu.buzz

How to shrink whisper files for fun and profit

(originally written at June 17, 2019)

Table of Contents

The problem we’re trying to solve is making whisper files smaller. The profit we’ll get from doing this is increased storage capacity of our graphite clusters.

To achieve this, a new file format needs to be designed and implemented from scratch - writing bits rather than bytes to files.

The fun part? End results are very rewarding. Not only is disk space reduced by ~60% - ~55% (seen in different testing environments and with different XFS configurations), but server utilities are greatly increased as well (see production test below).

What is a whisper file?

Whisper is a fixed-size database, similar in design and purpose to RRD (round-robin-database). It provides fast, reliable storage of numeric data over time. Whisper allows higher-resolution (seconds per point) recent data to degrade into lower resolutions, for long-term retention of historical data.

The Gorilla algorithm

Facebook’s paper Gorilla: A Fast, Scalable, In-Memory Time Series Database, introduced an algorithm for efficiently compressing time-series data. According to this paper, time-series data could be saved in 1.37 bytes per data point according, which when uncompressed is 16 bytes long in size (or 12 bytes if the timestamp is saved in 32 bits, like whisper. And yes, one more thing to fix before 2038).

Since its publication, this algorithm has been adopted by Prometheus, M3, and many other notable time-series databases. This makes it easier to adopt.

Whisper + Compression = CWhisper

Main format changes:

To illustrate the changes simply:

Original whisper file format:

Compressed whisper file format (detailed example):

Why introduce blocks?

The way the decompression algorithms work is by starting from the first data point. In order to read the 100th data point, all 99 data points before it need to be decompressed.

Therefore:

Tradeoff

So what’s the trade off we are making here in order to shrink files?

Test result on production

The test result on one of our clusters is good and conclusive. It outperforms the classic whisper format on almost all grounds like disk space usage, cpu usage and memory usage - and even shorter time-to-disk if we compare servers hosting the same number of metrics:

Metrics Whisper (standard) CWhisper (compressed)
Total Metrics 50.6 Millions 53.1 Millions
Num of Servers 32 9
Disk Usage (45.75% less) 32.28 TB 14.77 TB
Total Disk Space (2.9TB Per Server) 92.8 TB 26.1 TB
Theoretical Capacity Per Server (Metrics) ~4.5 Millions ~10.43 Millions

Source code: