Ever since I worked on my amazing little single-header C/C++ library utest.h, which mimics enough of Google’s googletest framework but in a significantly leaner footprint (and a single header!), I’ve wanted something similar for doing benchmarking. Google has their own benchmarking library Google benchmark, but it fails a couple of key tests that I have for libraries:

  • It isn’t a single header library.
  • It doesn’t support C benchmarks.
  • It is huge.
  • It has a build process that results in a non trivial integration process.

While none of these are a death knell for an otherwise incredibly useful library, it just didn’t meet enough of the requirements for my own uses.

So I’ve wrote my own - ubench.h.

Introducing ubench.h

So given all the infrastructure I had already worked out to get my utest.h library, I could re-use a significant amount of the code. I started by copying the code from utest.h and doing a rename of UTEST -> UBENCH and utest -> ubench. Next up, I removed all the code that wasn’t required from testing - all the asserts and expects that make up a unit testing framework. I also removed fixture and indexed testcases because I wanted to push for a minimally viable product to begin with.

UBENCH

To declare a benchmark you use the UBENCH macro:

#include "ubench.h"

UBENCH(foo, bar) {
  usleep(100 * 1000);
}

The macro takes two parameters - the set of the benchmark, and the name of the individual benchmark. In the body of the benchmark you specify the code you want to profile - in this case I’m just sleeping for a certain amount of time as a good no-op example.

In one file you then need to specify the UBENCH_MAIN() - this macro will define the main function of your executable, define all the global state we require, and call into ubench.h to run the benchmarks:

#include "ubench.h"

UBENCH_MAIN()

You can alternatively call the ubench_main function yourself if you want to use your own main function instead. First you need to specify the state for ubench.h:

UBENCH_STATE();

And then when you are ready to call into the ubench.h framework do:

int main(int argc, const char *const argv[]) {
  // do your own thing
  return ubench_main(argc, argv);
}

Note - you need to specify UBENCH_MAIN or UBENCH_STATE in a single source file only.

Command Line Output

The command line output for the library is similar to utest.h and thus also similar to googletest’s output:

[==========] Running 1 benchmarks.
[ RUN      ] foo.bar
[       OK ] foo.bar (102289906ns +- 1.751264%)
[==========] 1 benchmarks ran.
[  PASSED  ] 1 benchmarks

The one notable difference is how tests pass/fail, and the additional of +- 1.751264% to the timing information. Benchmarks are only useful if the repeated runs of them result in similar numbers. You need to run a benchmark a number of times and then get an average of the result. The problem is that the average can be vastly affected by huge outliers in the data set (EG. was the test ran from a cold start with dirty caches?) which can drastically affect the reliability of your results. To combat this, the benchmarks are recorded multiple times in multiple passes. Only when the standard deviation of the test is lower than 2.5 percent (a good initial value for a reliable and reproducible result) can the benchmark be seen as being useful. The +- 1.751264% is the standard deviation that was recorded for that example. If after a number of runs the variance of the runs does not conform to a small enough deviation then the benchmark is regarded to have failed, and is reported as such.

UBENCH_DO_NOTHING

Benchmarks are best when they are simple enough to record a single simple thing, that gives good reproducible results. One thing that defeats this is that compilers are so handy now at removing dead code - optimizing away the very thing you want to measure! To get around this I’ve included the UBENCH_DO_NOTHING macro, which takes a pointer argument and looks, to the compiler, like it will trash the memory within (read and write it). This way the compiler shouldn’t optimize out the data you are using and thus the code that you are using to produce it:

UBENCH(do, nothing) {
  static char a[8 * 1024 * 1024];
  static char b[8 * 1024 * 1024];
  UBENCH_DO_NOTHING(a);
  memcpy(b, a, sizeof(a));
  UBENCH_DO_NOTHING(b);
}

The above makes the compiler think that a is getting some data placed within it, and then after the memcpy that something is being done with b. The end result is we can get a meaningful performance number for how expensive the memcpy is.

Summary

The library is already simple yet useful for measuring some stuff - but I obviously want to do more. Fixtures are the obvious addition - you may want some state to be initialized once and then we profile the modifications of the data on its own rather than alongside the cost of initialization / destruction. I’d also like to give the user more options for controlling how many times the benchmark is run to get a low enough standard deviation.

But I think as a programmer our number one job is to get something simple done and then use it in production. Production always helps shuffle out the actual missing things you need from the code rather than in the abstract up front trying to work out the full set of requirements. I hope the library proves useful to some of you out there, and you can keep track of the progress on GitHub at https://github.com/sheredom/ubench.h.