One thing that was missing from ubench.h (intentionally to keep the initial code drop simple) was fixtures. In this PR I’ve added them.

So what are fixtures and why should you use them?

What are fixtures?

Fixtures are a way to setup and teardown state that doesn’t contribute to the timing of the actual benchmark itself:

// First you declare a struct that contains the state you need.
struct foo {
  unsigned size;
  char* foo;
};

// Then you setup the benchmark state - using the ubench_fixture implicit field.
UBENCH_F_SETUP(foo) {
  ubench_fixture->size = 1024 * 1024 * 128;
  ubench_fixture->foo = (char*)malloc(ubench_fixture->size);
}

// In the teardown you can free up anything you allocated.
UBENCH_F_TEARDOWN(foo) {
  free(ubench_fixture->foo);
}

// And then declare the benchmarks that use the state.
UBENCH_F(foo, bar) {
  UBENCH_DO_NOTHING(ubench_fixture->foo);
  memset(ubench_fixture->foo, 0, ubench_fixture->size);
  UBENCH_DO_NOTHING(ubench_fixture->foo);
}

// You can declare multiple benchmarks that use the same fixture too.
UBENCH_F(foo, haz) {
}

As you can see setting up and declaring a fixture isn’t difficult.

Why you should use them?

Let’s take the counter example to the fixture above - using the non-fixture approach:

UBENCH(foo, bar) {
  const unsigned size = 1024 * 1024 * 128;
  char* foo = (char*)malloc(size);
  UBENCH_DO_NOTHING(foo);
  memset(foo, 0, size);
  UBENCH_DO_NOTHING(foo);
  free(foo);
}

And if we look at the results of both these tests we can see that the non-fixture one runs like:

[==========] Running 1 benchmarks.
[ RUN      ] foo.bar
[       OK ] foo.bar (mean 15.506ms, confidence interval +- 1.553863%)
[==========] 1 benchmarks ran.
[  PASSED  ] 1 benchmarks.

Whereas the fixture one runs like:

[==========] Running 1 benchmarks.
[ RUN      ] foo.bar
[       OK ] foo.bar (mean 4.125ms, confidence interval +- 0.844092%)
[==========] 1 benchmarks ran.
[  PASSED  ] 1 benchmarks.

So we can see that the average time for the fixtured one is about 4x faster than the non-fixtured one - because it hasn’t timed the malloc/free cost per benchmark. This means that if you really only care about what memset costs you’ve now got a much more accurate way to get that.

This is where fixtures are really great - hide the cost of setting up the data for your benchmark from the thing you actually want to benchmark.

One last thing…

One thing that might not be obvious is that fixtured tests can have a wall clock time that is longer than the non-fixtured variants. For the examples above the non-fixture one runs in:

real 0m0.592s
user 0m0.368s
sys  0m0.177s

Whereas the fixture benchmark runs in:

real 0m3.631s
user 0m3.508s
sys  0m0.071s

So why is this? In general the longer running a benchmark is the more accurate the result will be - so because we’re measuring less things in the actual benchmark samples the framework is having to run more of them to get an accurate result.

Just thought you might appreciate the heads up!