One of the first questions I had when I open sourced my json.h library was how fast it can parse compared to other commonly used JSON libraries around. I’ve gone through all of the JSON libraries in C/C++ that I find tolerable (and I’ll document which ones I wouldn’t touch with a barge pole and why too!) and performed a performance comparison on them.
First off, I tested the following JSON C/C++ libraries;
- gason - C++ library with MIT license.
- json.h - my own JSON library, written in C and licensed under the unlicense.
- minijson_reader - C++11 library with BSD 3-Clause license.
- rapidjson - C++ library with MIT license.
- sajson - C++ library with MIT license.
- ujson4c - C library with BSD license.
And I rejected the following JSON C/C++ libraries (with some comments reasoning why I haven’t used them);
- jsmn - I initially did add this to the benchmark, but I found that when testing large input files, the time taken to parse was exponentially linked to the file size. Thus after around 1 minute trying to parse a 190MB file I made my benchmark timeout, and have removed this library from my benchmarking.
- cson - Uses some bizarre SCM called Fossil SCM which was enough of a roadblock to stop any further explorations.
- frozen - GPL license means it isn’t worth considering any further.
- jansson - absolutely crazy CMake files required to get it to build.
- js0n - only allows for searching of arbitrary tokens in a JSON stream, does not do a full parse and exploration of the DOM.
- jsoncpp - another crazy CMake mess, not touching this library.
- json++ - requires Flex/Bison, stupid requirement for a simple library.
- json-c - Absolutely no idea what is going on with this repository. Clusterfuck is too kind an explanation for what is going on.
- mjson - Requires compile time knowledge of JSON structure (only allows parsing of ‘known’ JSON structures).
- nxjson - Another GPL licensed library means it is a write-off.
- vjson - code did not have a license, too risky to explore it further.
I ran two kinds of testing, the cost of parsing some JSON files and storing them in the intermediate form, and also the cost of parsing then traversing the JSON to count all the numbers in the JSON structure.
The first JSON file I tested was a 9KB file. In this test my own json.h library is second worst of the six tested, a full 4x slower than the best performing gason library.
Ouch, on the 12.9MB file we perform worst of all. The reason minijson_reader seems to beat us here is that the input file AllSets.json is simply one large JSON array containing identical JSON objects in each element. The bigger the depth of the JSON objects (EG. objects, within arrays, within objects, etc.) the worst minijson_reader performs.
On the largest of our inputs, the large 190MB file, we perform second worst again. This library has many levels of objects and arrays, which is why minijson_reader performs atrociously bad.
One thing to note among all the examples is the difference between the parsing, and parse + traversing. Our library has a much narrower gap between these two than the other libraries, which means once the parsing has completed, iterating through our in memory representation is much quicker than the alternatives, I just need to work out now how to improve the parsing speed!
So in conclusion, my library is slower than I’d like - the only heartening thing is the speed of the library is a constant (at least I don’t have exponentially bad parsing!). I’ve already started to profile and re-write the offending part of the parsing, something I’ll cover in a future blog post.
Please check out the library here - I am more than happy to accept merge requests and feedback (I’ve already changed things based on user feedback!).