11 Mar

Adding JSON 5 to json.h

I’ve added JSON 5 support to my json.h library.

For those not in the know, JSON 5 (http://json5.org/) is a modern update to the JSON standard, including some cool features like unquoted keys, single quoted keys and strings, hexdecimal numbers, Infinity and NaN numbers, and c style comments!

As is sticking with the design of my lib – each of the features can be turned on individually if you don’t want the full shebang, or just add json_parse_flags_allow_json5 to enable the entire feature set.

The GitHub pull request brings in the functionality, and it is merged into master too!

29 May

A simple Vulkan Compute example

With all the buzz surrounding Vulkan and its ability to make graphics more shiny/pretty/fast, there is one key thing seems to have been lost in the ether of information – Vulkan isn’t just a graphics API, it supports compute too! Quoting the specification (bold added for effect):

Vulkan is an API (Application Programming Interface) for graphics and compute hardware

And:

This specification defines four types of functionality that queues may support: graphics, compute, transfer, and sparse memory management.

We can see that, through how well crafted the language is, Vulkan is not only allowed to support compute, there are cases where a Vulkan driver could expose only compute.

In this vein, I’ve put together a simple Vulkan compute sample – VkComputeSample. The sample:

  • allocates two buffers
  • fills them with random data
  • creates a compute shader that will memcpy from one buffer to the other
  • then check that the data copied over successfully

Key Vulkan principles covered:

  • creating a device and queue for compute only
  • allocating memories and buffers from them
  • writing a simple compute shader
  • executing the compute shader
  • getting the results

So without further ado, let us begin.

creating a device and queue for compute only

Vulkan has a ton of boilerplate code you need to use to get ready for action.

First up we need a VkInstance. To get this, we need to look at two of Vulkan’s structs – VkApplicationInfo and VkInstanceCreateInfo:

typedef struct VkApplicationInfo {
    VkStructureType    sType;
    const void*        pNext;
    const char*        pApplicationName;
    uint32_t           applicationVersion;
    const char*        pEngineName;
    uint32_t           engineVersion;
    uint32_t           apiVersion; // care about this
} VkApplicationInfo;

typedef struct VkInstanceCreateInfo {
    VkStructureType             sType;
    const void*                 pNext;
    VkInstanceCreateFlags       flags;
    const VkApplicationInfo*    pApplicationInfo; // care about this
    uint32_t                    enabledLayerCount;
    const char* const*          ppEnabledLayerNames;
    uint32_t                    enabledExtensionCount;
    const char* const*          ppEnabledExtensionNames;
} VkInstanceCreateInfo;

I’ve flagged the only two fields we really need to care about here – apiVersion and pApplicationInfo. The most important field here is apiVersion. apiVersion will allow us to write an application against the current Vulkan specification and specify exactly which version of Vulkan we wrote our application against within the code.

Why is this important you ask?

  1. It helps future you. You’ll know which version of the specification to look at.
  2. It allows the validation layer to understand which version of Vulkan you think you are interacting with, and potentially flag up any cross version issues between your application and the drivers you are interacting with.

I recommend you always at least provide an apiVersion.

pApplicationInfo is the easier to justify – you need this to point to a valid VkApplicationInfo if you want to specify an apiVersion, which I again highly recommend you use.

Next, we need to get all the physical devices the instance can interact with:

uint32_t physicalDeviceCount = 0;
vkEnumeratePhysicalDevices(instance, &physicalDeviceCount, 0);

VkPhysicalDevice* const physicalDevices = (VkPhysicalDevice*)malloc(
   sizeof(VkPhysicalDevice) * physicalDeviceCount);

vkEnumeratePhysicalDevices(
  instance, &physicalDeviceCount, physicalDevices);

We do this by using a pair of vkEnumeratePhysicalDevices calls – one to get the number of physical devices the instance knows about, and one to fill a newly created array with handles to these physical devices.

For the purposes of the sample, I iterate through these physical devices and run my sample on each of the physical devices present in the system – but for a ‘real-world application’ you’d want to find which device best suits your workload by using vkGetPhysicalDeviceFeatures, vkGetPhysicalDeviceFormatProperties, vkGetPhysicalDeviceImageFormatProperties, vkGetPhysicalDeviceProperties, vkGetPhysicalDeviceQueueFamilyProperties and vkGetPhysicalDeviceMemoryProperties.

For each physical device we need to find a queue family for that physical device which can work for compute:

uint32_t queueFamilyPropertiesCount = 0;
vkGetPhysicalDeviceQueueFamilyProperties(
  physicalDevice, &queueFamilyPropertiesCount, 0);

VkQueueFamilyProperties* const queueFamilyProperties =
  (VkQueueFamilyProperties*)malloc(
    sizeof(VkQueueFamilyProperties) * queueFamilyPropertiesCount);

vkGetPhysicalDeviceQueueFamilyProperties(physicalDevice,
  &queueFamilyPropertiesCount, queueFamilyProperties);

We do this by using a pair of calls to vkGetPhysicalDeviceQueueFamilyProperties, the first to get the number of queue families available, and the second to fill an array of information about our queue families. In each queue family:

typedef struct VkQueueFamilyProperties {
    VkQueueFlags    queueFlags; // care about this
    uint32_t        queueCount;
    uint32_t        timestampValidBits;
    VkExtent3D      minImageTransferGranularity;
} VkQueueFamilyProperties;

We care about the queueFlags member which specifies what workloads can execute on a particular queue. A naive way to do this would be to find any queue that could handle compute workloads. A better approach would be to find a queue that only handled compute workloads (but you need to ignore the transfer bit and for our purposes the sparse binding bit too).

Once we have a valid index into our queueFamilyProperties array we allocated, we need to keep this index around – it becomes our queue family index used in various other places of the API.

Next up, create the device:

typedef struct VkDeviceQueueCreateInfo {
    VkStructureType             sType;
    const void*                 pNext;
    VkDeviceQueueCreateFlags    flags;
    uint32_t                    queueFamilyIndex; // care about this
    uint32_t                    queueCount;
    const float*                pQueuePriorities;
} VkDeviceQueueCreateInfo;

typedef struct VkDeviceCreateInfo {
    VkStructureType                    sType;
    const void*                        pNext;
    VkDeviceCreateFlags                flags;
    uint32_t                           queueCreateInfoCount; // care about this
    const VkDeviceQueueCreateInfo*     pQueueCreateInfos;    // care about this
    uint32_t                           enabledLayerCount;
    const char* const*                 ppEnabledLayerNames;
    uint32_t                           enabledExtensionCount;
    const char* const*                 ppEnabledExtensionNames;
    const VkPhysicalDeviceFeatures*    pEnabledFeatures;
} VkDeviceCreateInfo;

The queue family index we just worked out will be used in our VkDeviceQueueCreateInfo struct’s queueFamilyIndex member, and our VkDeviceCreateInfo will contain one queueCreateInfoCount, with pQueueCreateInfos set to the address of our single VkDeviceQueueCreateInfo struct.

Lastly we get our device’s queue using:

VkQueue queue;
vkGetDeviceQueue(device, queueFamilyIndex, 0, &queue);

Et voilà, we have our device, we have our queue, and we are done (with getting our device and queue at least).

allocating memories and buffers from them

To allocate buffers for use in our compute shader, we first have to allocate memory that backs the buffer – the physical location of the buffer for the device. Vulkan supports many different memory types, so we need to query for the buffer that matches our requirements. We do this by a call to vkGetPhysicalDeviceMemoryProperties, and we then find a memory that has the properties we require, and is big enough for our uses:

const VkDeviceSize memorySize; // whatever size of memory we require
for (uint32_t k = 0; k < properties.memoryTypeCount; k++) {
  const VkMemoryType memoryType = properties.memoryTypes[k];

  if ((VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT & memoryType.propertyFlags)
  && (VK_MEMORY_PROPERTY_HOST_COHERENT_BIT & memoryType.propertyFlags)
  && (memorySize < properties.memoryHeaps[memoryType.heapIndex].size)) {
    // found our memory type!
  }
}

If we know how big a memory we require, we can find an index in our VkPhysicalDeviceMemoryProperties struct that has the properties we require set, and is big enough. For the sample I’m using memory that can be host visible, and is coherent (for ease of sample writing).

With the memory type index we found above we can allocate a memory:

typedef struct VkMemoryAllocateInfo {
    VkStructureType    sType;
    const void*        pNext;
    VkDeviceSize       allocationSize;
    uint32_t           memoryTypeIndex; // care about this
} VkMemoryAllocateInfo;

We need to care about the memoryTypeIndex – which we’ll set to the index we worked out from VkPhysicalDeviceMemoryProperties before.

For the sample, I allocate one memory, and then subdivide it into two buffers. We create two storage buffers (using VK_BUFFER_USAGE_STORAGE_BUFFER_BIT), and since we do not intend to use overlapping regions of memory for the buffers our sharing mode is VK_SHARING_MODE_EXCLUSIVE. Lastly we need to specify which queue families these buffers will be used with – in our case its the one queueFamilyIndex we discovered at the start.

The link between our memories and our buffers is vkBindBufferMemory:

vkBindBufferMemory(device, in_buffer, memory, 0);
vkBindBufferMemory(device, out_buffer, memory, bufferSize);

The crucial parameter for us to use the one memory for two buffers is the last one – memoryOffset. For our second buffer we set it to begin after the first buffer has ended. Since we are creating storage buffers, we need to be sure that our memoryOffset is a multiple of the minStorageBufferOffsetAlignment member of the VkPhysicalDeviceLimits struct. For the purposes of the sample, we choose a memory size that is a large power of two, satisfying the alignment requirements on our target platforms.

The last thing we can do is fill the memory with some initial random data. To do this we map the memory, write to it, and unmap, prior to using the memory in any queue:

VkDeviceSize memorySize; // whatever size of memory we require

int32_t *payload;
vkMapMemory(device, memory, 0, memorySize, 0, (void *)&payload);

for (uint32_t k = 0; k < memorySize / sizeof(int32_t); k++) {
  payload[k] = rand();
}

vkUnmapMemory(device, memory);

And that is it, we have our memory and buffers ready to data up later.

writing a simple compute shader

My job with Codeplay is to work on the Vulkan specification with the Khronos group. My real passion within this is making compute awesome. I spend a good amount of my time working on Vulkan compute but also on SPIR-V for Vulkan. I’ve never been a happy user of GLSL compute shaders – and luckily now I don’t have to use them!

For the purposes of the sample, I’ve hand written a little compute shader to copy from a storage buffer (set = 0, binding = 0) to another storage buffer (set = 0, binding = 1). As to the details of my approach, I’ll leave that to a future blog post (it’d be a lengthy sidetrack for this post I fear).

To create a compute pipeline that we can execute with, we first create a shader module with vkCreateShaderModule. Next we need a descriptor set layout using vkCreateDescriptorSetLayout, with the following structs:

VkDescriptorSetLayoutBinding descriptorSetLayoutBindings[2] = {
  {0, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, 1, VK_SHADER_STAGE_COMPUTE_BIT, 0},
  {1, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, 1, VK_SHADER_STAGE_COMPUTE_BIT, 0}
};

VkDescriptorSetLayoutCreateInfo descriptorSetLayoutCreateInfo = {
  VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO,
  0, 0, 2, descriptorSetLayoutBindings
};

We are describing the bindings within the set we are using for our compute shader, namely we have two descriptors in the set, both of which are storage buffers being used in a compute shader.

We then use vkCreatePipelineLayout to create our pipeline layout:

typedef struct VkPipelineLayoutCreateInfo {
    VkStructureType                 sType;
    const void*                     pNext;
    VkPipelineLayoutCreateFlags     flags;
    uint32_t                        setLayoutCount; // care about this
    const VkDescriptorSetLayout*    pSetLayouts;    // care about this
    uint32_t                        pushConstantRangeCount;
    const VkPushConstantRange*      pPushConstantRanges;
} VkPipelineLayoutCreateInfo;

Since we have only one descriptor set, we set setLayoutCount to 1, and pSetLayouts to the descriptor set layout we created for our two bindings-set created before.

And then lastly we use vkCreateComputePipelines to create our compute pipeline:

VkComputePipelineCreateInfo computePipelineCreateInfo = {
  VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO,
  0, 0,
  {
    VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
    0, 0, VK_SHADER_STAGE_COMPUTE_BIT, shader_module, "f", 0
  },
  pipelineLayout, 0, 0
};

Our shader has one entry point called “f” for its shader, and it is a compute shader. We also need the pipeline layout we just created, and et voilà – we have our compute pipeline ready to execute with.

executing the compute shader

To execute a compute shader we need to:

  1. Create a descriptor set that has two VkDescriptorBufferInfo’s for each of our buffers (one for each binding in the compute shader).
  2. Update the descriptor set to set the bindings of both of the VkBuffer’s we created earlier.
  3. Create a command pool with our queue family index.
  4. Allocate a command buffer from the command pool (we’re using VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT as we aren’t resubmitting the buffer in our sample).
  5. Begin the command buffer.
  6. Bind our compute pipeline.
  7. Bind our descriptor set at the VK_PIPELINE_BIND_POINT_COMPUTE.
  8. Dispatch a compute shader for each element of our buffer.
  9. End the command buffer.
  10. And submit it to the queue!

getting the results

To get the results from a submitted command buffer, the coarse way to do this is to use vkQueueWaitIdle – wait for all command buffers submitted to a queue to complete. For our purposes, we are submitting one queue, and waiting for it to complete, so it is the perfect tool for our sample – but broadly speaking you are better chaining dependent submissions together with VkSemaphore’s, and using a VkFence for only the queue at the end of the workload to ensure the execution has complete.

Once we’ve waited on the queue, we simply map the memory and check that the first half of the buffer equals the second half – EG. the memcpy of the elements succeeded:

int32_t *payload;
vkMapMemory(device, memory, 0, memorySize, 0, (void *)&payload);

for (uint32_t k = 0, e = bufferSize / sizeof(int32_t); k < e; k++) {
  assert(payload[k + e] == payload[k]);
}

And we are done! We have written our first memcpy sample in Vulkan compute shaders.

fin

The sample is dirty in ‘real-world application’ terms – it doesn’t free any of the Vulkan objects that need to be freed on completion. TL;DR one of the drivers I am testing on loves to segfault on perfectly valid code (and yes, for any IHV’s reading this I have already flagged this up with the relevant vendor!).

But for the purposes of explaining an easy Vulkan compute sample to all the compute lovers among my readership I hope the above gives you a good overview of exactly how to do that – yes there are many hoops to jump through to get something executing, but the sheer level of control that can be achieved through the Vulkan API far outweighs a few extra lines of code we need.

The full sample is available at the GitHub gist here.

Stay tuned for more Vulkan compute examples to come in future posts!

27 Mar

Allocators in json.h!

While at GDC I got a feature request from the awesome folks over at Blastbay Studios requesting that they could feed an allocator to json.h, my JSON parsing library. This has long been on my TODO list, so I took the opportunity of a plane ride back from San Francisco to the Isle of Skye to remedy the situation!

The latest master version of json.h changes the signature of json_parse_ex to include two new fields:

struct json_value_s *json_parse_ex(
  const void *src,
  size_t src_size,
  size_t flags_bitset,
  void*(*alloc_func_ptr)(void *, size_t), // new field!
  void *user_data,                        // new field!
  struct json_parse_result_s *result);

alloc_func_ptr and user_data.

For recap, json.h uses 1 allocation to store the entire structure and contents of the JSON file (see my introductory post for json.h for details).

Instead of calling malloc, if alloc_func_ptr is not NULL, it will instead call the allocator and provide the user_data to the allocator as the first parameter, and the size of allocation requested as the second parameter.

I hope this added functionality proves useful to the many people who are already using json.h in their applications!

11 Jan

git pre-commit clang-format hook

Most of my company projects have a requirement that we run clang-format on all commits to keep a consistent style. Unfortunately, I’m as forgetful as the fabled goldfish (which actually isn’t that forgetful it turns out) so I often forget to run clang-format.

After the umpteenth time of forgetting and suffering some quite deepseated rage from my colleagues, I decided to investigate a way to automate the process.

It turns out git has a really cool feature introduced in 1.7.0 to specify a folder of hooks, such that when you run git clone or git init to create a new local copy of a repository, it will as part of the creation process take a copy of these ‘template’ git hooks and use them in the repository.

I’ve created a GitHub repo git-hooks that contains my pre-commit hook for running clang-format on all .c, .cpp, .cc and .h files before committing them.

To use, simply clone the git repo to a folder <whatever folder name you choose>, and then run;

git config --global init.templatedir "<whatever folder name you choose>"

Then whenever you clone or init a git repository this hook will be copied and be used in your repositories. If you have existing repositories that you want to use the hook, simply re-run git init in the repository and it will copy the hook across.

One minor caveat, if you have any pre-existing git pre-commit hook in the repository then git will not overwrite it.

I’ve tested this on Windows and Linux, and I hope it proves as useful to others as it has been to me!

23 Dec

Full Simplified JSON support in json.h

In a previous post (Partial) Simplified JSON support in json.h I covered the partially added simplified JSON support I had added to json.h. One of the things I covered was my unwillingness to implement two of the features of simplified JSON, commas being optional and replacing : with =. I argued that both of these were unnecessary and stupid additions to the library.

I was wrong. The immediate feedback I received detailed good reasons why these were useful, but more importantly – why did I half-arse implementing a feature that was requested by a user!

So today, I’ve implemented all of simplified JSON. You can now parse;

foo = "bar"
yada = 2
meow = true

by using the following code;

const char payload[] = ...; // the string above!
json_value_s* json = json_parse_ex(
  payload,
  strlen(payload),
  json_parse_flags_allow_simplified_json,
  0);

A few caveats to remember when writing/using this code though:

  • json_parse_flags_allow_simplified_json is a bitmask enabling many other json_parse_flags that you can enable separately. As such, the behaviour of each of them applies collectively when using this flag
  • commas aren’t banned, they are just not required. You can mix commas/no-commas throughout your simplified JSON. You can also have trailing commas after the last element in an object/array now too
  • unquoted keys aren’t banned, you can mix and match quoted/unquoted keys
  • you always have a global object. Even if your JSON string was ‘{}’, you would have a global object that contained one empty object with simplified JSON enabled
  • colons aren’t banned, you can mix ‘:’ and ‘=’ when separating your key/value pairs within objects

I hope you find this option useful, and I intend to keep working on json.h over my Christmas holidays so stay tuned!

11 Dec

Introducing utest.h!

So as you may (or may not) already know – I’ve written two tiny C libraries – utf8.h and json.h. One of the questions that most plagued me when writing these libraries was ‘How should I write the tests for them?’ I could put them in a separate repository, I could put them within the repository, should I create a new testing framework every time? It would be useful to be able to have test cases spread across multiple files… and the thoughts went on and on.

Basically what I wanted was a single header variant of googletest – and for it to be able to work in C. I scoured the interwebs a bit, but I couldn’t really find exactly what I wanted. When in doubt, write your own I say!

So I’m introducing utest.h – a single header C/C++ unit tester. It’s licensed under the public domain (via the unlicense), and it works and is tested on Mac OSX, Linux and Windows, with both C and C++ files in the same tested executable. I’ve tried to mimic googletest‘s command line output as much as possible, even using the coloured text output where possible.

To start testing with utest.h, the absolute minimum you need is a TESTCASE(set, name) and UTEST_MAIN() to be in one of your source files. UTEST_MAIN() defines an int main(…) entry point function, so don’t try and define your own! For example;

#include "utest.h"

TESTCASE(foo, bar) {
 ASSERT_TRUE(1);
}

UTEST_MAIN();

I use the gcc/clang extension __attribute__((constructor)) to allow multiple files to register test cases pre-main, and an even uglier MSVC workaround to mimic __attribute__((constructor)) on Windows. The bottom line though – it works, and everyone can test across C/C++ to their heart’s delight.

Future work will include having command line arguments to select what test cases to run, and being able to output an xunit/junit xml file so that continuous integrations can easily pick up the results of the test cases. Stay tuned!

04 Dec

(Partial) Simplified JSON support in json.h

With the help of the rather awesome @ocornut – I’ve managed to get a variant of simplified JSON support into json.h!

So first off – what is simplified JSON?  Taken straight from the Bitsquid blog;

  1. Assume an object definition at the root level (no need to surround entire file with { })
  2. Commas are optional
  3. Quotes around object keys are optional if the keys are valid identifiers
  4. Replace : with =

Of the four points above, I’m going to argue that only 1. and 3. are actually useful.


1. Assume an object definition at the root level (no need to surround entire file with { })

If you are always going to start a JSON file with an object (which is the recommended behaviour when using JSON), it can be quite tedious to surround the parent object with { }’s. Imagine we have the following;

{
  "a" : true,
  "b" : false,
  "c" : null
}

Whereas with 1. this could read;

"a" : true,
"b" : false,
"c" : null

As we can see, it reads that little bit nicer overall. It has a nice benefit of also meaning you don’t have to indent the parent key/value elements of the main object – if you like me are rather over the top about indentation this is a nice space saver. The one downside for the JSON security purists is that someone could easily append onto the object with new elements – something that I know is a problem in certain domains.

2. Commas are optional

I really dislike this idea. Essentially the idea is that;

{
  "a" : true
  "b" : false
  "c" : null
}

Would be entirely valid with simplified JSON. I dislike this because you can end up with some really disgusting code. The code above looks like the newlines are basically replacing the commas to denote new elements, but the above would be functionally equivalent to;

{
  "a" : true"b" : false"c" : null
}

Which looks utterly hideous. This also adds some pretty funky parsing variants for json.h which I just wasn’t keen to add.

3. Quotes around object keys are optional if the keys are valid identifiers

It can be a real chore and also quite expensive in terms of file size to surround all the keys of objects with ” “‘s too! For the example;

{
  "a" : true,
  "b" : false,
  "c" : null
}

Given that “a”, “b” and “c” don’t contain funky characters or whitespace, why not allow them to be specified like;

{
  a : true,
  b : false,
  c : null
}

It looks pretty nice, and saves some space, to be able to specify them without the ” “‘s.

4. Replace : with =

This rule I dislike simply because it is a stylistic choice. Even reading the Bitsquid blog that specified simplified JSON – it was done to make the code read more like Lua. This is something we could add, but I don’t see the point as its not a functional change, its a stylistic change.

Solving the problem with 2.

So as I said in my comments on 2., I don’t like the ‘commas are optional’ rule. To my rescue came @ocornut – who happened to suggest that allowing trailing commas on elements would be a useful helper for some work he was doing with my json.h library, and it got me thinking – this seems to solve at least part of the problem with 2.! Making it such that commas are a little easier to use makes them seem that little bit more innocuous for developers to use.

So I’ve settled on a happy medium of the features from simplified JSON that I think are useful. In the next post I’ll explain how I changed the API to allow both pure/original/unadulterated JSON to survive alongside my partial simplified JSON support!

14 Sep

json.h performance (update!)

So previously I tested my json.h library’s performance against other JSON libraries in the wild (json.h performance (vs other C/C++ JSON parsers). My JSON parser wasn’t performing as I’d expect in the tests, so I’ve spent a good amount of time looking at why.

Before we begin, you can find the sources to my public domain, one .c/one .h file json.h library here – https://github.com/sheredom/json.h

A little information on my initial approach. The idea was that we would try and keep the values that make up arrays and objects together in memory. Using the worked example;

{“a” : [123, null, true, false, “alphabet”]}

Would produce the following DOM structure for the above JSON;

chart2

In the above diagram, we can see that the one json_object_s contains an array to the names and values contained within the object, and the one json_array_s contains an array to the values.

In practice though, it was this design decision that caused the main performance bottleneck of the approach – effectively each time you came to a new object or array, you’d have to first skip over all the contents of the object or array simply to first find out the number of values within! In big O notation, our worst case was O(n²) for parsing any given JSON file.

To fix this issue, we need to decouple the allocation of the memory to store the values contained within objects and arrays such that we can create a single value, fill out of the information about the contents of that value, and then allocate the next value. Effectively we need to sacrifice our array and instead have a linked-list of values.

To do this I’ve introduced json_array_element_s and json_object_element_s structures which form the new linked-list to the json.h library, and it is through these that you’ll have to iterate when traversing the DOM. Our worst case now becomes O(n) for parsing a JSON file.

Our new DOM for the above worked example is;

dom2

 

Our new DOM is admittedly not as concise or pretty, but performance wise – it is a winner.

I used three files, parsed and traversed them, and recorded the results in my previous post. I’ve updated the graphs below but this time added in a new row ‘json.h – old’ for the previous approach, and json.h is using the newer approach.

json-generator2

 

AllSets2

 

sf-city-lots-json2

As can be seen from the three charts, our performance is massively improved over the previous approach, an easy 3x+ faster overall. gason still performs the best of all the libraries benchmarked, but at least our parser is now in the region of the other parsers in terms of performance, non-withstanding the massive philosophical design difference that json.h decided on.

I hope you enjoyed this post, and I’ll keep beavering away on my json.h library to improve it further!

24 Aug

json.h performance (vs other C/C++ JSON parsers)

One of the first questions I had when I open sourced my json.h library was how fast it can parse compared to other commonly used JSON libraries around. I’ve gone through all of the JSON libraries in C/C++ that I find tolerable (and I’ll document which ones I wouldn’t touch with a barge pole and why too!) and performed a performance comparison on them.

First off, I tested the following JSON C/C++ libraries;

  • gason – C++ library with MIT license.
  • json.h – my own JSON library, written in C and licensed under the unlicense.
  • minijson_reader – C++11 library with BSD 3-Clause license.
  • rapidjson – C++ library with MIT license.
  • sajson – C++ library with MIT license.
  • ujson4c – C library with BSD license.

And I rejected the following JSON C/C++ libraries (with some comments reasoning why I haven’t used them);

  • jsmn – I initially did add this to the benchmark, but I found that when testing large input files, the time taken to parse was exponentially linked to the file size. Thus after around 1 minute trying to parse a 190MB file I made my benchmark timeout, and have removed this library from my benchmarking.
  • cson – Uses some bizarre SCM called Fossil SCM which was enough of a roadblock to stop any further explorations.
  • frozen – GPL license means it isn’t worth considering any further.
  • jansson – absolutely crazy CMake files required to get it to build.
  • js0n – only allows for searching of arbitrary tokens in a JSON stream, does not do a full parse and exploration of the DOM.
  • jsoncpp – another crazy CMake mess, not touching this library.
  • json++ – requires Flex/Bison, stupid requirement for a simple library.
  • json-c – Absolutely no idea what is going on with this repository. Clusterfuck is too kind an explanation for what is going on.
  • mjson – Requires compile time knowledge of JSON structure (only allows parsing of ‘known’ JSON structures).
  • nxjson – Another GPL licensed library means it is a write-off.
  • vjson – code did not have a license, too risky to explore it further.

I ran two kinds of testing, the cost of parsing some JSON files and storing them in the intermediate form, and also the cost of parsing then traversing the JSON to count all the numbers in the JSON structure.

json-generator

 

The first JSON file I tested was a 9KB file. In this test my own json.h library is second worst of the six tested, a full 4x slower than the best performing gason library.

AllSetsOuch, on the 12.9MB file we perform worst of all. The reason minijson_reader seems to beat us here is that the input file AllSets.json is simply one large JSON array containing identical JSON objects in each element. The bigger the depth of the JSON objects (EG. objects, within arrays, within objects, etc.) the worst minijson_reader performs.

sf-city-lots-jsonOn the largest of our inputs, the large 190MB file, we perform second worst again. This library has many levels of objects and arrays, which is why minijson_reader performs atrociously bad.

One thing to note among all the examples is the difference between the parsing, and parse + traversing. Our library has a much narrower gap between these two than the other libraries, which means once the parsing has completed, iterating through our in memory representation is much quicker than the alternatives, I just need to work out now how to improve the parsing speed!

So in conclusion, my library is slower than I’d like – the only heartening thing is the speed of the library is a constant (at least I don’t have exponentially bad parsing!). I’ve already started to profile and re-write the offending part of the parsing, something I’ll cover in a future blog post.

Please check out the library here – I am more than happy to accept merge requests and feedback (I’ve already changed things based on user feedback!).

 

 

 

18 Aug

json.h

Back in May, the rather awesome imgui creator @ocornut asked the following;

And it got me thinking – why isn’t there a simple JSON reader/writer lib in the same vein as the stb_* libraries that performs a single call to malloc to encode the state? I couldn’t find one, so I decided to write my own.

I’m introducing json.h – my one header/one source json library that will parse a JSON source into a single allocation buffer, and also has functions to write out the minified version of the JSON, and a pretty print function (for human readable JSON).

Lets go through a worked example, lets take the following trivial JSON;

{“a” : [123, null, true, false, “alphabet”]}

The above example covers all the core concepts inherent within JSON, so serves as a good coverage tool for our parsing. The above will be parsed (using json_parse) into a single-malloc’ed buffer, with the start of that buffer being a json_value_s* – a pointer to the root value. The Document Object Model (DOM) for this JSON is;

chart2

And the single allocation in-memory view of the above is;

chart3

In terms of speed of the library I’ve used these JSON files for reference;

Which produces the following chart;

chartCurrently, parsing is averaging around 55 MB/s, pretty writing around 300 MB/s and minified writing around 500 MB/s on my Intel Core i7-2700k 3.5GHz.

My next step will be to look into my parsing approach and see if anything can be done to speed up parsing of JSON!

I hope this library is useful, and I’m happy to have any comments/critiques on my approach.