Since my last post introducing utf8.h I’ve been frantically working on fleshing out the core utf8* functions to match the str* ones, and also listening to developer feedback!
Firstly, you can check out the one header C/C++ library here – utf8.h.
- @daniel_collin suggested adding an ASCII only utf8casecmp, which has been added. I’m looking into extending this to support more of the characters in Unicode (the most obvious ones that I can understand are ASCII characters with accents).
- @mcclure111 suggested I actually document the code where appropriate, and I’ve undertake efforts to remedy this.
Next up I plan to tackle the utf8canon that @KmBenzie suggested, to canonicalize poorly formed utf8 codepoints into the correct form (for example, utf8 ascii values can be encoded erroneously in a 4-byte codepoint which is regarded as poor form).