SICK: Streams of Independent Constant Keys
SICK is an approach to handle JSON -like structures and various libraries implementing it.
SICK allows you to achieve the following:
Store JSON -like data in efficient indexed binary form Avoid reading and parsing whole JSON files and access only the data you need just in time Store multiple JSON -like structures in one deduplicating storage Implement perfect streaming parsers for JSON -like data Efficiently stream updates for JSON -like data
The tradeoff for these benefits is somehow more complicated and less efficient encoder.
The problem
JSON has a Type-2 grammar and requires a pushdown automaton to parse it. So, it's not possible to implement efficient streaming parser for JSON . Just imagine a huge hierarchy of nested JSON objects: you won't be able to finish parsing the top-level object until you process the whole file.
JSON is frequently used to store and transfer large amounts of data and these transfers tend to grow over time. Just imagine a typical JSON config file for a large enterprise product.
The non-streaming nature of almost all the JSON parsers requires a lot of work to be done every time you need to deserialize a huge chunk of JSON data: you need to read it from disk, parse it in memory into an AST representation, and, usually, map raw JSON tree to object instances. Even if you use token streams and know the type of your object ahead of time you still have to deal with the Type-2 grammar.
This may be very inefficient and causes unnecessary delays, pauses, CPU activity and memory consumption spikes.
... continue reading