ordered_map unchecked insertions for writing large JSON files #2633

xelatihy · 2021-02-11T12:40:35Z

xelatihy
Feb 11, 2021

The use of ordered_map makes the library quite helpful in cases where one needs to maintain key order. One caveat of it is that all operations become O(n) in the number of keys, which is something that one might need to avoid in large documents. When deserializing objects from json, the O(n) behavior can be sidestepped by just using iteration over the keys. When creating a json document though, that cannot be avoided.

An alternative to this would be have a more complex data structure for the ordered_map behavior. But one of the benefits on the current implementation is that it saves memory in large JSON files and that feature should be maintained.

Another alternative would be to use a SAX-like writer for large documents. This is in its own a challenge and it is unclear whether it would solve all problems. It would certainly require a lot of code rewrite for the library users.

What I am proposing here is to add a method to ordered_map that performs insertions without checking whether the element is already in the dictionary, maybe called uncheck_insert(...). This addition has a few benefits

it is a small addition to the current setup; the estimate costs is a few lines of code;
it is easy to implement, backward compatible and without any side-effect;
it is safe to introduce, since it is fully opt-in and only people that specially ask for me will use it
it sidesteps a big performance issue without introducing additional costs for new data structures

It there were to be interest in this, I would ba happy to add th few lines of code needed and test it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ordered_map unchecked insertions for writing large JSON files #2633

{{title}}

Replies: 0 comments

Select a reply

ordered_map unchecked insertions for writing large JSON files #2633

xelatihy Feb 11, 2021

Replies: 0 comments

xelatihy
Feb 11, 2021