Fast, immutable string interning for C.
- A way of assigning a unique integer ID to each unique string, without collisions
- Two-way lookup: ID => string, string => ID
- Each string is stored only once in memory
- Optional inlining of unsigned integer strings
- Very low fragmentation via a custom block allocator
- Minimal overhead per string: currently ~40 bytes, which could be lower at the cost of additional fragmentation
- Fast: intern many millions of strings per second
- String repository optimization based on frequency analysis (improve locality)
- Support for snapshots (restore to a previous state)
$ cmake -G 'Unix Makefiles' -Wno-dev [OPTIONS]
$ make install
Options are:
-DBUILD_STATIC=1
: Build a static library (libintern.a
) rather than a shared library (libintern.so
orlibintern.dylib
)-DMMAP_PAGES=1
: Allocate pages withmmap(2)
rather thanmalloc(3)
-DPAGE_SIZE=4096
: Set the page size-DINLINE_UNSIGNED=1
: Inline unsigned integers between 0 andINT_MAX
-DCMAKE_BUILD_TYPE=Release
: Do a release build / enable optimization
Build your project with -lintern
and include <intern/strings.h>
.
See strings.h and optimize.h for more details.
MIT