Skip to content

chriso/intern

Repository files navigation

Intern Build status

Fast, immutable string interning for C.

What is this?

  • A way of assigning a unique integer ID to each unique string, without collisions
  • Two-way lookup: ID => string, string => ID
  • Each string is stored only once in memory
  • Optional inlining of unsigned integer strings
  • Very low fragmentation via a custom block allocator
  • Minimal overhead per string: currently ~40 bytes, which could be lower at the cost of additional fragmentation
  • Fast: intern many millions of strings per second
  • String repository optimization based on frequency analysis (improve locality)
  • Support for snapshots (restore to a previous state)

Installation

$ cmake -G 'Unix Makefiles' -Wno-dev [OPTIONS]
$ make install

Options are:

  • -DBUILD_STATIC=1: Build a static library (libintern.a) rather than a shared library (libintern.so or libintern.dylib)
  • -DMMAP_PAGES=1: Allocate pages with mmap(2) rather than malloc(3)
  • -DPAGE_SIZE=4096: Set the page size
  • -DINLINE_UNSIGNED=1: Inline unsigned integers between 0 and INT_MAX
  • -DCMAKE_BUILD_TYPE=Release: Do a release build / enable optimization

Usage

Build your project with -lintern and include <intern/strings.h>.

See strings.h and optimize.h for more details.

Extra

Bindings for Go

License

MIT

About

Fast, efficient string interning

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published