Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Column Families to SplinterDB #586

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

Add Column Families to SplinterDB #586

wants to merge 13 commits into from

Conversation

etwest
Copy link
Contributor

@etwest etwest commented Jun 15, 2023

This pull request is a foundational attempt to add support for column families in SplinterDB.

This is accomplished with the splinterdb_column_family struct that wraps splinterdb. Each column family has an associated struct and operations upon the family are performed by passing the struct to insert/delete/update/etc functions.

When operations upon keys are performed by a column family the user key is prepended with a 4 byte column family id. These keys are then commingled in a single splinterdb instance (though all key/value pairs for a single column family are stored contiguously in the key space).

The API for column families is defined in the splinterdb/column_family.h header.

Limitations/To-Do:

  • Column families all use xxh32 for the key_hash. There is no support for user defined hash functions.
  • The deletion of column spaces is not well supported. The space in the data config table is not reclaimed and more importantly, the data persists in the database even after the column family is deleted.
  • The onus is on the user to support closing and reopening a splinterdb instance that contains column families. The user must ensure that they create the exact same column families in the same order when reopening the instance. This includes column families that have been deleted as some key/value pairs from this column family may persist in the database.
  • Maximum key size of 512 bytes for column families.
  • Mixing regular splinterdb calls and column families results in undefined behavior.
  • Column family iterators are a bit awkward. Currently the user needs to allocate the iterator such that it is relatively long lived. User is managing memory. It would be better to make a contiguous strict (I.e. direct splinter iterator rather than pointer) and perform all the allocation in one shot in the init function.

@netlify
Copy link

netlify bot commented Jun 15, 2023

Deploy Preview for splinterdb canceled.

Name Link
🔨 Latest commit e874985
🔍 Latest deploy log https://app.netlify.com/sites/splinterdb/deploys/64c5ef223b03290008b25a4d

@etwest etwest requested a review from rtjohnso July 11, 2023 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants