Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Append mode is not working correctly for string keys with character codes larger than 255 (MDB_KEYEXIST error) #207

Open
hjerabek opened this issue Oct 4, 2022 · 3 comments

Comments

@hjerabek
Copy link

hjerabek commented Oct 4, 2022

The following sample code throws an MDB_KEYEXIST: Key/data pair already exists error when performing a put operation using the {append:true} option with a string containing character codes larger than 255, even though the key does not already exist:

var fs=require("fs"),node_lmdb=require("node-lmdb");
var path="append.mdb";
if (fs.existsSync(path)) fs.unlinkSync(path);
var env=new node_lmdb.Env();
env.open({path:path,maxDbs:1,mapSize:1e7,noSubdir:true});
var dbi=env.openDbi({create:true,keyIsBuffer:false});
var i,k,n=65536,txn=env.beginTxn(),opts={append:true};
for (i=0;i<n;i++) {
    k=String.fromCharCode(i);
    txn.putBoolean(dbi,k,true,opts);
}
txn.commit();
dbi.close()
env.close();

I assume the reason is that node-lmdb internally uses a little-endian instead of a big-endian encoding for UTF16 strings. As soon as the character code hits 256 the first of the two bytes gets reset to 0, so for the lmdb the key is out of order as it already has keys whose first byte is up to 255. My assumption is based on the fact that I get the same error if I use a utf16le-encoded buffer as key:

var fs=require("fs"),node_lmdb=require("node-lmdb");
var path="append.mdb";
if (fs.existsSync(path)) fs.unlinkSync(path);
var env=new node_lmdb.Env();
env.open({path:path,maxDbs:1,mapSize:1e7,noSubdir:true});
var dbi=env.openDbi({create:true,keyIsBuffer:true});
var i,k,n=65536,txn=env.beginTxn(),opts={append:true};
for (i=0;i<n;i++) {
    k=Buffer.from(String.fromCharCode(i),"utf16le");
    txn.putBoolean(dbi,k,true,opts);
}
txn.commit();
dbi.close()
env.close();

...yet it works without error if the key is utf16be-encoded (since Node.js does not provide that encoding itself, I use swap16 on the utf16le-encoded buffer):

var fs=require("fs"),node_lmdb=require("node-lmdb");
var path="append.mdb";
if (fs.existsSync(path)) fs.unlinkSync(path);
var env=new node_lmdb.Env();
env.open({path:path,maxDbs:1,mapSize:1e7,noSubdir:true});
var dbi=env.openDbi({create:true,keyIsBuffer:true});
var i,k,n=65536,txn=env.beginTxn(),opts={append:true};
for (i=0;i<n;i++) {
    k=Buffer.from(String.fromCharCode(i),"utf16le").swap16();
    txn.putBoolean(dbi,k,true,opts);
}
txn.commit();
dbi.close()
env.close();

PS: The sample code has been tested with Node.js v12.19 and node-lmdb v0.9.4.

PPS: I know this is a niche issue. I don't expect it to get fixed, I just wanted to report so others can find the error code in a websearch and see the workaround.

@DarbyBurbidge
Copy link

DarbyBurbidge commented Jan 14, 2023

I couldn't get append in put options working at all.

const someString = "someString"
const val = txn.getString(db, 1)
console.log(val)
if (val != someString) {
    txn.putString(db, 1, someString);
} else {
    txn.putString(db, 1, 'someOtherString', {append: true})
}

Simply removing append option does allow me to overwrite the value, but I can't get append working.

@hjerabek
Copy link
Author

Your code will use the append mode only if val==someString, in which case the database already contains a value for the key 1. AFAIK, appending only works if the key is larger than any other existing key in the given database. It should even fail if the key you are overwriting is the largest one.

@DarbyBurbidge
Copy link

Your code will use the append mode only if val==someString, in which case the database already contains a value for the key 1. AFAIK, appending only works if the key is larger than any other existing key in the given database. It should even fail if the key you are overwriting is the largest one.

I completely misunderstood what append is for. I figured out what I actually wanted to know, which was either extending the existing entry (at which point I would just input the existing entry plus the new one together as the new input), or multiple entries on the same key (solved by modifying the db on creation with dupSort: true).

I appreciate the response and apologize for cluttering up the thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants