LevelDB compaction performance issue with large worlds (especially converted worlds) #6580

dktapps · 2024-12-20T17:32:12Z

Problem description

LevelDB automatic level compaction tends to continuously hammer the disk with larger worlds (several GB).

Compacting 10 GB of data (at level 4) can take around 30 minutes of continuous I/O usage. LevelDB attempts to distribute this work in order to not max out I/O, but this appears to manifest as continuous background load on disks.

Wtf is compaction? Why is this happening?

Compaction orders all the data keys in a DB level according to the DB's comparator (by default a simple bytewise comparator that sorts like alphabetically but for byte values).
When a level in the DB fills up, the data from the lower level is merged with the higher level, and the higher level's data may be rebuilt to order it correctly.

During compaction, any files whose key ranges do not overlap with the higher level are directly moved to the higher level without rebuilding. This is cheap, so we want this to be the norm.
Files whose key ranges do overlap are costly to merge into the higher level, because the new data must be sorted with the set of existing data in the higher level.

This means that key structure is critical to getting best compaction performance. Keys that are accessed at similar times (e.g. chunks with similar coordinates) should have leading key bits that are as similar as possible to minimize range overlap.

However, thanks to Mojang's choice of key format, the data from lower levels is practically guaranteed to overlap because the key ordering is so disorganized relative to actual data usage patterns. This means that every compaction is I/O intensive. This is a big problem with large worlds, where the I/O cost becomes very obvious and sometimes continuous.

There are two main problems:

Keys are formatted XXXXZZZZ, so chunks with the same X coordinate will appear next to each other regardless of how far apart they are in the actual world. e.g. if you have chunks 0,0,0,1,1,0,1,1, chunk 0,2 will land in the middle of this order, requiring a lot of data to be moved. Scale this up 10 million times and you can see why large DBs are a problem.
Keys use little-endian packed int32s, meaning that chunks with similar coordinates will appear about as far apart in the DB as they possibly could (e.g. X coordinates would be ordered like 0, 256, 65536, 1, 257, 65537, etc) since the keys are compared byte-by-byte. This is like having an index at a library having the names of books sorted according to strrev(name).

Proposed solution

There's a couple of ways to solve this problem:

Use a custom comparator to sort the keys optimally - Not super portable because the whole DB must be rebuilt if comparator was changed
Change the key structure -also not ideal, would also require a full DB rebuild
2a) Use Z-order (morton) codes or another space-filling curve - this would structure keys to look like XZXZXZXZ at the bit level instead of XXXXZZZZ.
2b) Encode in big-endian to make sure related chunks data appears nearer each other. Little-endian doesn't make sense for this case.
Split the DB up into "regions" similar to Anvil & MCRegion. This would avoid the key overlap problem to a large extent by ensuring that these overlapping keys don't appear in the same DB in the first place.

The bottom line is that there's no way to fix this without deviating from the Mojang world format and forcing users to rebuild their worlds.
Really, this is a change that Mojang should make, but I doubt it'll ever happen, since vanilla worlds rarely grow big enough for this kind of problem to show up.

Alternative solutions that don't require API changes

There doesn't seem to be a good alternative solution to this problem. I considered having compactions triggered more frequently and on specific key sets, but I don't think it will address the problem of constantly hammering I/O. In addition, compactions by key ranges aren't super helpful anyway on account of poor cache locality.

The text was updated successfully, but these errors were encountered:

dktapps · 2024-12-21T13:58:44Z

Sadly local experiments with improved key structure didn't make a huge difference. It definitely improved things but not nearly as much as I'd hoped.

This new impl (which is not loadable by vanilla) is targeted at very large worlds, which experience significant I/O performance issues due to a variety of issues described in #6580. Two main changes are made in RegionizedLevelDB: - First, multiple LevelDBs are used, which cover a fixed NxN segment of terrain, similar to Anvil in Java. However, there's no constraint on these region sizes. Several experimental sizes are supported by default in WorldProviderManager. - Second, bigEndianLong(morton2d(chunkX, chunkZ)) is used for chunk keys instead of littleEndianInt(chunkX).littleEndianInt(chunkZ). This new scheme has much better cache locality than Mojang's version, which reduces overlap and costly DB compactions. The following new provider options are available as a result of this change: - custom-leveldb-regions-32 - custom-leveldb-regions-64 - custom-leveldb-regions-128 - custom-leveldb-regions-256 Smaller sizes will likely be less space-efficient, but will also probably have better performance. Once a sweet spot is found, a default will be introduced. Note that the different variations of custom-leveldb-regions-* are not cross-compatible. Conversion between the different formats is necessary if you want to change formats.

dktapps added Category: Core Related to internal functionality Status: Debugged Cause of the bug has been found, but not fixed Performance labels Dec 20, 2024

This was referenced Dec 21, 2024

EXPERIMENTAL! Introduced a custom LevelDB impl with regions and better key format #6582

Draft

Use a LevelDB to store playerdata NBT instead of flat files #4077

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LevelDB compaction performance issue with large worlds (especially converted worlds) #6580

LevelDB compaction performance issue with large worlds (especially converted worlds) #6580

dktapps commented Dec 20, 2024 •

edited

Loading

dktapps commented Dec 21, 2024

LevelDB compaction performance issue with large worlds (especially converted worlds) #6580

LevelDB compaction performance issue with large worlds (especially converted worlds) #6580

Comments

dktapps commented Dec 20, 2024 • edited Loading

Problem description

Wtf is compaction? Why is this happening?

Proposed solution

Alternative solutions that don't require API changes

dktapps commented Dec 21, 2024

dktapps commented Dec 20, 2024 •

edited

Loading