Creating zstd dictionaries for compressible mime types #429

RubenKelevra · 2020-04-02T19:23:39Z

I did some testing with the zstd dictionary feature, which speeds up compression/decompression speeds significantly and increases the compression ratio too.

I like to create a set of dictionaries for the future using them in ipfs for compression.

Currently, I only got a dictionary for the Linux-Kernel source code which can achieve a compression ratio of 5.931x. It should work on other c-projects quite well, too. (maybe some testing is needed)

There's no need for those files to have an open license (IMHO) since those dictionaries will just analyze the whole dataset for common patterns.

If you like to help: I search for large datasets of files (which are considered compressible), to make the most general dictionary for those file types as possible. Feel free to share either links here, via mail or via matrix ( ruben_kelevra ) with me.

More details on my research: ipld/specs#76 (comment)

(Please assign this ticket to me)

The text was updated successfully, but these errors were encountered:

hsanjuan · 2020-04-02T22:05:12Z

This does not seem to be describing a go-ipfs feature or how it should work. Right now this looks like a message for https://discuss.ipfs.io. Would you mind moving it or fixing the feature request? Thanks!

RubenKelevra · 2020-04-03T22:24:17Z

This does not seem to be describing a go-ipfs feature or how it should work. Right now this looks like a message for https://discuss.ipfs.io. Would you mind moving it or fixing the feature request? Thanks!

This was actually just a ticket which I like to complete. It will take some time to create set precompiled dictionaries. This ticket is meant to track the progress.

Maybe move it to ipfs/notes? :)

ribasushi · 2020-04-04T06:07:52Z

Yeah, let's move it to ipfs/notes. Having the go-ipfs binary carry predefined dictionaries ( of any kind ) is something I would be very opposed to. The research is extremely valuable on its own, but its results would fit in a separate standalone tool, not go-ipfs.

wallabra · 2022-01-24T04:46:48Z

How's progress in this? This entire feature is quite interesting and I can't wait to see where it goes.

Stebalien transferred this issue from ipfs/kubo Apr 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating zstd dictionaries for compressible mime types #429

Creating zstd dictionaries for compressible mime types #429

RubenKelevra commented Apr 2, 2020

hsanjuan commented Apr 2, 2020

RubenKelevra commented Apr 3, 2020

ribasushi commented Apr 4, 2020

wallabra commented Jan 24, 2022

Creating zstd dictionaries for compressible mime types #429

Creating zstd dictionaries for compressible mime types #429

Comments

RubenKelevra commented Apr 2, 2020

hsanjuan commented Apr 2, 2020

RubenKelevra commented Apr 3, 2020

ribasushi commented Apr 4, 2020

wallabra commented Jan 24, 2022