A more optimized stable B-Tree #95

crusso · 2023-10-05T03:19:30Z

No description provided.

…laudio

github-actions · 2023-10-06T01:21:12Z

Note
Diffing the performance result against the published result from main branch.
Unchanged benchmarks are omitted.

Map

Note
Same as main branch, skipping.

Priority queue

Note
Same as main branch, skipping.

Growable array

Note
Same as main branch, skipping.

Warning
Skip table 3 ## Stable structures from _out/collections/README.md, due to table shape mismatches from main branch.

Statistics

binary_size: no change
max_mem: no change
cycles: no change

Overall Statistics

binary_size: no change
max_mem: no change
cycles: no change

github-actions · 2023-10-06T01:21:14Z

Note
The flamegraph link only works after you merge.
Unchanged benchmarks are omitted.

Collection libraries

Measure different collection libraries written in both Motoko and Rust.
The library names with _rs suffix are written in Rust; the rest are written in Motoko.
The _stable and _stable_rs suffix represents that the library directly writes the state to stable memory using Region in Motoko and ic-stable-stuctures in Rust.

We use the same random number generator with fixed seed to ensure that all collections contain
the same elements, and the queries are exactly the same. Below we explain the measurements of each column in the table:

generate 1m. Insert 1m Nat64 integers into the collection. For Motoko collections, it usually triggers the GC; the rest of the column are not likely to trigger GC.
max mem. For Motoko, it reports rts_max_heap_size after generate call; For Rust, it reports the Wasm's memory page * 64Kb; For stable benchmarks, it reports the region size of the stable memory storing the map.
batch_get 50. Find 50 elements from the collection.
batch_put 50. Insert 50 elements to the collection.
batch_remove 50. Remove 50 elements from the collection.
upgrade. Upgrade the canister with the same Wasm module. For non-stable benchmarks, the map state is persisted by serializing and deserializing states into stable memory. For stable benchmarks, the upgrade only needs to initialize the metadata, as the state is already in the stable memory.

💎 Takeaways

The platform only charges for instruction count. Data structures which make use of caching and locality have no impact on the cost.
We have a limit on the maximal cycles per round. This means asymptotic behavior doesn't matter much. We care more about the performance up to a fixed N. In the extreme cases, you may see an $O(10000 n\log n)$ algorithm hitting the limit, while an $O(n^2)$ algorithm runs just fine.
Amortized algorithms/GC may need to be more eager to avoid hitting the cycle limit on a particular round.
Rust costs more cycles to process complicated Candid data, but it is more efficient in performing core computations.

Note

The Candid interface of the benchmark is minimal, therefore the serialization cost is negligible in this measurement.

Due to the instrumentation overhead and cycle limit, we cannot profile computations with very large collections.

The upgrade column uses Candid for serializing stable data. In Rust, you may get better cycle cost by using a different serialization format. Another slowdown in Rust is that ic-stable-structures tends to be slower than the region memory in Motoko.

Different library has different ways for persisting data during upgrades, there are mainly three categories:

Use stable variable directly in Motoko: zhenya_hashmap, btree, vector

Expose and serialize external state (share/unshare in Motoko, candid::Encode in Rust): rbtree, heap, btreemap_rs, hashmap_rs, heap_rs, vector_rs

Use pre/post-upgrade hooks to convert data into an array: hashmap, splay, triemap, buffer, imrc_hashmap_rs

The stable benchmarks are much more expensive than their non-stable counterpart, because the stable memory API is much more expensive. The benefit is that they get fast upgrade. The upgrade still needs to parse the metadata when initializing the upgraded Wasm module.

hashmap uses amortized data structure. When the initial capacity is reached, it has to copy the whole array, thus the cost of batch_put 50 is much higher than other data structures.

btree comes from mops.one/stableheapbtreemap.

btree_stable comes from github.com/sardariuss.

zhenya_hashmap comes from mops.one/map.

vector comes from mops.one/vector. Compare with buffer, put has better worst case time and space complexity ($O(\sqrt{n})$ vs $O(n)$); get has a slightly larger constant overhead.

hashmap_rs uses the fxhash crate, which is the same as std::collections::HashMap, but with a deterministic hasher. This ensures reproducible result.

imrc_hashmap_rs uses the im-rc crate, which is the immutable version hashmap in Rust.

Map

	binary_size	generate 1m	max mem	batch_get 50	batch_put 50	batch_remove 50	upgrade
hashmap	160_221	6_984_044_999	61_987_852	288_670	5_536_856_410	310_195	9_128_784_003
triemap	163_474	11_463_655_150	74_216_172	222_926	549_435	540_205	13_075_158_546
rbtree	158_149	5_979_229_900	57_996_060	88_905	268_573	278_352	5_771_880_608
splay	159_956	11_568_250_103	53_995_996	552_014	581_765	810_321	3_722_474_749
btree	187_897	8_224_242_789	31_104_012	277_542	384_171	429_041	2_517_941_583
zhenya_hashmap	160_509	2_201_622_562	22_773_100	48_627	61_839	70_872	2_695_448_620
btreemap_rs	477_612	1_651_590_463	27_590_656	66_862	112_477	76_234	2_660_975_747
imrc_hashmap_rs	479_773	2_392_906_831	244_973_568	32_763	163_245	98_394	5_191_575_323
hashmap_rs	467_997	403_296_648	73_138_176	16_851	21_680	20_263	1_144_828_025

Priority queue

	binary_size	heapify 1m	max mem	pop_min 50	put 50	pop_min 50	upgrade
heap	147_638	4_684_519_403	29_995_956	511_505	186_471	487_225	2_655_609_909
heap_rs	463_840	121_602_221	18_284_544	51_661	18_245	51_802	440_739_988

Growable array

	binary_size	generate 5k	max mem	batch_get 500	batch_put 500	batch_remove 500	upgrade
buffer	151_004	2_082_623	65_644	73_092	671_517	127_592	2_474_639
vector	152_551	1_588_260	24_580	105_191	149_932	148_094	3_844_445
vec_rs	459_655	265_683	1_310_720	13_014	25_363	21_247	2_743_831

Stable structures

	binary_size	generate 50k	max mem	batch_get 50	batch_put 50	batch_remove 50	upgrade
btree	187_897	351_889_189	1_554_152	219_328	337_463	368_143	125_813_601
btree_stable	205_322	7_130_493_804	2_621_440	5_161_096	7_847_782	14_878_672	24_593
btreemap_rs	477_612	70_026_986	2_555_904	57_181	86_494	75_309	113_837_931
btreemap_stable_rs	478_668	4_224_209_849	2_621_440	2_528_769	4_605_548	7_817_380	653_359
heap_rs	463_840	6_139_838	2_293_760	44_362	18_477	44_345	23_149_372
heap_stable_rs	451_018	279_422_369	458_752	2_346_843	241_158	2_329_183	653_433
vec_rs	459_655	2_866_886	2_228_224	13_014	14_113	13_710	21_249_908
vec_stable_rs	446_031	65_186_210	458_752	58_992	77_387	79_383	653_447

Environment

dfx 0.15.1

Motoko compiler 0.10.0 (source a3ywvw0a-p5a03qy6-vscbl9j8-qxszbxa6)

rustc 1.73.0 (cc66ad468 2023-10-03)

ic-repl 0.5.1

ic-wasm 0.6.0

collections/motoko/mops.template.toml

…canister-profiling into claudio/sardariussBTree-claudio

…laudio

* Bump ic-repl/ic-wasm to fix the heap out of bound bug in `__get_profiling` * bump profiling trace to 256M (8M for collection) * Append version info at the end of the report

…ee-claudio

target branch claudio

0071d15

crusso changed the base branch from main to claudio/sardariussBTree October 5, 2023 03:21

crusso added 6 commits October 5, 2023 23:23

Merge branch 'claudio/sardariussBTree' into claudio/sardariussBTree-c…

a8b76f1

…laudio

try something new

1e911af

hackity hack

d9235de

adjust to API changes

2a44e70

disablec crashing rust test

f1c7f96

focus on collections

29feadf

crusso commented Oct 6, 2023

View reviewed changes

collections/motoko/mops.template.toml Outdated Show resolved Hide resolved

crusso added 4 commits October 5, 2023 21:34

Update collections/motoko/mops.template.toml

ab661cf

re-enable btreemap_stable_rs

3031697

Merge branch 'claudio/sardariussBTree-claudio' of github.com:dfinity/…

495a8b6

…canister-profiling into claudio/sardariussBTree-claudio

Merge branch 'claudio/sardariussBTree' into claudio/sardariussBTree-c…

17c9f65

…laudio

crusso mentioned this pull request Oct 6, 2023

WIP: reduce allocations by refactoring sardariuss/MotokoStableBTree#9

Draft

6 tasks

chenyan-dfinity and others added 3 commits October 10, 2023 20:46

bump ic-repl (#96)

8d9e0a8

* Bump ic-repl/ic-wasm to fix the heap out of bound bug in `__get_profiling` * bump profiling trace to 256M (8M for collection) * Append version info at the end of the report

Merge remote-tracking branch 'origin/main' into claudio/sardariussBTr…

c100f6e

…ee-claudio

fix

f2f0bcc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A more optimized stable B-Tree #95

A more optimized stable B-Tree #95

crusso commented Oct 5, 2023

github-actions bot commented Oct 6, 2023 •

edited

Loading

github-actions bot commented Oct 6, 2023 •

edited

Loading

Environment

A more optimized stable B-Tree #95

Are you sure you want to change the base?

A more optimized stable B-Tree #95

Conversation

crusso commented Oct 5, 2023

github-actions bot commented Oct 6, 2023 • edited Loading

Map

Priority queue

Growable array

Statistics

Overall Statistics

github-actions bot commented Oct 6, 2023 • edited Loading

Collection libraries

💎 Takeaways

Map

Priority queue

Growable array

Stable structures

Environment

github-actions bot commented Oct 6, 2023 •

edited

Loading

github-actions bot commented Oct 6, 2023 •

edited

Loading