Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use btreemap for type_map and release #542

Merged
merged 6 commits into from
Apr 11, 2024
Merged

use btreemap for type_map and release #542

merged 6 commits into from
Apr 11, 2024

Conversation

chenyan-dfinity
Copy link
Contributor

@chenyan-dfinity chenyan-dfinity commented Apr 9, 2024

Looking from the encoding flamegraph, with a large type table, the encoder spends a lot of time in hashbrown::raw::RawTable::reserve_rehash. Changing the data structure from HashMap to BTreeMap significantly improves the performance. Another reason for the improvement is that comparing the type AST is usually cheaper than hashing the type AST.

This PR also disables the memoization to unroll types. Type table can now be slightly larger, but saves a lot of time in encoding.

For list_proposal benchmark, encoding cost comes down from 19M to 10M. Further down to 6M after disabling type unrolling.

Copy link

github-actions bot commented Apr 9, 2024

Name Max Mem (Kb) Encode Decode
blob 4_224 20_458_812 ($\textcolor{green}{-0.03\%}$) 12_083_520 ($\textcolor{green}{-0.01\%}$)
btreemap 75_456 4_814_024_296 ($\textcolor{green}{-0.00\%}$) 15_380_030_660 ($\textcolor{green}{-0.00\%}$)
nns 192 2_172_806 ($\textcolor{green}{-4.57\%}$) 12_210_232 ($\textcolor{green}{-14.86\%}$)
nns_list_proposal 1_920 ($\textcolor{red}{11.11\%}$) 6_636_110 ($\textcolor{green}{-65.58\%}$) 182_353_563 ($\textcolor{red}{0.13\%}$)
option_list 576 7_436_043 ($\textcolor{red}{1.71\%}$) 31_705_099 ($\textcolor{green}{-6.34\%}$)
text 6_336 28_844_735 ($\textcolor{green}{-0.01\%}$) 17_839_178 ($\textcolor{green}{-0.00\%}$)
variant_list 128 7_431_973 ($\textcolor{green}{-1.15\%}$) 23_873_932 ($\textcolor{green}{-6.32\%}$)
vec_int16 16_704 168_581_831 ($\textcolor{green}{-0.01\%}$) 1_031_829_277 ($\textcolor{green}{-0.00\%}$)
  • Parser cost: 18_351_038
  • Extra args: 3_348_203 ($\textcolor{green}{-0.54\%}$)
Click to see raw report

---------------------------------------------------

Benchmark: blob
  total:
    instructions: 32.54 M (-0.02%) (change within noise threshold)
    heap_increase: 66 pages (no change)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    instructions: 20.46 M (-0.03%) (change within noise threshold)
    heap_increase: 66 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    instructions: 12.08 M (-0.01%) (change within noise threshold)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: text
  total:
    instructions: 46.69 M (-0.01%) (change within noise threshold)
    heap_increase: 99 pages (no change)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    instructions: 28.84 M (-0.01%) (change within noise threshold)
    heap_increase: 99 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    instructions: 17.84 M (-0.00%) (change within noise threshold)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: vec_int16
  total:
    instructions: 1.20 B (-0.00%) (change within noise threshold)
    heap_increase: 261 pages (no change)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    instructions: 168.58 M (-0.01%) (change within noise threshold)
    heap_increase: 261 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    instructions: 1.03 B (-0.00%) (change within noise threshold)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: btreemap
  total:
    instructions: 20.19 B (-0.00%) (change within noise threshold)
    heap_increase: 1179 pages (no change)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    instructions: 4.81 B (-0.00%) (change within noise threshold)
    heap_increase: 257 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    instructions: 15.38 B (-0.00%) (change within noise threshold)
    heap_increase: 922 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: option_list
  total:
    instructions: 39.14 M (improved by 4.91%)
    heap_increase: 9 pages (no change)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    instructions: 7.44 M (1.71%) (change within noise threshold)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    instructions: 31.71 M (improved by 6.34%)
    heap_increase: 9 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: variant_list
  total:
    instructions: 31.31 M (improved by 5.14%)
    heap_increase: 2 pages (no change)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    instructions: 7.43 M (-1.15%) (change within noise threshold)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    instructions: 23.87 M (improved by 6.32%)
    heap_increase: 2 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: nns
  total:
    instructions: 33.48 M (improved by 6.27%)
    heap_increase: 3 pages (no change)
    stable_memory_increase: 0 pages (no change)

  0. Parsing (scope):
    instructions: 18.35 M (no change)
    heap_increase: 3 pages (no change)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    instructions: 2.17 M (improved by 4.57%)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    instructions: 12.21 M (improved by 14.86%)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: nns_list_proposal
  total:
    instructions: 188.99 M (improved by 6.16%)
    heap_increase: 30 pages (regressed by 11.11%)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    instructions: 6.64 M (improved by 65.58%)
    heap_increase: 5 pages (improved by 16.67%)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    instructions: 182.35 M (0.13%) (change within noise threshold)
    heap_increase: 25 pages (regressed by 19.05%)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: extra_args
  total:
    instructions: 3.35 M (-0.54%) (change within noise threshold)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------
Successfully persisted results to canbench_results.yml

@chenyan-dfinity chenyan-dfinity marked this pull request as ready for review April 11, 2024 16:34
@chenyan-dfinity chenyan-dfinity changed the title use btreemap for type_map use btreemap for type_map and release Apr 11, 2024
@chenyan-dfinity chenyan-dfinity merged commit ae4d0f7 into master Apr 11, 2024
5 checks passed
@chenyan-dfinity chenyan-dfinity deleted the perf-ser branch April 11, 2024 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants