You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ICU4X has test-c-tiny and test-js-tiny to show how far codesize can be optimized.
These are incremental, applying optimization on top of optimization to slowly reduce codesize. This shows a nice progression, but it is not helpful when understanding what the effect of each optimization is in isolation.
I think this is an important function of such a benchmark: many of these techniques are not uniformly available and impose additional constraints upon the build: some require nightly, some required paired Rust/Clang versions, some force build-std, some require a particular C compiler, some reduce debuggability, and so on.
Furthermore, a lot of these benchmarks build on top of each other: using a release build will of course help LTO be more effective (percent-wise).
Providing numbers for every combination is going to be a lot of work and likely an overwhelming amount of data. However, I think what we could do is identify a list of optimizations that are potentially relevant but not necessarily always possible, and then provide numbers for:
plain release build
release build with each of these optimizations individually applied
for optimizations that build on each other; e.g. -Clinker-plugin-lto needs LTO, apply its dependencies too
release build with all but one of these optimizations applied
similar setup for dependent optimizations: remove both
release build with all optimizations applied
This would both give us an idea of the immediate wins of individual optimizations, and how they cumulatively work together.
The list of optimizations I can identify are:
LTO (off, on, thin, it seems like thin gives us the best perf?)
-Clinker-plugin-lto
--gc-sections
--strip-all
panic=abort
panic-abort std
panic-immediate-abort std
one-step vs two-step clang
use of lld (?)
inclusion of debug symbols in the first place (same as strip? unclear)
This list might be larger than necessary, so we could merge some entries if desired. I might also be missing something. I didn't include Rust debug vs release here because I don't think debug build codesize numbers really mean much, and I can't think of a usecase for caring about those numbers.
Came up in #5935
ICU4X has test-c-tiny and test-js-tiny to show how far codesize can be optimized.
These are incremental, applying optimization on top of optimization to slowly reduce codesize. This shows a nice progression, but it is not helpful when understanding what the effect of each optimization is in isolation.
I think this is an important function of such a benchmark: many of these techniques are not uniformly available and impose additional constraints upon the build: some require nightly, some required paired Rust/Clang versions, some force build-std, some require a particular C compiler, some reduce debuggability, and so on.
Furthermore, a lot of these benchmarks build on top of each other: using a release build will of course help LTO be more effective (percent-wise).
Providing numbers for every combination is going to be a lot of work and likely an overwhelming amount of data. However, I think what we could do is identify a list of optimizations that are potentially relevant but not necessarily always possible, and then provide numbers for:
-Clinker-plugin-lto
needs LTO, apply its dependencies tooThis would both give us an idea of the immediate wins of individual optimizations, and how they cumulatively work together.
The list of optimizations I can identify are:
-Clinker-plugin-lto
--gc-sections
--strip-all
This list might be larger than necessary, so we could merge some entries if desired. I might also be missing something. I didn't include Rust debug vs release here because I don't think debug build codesize numbers really mean much, and I can't think of a usecase for caring about those numbers.
Thoughts? @sffc
The text was updated successfully, but these errors were encountered: