Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved c-tiny benchmark #5948

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Manishearth
Copy link
Member

@Manishearth Manishearth commented Dec 25, 2024

Progress on #5945

This does not fully enact the vision in #5945, but it does start testing various combinations of tactics. I didn't want to test all combinations and I didn't want to spend time figuring out which combinations we really want, so I went ahead and tested the ones most relevant to the current investigation (#5935).

This shows the effects of LTO, linker-plugin-lto, and stripping+gc-sections on panic-abort (with panic-immediate-abort std) release Rust builds.

Benchmarks with just panic=abort but no panic-immediate-abort/panic-abort std would probably be useful to clients but I haven't added them now. It may be worth using makefile magic to truly build a matrix of targets.

@Manishearth
Copy link
Member Author

Manishearth commented Dec 25, 2024

The current numbers are:

-rwxr-x--- 1 manishearth primarygroup   47416 Dec 25 12:03 panic-abort-clang.elf
-rwxr-x--- 1 manishearth primarygroup   47416 Dec 25 12:03 panic-abort-clang-stripped.elf
-rwxr-x--- 1 manishearth primarygroup   33296 Dec 25 12:09 panic-abort-linker-plugin-lto-clang-twostep.elf
-rwxr-x--- 1 manishearth primarygroup   33296 Dec 25 12:08 panic-abort-linker-plugin-lto-clang-twostep-stripped.elf
-rwxr-x--- 1 manishearth primarygroup  140400 Dec 25 12:03 panic-abort-lto-clang.elf
-rwxr-x--- 1 manishearth primarygroup   43320 Dec 25 12:03 panic-abort-lto-clang-stripped.elf
-rwxr-x--- 1 manishearth primarygroup  133520 Dec 25 12:03 panic-abort-lto-clang-twostep.elf
-rwxr-x--- 1 manishearth primarygroup   34040 Dec 25 12:03 panic-abort-lto-clang-twostep-stripped.elf
-rwxr-x--- 1 manishearth primarygroup 5273536 Dec 25 11:53 release.elf
-rwxr-x--- 1 manishearth primarygroup  387664 Dec 25 11:46 release-gcc-stripped.elf

An important wrinkle: LTO bloats binaries unless paired with stripping + gc-sections.

@sffc
Copy link
Member

sffc commented Dec 26, 2024

Praise: this is a good change to make. I'm glad there's someone else who has gone deep into this test.

Question: I don't know if I fully understand the difference between "one step" LTO and "two step" LTO. What I think it means: "one step" runs LTO on the C code and Rust code separately, whereas "two step" runs LTO on both at the same time.

Question: even in "one step", you have Rust emitting bitcode. Is this necessary? And I think this is one of the things that causes LLVM version mismatches to be problematic?

@sffc
Copy link
Member

sffc commented Dec 26, 2024

Observation: you found that "panic-abort-linker-plugin-lto-clang-twostep.elf" and "panic-abort-linker-plugin-lto-clang-twostep-stripped.elf" are the same size, but not one-step. My expectation is that this is because in "two-step", LTO is already running on the whole binary, so there is nothing left to strip, whereas in "one-step", there is no dead code elimination until you turn on braindead gc-sections.

@sffc
Copy link
Member

sffc commented Dec 26, 2024

In other words, my understanding has been that the thing you're calling "two-step LTO", aka "linker-plugin-lto", aka the thing I've always just called simply "LTO", applies -Os optimizations even at link time, so it can do things like inline function calls between Rust and C and as a result juice a bit extra code size and performance.

In my mental model, the thing you're calling "one-step LTO" to me is "LTO separately in C and Rust, with standard non-LTO cross-language linking". I agree that this does not depend on LLVM version between C and Rust since there is no LLVM bitcode or metadata that needs to be shared.

@Manishearth
Copy link
Member Author

Manishearth commented Dec 26, 2024

Question: even in "one step", you have Rust emitting bitcode. Is this necessary? And I think this is one of the things that causes LLVM version mismatches to be problematic?

No, pure rust LTO still requires that flag (it refuses to even initiate compilation without it, try it!)

Remember, Rust is still linking together a giant pile of rlibs! Each rlib does contain some machine code, so if Rust wishes to do proper LTO it needs to include LLVM bitcode in its rlibs alongside the machine code.

Why that's not implied I don't know. Cargo's LTO setting implies the other flag. I suppose they wanted explicitness in the lower level CLI. 🤷

I considered using Cargo profiles to do the LTO toggle but didn't want to make too many changes. That would have the nice benefit of not needing --target-dir hacks since I can use --profile.


linker-plugin-lto is documented as the thing needing LLVM concordance. No other mention of LTO in Rust has that requirement, and it matches my previous testing. I can do some more testing with the new makefile to verify.

"two-step LTO", aka "linker-plugin-lto", aka the thing I've always just called simply "LTO"

I don't think the "two step" and "linker-plugin-lto" are the same thing (though I suspect linker-plugin-lto needs two-step to work?), and I have some theories as to why the numbers are different (largely to do with ThinLTO or LLD) but I haven't yet investigated.

Please let's not make our terminology "two step aka linker-plugin-lto", these test results clearly show them as having different results. two-step has an impact even without linker-plugin-lto.

I mostly retained this distinction from the previous test but I haven't dug deeper into it. It's possible it's just a matter of the precise combinations of flags rather than the phasing (in which case we should benchmark those flags independently), or it's deficiencies in the ThinLTO model (in which case we should just try with regular LTO).

From a quick glance, it could literally just be the distinction between using LLD and the default system linker. I didn't poke too deeply, my focus was on linker-plugin-lto whilst trying not to lose any innovations in the old benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants