-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use greg7mdp/parallel-hashmap for faster execution #2468
base: master
Are you sure you want to change the base?
Conversation
@digit-google @jhasse how do you think about this PR instead of #2462? |
Thanks, I have tested this CL with a local Fuchsia build on Linux, I get a consistent 2% (x1.02) speedup there. Can you fuse all commits into a single one though? I'll add a few inline comments too. I suggest adding a top-level |
c1a74de
to
de4f281
Compare
Done.
Done. |
de4f281
to
52d0c4a
Compare
Thanks, the latest commit is much better, but please change the commit message to something informative. This is what will be submitted into the final git repository, so the current message will be useless for people looking at the git log in the future. And the github conversation here, and its description message at the top, will essentially be lost in the wind after that. @jhasse , what do you think? |
This implements option 1 in ninja-build#2462 (comment). This uses https://github.com/greg7mdp/parallel-hashmap instead of unordered_map for faster ninja execution. Some more context is in https://crbug.com/346910589. This makes noop build for chromium * 12% faster on Linux * 2% slower on Windows * 96% faster on macOS ``` $ hyperfine --warmup 1 \ '~/ninja_master -C out/non_component build:buildflag_header_h' \ '~/ninja_faster_hash -C out/non_component build:buildflag_header_h' Benchmark 1: ~/ninja_master -C out/non_component build:buildflag_header_h Time (mean ± σ): 5.251 s ± 0.027 s [User: 4.641 s, System: 0.608 s] Range (min … max): 5.222 s … 5.309 s 10 runs Benchmark 2: ~/ninja_faster_hash -C out/non_component build:buildflag_header_h Time (mean ± σ): 4.689 s ± 0.022 s [User: 4.055 s, System: 0.633 s] Range (min … max): 4.648 s … 4.737 s 10 runs Summary ~/ninja_faster_hash -C out/non_component build:buildflag_header_h ran 1.12 ± 0.01 times faster than ~/ninja_master -C out/non_component build:buildflag_header_h ``` ``` Benchmark 1: C:\Users\tikuta\source\repos\ninja-build\ninja\ninja_master -C out/non_component build:buildflag_header_h Time (mean ± σ): 13.161 s ± 0.105 s [User: 10.657 s, System: 2.194 s] Range (min … max): 13.005 s … 13.328 s 10 runs Benchmark 2: C:\Users\tikuta\source\repos\ninja-build\ninja\ninja_faster_hash -C out/non_component build:buildflag_header_h Time (mean ± σ): 13.206 s ± 0.210 s [User: 10.653 s, System: 2.194 s] Range (min … max): 12.951 s … 13.578 s 10 runs Summary C:\Users\tikuta\source\repos\ninja-build\ninja\ninja_master -C out/non_component build:buildflag_header_h ran 1.00 ± 0.02 times faster than C:\Users\tikuta\source\repos\ninja-build\ninja\ninja_faster_hash -C out/non_component build:buildflag_header_h ``` ``` $ hyperfine --warmup 1 '~/ninja_master -C out/non_component build:buildflag_header_h' '~/ninja_faster_hash -C out/non_component build:buildflag_header_h' Benchmark 1: ~/ninja_master -C out/non_component build:buildflag_header_h Time (mean ± σ): 14.360 s ± 0.706 s [User: 9.006 s, System: 1.681 s] Range (min … max): 12.832 s … 15.297 s 10 runs Benchmark 2: ~/ninja_faster_hash -C out/non_component build:buildflag_header_h Time (mean ± σ): 7.340 s ± 0.260 s [User: 6.263 s, System: 0.828 s] Range (min … max): 6.948 s … 7.799 s 10 runs Summary ~/ninja_faster_hash -C out/non_component build:buildflag_header_h ran 1.96 ± 0.12 times faster than ~/ninja_master -C out/non_component build:buildflag_header_h ```
52d0c4a
to
d91e07c
Compare
Thank you for your review, I updated commit message. |
This is much faster than std::unordered_map, and also slightly faster than phmap::flat_hash_map that was included in PR ninja-build#2468. It is MIT-licensed, and we just include the .h file wholesale. I haven't done a detailed test of all the various unordered_maps out there, but this is the overall highest-ranking contender on https://martin.ankerl.com/2022/08/27/hashmap-bench-01/ except for ankerl::unordered_dense::map, which requires C++17. For a no-op build of Chromium (Linux, Zen 2), this reduces time spent from 5.14 to 4.62 seconds.
This is much faster than std::unordered_map, and also slightly faster than phmap::flat_hash_map that was included in PR ninja-build#2468. It is MIT-licensed, and we just include the .h file wholesale. I haven't done a detailed test of all the various unordered_maps out there, but this is the overall highest-ranking contender on https://martin.ankerl.com/2022/08/27/hashmap-bench-01/ except for ankerl::unordered_dense::map, which requires C++17. For a no-op build of Chromium (Linux, Zen 2), this reduces time spent from 5.14 to 4.62 seconds.
This is much faster than std::unordered_map, and also slightly faster than phmap::flat_hash_map that was included in PR ninja-build#2468. It is MIT-licensed, and we just include the .h file wholesale. I haven't done a detailed test of all the various unordered_maps out there, but this is the overall highest-ranking contender on https://martin.ankerl.com/2022/08/27/hashmap-bench-01/ except for ankerl::unordered_dense::map, which requires C++17. For a no-op build of Chromium (Linux, Zen 2), this reduces time spent from 5.14 to 4.62 seconds.
This is much faster than std::unordered_map, and also slightly faster than phmap::flat_hash_map that was included in PR ninja-build#2468. It is MIT-licensed, and we just include the .h file wholesale. I haven't done a detailed test of all the various unordered_maps out there, but this is the overall highest-ranking contender on https://martin.ankerl.com/2022/08/27/hashmap-bench-01/ except for ankerl::unordered_dense::map, which requires C++17. For a no-op build of Chromium (Linux, Zen 2), this reduces time spent from 5.14 to 4.62 seconds.
This is much faster than std::unordered_map, and also slightly faster than phmap::flat_hash_map that was included in PR ninja-build#2468. It is MIT-licensed, and we just include the .h file wholesale. I haven't done a detailed test of all the various unordered_maps out there, but this is the overall highest-ranking contender on https://martin.ankerl.com/2022/08/27/hashmap-bench-01/ except for ankerl::unordered_dense::map, which requires C++17. For a no-op build of Chromium (Linux, Zen 2), this reduces time spent from 5.14 to 4.62 seconds.
This is much faster than std::unordered_map, and also slightly faster than phmap::flat_hash_map that was included in PR ninja-build#2468. It is MIT-licensed, and we just include the .h file wholesale. I haven't done a detailed test of all the various unordered_maps out there, but this is the overall highest-ranking contender on https://martin.ankerl.com/2022/08/27/hashmap-bench-01/ except for ankerl::unordered_dense::map, which requires C++17. For a no-op build of Chromium (Linux, Zen 2), this reduces time spent from 5.14 to 4.62 seconds.
This is much faster than std::unordered_map, and also slightly faster than phmap::flat_hash_map that was included in PR ninja-build#2468. It is MIT-licensed, and we just include the .h file wholesale. I haven't done a detailed test of all the various unordered_maps out there, but this is the overall highest-ranking contender on https://martin.ankerl.com/2022/08/27/hashmap-bench-01/ except for ankerl::unordered_dense::map, which requires C++17. For a no-op build of Chromium (Linux, Zen 2), this reduces time spent from 5.14 to 4.62 seconds.
This is much faster than std::unordered_map, and also slightly faster than phmap::flat_hash_map that was included in PR ninja-build#2468. It is MIT-licensed, and we just include the .h file wholesale. I haven't done a detailed test of all the various unordered_maps out there, but this is the overall highest-ranking contender on https://martin.ankerl.com/2022/08/27/hashmap-bench-01/ except for ankerl::unordered_dense::map, which requires C++17. For a no-op build of Chromium (Linux, Zen 2), this reduces time spent from 5.14 to 4.62 seconds.
That it's slower on Windows is a bit unfortunate. I'm not sure if adding a third party hash map is worth it for a 12 % speed up - the macOS speed up is nice though. |
This implements option 1 in #2462 (comment).
This uses https://github.com/greg7mdp/parallel-hashmap instead of unordered_map for faster ninja execution.
Some more context is in https://crbug.com/346910589.
This makes noop build for chromium
Linux
Windows
macOS