Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust crate compile times on macOS arm64 are incredibly slow #3784

Closed
prrao87 opened this issue Jul 9, 2024 · 10 comments · Fixed by #3826
Closed

Rust crate compile times on macOS arm64 are incredibly slow #3784

prrao87 opened this issue Jul 9, 2024 · 10 comments · Fixed by #3826

Comments

@prrao87
Copy link
Member

prrao87 commented Jul 9, 2024

Description

I'm on macOS Sonoma 14.5 (M3 Macbook), and I'm revisiting the Rust crate's compile time issue I'd noticed earlier on macOS. In 0.4.2 of Kùzu, running cargo build gets stuck on step 35/38 of the compilation process (the step in which the Kùzu linking happens). Eventually, after 5-8 minutes, it finally compiles and I can then do cargo run. I'd expect cargo build --release to take even longer than this.

Changing the environment variables like we discussed earlier CARGO_BUILD_JOBS, or manually setting CMAKE_BUILD_PARALLEL_LEVEL may help with building from source, but I'm thinking from an end user perspective (setting either of these prior to cargo build has no effect).

Is there any way at all the compile times for the Rust crate could be improved on arm64 Macs? Waiting 5-8 mins seems unreasonable, even if it's for the first build only.

@prrao87
Copy link
Member Author

prrao87 commented Jul 9, 2024

@benjaminwinger not urgent - just wanted to put this here as I don't think it's been pointed out before (most users of the Rust crate seem to be on Linux).

@prrao87 prrao87 added the apis label Jul 9, 2024
@benjaminwinger
Copy link
Collaborator

To clarify, is this just an issue with parallelism in the bundled C++ build, or is it something else? That is, does setting CMAKE_BUILD_PARALLEL_LEVEL fix it, but it's just not detecting the correct parallelism by default (I think I've seen warnings in our CI similar to rust-lang/cmake-rs#177 on some platforms (compile in very verbose mode -vv to see output from the bundled build))?

@prrao87
Copy link
Member Author

prrao87 commented Jul 9, 2024

Just to clarify, I'm behaving like the general user and doing the following to use the Rust API to construct a graph:

cargo new my_project
cd my_project
cargo add kuzu
cargo build

Because it takes so long to build, I cleared out my cargo cache, deleted the project, recreated it, and then tried doing export CMAKE_BUILD_PARALLEL_LEVEL=20. Still the same effect, and it takes just as long.

@mewim
Copy link
Member

mewim commented Jul 15, 2024

I can confirm that this is caused by not detecting the correct parallelism by default.

On my Macbook Pro with M2 Max,

The default configuration for cargo build "Finished dev profile [unoptimized + debuginfo] target(s) in 13m 09s"

However, if I manually setup the CMAKE_BUILD_PARALLEL_LEVEL:

env CMAKE_BUILD_PARALLEL_LEVEL=$(sysctl -n hw.ncpu) cargo build

The build "Finished dev profile [unoptimized + debuginfo] target(s) in 2m 02s".

@mewim
Copy link
Member

mewim commented Jul 15, 2024

This should have nothing to do with whether the machine is x64 or arm64 architecture.

@prrao87
Copy link
Member Author

prrao87 commented Jul 15, 2024

That's great! I'll make a note to document this as well, this is something macOS users should know. We can go ahead and close this if you think no further improvements in performance are possible.

@mewim
Copy link
Member

mewim commented Jul 15, 2024

I am not very familiar with Rust toolchain, but I think @benjaminwinger might have a way to detect number of threads and inject it into the build configuration at https://github.com/kuzudb/kuzu/blob/master/tools/rust_api/build.rs

@prrao87
Copy link
Member Author

prrao87 commented Jul 15, 2024

Actually yes, Ben mentioned that he was able to automatically do this in Linux, but on macOS he was only able to test on x86. He would need a way to test on arm64 (or at least suggest to you or me what we could try) so that we can avoid having to manually set the number of build processes each time - that command you showed is really tedious.

@mewim
Copy link
Member

mewim commented Jul 15, 2024

He can ssh as runner@ac2 (macOS x64) or runner@ac3 (macOS arm64) for testing.

@benjaminwinger
Copy link
Collaborator

I had actually seen this when testing on the x86_64 macos server before, however @prrao87 said that setting CMAKE_BUILD_PARALLEL_LEVEL wasn't helping, so I thought the issue must be something else.

It looks like Cargo sets a NUM_JOBS environment variable. The docs note that newer versions try to use cargo's jobserver, but I suspect that this may just be the case that there is no job server on macos for some reason (there's a warning message displayed about it being unavailable) and our only option is to use NUM_JOBS manually. It seems to be working on the x86_64 macOS server anyway; I'll also check on the arm64 server.

NUM_JOBS — the parallelism specified as the top-level parallelism. This can be useful to pass a -j parameter to a system like make. Note that care should be taken when interpreting this environment variable. For historical purposes this is still provided but recent versions of Cargo, for example, do not need to run make -j, and instead can set the MAKEFLAGS env var to the content of CARGO_MAKEFLAGS to activate the use of Cargo’s GNU Make compatible jobserver for sub-make invocations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants