Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.4.1] Files created on disk from remote cache or build are marked dirty on first rebuild #24763

Open
Bradshawz opened this issue Dec 19, 2024 · 1 comment
Labels
team-Performance Issues for Performance teams team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug untriaged

Comments

@Bradshawz
Copy link

Description of the bug:

We currently use BwoB toplevel with remote build/remote cache, but do our linking locally using build --strategy=CppLink=local. Due to this, we get a large amount of files in our output directory. On the first rebuild of any build, Bazel is marking all the downloaded files as changed outputs. This does not happen on the second rebuild. This can cause the rebuild to take a very long time.

I tried to track down where the issue was coming from a bit, and it seems like:

if (!trustRemoteValue && fileMetadata.couldBeModifiedSince(lastKnownData)) {

on first rebuild fileMetadata.couldBeModifiedSince(lastKnownData) does not seem like it actually successfully compares anything. It seems like it tries to compare their FileArtifactValue digests, and if that fails it compares their FileContentsProxy. On the first rebuild, fileMetadata has a proxy but no digest, but lastKnownData has a digest but no proxy. On the second rebuild fileMetadata still does not have a digest, but lastKnownData has a proxy and thus isn't marked as modified.

Which category does this issue belong to?

Performance, Remote Execution

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Using a build that either gets files built remotely or from a remote cache and then downloaded locally:
bazel clean
bazel build //src:someTarget
bazel build //src:someTarget - in the java log, there should be something like 241219 17:47:12.558:I 42 [com.google.devtools.build.lib.skyframe.SequencedSkyframeExecutor.detectModifiedOutputFiles] Found 1 modified files from last build
bazel build //src:someTarget - 241219 18:08:46.021:I 42 [com.google.devtools.build.lib.skyframe.SequencedSkyframeExecutor.detectModifiedOutputFiles] Found 0 modified files from last build

Which operating system are you running Bazel on?

Ubuntu 22.04

What is the output of bazel info release?

release 7.4.1

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.

No response

Have you found anything relevant by searching the web?

I found #22367, but it seems like the opposite issue where stuff that wasn't downloaded was still marked as dirty.

Any other information, logs, or outputs that you want to share?

No response

@github-actions github-actions bot added team-Performance Issues for Performance teams team-Remote-Exec Issues and PRs for the Execution (Remote) team labels Dec 19, 2024
@fmeum
Copy link
Collaborator

fmeum commented Dec 19, 2024

@coeuvre

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Performance Issues for Performance teams team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug untriaged
Projects
None yet
Development

No branches or pull requests

5 participants