Skip to content
This repository has been archived by the owner on Oct 2, 2023. It is now read-only.

How to create deterministic layers? #2180

Open
njlr opened this issue Oct 31, 2022 · 5 comments
Open

How to create deterministic layers? #2180

njlr opened this issue Oct 31, 2022 · 5 comments

Comments

@njlr
Copy link

njlr commented Oct 31, 2022

🐞 bug report

Affected Rule

The issue is caused by the rule:
  • container_run_and_commit_layer
  • container_image (maybe)

Is this a regression?

Unsure

Description

When building a container_run_and_commit_layer target multiple times, the hash is not deterministic.

However, container-diff shows no differences at a file-level.

🔬 Minimal Reproduction

https://github.com/njlr/bazel-run-commit

WORKSPACE

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
  name = "bazel_skylib",
  urls = [
    "https://mirror.bazel.build/github.com/bazelbuild/bazel-skylib/releases/download/1.2.1/bazel-skylib-1.2.1.tar.gz",
    "https://github.com/bazelbuild/bazel-skylib/releases/download/1.2.1/bazel-skylib-1.2.1.tar.gz",
  ],
  sha256 = "f7be3474d42aae265405a592bb7da8e171919d74c16f082a5457840f06054728",
)

load("@bazel_skylib//:workspace.bzl", "bazel_skylib_workspace")

bazel_skylib_workspace()

http_archive(
  name = "io_bazel_rules_docker",
  sha256 = "b1e80761a8a8243d03ebca8845e9cc1ba6c82ce7c5179ce2b295cd36f7e394bf",
  urls = ["https://github.com/bazelbuild/rules_docker/releases/download/v0.25.0/rules_docker-v0.25.0.tar.gz"],
)

load(
    "@io_bazel_rules_docker//repositories:repositories.bzl",
    container_repositories = "repositories",
)
container_repositories()

load("@io_bazel_rules_docker//repositories:deps.bzl", container_deps = "deps")

container_deps()

load(
  "@io_bazel_rules_docker//container:container.bzl",
  "container_pull",
)

container_pull(
  name = "dotnet_runtime_deps_6_0_10",
  registry = "mcr.microsoft.com",
  repository = "dotnet/runtime-deps",
  tag = "6.0.10-bullseye-slim-amd64",
  digest = "sha256:24554fadd483d8305974ded44bb1dbe4916e2f02500b9e2d78e7beb557cfebd0"
)

BUILD.bazel

load("@io_bazel_rules_docker//container:container.bzl", "container_image")
load("@io_bazel_rules_docker//docker/util:run.bzl", "container_run_and_commit_layer")
load("@bazel_skylib//rules:copy_file.bzl", "copy_file")

container_run_and_commit_layer(
  name = "install_git",
  image = "@dotnet_runtime_deps_6_0_10//image",
  commands = [
    " && ".join([
      "apt-get update -y",
      "apt-get install -y git=1:2.30.2-1",
      "apt-get clean",
      "rm -rf /var/lib/apt/lists/*",
      "rm -rf /var/cache/ldconfig/aux-cache",
      "rm -rf /var/log/alternatives.log",
      "rm -rf /var/log/apt/term.log",
      "rm -rf /var/log/apt/history.log",
      "rm -rf /var/log/dpkg.log",
      "rm -rf /var/log/*",
      "rm -rf /var/cache/debconf/templates.dat",
      "rm -rf /var/lib/dpkg/status-old",
      "rm -rf /var/lib/dpkg/status",
      "rm -rf /var/cache/debconf/config.dat",
      "rm -rf /etc/ld.so.cache",
      "rm -rf /var/lib/apt/extended_states",
      "rm -rf /var/log/apt/eipp.log.xz",
      "git --version",
    ]),
  ],
)

container_image(
  name = "image",
  base = "@dotnet_runtime_deps_6_0_10//image",
  layers = [
    ":install_git",
  ],
)

copy_file(
  name = "image_archive",
  src = ":image.tar",
  out = "image_archive.tar",
  is_executable = False,
  allow_symlink = False,
)

test.sh

#!/bin/bash

set -e
set -o pipefail

rm -rf ./bazel-*

bazel clean

bazel build //:image_archive

sha256sum bazel-bin/image_archive.tar

rm -rf ./bazel-*

bazel clean

bazel build //:image_archive

sha256sum bazel-bin/image_archive.tar
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
INFO: Analyzed target //:image_archive (111 packages loaded, 7332 targets configured).
INFO: Found 1 target...
Target //:image_archive up-to-date:
  bazel-bin/image_archive.tar
INFO: Elapsed time: 42.970s, Critical Path: 42.10s
INFO: 73 processes: 24 internal, 49 linux-sandbox.
INFO: Build completed successfully, 73 total actions
d51cbfa26560fe671e13655b0baa94a3d8426b4cc3a8726c2e4a2e05585ebc6b  bazel-bin/image_archive.tar
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
INFO: Analyzed target //:image_archive (111 packages loaded, 7332 targets configured).
INFO: Found 1 target...
Target //:image_archive up-to-date:
  bazel-bin/image_archive.tar
INFO: Elapsed time: 60.050s, Critical Path: 59.20s
INFO: 73 processes: 24 internal, 49 linux-sandbox.
INFO: Build completed successfully, 73 total actions
3b80585ed7dcf7f27590e48bb48b89d59ce6a1660f6ced7f081711c5e64fd064  bazel-bin/image_archive.tar

🔥 Exception or Error

N/A

🌍 Your Environment

Operating System:

lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04 LTS
Release:	22.04
Codename:	jammy

Output of bazel version:

bazel --version
bazel 5.3.1

Rules_docker version:

http_archive(
  name = "io_bazel_rules_docker",
  sha256 = "b1e80761a8a8243d03ebca8845e9cc1ba6c82ce7c5179ce2b295cd36f7e394bf",
  urls = ["https://github.com/bazelbuild/rules_docker/releases/download/v0.25.0/rules_docker-v0.25.0.tar.gz"],
)

Anything else relevant?

Nope

@njlr
Copy link
Author

njlr commented Oct 31, 2022

Curiously, this seems to work:

container_image(
  name = "image",
  base = "@dotnet_runtime_deps_6_0_10//image",
  layers = [
-    ":install_git",
  ],
+  tars = [
+    ":install_git",
+  ],
)

@njlr
Copy link
Author

njlr commented Oct 31, 2022

Also strange is that the hash on GitHub CI and my machine differ.

@alexeagle
Copy link
Collaborator

You call tools in your container which aren't hermetic, like apt-get install - so that tool produces a different output. Bazel can only provide determinism if the tools it runs do.

@njlr
Copy link
Author

njlr commented Mar 13, 2023

You call tools in your container which aren't hermetic, like apt-get install - so that tool produces a different output. Bazel can only provide determinism if the tools it runs do.

There are commands to clean up the noise from apt-get (although it is possible something was missed). It appears to be deterministic when using tars but not layers.

@codersasha
Copy link

This fix also seems to improve remote cacheability, and may help solve #2195.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants