Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mkcomposefs: add --sandbox flag #358

Closed
wants to merge 1 commit into from

Conversation

giuseppe
Copy link
Member

This commit adds a new --sandbox flag to mkcomposefs, enabling an attempt to isolate the process within a restricted environment. The sandbox limits the process's access to host resources, reducing potential attack surfaces. It is a best-effort attempt and does not guarantee full isolation.

@giuseppe giuseppe force-pushed the sandbox-mkcomposefs branch 3 times, most recently from 97a23bd to 43db64e Compare September 23, 2024 20:32
@@ -1,10 +1,12 @@
libcomposefs_dep = declare_dependency(link_with : libcomposefs, include_directories : config_inc)

seccomp_dep = dependency('libseccomp', required : true)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like the libcap will also be required, shouldn't it also be a dependency on meson?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes thanks for pointing it out. Fixed now

This commit adds a new --sandbox flag to mkcomposefs, enabling an
attempt to isolate the process within a restricted environment.  The
sandbox limits the process's access to host resources, reducing
potential attack surfaces.  It is a best-effort attempt and does not
guarantee full isolation.

Signed-off-by: Giuseppe Scrivano <[email protected]>
@cgwalters
Copy link
Contributor

In the past my concern here is that basically everything invoking composefs is already (or should be) in a position to sandbox it. Especially containers/storage - it's already linked into an entire container runtime infrastructure. Having another low level implementation of setting up seccomp rules and using pivot_root etc. doesn't seem to make a lot of sense to me versus the alternatives:

  • Teach c/storage to invoke mkcomposefs in a new user namespace and with the default seccomp policy - in fact basically the default policy as podman run from the root should be totally fine right?
  • In this project provide recommended examples of using e.g. bubblewrap and systemd-run style unit instructions UnshareUser= etc.

@giuseppe
Copy link
Member Author

neither Podman nor c/storage build the seccomp profile (buildah does for --isolation=chroot), they just pass down the information to the OCI runtime.
I am fine if we want to add a dependency on bwrap, (if it is worth though, since the new code is less than 200 lines), but I don't think we should go through systemd. Podman can run in environments where systemd is not present and systemd-run+dbus has a cost, on my system it is ~8ms:

$ hyperfine 'systemd-run --scope --user true'
Benchmark 1: systemd-run --scope --user true
  Time (mean ± σ):       7.9 ms ±   3.8 ms    [User: 1.5 ms, System: 2.2 ms]
  Range (min … max):     5.6 ms …  27.9 ms    148 runs

$ sudo hyperfine 'systemd-run --scope true'
Benchmark 1: systemd-run --scope true
  Time (mean ± σ):       8.2 ms ±   4.2 ms    [User: 1.5 ms, System: 2.1 ms]
  Range (min … max):     5.7 ms …  31.5 ms    93 runs

@cgwalters
Copy link
Contributor

I'm suggesting we don't add a hard dependency on any containerization (bwrap/systemd-run/crun/etc) here, but just provide recommendations for how to do it externally.

Us trying to do it internally creates a "fuzzy" boundary for what external callers should be doing and what we should be doing. I had a really similar debate with the virtiofsd maintainer (which has a ton of internal sandboxing) but that all started to break down when we wanted to run it inside an unprivileged Kubernetes pod because it wanted to do things like create a userns itself which wasn't allowed, etc.

Of course in theory creating a userns is safe...in practice, see threads like https://lwn.net/Articles/978846/

neither Podman nor c/storage build the seccomp profile (buildah does for --isolation=chroot), they just pass down the information to the OCI runtime.

Right but surely we could teach c/storage how to basically run an "internal container" with the rootfs of / (i.e. we use crun for the isolation)? Or perhaps crun would do things we don't want here like having it show up in podman ps? I haven't dug in at that level.


In the end my opinion here is that we should start at least changing c/storage to create a temporary unprivileged userns for helpers we invoke like gunzip and mkcomposefs and that gives us a strong baseline. The seccomp stuff seems much more optional.

@giuseppe giuseppe closed this Sep 26, 2024
cgwalters added a commit to cgwalters/composefs that referenced this pull request Sep 26, 2024
Alternative to containers#358
which would have put some internal sandboxing.

Signed-off-by: Colin Walters <[email protected]>
cgwalters added a commit to cgwalters/composefs that referenced this pull request Sep 26, 2024
Alternative to containers#358
which would have put some internal sandboxing.

Signed-off-by: Colin Walters <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants