-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix stdio permission error in userns container #4478
base: main
Are you sure you want to change the base?
Conversation
9d3439e
to
839fc07
Compare
839fc07
to
ec51b46
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this test case does not reproduce the original issue (which is not being able to write to write to /dev/stderr).
What touch
does is it changes the atime and mtime of /dev/stderr (using utimensat(2)
) and this is what fails here as far as I can see.
ec51b46
to
3ccbc91
Compare
Changed, thanks. If we revert the first commit, the fail msg looks as follows:
|
2b7dda7
to
7b9586a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it looks like it works with runc run
but doesn't work with runc create
+ runc start
. Why?
In case of runc create
, setupIO
is called with detach=true
, which (together with config.Terminal=false
) results in runc init
using whatever stdin/stdout/stderr runc create
has. In the case of this test stdout and stderr is pointing to log
:
__runc create test_busybox >log 2>&1
So, you end up with runc init
which has fd 1 and 2 opened to something like /tmp/bats-run-NxJCXf/runc.mipFpg/bundle/log
.
Next, runc start
tells runc init to go ahead and run the binary. Which can't write to log because it belongs to another user. I'm a bit puzzled by why the error message is "can't create /dev/stderr: Permission denied" (this is probably shell trying to recreate the file), but what the code in this PR does is it changes the host file permissions to 666 (you can check that by adding ls -l log
to the test case). Which is not good.
Now, if you change echo errormsg > /dev/stderr
to echo errormsg >&2
(which tells sh to redirect echo output to fd 2), it starts working again.
So, it looks like neither the test case nor the fix is correct.
|
Here is my detailed test: root@iZj6cgggwb62cxurec74geZ:/opt/bb# # runc run can work
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc run test
errorlog
root@iZj6cgggwb62cxurec74geZ:/opt/bb# # runc run -d can't work
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc run -d test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# sh: can't create /dev/stderr: Permission denied
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc delete test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# # runc create & start can't work
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc create test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc start test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# sh: can't create /dev/stderr: Permission denied
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc delete test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# After using my patch, both root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc run -d test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# errorlog
root@iZj6cgggwb62cxurec74geZ:/opt/bb#
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc delete test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc create test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc start test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# errorlog
root@iZj6cgggwb62cxurec74geZ:/opt/bb# /root/go/src/github.com/opencontainers/runc/runc delete test
root@iZj6cgggwb62cxurec74geZ:/opt/bb# config.json:
{
"ociVersion": "1.0.2-dev",
"process": {
"terminal": false,
"user": {
"uid": 0,
"gid": 0
},
"args": [
"sh", "-c", "echo errorlog > /dev/stderr"
],
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"TERM=xterm"
],
"cwd": "/",
"capabilities": {
"bounding": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"effective": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"permitted": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"ambient": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
]
},
"rlimits": [
{
"type": "RLIMIT_NOFILE",
"hard": 1024,
"soft": 1024
}
],
"noNewPrivileges": true
},
"root": {
"path": "rootfs",
"readonly": true
},
"hostname": "runc",
"mounts": [
{
"destination": "/proc",
"type": "proc",
"source": "proc"
},
{
"destination": "/dev",
"type": "tmpfs",
"source": "tmpfs",
"options": [
"nosuid",
"strictatime",
"mode=755",
"size=65536k"
]
},
{
"destination": "/dev/pts",
"type": "devpts",
"source": "devpts",
"options": [
"nosuid",
"noexec",
"newinstance",
"ptmxmode=0666",
"mode=0620",
"gid=5"
]
},
{
"destination": "/dev/shm",
"type": "tmpfs",
"source": "shm",
"options": [
"nosuid",
"noexec",
"nodev",
"mode=1777",
"size=65536k"
]
},
{
"destination": "/dev/mqueue",
"type": "mqueue",
"source": "mqueue",
"options": [
"nosuid",
"noexec",
"nodev"
]
},
{
"destination": "/sys",
"type": "sysfs",
"source": "sysfs",
"options": [
"nosuid",
"noexec",
"nodev",
"ro"
]
},
{
"destination": "/sys/fs/cgroup",
"type": "cgroup",
"source": "cgroup",
"options": [
"nosuid",
"noexec",
"nodev",
"relatime",
"ro"
]
}
],
"linux": {
"resources": {
"devices": [
{
"allow": false,
"access": "rwm"
}
]
},
"namespaces": [
{
"type": "pid"
},
{
"type": "network"
},
{
"type": "ipc"
},
{
"type": "uts"
},
{
"type": "mount"
},
{
"type": "cgroup"
},
{
"type": "user"
}
],
"uidMappings": [{"hostID": 100000, "containerID": 0, "size": 65534}],
"gidMappings": [{"hostID": 100000, "containerID": 0, "size": 65534}],
"maskedPaths": [
"/proc/acpi",
"/proc/asound",
"/proc/kcore",
"/proc/keys",
"/proc/latency_stats",
"/proc/timer_list",
"/proc/timer_stats",
"/proc/sched_debug",
"/sys/firmware",
"/proc/scsi"
],
"readonlyPaths": [
"/proc/bus",
"/proc/fs",
"/proc/irq",
"/proc/sys",
"/proc/sysrq-trigger"
]
}
}
|
This is what my patch does, because But this was also what my concern about, we have changed the host file's permission silently, it will cause some security issues? |
But this change is not in the container, it's in the runc parent process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like that we are changing host file permissions -- file passthrough stdio is already kind of dangerous (and should be avoided if possible), but this will cause a container using a file as stdio to silently change the ownership such that any user on the entire system can read and write to the file. This seems like something that is going to lead to security issues.
IMHO, starting a container with file passthrough to a file that the container process doesn't have access to is a configuration bug. Since we opened stdio as a privileged user before creating the container and dropping privileges, pid1 can already read/write to stdio (DAC permissions aren't checked during read
/write
). The issue is when a container process tries to re-open /proc/1/fd/...
. IMHO it's not really our job to fix this -- they should configure the stdio file to have the permissions they like.
If we really need to do this, then I think we should change the owner instead. At least that way we won't allow any user to read the file, but this could cause issues with a monitor process being unable to read from the file (imagine there is a cron job that reads the output file -- once the file has its owner changed it can't read from it anymore). But at least the security story makes a little more sense than blindly doing chmod 0666
.
0888fdb
to
f0ce623
Compare
It will be fixed in containerd, it’s not very urgent to fix it in runc, let’s remove the 1.2.1 milestone. |
libcontainer/container_linux.go
Outdated
// fixStdioPermissions fixes the permissions of STDIO to the specified user. | ||
func fixStdioPermissions(uid int) error { | ||
var null unix.Stat_t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please move it back to init_linux.go
? This will make the patch shorter and review simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this function is needed in both parent and init process, so I think it should be in container_linux.go
.
And I have updated the PR, so it really should be moved to container_linux.go
.
libcontainer/init_linux.go
Outdated
if err := fixStdioPermissions(execUser); err != nil { | ||
if err := fixStdioPermissions(execUser.Uid); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need this here if we have already done it in the parent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are different cases (in some we should do this from the parent, in some others from the child), maybe we should make calls to fixStdioPermissions
conditional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are different cases (in some we should do this from the parent, in some others from the child), maybe we should make calls to
fixStdioPermissions
conditional.
Since I considered that Root in userNS != User in userNS
, so I think it still need in init process.
But after I add a test case for it, I find this is also a bug in userns container.
So I updated this PR to move fixStdioPermissions
from init process to parent process.
Edit:
So I updated this PR to change chown from root user in the container to the real user in the container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks your comment to find another bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see this test: @test "check stdio permission for non-root user in userns [terminal=false]"
d3df716
to
0d08500
Compare
Signed-off-by: lifubang <[email protected]>
Signed-off-by: lifubang <[email protected]>
Signed-off-by: lifubang <[email protected]>
0d08500
to
77f9210
Compare
This indeed fixes the issue with the nginx image too. Do the rejects still apply to the last version? It seems @kolyshkin might not, and it is not changing permissions to "others" now? I can review in the next few days if no one still objects to this. |
Yes, as I mentioned in #4478 (comment), this should be fixed by users, and @fuweid has fixed it in containerd(containerd/containerd#10906). But for runc direct users, we also need to fix this issue in runc. This PR fixes about two issues, I'll split this PR once I have a time, and I'll describe the issue with more details. |
I was aware of the containerd PR, but thanks anyways :) Sure, feel free to ping me for review when you open the PRs |
Fix #4475
If the user in the container is different from current user, we need
to change the owner of stdio before we enter the user namespace, or
else we may can't access stdio in the container.