ci: fix a race in TestExecIn and TestExecInTTY #4445

lifubang · 2024-10-14T09:36:31Z

Fixes: #4437
We can use a chan to wait the output from init process. After we received the content,
it means that the init process has started. Then we can exec into this container to use
ps command to check the init process and the exec process are both exist.

Signed-off-by: lifubang [email protected]

kolyshkin · 2024-10-14T18:49:44Z

We can use a chan to wait the output from init process, after we received the content, it means that the init process has started. Then we can exec into this container to use ps command.

Please add this into the commit message.

kolyshkin

Maybe we should just not expect cat in ps output.

kolyshkin · 2024-10-14T18:58:25Z

Maybe we should just not expect cat in ps output.

Or do a retry.

lifubang · 2024-10-15T02:02:17Z

Maybe we should just not expect cat in ps output.

I think because these are integration tests, so maybe we need to keep this to check we didn't exec into an unrelated container?

Fixes: opencontainers#4437 We can use a chan to wait the output from init process. After we received the content, it means that the init process has started. Then we can exec into this container to use ps command to check the init process and the exec process are both exist. Signed-off-by: lifubang <[email protected]>

rata · 2024-10-21T15:18:22Z

libcontainer/integration/execin_test.go

+		_ = stdoutR.Close()
+		_ = stdoutW.Close()


can't we just defer the close, instead of a function? We are ignoring the error anyways.

I mean it for all the cases, not just this.

rata · 2024-10-21T15:19:14Z

libcontainer/integration/execin_test.go

-	_ = stdinR.Close()
-	defer stdinW.Close() //nolint: errcheck
+	defer func() {
+		_, _ = stdinW.Write([]byte("hello"))


why do we write this? To stop the go routine? Can you add a comment or the message written can be self-explanatory instead of this hello?

If it's that, is it needed? Won't the close of the pipe already free the goroutine?

kolyshkin

I think the proper fix belongs to func (c *Container) exec(), which signals the runc init to go ahead and exec the real init process (cat in our case). I see there is some complicated logic in there but it looks it's just to handle the error case.

What it does (in a normal, non-error) case is opens exec fifo, reads from it, and removes the exec fifo file.

The other side (runc init, see the tail of func (l *linuxStandardInit) Init()):

writes 0 to the exec fifo (after which the parent returns from container.Start and thus container is expected to be running);
runs the StartContainer hook;
calls utils.UnsafeCloseFrom
calls system.Exec(name, l.config.Args, os.Environ())

The test execs the second process after (1) has happened, and if it manages to execute before (4) then we have this issue.

I guess that #4325 makes the race window smaller because os.Environ takes some time (it actually copies the environment), but we still haven't merged that.

I wish there is a cheap way to find out if runc init has completed. Maybe read /proc/$INIT_PID/exe to see it's not /proc/self/exe? Or something similar, ideally polling on something.

And, this can be added to func (c *Container) exec() so when it returns we're sure the container is running.

lifubang added area/ci easy-to-review labels Oct 14, 2024

lifubang force-pushed the fix-ci-race-execin branch from a6c839a to ee42499 Compare October 14, 2024 09:39

lifubang requested review from rata and kolyshkin October 14, 2024 09:54

kolyshkin reviewed Oct 14, 2024

View reviewed changes

lifubang force-pushed the fix-ci-race-execin branch from ee42499 to f08116b Compare October 15, 2024 09:47

rata reviewed Oct 21, 2024

View reviewed changes

rata mentioned this pull request Oct 23, 2024

libct/int: add exec benchmark #4432

Merged

kolyshkin reviewed Dec 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: fix a race in TestExecIn and TestExecInTTY #4445

ci: fix a race in TestExecIn and TestExecInTTY #4445

lifubang commented Oct 14, 2024 •

edited

Loading

kolyshkin commented Oct 14, 2024

kolyshkin left a comment

kolyshkin commented Oct 14, 2024

lifubang commented Oct 15, 2024

rata Oct 21, 2024

rata Oct 21, 2024

kolyshkin left a comment

ci: fix a race in TestExecIn and TestExecInTTY #4445

Are you sure you want to change the base?

ci: fix a race in TestExecIn and TestExecInTTY #4445

Conversation

lifubang commented Oct 14, 2024 • edited Loading

kolyshkin commented Oct 14, 2024

kolyshkin left a comment

Choose a reason for hiding this comment

kolyshkin commented Oct 14, 2024

lifubang commented Oct 15, 2024

rata Oct 21, 2024

Choose a reason for hiding this comment

rata Oct 21, 2024

Choose a reason for hiding this comment

kolyshkin left a comment

Choose a reason for hiding this comment

lifubang commented Oct 14, 2024 •

edited

Loading