Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crun should never checkpoint the netns #1210

Open
Luap99 opened this issue May 9, 2023 · 5 comments
Open

crun should never checkpoint the netns #1210

Luap99 opened this issue May 9, 2023 · 5 comments

Comments

@Luap99
Copy link
Member

Luap99 commented May 9, 2023

Basically the same as 2a0947e but this made the exception to checkpoint the netns when the netns path is empty in the runtime spec. This works only for the case where podman creates a netns in advance but this is not always the case, i.e. when a custom userns is used (which also doesn't work in crun right now but this is a different issue #1207).

The problem now is that I want to consolidate the network setup code in podman containers/podman#18468 to only use one setup path instead of two for with and without userns. So going forward I always want to let the runtime create the netns (empty netns path in config) and after the create call configure the netns in podman. This works just fine except for the checkpoint/restore case. On restore criu tries to restore the netns which fails:

(00.048662)      1: Try to restore a link 10:2:eth0
(00.048676)      1: Restoring link eth0 type 2
(00.048691)      1: Restoring netdev eth0 idx 2
(00.048705)      1: Restore ll addr (8e:../6) for device
(00.048712)      1: Error (criu/net.c:1462): Unknown peer net namespace
(00.063166)      1: Error (criu/libnetlink.c:54): -16 reported by netlink: Device or resource busy
(00.063205)      1: Error (criu/net.c:1816): Can't restore link: -16
(00.063270)      1: Error (criu/util.c:1411): Can't wait or bad status: errno=0, status=65280
(00.064717) Error (criu/cr-restore.c:2536): Restoring FAILED.

The same commands work just fine with runc because they always ignore the netns. opencontainers/runc@8187fb7

So my ask is to always ignore the netns to match runc behavior and allow poman to work correctly.

@adrianreber
Copy link
Contributor

I had a look at it and can provide a fix. Different then in runc, but similar.

@adrianreber
Copy link
Contributor

This first needs changes in libcriu. The needed interface has not been exported, yet. I will open a PR in CRIU first.

@adrianreber
Copy link
Contributor

See checkpoint-restore/criu#2175 for the CRIU changes.

@Luap99
Copy link
Member Author

Luap99 commented Jun 21, 2023

@adrianreber Is there a way we can move forward here or do we need to wait on a new criu release?

@adrianreber
Copy link
Contributor

I see CI runs based on Ubuntu. I think we can update CRIU in Fedora and the Ubuntu PPA to include the necessary patch in the current release without waiting for a new release.

@rst0git Can you update the CRIU PPA to 3.18 with the patch from checkpoint-restore/criu#2175 (maybe also the Sapphire Rapids patch)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants