As we know, we can interact with cgroups in two ways, cgroupfs
and systemd
. The former is achieved by reading and writing cgroup tmpfs
files under /sys/fs/cgroup
while the latter is done by configuring a transient unit by requesting systemd. Kata agent uses cgroupfs
by default, unless you pass the parameter --systemd-cgroup
.
For systemd, kata agent configures cgroups according to the following linux.cgroupsPath
format standard provided by runc
([slice]:[prefix]:[name]
). If you don't provide a valid linux.cgroupsPath
, kata agent will treat it as "system.slice:kata_agent:<container-id>"
.
Here slice is a systemd slice under which the container is placed. If empty, it defaults to system.slice, except when cgroup v2 is used and rootless container is created, in which case it defaults to user.slice.
Note that slice can contain dashes to denote a sub-slice (e.g. user-1000.slice is a correct notation, meaning a
subslice
of user.slice), but it must not contain slashes (e.g. user.slice/user-1000.slice is invalid).A slice of
-
represents a root slice.Next, prefix and name are used to compose the unit name, which is
<prefix>-<name>.scope
, unless name has.slice
suffix, in which case prefix is ignored and the name is used as is.
The kata agent will translate the parameters in the linux.resources
of config.json
into systemd unit properties, and send it to systemd for configuration. Since systemd supports limited properties, only the following parameters in linux.resources
will be applied. We will simply treat hybrid mode as legacy mode by the way.
-
CPU
- v1
runtime spec resource systemd property name cpu.shares
CPUShares
- v2
runtime spec resource systemd property name cpu.shares
CPUShares
cpu.period
CPUQuotaPeriodUSec
(v242)cpu.period
&cpu.quota
CPUQuotaPerSecUSec
-
MEMORY
- v1
runtime spec resource systemd property name memory.limit
MemoryLimit
- v2
runtime spec resource systemd property name memory.low
MemoryLow
memory.max
MemoryMax
memory.swap
&memory.limit
MemorySwapMax
-
PIDS
runtime spec resource systemd property name pids.limit
TasksMax
-
CPUSET
runtime spec resource systemd property name cpuset.cpus
AllowedCPUs
(v244)cpuset.mems
AllowedMemoryNodes
(v244)
session.rs
and system.rs
in src/agent/rustjail/src/cgroups/systemd/interface
are automatically generated by zbus-xmlgen
, which is is an accompanying tool provided by zbus
to generate Rust code from D-Bus XML interface descriptions
. The specific commands to generate these two files are as follows:
// system.rs
zbus-xmlgen --system org.freedesktop.systemd1 /org/freedesktop/systemd1
// session.rs
zbus-xmlgen --session org.freedesktop.systemd1 /org/freedesktop/systemd1
The current implementation of cgroups/systemd
uses system.rs
while session.rs
could be used to build rootless containers in the future.