Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cgroup: Use kernel command line to disable memory cgroup #6439

Merged

Conversation

mairacanal
Copy link
Contributor

Commit 94a23e9 ("cgroup: Disable cgroup "memory" by default") disabled the memory cgroup by default when initing the cgroups. However, it's possible to disable the memory cgroup by a kernel command line. Hard-coding such a feature can be problematic as some memory management features depend on the order that things are set.

For example, it is possible to see a NULL pointer dereference caused by commit 94a23e9. The NULL pointer dereference is triggered by the memory shrinker and ends up in a kernel crash.

[ 50.028629] ==================================================================
[ 50.028645] BUG: KASAN: null-ptr-deref in do_shrink_slab+0x1fc/0x978
[ 50.028663] Write of size 8 at addr 0000000000000000 by task gfxrecon-replay/1965

[ 50.028676] CPU: 3 UID: 1000 PID: 1965 Comm: gfxrecon-replay Tainted: G C 6.12.0-rc4-v8-thp-kasan+ #85
[ 50.028685] Tainted: [C]=CRAP
[ 50.028689] Hardware name: Raspberry Pi 5 Model B Rev 1.0 (DT)
[ 50.028694] Call trace:
[ 50.028697] dump_backtrace+0xfc/0x120
[ 50.028706] show_stack+0x24/0x38
[ 50.028711] dump_stack_lvl+0x40/0x88
[ 50.028720] print_report+0xe4/0x708
[ 50.028728] kasan_report+0xcc/0x130
[ 50.028733] kasan_check_range+0x254/0x298
[ 50.028738] __kasan_check_write+0x20/0x30
[ 50.028745] do_shrink_slab+0x1fc/0x978
[ 50.028751] shrink_slab+0x318/0xc38
[ 50.028756] shrink_one+0x254/0x6d8
[ 50.028762] shrink_node+0x26b4/0x2848
[ 50.028767] do_try_to_free_pages+0x3e4/0x1190
[ 50.028773] try_to_free_pages+0x5a4/0xb40
[ 50.028778] __alloc_pages_direct_reclaim+0x144/0x298
[ 50.028787] __alloc_pages_slowpath+0x5c4/0xc70
[ 50.028793] __alloc_pages_noprof+0x4a8/0x6a8
[ 50.028800] __folio_alloc_noprof+0x24/0xa8
[ 50.028806] shmem_alloc_and_add_folio+0x2ec/0xce0
[ 50.028812] shmem_get_folio_gfp+0x380/0xc20
[ 50.028818] shmem_read_folio_gfp+0xe0/0x160
[ 50.028824] drm_gem_get_pages+0x238/0x620 [drm]
[ 50.029039] drm_gem_shmem_get_pages_sgt+0xd8/0x4b8 [drm_shmem_helper]
[ 50.029053] v3d_bo_create_finish+0x58/0x1e0 [v3d]
[ 50.029083] v3d_create_bo_ioctl+0xac/0x210 [v3d]
[ 50.029105] drm_ioctl_kernel+0x1d8/0x2b8 [drm]
[ 50.029220] drm_ioctl+0x4b4/0x920 [drm]
[ 50.029330] __arm64_sys_ioctl+0x11c/0x160
[ 50.029337] invoke_syscall+0x88/0x268
[ 50.029345] el0_svc_common+0x160/0x1d8
[ 50.029351] do_el0_svc+0x50/0x68
[ 50.029358] el0_svc+0x34/0x80
[ 50.029364] el0t_64_sync_handler+0x84/0x100
[ 50.029371] el0t_64_sync+0x190/0x198
[ 50.029376] ==================================================================

This happens because the memory shrinker is unaware that we are artificially disabling the memory cgroups and therefore it doesn't allocate nr_deferred (as it would if we used the kernel command line).

To avoid such an issue, revert the artificial disablement and disable it through the command line. If a user wants to enable the feature, it can use the cgroup_enable= command line.

@pelwell
Copy link
Contributor

pelwell commented Oct 25, 2024

Hi Maira,

Thanks for this. Your patches missed out a few platforms because of the non-trivial include hierarchy. You may find it easier to build on top of #6442, which is effectively a cosmetic change that reduces the number of files to change to 6:

arch/arm/boot/dts/broadcom/bcm2708-rpi-bt.dtsi:27:              bootargs = "coherent_pool=1M 8250.nr_uarts=1 snd_bcm2835.enable_headphones=0";
arch/arm/boot/dts/broadcom/bcm270x.dtsi:7:              bootargs = "coherent_pool=1M snd_bcm2835.enable_headphones=0";
arch/arm/boot/dts/broadcom/bcm2711-rpi-cm4s.dts:151:            bootargs = "coherent_pool=1M snd_bcm2835.enable_headphones=0 numa_policy=interleave";
arch/arm/boot/dts/broadcom/bcm2711-rpi-ds.dtsi:6:               bootargs = "coherent_pool=1M 8250.nr_uarts=1 snd_bcm2835.enable_headphones=0 numa_policy=interleave";
arch/arm/boot/dts/broadcom/bcm271x-rpi-bt.dtsi:27:              bootargs = "coherent_pool=1M 8250.nr_uarts=1 snd_bcm2835.enable_headphones=0";
arch/arm64/boot/dts/broadcom/bcm2712-rpi.dtsi:102:              bootargs = "reboot=w coherent_pool=1M 8250.nr_uarts=1 pci=pcie_bus_safe numa_policy=interleave iommu_dm

@mairacanal
Copy link
Contributor Author

Thanks @pelwell! I'll wait #6442 to land and then I'll rebase this PR modifying the DTS needed.

@pelwell
Copy link
Contributor

pelwell commented Oct 25, 2024

#6442 is merged, so over to you.

Commit 94a23e9 ("cgroup: Disable cgroup "memory" by default")
disabled the memory cgroup by default when initing the cgroups. However,
it's possible to disable the memory cgroup by a kernel command line.
Hard-coding such a feature can be problematic as some memory management
features depend on the order that things are set.

For example, it is possible to see a NULL pointer dereference caused by
commit 94a23e9. The NULL pointer
dereference is triggered by the memory shrinker and ends up in a kernel
crash.

[   50.028629] ==================================================================
[   50.028645] BUG: KASAN: null-ptr-deref in do_shrink_slab+0x1fc/0x978
[   50.028663] Write of size 8 at addr 0000000000000000 by task gfxrecon-replay/1965

[   50.028676] CPU: 3 UID: 1000 PID: 1965 Comm: gfxrecon-replay Tainted: G         C         6.12.0-rc4-v8-thp-kasan+ raspberrypi#85
[   50.028685] Tainted: [C]=CRAP
[   50.028689] Hardware name: Raspberry Pi 5 Model B Rev 1.0 (DT)
[   50.028694] Call trace:
[   50.028697]  dump_backtrace+0xfc/0x120
[   50.028706]  show_stack+0x24/0x38
[   50.028711]  dump_stack_lvl+0x40/0x88
[   50.028720]  print_report+0xe4/0x708
[   50.028728]  kasan_report+0xcc/0x130
[   50.028733]  kasan_check_range+0x254/0x298
[   50.028738]  __kasan_check_write+0x20/0x30
[   50.028745]  do_shrink_slab+0x1fc/0x978
[   50.028751]  shrink_slab+0x318/0xc38
[   50.028756]  shrink_one+0x254/0x6d8
[   50.028762]  shrink_node+0x26b4/0x2848
[   50.028767]  do_try_to_free_pages+0x3e4/0x1190
[   50.028773]  try_to_free_pages+0x5a4/0xb40
[   50.028778]  __alloc_pages_direct_reclaim+0x144/0x298
[   50.028787]  __alloc_pages_slowpath+0x5c4/0xc70
[   50.028793]  __alloc_pages_noprof+0x4a8/0x6a8
[   50.028800]  __folio_alloc_noprof+0x24/0xa8
[   50.028806]  shmem_alloc_and_add_folio+0x2ec/0xce0
[   50.028812]  shmem_get_folio_gfp+0x380/0xc20
[   50.028818]  shmem_read_folio_gfp+0xe0/0x160
[   50.028824]  drm_gem_get_pages+0x238/0x620 [drm]
[   50.029039]  drm_gem_shmem_get_pages_sgt+0xd8/0x4b8 [drm_shmem_helper]
[   50.029053]  v3d_bo_create_finish+0x58/0x1e0 [v3d]
[   50.029083]  v3d_create_bo_ioctl+0xac/0x210 [v3d]
[   50.029105]  drm_ioctl_kernel+0x1d8/0x2b8 [drm]
[   50.029220]  drm_ioctl+0x4b4/0x920 [drm]
[   50.029330]  __arm64_sys_ioctl+0x11c/0x160
[   50.029337]  invoke_syscall+0x88/0x268
[   50.029345]  el0_svc_common+0x160/0x1d8
[   50.029351]  do_el0_svc+0x50/0x68
[   50.029358]  el0_svc+0x34/0x80
[   50.029364]  el0t_64_sync_handler+0x84/0x100
[   50.029371]  el0t_64_sync+0x190/0x198
[   50.029376] ==================================================================

This happens because the memory shrinker is unaware that we are
artificially disabling the memory cgroups and therefore it doesn't
allocate `nr_deferred` (as it would if we used the kernel command line).

To avoid such an issue, revert the artificial disablement and disable it
through the command line. If a user wants to enable the feature, it can
use the `cgroup_enable=` command line.

Signed-off-by: Maíra Canal <[email protected]>
@mairacanal
Copy link
Contributor Author

I rebased and updated the DTS files.

@pelwell pelwell merged commit 86099de into raspberrypi:rpi-6.6.y Oct 25, 2024
11 of 12 checks passed
@pelwell
Copy link
Contributor

pelwell commented Oct 25, 2024

Thanks!

popcornmix added a commit to raspberrypi/firmware that referenced this pull request Nov 1, 2024
kernel: drivers: media: bcm2835_isp: Cache LS table dmabuf
See: raspberrypi/linux#6429

kernel: arm64: dts: Sort out CM5 and I/O board I2C ports
See: raspberrypi/linux#6441

kernel: cgroup: Use kernel command line to disable memory cgroup
See: raspberrypi/linux#6439
popcornmix added a commit to raspberrypi/rpi-firmware that referenced this pull request Nov 1, 2024
kernel: drivers: media: bcm2835_isp: Cache LS table dmabuf
See: raspberrypi/linux#6429

kernel: arm64: dts: Sort out CM5 and I/O board I2C ports
See: raspberrypi/linux#6441

kernel: cgroup: Use kernel command line to disable memory cgroup
See: raspberrypi/linux#6439
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants