-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cgroup: Use kernel command line to disable memory cgroup #6439
Merged
pelwell
merged 1 commit into
raspberrypi:rpi-6.6.y
from
mairacanal:cgroup/disable-cgroup-cmdline
Oct 25, 2024
Merged
cgroup: Use kernel command line to disable memory cgroup #6439
pelwell
merged 1 commit into
raspberrypi:rpi-6.6.y
from
mairacanal:cgroup/disable-cgroup-cmdline
Oct 25, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Hi Maira, Thanks for this. Your patches missed out a few platforms because of the non-trivial include hierarchy. You may find it easier to build on top of #6442, which is effectively a cosmetic change that reduces the number of files to change to 6:
|
#6442 is merged, so over to you. |
Commit 94a23e9 ("cgroup: Disable cgroup "memory" by default") disabled the memory cgroup by default when initing the cgroups. However, it's possible to disable the memory cgroup by a kernel command line. Hard-coding such a feature can be problematic as some memory management features depend on the order that things are set. For example, it is possible to see a NULL pointer dereference caused by commit 94a23e9. The NULL pointer dereference is triggered by the memory shrinker and ends up in a kernel crash. [ 50.028629] ================================================================== [ 50.028645] BUG: KASAN: null-ptr-deref in do_shrink_slab+0x1fc/0x978 [ 50.028663] Write of size 8 at addr 0000000000000000 by task gfxrecon-replay/1965 [ 50.028676] CPU: 3 UID: 1000 PID: 1965 Comm: gfxrecon-replay Tainted: G C 6.12.0-rc4-v8-thp-kasan+ raspberrypi#85 [ 50.028685] Tainted: [C]=CRAP [ 50.028689] Hardware name: Raspberry Pi 5 Model B Rev 1.0 (DT) [ 50.028694] Call trace: [ 50.028697] dump_backtrace+0xfc/0x120 [ 50.028706] show_stack+0x24/0x38 [ 50.028711] dump_stack_lvl+0x40/0x88 [ 50.028720] print_report+0xe4/0x708 [ 50.028728] kasan_report+0xcc/0x130 [ 50.028733] kasan_check_range+0x254/0x298 [ 50.028738] __kasan_check_write+0x20/0x30 [ 50.028745] do_shrink_slab+0x1fc/0x978 [ 50.028751] shrink_slab+0x318/0xc38 [ 50.028756] shrink_one+0x254/0x6d8 [ 50.028762] shrink_node+0x26b4/0x2848 [ 50.028767] do_try_to_free_pages+0x3e4/0x1190 [ 50.028773] try_to_free_pages+0x5a4/0xb40 [ 50.028778] __alloc_pages_direct_reclaim+0x144/0x298 [ 50.028787] __alloc_pages_slowpath+0x5c4/0xc70 [ 50.028793] __alloc_pages_noprof+0x4a8/0x6a8 [ 50.028800] __folio_alloc_noprof+0x24/0xa8 [ 50.028806] shmem_alloc_and_add_folio+0x2ec/0xce0 [ 50.028812] shmem_get_folio_gfp+0x380/0xc20 [ 50.028818] shmem_read_folio_gfp+0xe0/0x160 [ 50.028824] drm_gem_get_pages+0x238/0x620 [drm] [ 50.029039] drm_gem_shmem_get_pages_sgt+0xd8/0x4b8 [drm_shmem_helper] [ 50.029053] v3d_bo_create_finish+0x58/0x1e0 [v3d] [ 50.029083] v3d_create_bo_ioctl+0xac/0x210 [v3d] [ 50.029105] drm_ioctl_kernel+0x1d8/0x2b8 [drm] [ 50.029220] drm_ioctl+0x4b4/0x920 [drm] [ 50.029330] __arm64_sys_ioctl+0x11c/0x160 [ 50.029337] invoke_syscall+0x88/0x268 [ 50.029345] el0_svc_common+0x160/0x1d8 [ 50.029351] do_el0_svc+0x50/0x68 [ 50.029358] el0_svc+0x34/0x80 [ 50.029364] el0t_64_sync_handler+0x84/0x100 [ 50.029371] el0t_64_sync+0x190/0x198 [ 50.029376] ================================================================== This happens because the memory shrinker is unaware that we are artificially disabling the memory cgroups and therefore it doesn't allocate `nr_deferred` (as it would if we used the kernel command line). To avoid such an issue, revert the artificial disablement and disable it through the command line. If a user wants to enable the feature, it can use the `cgroup_enable=` command line. Signed-off-by: Maíra Canal <[email protected]>
mairacanal
force-pushed
the
cgroup/disable-cgroup-cmdline
branch
from
October 25, 2024 15:13
0974ecb
to
a870565
Compare
I rebased and updated the DTS files. |
Thanks! |
popcornmix
added a commit
to raspberrypi/firmware
that referenced
this pull request
Nov 1, 2024
kernel: drivers: media: bcm2835_isp: Cache LS table dmabuf See: raspberrypi/linux#6429 kernel: arm64: dts: Sort out CM5 and I/O board I2C ports See: raspberrypi/linux#6441 kernel: cgroup: Use kernel command line to disable memory cgroup See: raspberrypi/linux#6439
popcornmix
added a commit
to raspberrypi/rpi-firmware
that referenced
this pull request
Nov 1, 2024
kernel: drivers: media: bcm2835_isp: Cache LS table dmabuf See: raspberrypi/linux#6429 kernel: arm64: dts: Sort out CM5 and I/O board I2C ports See: raspberrypi/linux#6441 kernel: cgroup: Use kernel command line to disable memory cgroup See: raspberrypi/linux#6439
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Commit 94a23e9 ("cgroup: Disable cgroup "memory" by default") disabled the memory cgroup by default when initing the cgroups. However, it's possible to disable the memory cgroup by a kernel command line. Hard-coding such a feature can be problematic as some memory management features depend on the order that things are set.
For example, it is possible to see a NULL pointer dereference caused by commit 94a23e9. The NULL pointer dereference is triggered by the memory shrinker and ends up in a kernel crash.
[ 50.028629] ==================================================================
[ 50.028645] BUG: KASAN: null-ptr-deref in do_shrink_slab+0x1fc/0x978
[ 50.028663] Write of size 8 at addr 0000000000000000 by task gfxrecon-replay/1965
[ 50.028676] CPU: 3 UID: 1000 PID: 1965 Comm: gfxrecon-replay Tainted: G C 6.12.0-rc4-v8-thp-kasan+ #85
[ 50.028685] Tainted: [C]=CRAP
[ 50.028689] Hardware name: Raspberry Pi 5 Model B Rev 1.0 (DT)
[ 50.028694] Call trace:
[ 50.028697] dump_backtrace+0xfc/0x120
[ 50.028706] show_stack+0x24/0x38
[ 50.028711] dump_stack_lvl+0x40/0x88
[ 50.028720] print_report+0xe4/0x708
[ 50.028728] kasan_report+0xcc/0x130
[ 50.028733] kasan_check_range+0x254/0x298
[ 50.028738] __kasan_check_write+0x20/0x30
[ 50.028745] do_shrink_slab+0x1fc/0x978
[ 50.028751] shrink_slab+0x318/0xc38
[ 50.028756] shrink_one+0x254/0x6d8
[ 50.028762] shrink_node+0x26b4/0x2848
[ 50.028767] do_try_to_free_pages+0x3e4/0x1190
[ 50.028773] try_to_free_pages+0x5a4/0xb40
[ 50.028778] __alloc_pages_direct_reclaim+0x144/0x298
[ 50.028787] __alloc_pages_slowpath+0x5c4/0xc70
[ 50.028793] __alloc_pages_noprof+0x4a8/0x6a8
[ 50.028800] __folio_alloc_noprof+0x24/0xa8
[ 50.028806] shmem_alloc_and_add_folio+0x2ec/0xce0
[ 50.028812] shmem_get_folio_gfp+0x380/0xc20
[ 50.028818] shmem_read_folio_gfp+0xe0/0x160
[ 50.028824] drm_gem_get_pages+0x238/0x620 [drm]
[ 50.029039] drm_gem_shmem_get_pages_sgt+0xd8/0x4b8 [drm_shmem_helper]
[ 50.029053] v3d_bo_create_finish+0x58/0x1e0 [v3d]
[ 50.029083] v3d_create_bo_ioctl+0xac/0x210 [v3d]
[ 50.029105] drm_ioctl_kernel+0x1d8/0x2b8 [drm]
[ 50.029220] drm_ioctl+0x4b4/0x920 [drm]
[ 50.029330] __arm64_sys_ioctl+0x11c/0x160
[ 50.029337] invoke_syscall+0x88/0x268
[ 50.029345] el0_svc_common+0x160/0x1d8
[ 50.029351] do_el0_svc+0x50/0x68
[ 50.029358] el0_svc+0x34/0x80
[ 50.029364] el0t_64_sync_handler+0x84/0x100
[ 50.029371] el0t_64_sync+0x190/0x198
[ 50.029376] ==================================================================
This happens because the memory shrinker is unaware that we are artificially disabling the memory cgroups and therefore it doesn't allocate
nr_deferred
(as it would if we used the kernel command line).To avoid such an issue, revert the artificial disablement and disable it through the command line. If a user wants to enable the feature, it can use the
cgroup_enable=
command line.