linux-yocto/kernel/cgroup
Chen Ridong 4a1e3ec28e cgroup: split cgroup_destroy_wq into 3 workqueues
[ Upstream commit 79f919a89c9d06816dbdbbd168fa41d27411a7f9 ]

A hung task can occur during [1] LTP cgroup testing when repeatedly
mounting/unmounting perf_event and net_prio controllers with
systemd.unified_cgroup_hierarchy=1. The hang manifests in
cgroup_lock_and_drain_offline() during root destruction.

Related case:
cgroup_fj_function_perf_event cgroup_fj_function.sh perf_event
cgroup_fj_function_net_prio cgroup_fj_function.sh net_prio

Call Trace:
	cgroup_lock_and_drain_offline+0x14c/0x1e8
	cgroup_destroy_root+0x3c/0x2c0
	css_free_rwork_fn+0x248/0x338
	process_one_work+0x16c/0x3b8
	worker_thread+0x22c/0x3b0
	kthread+0xec/0x100
	ret_from_fork+0x10/0x20

Root Cause:

CPU0                            CPU1
mount perf_event                umount net_prio
cgroup1_get_tree                cgroup_kill_sb
rebind_subsystems               // root destruction enqueues
				// cgroup_destroy_wq
// kill all perf_event css
                                // one perf_event css A is dying
                                // css A offline enqueues cgroup_destroy_wq
                                // root destruction will be executed first
                                css_free_rwork_fn
                                cgroup_destroy_root
                                cgroup_lock_and_drain_offline
                                // some perf descendants are dying
                                // cgroup_destroy_wq max_active = 1
                                // waiting for css A to die

Problem scenario:
1. CPU0 mounts perf_event (rebind_subsystems)
2. CPU1 unmounts net_prio (cgroup_kill_sb), queuing root destruction work
3. A dying perf_event CSS gets queued for offline after root destruction
4. Root destruction waits for offline completion, but offline work is
   blocked behind root destruction in cgroup_destroy_wq (max_active=1)

Solution:
Split cgroup_destroy_wq into three dedicated workqueues:
cgroup_offline_wq – Handles CSS offline operations
cgroup_release_wq – Manages resource release
cgroup_free_wq – Performs final memory deallocation

This separation eliminates blocking in the CSS free path while waiting for
offline operations to complete.

[1] https://github.com/linux-test-project/ltp/blob/master/runtest/controllers
Fixes: 334c3679ec ("cgroup: reimplement rebind_subsystems() using cgroup_apply_control() and friends")
Reported-by: Gao Yingjie <gaoyingjie@uniontech.com>
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Suggested-by: Teju Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-09-25 11:00:05 +02:00
..
cgroup-internal.h cgroup: Make operations on the cgroup root_list RCU safe 2024-08-19 06:04:25 +02:00
cgroup-v1.c kernfs: Convert kernfs_path_from_node_locked() from strlcpy() to strscpy() 2024-08-03 08:53:21 +02:00
cgroup.c cgroup: split cgroup_destroy_wq into 3 workqueues 2025-09-25 11:00:05 +02:00
cpuset.c cgroup/cpuset: Use static_branch_enable_cpuslocked() on cpusets_insane_config_key 2025-08-28 16:28:47 +02:00
debug.c kernel: cgroup: fix misuse of %x 2019-05-06 08:47:48 -07:00
freezer.c cgroup: cleanup comments 2022-03-13 19:19:27 -10:00
legacy_freezer.c Revert "cgroup_freezer: cgroup_freezing: Check if not frozen" 2025-07-24 08:53:20 +02:00
Makefile cgroup: Add misc cgroup controller 2021-04-04 13:34:46 -04:00
misc.c cgroup/misc: Store atomic64_t reads to u64 2023-07-21 08:10:06 -10:00
namespace.c cgroup:namespace: Remove unused cgroup_namespaces_init() 2023-08-14 14:29:47 -10:00
pids.c cgroup: add pids.peak interface for pids controller 2022-09-04 09:26:51 -10:00
rdma.c rdmacg: fix kernel-doc warnings in rdmacg 2023-06-05 09:45:14 -10:00
rstat.c cgroup: Remove steal time from usage_usec 2025-02-21 13:57:08 +01:00