linux-yocto/kernel
Frederic Weisbecker 335cd42062 rcu: Fix racy re-initialization of irq_work causing hangs
commit 61399e0c5410567ef60cb1cda34cca42903842e3 upstream.

RCU re-initializes the deferred QS irq work everytime before attempting
to queue it. However there are situations where the irq work is
attempted to be queued even though it is already queued. In that case
re-initializing messes-up with the irq work queue that is about to be
handled.

The chances for that to happen are higher when the architecture doesn't
support self-IPIs and irq work are then all lazy, such as with the
following sequence:

1) rcu_read_unlock() is called when IRQs are disabled and there is a
   grace period involving blocked tasks on the node. The irq work
   is then initialized and queued.

2) The related tasks are unblocked and the CPU quiescent state
   is reported. rdp->defer_qs_iw_pending is reset to DEFER_QS_IDLE,
   allowing the irq work to be requeued in the future (note the previous
   one hasn't fired yet).

3) A new grace period starts and the node has blocked tasks.

4) rcu_read_unlock() is called when IRQs are disabled again. The irq work
   is re-initialized (but it's queued! and its node is cleared) and
   requeued. Which means it's requeued to itself.

5) The irq work finally fires with the tick. But since it was requeued
   to itself, it loops and hangs.

Fix this with initializing the irq work only once before the CPU boots.

Fixes: b41642c877 ("rcu: Fix rcu_read_unlock() deadloop due to IRQ work")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202508071303.c1134cce-lkp@intel.com
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.upadhyay@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-20 18:41:43 +02:00
..
bpf bpf: Make reg_not_null() true for CONST_PTR_TO_MAP 2025-08-20 18:41:20 +02:00
cgroup cgroup: Add compatibility option for content of /proc/cgroups 2025-08-15 16:39:10 +02:00
configs Kbuild updates for v6.16 2025-06-07 10:05:35 -07:00
debug TTY/Serial driver updates for 6.15-rc1 2025-04-02 18:17:33 -07:00
dma dma-contiguous: hornor the cma address limit setup by user 2025-06-12 08:38:40 +02:00
entry entry: Inline syscall_exit_to_user_mode() 2025-04-29 08:27:10 +02:00
events perf/core: Prevent VMA split of buffer mappings 2025-08-15 16:39:31 +02:00
futex futex: Use user_write_access_begin/_end() in futex_put_value() 2025-08-20 18:41:35 +02:00
gcov kbuild: require gcc-8 and binutils-2.30 2025-04-30 21:53:35 +02:00
irq genirq/irq_sim: Initialize work context pointers properly 2025-06-13 15:36:35 +02:00
kcsan kcsan: test: Initialize dummy variable 2025-08-15 16:38:50 +02:00
livepatch sched,livepatch: Untangle cond_resched() and live-patching 2025-05-14 13:16:24 +02:00
locking Generic: 2025-06-02 12:24:58 -07:00
module module: Prevent silent truncation of module name in delete_module(2) 2025-08-20 18:41:28 +02:00
power PM: sleep: console: Fix the black screen issue 2025-08-20 18:41:00 +02:00
printk printk: nbcon: Allow reacquire during panic 2025-08-20 18:41:30 +02:00
rcu rcu: Fix racy re-initialization of irq_work causing hangs 2025-08-20 18:41:43 +02:00
sched sched/fair: Bump sd->max_newidle_lb_cost when newidle balance fails 2025-08-20 18:41:10 +02:00
time timekeeping: Zero initialize system_counterval when querying time from phc drivers 2025-07-22 14:25:21 +02:00
trace tracing: fprobe: Fix infinite recursion using preempt_*_notrace() 2025-08-20 18:41:42 +02:00
.gitignore kheaders: rebuild kheaders_data.tar.xz when a file is modified within a minute 2025-08-20 18:41:31 +02:00
acct.c acct: block access to kernel internal filesystems 2025-02-12 12:24:16 +01:00
async.c
audit_fsnotify.c
audit_tree.c replace collect_mounts()/drop_collected_mounts() with a safer variant 2025-06-23 14:01:49 -04:00
audit_watch.c fs: add kern_path_locked_negative() 2025-04-15 11:32:34 +02:00
audit.c audit: record AUDIT_ANOM_* events regardless of presence of rules 2025-04-11 14:14:41 -04:00
audit.h audit,module: restore audit logging in load failure case 2025-08-15 16:38:20 +02:00
auditfilter.c audit: fix suffixed '/' filename matching 2024-12-05 19:22:38 -05:00
auditsc.c audit,module: restore audit logging in load failure case 2025-08-15 16:38:20 +02:00
backtracetest.c
bounds.c
capability.c capability: Remove unused has_capability 2025-03-07 22:03:09 -06:00
cfi.c Modules changes for 6.15-rc1 2025-03-30 15:44:36 -07:00
compat.c
configs.c
context_tracking.c context_tracking: Make RCU watch ct_kernel_exit_state() warning 2025-03-04 18:44:29 -08:00
cpu_pm.c
cpu.c perf: Remove too early and redundant CPU hotplug handling 2025-05-08 21:50:19 +02:00
crash_core.c crash: Use note name macros 2025-02-10 16:56:58 -08:00
crash_dump_dm_crypt.c crash_dump: retrieve dm crypt keys in kdump kernel 2025-05-21 10:48:21 -07:00
crash_reserve.c crash: fix spelling mistake "crahskernel" -> "crashkernel" 2025-05-11 17:54:10 -07:00
cred.c cred: remove old {override,revert}_creds() helpers 2024-12-02 11:25:09 +01:00
delayacct.c delayacct: remove redundant code and adjust indentation 2025-05-27 19:40:33 -07:00
dma.c
elfcorehdr.c
exec_domain.c
exit.c - Avoid a crash on a heterogeneous machine where not all cores support the 2025-06-22 10:11:45 -07:00
exit.h
extable.c
fail_function.c
fork.c - The 11 patch series "Add folio_mk_pte()" from Matthew Wilcox 2025-05-31 15:44:16 -07:00
freezer.c sched,freezer: Remove unnecessary warning in __thaw_task 2025-07-17 07:56:50 -10:00
gen_kheaders.sh kheaders: rebuild kheaders_data.tar.xz when a file is modified within a minute 2025-08-20 18:41:31 +02:00
groups.c
hung_task.c hung_task: show the blocker task if the task is hung on semaphore 2025-05-11 17:54:08 -07:00
iomem.c mm/memremap: Pass down MEMREMAP_* flags to arch_memremap_wb() 2025-02-21 15:05:38 +01:00
irq_work.c kasan: make kasan_record_aux_stack_noalloc() the default behaviour 2025-01-13 22:40:36 -08:00
jump_label.c jump_label: Use RCU in all users of __module_text_address(). 2025-03-10 11:54:46 +01:00
kallsyms_internal.h
kallsyms_selftest.c kallsyms: Use kthread_run_on_cpu() 2025-01-02 22:12:12 +01:00
kallsyms_selftest.h
kallsyms.c kallsyms: Remove KALLSYMS_ABSOLUTE_PERCPU 2025-02-18 10:16:04 +01:00
kcmp.c kcmp: improve performance adding an unlikely hint to task comparisons 2025-02-21 10:25:33 +01:00
Kconfig.freezer
Kconfig.hz kernel: Fix "select" wording on HZ_250 description 2025-02-21 09:20:30 +01:00
Kconfig.kexec crashdump: add CONFIG_KEYS dependency 2025-06-25 15:55:04 -07:00
Kconfig.locks
Kconfig.preempt sched: No PREEMPT_RT=y for all{yes,mod}config 2024-11-07 15:25:05 +01:00
kcov.c kcov: mark in_softirq_really() as __always_inline 2024-12-30 17:59:08 -08:00
kexec_core.c kexec_core: Fix error code path in the KEXEC_JUMP flow 2025-08-15 16:38:34 +02:00
kexec_elf.c kexec: initialize ELF lowest address to ULONG_MAX 2025-03-16 22:30:47 -07:00
kexec_file.c - The 3 patch series "hung_task: extend blocking task stacktrace dump to 2025-05-31 19:12:53 -07:00
kexec_handover.c kho: initialize tail pages for higher order folios properly 2025-06-19 20:48:02 -07:00
kexec_internal.h kexec: add KHO support to kexec file loads 2025-05-12 23:50:40 -07:00
kexec.c
kheaders.c kheaders: Simplify attribute through __BIN_ATTR_SIMPLE_RO() 2024-12-24 09:46:49 +01:00
kprobes.c kprobes: Use RCU in all users of __module_text_address(). 2025-03-10 11:54:46 +01:00
ksyms_common.c
ksysfs.c kernel/ksysfs.c: simplify bin_attribute definition 2025-01-07 16:59:15 +01:00
kthread.c ipvs: Fix estimator kthreads preferred affinity 2025-08-20 18:40:52 +02:00
latencytop.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
Makefile kheaders: rebuild kheaders_data.tar.xz when a file is modified within a minute 2025-08-20 18:41:31 +02:00
module_signature.c
notifier.c reboot: move reboot_notifier_list to kernel/reboot.c 2024-11-05 17:12:31 -08:00
nsproxy.c kernel/nsproxy: remove unnecessary guards 2025-05-09 13:13:54 +02:00
padata.c padata: Fix pd UAF once and for all 2025-08-15 16:38:58 +02:00
panic.c - The 3 patch series "hung_task: extend blocking task stacktrace dump to 2025-05-31 19:12:53 -07:00
params.c module: ensure that kobject_put() is safe for module type kobjects 2025-05-07 20:24:59 +02:00
pid_namespace.c pid: Do not set pid_max in new pid namespaces 2025-03-06 10:18:36 +01:00
pid_sysctl.h treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
pid.c pidfs: detect refcount bugs 2025-05-06 13:59:00 +02:00
profile.c
ptrace.c ptrace: introduce PTRACE_SET_SYSCALL_INFO request 2025-05-11 17:48:15 -07:00
range.c
reboot.c - The 7 patch series "powerpc/crash: use generic crashkernel 2025-04-01 10:06:52 -07:00
regset.c
relay.c relay: remove unused relay_late_setup_files 2025-05-11 17:54:09 -07:00
resource_kunit.c
resource.c resource: fix false warning in __request_region() 2025-07-24 17:57:59 -07:00
rseq.c rseq: Fix segfault on registration when rseq_cs is non-zero 2025-03-06 22:26:49 +01:00
scftorture.c scftorture: Handle NULL argument passed to scf_add_to_free_list(). 2024-11-14 16:09:51 -08:00
scs.c
seccomp.c seccomp: avoid the lock trip seccomp_filter_release in common case 2025-02-24 11:17:10 -08:00
signal.c signal: Move signal ctl tables into signal.c 2025-04-09 13:32:16 +02:00
smp.c CSD-lock pull request for v6.14 2025-01-28 11:34:03 -08:00
smpboot.c
smpboot.h
softirq.c lockdep: Fix wait context check on softirq for PREEMPT_RT 2025-03-25 10:46:44 +01:00
stackleak.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
stacktrace.c
static_call_inline.c Modules changes for 6.15-rc1 2025-03-30 15:44:36 -07:00
static_call.c
stop_machine.c sched/core: Fix migrate_swap() vs. hotplug 2025-07-01 15:02:03 +02:00
sys_ni.c
sys.c futex: Add basic infrastructure for local task local hash 2025-05-03 12:02:07 +02:00
sysctl-test.c sysctl: move u8 register test to lib/test_sysctl.c 2025-04-14 14:13:41 +02:00
sysctl.c sparc: mv sparc sysctls into their own file under arch/sparc/kernel 2025-04-09 13:32:16 +02:00
task_work.c kasan: make kasan_record_aux_stack_noalloc() the default behaviour 2025-01-13 22:40:36 -08:00
taskstats.c fdget(), more trivial conversions 2024-11-03 01:28:06 -05:00
torture.c torture: Add get_torture_init_jiffies() for test-start time 2025-02-05 07:14:24 -08:00
tracepoint.c tracepoint: Print the function symbol when tracepoint_debug is set 2025-03-21 15:30:10 -04:00
tsacct.c
ucount.c ucount: fix atomic_long_inc_below() argument type 2025-08-15 16:39:17 +02:00
uid16.c
uid16.h
umh.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
up.c
user_namespace.c uidgid: add map_id_range_up() 2025-02-12 12:12:27 +01:00
user-return-notifier.c
user.c
usermode_driver.c
utsname_sysctl.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
utsname.c
vhost_task.c vhost_task: fix vhost_task_create() documentation 2025-04-18 10:08:11 -04:00
vmcore_info.c crash: export PAGE_UNACCEPTED_MAPCOUNT_VALUE to vmcoreinfo 2025-05-11 17:54:04 -07:00
watch_queue.c vfs-6.15-rc1.pipe 2025-03-24 09:52:37 -07:00
watchdog_buddy.c
watchdog_perf.c - The 7 patch series "powerpc/crash: use generic crashkernel 2025-04-01 10:06:52 -07:00
watchdog.c kernel/watchdog: add /sys/kernel/{hard,soft}lockup_count 2025-05-21 10:48:22 -07:00
workqueue_internal.h
workqueue.c workqueue: Initialize wq_isolated_cpumask in workqueue_init_early() 2025-06-17 08:58:29 -10:00