linux-yocto/include/asm-generic
Kumar Kartikeya Dwivedi 43b6c312b8 rqspinlock: Enclose lock/unlock within lock entry acquisitions
[ Upstream commit beb7021a6003d9c6a463fffca0d6311efb8e0e66 ]

Ritesh reported that timeouts occurred frequently for rqspinlock despite
reentrancy on the same lock on the same CPU in [0]. This patch closes
one of the races leading to this behavior, and reduces the frequency of
timeouts.

We currently have a tiny window between the fast-path cmpxchg and the
grabbing of the lock entry where an NMI could land, attempt the same
lock that was just acquired, and end up timing out. This is not ideal.
Instead, move the lock entry acquisition from the fast path to before
the cmpxchg, and remove the grabbing of the lock entry in the slow path,
assuming it was already taken by the fast path. The TAS fallback is
invoked directly without being preceded by the typical fast path,
therefore we must continue to grab the deadlock detection entry in that
case.

Case on lock leading to missed AA:

cmpxchg lock A
<NMI>
... rqspinlock acquisition of A
... timeout
</NMI>
grab_held_lock_entry(A)

There is a similar case when unlocking the lock. If the NMI lands
between the WRITE_ONCE and smp_store_release, it is possible that we end
up in a situation where the NMI fails to diagnose the AA condition,
leading to a timeout.

Case on unlock leading to missed AA:

WRITE_ONCE(rqh->locks[rqh->cnt - 1], NULL)
<NMI>
... rqspinlock acquisition of A
... timeout
</NMI>
smp_store_release(A->locked, 0)

The patch changes the order on unlock to smp_store_release() succeeded
by WRITE_ONCE() of NULL. This avoids the missed AA detection described
above, but may lead to a false positive if the NMI lands between these
two statements, which is acceptable (and preferred over a timeout).

The original intention of the reverse order on unlock was to prevent the
following possible misdiagnosis of an ABBA scenario:

grab entry A
lock A
grab entry B
lock B
unlock B
   smp_store_release(B->locked, 0)
							grab entry B
							lock B
							grab entry A
							lock A
							! <detect ABBA>
   WRITE_ONCE(rqh->locks[rqh->cnt - 1], NULL)

If the store release were is after the WRITE_ONCE, the other CPU would
not observe B in the table of the CPU unlocking the lock B.  However,
since the threads are obviously participating in an ABBA deadlock, it
is no longer appealing to use the order above since it may lead to a
250 ms timeout due to missed AA detection.

  [0]: https://lore.kernel.org/bpf/CAH6OuBTjG+N=+GGwcpOUbeDN563oz4iVcU3rbse68egp9wj9_A@mail.gmail.com

Fixes: 0d80e7f951 ("rqspinlock: Choose trylock fallback for NMI waiters")
Reported-by: Ritesh Oedayrajsingh Varma <ritesh@superluminal.eu>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20251128232802.1031906-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-18 14:03:28 +01:00
..
bitops bitops: Add __attribute_const__ to generic ffs()-family implementations 2025-09-08 14:58:50 -07:00
vdso vdso: Drop Kconfig GENERIC_VDSO_DATA_STORE 2025-09-04 11:23:50 +02:00
access_ok.h uaccess: remove CONFIG_SET_FS 2022-02-25 09:36:06 +01:00
agp.h char/agp: introduce asm-generic/agp.h 2023-02-13 22:13:29 +01:00
archrandom.h random: handle archrandom with multiple longs 2022-07-25 13:26:14 +02:00
asm-offsets.h
asm-prototypes.h
atomic.h locking/atomic: make atomic*_{cmp,}xchg optional 2023-06-05 09:57:14 +02:00
atomic64.h
audit_change_attr.h fs/xattr: add *at family syscalls 2024-11-06 12:59:44 -05:00
audit_dir_write.h
audit_read.h
audit_signal.h
audit_write.h
barrier.h sched: Add missing memory barrier in switch_mm_cid 2024-04-16 13:59:45 +02:00
bitops.h include: move find.h from asm_generic to linux 2022-01-15 08:47:31 -08:00
bitsperlong.h
bug.h bug: Improve comment 2024-05-07 14:20:48 +02:00
cache.h
cacheflush.h mm: Introduce flush_cache_vmap_early() 2023-12-14 00:23:17 -08:00
cfi.h cfi: Flip headers 2023-12-15 16:25:55 -08:00
checksum.h asm-generic: Improve csum_fold 2024-01-17 17:52:29 -08:00
cmpxchg-local.h asm-generic: Fix 32 bit __generic_cmpxchg_local 2024-01-05 23:19:14 +01:00
cmpxchg.h asm-generic: avoid __generic_cmpxchg_local warnings 2023-04-04 17:58:11 +02:00
codetag.lds.h codetag: avoid unused alloc_tags sections/symbols 2025-07-09 22:42:14 -07:00
compat.h asm-generic: compat: fix compat_arg_u64() and compat_arg_u64_dual() 2022-11-01 10:20:11 +11:00
current.h asm-generic: current: Don't include thread-info.h if building asm 2023-08-26 22:38:49 +02:00
delay.h delay: Fix ndelay() spuriously treated as udelay() 2024-11-29 11:40:22 +01:00
device.h
div64.h __arch_xprod64(): make __always_inline when optimizing for performance 2024-10-28 21:44:28 +00:00
dma-mapping.h dma-mapping: no need to pass a bus_type into get_arch_dma_ops() 2023-02-15 12:35:20 +01:00
dma.h
early_ioremap.h mm/early_ioremap: add null pointer checks to prevent NULL-pointer dereference 2025-01-13 22:40:59 -08:00
emergency-restart.h
error-injection.h docs: fault-injection: add requirements of error injectable functions 2023-02-02 22:50:00 -08:00
exec.h
extable.h
fixmap.h fixmap: Remove unused set_fixmap_offset_io() 2024-07-11 17:41:23 +02:00
flat.h
fprobe.h fprobe: Add fprobe_header encoding feature 2024-12-26 10:50:05 -05:00
ftrace.h
futex.h futex: Fix additional regressions 2021-12-11 23:31:51 +01:00
getorder.h
hardirq.h
hugetlb.h mm: drop hugetlb_free_pgd_range() 2025-07-24 19:12:32 -07:00
hw_irq.h
int-ll64.h
io.h asm-generic/io.h: Skip trace helpers if rwmmio events are disabled 2025-09-24 16:21:13 +02:00
ioctl.h
iomap.h asm-generic/io.h: rework split ioread64/iowrite64 helpers 2025-03-01 21:00:22 +01:00
irq_regs.h
irq_work.h
irq.h
irqflags.h
Kbuild unwind_user: Add user space unwinding API with frame pointer support 2025-07-29 14:46:07 -04:00
kdebug.h
kmap_size.h
kprobes.h
kvm_para.h
kvm_types.h
linkage.h
local.h locking/generic: Wire up local{,64}_try_cmpxchg() 2023-04-29 09:09:09 +02:00
local64.h locking/generic: Wire up local{,64}_try_cmpxchg() 2023-04-29 09:09:09 +02:00
logic_io.h logic_io instance of iounmap() needs volatile on argument 2021-12-21 21:31:08 +01:00
mcs_spinlock.h locking: Move MCS struct definition to public header 2025-03-18 10:28:21 -07:00
memory_model.h mm: convert page_to_section() to memdesc_section() 2025-09-13 16:55:07 -07:00
mm_hooks.h mm: remove arch_unmap() 2024-09-01 20:26:13 -07:00
mmiowb_types.h
mmiowb.h
mmu_context.h
mmu.h
mmzone.h arch, mm: move definition of node_data to generic code 2024-09-03 21:15:28 -07:00
module.h asm-generic: Always define Elf_Rel and Elf_Rela 2025-03-26 15:56:43 -07:00
module.lds.h
mshyperv.h mshv: Fix deposit memory in MSHV_ROOT_HVCALL 2025-12-18 14:03:01 +01:00
msi.h irqchip/gic-v5: Add GICv5 IWB support 2025-07-08 18:35:52 +01:00
nommu_context.h
numa.h arch_numa: switch over to numa_memblks 2024-09-03 21:15:32 -07:00
param.h alpha: regularize the situation with asm/param.h 2025-06-24 22:02:05 -04:00
parport.h
pci_iomap.h PCI: Stub __pci_ioport_map() for arches that don't support it at all 2022-07-29 12:01:00 -05:00
pci.h asm-generic: Add new pci.h and use it 2022-07-22 17:34:57 -05:00
percpu.h percpu: repurpose __percpu tag as a named address space qualifier 2025-03-16 22:05:53 -07:00
pgalloc.h mm: call ctor/dtor for kernel PTEs 2025-05-11 17:48:21 -07:00
pgtable_uffd.h
pgtable-nop4d.h
pgtable-nopmd.h mm: recover pud_leaf() definitions in nopmd case 2024-03-13 12:12:21 -07:00
pgtable-nopud.h
preempt.h riscv: support PREEMPT_DYNAMIC with static keys 2023-08-31 00:18:34 -07:00
qrwlock_types.h locking/qrwlock: Change "queue rwlock" to "queued rwlock" 2022-05-11 16:27:04 +02:00
qrwlock.h asm-generic changes for 5.19 2022-05-26 10:50:30 -07:00
qspinlock_types.h
qspinlock.h riscv: Add qspinlock support 2024-11-11 07:33:20 -08:00
resource.h
rqspinlock.h rqspinlock: Enclose lock/unlock within lock entry acquisitions 2025-12-18 14:03:28 +01:00
runtime-const.h runtime constants: add default dummy infrastructure 2024-06-19 12:34:34 -07:00
rwonce.h rwonce: fix crash by removing READ_ONCE() for unaligned read 2025-03-26 22:16:50 +01:00
seccomp.h
sections.h percpu: Remove __per_cpu_load 2025-02-18 10:16:00 +01:00
serial.h
set_memory.h
shmparam.h
signal.h asm-generic: Remove empty #ifdef SA_RESTORER 2022-09-10 09:56:53 +02:00
simd.h asm-generic: Add sched.h inclusion in simd.h 2025-05-30 20:56:48 +08:00
softirq_stack.h asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig. 2022-09-05 17:20:55 +02:00
spinlock_types.h asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock 2024-11-11 07:33:16 -08:00
spinlock.h asm-generic: ticket-lock: Add separate ticket-lock.h 2024-11-11 07:33:17 -08:00
statfs.h
string.h
switch_to.h
syscall.h syscall.h: introduce syscall_set_nr() 2025-05-11 17:48:15 -07:00
syscalls.h syscalls: mmap(): use unsigned offset type consistently 2024-06-25 15:57:38 +02:00
text-patching.h asm-generic: introduce text-patching.h 2024-11-07 14:25:15 -08:00
thread_info_tif.h asm-generic: Provide generic TIF infrastructure 2025-09-17 08:14:03 +02:00
ticket_spinlock.h riscv: Add qspinlock support 2024-11-11 07:33:20 -08:00
timex.h
tlb.h mmu_gather: move tlb flush for VM_PFNMAP/VM_MIXEDMAP vmas into free_pgtables() 2025-05-31 22:46:12 -07:00
tlbflush.h
topology.h
trace_clock.h
uaccess.h move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
unwind_user.h unwind_user: Add user space unwinding API with frame pointer support 2025-07-29 14:46:07 -04:00
user.h
vermagic.h
vga.h empty include/asm-generic/vga.h 2024-11-11 21:51:42 +01:00
video.h arch: Rename fbdev header and source files 2024-05-03 17:07:50 +02:00
vmlinux.lds.h kbuild: align modinfo section for Secureboot Authenticode EDK2 compat 2025-10-27 16:21:24 -07:00
word-at-a-time.h kernel.h: removed REPEAT_BYTE from kernel.h 2024-02-01 09:47:59 -08:00
xor.h lib/xor: make xor prototypes more friendly to compiler vectorization 2022-02-11 20:39:39 +11:00