linux-yocto

mirror of git://git.yoctoproject.org/linux-yocto.git synced 2025-08-21 16:31:14 +02:00

Author	SHA1	Message	Date
Greg Kroah-Hartman	259f497740	Linux 6.12.38 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-14 16:02:59 +02:00
Borislav Petkov (AMD)	faac2abe89	x86/CPU/AMD: Properly check the TSA microcode In order to simplify backports, I resorted to an older version of the microcode revision checking which didn't pull in the whole struct x86_cpu_id matching machinery. My simpler method, however, forgot to add the extended CPU model to the patch revision, which lead to mismatches when determining whether TSA mitigation support is present. So add that forgotten extended model. This is a stable-only fix and the preference is to do it this way because it is a lot simpler. Also, the Fixes: tag below points to the respective stable patch. Fixes: `7a0395f660` ("x86/bugs: Add a Transient Scheduler Attacks mitigation") Reported-by: Thomas Voegtle <tv@lio96.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Thomas Voegtle <tv@lio96.de> Message-ID: <04ea0a8e-edb0-c59e-ce21-5f3d5d167af3@lio96.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-14 16:02:59 +02:00
Greg Kroah-Hartman	fbad404f04	Linux 6.12.37 Link: https://lore.kernel.org/r/20250708162241.426806072@linuxfoundation.org Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Mark Brown <broonie@kernel.org> Tested-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Miguel Ojeda <ojeda@kernel.org> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:15 +02:00
Borislav Petkov (AMD)	0029b3c132	x86/process: Move the buffer clearing before MONITOR Commit `8e786a85c0` upstream. Move the VERW clearing before the MONITOR so that VERW doesn't disarm it and the machine never enters C1. Original idea by Kim Phillips <kim.phillips@amd.com>. Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:14 +02:00
Borislav Petkov (AMD)	331cfdd274	x86/microcode/AMD: Add TSA microcode SHAs Commit `2329f250e0` upstream. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:14 +02:00
Borislav Petkov (AMD)	d5d66e31fd	KVM: SVM: Advertise TSA CPUID bits to guests Commit `31272abd59` upstream. Synthesize the TSA CPUID feature bits for guests. Set TSA_{SQ,L1}_NO on unaffected machines. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:14 +02:00
Borislav Petkov (AMD)	7a0395f660	x86/bugs: Add a Transient Scheduler Attacks mitigation Commit `d8010d4ba4` upstream. Add the required features detection glue to bugs.c et all in order to support the TSA mitigation. Co-developed-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:14 +02:00
Borislav Petkov (AMD)	0720e436e5	x86/bugs: Rename MDS machinery to something more generic Commit `f9af88a3d3` upstream. It will be used by other x86 mitigations. No functional changes. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:14 +02:00
Kairui Song	4c443046d8	mm: userfaultfd: fix race of userfaultfd_move and swap cache commit `0ea148a799` upstream. This commit fixes two kinds of races, they may have different results: Barry reported a BUG_ON in commit `c50f8e6053`, we may see the same BUG_ON if the filemap lookup returned NULL and folio is added to swap cache after that. If another kind of race is triggered (folio changed after lookup) we may see RSS counter is corrupted: [ 406.893936] BUG: Bad rss-counter state mm:ffff0000c5a9ddc0 type:MM_ANONPAGES val:-1 [ 406.894071] BUG: Bad rss-counter state mm:ffff0000c5a9ddc0 type:MM_SHMEMPAGES val:1 Because the folio is being accounted to the wrong VMA. I'm not sure if there will be any data corruption though, seems no. The issues above are critical already. On seeing a swap entry PTE, userfaultfd_move does a lockless swap cache lookup, and tries to move the found folio to the faulting vma. Currently, it relies on checking the PTE value to ensure that the moved folio still belongs to the src swap entry and that no new folio has been added to the swap cache, which turns out to be unreliable. While working and reviewing the swap table series with Barry, following existing races are observed and reproduced [1]: In the example below, move_pages_pte is moving src_pte to dst_pte, where src_pte is a swap entry PTE holding swap entry S1, and S1 is not in the swap cache: CPU1 CPU2 userfaultfd_move move_pages_pte() entry = pte_to_swp_entry(orig_src_pte); // Here it got entry = S1 ... < interrupted> ... <swapin src_pte, alloc and use folio A> // folio A is a new allocated folio // and get installed into src_pte <frees swap entry S1> // src_pte now points to folio A, S1 // has swap count == 0, it can be freed // by folio_swap_swap or swap // allocator's reclaim. <try to swap out another folio B> // folio B is a folio in another VMA. <put folio B to swap cache using S1 > // S1 is freed, folio B can use it // for swap out with no problem. ... folio = filemap_get_folio(S1) // Got folio B here !!! ... < interrupted again> ... <swapin folio B and free S1> // Now S1 is free to be used again. <swapout src_pte & folio A using S1> // Now src_pte is a swap entry PTE // holding S1 again. folio_trylock(folio) move_swap_pte double_pt_lock is_pte_pages_stable // Check passed because src_pte == S1 folio_move_anon_rmap(...) // Moved invalid folio B here !!! The race window is very short and requires multiple collisions of multiple rare events, so it's very unlikely to happen, but with a deliberately constructed reproducer and increased time window, it can be reproduced easily. This can be fixed by checking if the folio returned by filemap is the valid swap cache folio after acquiring the folio lock. Another similar race is possible: filemap_get_folio may return NULL, but folio (A) could be swapped in and then swapped out again using the same swap entry after the lookup. In such a case, folio (A) may remain in the swap cache, so it must be moved too: CPU1 CPU2 userfaultfd_move move_pages_pte() entry = pte_to_swp_entry(orig_src_pte); // Here it got entry = S1, and S1 is not in swap cache folio = filemap_get_folio(S1) // Got NULL ... < interrupted again> ... <swapin folio A and free S1> <swapout folio A re-using S1> move_swap_pte double_pt_lock is_pte_pages_stable // Check passed because src_pte == S1 folio_move_anon_rmap(...) // folio A is ignored !!! Fix this by checking the swap cache again after acquiring the src_pte lock. And to avoid the filemap overhead, we check swap_map directly [2]. The SWP_SYNCHRONOUS_IO path does make the problem more complex, but so far we don't need to worry about that, since folios can only be exposed to the swap cache in the swap out path, and this is covered in this patch by checking the swap cache again after acquiring the src_pte lock. Testing with a simple C program that allocates and moves several GB of memory did not show any observable performance change. Link: https://lkml.kernel.org/r/20250604151038.21968-1-ryncsn@gmail.com Fixes: `adef440691` ("userfaultfd: UFFDIO_MOVE uABI") Signed-off-by: Kairui Song <kasong@tencent.com> Closes: https://lore.kernel.org/linux-mm/CAMgjq7B1K=6OOrK2OUZ0-tqCzi+EJt+2_K97TPGoSt=9+JwP7Q@mail.gmail.com/ [1] Link: https://lore.kernel.org/all/CAGsJ_4yJhJBo16XhiC-nUzSheyX-V3-nFE+tAi=8Y560K8eT=A@mail.gmail.com/ [2] Reviewed-by: Lokesh Gidra <lokeshgidra@google.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Barry Song <baohua@kernel.org> Reviewed-by: Chris Li <chrisl@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit `0ea148a799`) [ lokeshgidra: resolved merged conflict caused by the difference in move_swap_pte() arguments ] Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:14 +02:00
Jeongjun Park	ead91de35d	mm/vmalloc: fix data race in show_numa_info() commit `5c5f0468d1` upstream. The following data-race was found in show_numa_info(): ================================================================== BUG: KCSAN: data-race in vmalloc_info_show / vmalloc_info_show read to 0xffff88800971fe30 of 4 bytes by task 8289 on cpu 0: show_numa_info mm/vmalloc.c:4936 [inline] vmalloc_info_show+0x5a8/0x7e0 mm/vmalloc.c:5016 seq_read_iter+0x373/0xb40 fs/seq_file.c:230 proc_reg_read_iter+0x11e/0x170 fs/proc/inode.c:299 .... write to 0xffff88800971fe30 of 4 bytes by task 8287 on cpu 1: show_numa_info mm/vmalloc.c:4934 [inline] vmalloc_info_show+0x38f/0x7e0 mm/vmalloc.c:5016 seq_read_iter+0x373/0xb40 fs/seq_file.c:230 proc_reg_read_iter+0x11e/0x170 fs/proc/inode.c:299 .... value changed: 0x0000008f -> 0x00000000 ================================================================== According to this report,there is a read/write data-race because m->private is accessible to multiple CPUs. To fix this, instead of allocating the heap in proc_vmalloc_init() and passing the heap address to m->private, vmalloc_info_show() should allocate the heap. Link: https://lkml.kernel.org/r/20250508165620.15321-1-aha310510@gmail.com Fixes: `8e1d743f2c` ("mm: vmalloc: support multiple nodes in vmallocinfo") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Suggested-by: Eric Dumazet <edumazet@google.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: "Uladzislau Rezki (Sony)" <urezki@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:13 +02:00
Madhavan Srinivasan	679bf9a0cc	powerpc/kernel: Fix ppc_save_regs inclusion in build commit `93bd4a80ef` upstream. Recent patch fixed an old commit 'fc2a5a6161a2 ("powerpc/64s: ppc_save_regs is now needed for all 64s builds")' which is to include building of ppc_save_reg.c only when XMON and KEXEC_CORE and PPC_BOOK3S are enabled. This was valid, since ppc_save_regs was called only in replay_system_reset() of old irq.c which was under BOOK3S. But there has been multiple refactoring of irq.c and have added call to ppc_save_regs() from __replay_soft_interrupts -> replay_soft_interrupts which is part of irq_64.c included under CONFIG_PPC64. And since ppc_save_regs is called in CRASH_DUMP path as part of crash_setup_regs in kexec.h, CONFIG_PPC32 also needs it. So with this recent patch which enabled the building of ppc_save_regs.c caused a build break when none of these (XMON, KEXEC_CORE, BOOK3S) where enabled as part of config. Patch to enable building of ppc_save_regs.c by defaults. Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250511041111.841158-1-maddy@linux.ibm.com Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:13 +02:00
Andrei Kuchynski	c782f98eef	usb: typec: displayport: Fix potential deadlock commit `099cf1fbb8` upstream. The deadlock can occur due to a recursive lock acquisition of `cros_typec_altmode_data::mutex`. The call chain is as follows: 1. cros_typec_altmode_work() acquires the mutex 2. typec_altmode_vdm() -> dp_altmode_vdm() -> 3. typec_altmode_exit() -> cros_typec_altmode_exit() 4. cros_typec_altmode_exit() attempts to acquire the mutex again To prevent this, defer the `typec_altmode_exit()` call by scheduling it rather than calling it directly from within the mutex-protected context. Cc: stable <stable@kernel.org> Fixes: `b4b38ffb38` ("usb: typec: displayport: Receive DP Status Update NAK request exit dp altmode") Signed-off-by: Andrei Kuchynski <akuchynski@chromium.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20250624133246.3936737-1-akuchynski@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:13 +02:00
Kurt Borja	f65ad436e4	platform/x86: think-lmi: Fix sysfs group cleanup commit `4f30f946f2` upstream. Many error paths in tlmi_sysfs_init() lead to sysfs groups being removed when they were not even created. Fix this by letting the kobject core manage these groups through their kobj_type's defult_groups. Fixes: `a40cd7ef22` ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms") Cc: stable@vger.kernel.org Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Kurt Borja <kuurtb@gmail.com> Link: https://lore.kernel.org/r/20250630-lmi-fix-v3-3-ce4f81c9c481@gmail.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:13 +02:00
Kurt Borja	5805edbea5	platform/x86: think-lmi: Fix kobject cleanup commit `9110056fe1` upstream. In tlmi_analyze(), allocated structs with an embedded kobject are freed in error paths after the they were already initialized. Fix this by first by avoiding the initialization of kobjects in tlmi_analyze() and then by correctly cleaning them up in tlmi_release_attr() using their kset's kobject list. Fixes: `a40cd7ef22` ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms") Fixes: `30e78435d3` ("platform/x86: think-lmi: Split kobject_init() and kobject_add() calls") Cc: stable@vger.kernel.org Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Kurt Borja <kuurtb@gmail.com> Link: https://lore.kernel.org/r/20250630-lmi-fix-v3-2-ce4f81c9c481@gmail.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:13 +02:00
Kurt Borja	b11397bf9a	platform/x86: think-lmi: Create ksets consecutively commit `8dab34ca77` upstream. Avoid entering tlmi_release_attr() in error paths if both ksets are not yet created. This is accomplished by initializing them side by side. Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Kurt Borja <kuurtb@gmail.com> Link: https://lore.kernel.org/r/20250630-lmi-fix-v3-1-ce4f81c9c481@gmail.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:13 +02:00
Vivian Wang	f5fe094f35	riscv: cpu_ops_sbi: Use static array for boot_data commit `2b29be967a` upstream. Since commit `6b9f29b81b` ("riscv: Enable pcpu page first chunk allocator"), if NUMA is enabled, the page percpu allocator may be used on very sparse configurations, or when requested on boot with percpu_alloc=page. In that case, percpu data gets put in the vmalloc area. However, sbi_hsm_hart_start() needs the physical address of a sbi_hart_boot_data, and simply assumes that __pa() would work. This causes the just started hart to immediately access an invalid address and hang. Fortunately, struct sbi_hart_boot_data is not too large, so we can simply allocate an array for boot_data statically, putting it in the kernel image. This fixes NUMA=y SMP boot on Sophgo SG2042. To reproduce on QEMU: Set CONFIG_NUMA=y and CONFIG_DEBUG_VIRTUAL=y, then run with: qemu-system-riscv64 -M virt -smp 2 -nographic \ -kernel arch/riscv/boot/Image \ -append "percpu_alloc=page" Kernel output: [ 0.000000] Booting Linux on hartid 0 [ 0.000000] Linux version 6.16.0-rc1 (dram@sakuya) (riscv64-unknown-linux-gnu-gcc (GCC) 14.2.1 20250322, GNU ld (GNU Binutils) 2.44) #11 SMP Tue Jun 24 14:56:22 CST 2025 ... [ 0.000000] percpu: 28 4K pages/cpu s85784 r8192 d20712 ... [ 0.083192] smp: Bringing up secondary CPUs ... [ 0.086722] ------------[ cut here ]------------ [ 0.086849] virt_to_phys used for non-linear address: (____ptrval____) (0xff2000000001d080) [ 0.088001] WARNING: CPU: 0 PID: 1 at arch/riscv/mm/physaddr.c:14 __virt_to_phys+0xae/0xe8 [ 0.088376] Modules linked in: [ 0.088656] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.16.0-rc1 #11 NONE [ 0.088833] Hardware name: riscv-virtio,qemu (DT) [ 0.088948] epc : __virt_to_phys+0xae/0xe8 [ 0.089001] ra : __virt_to_phys+0xae/0xe8 [ 0.089037] epc : ffffffff80021eaa ra : ffffffff80021eaa sp : ff2000000004bbc0 [ 0.089057] gp : ffffffff817f49c0 tp : ff60000001d60000 t0 : 5f6f745f74726976 [ 0.089076] t1 : 0000000000000076 t2 : 705f6f745f747269 s0 : ff2000000004bbe0 [ 0.089095] s1 : ff2000000001d080 a0 : 0000000000000000 a1 : 0000000000000000 [ 0.089113] a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 [ 0.089131] a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 [ 0.089155] s2 : ffffffff8130dc00 s3 : 0000000000000001 s4 : 0000000000000001 [ 0.089174] s5 : ffffffff8185eff8 s6 : ff2000007f1eb000 s7 : ffffffff8002a2ec [ 0.089193] s8 : 0000000000000001 s9 : 0000000000000001 s10: 0000000000000000 [ 0.089211] s11: 0000000000000000 t3 : ffffffff8180a9f7 t4 : ffffffff8180a9f7 [ 0.089960] t5 : ffffffff8180a9f8 t6 : ff2000000004b9d8 [ 0.089984] status: 0000000200000120 badaddr: ffffffff80021eaa cause: 0000000000000003 [ 0.090101] [<ffffffff80021eaa>] __virt_to_phys+0xae/0xe8 [ 0.090228] [<ffffffff8001d796>] sbi_cpu_start+0x6e/0xe8 [ 0.090247] [<ffffffff8001a5da>] __cpu_up+0x1e/0x8c [ 0.090260] [<ffffffff8002a32e>] bringup_cpu+0x42/0x258 [ 0.090277] [<ffffffff8002914c>] cpuhp_invoke_callback+0xe0/0x40c [ 0.090292] [<ffffffff800294e0>] __cpuhp_invoke_callback_range+0x68/0xfc [ 0.090320] [<ffffffff8002a96a>] _cpu_up+0x11a/0x244 [ 0.090334] [<ffffffff8002aae6>] cpu_up+0x52/0x90 [ 0.090384] [<ffffffff80c09350>] bringup_nonboot_cpus+0x78/0x118 [ 0.090411] [<ffffffff80c11060>] smp_init+0x34/0xb8 [ 0.090425] [<ffffffff80c01220>] kernel_init_freeable+0x148/0x2e4 [ 0.090442] [<ffffffff80b83802>] kernel_init+0x1e/0x14c [ 0.090455] [<ffffffff800124ca>] ret_from_fork_kernel+0xe/0xf0 [ 0.090471] [<ffffffff80b8d9c2>] ret_from_fork_kernel_asm+0x16/0x18 [ 0.090560] ---[ end trace 0000000000000000 ]--- [ 1.179875] CPU1: failed to come online [ 1.190324] smp: Brought up 1 node, 1 CPU Cc: stable@vger.kernel.org Reported-by: Han Gao <rabenda.cn@gmail.com> Fixes: `6b9f29b81b` ("riscv: Enable pcpu page first chunk allocator") Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Link: https://lore.kernel.org/r/20250624-riscv-hsm-boot-data-array-v1-1-50b5eeafbe61@iscas.ac.cn Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:13 +02:00
Zhang Rui	d8ca2036f3	powercap: intel_rapl: Do not change CLAMPING bit if ENABLE bit cannot be changed commit `964209202e` upstream. PL1 cannot be disabled on some platforms. The ENABLE bit is still set after software clears it. This behavior leads to a scenario where, upon user request to disable the Power Limit through the powercap sysfs, the ENABLE bit remains set while the CLAMPING bit is inadvertently cleared. According to the Intel Software Developer's Manual, the CLAMPING bit, "When set, allows the processor to go below the OS requested P states in order to maintain the power below specified Platform Power Limit value." Thus this means the system may operate at higher power levels than intended on such platforms. Enhance the code to check ENABLE bit after writing to it, and stop further processing if ENABLE bit cannot be changed. Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Fixes: `2d281d8196` ("PowerCap: Introduce Intel RAPL power capping driver") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Link: https://patch.msgid.link/20250619071340.384782-1-rui.zhang@intel.com [ rjw: Use str_enabled_disabled() instead of open-coded equivalent ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:12 +02:00
Simon Xue	53892dc686	iommu/rockchip: prevent iommus dead loop when two masters share one IOMMU commit `62e062a29a` upstream. When two masters share an IOMMU, calling ops->of_xlate during the second master's driver init may overwrite iommu->domain set by the first. This causes the check if (iommu->domain == domain) in rk_iommu_attach_device() to fail, resulting in the same iommu->node being added twice to &rk_domain->iommus, which can lead to an infinite loop in subsequent &rk_domain->iommus operations. Cc: <stable@vger.kernel.org> Fixes: `25c2325575` ("iommu/rockchip: Add missing set_platform_dma_ops callback") Signed-off-by: Simon Xue <xxm@rock-chips.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20250623020018.584802-1-xxm@rock-chips.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:12 +02:00
Jens Wiklander	5f28563f0c	optee: ffa: fix sleep in atomic context commit `312d02adb9` upstream. The OP-TEE driver registers the function notif_callback() for FF-A notifications. However, this function is called in an atomic context leading to errors like this when processing asynchronous notifications: \| BUG: sleeping function called from invalid context at kernel/locking/mutex.c:258 \| in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 9, name: kworker/0:0 \| preempt_count: 1, expected: 0 \| RCU nest depth: 0, expected: 0 \| CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.14.0-00019-g657536ebe0aa #13 \| Hardware name: linux,dummy-virt (DT) \| Workqueue: ffa_pcpu_irq_notification notif_pcpu_irq_work_fn \| Call trace: \| show_stack+0x18/0x24 (C) \| dump_stack_lvl+0x78/0x90 \| dump_stack+0x18/0x24 \| __might_resched+0x114/0x170 \| __might_sleep+0x48/0x98 \| mutex_lock+0x24/0x80 \| optee_get_msg_arg+0x7c/0x21c \| simple_call_with_arg+0x50/0xc0 \| optee_do_bottom_half+0x14/0x20 \| notif_callback+0x3c/0x48 \| handle_notif_callbacks+0x9c/0xe0 \| notif_get_and_handle+0x40/0x88 \| generic_exec_single+0x80/0xc0 \| smp_call_function_single+0xfc/0x1a0 \| notif_pcpu_irq_work_fn+0x2c/0x38 \| process_one_work+0x14c/0x2b4 \| worker_thread+0x2e4/0x3e0 \| kthread+0x13c/0x210 \| ret_from_fork+0x10/0x20 Fix this by adding work queue to process the notification in a non-atomic context. Fixes: `d0476a59de` ("optee: ffa_abi: add asynchronous notifications") Cc: stable@vger.kernel.org Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20250602120452.2507084-1-jens.wiklander@linaro.org Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:12 +02:00
Oliver Neukum	ccdc472b4d	Logitech C-270 even more broken commit `cee4392a57` upstream. Some varieties of this device don't work with RESET_RESUME alone. Signed-off-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@kernel.org> Link: https://lore.kernel.org/r/20250605122852.1440382-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:12 +02:00
Michael J. Ruhl	4c37963d67	i2c/designware: Fix an initialization issue commit `3d30048958` upstream. The i2c_dw_xfer_init() function requires msgs and msg_write_idx from the dev context to be initialized. amd_i2c_dw_xfer_quirk() inits msgs and msgs_num, but not msg_write_idx. This could allow an out of bounds access (of msgs). Initialize msg_write_idx before calling i2c_dw_xfer_init(). Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Fixes: `17631e8ca2` ("i2c: designware: Add driver support for AMD NAVI GPU") Cc: <stable@vger.kernel.org> # v5.13+ Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20250627143511.489570-1-michael.j.ruhl@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:12 +02:00
Christian König	c745744a82	dma-buf: fix timeout handling in dma_resv_wait_timeout v2 commit `2b95a7db6e` upstream. Even the kerneldoc says that with a zero timeout the function should not wait for anything, but still return 1 to indicate that the fences are signaled now. Unfortunately that isn't what was implemented, instead of only returning 1 we also waited for at least one jiffies. Fix that by adjusting the handling to what the function is actually documented to do. v2: improve code readability Reported-by: Marek Olšák <marek.olsak@amd.com> Reported-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20250129105841.1806-1-christian.koenig@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:12 +02:00
Shyam Prasad N	631f9de9a7	cifs: all initializations for tcon should happen in tcon_info_alloc commit `74ebd02163` upstream. Today, a few work structs inside tcon are initialized inside cifs_get_tcon and not in tcon_info_alloc. As a result, if a tcon is obtained from tcon_info_alloc, but not called as a part of cifs_get_tcon, we may trip over. Cc: <stable@vger.kernel.org> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:12 +02:00
Philipp Kerling	7b02e09fc0	smb: client: fix readdir returning wrong type with POSIX extensions commit `b8f89cb723` upstream. When SMB 3.1.1 POSIX Extensions are negotiated, userspace applications using readdir() or getdents() calls without stat() on each individual file (such as a simple "ls" or "find") would misidentify file types and exhibit strange behavior such as not descending into directories. The reason for this behavior is an oversight in the cifs_posix_to_fattr conversion function. Instead of extracting the entry type for cf_dtype from the properly converted cf_mode field, it tries to extract the type from the PDU. While the wire representation of the entry mode is similar in structure to POSIX stat(), the assignments of the entry types are different. Applying the S_DT macro to cf_mode instead yields the correct result. This is also what the equivalent function smb311_posix_info_to_fattr in inode.c already does for stat() etc.; which is why "ls -l" would give the correct file type but "ls" would not (as identified by the colors). Cc: stable@vger.kernel.org Signed-off-by: Philipp Kerling <pkerling@casix.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:11 +02:00
Heikki Krogerus	7cb8750160	usb: acpi: fix device link removal commit `3b18405763` upstream. The device link to the USB4 host interface has to be removed manually since it's no longer auto removed. Fixes: `623dae3e70` ("usb: acpi: fix boot hang due to early incorrect 'tunneled' USB3 device links") Cc: stable <stable@kernel.org> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/r/20250611111415.2707865-1-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:11 +02:00
Xu Yang	c68a27bbeb	usb: chipidea: udc: disconnect/reconnect from host when do suspend/resume commit `31a6afbe86` upstream. Shawn and John reported a hang issue during system suspend as below: - USB gadget is enabled as Ethernet - There is data transfer over USB Ethernet (scp a big file between host and device) - Device is going in/out suspend (echo mem > /sys/power/state) The root cause is the USB device controller is suspended but the USB bus is still active which caused the USB host continues to transfer data with device and the device continues to queue USB requests (in this case, a delayed TCP ACK packet trigger the issue) after controller is suspended, however the USB controller clock is already gated off. Then if udc driver access registers after that point, the system will hang. The correct way to avoid such issue is to disconnect device from host when the USB bus is not at suspend state. Then the host will receive disconnect event and stop data transfer in time. To continue make USB gadget device work after system resume, this will reconnect device automatically. To make usb wakeup work if USB bus is already at suspend state, this will keep connection for it only when USB device controller has enabled wakeup capability. Reported-by: Shawn Guo <shawnguo@kernel.org> Reported-by: John Ernberg <john.ernberg@actia.se> Closes: https://lore.kernel.org/linux-usb/aEZxmlHmjeWcXiF3@dragon/ Tested-by: John Ernberg <john.ernberg@actia.se> # iMX8QXP Fixes: `235ffc17d0` ("usb: chipidea: udc: add suspend/resume support for device controller") Cc: stable <stable@kernel.org> Reviewed-by: Jun Li <jun.li@nxp.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Peter Chen <peter.chen@kernel.org> Link: https://lore.kernel.org/r/20250614124914.207540-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:11 +02:00
Kuen-Han Tsai	3b1407caac	usb: dwc3: Abort suspend on soft disconnect failure commit `630a1dec3b` upstream. When dwc3_gadget_soft_disconnect() fails, dwc3_suspend_common() keeps going with the suspend, resulting in a period where the power domain is off, but the gadget driver remains connected. Within this time frame, invoking vbus_event_work() will cause an error as it attempts to access DWC3 registers for endpoint disabling after the power domain has been completely shut down. Abort the suspend sequence when dwc3_gadget_suspend() cannot halt the controller and proceeds with a soft connect. Fixes: `9f8a67b65a` ("usb: dwc3: gadget: fix gadget suspend/resume") Cc: stable <stable@kernel.org> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Signed-off-by: Kuen-Han Tsai <khtsai@google.com> Link: https://lore.kernel.org/r/20250528100315.2162699-1-khtsai@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:11 +02:00
Pawel Laszczak	27199ab790	usb: cdnsp: Fix issue with CV Bad Descriptor test commit `2831a81077` upstream. The SSP2 controller has extra endpoint state preserve bit (ESP) which setting causes that endpoint state will be preserved during Halt Endpoint command. It is used only for EP0. Without this bit the Command Verifier "TD 9.10 Bad Descriptor Test" failed. Setting this bit doesn't have any impact for SSP controller. Fixes: `3d82904559` ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Cc: stable <stable@kernel.org> Signed-off-by: Pawel Laszczak <pawell@cadence.com> Acked-by: Peter Chen <peter.chen@kernel.org> Link: https://lore.kernel.org/r/PH7PR07MB95382CCD50549DABAEFD6156DD7CA@PH7PR07MB9538.namprd07.prod.outlook.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:11 +02:00
Peter Chen	b68e355a61	usb: cdnsp: do not disable slot for disabled slot commit `7e2c421ef8` upstream. It doesn't need to do it, and the related command event returns 'Slot Not Enabled Error' status. Fixes: `3d82904559` ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Cc: stable <stable@kernel.org> Suggested-by: Hongliang Yang <hongliang.yang@cixtech.com> Reviewed-by: Fugang Duan <fugang.duan@cixtech.com> Signed-off-by: Peter Chen <peter.chen@cixtech.com> Link: https://lore.kernel.org/r/20250619013413.35817-1-peter.chen@cixtech.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:11 +02:00
Jeff LaBundy	46f7589281	Input: iqs7222 - explicitly define number of external channels commit `63f4970a12` upstream. The number of external channels is assumed to be a multiple of 10, but this is not the case for IQS7222D. As a result, some CRx pins are wrongly prevented from being assigned to some channels. Address this problem by explicitly defining the number of external channels for cases in which the number of external channels is not equal to the total number of available channels. Fixes: `dd24e202ac` ("Input: iqs7222 - add support for Azoteq IQS7222D") Signed-off-by: Jeff LaBundy <jeff@labundy.com> Link: https://lore.kernel.org/r/aGHVf6HkyFZrzTPy@nixie71 Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:10 +02:00
Nilton Perim Neto	dbdd2a2320	Input: xpad - support Acer NGR 200 Controller commit `22c69d786e` upstream. Add the NGR 200 Xbox 360 to the list of recognized controllers. Signed-off-by: Nilton Perim Neto <niltonperimneto@gmail.com> Link: https://lore.kernel.org/r/20250608060517.14967-1-niltonperimneto@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:10 +02:00
Hongyu Xie	195597e0be	xhci: Disable stream for xHC controller with XHCI_BROKEN_STREAMS commit `cd65ee8124` upstream. Disable stream for platform xHC controller with broken stream. Fixes: `14aec58932` ("storage: accept some UAS devices if streams are unavailable") Cc: stable <stable@kernel.org> Signed-off-by: Hongyu Xie <xiehongyu1@kylinos.cn> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250627144127.3889714-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:10 +02:00
Mathias Nyman	8bfd11dae3	xhci: dbc: Flush queued requests before stopping dbc commit `efe3e3ae5a` upstream. Flush dbc requests when dbc is stopped and transfer rings are freed. Failure to flush them lead to leaking memory and dbc completing odd requests after resuming from suspend, leading to error messages such as: [ 95.344392] xhci_hcd 0000:00:0d.0: no matched request Cc: stable <stable@kernel.org> Fixes: `dfba2174dc` ("usb: xhci: Add DbC support in xHCI driver") Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250627144127.3889714-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:10 +02:00
Łukasz Bartosik	9f3b2e497d	xhci: dbctty: disable ECHO flag by default commit `2b857d69a5` upstream. When /dev/ttyDBC0 device is created then by default ECHO flag is set for the terminal device. However if data arrives from a peer before application using /dev/ttyDBC0 applies its set of terminal flags then the arriving data will be echoed which might not be desired behavior. Fixes: `4521f16139` ("xhci: dbctty: split dbc tty driver registration and unregistration functions.") Cc: stable <stable@kernel.org> Signed-off-by: Łukasz Bartosik <ukaszb@chromium.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/stable/20250610111802.18742-1-ukaszb%40chromium.org Link: https://lore.kernel.org/r/20250627144127.3889714-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:10 +02:00
Raju Rangoju	fbebc2254a	usb: xhci: quirk for data loss in ISOC transfers commit `cbc889ab01` upstream. During the High-Speed Isochronous Audio transfers, xHCI controller on certain AMD platforms experiences momentary data loss. This results in Missed Service Errors (MSE) being generated by the xHCI. The root cause of the MSE is attributed to the ISOC OUT endpoint being omitted from scheduling. This can happen when an IN endpoint with a 64ms service interval either is pre-scheduled prior to the ISOC OUT endpoint or the interval of the ISOC OUT endpoint is shorter than that of the IN endpoint. Consequently, the OUT service is neglected when an IN endpoint with a service interval exceeding 32ms is scheduled concurrently (every 64ms in this scenario). This issue is particularly seen on certain older AMD platforms. To mitigate this problem, it is recommended to adjust the service interval of the IN endpoint to not exceed 32ms (interval 8). This adjustment ensures that the OUT endpoint will not be bypassed, even if a smaller interval value is utilized. Cc: stable <stable@kernel.org> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250627144127.3889714-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:10 +02:00
Roy Luo	9f75893189	Revert "usb: xhci: Implement xhci_handshake_check_state() helper" commit `7aed15379d` upstream. This reverts commit `6ccb83d6c4`. Commit `6ccb83d6c4` ("usb: xhci: Implement xhci_handshake_check_state() helper") was introduced to workaround watchdog timeout issues on some platforms, allowing xhci_reset() to bail out early without waiting for the reset to complete. Skipping the xhci handshake during a reset is a dangerous move. The xhci specification explicitly states that certain registers cannot be accessed during reset in section 5.4.1 USB Command Register (USBCMD), Host Controller Reset (HCRST) field: "This bit is cleared to '0' by the Host Controller when the reset process is complete. Software cannot terminate the reset process early by writinga '0' to this bit and shall not write any xHC Operational or Runtime registers until while HCRST is '1'." This behavior causes a regression on SNPS DWC3 USB controller with dual-role capability. When the DWC3 controller exits host mode and removes xhci while a reset is still in progress, and then tries to configure its hardware for device mode, the ongoing reset leads to register access issues; specifically, all register reads returns 0. These issues extend beyond the xhci register space (which is expected during a reset) and affect the entire DWC3 IP block, causing the DWC3 device mode to malfunction. Cc: stable <stable@kernel.org> Fixes: `6ccb83d6c4` ("usb: xhci: Implement xhci_handshake_check_state() helper") Signed-off-by: Roy Luo <royluo@google.com> Link: https://lore.kernel.org/r/20250522190912.457583-3-royluo@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:10 +02:00
Roy Luo	8caccd2eac	usb: xhci: Skip xhci_reset in xhci_resume if xhci is being removed commit `3eff494f6e` upstream. xhci_reset() currently returns -ENODEV if XHCI_STATE_REMOVING is set, without completing the xhci handshake, unless the reset completes exceptionally quickly. This behavior causes a regression on Synopsys DWC3 USB controllers with dual-role capabilities. Specifically, when a DWC3 controller exits host mode and removes xhci while a reset is still in progress, and then attempts to configure its hardware for device mode, the ongoing, incomplete reset leads to critical register access issues. All register reads return zero, not just within the xHCI register space (which might be expected during a reset), but across the entire DWC3 IP block. This patch addresses the issue by preventing xhci_reset() from being called in xhci_resume() and bailing out early in the reinit flow when XHCI_STATE_REMOVING is set. Cc: stable <stable@kernel.org> Fixes: `6ccb83d6c4` ("usb: xhci: Implement xhci_handshake_check_state() helper") Suggested-by: Mathias Nyman <mathias.nyman@intel.com> Signed-off-by: Roy Luo <royluo@google.com> Link: https://lore.kernel.org/r/20250522190912.457583-2-royluo@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-07-10 16:05:09 +02:00
Trond Myklebust	1a81dfc9d1	NFSv4/flexfiles: Fix handling of NFS level errors in I/O [ Upstream commit `38074de35b` ] Allow the flexfiles error handling to recognise NFS level errors (as opposed to RPC level errors) and handle them separately. The main motivator is the NFSERR_PERM errors that get returned if the NFS client connects to the data server through a port number that is lower than 1024. In that case, the client should disconnect and retry a READ on a different data server, or it should retry a WRITE after reconnecting. Reviewed-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Fixes: `d67ae825a5` ("pnfs/flexfiles: Add the FlexFile Layout Driver") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:09 +02:00
Harry Austen	5e110e8679	drm/xe: Allow dropping kunit dependency as built-in [ Upstream commit `aa18d5769f` ] Fix Kconfig symbol dependency on KUNIT, which isn't actually required for XE to be built-in. However, if KUNIT is enabled, it must be built-in too. Fixes: `08987a8b68` ("drm/xe: Fix build with KUNIT=m") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Harry Austen <hpausten@protonmail.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20250627-xe-kunit-v2-2-756fe5cd56cf@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `a559434880`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:09 +02:00
Vinay Belgaumkar	994b0bc2a0	drm/xe/bmg: Update Wa_22019338487 [ Upstream commit `84c0b4a006` ] Limit GT max frequency to 2600MHz and wait for frequency to reduce before proceeding with a transient flush. This is really only needed for the transient flush: if L2 flush is needed due to 16023588340 then there's no need to do this additional wait since we are already using the bigger hammer. v2: Use generic names, ensure user set max frequency requests wait for flush to complete (Rodrigo) v3: - User requests wait via wait_var_event_timeout (Lucas) - Close races on flush + user requests (Lucas) - Fix xe_guc_pc_remove_flush_freq_limit() being called on last gt rather than root gt (Lucas) v4: - Only apply the freq reducing part if a TDF is needed: L2 flush trumps the need for waiting a lower frequency Fixes: `aaa08078e7` ("drm/xe/bmg: Apply Wa_22019338487") Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://lore.kernel.org/r/20250618-wa-22019338487-v5-4-b888388477f2@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `deea6a7d6d`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:09 +02:00
Or Har-Toov	beb89ada57	IB/mlx5: Fix potential deadlock in MR deregistration [ Upstream commit `2ed25aa7f7` ] The issue arises when kzalloc() is invoked while holding umem_mutex or any other lock acquired under umem_mutex. This is problematic because kzalloc() can trigger fs_reclaim_aqcuire(), which may, in turn, invoke mmu_notifier_invalidate_range_start(). This function can lead to mlx5_ib_invalidate_range(), which attempts to acquire umem_mutex again, resulting in a deadlock. The problematic flow: CPU0 \| CPU1 ---------------------------------------\|------------------------------------------------ mlx5_ib_dereg_mr() \| → revoke_mr() \| → mutex_lock(&umem_odp->umem_mutex) \| \| mlx5_mkey_cache_init() \| → mutex_lock(&dev->cache.rb_lock) \| → mlx5r_cache_create_ent_locked() \| → kzalloc(GFP_KERNEL) \| → fs_reclaim() \| → mmu_notifier_invalidate_range_start() \| → mlx5_ib_invalidate_range() \| → mutex_lock(&umem_odp->umem_mutex) → cache_ent_find_and_store() \| → mutex_lock(&dev->cache.rb_lock) \| Additionally, when kzalloc() is called from within cache_ent_find_and_store(), we encounter the same deadlock due to re-acquisition of umem_mutex. Solve by releasing umem_mutex in dereg_mr() after umr_revoke_mr() and before acquiring rb_lock. This ensures that we don't hold umem_mutex while performing memory allocations that could trigger the reclaim path. This change prevents the deadlock by ensuring proper lock ordering and avoiding holding locks during memory allocation operations that could trigger the reclaim path. The following lockdep warning demonstrates the deadlock: python3/20557 is trying to acquire lock: ffff888387542128 (&umem_odp->umem_mutex){+.+.}-{4:4}, at: mlx5_ib_invalidate_range+0x5b/0x550 [mlx5_ib] but task is already holding lock: ffffffff82f6b840 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}, at: unmap_vmas+0x7b/0x1a0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}: fs_reclaim_acquire+0x60/0xd0 mem_cgroup_css_alloc+0x6f/0x9b0 cgroup_init_subsys+0xa4/0x240 cgroup_init+0x1c8/0x510 start_kernel+0x747/0x760 x86_64_start_reservations+0x25/0x30 x86_64_start_kernel+0x73/0x80 common_startup_64+0x129/0x138 -> #2 (fs_reclaim){+.+.}-{0:0}: fs_reclaim_acquire+0x91/0xd0 __kmalloc_cache_noprof+0x4d/0x4c0 mlx5r_cache_create_ent_locked+0x75/0x620 [mlx5_ib] mlx5_mkey_cache_init+0x186/0x360 [mlx5_ib] mlx5_ib_stage_post_ib_reg_umr_init+0x3c/0x60 [mlx5_ib] __mlx5_ib_add+0x4b/0x190 [mlx5_ib] mlx5r_probe+0xd9/0x320 [mlx5_ib] auxiliary_bus_probe+0x42/0x70 really_probe+0xdb/0x360 __driver_probe_device+0x8f/0x130 driver_probe_device+0x1f/0xb0 __driver_attach+0xd4/0x1f0 bus_for_each_dev+0x79/0xd0 bus_add_driver+0xf0/0x200 driver_register+0x6e/0xc0 __auxiliary_driver_register+0x6a/0xc0 do_one_initcall+0x5e/0x390 do_init_module+0x88/0x240 init_module_from_file+0x85/0xc0 idempotent_init_module+0x104/0x300 __x64_sys_finit_module+0x68/0xc0 do_syscall_64+0x6d/0x140 entry_SYSCALL_64_after_hwframe+0x4b/0x53 -> #1 (&dev->cache.rb_lock){+.+.}-{4:4}: __mutex_lock+0x98/0xf10 __mlx5_ib_dereg_mr+0x6f2/0x890 [mlx5_ib] mlx5_ib_dereg_mr+0x21/0x110 [mlx5_ib] ib_dereg_mr_user+0x85/0x1f0 [ib_core] uverbs_free_mr+0x19/0x30 [ib_uverbs] destroy_hw_idr_uobject+0x21/0x80 [ib_uverbs] uverbs_destroy_uobject+0x60/0x3d0 [ib_uverbs] uobj_destroy+0x57/0xa0 [ib_uverbs] ib_uverbs_cmd_verbs+0x4d5/0x1210 [ib_uverbs] ib_uverbs_ioctl+0x129/0x230 [ib_uverbs] __x64_sys_ioctl+0x596/0xaa0 do_syscall_64+0x6d/0x140 entry_SYSCALL_64_after_hwframe+0x4b/0x53 -> #0 (&umem_odp->umem_mutex){+.+.}-{4:4}: __lock_acquire+0x1826/0x2f00 lock_acquire+0xd3/0x2e0 __mutex_lock+0x98/0xf10 mlx5_ib_invalidate_range+0x5b/0x550 [mlx5_ib] __mmu_notifier_invalidate_range_start+0x18e/0x1f0 unmap_vmas+0x182/0x1a0 exit_mmap+0xf3/0x4a0 mmput+0x3a/0x100 do_exit+0x2b9/0xa90 do_group_exit+0x32/0xa0 get_signal+0xc32/0xcb0 arch_do_signal_or_restart+0x29/0x1d0 syscall_exit_to_user_mode+0x105/0x1d0 do_syscall_64+0x79/0x140 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Chain exists of: &dev->cache.rb_lock --> mmu_notifier_invalidate_range_start --> &umem_odp->umem_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&umem_odp->umem_mutex); lock(mmu_notifier_invalidate_range_start); lock(&umem_odp->umem_mutex); lock(&dev->cache.rb_lock); * DEADLOCK * Fixes: `abb604a1a9` ("RDMA/mlx5: Fix a race for an ODP MR which leads to CQE with error") Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://patch.msgid.link/3c8f225a8a9fade647d19b014df1172544643e4a.1750061612.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:09 +02:00
Michael Guralnik	f658855702	RDMA/mlx5: Fix cache entry update on dereg error [ Upstream commit `24d693cf6c` ] Fix double decrement of 'in_use' counter on push_mkey_locked() failure while deregistering an MR. If we fail to return an mkey to the cache in cache_ent_find_and_store() it'll update the 'in_use' counter. Its caller, revoke_mr(), also updates it, thus having double decrement. Wrong value of 'in_use' counter will be exposed through debugfs and can also cause wrong resizing of the cache when users try to set cache entry size using the 'size' debugfs. To address this issue, the 'in_use' counter is now decremented within mlx5_revoke_mr() also after a successful call to cache_ent_find_and_store() and not within cache_ent_find_and_store(). Other success or failure flows remains unchanged where it was also decremented. Fixes: `8c1185fef6` ("RDMA/mlx5: Change check for cacheable mkeys") Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Reviewed-by: Yishai Hadas <yishaih@nvidia.com> Link: https://patch.msgid.link/97e979dff636f232ff4c83ce709c17c727da1fdb.1741875692.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:09 +02:00
Shivank Garg	f94c422157	fs: export anon_inode_make_secure_inode() and fix secretmem LSM bypass [ Upstream commit `cbe4134ea4` ] Export anon_inode_make_secure_inode() to allow KVM guest_memfd to create anonymous inodes with proper security context. This replaces the current pattern of calling alloc_anon_inode() followed by inode_init_security_anon() for creating security context manually. This change also fixes a security regression in secretmem where the S_PRIVATE flag was not cleared after alloc_anon_inode(), causing LSM/SELinux checks to be bypassed for secretmem file descriptors. As guest_memfd currently resides in the KVM module, we need to export this symbol for use outside the core kernel. In the future, guest_memfd might be moved to core-mm, at which point the symbols no longer would have to be exported. When/if that happens is still unclear. Fixes: `2bfe15c526` ("mm: create security context for memfd_secret inodes") Suggested-by: David Hildenbrand <david@redhat.com> Suggested-by: Mike Rapoport <rppt@kernel.org> Signed-off-by: Shivank Garg <shivankg@amd.com> Link: https://lore.kernel.org/20250620070328.803704-3-shivankg@amd.com Acked-by: "Mike Rapoport (Microsoft)" <rppt@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:09 +02:00
Peter Zijlstra	cdd9862252	module: Provide EXPORT_SYMBOL_GPL_FOR_MODULES() helper [ Upstream commit `707f853d7f` ] Helper macro to more easily limit the export of a symbol to a given list of modules. Eg: EXPORT_SYMBOL_GPL_FOR_MODULES(preempt_notifier_inc, "kvm"); will limit the use of said function to kvm.ko, any other module trying to use this symbol will refure to load (and get modpost build failures). Requested-by: Masahiro Yamada <masahiroy@kernel.org> Requested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Stable-dep-of: `cbe4134ea4` ("fs: export anon_inode_make_secure_inode() and fix secretmem LSM bypass") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:08 +02:00
Al Viro	e036efbe58	add a string-to-qstr constructor [ Upstream commit `c1feab95e0` ] Quite a few places want to build a struct qstr by given string; it would be convenient to have a primitive doing that, rather than open-coding it via QSTR_INIT(). The closest approximation was in bcachefs, but that expands to initializer list - {.len = strlen(string), .name = string}. It would be more useful to have it as compound literal - (struct qstr){.len = strlen(string), .name = string}. Unlike initializer list it's a valid expression. What's more, it's a valid lvalue - it's an equivalent of anonymous local variable with such initializer, so the things like path->dentry = d_alloc_pseudo(mnt->mnt_sb, &QSTR(name)); are valid. It can also be used as initializer, with identical effect - struct qstr x = (struct qstr){.name = s, .len = strlen(s)}; is equivalent to struct qstr anon_variable = {.name = s, .len = strlen(s)}; struct qstr x = anon_variable; // anon_variable is never used after that point and any even remotely sane compiler will manage to collapse that into struct qstr x = {.name = s, .len = strlen(s)}; What compound literals can't be used for is initialization of global variables, but those are covered by QSTR_INIT(). This commit lifts definition(s) of QSTR() into linux/dcache.h, converts it to compound literal (all bcachefs users are fine with that) and converts assorted open-coded instances to using that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Stable-dep-of: `cbe4134ea4` ("fs: export anon_inode_make_secure_inode() and fix secretmem LSM bypass") Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:08 +02:00
Uladzislau Rezki (Sony)	42c5a4b47d	rcu: Return early if callback is not specified [ Upstream commit `33b6a1f155` ] Currently the call_rcu() API does not check whether a callback pointer is NULL. If NULL is passed, rcu_core() will try to invoke it, resulting in NULL pointer dereference and a kernel crash. To prevent this and improve debuggability, this patch adds a check for NULL and emits a kernel stack trace to help identify a faulty caller. Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:08 +02:00
Pablo Martin-Gomez	c40b207caf	mtd: spinand: fix memory leak of ECC engine conf [ Upstream commit `6463cbe08b` ] Memory allocated for the ECC engine conf is not released during spinand cleanup. Below kmemleak trace is seen for this memory leak: unreferenced object 0xffffff80064f00e0 (size 8): comm "swapper/0", pid 1, jiffies 4294937458 hex dump (first 8 bytes): 00 00 00 00 00 00 00 00 ........ backtrace (crc 0): kmemleak_alloc+0x30/0x40 __kmalloc_cache_noprof+0x208/0x3c0 spinand_ondie_ecc_init_ctx+0x114/0x200 nand_ecc_init_ctx+0x70/0xa8 nanddev_ecc_engine_init+0xec/0x27c spinand_probe+0xa2c/0x1620 spi_mem_probe+0x130/0x21c spi_probe+0xf0/0x170 really_probe+0x17c/0x6e8 __driver_probe_device+0x17c/0x21c driver_probe_device+0x58/0x180 __device_attach_driver+0x15c/0x1f8 bus_for_each_drv+0xec/0x150 __device_attach+0x188/0x24c device_initial_probe+0x10/0x20 bus_probe_device+0x11c/0x160 Fix the leak by calling nanddev_ecc_engine_cleanup() inside spinand_cleanup(). Signed-off-by: Pablo Martin-Gomez <pmartin-gomez@freebox.fr> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:08 +02:00
Rafael J. Wysocki	18ff4ed6a3	ACPICA: Refuse to evaluate a method if arguments are missing [ Upstream commit `6fcab27915` ] As reported in [1], a platform firmware update that increased the number of method parameters and forgot to update a least one of its callers, caused ACPICA to crash due to use-after-free. Since this a result of a clear AML issue that arguably cannot be fixed up by the interpreter (it cannot produce missing data out of thin air), address it by making ACPICA refuse to evaluate a method if the caller attempts to pass fewer arguments than expected to it. Closes: https://github.com/acpica/acpica/issues/1027 [1] Reported-by: Peter Williams <peter@newton.cx> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Hans de Goede <hansg@kernel.org> Tested-by: Hans de Goede <hansg@kernel.org> # Dell XPS 9640 with BIOS 1.12.0 Link: https://patch.msgid.link/5909446.DvuYhMxLoT@rjwysocki.net Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:08 +02:00
Johannes Berg	327997afbb	wifi: ath6kl: remove WARN on bad firmware input [ Upstream commit `e7417421d8` ] If the firmware gives bad input, that's nothing to do with the driver's stack at this point etc., so the WARN_ON() doesn't add any value. Additionally, this is one of the top syzbot reports now. Just print a message, and as an added bonus, print the sizes too. Reported-by: syzbot+92c6dd14aaa230be6855@syzkaller.appspotmail.com Tested-by: syzbot+92c6dd14aaa230be6855@syzkaller.appspotmail.com Acked-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Link: https://patch.msgid.link/20250617114529.031a677a348e.I58bf1eb4ac16a82c546725ff010f3f0d2b0cca49@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:08 +02:00
Johannes Berg	1b10265639	wifi: mac80211: drop invalid source address OCB frames [ Upstream commit `d1b1a5eb27` ] In OCB, don't accept frames from invalid source addresses (and in particular don't try to create stations for them), drop the frames instead. Reported-by: syzbot+8b512026a7ec10dcbdd9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/6788d2d9.050a0220.20d369.0028.GAE@google.com/ Signed-off-by: Johannes Berg <johannes.berg@intel.com> Tested-by: syzbot+8b512026a7ec10dcbdd9@syzkaller.appspotmail.com Link: https://patch.msgid.link/20250616171838.7433379cab5d.I47444d63c72a0bd58d2e2b67bb99e1fea37eec6f@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2025-07-10 16:05:07 +02:00

1 2 3 4 5 ...

1319709 Commits