linux-yocto/mm
David Hildenbrand d70ca21f7b mm/mremap: fix WARN with uffd that has remap events disabled
commit 772e5b4a5e8360743645b9a466842d16092c4f94 upstream.

Registering userfaultd on a VMA that spans at least one PMD and then
mremap()'ing that VMA can trigger a WARN when recovering from a failed
page table move due to a page table allocation error.

The code ends up doing the right thing (recurse, avoiding moving actual
page tables), but triggering that WARN is unpleasant:

WARNING: CPU: 2 PID: 6133 at mm/mremap.c:357 move_normal_pmd mm/mremap.c:357 [inline]
WARNING: CPU: 2 PID: 6133 at mm/mremap.c:357 move_pgt_entry mm/mremap.c:595 [inline]
WARNING: CPU: 2 PID: 6133 at mm/mremap.c:357 move_page_tables+0x3832/0x44a0 mm/mremap.c:852
Modules linked in:
CPU: 2 UID: 0 PID: 6133 Comm: syz.0.19 Not tainted 6.17.0-rc1-syzkaller-00004-g53e760d89498 #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:move_normal_pmd mm/mremap.c:357 [inline]
RIP: 0010:move_pgt_entry mm/mremap.c:595 [inline]
RIP: 0010:move_page_tables+0x3832/0x44a0 mm/mremap.c:852
Code: ...
RSP: 0018:ffffc900037a76d8 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000032930007 RCX: ffffffff820c6645
RDX: ffff88802e56a440 RSI: ffffffff820c7201 RDI: 0000000000000007
RBP: ffff888037728fc0 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000032930007 R11: 0000000000000000 R12: 0000000000000000
R13: ffffc900037a79a8 R14: 0000000000000001 R15: dffffc0000000000
FS:  000055556316a500(0000) GS:ffff8880d68bc000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b30863fff CR3: 0000000050171000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 copy_vma_and_data+0x468/0x790 mm/mremap.c:1215
 move_vma+0x548/0x1780 mm/mremap.c:1282
 mremap_to+0x1b7/0x450 mm/mremap.c:1406
 do_mremap+0xfad/0x1f80 mm/mremap.c:1921
 __do_sys_mremap+0x119/0x170 mm/mremap.c:1977
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0x4c0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f00d0b8ebe9
Code: ...
RSP: 002b:00007ffe5ea5ee98 EFLAGS: 00000246 ORIG_RAX: 0000000000000019
RAX: ffffffffffffffda RBX: 00007f00d0db5fa0 RCX: 00007f00d0b8ebe9
RDX: 0000000000400000 RSI: 0000000000c00000 RDI: 0000200000000000
RBP: 00007ffe5ea5eef0 R08: 0000200000c00000 R09: 0000000000000000
R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000002
R13: 00007f00d0db5fa0 R14: 00007f00d0db5fa0 R15: 0000000000000005
 </TASK>

The underlying issue is that we recurse during the original page table
move, but not during the recovery move.

Fix it by checking for both VMAs and performing the check before the
pmd_none() sanity check.

Add a new helper where we perform+document that check for the PMD and PUD
level.

Thanks to Harry for bisecting.

Link: https://lkml.kernel.org/r/20250818175358.1184757-1-david@redhat.com
Fixes: 0cef0bb836 ("mm: clear uffd-wp PTE/PMD state on mremap()")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: syzbot+4d9a13f0797c46a29e42@syzkaller.appspotmail.com
Closes: https://lkml.kernel.org/r/689bb893.050a0220.7f033.013a.GAE@google.com
Tested-by: Harry Yoo <harry.yoo@oracle.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28 16:34:35 +02:00
..
damon mm/damon/core: fix damos_commit_filter not changing allow 2025-08-28 16:34:35 +02:00
kasan kasan/test: fix protection against compiler elision 2025-08-28 16:34:26 +02:00
kfence kfence: skip __GFP_THISNODE allocations on NUMA systems 2025-02-01 03:53:26 -08:00
kmsan kmsan: test: add module description 2025-06-05 22:02:25 -07:00
backing-dev.c treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
balloon_compaction.c balloon_compaction: update the NR_BALLOON_PAGES state 2025-03-21 22:03:13 -07:00
bootmem_info.c mm/sparse: allow for alternate vmemmap section init at boot 2025-03-16 22:06:27 -07:00
cma_debug.c mm, cma: support multiple contiguous ranges, if requested 2025-03-16 22:06:25 -07:00
cma_sysfs.c mm/cma: export total and free number of pages for CMA areas 2025-03-16 22:06:24 -07:00
cma.c - The 11 patch series "Add folio_mk_pte()" from Matthew Wilcox 2025-05-31 15:44:16 -07:00
cma.h mm: cma: set early_pfn and bitmap as a union in cma_memrange 2025-05-22 14:55:36 -07:00
compaction.c mm: page_alloc: tighten up find_suitable_fallback() 2025-05-11 17:48:18 -07:00
debug_page_alloc.c mm/debug_page_alloc: improve error message for invalid guardpage minorder 2025-05-12 23:50:38 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: clear page table entries at destroy_args() 2025-08-28 16:34:35 +02:00
debug.c mm/debug: fix parameter passed to page_mapcount_is_type() 2025-05-11 17:48:17 -07:00
dmapool_test.c
dmapool.c dmapool: add NUMA affinity support 2025-05-20 05:34:27 +02:00
early_ioremap.c mm/early_ioremap: add null pointer checks to prevent NULL-pointer dereference 2025-01-13 22:40:59 -08:00
execmem.c Revert "mm/execmem: Unify early execmem_cache behaviour" 2025-06-11 11:20:52 +02:00
fadvise.c
fail_page_alloc.c
failslab.c
filemap.c readahead: fix return value of page_cache_next_miss() when no hole is found 2025-08-28 16:34:24 +02:00
folio-compat.c mm: Remove grab_cache_page_write_begin() 2025-03-04 17:02:25 +00:00
gup_test.c
gup_test.h
gup.c mm/gup: revert "mm: gup: fix infinite loop within __get_longterm_locked" 2025-06-19 20:48:01 -07:00
highmem.c
hmm.c mm/hmm: move pmd_to_hmm_pfn_flags() to the respective #ifdeffery 2025-08-15 16:39:35 +02:00
huge_memory.c mm/huge_memory: don't ignore queried cachemode in vmf_insert_pfn_pud() 2025-08-20 18:41:41 +02:00
hugetlb_cgroup.c page_counter: track failcnt only for legacy cgroups 2025-03-17 00:05:35 -07:00
hugetlb_cma.c mm/hugetlb: use separate nodemask for bootmem allocations 2025-05-12 23:50:35 -07:00
hugetlb_cma.h mm/hugetlb: move hugetlb CMA code in to its own file 2025-03-16 22:06:31 -07:00
hugetlb_vmemmap.c mm, hugetlb: increment the number of pages to be reset on HVO 2025-04-17 20:10:08 -07:00
hugetlb_vmemmap.h mm/hugetlb: do pre-HVO for bootmem allocated pages 2025-03-16 22:06:29 -07:00
hugetlb.c mm/hugetlb: don't crash when allocating a folio if there are no resv 2025-07-09 21:07:53 -07:00
hwpoison-inject.c
init-mm.c mm: replace vm_lock and detached flag with a reference count 2025-03-16 22:06:20 -07:00
internal.h mm: move dup_mmap() to mm 2025-05-12 23:50:48 -07:00
interval_tree.c
io-mapping.c mm/io-mapping: track_pfn() -> "pfnmap tracking" 2025-05-22 14:55:37 -07:00
ioremap.c mm/ioremap: pass pgprot_t to ioremap_prot() instead of unsigned long 2025-03-16 22:06:23 -07:00
Kconfig LoongArch changes for v6.16 2025-06-07 09:56:18 -07:00
Kconfig.debug mm: rename GENERIC_PTDUMP and PTDUMP_CORE 2025-03-17 00:05:32 -07:00
khugepaged.c mm/khugepaged: clean up refcount check using folio_expected_ref_count() 2025-05-31 22:46:16 -07:00
kmemleak.c mm/kmemleak: avoid deadlock by moving pr_warn() outside kmemleak_lock 2025-08-20 18:41:41 +02:00
ksm.c mm/ksm: fix -Wsometimes-uninitialized from clang-21 in advisor_mode_show() 2025-07-19 19:26:17 -07:00
list_lru.c mm/list_lru: make the case where mlru is NULL as unlikely 2025-03-17 00:05:32 -07:00
maccess.c maccess: fix strncpy_from_user_nofault() empty string handling 2025-05-11 17:54:10 -07:00
madvise.c mm: close theoretical race where stale TLB entries could linger 2025-06-11 22:42:35 -07:00
Makefile mm: perform VMA allocation, freeing, duplication in mm 2025-05-12 23:50:48 -07:00
mapping_dirty_helpers.c
memblock.c memblock: add KHO support for reserve_mem 2025-05-12 23:50:42 -07:00
memcontrol-v1.c memcg: make count_memcg_events re-entrant safe against irqs 2025-05-22 14:55:38 -07:00
memcontrol-v1.h memcg: move do_memsw_account() to CONFIG_MEMCG_V1 2025-03-21 22:03:11 -07:00
memcontrol.c Revert "sched/numa: add statistics of numa balance task" 2025-07-09 21:07:56 -07:00
memfd.c mm: move folio_index to mm/swap.h and remove no longer needed helper 2025-05-12 23:50:50 -07:00
memory_hotplug.c mm: use for_each_valid_pfn() in memory_hotplug 2025-05-12 23:50:44 -07:00
memory-failure.c mm/memory-failure: fix infinite UCE for VM_PFNMAP pfn 2025-08-28 16:34:35 +02:00
memory-tiers.c
memory.c mm/shmem, swap: fix softlockup with mTHP swapin 2025-06-19 20:48:01 -07:00
mempolicy.c mm/mempolicy: fix incorrect freeing of wi_kobj 2025-06-05 22:02:23 -07:00
mempool.c
memremap.c mm: introduce pfnmap_track() and pfnmap_untrack() and use them for memremap 2025-05-22 14:55:37 -07:00
memtest.c
migrate_device.c - The 6 patch series "Enable strict percpu address space checks" from 2025-04-01 09:29:18 -07:00
migrate.c mm/migrate: fix do_pages_stat in compat mode 2025-07-09 21:07:54 -07:00
mincore.c mm: mincore: use pte_batch_hint() to batch process large folios 2025-05-22 14:55:36 -07:00
mlock.c mm: allow compound zone device pages 2025-03-17 22:06:39 -07:00
mm_init.c - The 2 patch series "zram: support algorithm-specific parameters" from 2025-06-02 16:00:26 -07:00
mm_slot.h
mmap_lock.c mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped 2025-08-15 16:39:36 +02:00
mmap.c mm: convert VM_PFNMAP tracking to pfnmap_track() + pfnmap_untrack() 2025-05-22 14:55:37 -07:00
mmu_gather.c mmu_gather: move tlb flush for VM_PFNMAP/VM_MIXEDMAP vmas into free_pgtables() 2025-05-31 22:46:12 -07:00
mmu_notifier.c Update Christoph's Email address and make it consistent 2025-05-12 23:50:31 -07:00
mmzone.c
mprotect.c mm/huge_memory: remove useless folio pointers passing 2025-05-12 23:50:34 -07:00
mremap.c mm/mremap: fix WARN with uffd that has remap events disabled 2025-08-28 16:34:35 +02:00
mseal.c mseal: remove can_do_mseal() 2025-01-13 22:40:51 -08:00
msync.c
nommu.c - The 11 patch series "Add folio_mk_pte()" from Matthew Wilcox 2025-05-31 15:44:16 -07:00
numa_emulation.c mm/fake-numa: allow later numa node hotplug 2025-01-25 20:22:29 -08:00
numa_memblks.c mm: numa_memblks: introduce numa_add_reserved_memblk 2025-05-22 14:55:36 -07:00
numa.c mm/numa: remove unnecessary local variable in alloc_node_data() 2025-05-12 23:50:38 -07:00
oom_kill.c mm/oom_kill: fix trivial typo in comment 2025-03-16 22:05:55 -07:00
page_alloc.c - The 11 patch series "Add folio_mk_pte()" from Matthew Wilcox 2025-05-31 15:44:16 -07:00
page_counter.c page_counter: track failcnt only for legacy cgroups 2025-03-17 00:05:35 -07:00
page_ext.c mm: page_ext: add an iteration API for page extensions 2025-03-17 22:06:57 -07:00
page_frag_cache.c mm/page_alloc: export free_frozen_pages() instead of free_unref_page() 2025-01-13 22:40:31 -08:00
page_idle.c mm/page_idle: handle device-exclusive entries correctly in page_idle_clear_pte_refs_one() 2025-03-16 22:05:59 -07:00
page_io.c mm: Remove swap_writepage() and shmem_writepage() 2025-04-07 09:36:50 +02:00
page_isolation.c mm: page_isolation: avoid calling folio_hstate() without hugetlb_lock 2025-04-01 15:14:43 -07:00
page_owner.c mm: rename try_alloc_pages() to alloc_pages_nolock() 2025-05-22 14:55:37 -07:00
page_poison.c
page_reporting.c
page_reporting.h
page_table_check.c mm/page_table_check: Batch-check pmds/puds just like ptes 2025-05-09 13:43:07 +01:00
page_vma_mapped.c mm: make page_mapped_in_vma() hugetlb walk aware 2025-03-16 22:06:42 -07:00
page-writeback.c treewide, timers: Rename from_timer() to timer_container_of() 2025-06-08 09:07:37 +02:00
pagewalk.c mm: pagewalk: add the ability to install PTEs 2024-11-11 00:26:44 -08:00
percpu-internal.h
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c - The 6 patch series "Enable strict percpu address space checks" from 2025-04-01 09:29:18 -07:00
pgalloc-track.h
pgtable-generic.c mm: add RCU annotation to pte_offset_map(_lock) 2024-12-18 19:04:43 -08:00
process_vm_access.c
pt_reclaim.c mm: pgtable: reclaim empty PTE page in madvise(MADV_DONTNEED) 2025-01-13 22:40:48 -08:00
ptdump.c mm/ptdump: take the memory hotplug lock inside ptdump_walk_pgd() 2025-08-20 18:41:41 +02:00
readahead.c fs: add S_ANON_INODE 2025-04-21 13:20:14 +02:00
rmap.c mm/rmap: fix potential out-of-bounds page table access during batched unmap 2025-07-09 21:07:53 -07:00
rodata_test.c mm/rodata_test: verify test data is unchanged, rather than non-zero 2025-01-13 22:40:38 -08:00
secretmem.c secretmem: use SB_I_NOEXEC 2025-07-07 13:26:09 +02:00
shmem_quota.c
shmem.c mm/shmem, swap: improve cached mTHP handling and fix potential hang 2025-08-20 18:41:41 +02:00
show_mem.c - The 11 patch series "Add folio_mk_pte()" from Matthew Wilcox 2025-05-31 15:44:16 -07:00
shrinker_debug.c mm/shrinker: fix name consistency issue in shrinker_debugfs_rename() 2025-03-17 00:05:40 -07:00
shrinker.c
shuffle.c
shuffle.h
slab_common.c Update Christoph's Email address and make it consistent 2025-05-12 23:50:31 -07:00
slab.h Merge branch 'slab/for-6.15/kfree_rcu_tiny' into slab/for-next 2025-03-20 10:33:38 +01:00
slub.c mm, slab: restore NUMA policy support for large kmalloc 2025-08-20 18:41:41 +02:00
sparse-vmemmap.c mm/hugetlb: do pre-HVO for bootmem allocated pages 2025-03-16 22:06:29 -07:00
sparse.c drivers/base/memory: improve add_boot_memory_block() 2025-03-17 22:07:01 -07:00
swap_cgroup.c mm: swap_cgroup: remove double initialization of locals 2025-03-17 22:06:58 -07:00
swap_state.c - The 11 patch series "Add folio_mk_pte()" from Matthew Wilcox 2025-05-31 15:44:16 -07:00
swap.c memcg: make count_memcg_events re-entrant safe against irqs 2025-05-22 14:55:38 -07:00
swap.h mm/shmem, swap: fix softlockup with mTHP swapin 2025-06-19 20:48:01 -07:00
swapfile.c mm: swap: move nr_swap_pages counter decrement from folio_alloc_swap() to swap_range_alloc() 2025-08-15 16:39:35 +02:00
truncate.c - The 2 patch series "zram: support algorithm-specific parameters" from 2025-06-02 16:00:26 -07:00
usercopy.c mm: security: Check early if HARDENED_USERCOPY is enabled 2025-02-28 11:51:31 -08:00
userfaultfd.c userfaultfd: fix a crash in UFFDIO_MOVE when PMD is a migration entry 2025-08-20 18:41:40 +02:00
util.c mm: add mmap_prepare() compatibility layer for nested file systems 2025-06-12 21:39:02 -07:00
vma_exec.c mm: abstract initial stack setup to mm subsystem 2025-05-12 23:50:48 -07:00
vma_init.c mm: convert VM_PFNMAP tracking to pfnmap_track() + pfnmap_untrack() 2025-05-22 14:55:37 -07:00
vma_internal.h mm/vma: move brk() internals to mm/vma.c 2025-01-13 22:40:42 -08:00
vma.c mm: add mmap_prepare() compatibility layer for nested file systems 2025-06-12 21:39:02 -07:00
vma.h mm: add mmap_prepare() compatibility layer for nested file systems 2025-06-12 21:39:02 -07:00
vmalloc.c mm/vmalloc: leave lazy MMU mode on PTE mapping error 2025-07-09 21:07:52 -07:00
vmpressure.c
vmscan.c mm/vmscan: fix hwpoisoned large folio handling in shrink_folio_list 2025-07-19 19:26:15 -07:00
vmstat.c Revert "sched/numa: add statistics of numa balance task" 2025-07-09 21:07:56 -07:00
workingset.c mm: workingset: simplify lockdep check in update_node 2025-05-12 23:50:44 -07:00
zpdesc.h mm: rename page->index to page->__folio_index 2025-05-31 22:46:06 -07:00
zpool.c zsmalloc: prefer the the original page's node for compressed data 2025-05-11 17:48:06 -07:00
zsmalloc.c mm/zsmalloc: do not pass __GFP_MOVABLE if CONFIG_COMPACTION=n 2025-07-19 19:26:15 -07:00
zswap.c zsmalloc: prefer the the original page's node for compressed data 2025-05-11 17:48:06 -07:00