linux-yocto/mm
Liu Shixin 1d117f79b5 mm: khugepaged: fix call hpage_collapse_scan_file() for anonymous vma
commit f1897f2f08 upstream.

syzkaller reported such a BUG_ON():

 ------------[ cut here ]------------
 kernel BUG at mm/khugepaged.c:1835!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 ...
 CPU: 6 UID: 0 PID: 8009 Comm: syz.15.106 Kdump: loaded Tainted: G        W          6.13.0-rc6 #22
 Tainted: [W]=WARN
 Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
 pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : collapse_file+0xa44/0x1400
 lr : collapse_file+0x88/0x1400
 sp : ffff80008afe3a60
 ...
 Call trace:
  collapse_file+0xa44/0x1400 (P)
  hpage_collapse_scan_file+0x278/0x400
  madvise_collapse+0x1bc/0x678
  madvise_vma_behavior+0x32c/0x448
  madvise_walk_vmas.constprop.0+0xbc/0x140
  do_madvise.part.0+0xdc/0x2c8
  __arm64_sys_madvise+0x68/0x88
  invoke_syscall+0x50/0x120
  el0_svc_common.constprop.0+0xc8/0xf0
  do_el0_svc+0x24/0x38
  el0_svc+0x34/0x128
  el0t_64_sync_handler+0xc8/0xd0
  el0t_64_sync+0x190/0x198

This indicates that the pgoff is unaligned.  After analysis, I confirm the
vma is mapped to /dev/zero.  Such a vma certainly has vm_file, but it is
set to anonymous by mmap_zero().  So even if it's mmapped by 2m-unaligned,
it can pass the check in thp_vma_allowable_order() as it is an
anonymous-mmap, but then be collapsed as a file-mmap.

It seems the problem has existed for a long time, but actually, since we
have khugepaged_max_ptes_none check before, we will skip collapse it as it
is /dev/zero and so has no present page.  But commit d8ea7cc854 limit
the check for only khugepaged, so the BUG_ON() can be triggered by
madvise_collapse().

Add vma_is_anonymous() check to make such vma be processed by
hpage_collapse_scan_pmd().

Link: https://lkml.kernel.org/r/20250111034511.2223353-1-liushixin2@huawei.com
Fixes: d8ea7cc854 ("mm/khugepaged: add flag to predicate khugepaged-only behavior")
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Yang Shi <yang@os.amperecomputing.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Mattew Wilcox <willy@infradead.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nanyong Sun <sunnanyong@huawei.com>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[acsjakub: backport, clean apply]
Signed-off-by: Jakub Acs <acsjakub@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01 09:47:31 +01:00
..
damon mm/damon/sysfs-schemes: free old damon_sysfs_scheme_filter->memcg_path on write 2025-07-06 11:00:11 +02:00
kasan kasan: use vmalloc_dump_obj() for vmalloc error reports 2025-08-01 09:47:30 +01:00
kfence kfence: skip __GFP_THISNODE allocations on NUMA systems 2025-02-17 09:40:32 +01:00
kmsan dma: kmsan: export kmsan_handle_dma() for modules 2025-03-13 12:58:27 +01:00
backing-dev.c blk-wbt: Fix detection of dirty-throttled tasks 2024-02-23 09:25:16 +01:00
balloon_compaction.c
bootmem_info.c
cma_debug.c
cma_sysfs.c
cma.c mm/cma: drop incorrect alignment check in cma_init_reserved_mem 2024-06-16 13:47:42 +02:00
cma.h
compaction.c NFS: fix nfs_release_folio() to not deadlock via kcompactd writeback 2025-03-13 12:58:27 +01:00
debug_page_alloc.c
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: drop RANDOM_ORVALUE trick 2024-08-19 06:04:31 +02:00
debug.c
dmapool_test.c
dmapool.c
early_ioremap.c
fadvise.c mm: remove unnecessary pagevec includes 2023-06-23 16:59:31 -07:00
fail_page_alloc.c
failslab.c
filemap.c mm: fix filemap_get_folios_contig returning batches of identical folios 2025-04-25 10:45:48 +02:00
folio-compat.c filemap: Add fgf_t typedef 2023-07-24 18:04:30 -04:00
gup_test.c
gup_test.h
gup.c mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable() 2025-04-25 10:45:48 +02:00
highmem.c
hmm.c mm: enable page walking API to lock vmas during the walk 2023-08-21 13:07:20 -07:00
huge_memory.c mm/huge_memory: fix dereferencing invalid pmd migration entry 2025-06-27 11:09:00 +01:00
hugetlb_cgroup.c
hugetlb_vmemmap.c mm: hugetlb_vmemmap: fix a race between vmemmap pmd split 2023-08-18 10:12:14 -07:00
hugetlb_vmemmap.h
hugetlb.c mm/hugetlb: unshare page tables during VMA split, not before 2025-06-27 11:09:00 +01:00
hwpoison-inject.c
init-mm.c mm: move dummy_vm_ops out of a header 2023-08-21 13:37:46 -07:00
internal.h Rename .data.once to .data..once to fix resetting WARN*_ONCE 2024-12-09 10:32:59 +01:00
interval_tree.c
io-mapping.c
ioremap.c mm: ioremap: remove unneeded ioremap_allowed and iounmap_allowed 2023-08-18 10:12:36 -07:00
Kconfig mm: z3fold: deprecate CONFIG_Z3FOLD 2024-10-10 11:58:02 +02:00
Kconfig.debug
khugepaged.c mm: khugepaged: fix call hpage_collapse_scan_file() for anonymous vma 2025-08-01 09:47:31 +01:00
kmemleak.c mm: kmemleak: fix upper boundary check for physical address objects 2025-02-17 09:40:35 +01:00
ksm.c mm/ksm: fix ksm_zero_pages accounting 2024-06-16 13:47:41 +02:00
list_lru.c
maccess.c
madvise.c mm,madvise,hugetlb: check for 0-length range after end address adjustment 2025-02-27 04:10:52 -08:00
Makefile - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") 2023-08-29 14:25:26 -07:00
mapping_dirty_helpers.c mm: fix clean_record_shared_mapping_range kernel-doc 2023-08-24 16:20:30 -07:00
memblock.c memblock: Accept allocated memory before use in memblock_double_array() 2025-05-22 14:12:25 +02:00
memcontrol.c memcg: always call cond_resched() after fn() 2025-06-04 14:42:20 +02:00
memfd.c revert "memfd: improve userspace warnings for missing exec-related flags". 2023-09-05 11:11:52 -07:00
memory_hotplug.c hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio 2025-05-22 14:12:25 +02:00
memory-failure.c mm/hwpoison: do not send SIGBUS to processes with recovered clean pages 2025-04-25 10:45:31 +02:00
memory-tiers.c memory tier: use helper macro __ATTR_RW() 2023-08-18 10:12:38 -07:00
memory.c mm: fix apply_to_existing_page_range() 2025-04-25 10:45:48 +02:00
mempolicy.c mm/mempolicy: fix migrate_to_node() assuming there is at least one VMA in a MM 2024-12-14 20:00:18 +01:00
mempool.c
memremap.c
memtest.c memtest: use {READ,WRITE}_ONCE in memory scanning 2024-04-03 15:28:33 +02:00
migrate_device.c Add x86 shadow stack support 2023-08-31 12:20:12 -07:00
migrate.c mm/migrate: correct nr_failed in migrate_pages_sync() 2025-05-22 14:12:25 +02:00
mincore.c mm: enable page walking API to lock vmas during the walk 2023-08-21 13:07:20 -07:00
mlock.c merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes 2023-08-21 14:26:20 -07:00
mm_init.c efi: disable mirror feature during crashkernel 2024-01-31 16:18:56 -08:00
mm_slot.h
mmap_lock.c mm: mmap_lock: replace get_memcg_path_buf() with on-stack buffer 2024-08-03 08:54:12 +02:00
mmap.c mm/hugetlb: unshare page tables during VMA split, not before 2025-06-27 11:09:00 +01:00
mmu_gather.c mm: fix kernel-doc warning from tlb_flush_rmaps() 2023-08-24 16:20:30 -07:00
mmu_notifier.c mmu_notifiers: rename invalidate_range notifier 2023-08-18 10:12:41 -07:00
mmzone.c
mprotect.c mm: refactor map_deny_write_exec() 2024-11-22 15:38:37 +01:00
mremap.c mm/mremap: correctly handle partial mremap() of VMA starting at 0 2025-04-25 10:45:31 +02:00
msync.c
nommu.c mm: add nommu variant of vm_insert_pages() 2025-03-22 12:50:44 -07:00
oom_kill.c memcg: fix soft lockup in the OOM process 2025-02-27 04:10:45 -08:00
page_alloc.c mm/page_alloc.c: avoid infinite retries caused by cpuset race 2025-06-04 14:42:20 +02:00
page_counter.c
page_ext.c mm/page_ext: move functions around for minor cleanups to page_ext 2023-08-18 10:12:31 -07:00
page_idle.c
page_io.c zswap: make zswap_load() take a folio 2023-08-21 13:37:27 -07:00
page_isolation.c mm/hugetlb: get rid of page_hstate() 2023-08-18 10:12:39 -07:00
page_owner.c mm/page_ext: use page_ext_data helper in page_owner 2023-08-21 13:37:27 -07:00
page_poison.c mm/page_poison: remove unused page_ext.h from page_poison 2023-08-21 13:37:30 -07:00
page_reporting.c mm, treewide: introduce NR_PAGE_ORDERS 2024-05-02 16:32:41 +02:00
page_reporting.h
page_table_check.c mm/page_table_check: support userfault wr-protect entries 2024-08-19 06:04:29 +02:00
page_vma_mapped.c mm: make page_mapped_in_vma() hugetlb walk aware 2025-04-25 10:45:31 +02:00
page-writeback.c mm: fix ratelimit_pages update error in dirty_ratio_handler() 2025-06-27 11:08:49 +01:00
pagewalk.c mm/pagewalk: fix bootstopping regression from extra pte_unmap() 2023-09-02 08:39:21 -07:00
percpu-internal.h
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c mm: Introduce flush_cache_vmap_early() 2024-02-16 19:10:52 +01:00
pgalloc-track.h
pgtable-generic.c mm: fix race between __split_huge_pmd_locked() and GUP-fast 2024-06-16 13:47:40 +02:00
process_vm_access.c
ptdump.c
readahead.c mm/readahead: fix large folio support in async readahead 2025-01-09 13:32:08 +01:00
rmap.c mm/rmap: reject hugetlb folios in folio_make_device_exclusive() 2025-04-25 10:45:31 +02:00
rodata_test.c
secretmem.c fs: export anon_inode_make_secure_inode() and fix secretmem LSM bypass 2025-07-10 16:03:18 +02:00
shmem_quota.c tmpfs: fix race on handling dquot rbtree 2024-04-03 15:28:54 +02:00
shmem.c Revert "libfs: Add simple_offset_empty()" 2025-02-01 18:37:54 +01:00
show_mem.c mm, treewide: introduce NR_PAGE_ORDERS 2024-05-02 16:32:41 +02:00
shrinker_debug.c
shuffle.c
shuffle.h
slab_common.c mm: krealloc: Fix MTE false alarm in __do_krealloc 2024-11-17 15:08:58 +01:00
slab.c Randomized slab caches for kmalloc() 2023-07-18 10:07:47 +02:00
slab.h mm/slub: Avoid list corruption when removing a slab from the full list 2024-12-09 10:33:06 +01:00
slub.c mm/slub: Avoid list corruption when removing a slab from the full list 2024-12-09 10:33:06 +01:00
sparse-vmemmap.c mm/vmemmap: allow architectures to override how vmemmap optimization works 2023-08-18 10:12:53 -07:00
sparse.c x86/kaslr: Expose and use the end of the physical memory address space 2024-09-12 11:11:25 +02:00
swap_cgroup.c
swap_slots.c
swap_state.c mm/swap: inline folio_set_swap_entry() and folio_swap_entry() 2023-08-24 16:20:28 -07:00
swap.c mm: page_alloc: move mlocked flag clearance into free_pages_prepare() 2024-12-14 19:59:51 +01:00
swap.h mm/swap: fix race when skipping swapcache 2024-03-01 13:35:00 +01:00
swapfile.c mm/swapfile: skip HugeTLB pages for unuse_vma 2024-10-22 15:46:21 +02:00
truncate.c mm: Fix missing folio invalidation calls during truncation 2024-09-04 13:28:23 +02:00
usercopy.c
userfaultfd.c userfaultfd: fix checks for huge PMDs 2024-09-12 11:11:27 +02:00
util.c mm: only enforce minimum stack gap size if it's sensible 2024-10-04 16:30:02 +02:00
vmalloc.c mm/vmalloc: leave lazy MMU mode on PTE mapping error 2025-07-17 18:35:15 +02:00
vmpressure.c net-memcg: Fix scope of sockmem pressure indicators 2023-08-16 12:21:32 +01:00
vmscan.c mm: add missing release barrier on PGDAT_RECLAIM_LOCKED unlock 2025-04-25 10:45:31 +02:00
vmstat.c vmstat: call fold_vm_zone_numa_events() before show per zone NUMA event 2024-12-09 10:33:05 +01:00
workingset.c mm: ratelimit stat flush from workingset shrinker 2024-06-16 13:47:31 +02:00
z3fold.c mm/z3fold: remove obsolete comment for struct z3fold_pool 2023-08-21 13:37:51 -07:00
zbud.c
zpool.c
zsmalloc.c mm/zsmalloc: do not pass __GFP_MOVABLE if CONFIG_COMPACTION=n 2025-08-01 09:47:31 +01:00
zswap.c mm: zswap: fix missing folio cleanup in writeback race path 2024-03-01 13:35:10 +01:00