linux-yocto/mm
Alistair Popple 82ba975e4c mm: allow compound zone device pages
Zone device pages are used to represent various type of device memory
managed by device drivers.  Currently compound zone device pages are not
supported.  This is because MEMORY_DEVICE_FS_DAX pages are the only user
of higher order zone device pages and have their own page reference
counting.

A future change will unify FS DAX reference counting with normal page
reference counting rules and remove the special FS DAX reference counting.
Supporting that requires compound zone device pages.

Supporting compound zone device pages requires compound_head() to
distinguish between head and tail pages whilst still preserving the
special struct page fields that are specific to zone device pages.

A tail page is distinguished by having bit zero being set in
page->compound_head, with the remaining bits pointing to the head page. 
For zone device pages page->compound_head is shared with page->pgmap.

The page->pgmap field must be common to all pages within a folio, even if
the folio spans memory sections.  Therefore pgmap is the same for both
head and tail pages and can be moved into the folio and we can use the
standard scheme to find compound_head from a tail page.

Link: https://lkml.kernel.org/r/67055d772e6102accf85161d0b57b0b3944292bf.1740713401.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Asahi Lina <lina@asahilina.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chunyan Zhang <zhang.lyra@gmail.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: linmiaohe <linmiaohe@huawei.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Michael "Camp Drill Sergeant" Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:39 -07:00
..
damon mm/damon/paddr: respect ops_filters_default_reject 2025-03-17 00:05:39 -07:00
kasan kasan: don't call find_vm_area() in a PREEMPT_RT kernel 2025-02-17 22:40:04 -08:00
kfence kfence: skip __GFP_THISNODE allocations on NUMA systems 2025-02-01 03:53:26 -08:00
kmsan dma: kmsan: export kmsan_handle_dma() for modules 2025-03-05 21:36:14 -08:00
backing-dev.c
balloon_compaction.c
bootmem_info.c mm/sparse: allow for alternate vmemmap section init at boot 2025-03-16 22:06:27 -07:00
cma_debug.c mm, cma: support multiple contiguous ranges, if requested 2025-03-16 22:06:25 -07:00
cma_sysfs.c mm/cma: export total and free number of pages for CMA areas 2025-03-16 22:06:24 -07:00
cma.c mm/cma: introduce interface for early reservations 2025-03-16 22:06:30 -07:00
cma.h mm/cma: introduce interface for early reservations 2025-03-16 22:06:30 -07:00
compaction.c mm/page_alloc: clarify terminology in migratetype fallback code 2025-03-17 00:05:35 -07:00
debug_page_alloc.c
debug_page_ref.c
debug_vm_pgtable.c
debug.c mm/debug: print vm_refcnt state when dumping the vma 2025-03-16 22:06:20 -07:00
dmapool_test.c
dmapool.c
early_ioremap.c mm/early_ioremap: add null pointer checks to prevent NULL-pointer dereference 2025-01-13 22:40:59 -08:00
execmem.c
fadvise.c
fail_page_alloc.c
failslab.c
filemap.c filemap: remove redundant folio_test_large check in filemap_free_folio 2025-03-16 22:06:16 -07:00
folio-compat.c
gup_test.c
gup_test.h
gup.c mm/gup: remove redundant check for PCI P2PDMA page 2025-03-17 22:06:38 -07:00
highmem.c
hmm.c mm: allow compound zone device pages 2025-03-17 22:06:39 -07:00
huge_memory.c mm: avoid splitting pmd for lazyfree pmd-mapped THP in try_to_unmap 2025-03-16 22:06:17 -07:00
hugetlb_cgroup.c page_counter: track failcnt only for legacy cgroups 2025-03-17 00:05:35 -07:00
hugetlb_cma.c mm/hugetlb: move hugetlb CMA code in to its own file 2025-03-16 22:06:31 -07:00
hugetlb_cma.h mm/hugetlb: move hugetlb CMA code in to its own file 2025-03-16 22:06:31 -07:00
hugetlb_vmemmap.c mm/hugetlb: do pre-HVO for bootmem allocated pages 2025-03-16 22:06:29 -07:00
hugetlb_vmemmap.h mm/hugetlb: do pre-HVO for bootmem allocated pages 2025-03-16 22:06:29 -07:00
hugetlb.c mm: hugetlb: log time needed to allocate hugepages 2025-03-17 00:05:37 -07:00
hwpoison-inject.c
init-mm.c mm: replace vm_lock and detached flag with a reference count 2025-03-16 22:06:20 -07:00
internal.h mm/page_alloc: clarify terminology in migratetype fallback code 2025-03-17 00:05:35 -07:00
interval_tree.c
io-mapping.c
ioremap.c mm/ioremap: pass pgprot_t to ioremap_prot() instead of unsigned long 2025-03-16 22:06:23 -07:00
Kconfig mm/sparse: allow for alternate vmemmap section init at boot 2025-03-16 22:06:27 -07:00
Kconfig.debug mm: rename GENERIC_PTDUMP and PTDUMP_CORE 2025-03-17 00:05:32 -07:00
khugepaged.c The various patchsets are summarized below. Plus of course many 2025-01-26 18:36:23 -08:00
kmemleak.c mm: kmemleak: add support for dumping physical and __percpu object info 2025-03-16 22:06:08 -07:00
ksm.c mm/ksm: handle device-exclusive entries correctly in write_protect_page() 2025-03-16 22:05:58 -07:00
list_lru.c mm/list_lru: make the case where mlru is NULL as unlikely 2025-03-17 00:05:32 -07:00
maccess.c kasan: migrate copy_user_test to kunit 2024-11-11 00:26:44 -08:00
madvise.c mm: allow guard regions in file-backed and read-only mappings 2025-03-16 22:06:14 -07:00
Makefile mm: rename GENERIC_PTDUMP and PTDUMP_CORE 2025-03-17 00:05:32 -07:00
mapping_dirty_helpers.c
memblock.c mm/memblock: add memblock_alloc_or_panic interface 2025-01-25 20:22:38 -08:00
memcontrol-v1.c mm: memcontrol: move memsw charge callbacks to v1 2025-03-16 22:05:55 -07:00
memcontrol-v1.h mm: memcontrol: move memsw charge callbacks to v1 2025-03-16 22:05:55 -07:00
memcontrol.c memcg: bypass root memcg check for skmem charging 2025-03-17 00:05:36 -07:00
memfd.c mm/memfd: fix spelling and grammatical issues 2025-03-16 22:06:04 -07:00
memory_hotplug.c hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio 2025-03-05 21:36:13 -08:00
memory-failure.c mm: memory-failure: update ttu flag inside unmap_poisoned_folio 2025-03-05 21:36:13 -08:00
memory-tiers.c
memory.c mm: allow compound zone device pages 2025-03-17 22:06:39 -07:00
mempolicy.c mm/hugetlb: rename isolate_hugetlb() to folio_isolate_hugetlb() 2025-01-25 20:22:41 -08:00
mempool.c
memremap.c mm: allow compound zone device pages 2025-03-17 22:06:39 -07:00
memtest.c
migrate_device.c mm: allow compound zone device pages 2025-03-17 22:06:39 -07:00
migrate.c mm: use READ/WRITE_ONCE() for vma->vm_flags on migrate, mprotect 2025-03-16 22:06:09 -07:00
mincore.c mm/mincore: improve performance by adding an unlikely hint 2025-03-16 22:06:32 -07:00
mlock.c mm: allow compound zone device pages 2025-03-17 22:06:39 -07:00
mm_init.c mm: allow compound zone device pages 2025-03-17 22:06:39 -07:00
mm_slot.h
mmap_lock.c mm: mmap_lock: optimize mmap_lock tracepoints 2025-01-13 22:40:34 -08:00
mmap.c mm: make vma cache SLAB_TYPESAFE_BY_RCU 2025-03-16 22:06:21 -07:00
mmu_gather.c mm/mmu_gather: update comment on RCU freeing 2025-03-16 22:06:12 -07:00
mmu_notifier.c
mmzone.c
mprotect.c mm: use READ/WRITE_ONCE() for vma->vm_flags on migrate, mprotect 2025-03-16 22:06:09 -07:00
mremap.c mm: clear uffd-wp PTE/PMD state on mremap() 2025-01-12 19:03:37 -08:00
mseal.c mseal: remove can_do_mseal() 2025-01-13 22:40:51 -08:00
msync.c
nommu.c mm: introduce vma_iter_store_attached() to use with attached vmas 2025-03-16 22:06:18 -07:00
numa_emulation.c mm/fake-numa: allow later numa node hotplug 2025-01-25 20:22:29 -08:00
numa_memblks.c mm/fake-numa: allow later numa node hotplug 2025-01-25 20:22:29 -08:00
numa.c mm/memblock: add memblock_alloc_or_panic interface 2025-01-25 20:22:38 -08:00
oom_kill.c mm/oom_kill: fix trivial typo in comment 2025-03-16 22:05:55 -07:00
page_alloc.c mm/page_alloc: clarify should_claim_block() commentary 2025-03-17 00:05:35 -07:00
page_counter.c page_counter: track failcnt only for legacy cgroups 2025-03-17 00:05:35 -07:00
page_ext.c
page_frag_cache.c mm/page_alloc: export free_frozen_pages() instead of free_unref_page() 2025-01-13 22:40:31 -08:00
page_idle.c mm/page_idle: handle device-exclusive entries correctly in page_idle_clear_pte_refs_one() 2025-03-16 22:05:59 -07:00
page_io.c mm, swap: clean up device availability check 2025-01-25 20:22:36 -08:00
page_isolation.c mm/hugetlb: wait for hugetlb folios to be freed 2025-03-05 21:36:14 -08:00
page_owner.c
page_poison.c
page_reporting.c
page_reporting.h
page_table_check.c mm: use single SWP_DEVICE_EXCLUSIVE entry type 2025-03-16 22:05:58 -07:00
page_vma_mapped.c mm: make page_mapped_in_vma() hugetlb walk aware 2025-03-16 22:06:42 -07:00
page-writeback.c writeback: fix calculations in trace_balance_dirty_pages() for cgwb 2025-03-17 00:05:37 -07:00
pagewalk.c mm: pagewalk: add the ability to install PTEs 2024-11-11 00:26:44 -08:00
percpu-internal.h
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c mm, percpu: do not consider sleepable allocations atomic 2025-03-16 22:06:08 -07:00
pgalloc-track.h
pgtable-generic.c mm: add RCU annotation to pte_offset_map(_lock) 2024-12-18 19:04:43 -08:00
process_vm_access.c
pt_reclaim.c mm: pgtable: reclaim empty PTE page in madvise(MADV_DONTNEED) 2025-01-13 22:40:48 -08:00
ptdump.c
readahead.c The various patchsets are summarized below. Plus of course many 2025-01-26 18:36:23 -08:00
rmap.c mm: avoid splitting pmd for lazyfree pmd-mapped THP in try_to_unmap 2025-03-16 22:06:17 -07:00
rodata_test.c mm/rodata_test: verify test data is unchanged, rather than non-zero 2025-01-13 22:40:38 -08:00
secretmem.c add a string-to-qstr constructor 2025-01-27 19:25:45 -05:00
shmem_quota.c
shmem.c mm: shmem: factor out the within_size logic into a new helper 2025-03-17 00:05:42 -07:00
show_mem.c
shrinker_debug.c mm/shrinker: fix name consistency issue in shrinker_debugfs_rename() 2025-03-17 00:05:40 -07:00
shrinker.c
shuffle.c
shuffle.h
slab_common.c mm/slab/kvfree_rcu: Switch to WQ_MEM_RECLAIM wq 2025-03-04 08:51:53 +01:00
slab.h mm/slab: fix kernel-doc func param names 2025-01-13 10:22:04 +01:00
slub.c alloc_tag: uninline code gated by mem_alloc_profiling_key in slab allocator 2025-03-16 22:06:03 -07:00
sparse-vmemmap.c mm/hugetlb: do pre-HVO for bootmem allocated pages 2025-03-16 22:06:29 -07:00
sparse.c mm/sparse: allow for alternate vmemmap section init at boot 2025-03-16 22:06:27 -07:00
swap_cgroup.c mm: memcontrol: fix swap counter leak from offline cgroup 2025-03-16 17:40:24 -07:00
swap_state.c mm, swap: simplify folio swap allocation 2025-03-16 22:06:44 -07:00
swap.c mm/filemap: add read support for RWF_DONTCACHE 2025-01-25 20:22:43 -08:00
swap.h mm, swap: simplify folio swap allocation 2025-03-16 22:06:44 -07:00
swapfile.c mm, swap: simplify folio swap allocation 2025-03-16 22:06:44 -07:00
truncate.c fs/dax: always remove DAX page-cache entries when breaking layouts 2025-03-17 22:06:37 -07:00
usercopy.c
userfaultfd.c mm: allow vma_start_read_locked/vma_start_read_locked_nested to fail 2025-03-16 22:06:18 -07:00
util.c mm: add comments to do_mmap(), mmap_region() and vm_mmap() 2025-01-13 22:40:59 -08:00
vma_internal.h mm/vma: move brk() internals to mm/vma.c 2025-01-13 22:40:42 -08:00
vma.c mm: make vma cache SLAB_TYPESAFE_BY_RCU 2025-03-16 22:06:21 -07:00
vma.h mm: make vma cache SLAB_TYPESAFE_BY_RCU 2025-03-16 22:06:21 -07:00
vmalloc.c mm: don't skip arch_sync_kernel_mappings() in error paths 2025-03-05 21:36:18 -08:00
vmpressure.c
vmscan.c mm, swap: simplify folio swap allocation 2025-03-16 22:06:44 -07:00
vmstat.c vmstat: disable vmstat_work on vmstat_cpu_down_prep() 2025-01-12 19:03:38 -08:00
workingset.c mm/mglru: rework workingset protection 2025-01-25 20:22:39 -08:00
zpdesc.h mm/zsmalloc: introduce __zpdesc_clear/set_zsmalloc() 2025-01-25 20:22:35 -08:00
zpool.c mm: zpool: remove zpool_malloc_support_movable() 2025-03-17 00:05:41 -07:00
zsmalloc.c mm: zpool: remove zpool_malloc_support_movable() 2025-03-17 00:05:41 -07:00
zswap.c mm: zpool: remove zpool_malloc_support_movable() 2025-03-17 00:05:41 -07:00