This fixes a real bug, too, because xstate_size() was assuming that
the stack variable xstate_size was initialized to zero. That's not
guaranteed nor even especially likely.
Link: https://lkml.kernel.org/r/20230606071637.267103-8-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The uffd tests generate two compile time warnings from clang's
-Wformat-security setting. These trigger at the call sites for
uffd_test_start() and uffd_test_skip().
1) Fix the uffd_test_start() issue by removing the intermediate
test_name variable (thanks to David Hildenbrand for showing how to do
this).
2) Fix the uffd_test_skip() issue by observing that there is no need for
a macro and a variable args approach, because all callers of
uffd_test_skip() pass in a simple char* string, without any format
specifiers. So just change uffd_test_skip() into a regular C function.
Link: https://lkml.kernel.org/r/20230606071637.267103-7-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
These new build products were left out of .gitignore, so add them now.
Link: https://lkml.kernel.org/r/20230606071637.267103-6-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We cannot depend upon git to reliably retain the executable bit on shell
scripts, or so I was told several years ago while working on this same
run_vmtests.sh script. And sure enough, things such as test_hmm.sh are
lately failing to run, due to lacking execute permissions.
Fix this by explicitly adding "bash" to each of the shell script
invocations. Leave fixing the overall approach to another day.
Link: https://lkml.kernel.org/r/20230606071637.267103-5-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The stop variable is a char*, and the code was assigning a char value to
it. This was generating a warning when compiling with clang.
However, as both David and Peter pointed out, stop is not even used
after the problematic assignment to a char type. So just delete that
line entirely.
Link: https://lkml.kernel.org/r/20230606071637.267103-4-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Dummy variables are required in order to make these two (similar)
routines work, so in both cases, declare the variables as volatile in
order to avoid the clang compiler warning.
Furthermore, in order to ensure that each test actually does what is
intended, add an asm volatile invocation (thanks to David Hildenbrand
for the suggestion), with a clarifying comment so that it survives
future maintenance.
Link: https://lkml.kernel.org/r/20230606071637.267103-3-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "A minor flurry of selftest/mm fixes", v3.
A series that fixes up build errors and warnings for at least the 64-bit
builds on x86 with clang.
The series also includes an optional "improvement" of moving some uffd
code into uffd-common.[ch], which is proving to be somewhat controversial,
and so if that doesn't get resolved, then patches 9 and 10 may just get
dropped. They are not required in order to get a clean build, now that
"make headers" is happening.
[1]: https://lore.kernel.org/all/20230602013358.900637-1-jhubbard@nvidia.com/
This patch (of 11):
uffd_minor_feature() was unused. Remove it in order to fix the associated
clang build warning.
Link: https://lkml.kernel.org/r/20230606071637.267103-1-jhubbard@nvidia.com
Link: https://lkml.kernel.org/r/20230606071637.267103-2-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Currently the MM selftests attempt to work out the target architecture by
using CROSS_COMPILE or otherwise querying the host machine, storing the
target architecture in a variable called MACHINE rather than the usual
ARCH though as far as I can tell (including for x86_64) the value is the
same as we would use for architecture.
When cross compiling with LLVM we don't need a CROSS_COMPILE as LLVM can
support many target architectures in a single build so this logic does not
work, CROSS_COMPILE is not set and we end up selecting tests for the host
rather than target architecture. Fix this by using the more standard ARCH
to describe the architecture, taking it from the environment if specified.
Link: https://lkml.kernel.org/r/20230614-kselftest-mm-llvm-v1-1-180523f277d3@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tom Rix <trix@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Similar to the COW selftests, also use io_uring fixed buffers to test if
long-term page pinning works as expected.
Link: https://lkml.kernel.org/r/20230519102723.185721-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Let's add a new test for checking whether GUP long-term page pinning works
as expected (R/O vs. R/W, MAP_PRIVATE vs. MAP_SHARED, GUP vs.
GUP-fast). Note that COW handling with long-term R/O pinning in private
mappings, and pinning of anonymous memory in general, is tested by the COW
selftest. This test, therefore, focuses on page pinning in file mappings.
The most interesting case is probably the "local tmpfile" case, as that
will likely end up on a "real" filesystem such as ext4 or xfs, not on a
virtual one like tmpfs or hugetlb where any long-term page pinning is
always expected to succeed.
For now, only add tests that use the "/sys/kernel/debug/gup_test"
interface. We'll add tests based on liburing separately next.
[akpm@linux-foundation.org: update .gitignore for gup_longterm, per Peter]
Link: https://lkml.kernel.org/r/20230519102723.185721-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "selftests/mm: new test for FOLL_LONGTERM on file mappings".
Let's add some selftests to make sure that:
* R/O long-term pinning always works of file mappings
* R/W long-term pinning always works in MAP_PRIVATE file mappings
* R/W long-term pinning only works in MAP_SHARED mappings with special
filesystems (shmem, hugetlb) and fails with other filesystems (ext4, btrfs,
xfs).
The tests make use of the gup_test kernel module to trigger ordinary GUP
and GUP-fast, and liburing (similar to our COW selftests). Test with
memfd, memfd hugetlb, tmpfile() and mkstemp(). The latter usually gives
us a "real" filesystem (ext4, btrfs, xfs) where long-term pinning is
expected to fail.
Note that these selftests don't contain any actual reproducers for data
corruptions in case R/W long-term pinning on problematic filesystems
"would" work.
Maybe we can later come up with a racy !FOLL_LONGTERM reproducer that can
reuse an existing interface to trigger short-term pinning (I'll look into
that next).
On current mm/mm-unstable:
# ./gup_longterm
# [INFO] detected hugetlb page size: 2048 KiB
# [INFO] detected hugetlb page size: 1048576 KiB
TAP version 13
1..50
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with memfd
ok 1 Should have worked
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with tmpfile
ok 2 Should have worked
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with local tmpfile
ok 3 Should have failed
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 4 Should have worked
# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 5 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd
ok 6 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with tmpfile
ok 7 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with local tmpfile
ok 8 Should have failed
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 9 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 10 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with memfd
ok 11 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with tmpfile
ok 12 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with local tmpfile
ok 13 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 14 Should have worked
# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 15 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd
ok 16 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with tmpfile
ok 17 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with local tmpfile
ok 18 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 19 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 20 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with memfd
ok 21 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with tmpfile
ok 22 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with local tmpfile
ok 23 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 24 Should have worked
# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 25 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd
ok 26 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with tmpfile
ok 27 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with local tmpfile
ok 28 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 29 Should have worked
# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 30 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with memfd
ok 31 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with tmpfile
ok 32 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with local tmpfile
ok 33 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 34 Should have worked
# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 35 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd
ok 36 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with tmpfile
ok 37 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with local tmpfile
ok 38 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 39 Should have worked
# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 40 Should have worked
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with memfd
ok 41 Should have worked
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with tmpfile
ok 42 Should have worked
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with local tmpfile
ok 43 Should have failed
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
ok 44 Should have worked
# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
ok 45 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with memfd
ok 46 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with tmpfile
ok 47 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with local tmpfile
ok 48 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
ok 49 Should have worked
# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
ok 50 Should have worked
# Totals: pass:50 fail:0 xfail:0 xpass:0 skip:0 error:0
This patch (of 3):
Let's factor detection out into vm_util, to be reused by a new test.
Link: https://lkml.kernel.org/r/20230519102723.185721-1-david@redhat.com
Link: https://lkml.kernel.org/r/20230519102723.185721-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
- Some KSM work from David Hildenbrand, to make the PR_SET_MEMORY_MERGE
ioctl's behavior more similar to KSM's behavior.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZFLsxAAKCRDdBJ7gKXxA
jl8yAQCqjstPsOULf9QN0z4bGAUhY+Wj4ERz1jbKSIuhFCJWiQEAgQvgRXObKjmi
OtUB0Ek4CMDCQzbyIQ1Bhp3kxi6+Jgs=
=AbyC
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-05-03-16-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
- Some DAMON cleanups from Kefeng Wang
- Some KSM work from David Hildenbrand, to make the PR_SET_MEMORY_MERGE
ioctl's behavior more similar to KSM's behavior.
[ Andrew called these "final", but I suspect we'll have a series fixing
up the fact that the last commit in the dmapools series in the
previous pull seems to have unintentionally just reverted all the
other commits in the same series.. - Linus ]
* tag 'mm-stable-2023-05-03-16-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: hwpoison: coredump: support recovery from dump_user_range()
mm/page_alloc: add some comments to explain the possible hole in __pageblock_pfn_to_page()
mm/ksm: move disabling KSM from s390/gmap code to KSM code
selftests/ksm: ksm_functional_tests: add prctl unmerge test
mm/ksm: unmerge and clear VM_MERGEABLE when setting PR_SET_MEMORY_MERGE=0
mm/damon/paddr: fix missing folio_sz update in damon_pa_young()
mm/damon/paddr: minor refactor of damon_pa_mark_accessed_or_deactivate()
mm/damon/paddr: minor refactor of damon_pa_pageout()
Let's test whether setting PR_SET_MEMORY_MERGE to 0 after setting it to 1
will unmerge pages, similar to how setting MADV_UNMERGEABLE after setting
MADV_MERGEABLE would.
Link: https://lkml.kernel.org/r/20230422205420.30372-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Stefan Roesch <shr@devkernel.io>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
- User events are finally ready!
After lots of collaboration between various parties, we finally locked
down on a stable interface for user events that can also work with user
space only tracing. This is implemented by telling the kernel (or user
space library, but that part is user space only and not part of this
patch set), where the variable is that the application uses to know if
something is listening to the trace. There's also an interface to tell
the kernel about these events, which will show up in the
/sys/kernel/tracing/events/user_events/ directory, where it can be
enabled. When it's enabled, the kernel will update the variable, to tell
the application to start writing to the kernel.
See https://lwn.net/Articles/927595/
- Cleaned up the direct trampolines code to simplify arm64 addition of
direct trampolines. Direct trampolines use the ftrace interface but
instead of jumping to the ftrace trampoline, applications (mostly BPF)
can register their own trampoline for performance reasons.
- Some updates to the fprobe infrastructure. fprobes are more efficient than
kprobes, as it does not need to save all the registers that kprobes on
ftrace do. More work needs to be done before the fprobes will be exposed
as dynamic events.
- More updates to references to the obsolete path of
/sys/kernel/debug/tracing for the new /sys/kernel/tracing path.
- Add a seq_buf_do_printk() helper to seq_bufs, to print a large buffer line
by line instead of all at once. There's users in production kernels that
have a large data dump that originally used printk() directly, but the
data dump was larger than what printk() allowed as a single print.
Using seq_buf() to do the printing fixes that.
- Add /sys/kernel/tracing/touched_functions that shows all functions that
was every traced by ftrace or a direct trampoline. This is used for
debugging issues where a traced function could have caused a crash by
a bpf program or live patching.
- Add a "fields" option that is similar to "raw" but outputs the fields of
the events. It's easier to read by humans.
- Some minor fixes and clean ups.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZEr36xQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6quZHAQCzuqnn2S8DsPd3Sy1vKIYaj0uajW5D
Kz1oUJH4F0H7kgEA8XwXkdtfKpOXWc/ZH4LWfL7Orx2wJZJQMV9dVqEPDAE=
=w0Z1
-----END PGP SIGNATURE-----
Merge tag 'trace-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt:
- User events are finally ready!
After lots of collaboration between various parties, we finally
locked down on a stable interface for user events that can also work
with user space only tracing.
This is implemented by telling the kernel (or user space library, but
that part is user space only and not part of this patch set), where
the variable is that the application uses to know if something is
listening to the trace.
There's also an interface to tell the kernel about these events,
which will show up in the /sys/kernel/tracing/events/user_events/
directory, where it can be enabled.
When it's enabled, the kernel will update the variable, to tell the
application to start writing to the kernel.
See https://lwn.net/Articles/927595/
- Cleaned up the direct trampolines code to simplify arm64 addition of
direct trampolines.
Direct trampolines use the ftrace interface but instead of jumping to
the ftrace trampoline, applications (mostly BPF) can register their
own trampoline for performance reasons.
- Some updates to the fprobe infrastructure. fprobes are more efficient
than kprobes, as it does not need to save all the registers that
kprobes on ftrace do. More work needs to be done before the fprobes
will be exposed as dynamic events.
- More updates to references to the obsolete path of
/sys/kernel/debug/tracing for the new /sys/kernel/tracing path.
- Add a seq_buf_do_printk() helper to seq_bufs, to print a large buffer
line by line instead of all at once.
There are users in production kernels that have a large data dump
that originally used printk() directly, but the data dump was larger
than what printk() allowed as a single print.
Using seq_buf() to do the printing fixes that.
- Add /sys/kernel/tracing/touched_functions that shows all functions
that was every traced by ftrace or a direct trampoline. This is used
for debugging issues where a traced function could have caused a
crash by a bpf program or live patching.
- Add a "fields" option that is similar to "raw" but outputs the fields
of the events. It's easier to read by humans.
- Some minor fixes and clean ups.
* tag 'trace-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (41 commits)
ring-buffer: Sync IRQ works before buffer destruction
tracing: Add missing spaces in trace_print_hex_seq()
ring-buffer: Ensure proper resetting of atomic variables in ring_buffer_reset_online_cpus
recordmcount: Fix memory leaks in the uwrite function
tracing/user_events: Limit max fault-in attempts
tracing/user_events: Prevent same address and bit per process
tracing/user_events: Ensure bit is cleared on unregister
tracing/user_events: Ensure write index cannot be negative
seq_buf: Add seq_buf_do_printk() helper
tracing: Fix print_fields() for __dyn_loc/__rel_loc
tracing/user_events: Set event filter_type from type
ring-buffer: Clearly check null ptr returned by rb_set_head_page()
tracing: Unbreak user events
tracing/user_events: Use print_format_fields() for trace output
tracing/user_events: Align structs with tabs for readability
tracing/user_events: Limit global user_event count
tracing/user_events: Charge event allocs to cgroups
tracing/user_events: Update documentation for ABI
tracing/user_events: Use write ABI in example
tracing/user_events: Add ABI self-test
...
switching from a user process to a kernel thread.
- More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav.
- zsmalloc performance improvements from Sergey Senozhatsky.
- Yue Zhao has found and fixed some data race issues around the
alteration of memcg userspace tunables.
- VFS rationalizations from Christoph Hellwig:
- removal of most of the callers of write_one_page().
- make __filemap_get_folio()'s return value more useful
- Luis Chamberlain has changed tmpfs so it no longer requires swap
backing. Use `mount -o noswap'.
- Qi Zheng has made the slab shrinkers operate locklessly, providing
some scalability benefits.
- Keith Busch has improved dmapool's performance, making part of its
operations O(1) rather than O(n).
- Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
permitting userspace to wr-protect anon memory unpopulated ptes.
- Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather
than exclusive, and has fixed a bunch of errors which were caused by its
unintuitive meaning.
- Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
which causes minor faults to install a write-protected pte.
- Vlastimil Babka has done some maintenance work on vma_merge():
cleanups to the kernel code and improvements to our userspace test
harness.
- Cleanups to do_fault_around() by Lorenzo Stoakes.
- Mike Rapoport has moved a lot of initialization code out of various
mm/ files and into mm/mm_init.c.
- Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
DRM, but DRM doesn't use it any more.
- Lorenzo has also coverted read_kcore() and vread() to use iterators
and has thereby removed the use of bounce buffers in some cases.
- Lorenzo has also contributed further cleanups of vma_merge().
- Chaitanya Prakash provides some fixes to the mmap selftesting code.
- Matthew Wilcox changes xfs and afs so they no longer take sleeping
locks in ->map_page(), a step towards RCUification of pagefaults.
- Suren Baghdasaryan has improved mmap_lock scalability by switching to
per-VMA locking.
- Frederic Weisbecker has reworked the percpu cache draining so that it
no longer causes latency glitches on cpu isolated workloads.
- Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
logic.
- Liu Shixin has changed zswap's initialization so we no longer waste a
chunk of memory if zswap is not being used.
- Yosry Ahmed has improved the performance of memcg statistics flushing.
- David Stevens has fixed several issues involving khugepaged,
userfaultfd and shmem.
- Christoph Hellwig has provided some cleanup work to zram's IO-related
code paths.
- David Hildenbrand has fixed up some issues in the selftest code's
testing of our pte state changing.
- Pankaj Raghav has made page_endio() unneeded and has removed it.
- Peter Xu contributed some rationalizations of the userfaultfd
selftests.
- Yosry Ahmed has fixed an issue around memcg's page recalim accounting.
- Chaitanya Prakash has fixed some arm-related issues in the
selftests/mm code.
- Longlong Xia has improved the way in which KSM handles hwpoisoned
pages.
- Peter Xu fixes a few issues with uffd-wp at fork() time.
- Stefan Roesch has changed KSM so that it may now be used on a
per-process and per-cgroup basis.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEr3zQAKCRDdBJ7gKXxA
jlLoAP0fpQBipwFxED0Us4SKQfupV6z4caXNJGPeay7Aj11/kQD/aMRC2uPfgr96
eMG3kwn2pqkB9ST2QpkaRbxA//eMbQY=
=J+Dj
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
switching from a user process to a kernel thread.
- More folio conversions from Kefeng Wang, Zhang Peng and Pankaj
Raghav.
- zsmalloc performance improvements from Sergey Senozhatsky.
- Yue Zhao has found and fixed some data race issues around the
alteration of memcg userspace tunables.
- VFS rationalizations from Christoph Hellwig:
- removal of most of the callers of write_one_page()
- make __filemap_get_folio()'s return value more useful
- Luis Chamberlain has changed tmpfs so it no longer requires swap
backing. Use `mount -o noswap'.
- Qi Zheng has made the slab shrinkers operate locklessly, providing
some scalability benefits.
- Keith Busch has improved dmapool's performance, making part of its
operations O(1) rather than O(n).
- Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
permitting userspace to wr-protect anon memory unpopulated ptes.
- Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive
rather than exclusive, and has fixed a bunch of errors which were
caused by its unintuitive meaning.
- Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
which causes minor faults to install a write-protected pte.
- Vlastimil Babka has done some maintenance work on vma_merge():
cleanups to the kernel code and improvements to our userspace test
harness.
- Cleanups to do_fault_around() by Lorenzo Stoakes.
- Mike Rapoport has moved a lot of initialization code out of various
mm/ files and into mm/mm_init.c.
- Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
DRM, but DRM doesn't use it any more.
- Lorenzo has also coverted read_kcore() and vread() to use iterators
and has thereby removed the use of bounce buffers in some cases.
- Lorenzo has also contributed further cleanups of vma_merge().
- Chaitanya Prakash provides some fixes to the mmap selftesting code.
- Matthew Wilcox changes xfs and afs so they no longer take sleeping
locks in ->map_page(), a step towards RCUification of pagefaults.
- Suren Baghdasaryan has improved mmap_lock scalability by switching to
per-VMA locking.
- Frederic Weisbecker has reworked the percpu cache draining so that it
no longer causes latency glitches on cpu isolated workloads.
- Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
logic.
- Liu Shixin has changed zswap's initialization so we no longer waste a
chunk of memory if zswap is not being used.
- Yosry Ahmed has improved the performance of memcg statistics
flushing.
- David Stevens has fixed several issues involving khugepaged,
userfaultfd and shmem.
- Christoph Hellwig has provided some cleanup work to zram's IO-related
code paths.
- David Hildenbrand has fixed up some issues in the selftest code's
testing of our pte state changing.
- Pankaj Raghav has made page_endio() unneeded and has removed it.
- Peter Xu contributed some rationalizations of the userfaultfd
selftests.
- Yosry Ahmed has fixed an issue around memcg's page recalim
accounting.
- Chaitanya Prakash has fixed some arm-related issues in the
selftests/mm code.
- Longlong Xia has improved the way in which KSM handles hwpoisoned
pages.
- Peter Xu fixes a few issues with uffd-wp at fork() time.
- Stefan Roesch has changed KSM so that it may now be used on a
per-process and per-cgroup basis.
* tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
mm,unmap: avoid flushing TLB in batch if PTE is inaccessible
shmem: restrict noswap option to initial user namespace
mm/khugepaged: fix conflicting mods to collapse_file()
sparse: remove unnecessary 0 values from rc
mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area()
hugetlb: pte_alloc_huge() to replace huge pte_alloc_map()
maple_tree: fix allocation in mas_sparse_area()
mm: do not increment pgfault stats when page fault handler retries
zsmalloc: allow only one active pool compaction context
selftests/mm: add new selftests for KSM
mm: add new KSM process and sysfs knobs
mm: add new api to enable ksm per process
mm: shrinkers: fix debugfs file permissions
mm: don't check VMA write permissions if the PTE/PMD indicates write permissions
migrate_pages_batch: fix statistics for longterm pin retry
userfaultfd: use helper function range_in_vma()
lib/show_mem.c: use for_each_populated_zone() simplify code
mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list()
fs/buffer: convert create_page_buffers to folio_create_buffers
fs/buffer: add folio_create_empty_buffers helper
...
- sh: Use generic GCC library routines
- sh: sq: Use the bitmap API when applicable
- sh: sq: Fix incorrect element size for allocating bitmap buffer
- sh: pci: Remove unused variable in SH-7786 PCI Express code
- sh: mcount.S: fix build error when PRINTK is not enabled
- sh: remove sh5/sh64 last fragments
- sh: math-emu: fix macro redefined warning
- sh: init: use OF_EARLY_FLATTREE for early init
- sh: nmi_debug: fix return value of __setup handler
- sh: SH2007: drop the bad URL info
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEYv+KdYTgKVaVRgAGdCY7N/W1+RMFAmRIz0UACgkQdCY7N/W1
+RMm1Q/9Hw5xMnxHbryDoBAqgwEOZRH+MUMBnAyMw3shqxO/Cp/nIAacvdNmF4Me
iszDjATleshk8vbTwUE6cFPzKuLM8r4o1JfBvYSEBgkfs5YEEhoa+1TQZ6aYl3zD
v6vcVQnobaV5dUc9yUA3FdG/vuXEj7wctZuqO0QYsC/bE5g/r1fFTEd37Jbo2qwg
6sJ+xL8KEa29Abq9OP0QmeOWvHBuGcCLZNgagA4JxT7U4+jYhg0ddphw+c3yybnP
FX1eFMulB98V/oDPCOlfrYsZAkQGoYPWwY0WI/nVg8ujA3lbRkSu6Fd9ic95/PGG
KVjr6Mol6/+ESy4k/MB46bJzq0un2FPWhZzyfL0RoCbX2zQWBtC/1XbT0PmTsRud
CzcPAMpNPDwUTcoSWdUpOfEAbxjIgGNhQBth9lRMNFhNkk8cwgk1UAN0LjBRm5nq
MteTim3qCyiFkNlngpvSVbIokBKWllKAtPSL3wCi6OgQCNm7XWZxme2z8G5tVkit
Q9bTVD5qMt24pRJsGsVho8wvRsqMmtl5hwMzFVP02WBNxb9csHpQHrhG7MRLN9kt
0BPYU6erCcRl9DQ9HonUaKCmJDJEyxUcXan48TSyGzajFDnURS7AfkreO7NHQIbO
YAaCvqCDwGVygBjUQtHLrBWlORjAD8IoMEJ1sivRzCeHXGlmI6s=
=RGSv
-----END PGP SIGNATURE-----
Merge tag 'sh-for-v6.4-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux
Pull sh updates from John Paul Adrian Glaubitz:
"This is a bit larger than my previous one and mainly consists of
clean-up work in the arch/sh directory by Geert Uytterhoeven and Randy
Dunlap.
Additionally, this fixes a bug in the Storage Queue code that was
discovered while I was reviewing a patch to switch the code to the
bitmap API by Christophe Jaillet.
So this contains both a fix for the original bug in the Storage Queue
code that can be backported later as well as the Christophe's patch to
swich the code to the bitmap API.
Summary:
- Use generic GCC library routines
- sq: Use the bitmap API when applicable
- sq: Fix incorrect element size for allocating bitmap buffer
- pci: Remove unused variable in SH-7786 PCI Express code
- mcount.S: fix build error when PRINTK is not enabled
- remove sh5/sh64 last fragments
- math-emu: fix macro redefined warning
- init: use OF_EARLY_FLATTREE for early init
- nmi_debug: fix return value of __setup handler
- SH2007: drop the bad URL info"
* tag 'sh-for-v6.4-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
sh: Replace <uapi/asm/types.h> by <asm-generic/int-ll64.h>
sh: Use generic GCC library routines
sh: sq: Use the bitmap API when applicable
sh: sq: Fix incorrect element size for allocating bitmap buffer
sh: pci: Remove unused variable in SH-7786 PCI Express code
sh: mcount.S: fix build error when PRINTK is not enabled
sh: remove sh5/sh64 last fragments
sh: math-emu: fix macro redefined warning
sh: init: use OF_EARLY_FLATTREE for early init
sh: nmi_debug: fix return value of __setup handler
sh: SH2007: drop the bad URL info
This adds three new tests to the selftests for KSM. These tests use the
new prctl API's to enable and disable KSM.
1) add new prctl flags to prctl header file in tools dir
This adds the new prctl flags to the include file prct.h in the
tools directory. This makes sure they are available for testing.
2) add KSM prctl merge test to ksm_tests
This adds the -t option to the ksm_tests program. The -t flag
allows to specify if it should use madvise or prctl ksm merging.
3) add two functions for debugging merge outcome for ksm_tests
This adds two functions to report the metrics in /proc/self/ksm_stat
and /sys/kernel/debug/mm/ksm. The debug output is enabled with the
-d option.
4) add KSM prctl test to ksm_functional_tests
This adds a test to the ksm_functional_test that verifies that the
prctl system call to enable / disable KSM works.
5) add KSM fork test to ksm_functional_test
Add fork test to verify that the MMF_VM_MERGE_ANY flag is inherited
by the child process.
Link: https://lkml.kernel.org/r/20230418051342.1919757-4-shr@devkernel.io
Signed-off-by: Stefan Roesch <shr@devkernel.io>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add a test suite (with 10 more sub-tests) to cover RO pinning against
fork() over uffd-wp. It covers both:
(1) Early CoW test in fork() when page pinned,
(2) page unshare due to RO longterm pin.
They are:
Testing wp-fork-pin on anon... done
Testing wp-fork-pin on shmem... done
Testing wp-fork-pin on shmem-private... done
Testing wp-fork-pin on hugetlb... done
Testing wp-fork-pin on hugetlb-private... done
Testing wp-fork-pin-with-event on anon... done
Testing wp-fork-pin-with-event on shmem... done
Testing wp-fork-pin-with-event on shmem-private... done
Testing wp-fork-pin-with-event on hugetlb... done
Testing wp-fork-pin-with-event on hugetlb-private... done
CONFIG_GUP_TEST needed or they'll be skipped.
Testing wp-fork-pin on anon... skipped [reason: Possibly CONFIG_GUP_TEST missing or unprivileged]
Note that the major test goal is on private memory, but no hurt to also run
all of them over shared because shared memory should work the same.
Link: https://lkml.kernel.org/r/20230417195317.898696-7-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The macro and facility can be reused in other tests too. Make it general.
Link: https://lkml.kernel.org/r/20230417195317.898696-6-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Extend it to all types of mem, meanwhile add one parallel test when
EVENT_FORK is enabled, where uffd-wp bits should be persisted rather than
dropped.
Since at it, rename the test to "wp-fork" to better show what it means.
Making the new test called "wp-fork-with-event".
Before:
Testing pagemap on anon... done
After:
Testing wp-fork on anon... done
Testing wp-fork on shmem... done
Testing wp-fork on shmem-private... done
Testing wp-fork on hugetlb... done
Testing wp-fork on hugetlb-private... done
Testing wp-fork-with-event on anon... done
Testing wp-fork-with-event on shmem... done
Testing wp-fork-with-event on shmem-private... done
Testing wp-fork-with-event on hugetlb... done
Testing wp-fork-with-event on hugetlb-private... done
Link: https://lkml.kernel.org/r/20230417195317.898696-5-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Namely:
"-f": add a wildcard filter for tests to run
"-l": list tests rather than running any
"-h": help msg
Link: https://lkml.kernel.org/r/20230417195317.898696-4-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The va_high_addr_switch selftest is used to test mmap across 128TB
boundary. It divides the selftest cases into two main categories on the
basis of size. One set is used to create mappings that are multiples of
PAGE_SIZE while the other creates mappings that are multiples of
HUGETLB_SIZE.
In order to run the hugetlb testcases the binary must be appended with
"--run-hugetlb" but the file that used to run the test only invokes the
binary, thereby completely skipping the hugetlb testcases. Hence, the
required statement has been added.
Link: https://lkml.kernel.org/r/20230323105243.2807166-6-chaitanyas.prakash@arm.com
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Arm64 has a default hugepage size of 512MB when CONFIG_ARM64_64K_PAGES=y
is enabled. While testing on arm64 platforms having up to 4PB of virtual
address space, a minimum of 6 hugepages were required for all test cases
to pass. Support for this requirement has been added.
Link: https://lkml.kernel.org/r/20230323105243.2807166-5-chaitanyas.prakash@arm.com
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The in code comments for the selftest were made on the basis of 128TB
switch, an architecture feature specific to PowerPc and x86 platforms.
Keeping in mind the support added for arm64 platforms which implements a
256TB switch, a more generic explanation has been provided.
Link: https://lkml.kernel.org/r/20230323105243.2807166-4-chaitanyas.prakash@arm.com
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
As the initial selftest only took into consideration PowperPC and x86
architectures, on adding support for arm64, a platform independent naming
convention is chosen.
Link: https://lkml.kernel.org/r/20230323105243.2807166-3-chaitanyas.prakash@arm.com
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "selftests/mm: Implement support for arm64 on va".
The va_128TBswitch selftest is designed and implemented for PowerPC and
x86 architectures which support a 128TB switch, up to 256TB of virtual
address space and hugepage sizes of 16MB and 2MB respectively. Arm64
platforms on the other hand support a 256Tb switch, up to 4PB of virtual
address space and a default hugepage size of 512MB when 64k pagesize is
enabled.
These architectural differences require introducing support for arm64
platforms, after which a more generic naming convention is suggested. The
in code comments are amended to provide a more platform independent
explanation of the working of the code and nr_hugepages are configured as
required. Finally, the file running the testcase is modified in order to
prevent skipping of hugetlb testcases of va_high_addr_switch.
This patch (of 5):
Arm64 platforms have the ability to support 64kb pagesize, 512MB default
hugepage size and up to 4PB of virtual address space. The address switch
occurs at 256TB as opposed to 128TB. Hence, the necessary support has
been added.
Link: https://lkml.kernel.org/r/20230323105243.2807166-1-chaitanyas.prakash@arm.com
Link: https://lkml.kernel.org/r/20230323105243.2807166-2-chaitanyas.prakash@arm.com
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This new test tests against the returned ioctls from UFFDIO_REGISTER,
where put into uffdio_register.ioctls.
This also tests the expected failure cases of UFFDIO_REGISTER, aka:
- Register with empty mode should fail with -EINVAL
- Register minor without page cache (anon) should fail with -EINVAL
Link: https://lkml.kernel.org/r/20230412164548.329376-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The userfaultfd stress test never tested private shmem, which I think was
overlooked long due. Add it so it matches with uffd unit test and it'll
cover all memory supported with the three memory types.
Meanwhile, rename the memory types a bit. Considering shared mem is the
major use case for both shmem / hugetlbfs, changing from:
anon, hugetlb, hugetlb_shared, shmem
To (with shmem-private added):
anon, hugetlb, hugetlb-private, shmem, shmem-private
Add the shmem-private to run_vmtests.sh too.
Link: https://lkml.kernel.org/r/20230412164546.329355-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
With the new uffd unit test covering the /dev/userfaultfd path and syscall
path of uffd initializations, we can safely drop the devnode test in the
old stress test.
One thing is to avoid duplication of running the stress test twice which is
an overkill to only test the /dev/ interface in run_vmtests.sh.
The other benefit is now all uffd tests (that uses userfaultfd_open) can
run automatically as long as any type of interface is enabled (either
syscall or dev), so it's more likely to succeed rather than fail due to
unprivilege.
With this patch lands, we can drop all the "mem_type:XXX" handlings too.
Link: https://lkml.kernel.org/r/20230412164525.329176-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Allow skip a unit test properly due to no privilege (e.g. sigbus and
events tests).
[colin.i.king@gmail.com: fix spelling mistake "priviledge" -> "privilege"]
Link: https://lkml.kernel.org/r/20230414081506.1678998-1-colin.i.king@gmail.com
Link: https://lkml.kernel.org/r/20230412164520.329163-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Userfaultfd minor+wp mode was very recently added. The test will fail on
the old kernels at ioctl(UFFDIO_CONTINUE) which is misterious.
Unfortunately there's no feature bit to detect for this support.
Add a hack to leverage WP_UNPOPULATED to detect whether that feature
existed, since WP_UNPOPULATED was merged right after minor+wp.
Link: https://lkml.kernel.org/r/20230412164517.329152-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Simplifies it a bit along the way, e.g., drop the never used offset field
(which was always the 1st page so offset=0).
Introduce uffd_register_with_ioctls() out of uffd_register() to detect
uffdio_register.ioctls got returned. Check that automatically when testing
UFFDIO_ZEROPAGE on different types of memory (and kernel).
Link: https://lkml.kernel.org/r/20230412164404.328815-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Move the two tests into the unit test, and convert it into 20 standalone
tests:
- events test on all 5 mem types, with wp on/off
- signal test on all 5 mem types, with wp on/off
Testing sigbus on anon... done
Testing sigbus on shmem... done
Testing sigbus on shmem-private... done
Testing sigbus on hugetlb... done
Testing sigbus on hugetlb-private... done
Testing sigbus-wp on anon... done
Testing sigbus-wp on shmem... done
Testing sigbus-wp on shmem-private... done
Testing sigbus-wp on hugetlb... done
Testing sigbus-wp on hugetlb-private... done
Testing events on anon... done
Testing events on shmem... done
Testing events on shmem-private... done
Testing events on hugetlb... done
Testing events on hugetlb-private... done
Testing events-wp on anon... done
Testing events-wp on shmem... done
Testing events-wp on shmem-private... done
Testing events-wp on hugetlb... done
Testing events-wp on hugetlb-private... done
It'll also remove a lot of global references along the way,
e.g. test_uffdio_wp will be replaced with the wp value passed over.
Link: https://lkml.kernel.org/r/20230412164400.328798-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This moves the minor test to the new unit test.
Rewrite the content check with char* opeartions to avoid fiddling with
my_bcmp().
Drop global vars test_uffdio_minor and test_collapse, just assume test them
always in common code for now.
OTOH make this single test into five tests:
- minor test on [shmem, hugetlb] with wp=false
- minor test on [shmem, hugetlb] with wp=true
- minor test + collapse on shmem only
One thing to mention that we used to test COLLAPSE+WP but that doesn't
sound right at all. It's possible it's silently broken but unnoticed
because COLLAPSE is not part of the default test suite.
Make the MADV_COLLAPSE test fail-able (by skip it when failing), because
it's not guaranteed to success anyway.
Drop a bunch of useless code after the move, because the unit test always
use aligned num of pages and has nothing to do with n_cpus.
Link: https://lkml.kernel.org/r/20230412164357.328779-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Move it over and make it split into two tests, one for pagemap and one for
the new WP_UNPOPULATED (to be a separate one).
The thp pagemap test wasn't really working (with MADV_HUGEPAGE). Let's
just drop it (since it never really worked anyway..) and leave that for
later.
Link: https://lkml.kernel.org/r/20230412164352.328733-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add a framework to be prepared to move unit tests from uffd-stress.c into
uffd-unit-tests.c. The goal is to allow detection of uffd features for
each test, and also loop over specified types of memory that a test
support.
Link: https://lkml.kernel.org/r/20230412164348.328710-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Mostly to detect hugetlb allocation errors and skip hugetlb tests when
pages are not allocated.
Link: https://lkml.kernel.org/r/20230412164345.328659-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Make the handler optionally apply WP bit when resolving page faults for
either missing or minor page faults. This moves towards removing global
test_uffdio_wp outside of the common code.
Link: https://lkml.kernel.org/r/20230412164341.328618-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Prepare for adding more fields into the struct.
Link: https://lkml.kernel.org/r/20230412164337.328607-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Suggested-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
hpage_size was wrongly used. Sometimes it means hugetlb default size,
sometimes it was used as thp size.
Remove the global variable and use the right one at each place.
Link: https://lkml.kernel.org/r/20230412164333.328596-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Drop it by creating the memfd dynamically in the tests.
Link: https://lkml.kernel.org/r/20230412164331.328584-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add one simple test for UFFDIO_API. With that, I also added a bunch of
small but handy helpers along the way.
Link: https://lkml.kernel.org/r/20230412164257.328375-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Provide two helpers to open an uffd handle. Drop the error checks around
SKIPs because it's inside an errexit() anyway, which IMHO doesn't really
help much if the test will not continue.
Link: https://lkml.kernel.org/r/20230412164254.328335-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add two helpers to register/unregister to an uffd. Use them to drop
duplicate codes.
This patch also drops assert_expected_ioctls_present() and
get_expected_ioctls(). Reasons:
- It'll need a lot of effort to pass test_type==HUGETLB into it from
the upper, so it's the simplest way to get rid of another global var
- The ioctls returned in UFFDIO_REGISTER is hardly useful at all,
because any app can already detect kernel support on any ioctl via its
corresponding UFFD_FEATURE_*. The check here is for sanity mostly but
it's probably destined no user app will even use it.
- It's not friendly to one future goal of uffd to run on old
kernels, the problem is get_expected_ioctls() compiles against
UFFD_API_RANGE_IOCTLS, which is a value that can change depending on
where the test is compiled, rather than reflecting what the kernel
underneath has. It means it'll report false negatives on old kernels
so it's against our will.
So let's make our lives easier.
[peterx@redhat.com; tools/testing/selftests/mm/hugepage-mremap.c: add headers]
Link: https://lkml.kernel.org/r/ZDxrvZh/cw357D8P@x1n
Link: https://lkml.kernel.org/r/20230412164247.328293-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In many ways it's weird and unwanted to keep all the tests in the same
userfaultfd.c at least when still in the current way.
For example, it doesn't make much sense to run the stress test for each
method we can create an userfaultfd handle (either via syscall or /dev/
node). It's a waste of time running this twice for the whole stress as
the stress paths are the same, only the open path is different.
It's also just weird to need to manually specify different types of memory
to run all unit tests for the userfaultfd interface. We should be able to
just run a single program and that should go through all functional uffd
tests without running the stress test at all. The stress test was more
for torturing and finding race conditions. We don't want to wait for
stress to finish just to regress test a functional test.
When we start to pile up more things on top of the same file and same
functions, things start to go a bit chaos and the code is just harder to
maintain too with tons of global variables.
This patch creates a new test uffd-unit-tests to keep userfaultfd unit
tests in the future, currently empty.
Meanwhile rename the old userfaultfd.c test to uffd-stress.c.
Link: https://lkml.kernel.org/r/20230412164244.328270-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Move common utility functions into uffd-common.[ch] files from the
original userfaultfd.c. This prepares for a split of userfaultfd.c into
two tests: one to only cover the old but powerful stress test, the other
one covers all the functional tests.
This movement is kind of a brute-force effort for now, with light
touch-ups but nothing should really change. There's chances to optimize
more, but let's leave that for later.
Link: https://lkml.kernel.org/r/20230412164241.328259-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The idea was trying to flip this var in the alarm handler from time to
time to test -EEXIST of UFFDIO_ZEROPAGE, but firstly it's only used in the
zeropage test so probably only used once, meanwhile we passed
"retry==false" so it'll never got tested anyway.
Drop both sides so we always test UFFDIO_ZEROPAGE retries if has_zeropage
is set (!hugetlb).
One more thing to do is doing UFFDIO_REGISTER for the alias buffer too,
because otherwise the test won't even pass! We were just lucky that this
test never really got ran at all.
Link: https://lkml.kernel.org/r/20230412164238.328238-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Make the check as simple as "test_type == TEST_HUGETLB" because that's the
only mem that doesn't support ZEROPAGE.
Link: https://lkml.kernel.org/r/20230412164234.328168-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We've got the macros in uffd-stress.c, move it over and use it in
vm_util.h.
Link: https://lkml.kernel.org/r/20230412164227.328145-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
There're already 3 same definitions of the three functions. Move it into
vm_util.[ch].
Link: https://lkml.kernel.org/r/20230412164223.328134-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We do have plenty of files that want to link against vm_util.c. Just make
it simple by linking it always.
Link: https://lkml.kernel.org/r/20230412164220.328123-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
TEST_GEN_PROGS and TEST_GEN_FILES are used randomly in the mm/Makefile to
specify programs that need to build. Logically all these binaries should
all fall into TEST_GEN_PROGS.
Replace those TEST_GEN_FILES with TEST_GEN_PROGS, so that we can reference
all the tests easily later.
[peterx@redhat.com: tools/testing/selftests/mm/Makefile: don't wipe out TEST_GEN_PROGS]
Link: https://lkml.kernel.org/r/ZDxrvZh/cw357D8P@x1n
Link: https://lkml.kernel.org/r/20230412164218.328104-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
There're two util headers under mm/ kselftest. Merge one with another.
It turns out util.h is the easy one to move.
When merging, drop PAGE_SIZE / PAGE_SHIFT because they're unnecessary
wrappers to page_size() / page_shift(), meanwhile rename them to psize()
and pshift() so as to not conflict with some existing definitions in some
test files that includes vm_util.h.
Link: https://lkml.kernel.org/r/20230412164120.327731-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Dump a summary after running whatever test specified. Useful for human
runners to identify any kind of failures (besides exit code).
Link: https://lkml.kernel.org/r/20230412164117.327720-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "selftests/mm: Split / Refactor userfault test", v2.
This patchset splits userfaultfd.c into two tests:
- uffd-stress: the "vanilla", old and powerful stress test
- uffd-unit-tests: all the unit tests will be moved here
This is on my todo list for a long time but I never did it for real. The
uffd test is growing into a small and cute monster. I start to notice it's
going harder to maintain such a test and make it useful.
A few issues I found when looking at userfaultfd test:
- We have a bunch of unit tests in userfaultfd.c, but they always need to
be run only after a stress type. No way to not do it.
- We can only run an unit test for one memory type only, if we want to
do a quick smoke test to check regressions, there's no good way. The
best to come currently is "bash ./run_vmtests.sh -t userfaultfd" thanks
to the most recent changes to run_vmtests.sh on tagging. Still, that
needs to run the stress tests always and hard to see what's wrong.
- It's hard to add a new unit test to userfaultfd.c, we don't really know
what's happening, not until we mostly read the whole file.
- We did a bunch of useless tests, e.g. we run twice the whole suite of
stress test just to verify both syscall and /dev/userfaultfd. They're
all using userfaultfd_new() to create the handle, everything should
really be the same underneath. One simple unit test should cover that!
- We have tens of global variables in one file but shared with all the
tests. Some of them are not suitable to be a global var from
maintainance pov. It enforces every unit test to consider how these
vars affects the stress test and vice versa, but that's logically not
necessary.
- Userfaultfd test is not friendly to old kernels. Mostly it only works
on the latest kernel tree. It's preferrable to be run on all kernels
and properly report what's missing.
I'll stop here, I feel like I can still list some..
This patchset should resolve all issues above, and actually we can do even
more on top. I stopped doing that until I found I already got 29 patches
and 2000+ LOC changes. That's already a patchset terrible enough so we
should move in small steps.
After the whole set applied, "./run_vmtests.sh -t userfaultfd" looks like
this:
===8<===
vm.nr_hugepages = 1024
-------------------------
running ./uffd-unit-tests
-------------------------
Testing UFFDIO_API (with syscall)... done
Testing UFFDIO_API (with /dev/userfaultfd)... done
Testing register-ioctls on anon... done
Testing register-ioctls on shmem... done
Testing register-ioctls on shmem-private... done
Testing register-ioctls on hugetlb... done
Testing register-ioctls on hugetlb-private... done
Testing zeropage on anon... done
Testing zeropage on shmem... done
Testing zeropage on shmem-private... done
Testing zeropage on hugetlb... done
Testing zeropage on hugetlb-private... done
Testing pagemap on anon... done
Testing wp-unpopulated on anon... done
Testing minor on shmem... done
Testing minor on hugetlb... done
Testing minor-wp on shmem... done
Testing minor-wp on hugetlb... done
Testing minor-collapse on shmem... done
Testing sigbus on anon... done
Testing sigbus on shmem... done
Testing sigbus on shmem-private... done
Testing sigbus on hugetlb... done
Testing sigbus on hugetlb-private... done
Testing sigbus-wp on anon... done
Testing sigbus-wp on shmem... done
Testing sigbus-wp on shmem-private... done
Testing sigbus-wp on hugetlb... done
Testing sigbus-wp on hugetlb-private... done
Testing events on anon... done
Testing events on shmem... done
Testing events on shmem-private... done
Testing events on hugetlb... done
Testing events on hugetlb-private... done
Testing events-wp on anon... done
Testing events-wp on shmem... done
Testing events-wp on shmem-private... done
Testing events-wp on hugetlb... done
Testing events-wp on hugetlb-private... done
Userfaults unit tests: pass=39, skip=0, fail=0 (total=39)
[PASS]
--------------------------------
running ./uffd-stress anon 20 16
--------------------------------
nr_pages: 5120, nr_pages_per_cpu: 640
bounces: 15, mode: rnd racing ver poll, userfaults: 345 missing (26+48+61+102+30+12+59+7) 1596 wp (120+139+317+346+215+67+306+86)
[...]
[PASS]
------------------------------------
running ./uffd-stress hugetlb 128 32
------------------------------------
nr_pages: 64, nr_pages_per_cpu: 8
bounces: 31, mode: rnd racing ver poll, userfaults: 29 missing (6+6+6+5+4+2+0+0) 104 wp (20+19+22+18+7+12+5+1)
[...]
[PASS]
--------------------------------------------
running ./uffd-stress hugetlb-private 128 32
--------------------------------------------
nr_pages: 64, nr_pages_per_cpu: 8
bounces: 31, mode: rnd racing ver poll, userfaults: 33 missing (12+9+7+0+5+0+0+0) 111 wp (24+25+14+14+11+17+5+1)
[...]
[PASS]
---------------------------------
running ./uffd-stress shmem 20 16
---------------------------------
nr_pages: 5120, nr_pages_per_cpu: 640
bounces: 15, mode: rnd racing ver poll, userfaults: 247 missing (15+17+34+60+81+37+3+0) 2038 wp (180+114+276+400+381+318+165+204)
[...]
[PASS]
-----------------------------------------
running ./uffd-stress shmem-private 20 16
-----------------------------------------
nr_pages: 5120, nr_pages_per_cpu: 640
bounces: 15, mode: rnd racing ver poll, userfaults: 235 missing (52+29+55+56+13+9+16+5) 2849 wp (218+406+461+531+328+284+430+191)
[...]
[PASS]
SUMMARY: PASS=6 SKIP=0 FAIL=0
===8<===
The output may be different if we miss some features (e.g., hugetlb not
allocated, old kernel, less privilege of uffd handle), but they should show
up with good reasons. E.g., I tried to run the unit test on my Fedora
kernel and it gives me:
===8<===
UFFDIO_API (with syscall)... failed [reason: UFFDIO_API should fail with wrong api but didn't]
UFFDIO_API (with /dev/userfaultfd)... skipped [reason: cannot open userfaultfd handle]
zeropage on anon... done
zeropage on shmem... done
zeropage on shmem-private... done
zeropage-hugetlb on hugetlb... done
zeropage-hugetlb on hugetlb-private... done
pagemap on anon... pagemap on anon... pagemap on anon... done
wp-unpopulated on anon... skipped [reason: feature missing]
minor on shmem... done
minor on hugetlb... done
minor-wp on shmem... skipped [reason: feature missing]
minor-wp on hugetlb... skipped [reason: feature missing]
minor-collapse on shmem... done
sigbus on anon... skipped [reason: possible lack of priviledge]
sigbus on shmem... skipped [reason: possible lack of priviledge]
sigbus on shmem-private... skipped [reason: possible lack of priviledge]
sigbus on hugetlb... skipped [reason: possible lack of priviledge]
sigbus on hugetlb-private... skipped [reason: possible lack of priviledge]
sigbus-wp on anon... skipped [reason: possible lack of priviledge]
sigbus-wp on shmem... skipped [reason: possible lack of priviledge]
sigbus-wp on shmem-private... skipped [reason: possible lack of priviledge]
sigbus-wp on hugetlb... skipped [reason: possible lack of priviledge]
sigbus-wp on hugetlb-private... skipped [reason: possible lack of priviledge]
events on anon... skipped [reason: possible lack of priviledge]
events on shmem... skipped [reason: possible lack of priviledge]
events on shmem-private... skipped [reason: possible lack of priviledge]
events on hugetlb... skipped [reason: possible lack of priviledge]
events on hugetlb-private... skipped [reason: possible lack of priviledge]
events-wp on anon... skipped [reason: possible lack of priviledge]
events-wp on shmem... skipped [reason: possible lack of priviledge]
events-wp on shmem-private... skipped [reason: possible lack of priviledge]
events-wp on hugetlb... skipped [reason: possible lack of priviledge]
events-wp on hugetlb-private... skipped [reason: possible lack of priviledge]
Userfaults unit tests: pass=9, skip=24, fail=1 (total=34)
===8<===
Patch layout:
- Revert "userfaultfd: don't fail on unrecognized features"
Something I found when I got the UFFDIO_API test below. Axel, I still
propose to revert it as a whole, but feel free to continue the discussion
from the original patch thread.
- selftests/mm: Update .gitignore with two missing tests
- selftests/mm: Dump a summary in run_vmtests.sh
- selftests/mm: Merge util.h into vm_util.h
- selftests/mm: Use TEST_GEN_PROGS where proper
- selftests/mm: Link vm_util.c always
- selftests/mm: Merge default_huge_page_size() into one
- selftests/mm: Use PM_* macros in vm_utils.h
- selftests/mm: Reuse pagemap_get_entry() in vm_util.h
- selftests/mm: Test UFFDIO_ZEROPAGE only when !hugetlb
- selftests/mm: Drop test_uffdio_zeropage_eexist
Until here, all cleanups here and there. I wanted to keep going, but I
found that maybe it'll take a few more days to split the test. Hence I
did a split starting from the next one, so we have a working thing first.
- selftests/mm: Create uffd-common.[ch]
- selftests/mm: Split uffd tests into uffd-stress and uffd-unit-tests
This did the major brute force split of common codes into
uffd-common.[ch]. That'll be the so far common base for stress and unit
tests. Then a new unit test is created.
- selftests/mm: uffd_[un]register()
- selftests/mm: uffd_open_{dev|sys}()
- selftests/mm: UFFDIO_API test
This patch hides here to start writting the 1st unit test with
UFFDIO_API, also detection of userfaultfd privileges.
- selftests/mm: Drop global mem_fd in uffd tests
- selftests/mm: Drop global hpage_size in uffd tests
- selftests/mm: Rename uffd_stats to uffd_args
- selftests/mm: Let uffd_handle_page_fault() takes wp parameter
- selftests/mm: Allow allocate_area() to fail properly
Some further cleanup that I noticed otherwise hard to move the tests.
- selftests/mm: Add framework for uffd-unit-test
The major patch provides the framework for most of the rest unit tests.
- selftests/mm: Move uffd pagemap test to unit test
- selftests/mm: Move uffd minor test to unit test
- selftests/mm: Move uffd sig/events tests into uffd unit tests
- selftests/mm: Move zeropage test into uffd unit tests
Move unit tests and suite them into the new file.
- selftests/mm: Workaround no way to detect uffd-minor + wp
- selftests/mm: Allow uffd test to skip properly with no privilege
- selftests/mm: Drop sys/dev test in uffd-stress test
- selftests/mm: Add shmem-private test to uffd-stress
A bunch of changes to do better on error reportings, and add
shmem-private to the stress test which was long missing.
- selftests/mm: Add uffdio register ioctls test
One more patch to test uffdio_register.ioctls.
This patch (of 30):
Update .gitignore with two missing tests.
Link: https://lkml.kernel.org/r/20230412163922.327282-1-peterx@redhat.com
Link: https://lkml.kernel.org/r/20230412164114.327709-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Let's add some tests that trigger (pte|pmd)_mkdirty on VMAs without write
permissions. If an architecture implementation is wrong, we might
accidentally set the PTE/PMD writable and allow for write access in a VMA
without write permissions.
The tests include reproducers for the two issues recently discovered
and worked-around in core-MM for now:
(1) commit 624a2c94f5 ("Partly revert "mm/thp: carry over dirty
bit when thp splits on pmd"")
(2) commit 96a9c287e2 ("mm/migrate: fix wrongly apply write bit
after mkdirty on sparc64")
In addition, some other tests that reveal further issues.
All tests pass under x86_64:
./mkdirty
# [INFO] detected THP size: 2048 KiB
TAP version 13
1..6
# [INFO] PTRACE write access
ok 1 SIGSEGV generated, page not modified
# [INFO] PTRACE write access to THP
ok 2 SIGSEGV generated, page not modified
# [INFO] Page migration
ok 3 SIGSEGV generated, page not modified
# [INFO] Page migration of THP
ok 4 SIGSEGV generated, page not modified
# [INFO] PTE-mapping a THP
ok 5 SIGSEGV generated, page not modified
# [INFO] UFFDIO_COPY
ok 6 SIGSEGV generated, page not modified
# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0
But some fail on sparc64:
./mkdirty
# [INFO] detected THP size: 8192 KiB
TAP version 13
1..6
# [INFO] PTRACE write access
not ok 1 SIGSEGV generated, page not modified
# [INFO] PTRACE write access to THP
not ok 2 SIGSEGV generated, page not modified
# [INFO] Page migration
ok 3 SIGSEGV generated, page not modified
# [INFO] Page migration of THP
ok 4 SIGSEGV generated, page not modified
# [INFO] PTE-mapping a THP
ok 5 SIGSEGV generated, page not modified
# [INFO] UFFDIO_COPY
not ok 6 SIGSEGV generated, page not modified
Bail out! 3 out of 6 tests failed
# Totals: pass:3 fail:3 xfail:0 xpass:0 skip:0 error:0
Reverting both above commits makes all tests fail on sparc64:
./mkdirty
# [INFO] detected THP size: 8192 KiB
TAP version 13
1..6
# [INFO] PTRACE write access
not ok 1 SIGSEGV generated, page not modified
# [INFO] PTRACE write access to THP
not ok 2 SIGSEGV generated, page not modified
# [INFO] Page migration
not ok 3 SIGSEGV generated, page not modified
# [INFO] Page migration of THP
not ok 4 SIGSEGV generated, page not modified
# [INFO] PTE-mapping a THP
not ok 5 SIGSEGV generated, page not modified
# [INFO] UFFDIO_COPY
not ok 6 SIGSEGV generated, page not modified
Bail out! 6 out of 6 tests failed
# Totals: pass:0 fail:6 xfail:0 xpass:0 skip:0 error:0
The tests are useful to detect other problematic archs, to verify new
arch fixes, and to stop such issues from reappearing in the future.
For now, we don't add any hugetlb tests.
Link: https://lkml.kernel.org/r/20230411142512.438404-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "mm: (pte|pmd)_mkdirty() should not unconditionally allow for
write access".
This is the follow-up on [1], adding selftests (testing for known issues
we added workarounds for and other issues that haven't been fixed yet),
fixing sparc64, reverting the workarounds, and perform one cleanup.
The patch from [1] was modified slightly (updated/extended patch
description, dropped one unnecessary NOP instruction from the ASM in
__pte_mkhwwrite()).
Retested on x86_64 and sparc64 (sun4u in QEMU).
I scanned most architectures to make sure their (pte|pmd)_mkdirty()
handling is correct. To be sure, we can run the selftests and find out if
other architectures are still affectes (loongarch was fixed recently as
well).
Based on master for now. I don't expect surprises regarding mm-tress, but
I can rebase if there are any problems.
This patch (of 6):
The COW selftest can deal with THP not being configured. So move error
handling of read_pmd_pagesize() into the callers such that we can reuse it
in the COW selftest.
Link: https://lkml.kernel.org/r/20230411142512.438404-1-david@redhat.com
Link: https://lkml.kernel.org/r/20221212130213.136267-1-david@redhat.com [1]
Link: https://lkml.kernel.org/r/20230411142512.438404-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The kernel's default behaviour is to obstruct the allocation of high
virtual address as it handles memory overcommit in a heuristic manner.
Setting the parameter as OVERCOMMIT_ALWAYS, ensures kernel isn't
susceptible to the availability of a platform's physical memory when
denying a memory allocation request.
Link: https://lkml.kernel.org/r/20230323060121.1175830-4-chaitanyas.prakash@arm.com
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Although there is a provision for 52 bit VA on arm64 platform, it remains
unutilised and higher addresses are not allocated. In order to
accommodate 4PB [2^52] virtual address space where supported,
NR_CHUNKS_HIGH is changed accordingly.
Array holding addresses is changed from static allocation to dynamic
allocation to accommodate its voluminous nature which otherwise might
overflow the stack.
Link: https://lkml.kernel.org/r/20230323060121.1175830-3-chaitanyas.prakash@arm.com
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "selftests: Fix virtual address range for arm64", v2.
When the virtual address range selftest is run on arm64 and x86 platforms,
it is observed that both the low and high VA range iterations are skipped
when the MAP_CHUNK_SIZE is set to 16GB. The MAP_CHUNK_SIZE is changed to
1GB to resolve this issue, following which support for arm64 platform is
added by changing the NR_CHUNKS_HIGH for aarch64 to accommodate up to 4PB
of virtual address space allocation requests. Dynamic memory allocation
of array holding addresses is introduced to prevent overflow of the stack.
Finally, the overcommit_policy is set as OVERCOMMIT_ALWAYS to prevent the
kernel from denying a memory allocation request based on a platform's
physical memory availability.
This patch (of 3):
mmap() fails to allocate 16GB virtual space chunk, skipping both low and
high VA range iterations. Hence, reduce MAP_CHUNK_SIZE to 1GB and update
relevant macros as required.
Link: https://lkml.kernel.org/r/20230323060121.1175830-1-chaitanyas.prakash@arm.com
Link: https://lkml.kernel.org/r/20230323060121.1175830-2-chaitanyas.prakash@arm.com
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
UFFDIO_COPY already has UFFDIO_COPY_MODE_WP, so when installing a new PTE
to resolve a missing fault, one can install a write-protected one. This
is useful when using UFFDIO_REGISTER_MODE_{MISSING,WP} in combination.
This was motivated by testing HugeTLB HGM [1], and in particular its
interaction with userfaultfd features. Existing userfaultfd code supports
using WP and MINOR modes together (i.e. you can register an area with
both enabled), but without this CONTINUE flag the combination is in
practice unusable.
So, add an analogous UFFDIO_CONTINUE_MODE_WP, which does the same thing as
UFFDIO_COPY_MODE_WP, but for *minor* faults.
Update the selftest to do some very basic exercising of the new flag.
Update Documentation/ to describe how these flags are used (neither the
COPY nor the new CONTINUE versions of this mode flag were described there
before).
[1]: https://patchwork.kernel.org/project/linux-mm/cover/20230218002819.1486479-1-jthoughton@google.com/
Link: https://lkml.kernel.org/r/20230314221250.682452-5-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nadav Amit <namit@vmware.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Enable it by default on the stress test, and add some smoke tests for the
pte markers on anonymous.
Link: https://lkml.kernel.org/r/20230309223711.823547-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Paul Gofman <pgofman@codeweavers.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The canonical location for the tracefs filesystem is at /sys/kernel/tracing.
But, from Documentation/trace/ftrace.rst:
Before 4.1, all ftrace tracing control files were within the debugfs
file system, which is typically located at /sys/kernel/debug/tracing.
For backward compatibility, when mounting the debugfs file system,
the tracefs file system will be automatically mounted at:
/sys/kernel/debug/tracing
A few spots in tools/testing/selftests still refer to this older debugfs
path, so let's update them to avoid confusion.
Link: https://lkml.kernel.org/r/20230313211746.1541525-1-zwisler@kernel.org
Cc: "Tobin C. Harding" <me@tobin.cc>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tycho Andersen <tycho@tycho.pizza>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Signed-off-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Fix two inputs to check_anon_huge() and one if condition, so the tests
work as expected.
Link: https://lkml.kernel.org/r/20230306160907.16804-1-zi.yan@sent.com
Fixes: c07c343cda ("selftests/vm: dedup THP helpers")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Zach O'Keefe <zokeefe@google.com>
Tested-by: Zach O'Keefe <zokeefe@google.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
A previous patch removed most of the sh5 (sh64) support from the
kernel tree. Now remove the last stragglers.
Fixes: 37744feebc ("sh: remove sh5 support")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Rich Felker <dalias@libc.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: linux-sh@vger.kernel.org
Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Link: https://lore.kernel.org/r/20230306040037.20350-6-rdunlap@infradead.org
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
F_SEAL_EXEC") which permits the setting of the memfd execute bit at
memfd creation time, with the option of sealing the state of the X bit.
- Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
thread-safe for pmd unshare") which addresses a rare race condition
related to PMD unsharing.
- Several folioification patch serieses from Matthew Wilcox, Vishal
Moola, Sidhartha Kumar and Lorenzo Stoakes
- Johannes Weiner has a series ("mm: push down lock_page_memcg()") which
does perform some memcg maintenance and cleanup work.
- SeongJae Park has added DAMOS filtering to DAMON, with the series
"mm/damon/core: implement damos filter". These filters provide users
with finer-grained control over DAMOS's actions. SeongJae has also done
some DAMON cleanup work.
- Kairui Song adds a series ("Clean up and fixes for swap").
- Vernon Yang contributed the series "Clean up and refinement for maple
tree".
- Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
adds to MGLRU an LRU of memcgs, to improve the scalability of global
reclaim.
- David Hildenbrand has added some userfaultfd cleanup work in the
series "mm: uffd-wp + change_protection() cleanups".
- Christoph Hellwig has removed the generic_writepages() library
function in the series "remove generic_writepages".
- Baolin Wang has performed some maintenance on the compaction code in
his series "Some small improvements for compaction".
- Sidhartha Kumar is doing some maintenance work on struct page in his
series "Get rid of tail page fields".
- David Hildenbrand contributed some cleanup, bugfixing and
generalization of pte management and of pte debugging in his series "mm:
support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap
PTEs".
- Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
flag in the series "Discard __GFP_ATOMIC".
- Sergey Senozhatsky has improved zsmalloc's memory utilization with his
series "zsmalloc: make zspage chain size configurable".
- Joey Gouly has added prctl() support for prohibiting the creation of
writeable+executable mappings. The previous BPF-based approach had
shortcomings. See "mm: In-kernel support for memory-deny-write-execute
(MDWE)".
- Waiman Long did some kmemleak cleanup and bugfixing in the series
"mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
- T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
"mm: multi-gen LRU: improve".
- Jiaqi Yan has provided some enhancements to our memory error
statistics reporting, mainly by presenting the statistics on a per-node
basis. See the series "Introduce per NUMA node memory error
statistics".
- Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
regression in compaction via his series "Fix excessive CPU usage during
compaction".
- Christoph Hellwig does some vmalloc maintenance work in the series
"cleanup vfree and vunmap".
- Christoph Hellwig has removed block_device_operations.rw_page() in ths
series "remove ->rw_page".
- We get some maple_tree improvements and cleanups in Liam Howlett's
series "VMA tree type safety and remove __vma_adjust()".
- Suren Baghdasaryan has done some work on the maintainability of our
vm_flags handling in the series "introduce vm_flags modifier functions".
- Some pagemap cleanup and generalization work in Mike Rapoport's series
"mm, arch: add generic implementation of pfn_valid() for FLATMEM" and
"fixups for generic implementation of pfn_valid()"
- Baoquan He has done some work to make /proc/vmallocinfo and
/proc/kcore better represent the real state of things in his series
"mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
- Jason Gunthorpe rationalized the GUP system's interface to the rest of
the kernel in the series "Simplify the external interface for GUP".
- SeongJae Park wishes to migrate people from DAMON's debugfs interface
over to its sysfs interface. To support this, we'll temporarily be
printing warnings when people use the debugfs interface. See the series
"mm/damon: deprecate DAMON debugfs interface".
- Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
and clean-ups" series.
- Huang Ying has provided a dramatic reduction in migration's TLB flush
IPI rates with the series "migrate_pages(): batch TLB flushing".
- Arnd Bergmann has some objtool fixups in "objtool warning fixes".
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY/PoPQAKCRDdBJ7gKXxA
jlvpAPsFECUBBl20qSue2zCYWnHC7Yk4q9ytTkPB/MMDrFEN9wD/SNKEm2UoK6/K
DmxHkn0LAitGgJRS/W9w81yrgig9tAQ=
=MlGs
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Daniel Verkamp has contributed a memfd series ("mm/memfd: add
F_SEAL_EXEC") which permits the setting of the memfd execute bit at
memfd creation time, with the option of sealing the state of the X
bit.
- Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
thread-safe for pmd unshare") which addresses a rare race condition
related to PMD unsharing.
- Several folioification patch serieses from Matthew Wilcox, Vishal
Moola, Sidhartha Kumar and Lorenzo Stoakes
- Johannes Weiner has a series ("mm: push down lock_page_memcg()")
which does perform some memcg maintenance and cleanup work.
- SeongJae Park has added DAMOS filtering to DAMON, with the series
"mm/damon/core: implement damos filter".
These filters provide users with finer-grained control over DAMOS's
actions. SeongJae has also done some DAMON cleanup work.
- Kairui Song adds a series ("Clean up and fixes for swap").
- Vernon Yang contributed the series "Clean up and refinement for maple
tree".
- Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
adds to MGLRU an LRU of memcgs, to improve the scalability of global
reclaim.
- David Hildenbrand has added some userfaultfd cleanup work in the
series "mm: uffd-wp + change_protection() cleanups".
- Christoph Hellwig has removed the generic_writepages() library
function in the series "remove generic_writepages".
- Baolin Wang has performed some maintenance on the compaction code in
his series "Some small improvements for compaction".
- Sidhartha Kumar is doing some maintenance work on struct page in his
series "Get rid of tail page fields".
- David Hildenbrand contributed some cleanup, bugfixing and
generalization of pte management and of pte debugging in his series
"mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with
swap PTEs".
- Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
flag in the series "Discard __GFP_ATOMIC".
- Sergey Senozhatsky has improved zsmalloc's memory utilization with
his series "zsmalloc: make zspage chain size configurable".
- Joey Gouly has added prctl() support for prohibiting the creation of
writeable+executable mappings.
The previous BPF-based approach had shortcomings. See "mm: In-kernel
support for memory-deny-write-execute (MDWE)".
- Waiman Long did some kmemleak cleanup and bugfixing in the series
"mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
- T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
"mm: multi-gen LRU: improve".
- Jiaqi Yan has provided some enhancements to our memory error
statistics reporting, mainly by presenting the statistics on a
per-node basis. See the series "Introduce per NUMA node memory error
statistics".
- Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
regression in compaction via his series "Fix excessive CPU usage
during compaction".
- Christoph Hellwig does some vmalloc maintenance work in the series
"cleanup vfree and vunmap".
- Christoph Hellwig has removed block_device_operations.rw_page() in
ths series "remove ->rw_page".
- We get some maple_tree improvements and cleanups in Liam Howlett's
series "VMA tree type safety and remove __vma_adjust()".
- Suren Baghdasaryan has done some work on the maintainability of our
vm_flags handling in the series "introduce vm_flags modifier
functions".
- Some pagemap cleanup and generalization work in Mike Rapoport's
series "mm, arch: add generic implementation of pfn_valid() for
FLATMEM" and "fixups for generic implementation of pfn_valid()"
- Baoquan He has done some work to make /proc/vmallocinfo and
/proc/kcore better represent the real state of things in his series
"mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
- Jason Gunthorpe rationalized the GUP system's interface to the rest
of the kernel in the series "Simplify the external interface for
GUP".
- SeongJae Park wishes to migrate people from DAMON's debugfs interface
over to its sysfs interface. To support this, we'll temporarily be
printing warnings when people use the debugfs interface. See the
series "mm/damon: deprecate DAMON debugfs interface".
- Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
and clean-ups" series.
- Huang Ying has provided a dramatic reduction in migration's TLB flush
IPI rates with the series "migrate_pages(): batch TLB flushing".
- Arnd Bergmann has some objtool fixups in "objtool warning fixes".
* tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits)
include/linux/migrate.h: remove unneeded externs
mm/memory_hotplug: cleanup return value handing in do_migrate_range()
mm/uffd: fix comment in handling pte markers
mm: change to return bool for isolate_movable_page()
mm: hugetlb: change to return bool for isolate_hugetlb()
mm: change to return bool for isolate_lru_page()
mm: change to return bool for folio_isolate_lru()
objtool: add UACCESS exceptions for __tsan_volatile_read/write
kmsan: disable ftrace in kmsan core code
kasan: mark addr_has_metadata __always_inline
mm: memcontrol: rename memcg_kmem_enabled()
sh: initialize max_mapnr
m68k/nommu: add missing definition of ARCH_PFN_OFFSET
mm: percpu: fix incorrect size in pcpu_obj_full_size()
maple_tree: reduce stack usage with gcc-9 and earlier
mm: page_alloc: call panic() when memoryless node allocation fails
mm: multi-gen LRU: avoid futile retries
migrate_pages: move THP/hugetlb migration support check to simplify code
migrate_pages: batch flushing TLB
migrate_pages: share more code between _unmap and _move
...
Selftests vm builds break when doing cross-compilation. The Makefile
MACHINE variable incorrectly picks upp the host machine architecture.
If the CROSS_COMPILE variable is set, dig out the target host architecture
from CROSS_COMPILE, instead of calling uname.
Link: https://lkml.kernel.org/r/20230109114251.3349638-1-bjorn@kernel.org
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
If MADV_PAGEOUT is not defined (e.g., on AlmaLinux 8), compilation will
fail. Let's fix that like khugepaged.c does by conditionally defining
MADV_PAGEOUT.
Link: https://lkml.kernel.org/r/20230109171255.488749-1-david@redhat.com
Fixes: 69c66add56 ("selftests/vm: anon_cow: test COW handling of anonymous memory")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add a test to assert that we can mremap() and expand a mapping starting
from an offset within an existing mapping. We unmap the last page in a 3
page mapping to ensure that the remap should always succeed, before
remapping from the 2nd page.
This is additionally a regression test for the issue solved in "mm,
mremap: fix mremap() expanding vma with addr inside vma" and confirmed to
fail prior to the change and pass after it.
Finally, this patch updates the existing mremap expand merge test to check
error conditions and reduce code duplication between the two tests.
[lstoakes@gmail.com: increment num_expand_tests so test doesn't complain about unexpected tests being run]
Link: https://lkml.kernel.org/r/8ff3ba3cadc0b6c1b2688ae5c851bf73aa062d57.1673701836.git.lstoakes@gmail.com
Link: https://lkml.kernel.org/r/02b117a8ffd52acc01dc66c2fb39754f08d92c0e.1672675824.git.lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Jakub Matěna <matenajakub@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Rename selftets/vm to selftests/mm for being more consistent with the
code, documentation, and tools directories, and won't be confused with
virtual machines.
[sj@kernel.org: convert missing vm->mm changes]
Link: https://lkml.kernel.org/r/20230107230643.252273-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20230103180754.129637-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>