Most tests are currently not giving any proper output for the user
to see how much sub-tests have already been run, or whether new
sub-tests are part of a binary or not. So it would be good to
support TAP output in the KVM selftests. There is already a nice
framework for this in the kselftest_harness.h header which we can
use. But since we also need a vcpu in most KVM selftests, it also
makes sense to introduce our own wrapper around this which takes
care of creating a VM with one vcpu, so we don't have to repeat
this boilerplate in each and every test. Thus let's introduce
a KVM_ONE_VCPU_TEST() macro here which takes care of this.
Suggested-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/all/Y2v+B3xxYKJSM%2FfH@google.com/
Signed-off-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20240208204844.119326-5-thuth@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Extract the code to set a vCPU's entry point out of vm_arch_vcpu_add() and
into a new API, vcpu_arch_set_entry_point(). Providing a separate API
will allow creating a KVM selftests hardness that can handle tests that
use different entry points for sub-tests, whereas *requiring* the entry
point to be specified at vCPU creation makes it difficult to create a
generic harness, e.g. the boilerplate setup/teardown can't easily create
and destroy the VM and vCPUs.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20240208204844.119326-4-thuth@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
The regs structure just accidentally contains the right values
from the previous test in the spot where we want to change rbx.
It's cleaner if we properly initialize the structure here before
using it.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20240208204844.119326-3-thuth@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
In the spots where we are expecting a successful run, we should
use vcpu_run() instead of _vcpu_run() to make sure that the run
did not fail.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20240208204844.119326-2-thuth@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Since only 64bit KVM selftests were supported on all architectures,
add the CONFIG_64BIT definition in kvm/Makefile to ensure only 64bit
definitions were available in the corresponding included files.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Split the arch-neutral test code out of aarch64/arch_timer.c
and put them into a common arch_timer.c. This is a preparation
to share timer test codes in riscv.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
There are intermittent failures occurred when stressing the
arch-timer test in a Qemu VM:
Guest assert failed, vcpu 0; stage; 4; iter: 3
==== Test Assertion Failure ====
aarch64/arch_timer.c:196: config_iter + 1 == irq_iter
pid=4048 tid=4049 errno=4 - Interrupted system call
1 0x000000000040253b: test_vcpu_run at arch_timer.c:248
2 0x0000ffffb60dd5c7: ?? ??:0
3 0x0000ffffb6145d1b: ?? ??:0
0x3 != 0x2 (config_iter + 1 != irq_iter)e
Further test and debug show that the timeout for an interrupt
to arrive do have random high fluctuation, espectially when
testing in an virtual environment.
To alleviate this issue, just expose the timeout value as user
configurable and print some hint message to increase the value
when hitting the failure..
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Anup Patel <anup@brainfault.org>
Change signed type to unsigned in test_args struct which
only make sense for unsigned value.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Anup Patel <anup@brainfault.org>
The introduction of $(SPLIT_TESTS) also introduced a warning when
building selftests on architectures that include get-reg-lists:
make: Entering directory '/root/kvm/tools/testing/selftests/kvm'
Makefile:272: warning: overriding recipe for target '/root/kvm/tools/testing/selftests/kvm/get-reg-list'
Makefile:267: warning: ignoring old recipe for target '/root/kvm/tools/testing/selftests/kvm/get-reg-list'
make: Leaving directory '/root/kvm/tools/testing/selftests/kvm'
In addition, the rule for $(SPLIT_TESTS_TARGETS) includes _all_
the $(SPLIT_TESTS_OBJS), which only works because there is just one.
So fix both by adjusting the rules:
- remove $(SPLIT_TESTS_TARGETS) from the $(TEST_GEN_PROGS) rules,
and rename it to $(SPLIT_TEST_GEN_PROGS)
- fix $(SPLIT_TESTS_OBJS) so that it plays well with $(OUTPUT),
rename it to $(SPLIT_TEST_GEN_OBJ), and list the object file
explicitly in the $(SPLIT_TEST_GEN_PROGS) link rule
Fixes: 17da79e009 ("KVM: arm64: selftests: Split get-reg-list test code", 2023-08-09)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Anup Patel <anup@brainfault.org>
In Makefiles, $(error ), $(warning ), and $(info ) expand to the empty
string, as explained in the GNU Make manual [1]:
"The result of the expansion of this function is the empty string."
Therefore, they are no-op except for logging purposes.
$(shell ...) expands to the output of the command. It expands to the
empty string when the command does not print anything to stdout.
Hence, $(shell mkdir ...) is no-op except for creating the directory.
Remove meaningless assignments.
[1]: https://www.gnu.org/software/make/manual/make.html#Make-Control-Functions
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20240221134201.2656908-1-masahiroy@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kbuild@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
If an integer's type has x bits, shifting the integer left by x or more
is undefined behavior.
This can happen in the rotate function when attempting to do a rotation
of the whole value by 0.
Fixes: 0dd714bfd2 ("KVM: s390: selftest: memop: Add cmpxchg tests")
Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Link: https://lore.kernel.org/r/20240111094805.363047-1-nsg@linux.ibm.com
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Message-Id: <20240111094805.363047-1-nsg@linux.ibm.com>
Extend set_memory_region_test's invalid flags subtest to verify that
GUEST_MEMFD is incompatible with READONLY. GUEST_MEMFD doesn't currently
support writes from userspace and KVM doesn't support emulated MMIO on
private accesses, and so KVM is supposed to reject the GUEST_MEMFD+READONLY
in order to avoid configuration that KVM can't support.
Link: https://lore.kernel.org/r/20240222190612.2942589-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Actually create a GUEST_MEMFD instance and pass it to KVM when doing
negative tests for KVM_SET_USER_MEMORY_REGION2 + KVM_MEM_GUEST_MEMFD.
Without a valid GUEST_MEMFD file descriptor, KVM_SET_USER_MEMORY_REGION2
will always fail with -EINVAL, resulting in false passes for any and all
tests of illegal combinations of KVM_MEM_GUEST_MEMFD and other flags.
Fixes: 5d74316466 ("KVM: selftests: Add a memory region subtest to validate invalid flags")
Link: https://lore.kernel.org/r/20240222190612.2942589-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
There is a selftest that checks for an (expected) error when an
invalid AR is specified, but not one that exercises the AR path.
Add a simple test that mirrors the vanilla write/read test while
providing an AR. An AR that contains zero will direct the CPU to
use the primary address space normally used anyway. AR[1] is
selected for this test because the host AR[1] is usually non-zero,
and KVM needs to correctly swap those values.
Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Link: https://lore.kernel.org/r/20240220211211.3102609-3-farman@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
If the relevant capability (KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA) is present
then re-map vcpu_info using the HVA part way through the tests to make sure
then there is no functional change.
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lore.kernel.org/r/20240215152916.1158-16-paul@xen.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Using the HVA of the shared_info page is more efficient, so if the
capability (KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA) is present use that method
to do the mapping.
NOTE: Have the juggle_shinfo_state() thread map and unmap using both
GFN and HVA, to make sure the older mechanism is not broken.
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lore.kernel.org/r/20240215152916.1158-15-paul@xen.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Although the fixed counter 3 and its exclusive pseudo slots event are
not supported by KVM yet, the architectural slots event is supported by
KVM and can be programmed on any GP counter. Thus add validation for this
architectural slots event.
Top-down slots event "counts the total number of available slots for an
unhalted logical processor, and increments by machine-width of the
narrowest pipeline as employed by the Top-down Microarchitecture
Analysis method."
As for the slot, it's an abstract concept which indicates how many
uops (decoded from instructions) can be processed simultaneously
(per cycle) on HW. In Top-down Microarchitecture Analysis (TMA) method,
the processor is divided into two parts, frond-end and back-end. Assume
there is a processor with classic 5-stage pipeline, fetch, decode,
execute, memory access and register writeback. The former 2 stages
(fetch/decode) are classified to frond-end and the latter 3 stages are
classified to back-end.
In modern Intel processors, a complicated instruction would be decoded
into several uops (micro-operations) and so these uops can be processed
simultaneously and then improve the performance. Thus, assume a
processor can decode and dispatch 4 uops in front-end and execute 4 uops
in back-end simultaneously (per-cycle), so the machine-width of this
processor is 4 and this processor has 4 topdown slots per-cycle.
If a slot is spare and can be used to process a new upcoming uop, then
the slot is available, but if a uop occupies a slot for several cycles
and can't be retired (maybe blocked by memory access), then this slot is
stall and unavailable.
Considering the testing instruction sequence can't be macro-fused on x86
platforms, the measured slots count should not be less than
NUM_INSNS_RETIRED. Thus assert the slots count against NUM_INSNS_RETIRED.
pmu_counters_test passed with this patch on Intel Sapphire Rapids.
About the more information about TMA method, please refer the below link.
https://www.intel.com/content/www/us/en/docs/vtune-profiler/cookbook/2023-0/top-down-microarchitecture-analysis-method.html
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240218043003.2424683-1-dapeng1.mi@linux.intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
- Remove redundant newlines from error messages.
- Delete an unused variable in the AMX test (which causes build failures when
compiling with -Werror).
- Fail instead of skipping tests if open(), e.g. of /dev/kvm, fails with an
error code other than ENOENT (a Hyper-V selftest bug resulted in an EMFILE,
and the test eventually got skipped).
- Fix TSC related bugs in several Hyper-V selftests.
- Fix a bug in the dirty ring logging test where a sem_post() could be left
pending across multiple runs, resulting in incorrect synchronization between
the main thread and the vCPU worker thread.
- Relax the dirty log split test's assertions on 4KiB mappings to fix false
positives due to the number of mappings for memslot 0 (used for code and
data that is NOT being dirty logged) changing, e.g. due to NUMA balancing.
- Have KVM's gtod_is_based_on_tsc() return "bool" instead of an "int" (the
function generates boolean values, and all callers treat the return value as
a bool).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmXKupQSHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5DiQP/RNSgLrE9+/3oyqo9zpbhio2dKqz4dIk
8Ga1ZE4R89dyMB9jGKtWn3rEkyma3TsB+neVpG9ohHV6j25JJ0vNAkxQu3Gt+gkl
uM1lh/IfXPnAKyuy6dW9tpgZYE1v2/KfdWjeEzzxfPjzY/LX3yFiiCKEnUmfjjzZ
sSz91nV4KYS4b4xLWTIcBgNJuyLJuL05htTLmCu7t8DKOBHwHxXjSn8qqG8OvAjs
FOhf0zgGJKBFdKOw2Y8XeDdKO0RTEyEPHaFILcLEsuhoVIbY5OUmLe32pAFzzMbG
hPawUZ5CzC++e339gUgGkRNY80iSnGcYVcZa+ohxOsNBdOWko9z/eGWZUV7qkYDK
dkPHMoDnSzUCE2eSYbEB1eR/KOfziJCWMS9SAIJbJxIGb1HYajikwAEZ6FNp3R+u
MyCuNlV9TfsGgt4Dx8RctMeH2ROpORRu7h3WPFUBgG2/jOzPk/OR6U8hSzvmhTvL
MykZ8IaLmUIYoK/nCY2iwy50lQRxtZ/htqWn3sidCBGY0DXdNlMhvd3Vk9jtUvY5
Fgof0b564eYfk/qO3cMIDd2WFaDejP28JVSn0CNm6z9i54ubCKkSBEb4kTYXXnVK
YBHvbZ21Vjg52trudvK5UPt599sxxNBNiSV32ckLFKHS4ZVGSFSBSbsAWiQF157i
CbYntmtJhM+D
=infW
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-selftests-6.8-rcN' of https://github.com/kvm-x86/linux into HEAD
KVM selftests fixes/cleanups (and one KVM x86 cleanup) for 6.8:
- Remove redundant newlines from error messages.
- Delete an unused variable in the AMX test (which causes build failures when
compiling with -Werror).
- Fail instead of skipping tests if open(), e.g. of /dev/kvm, fails with an
error code other than ENOENT (a Hyper-V selftest bug resulted in an EMFILE,
and the test eventually got skipped).
- Fix TSC related bugs in several Hyper-V selftests.
- Fix a bug in the dirty ring logging test where a sem_post() could be left
pending across multiple runs, resulting in incorrect synchronization between
the main thread and the vCPU worker thread.
- Relax the dirty log split test's assertions on 4KiB mappings to fix false
positives due to the number of mappings for memslot 0 (used for code and
data that is NOT being dirty logged) changing, e.g. due to NUMA balancing.
- Have KVM's gtod_is_based_on_tsc() return "bool" instead of an "int" (the
function generates boolean values, and all callers treat the return value as
a bool).
Fix a pile of -Wformat warnings in the KVM ARM selftests code, almost all
of which are benign "long" versus "long long" issues (selftests are 64-bit
only, and the guest printf code treats "ll" the same as "l"). The code
itself isn't problematic, but the warnings make it impossible to build ARM
selftests with -Werror, which does detect real issues from time to time.
Opportunistically have GUEST_ASSERT_BITMAP_REG() interpret set_expected,
which is a bool, as an unsigned decimal value, i.e. have it print '0' or
'1' instead of '0x0' or '0x1'.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Tested-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20240202234603.366925-1-seanjc@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Drop dirty_log_page_splitting_test's assertion that the number of 4KiB
pages remains the same across dirty logging being enabled and disabled, as
the test doesn't guarantee that mappings outside of the memslots being
dirty logged are stable, e.g. KVM's mappings for code and pages in
memslot0 can be zapped by things like NUMA balancing.
To preserve the spirit of the check, assert that (a) the number of 4KiB
pages after splitting is _at least_ the number of 4KiB pages across all
memslots under test, and (b) the number of hugepages before splitting adds
up to the number of pages across all memslots under test. (b) is a little
tenuous as it relies on memslot0 being incompatible with transparent
hugepages, but that holds true for now as selftests explicitly madvise()
MADV_NOHUGEPAGE for memslot0 (__vm_create() unconditionally specifies the
backing type as VM_MEM_SRC_ANONYMOUS).
Reported-by: Yi Lai <yi1.lai@intel.com>
Reported-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20240131222728.4100079-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
When finishing the final iteration of dirty_log_test testcase, set
host_quit _before_ the final "continue" so that the vCPU worker doesn't
run an extra iteration, and delete the hack-a-fix of an extra "continue"
from the dirty ring testcase. This fixes a bug where the extra post to
sem_vcpu_cont may not be consumed, which results in failures in subsequent
runs of the testcases. The bug likely was missed during development as
x86 supports only a single "guest mode", i.e. there aren't any subsequent
testcases after the dirty ring test, because for_each_guest_mode() only
runs a single iteration.
For the regular dirty log testcases, letting the vCPU run one extra
iteration is a non-issue as the vCPU worker waits on sem_vcpu_cont if and
only if the worker is explicitly told to stop (vcpu_sync_stop_requested).
But for the dirty ring test, which needs to periodically stop the vCPU to
reap the dirty ring, letting the vCPU resume the guest _after_ the last
iteration means the vCPU will get stuck without an extra "continue".
However, blindly firing off an post to sem_vcpu_cont isn't guaranteed to
be consumed, e.g. if the vCPU worker sees host_quit==true before resuming
the guest. This results in a dangling sem_vcpu_cont, which leads to
subsequent iterations getting out of sync, as the vCPU worker will
continue on before the main task is ready for it to resume the guest,
leading to a variety of asserts, e.g.
==== Test Assertion Failure ====
dirty_log_test.c:384: dirty_ring_vcpu_ring_full
pid=14854 tid=14854 errno=22 - Invalid argument
1 0x00000000004033eb: dirty_ring_collect_dirty_pages at dirty_log_test.c:384
2 0x0000000000402d27: log_mode_collect_dirty_pages at dirty_log_test.c:505
3 (inlined by) run_test at dirty_log_test.c:802
4 0x0000000000403dc7: for_each_guest_mode at guest_modes.c:100
5 0x0000000000401dff: main at dirty_log_test.c:941 (discriminator 3)
6 0x0000ffff9be173c7: ?? ??:0
7 0x0000ffff9be1749f: ?? ??:0
8 0x000000000040206f: _start at ??:?
Didn't continue vcpu even without ring full
Alternatively, the test could simply reset the semaphores before each
testcase, but papering over hacks with more hacks usually ends in tears.
Reported-by: Shaoqin Huang <shahuang@redhat.com>
Fixes: 84292e5659 ("KVM: selftests: Add dirty ring buffer test")
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Link: https://lore.kernel.org/r/20240202231831.354848-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
KVM sets up Hyper-V TSC page clocksource for its guests when system
clocksource is 'based on TSC' (see gtod_is_based_on_tsc()), running
hyperv_clock with any other clocksource leads to imminent failure.
Add the missing requirement to make the test skip gracefully.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20240109141121.1619463-5-vkuznets@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
KVM's 'gtod_is_based_on_tsc()' recognizes two clocksources: 'tsc' and
'hyperv_clocksource_tsc_page' and enables kvmclock in 'masterclock'
mode when either is in use. Transform 'sys_clocksource_is_tsc()' into
'sys_clocksource_is_based_on_tsc()' to support the later. This affects
two tests: kvm_clock_test and vmx_nested_tsc_scaling_test, both seem
to work well when system clocksource is 'hyperv_clocksource_tsc_page'.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20240109141121.1619463-4-vkuznets@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Despite its name, system_has_stable_tsc() just checks that system
clocksource is 'tsc'; this can now be done with generic
sys_clocksource_is_tsc().
No functional change intended.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20240109141121.1619463-3-vkuznets@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Several existing x86 selftests need to check that the underlying system
clocksource is TSC or based on TSC but every test implements its own
check. As a first step towards unification, extract check_clocksource()
from kvm_clock_test and split it into two functions: arch-neutral
'sys_get_cur_clocksource()' and x86-specific 'sys_clocksource_is_tsc()'.
Fix a couple of pre-existing issues in kvm_clock_test: memory leakage in
check_clocksource() and using TEST_ASSERT() instead of TEST_REQUIRE().
The change also makes the test fail when system clocksource can't be read
from sysfs.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20240109141121.1619463-2-vkuznets@redhat.com
[sean: eliminate if-elif pattern just to set a bool true]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Extend the read/write PMU counters subtest to verify that RDPMC also reads
back the written value. Opportunsitically verify that attempting to use
the "fast" mode of RDPMC fails, as the "fast" flag is only supported by
non-architectural PMUs, which KVM doesn't virtualize.
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-30-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add helpers for safe and safe-with-forced-emulations versions of RDMSR,
RDPMC, and XGETBV. Use macro shenanigans to eliminate the rather large
amount of boilerplate needed to get values in and out of registers.
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-29-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add KVM_ASM_SAFE_FEP() to allow forcing emulation on an instruction that
might fault. Note, KVM skips RIP past the FEP prefix before injecting an
exception, i.e. the fixup needs to be on the instruction itself. Do not
check for FEP support, that is firmly the responsibility of whatever code
wants to use KVM_ASM_SAFE_FEP().
Sadly, chaining variadic arguments that contain commas doesn't work, thus
the unfortunate amount of copy+paste.
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-28-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Extend the PMC counters test to use forced emulation to verify that KVM
emulates counter events for instructions retired and branches retired.
Force emulation for only a subset of the measured code to test that KVM
does the right thing when mixing perf events with emulated events.
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-27-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Move the KVM_FEP definition, a.k.a. the KVM force emulation prefix, into
processor.h so that it can be used for other tests besides the MSR filter
test.
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-26-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add a helper to detect KVM support for forced emulation by querying the
module param, and use the helper to detect support for the MSR filtering
test instead of throwing a noodle/NOP at KVM to see if it sticks.
Cc: Aaron Lewis <aaronlewis@google.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-25-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add helpers to read integer module params, which is painfully non-trivial
because the pain of dealing with strings in C is exacerbated by the kernel
inserting a newline.
Don't bother differentiating between int, uint, short, etc. They all fit
in an int, and KVM (thankfully) doesn't have any integer params larger
than an int.
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-24-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add a helper to probe KVM's "enable_pmu" param, open coding strings in
multiple places is just asking for false negatives and/or runtime errors
due to typos.
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-23-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Expand the PMU counters test to verify that LLC references and misses have
non-zero counts when the code being executed while the LLC event(s) is
active is evicted via CFLUSH{,OPT}. Note, CLFLUSH{,OPT} requires a fence
of some kind to ensure the cache lines are flushed before execution
continues. Use MFENCE for simplicity (performance is not a concern).
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-22-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Extend the fixed counters test to verify that supported counters can
actually be enabled in the control MSRs, that unsupported counters cannot,
and that enabled counters actually count.
Co-developed-by: Like Xu <likexu@tencent.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
[sean: fold into the rd/wr access test, massage changelog]
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-21-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Extend the PMU counters test to verify KVM emulation of fixed counters in
addition to general purpose counters. Fixed counters add an extra wrinkle
in the form of an extra supported bitmask. Thus quoth the SDM:
fixed-function performance counter 'i' is supported if ECX[i] || (EDX[4:0] > i)
Test that KVM handles a counter being available through either method.
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Co-developed-by: Like Xu <likexu@tencent.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-20-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add a test to verify that KVM correctly emulates MSR-based accesses to
general purpose counters based on guest CPUID, e.g. that accesses to
non-existent counters #GP and accesses to existent counters succeed.
Note, for compatibility reasons, KVM does not emulate #GP when
MSR_P6_PERFCTR[0|1] is not present (writes should be dropped).
Co-developed-by: Like Xu <likexu@tencent.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-19-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Extend the PMU counters test to validate architectural events using fixed
counters. The core logic is largely the same, the biggest difference
being that if a fixed counter exists, its associated event is available
(the SDM doesn't explicitly state this to be true, but it's KVM's ABI and
letting software program a fixed counter that doesn't actually count would
be quite bizarre).
Note, fixed counters rely on PERF_GLOBAL_CTRL.
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Co-developed-by: Like Xu <likexu@tencent.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-18-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add test cases to verify that Intel's Architectural PMU events work as
expected when they are available according to guest CPUID. Iterate over a
range of sane PMU versions, with and without full-width writes enabled,
and over interesting combinations of lengths/masks for the bit vector that
enumerates unavailable events.
Test up to vPMU version 5, i.e. the current architectural max. KVM only
officially supports up to version 2, but the behavior of the counters is
backwards compatible, i.e. KVM shouldn't do something completely different
for a higher, architecturally-defined vPMU version. Verify KVM behavior
against the effective vPMU version, e.g. advertising vPMU 5 when KVM only
supports vPMU 2 shouldn't magically unlock vPMU 5 features.
According to Intel SDM, the number of architectural events is reported
through CPUID.0AH:EAX[31:24] and the architectural event x is supported
if EBX[x]=0 && EAX[31:24]>x.
Handcode the entirety of the measured section so that the test can
precisely assert on the number of instructions and branches retired.
Co-developed-by: Like Xu <likexu@tencent.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-17-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add a PMU library for x86 selftests to help eliminate open-coded event
encodings, and to reduce the amount of copy+paste between PMU selftests.
Use the new common macro definitions in the existing PMU event filter test.
Cc: Aaron Lewis <aaronlewis@google.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-16-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Extend the kvm_x86_pmu_feature framework to allow querying for fixed
counters via {kvm,this}_pmu_has(). Like architectural events, checking
for a fixed counter annoyingly requires checking multiple CPUID fields, as
a fixed counter exists if:
FxCtr[i]_is_supported := ECX[i] || (EDX[4:0] > i);
Note, KVM currently doesn't actually support exposing fixed counters via
the bitmask, but that will hopefully change sooner than later, and Intel's
SDM explicitly "recommends" checking both the number of counters and the
mask.
Rename the intermedate "anti_feature" field to simply 'f' since the fixed
counter bitmask (thankfully) doesn't have reversed polarity like the
architectural events bitmask.
Note, ideally the helpers would use BUILD_BUG_ON() to assert on the
incoming register, but the expected usage in PMU tests can't guarantee the
inputs are compile-time constants.
Opportunistically define macros for all of the known architectural events
and fixed counters.
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-15-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Drop the "name" parameter from KVM_X86_PMU_FEATURE(), it's unused and
the name is redundant with the macro, i.e. it's truly useless.
Reviewed-by: Jim Mattson <jmattson@google.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-14-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add vcpu_set_cpuid_property() helper function for setting properties, and
use it instead of open coding an equivalent for MAX_PHY_ADDR. Future vPMU
testcases will also need to stuff various CPUID properties.
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20240109230250.424295-13-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
open_path_or_exit() is used for '/dev/kvm', '/dev/sev', and
'/sys/module/%s/parameters/%s' and skipping test when the entry is missing
is completely reasonable. Other errors, however, may indicate a real issue
which is easy to miss. E.g. when 'hyperv_features' test was entering an
infinite loop the output was:
./hyperv_features
Testing access to Hyper-V specific MSRs
1..0 # SKIP - /dev/kvm not available (errno: 24)
and this can easily get overlooked.
Keep ENOENT case 'special' for skipping tests and fail when open() results
in any other errno.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20240129085847.2674082-2-vkuznets@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
When X86_FEATURE_INVTSC is missing, guest_test_msrs_access() was supposed
to skip testing dependent Hyper-V invariant TSC feature. Unfortunately,
'continue' does not lead to that as stage is not incremented. Moreover,
'vm' allocated with vm_create_with_one_vcpu() is not freed and the test
runs out of available file descriptors very quickly.
Fixes: bd827bd775 ("KVM: selftests: Test Hyper-V invariant TSC control")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20240129085847.2674082-1-vkuznets@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Delete the AMX's tests "stage" counter, as the counter is no longer used,
which makes clang unhappy:
x86_64/amx_test.c:224:6: error: variable 'stage' set but not used
int stage, ret;
^
1 error generated.
Note, "stage" was never really used, it just happened to be dumped out by
a (failed) assertion on run->exit_reason, i.e. the AMX test has no concept
of stages, the code was likely copy+pasted from a different test.
Fixes: c96f57b080 ("KVM: selftests: Make vCPU exit reason test assertion common")
Reviewed-by: Jim Mattson <jmattson@google.com>
Link: https://lore.kernel.org/r/20240109220302.399296-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
TEST_* functions append their own newline. Remove newlines from
TEST_* callsites to avoid extra newlines in output.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231206170241.82801-12-ajones@ventanamicro.com
[sean: keep the newline in the "tsc\n" strncmp()]
Signed-off-by: Sean Christopherson <seanjc@google.com>
TEST_* functions append their own newline. Remove newlines from
TEST_* callsites to avoid extra newlines in output.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20231206170241.82801-11-ajones@ventanamicro.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
TEST_* functions append their own newline. Remove newlines from
TEST_* callsites to avoid extra newlines in output.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20231206170241.82801-10-ajones@ventanamicro.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
TEST_* functions append their own newline. Remove newlines from
TEST_* callsites to avoid extra newlines in output.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20231206170241.82801-9-ajones@ventanamicro.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
TEST_* functions append their own newline. Remove newlines from
TEST_* callsites to avoid extra newlines in output.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231206170241.82801-8-ajones@ventanamicro.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Rework the NX hugepage test's skip message regarding the magic token to
provide all of the necessary magic, and to very explicitly recommended
using the wrapper shell script.
Opportunistically remove an overzealous newline; splitting the
recommendation message across two lines of ~45 characters makes it much
harder to read than running out a single line to 98 characters.
Link: https://lore.kernel.org/r/20231129224042.530798-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
There are some feature fields with nonzero minimum valid value. Make
sure get_safe_value() won't return invalid field values for them.
Also fix a bug that wrongly uses the feature bits type as the feature
bits sign causing all fields as signed in the get_safe_value() and
get_invalid_value().
Fixes: 54a9ea7352 ("KVM: arm64: selftests: Test for setting ID register from usersapce")
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Reported-by: Itaru Kitayama <itaru.kitayama@linux.dev>
Tested-by: Itaru Kitayama <itaru.kitayama@fujitsu.com>
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20240115220210.3966064-2-jingzhangos@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
The KVM RISC-V allows Zfa extension for Guest/VM so let us
add this extension to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
The KVM RISC-V allows Zvfh[min] extensions for Guest/VM so let us
add these extensions to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
The KVM RISC-V allows Zihintntl extension for Guest/VM so let us
add this extension to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
The KVM RISC-V allows Zfh[min] extensions for Guest/VM so let us
add these extensions to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
The KVM RISC-V allows vector crypto extensions for Guest/VM so let us
add these extensions to get-reg-list test. This includes extensions
Zvbb, Zvbc, Zvkb, Zvkg, Zvkned, Zvknha, Zvknhb, Zvksed, Zvksh, and Zvkt.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
The KVM RISC-V allows scaler crypto extensions for Guest/VM so let us
add these extensions to get-reg-list test. This includes extensions
Zbkb, Zbkc, Zbkx, Zknd, Zkne, Zknh, Zkr, Zksed, Zksh, and Zkt.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
The KVM RISC-V allows Zbc extension for Guest/VM so let us add
this extension to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
- Use memdup_array_user() to harden against overflow.
- Unconditionally advertise KVM_CAP_DEVICE_CTRL for all architectures.
- Clean up Kconfigs that all KVM architectures were selecting
- New functionality around "guest_memfd", a new userspace API that
creates an anonymous file and returns a file descriptor that refers
to it. guest_memfd files are bound to their owning virtual machine,
cannot be mapped, read, or written by userspace, and cannot be resized.
guest_memfd files do however support PUNCH_HOLE, which can be used to
switch a memory area between guest_memfd and regular anonymous memory.
- New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify
per-page attributes for a given page of guest memory; right now the
only attribute is whether the guest expects to access memory via
guest_memfd or not, which in Confidential SVMs backed by SEV-SNP,
TDX or ARM64 pKVM is checked by firmware or hypervisor that guarantees
confidentiality (AMD PSP, Intel TDX module, or EL2 in the case of pKVM).
x86:
- Support for "software-protected VMs" that can use the new guest_memfd
and page attributes infrastructure. This is mostly useful for testing,
since there is no pKVM-like infrastructure to provide a meaningfully
reduced TCB.
- Fix a relatively benign off-by-one error when splitting huge pages during
CLEAR_DIRTY_LOG.
- Fix a bug where KVM could incorrectly test-and-clear dirty bits in non-leaf
TDP MMU SPTEs if a racing thread replaces a huge SPTE with a non-huge SPTE.
- Use more generic lockdep assertions in paths that don't actually care
about whether the caller is a reader or a writer.
- let Xen guests opt out of having PV clock reported as "based on a stable TSC",
because some of them don't expect the "TSC stable" bit (added to the pvclock
ABI by KVM, but never set by Xen) to be set.
- Revert a bogus, made-up nested SVM consistency check for TLB_CONTROL.
- Advertise flush-by-ASID support for nSVM unconditionally, as KVM always
flushes on nested transitions, i.e. always satisfies flush requests. This
allows running bleeding edge versions of VMware Workstation on top of KVM.
- Sanity check that the CPU supports flush-by-ASID when enabling SEV support.
- On AMD machines with vNMI, always rely on hardware instead of intercepting
IRET in some cases to detect unmasking of NMIs
- Support for virtualizing Linear Address Masking (LAM)
- Fix a variety of vPMU bugs where KVM fail to stop/reset counters and other state
prior to refreshing the vPMU model.
- Fix a double-overflow PMU bug by tracking emulated counter events using a
dedicated field instead of snapshotting the "previous" counter. If the
hardware PMC count triggers overflow that is recognized in the same VM-Exit
that KVM manually bumps an event count, KVM would pend PMIs for both the
hardware-triggered overflow and for KVM-triggered overflow.
- Turn off KVM_WERROR by default for all configs so that it's not
inadvertantly enabled by non-KVM developers, which can be problematic for
subsystems that require no regressions for W=1 builds.
- Advertise all of the host-supported CPUID bits that enumerate IA32_SPEC_CTRL
"features".
- Don't force a masterclock update when a vCPU synchronizes to the current TSC
generation, as updating the masterclock can cause kvmclock's time to "jump"
unexpectedly, e.g. when userspace hotplugs a pre-created vCPU.
- Use RIP-relative address to read kvm_rebooting in the VM-Enter fault paths,
partly as a super minor optimization, but mostly to make KVM play nice with
position independent executable builds.
- Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on
CONFIG_HYPERV as a minor optimization, and to self-document the code.
- Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV "emulation"
at build time.
ARM64:
- LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB
base granule sizes. Branch shared with the arm64 tree.
- Large Fine-Grained Trap rework, bringing some sanity to the
feature, although there is more to come. This comes with
a prefix branch shared with the arm64 tree.
- Some additional Nested Virtualization groundwork, mostly
introducing the NV2 VNCR support and retargetting the NV
support to that version of the architecture.
- A small set of vgic fixes and associated cleanups.
Loongarch:
- Optimization for memslot hugepage checking
- Cleanup and fix some HW/SW timer issues
- Add LSX/LASX (128bit/256bit SIMD) support
RISC-V:
- KVM_GET_REG_LIST improvement for vector registers
- Generate ISA extension reg_list using macros in get-reg-list selftest
- Support for reporting steal time along with selftest
s390:
- Bugfixes
Selftests:
- Fix an annoying goof where the NX hugepage test prints out garbage
instead of the magic token needed to run the test.
- Fix build errors when a header is delete/moved due to a missing flag
in the Makefile.
- Detect if KVM bugged/killed a selftest's VM and print out a helpful
message instead of complaining that a random ioctl() failed.
- Annotate the guest printf/assert helpers with __printf(), and fix the
various bugs that were lurking due to lack of said annotation.
There are two non-KVM patches buried in the middle of guest_memfd support:
fs: Rename anon_inode_getfile_secure() and anon_inode_getfd_secure()
mm: Add AS_UNMOVABLE to mark mapping as completely unmovable
The first is small and mostly suggested-by Christian Brauner; the second
a bit less so but it was written by an mm person (Vlastimil Babka).
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmWcMWkUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroO15gf/WLmmg3SET6Uzw9iEq2xo28831ZA+
6kpILfIDGKozV5safDmMvcInlc/PTnqOFrsKyyN4kDZ+rIJiafJdg/loE0kPXBML
wdR+2ix5kYI1FucCDaGTahskBDz8Lb/xTpwGg9BFLYFNmuUeHc74o6GoNvr1uliE
4kLZL2K6w0cSMPybUD+HqGaET80ZqPwecv+s1JL+Ia0kYZJONJifoHnvOUJ7DpEi
rgudVdgzt3EPjG0y1z6MjvDBXTCOLDjXajErlYuZD3Ej8N8s59Dh2TxOiDNTLdP4
a4zjRvDmgyr6H6sz+upvwc7f4M4p+DBvf+TkWF54mbeObHUYliStqURIoA==
=66Ws
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"Generic:
- Use memdup_array_user() to harden against overflow.
- Unconditionally advertise KVM_CAP_DEVICE_CTRL for all
architectures.
- Clean up Kconfigs that all KVM architectures were selecting
- New functionality around "guest_memfd", a new userspace API that
creates an anonymous file and returns a file descriptor that refers
to it. guest_memfd files are bound to their owning virtual machine,
cannot be mapped, read, or written by userspace, and cannot be
resized. guest_memfd files do however support PUNCH_HOLE, which can
be used to switch a memory area between guest_memfd and regular
anonymous memory.
- New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify
per-page attributes for a given page of guest memory; right now the
only attribute is whether the guest expects to access memory via
guest_memfd or not, which in Confidential SVMs backed by SEV-SNP,
TDX or ARM64 pKVM is checked by firmware or hypervisor that
guarantees confidentiality (AMD PSP, Intel TDX module, or EL2 in
the case of pKVM).
x86:
- Support for "software-protected VMs" that can use the new
guest_memfd and page attributes infrastructure. This is mostly
useful for testing, since there is no pKVM-like infrastructure to
provide a meaningfully reduced TCB.
- Fix a relatively benign off-by-one error when splitting huge pages
during CLEAR_DIRTY_LOG.
- Fix a bug where KVM could incorrectly test-and-clear dirty bits in
non-leaf TDP MMU SPTEs if a racing thread replaces a huge SPTE with
a non-huge SPTE.
- Use more generic lockdep assertions in paths that don't actually
care about whether the caller is a reader or a writer.
- let Xen guests opt out of having PV clock reported as "based on a
stable TSC", because some of them don't expect the "TSC stable" bit
(added to the pvclock ABI by KVM, but never set by Xen) to be set.
- Revert a bogus, made-up nested SVM consistency check for
TLB_CONTROL.
- Advertise flush-by-ASID support for nSVM unconditionally, as KVM
always flushes on nested transitions, i.e. always satisfies flush
requests. This allows running bleeding edge versions of VMware
Workstation on top of KVM.
- Sanity check that the CPU supports flush-by-ASID when enabling SEV
support.
- On AMD machines with vNMI, always rely on hardware instead of
intercepting IRET in some cases to detect unmasking of NMIs
- Support for virtualizing Linear Address Masking (LAM)
- Fix a variety of vPMU bugs where KVM fail to stop/reset counters
and other state prior to refreshing the vPMU model.
- Fix a double-overflow PMU bug by tracking emulated counter events
using a dedicated field instead of snapshotting the "previous"
counter. If the hardware PMC count triggers overflow that is
recognized in the same VM-Exit that KVM manually bumps an event
count, KVM would pend PMIs for both the hardware-triggered overflow
and for KVM-triggered overflow.
- Turn off KVM_WERROR by default for all configs so that it's not
inadvertantly enabled by non-KVM developers, which can be
problematic for subsystems that require no regressions for W=1
builds.
- Advertise all of the host-supported CPUID bits that enumerate
IA32_SPEC_CTRL "features".
- Don't force a masterclock update when a vCPU synchronizes to the
current TSC generation, as updating the masterclock can cause
kvmclock's time to "jump" unexpectedly, e.g. when userspace
hotplugs a pre-created vCPU.
- Use RIP-relative address to read kvm_rebooting in the VM-Enter
fault paths, partly as a super minor optimization, but mostly to
make KVM play nice with position independent executable builds.
- Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on
CONFIG_HYPERV as a minor optimization, and to self-document the
code.
- Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV
"emulation" at build time.
ARM64:
- LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB base
granule sizes. Branch shared with the arm64 tree.
- Large Fine-Grained Trap rework, bringing some sanity to the
feature, although there is more to come. This comes with a prefix
branch shared with the arm64 tree.
- Some additional Nested Virtualization groundwork, mostly
introducing the NV2 VNCR support and retargetting the NV support to
that version of the architecture.
- A small set of vgic fixes and associated cleanups.
Loongarch:
- Optimization for memslot hugepage checking
- Cleanup and fix some HW/SW timer issues
- Add LSX/LASX (128bit/256bit SIMD) support
RISC-V:
- KVM_GET_REG_LIST improvement for vector registers
- Generate ISA extension reg_list using macros in get-reg-list
selftest
- Support for reporting steal time along with selftest
s390:
- Bugfixes
Selftests:
- Fix an annoying goof where the NX hugepage test prints out garbage
instead of the magic token needed to run the test.
- Fix build errors when a header is delete/moved due to a missing
flag in the Makefile.
- Detect if KVM bugged/killed a selftest's VM and print out a helpful
message instead of complaining that a random ioctl() failed.
- Annotate the guest printf/assert helpers with __printf(), and fix
the various bugs that were lurking due to lack of said annotation"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (185 commits)
x86/kvm: Do not try to disable kvmclock if it was not enabled
KVM: x86: add missing "depends on KVM"
KVM: fix direction of dependency on MMU notifiers
KVM: introduce CONFIG_KVM_COMMON
KVM: arm64: Add missing memory barriers when switching to pKVM's hyp pgd
KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cache
RISC-V: KVM: selftests: Add get-reg-list test for STA registers
RISC-V: KVM: selftests: Add steal_time test support
RISC-V: KVM: selftests: Add guest_sbi_probe_extension
RISC-V: KVM: selftests: Move sbi_ecall to processor.c
RISC-V: KVM: Implement SBI STA extension
RISC-V: KVM: Add support for SBI STA registers
RISC-V: KVM: Add support for SBI extension registers
RISC-V: KVM: Add SBI STA info to vcpu_arch
RISC-V: KVM: Add steal-update vcpu request
RISC-V: KVM: Add SBI STA extension skeleton
RISC-V: paravirt: Implement steal-time support
RISC-V: Add SBI STA extension definitions
RISC-V: paravirt: Add skeleton for pv-time support
RISC-V: KVM: Fix indentation in kvm_riscv_vcpu_set_reg_csr()
...
* for-next/cpufeature
- Remove ARM64_HAS_NO_HW_PREFETCH copy_page() optimisation for ye olde
Thunder-X machines.
- Avoid mapping KPTI trampoline when it is not required.
- Make CPU capability API more robust during early initialisation.
* for-next/early-idreg-overrides
- Remove dependencies on core kernel helpers from the early
command-line parsing logic in preparation for moving this code
before the kernel is mapped.
* for-next/fpsimd
- Restore kernel-mode fpsimd context lazily, allowing us to run fpsimd
code sequences in the kernel with pre-emption enabled.
* for-next/kbuild
- Install 'vmlinuz.efi' when CONFIG_EFI_ZBOOT=y.
- Makefile cleanups.
* for-next/lpa2-prep
- Preparatory work for enabling the 'LPA2' extension, which will
introduce 52-bit virtual and physical addressing even with 4KiB
pages (including for KVM guests).
* for-next/misc
- Remove dead code and fix a typo.
* for-next/mm
- Pass NUMA node information for IRQ stack allocations.
* for-next/perf
- Add perf support for the Synopsys DesignWare PCIe PMU.
- Add support for event counting thresholds (FEAT_PMUv3_TH) introduced
in Armv8.8.
- Add support for i.MX8DXL SoCs to the IMX DDR PMU driver.
- Minor PMU driver fixes and optimisations.
* for-next/rip-vpipt
- Remove what support we had for the obsolete VPIPT I-cache policy.
* for-next/selftests
- Improvements to the SVE and SME selftests.
* for-next/stacktrace
- Refactor kernel unwind logic so that it can used by BPF unwinding
and, eventually, reliable backtracing.
* for-next/sysregs
- Update a bunch of register definitions based on the latest XML drop
from Arm.
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmWWvKYQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNIiTB/9agZBkEhZjP2sNDGyE4UFwawweWHkt2r8h
WyvdwP91Z/AIsYSsGYu36J0l4pOnMKp/i6t+rt031SK4j+Q8hJYhSfDt3RvVbc0/
Pz9D18V6cLrfq+Yxycqq9ufVdjs+m+CQ5WeLaRGmNIyEzJ/Jv/qrAN+2r603EeLP
nq08qMZhDIQd2ZzbigCnGaNrTsVSafFfBFv1GsgDvnMZAjs1G6457A6zu+NatNUc
+TMSG+3EawutHZZ2noXl0Ra7VOfIbVZFiUssxRPenKQByHHHR+QB2c/O1blri+dm
XLMutvqO2/WvYGIfXO5koqZqvpVeR3zXxPwmGi5hQBsmOjtXzKd+
=U4mo
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"CPU features:
- Remove ARM64_HAS_NO_HW_PREFETCH copy_page() optimisation for ye
olde Thunder-X machines
- Avoid mapping KPTI trampoline when it is not required
- Make CPU capability API more robust during early initialisation
Early idreg overrides:
- Remove dependencies on core kernel helpers from the early
command-line parsing logic in preparation for moving this code
before the kernel is mapped
FPsimd:
- Restore kernel-mode fpsimd context lazily, allowing us to run
fpsimd code sequences in the kernel with pre-emption enabled
KBuild:
- Install 'vmlinuz.efi' when CONFIG_EFI_ZBOOT=y
- Makefile cleanups
LPA2 prep:
- Preparatory work for enabling the 'LPA2' extension, which will
introduce 52-bit virtual and physical addressing even with 4KiB
pages (including for KVM guests).
Misc:
- Remove dead code and fix a typo
MM:
- Pass NUMA node information for IRQ stack allocations
Perf:
- Add perf support for the Synopsys DesignWare PCIe PMU
- Add support for event counting thresholds (FEAT_PMUv3_TH)
introduced in Armv8.8
- Add support for i.MX8DXL SoCs to the IMX DDR PMU driver.
- Minor PMU driver fixes and optimisations
RIP VPIPT:
- Remove what support we had for the obsolete VPIPT I-cache policy
Selftests:
- Improvements to the SVE and SME selftests
Stacktrace:
- Refactor kernel unwind logic so that it can used by BPF unwinding
and, eventually, reliable backtracing
Sysregs:
- Update a bunch of register definitions based on the latest XML drop
from Arm"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (87 commits)
kselftest/arm64: Don't probe the current VL for unsupported vector types
efi/libstub: zboot: do not use $(shell ...) in cmd_copy_and_pad
arm64: properly install vmlinuz.efi
arm64/sysreg: Add missing system instruction definitions for FGT
arm64/sysreg: Add missing system register definitions for FGT
arm64/sysreg: Add missing ExtTrcBuff field definition to ID_AA64DFR0_EL1
arm64/sysreg: Add missing Pauth_LR field definitions to ID_AA64ISAR1_EL1
arm64: memory: remove duplicated include
arm: perf: Fix ARCH=arm build with GCC
arm64: Align boot cpucap handling with system cpucap handling
arm64: Cleanup system cpucap handling
MAINTAINERS: add maintainers for DesignWare PCIe PMU driver
drivers/perf: add DesignWare PCIe PMU driver
PCI: Move pci_clear_and_set_dword() helper to PCI header
PCI: Add Alibaba Vendor ID to linux/pci_ids.h
docs: perf: Add description for Synopsys DesignWare PCIe PMU driver
arm64: irq: set the correct node for shadow call stack
Revert "perf/arm_dmc620: Remove duplicate format attribute #defines"
arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD
arm64: fpsimd: Preserve/restore kernel mode NEON at context switch
...
- Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on
CONFIG_HYPERV as a minor optimization, and to self-document the code.
- Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV "emulation"
at build time.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmWW8gYSHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5sGUP/iadHMz7Up1X29IDGtq58LRORNVXp2Ln
2dqoj8IKZeSr+mPMw2GvZyuiLqVPMs4Et21WJfCO7HgKd/NPMDORwRndhJYweFRY
yk+5NJLvXYuo8UR3b2QYy8XUghEqP+j5eYyon6UdCiPACcBGTpgoj4pU7SLM7l4T
EOge42ya5YxD/1oWr5vyifNrOJCPNTBYcC0as5//+RdnmQYqYZ26Z73b0B8Pdct4
XMWwgoKlmLTmei0YntXtGaDGimCvTYP8EPM4tOWgiBSWMhQXWbAh/0biDfd3eZVO
Hoe4HvstdjUNbpO3h3Zo78Ob7ehk4kx/6r0nlQnz5JxzGnuDjYCDIVUlYn0mw5Yi
nu4ztr8M3VRksDbpmAjSO9XFEKIYxlYQfzZ1UuTy8ehdBYTDl/3lPAbh2ApUYE72
Tt2PXmFGz2j1sjG38Gh94s48Za5OxHoVlfq8iGhU4v7UjuxnMNHfExOWd66SwZgx
5tZkr4rj/pWt21wr7jaVqFGzuftIC5G4ZEBhh7JcW89oamFrykgQUu5z4dhBMO75
G7DAVh9eSH2SKkmJH1ClXriveazTK7fqMx8sZzzRnusMz09qH7SIdjSzmp7H5utw
pWBfatft0n0FTI1r+hxGueiJt7dFlrIz0Q4hHyBN4saoVH121bZioc0pq1ob6MIk
Y2Ou4xJBt14F
=bjfs
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-hyperv-6.8' of https://github.com/kvm-x86/linux into HEAD
KVM x86 Hyper-V changes for 6.8:
- Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on
CONFIG_HYPERV as a minor optimization, and to self-document the code.
- Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV "emulation"
at build time.
- LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB
base granule sizes. Branch shared with the arm64 tree.
- Large Fine-Grained Trap rework, bringing some sanity to the
feature, although there is more to come. This comes with
a prefix branch shared with the arm64 tree.
- Some additional Nested Virtualization groundwork, mostly
introducing the NV2 VNCR support and retargetting the NV
support to that version of the architecture.
- A small set of vgic fixes and associated cleanups.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmWX4wUACgkQI9DQutE9
ekM0DxAAvOJtM+m8ahv2tCSHZpwowkuKBBc7JWI75l4befHEOSvYMZwQwejrequa
lPwLgx9t0sGjba+tRGv1JZMtnUBjV4V/lcrhX95AYTF5dfg7vbuTxUh/YFu1CaQ/
MkuKVJ74PUWqpvDYSzwW8Jjqu6RskjW0HqVPMbFkmUWWc8cgExc8XD9M+nu0SrNT
g5261KD53CUeyNaR0/+zkaHouq2Skeqw/u2d5OLdnY23hINMZ0qR1jYHj935suYy
YrMTiMje1h/fs7YXWra4LmMcsg0V+3LZVQJXwRARrZdk2xkW5w+eLPIYjVqcA7aT
VwhrtzjEzD56trrSZClOpj7MSVfQ8OjV7BgvSUpgLT5+kjVrFLIEMIOakiTOCoIJ
weweRawTyomUoIsT1EkRmRYQkPH3Z552tcrztD/slYvqrtCB4JcHKF0O7BT88ZfM
t2hRhlT+32KR9cOciLfFMzlZI1uKQYF8Z+CvvBA5TJ9Hv8JsIwF2E/NjYUy2ilca
iDzF5KdZ/OLQzjwWVWDq9OlvepB2rLGQKNnw67jd1BSzd9Jj3eVuaI/9xRBrLDYR
cBOMoIaZMy7Va+pop1zoFEhC7IbTglVHzsj2ch+4F1NB/1+Dd0zBQKbDUPqp5TR/
OOuonTTVk9yH6RgpUULKlbRZ4oU70UoOBFBxCqnvng0cw1KBbbA=
=Q6c+
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for Linux 6.8
- LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB
base granule sizes. Branch shared with the arm64 tree.
- Large Fine-Grained Trap rework, bringing some sanity to the
feature, although there is more to come. This comes with
a prefix branch shared with the arm64 tree.
- Some additional Nested Virtualization groundwork, mostly
introducing the NV2 VNCR support and retargetting the NV
support to that version of the architecture.
- A small set of vgic fixes and associated cleanups.
- KVM_GET_REG_LIST improvement for vector registers
- Generate ISA extension reg_list using macros in get-reg-list selftest
- Steal time account support along with selftest
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmWQ+cgACgkQrUjsVaLH
LAckBA//R4X9L5ugfPdDunp3ntjZXmNtBS5pM2jD+UvaoFn2kOA1o5kOD5mXluuh
0imNjVuzlrX7XoAATQ4BoeoXg0whDbnv/8TE13KqSl1PfNziH2p5YD2DuHXPST3B
V2VHrGACZ4wN074Whztl0oYI72obGnmpcsiglifkeCvRPerghHuDu40eUaWvCmgD
DPwT+bjBgxkDZ4IheyytUrPql6reALh1Qo1vfj0FsJAgj+MAqQeD8n6rixSPnOdE
9XXa4jdu7ycDp675VX/DVEWsNBQGPrsRK/uCiMksO36td+wLCKpAkvX95mE0w/L8
qFJ+dN1c+1ZkjooHdVLfq2MjxaIRwmIowk7SeJbpvGIf/zG3r7eany7eXJT0+NjO
22j5FY2z1NqcSG6Fazx76Qp2vVBVbxHShP9h7d6VTZYS7XENjmV6IWHpTSuSF8+n
puj8Nf5C7WuqbySirSgQndDuKawn9myqfXXEoAuSiZ+kVyYEl8QnXm2gAIcxRDHX
x+NDPMv0DpMBRO9qa/tXeqgNue/XOTJwgbmXzAlCNff3U7hPIHJ/5aZiJ/Re5TeE
DxiU9AmIsNN2Bh0csS/wQbdScIqkOdOiDYEwT1DXOJWpmhiyCW7vR8ltaIuMJ4vP
DtlfuUlSe4aml957nAiqqyjQAY/7gqmpoaGwu+lmrOX1K7fdtF0=
=FeiG
-----END PGP SIGNATURE-----
Merge tag 'kvm-riscv-6.8-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv changes for 6.8 part #1
- KVM_GET_REG_LIST improvement for vector registers
- Generate ISA extension reg_list using macros in get-reg-list selftest
- Steal time account support along with selftest
Add SBI STA and its two registers to the get-reg-list test.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
With the introduction of steal-time accounting support for
RISC-V KVM we can add RISC-V support to the steal_time test.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Add guest_sbi_probe_extension(), allowing guest code to probe for
SBI extensions. As guest_sbi_probe_extension() needs
SBI_ERR_NOT_SUPPORTED, take the opportunity to bring in all SBI
error codes. We don't bring in all current extension IDs or base
extension function IDs though, even though we need one of each,
because we'd prefer to bring those in as necessary.
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
sbi_ecall() isn't ucall specific and its prototype is already in
processor.h. Move its implementation to processor.c.
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
SBI extension registers may not be present and indeed when
running on a platform without sscofpmf the PMU SBI extension
is not. Move the SBI extension registers from the base set of
registers to the filter list. Individual configs should test
for any that may or may not be present separately. Since
the PMU extension may disappear and the DBCN extension is only
present in later kernels, separate them from the rest into
their own configs. The rest are lumped together into the same
config.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Anup Patel <anup@brainfault.org>
Always use register subtypes in the get-reg-list test when registers
have them. The only registers neglecting to do so were ISA extension
registers. While we don't really need to use KVM_REG_RISCV_ISA_SINGLE
(since it's zero), the main purpose is to avoid confusion and to
self-document the tests. Also add print support for the multi
registers like SBI extensions have, even though they're only used for
debugging.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Anup Patel <anup@brainfault.org>
While adding RISCV_SBI_EXT_REG(), acknowledge that some registers
have subtypes and extend __kvm_reg_id() to take a subtype field.
Then, update all macros to set the new field appropriately. The
general CSR macro gets renamed to include "GENERAL", but the other
macros, like the new RISCV_SBI_EXT_REG, just use the SINGLE subtype.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Anup Patel <anup@brainfault.org>
These registers are no longer getting added to get-reg-list.
We keep sbi_ext_multi_id_to_str() for printing, even though
we don't expect it to normally be used, because it may be
useful for debug.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Anup Patel <anup@brainfault.org>
Various ISA extension reg_list have common pattern so let us generate
these using macros.
We define two macros for the above purpose:
1) KVM_ISA_EXT_SIMPLE_CONFIG - Macro to generate reg_list for
ISA extension without any additional ONE_REG registers
2) KVM_ISA_EXT_SUBLIST_CONFIG - Macro to generate reg_list for
ISA extension with additional ONE_REG registers
This patch also adds the missing config for svnapot.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
- Ensure a vCPU's redistributor is unregistered from the MMIO bus
if vCPU creation fails
- Fix building KVM selftests for arm64 from the top-level Makefile
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCZYCYmAAKCRCivnWIJHzd
FhU+AQDqIOIg3VMV+VjxhrG5aiHccq9o1mczO4LL9FQUO9AdYwD/SbTP4puBlfai
gOFQDuvJFogTwKmYPDO2jycp1ekTuQ0=
=RhfO
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-fixes-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
KVM/arm64 fixes for 6.7, part #2
- Ensure a vCPU's redistributor is unregistered from the MMIO bus
if vCPU creation fails
- Fix building KVM selftests for arm64 from the top-level Makefile
When we dynamically generate a name for a configuration in get-reg-list
we use strcat() to append to a buffer allocated using malloc() but we
never initialise that buffer. Since malloc() offers no guarantees
regarding the contents of the memory it returns this can lead to us
corrupting, and likely overflowing, the buffer:
vregs: PASS
vregs+pmu: PASS
sve: PASS
sve+pmu: PASS
vregs+pauth_address+pauth_generic: PASS
X?vr+gspauth_addre+spauth_generi+pmu: PASS
The bug is that strcat() should have been strcpy(), and that replacement
would be enough to fix it, but there are other things in the function
that leave something to be desired. In particular, an (incorrectly)
empty config would cause an out of bounds access to c->name[-1].
Since the strcpy() call relies on c->name[0..len-1] being initialized,
enforce that invariant throughout the function.
Fixes: 2f9ace5d45 ("KVM: arm64: selftests: get-reg-list: Introduce vcpu configs")
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Co-developed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Message-Id: <20231211-kvm-get-reg-list-str-init-v3-1-6554c71c77b1@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
print_reg() will print everything it knows when it encounters
a register ID it's unfamiliar with in the default cases of its
decoding switches. Fix several issues with these (until now,
never tested) paths; missing newlines in printfs, missing
complement operator in mask, and missing return in order to
avoid continuing to decode.
Fixes: 62d0c458f8 ("KVM: riscv: selftests: get-reg-list print_reg should never fail")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Building the KVM selftests from the main selftests Makefile (as opposed
to the kvm subdirectory) doesn't work as OUTPUT is set, forcing the
generated header to spill into the selftests directory. Additionally,
relative paths do not work when building outside of the srctree, as the
canonical selftests path is replaced with 'kselftest' in the output.
Work around both of these issues by explicitly overriding OUTPUT on the
submake cmdline. Move the whole fragment below the point lib.mk gets
included such that $(abs_objdir) is available.
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Tested-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20231212070431.145544-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
- Fix an annoying goof where the NX hugepage test prints out garbage
instead of the magic token needed to run the text.
- Fix build errors when a header is delete/moved due to a missing flag
in the Makefile.
- Detect if KVM bugged/killed a selftest's VM and print out a helpful
message instead of complaining that a random ioctl() failed.
- Annotate the guest printf/assert helpers with __printf(), and fix the
various bugs that were lurking due to lack of said annotation.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmVye5USHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5VSsQAJ0gkFUtNBMrjMgZviaHUF5dkhx2GuJz
/stRoVYaQgWUPENlc9ki8q9t+KpkQRl5KlDZ9RBea06C11wpwMK2/h6J9GJKsUvp
DahxSttaSwrDpgIOlpfWqeQBtAPaj8yGcG3OApqcYDvze84oahNgCjlKnuZUnqpM
itaReAyd5vnB4LHY06EGyiTqmzkVweWO9B1sszazKZnXf3nMyy5sDLQcbVzuL8X3
ZZLmtxkFHHuXXk6xj9V9QW3HumEpHssJsF8dAX6V7mLpj+2DFWDk1U6MOe1gwi9F
XmBrSblrD2TZjAs77YbxF13gmYIKb7U1x/VXWlivkzRhtA8n7Gvq9sWUH5KJAPR5
OkM/pY2GT2jbvc+FFfxfXA6eRswQrll9Me6BrlOtFnHgXd2/DavZoZPbUx6/bTx/
Go9HJPExT/Ei7mztt2kqwC2uT8cn0VYM7kt7XKvEoDyncNO8SkhVVnZKHxcFZdlF
52fa9qS5kBuQ1V5PlvGfuBmStm/GNV9Ui/KSGExAARQsHQlmMQ3ZHLDGPeGdbwlZ
nAUbhZAljKO2dQbcX1q2tHTToC6QDehgOgLTnQtAlpSyBEK43jXAWywQilegJG+5
vYK9vEJor2n52/aHh84uQiAP7cuvUNZyBqWDIqH22CAn8LrBKs6nzppv0wFta6aA
WIGGFFDKC8Yw
=Ur6p
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-selftests-6.7-rcN' of https://github.com/kvm-x86/linux into HEAD
KVM selftests fixes for 6.8 merge window:
- Fix an annoying goof where the NX hugepage test prints out garbage
instead of the magic token needed to run the text.
- Fix build errors when a header is delete/moved due to a missing flag
in the Makefile.
- Detect if KVM bugged/killed a selftest's VM and print out a helpful
message instead of complaining that a random ioctl() failed.
- Annotate the guest printf/assert helpers with __printf(), and fix the
various bugs that were lurking due to lack of said annotation.
A small subset of these was included in 6.7-rc as well.
KVM/Arm supports readonly memslots; fix the calculation of
supported_flags in set_memory_region_test.c, otherwise the
test fails.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Using -MD without -MP causes build failures when a header file is deleted
or moved. With -MP, the compiler will emit phony targets for the header
files it lists as dependencies, and the Makefiles won't refuse to attempt
to rebuild a C unit which no longer includes the deleted header.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lore.kernel.org/r/9fc8b5395321abbfcaf5d78477a9a7cd350b08e4.camel@infradead.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pass MAGIC_TOKEN to __TEST_REQUIRE() when printing the help message about
needing to pass a magic value to manually run the NX hugepages test,
otherwise the help message will contain garbage.
In file included from x86_64/nx_huge_pages_test.c:15:
x86_64/nx_huge_pages_test.c: In function ‘main’:
include/test_util.h:40:32: error: format ‘%d’ expects a matching ‘int’ argument [-Werror=format=]
40 | ksft_exit_skip("- " fmt "\n", ##__VA_ARGS__); \
| ^~~~
x86_64/nx_huge_pages_test.c:259:9: note: in expansion of macro ‘__TEST_REQUIRE’
259 | __TEST_REQUIRE(token == MAGIC_TOKEN,
| ^~~~~~~~~~~~~~
Signed-off-by: angquan yu <angquan21@gmail.com>
Link: https://lore.kernel.org/r/20231128221105.63093-1-angquan21@gmail.com
[sean: rewrite shortlog+changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The "vmxon_pa == vmcs12_pa == -1ull" test happens to work by accident: as
Enlightened VMCS is always supported, set_default_vmx_state() adds
'KVM_STATE_NESTED_EVMCS' to 'flags' and the following branch of
vmx_set_nested_state() is executed:
if ((kvm_state->flags & KVM_STATE_NESTED_EVMCS) &&
(!guest_can_use(vcpu, X86_FEATURE_VMX) ||
!vmx->nested.enlightened_vmcs_enabled))
return -EINVAL;
as 'enlightened_vmcs_enabled' is false. In fact, "vmxon_pa == vmcs12_pa ==
-1ull" is a valid state when not tainted by wrong flags so the test should
aim for this branch:
if (kvm_state->hdr.vmx.vmxon_pa == INVALID_GPA)
return 0;
Test all this properly:
- Without KVM_STATE_NESTED_EVMCS in the flags, the expected return value is
'0'.
- With KVM_STATE_NESTED_EVMCS flag (when supported) set, the expected
return value is '-EINVAL' prior to enabling eVMCS and '0' after.
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Tested-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20231205103630.1391318-11-vkuznets@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
In preparation for conditional Hyper-V emulation enablement in KVM, make
Hyper-V specific tests skip gracefully instead of failing when KVM support
for emulating Hyper-V is not there.
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Tested-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20231205103630.1391318-10-vkuznets@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Annotate guest printf helpers with __printf() so that the compiler will
warn about incorrect formatting at compile time (see git log for how easy
it is to screw up with the formatting).
Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20231129224916.532431-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Swap the ordering of parameters to guest asserts related to {RD,WR}MSR
success/failure in the Hyper-V features test. As is, the output will
be mangled and broken due to passing an integer as a string and vice
versa.
Opportunistically fix a benign %u vs. %lu issue as well.
Link: https://lore.kernel.org/r/20231129224916.532431-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Convert %llx to %lx as appropriate in guest asserts. The guest printf
implementation treats them the same as KVM selftests are 64-bit only, but
strictly adhering to the correct format will allow annotating the
underlying helpers with __printf() without introducing new warnings in the
build.
Link: https://lore.kernel.org/r/20231129224916.532431-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Print out the test and vector as intended when a guest assert fails an
assertion regarding MONITOR/MWAIT faulting. Unfortunately, the guest
printf support doesn't detect such issues at compile-time, so the bug
manifests as a confusing error message, e.g. in the most confusing case,
the test complains that it got vector "0" instead of expected vector "0".
Fixes: 0f52e4aaa6 ("KVM: selftests: Convert the MONITOR/MWAIT test to use printf guest asserts")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20231107182159.404770-1-seanjc@google.com
Link: https://lore.kernel.org/r/20231129224916.532431-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Remove x86's mmio_warning_test, as it is unnecessarily complex (there's no
reason to fork, spawn threads, initialize srand(), etc..), unnecessarily
restrictive (triggering triple fault is not unique to Intel CPUs without
unrestricted guest), and provides no meaningful coverage beyond what
basic fuzzing can achieve (running a vCPU with garbage is fuzzing's bread
and butter).
That the test has *all* of the above flaws is not coincidental, as the
code was copy+pasted almost verbatim from the syzkaller reproducer that
originally found the KVM bug (which has long since been fixed).
Cc: Michal Luczaj <mhal@rbox.co>
Link: https://groups.google.com/g/syzkaller/c/lHfau8E3SOE
Link: https://lore.kernel.org/r/20230815220030.560372-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add yet another macro to the VM/vCPU ioctl() framework to detect when an
ioctl() failed because KVM killed/bugged the VM, i.e. when there was
nothing wrong with the ioctl() itself. If KVM kills a VM, e.g. by way of
a failed KVM_BUG_ON(), all subsequent VM and vCPU ioctl()s will fail with
-EIO, which can be quite misleading and ultimately waste user/developer
time.
Use KVM_CHECK_EXTENSION on KVM_CAP_USER_MEMORY to detect if the VM is
dead and/or bug, as KVM doesn't provide a dedicated ioctl(). Using a
heuristic is obviously less than ideal, but practically speaking the logic
is bulletproof barring a KVM change, and any such change would arguably
break userspace, e.g. if KVM returns something other than -EIO.
Without the detection, tearing down a bugged VM yields a cryptic failure
when deleting memslots:
==== Test Assertion Failure ====
lib/kvm_util.c:689: !ret
pid=45131 tid=45131 errno=5 - Input/output error
1 0x00000000004036c3: __vm_mem_region_delete at kvm_util.c:689
2 0x00000000004042f0: kvm_vm_free at kvm_util.c:724 (discriminator 12)
3 0x0000000000402929: race_sync_regs at sync_regs_test.c:193
4 0x0000000000401cab: main at sync_regs_test.c:334 (discriminator 6)
5 0x0000000000416f13: __libc_start_call_main at libc-start.o:?
6 0x000000000041855f: __libc_start_main_impl at ??:?
7 0x0000000000401d40: _start at ??:?
KVM_SET_USER_MEMORY_REGION failed, rc: -1 errno: 5 (Input/output error)
Which morphs into a more pointed error message with the detection:
==== Test Assertion Failure ====
lib/kvm_util.c:689: false
pid=80347 tid=80347 errno=5 - Input/output error
1 0x00000000004039ab: __vm_mem_region_delete at kvm_util.c:689 (discriminator 5)
2 0x0000000000404660: kvm_vm_free at kvm_util.c:724 (discriminator 12)
3 0x0000000000402ac9: race_sync_regs at sync_regs_test.c:193
4 0x0000000000401cb7: main at sync_regs_test.c:334 (discriminator 6)
5 0x0000000000418263: __libc_start_call_main at libc-start.o:?
6 0x00000000004198af: __libc_start_main_impl at ??:?
7 0x0000000000401d90: _start at ??:?
KVM killed/bugged the VM, check the kernel log for clues
Suggested-by: Michal Luczaj <mhal@rbox.co>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Colton Lewis <coltonlewis@google.com>
Link: https://lore.kernel.org/r/20231108010953.560824-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Drop _kvm_ioctl(), _vm_ioctl(), and _vcpu_ioctl(), as they are no longer
used by anything other than the no-underscores variants (and may have
never been used directly). The single-underscore variants were never
intended to be a "feature", they were a stopgap of sorts to ease the
conversion to pretty printing ioctl() names when reporting errors.
Opportunistically add a comment explaining when to use __KVM_IOCTL_ERROR()
versus KVM_IOCTL_ERROR(). The single-underscore macros were subtly
ensuring that the name of the ioctl() was printed on error, i.e. it's all
too easy to overlook the fact that using __KVM_IOCTL_ERROR() is
intentional.
Link: https://lore.kernel.org/r/20231108010953.560824-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Using -MD without -MP causes build failures when a header file is deleted
or moved. With -MP, the compiler will emit phony targets for the header
files it lists as dependencies, and the Makefiles won't refuse to attempt
to rebuild a C unit which no longer includes the deleted header.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lore.kernel.org/r/9fc8b5395321abbfcaf5d78477a9a7cd350b08e4.camel@infradead.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Pass MAGIC_TOKEN to __TEST_REQUIRE() when printing the help message about
needing to pass a magic value to manually run the NX hugepages test,
otherwise the help message will contain garbage.
In file included from x86_64/nx_huge_pages_test.c:15:
x86_64/nx_huge_pages_test.c: In function ‘main’:
include/test_util.h:40:32: error: format ‘%d’ expects a matching ‘int’ argument [-Werror=format=]
40 | ksft_exit_skip("- " fmt "\n", ##__VA_ARGS__); \
| ^~~~
x86_64/nx_huge_pages_test.c:259:9: note: in expansion of macro ‘__TEST_REQUIRE’
259 | __TEST_REQUIRE(token == MAGIC_TOKEN,
| ^~~~~~~~~~~~~~
Signed-off-by: angquan yu <angquan21@gmail.com>
Link: https://lore.kernel.org/r/20231128221105.63093-1-angquan21@gmail.com
[sean: rewrite shortlog+changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add support for VM_MODE_P52V48_4K and VM_MODE_P52V48_16K guest modes by
using the FEAT_LPA2 pte format for stage1, when FEAT_LPA2 is available.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231127111737.1897081-13-ryan.roberts@arm.com
We are about to add 52 bit PA guest modes for 4K and 16K pages when the
system supports LPA2. In preparation beef up the logic that parses mmfr0
to also tell us what the maximum supported PA size is for each page
size. Max PA size = 0 implies the page size is not supported at all.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231127111737.1897081-12-ryan.roberts@arm.com
Currently the sysreg-defs are written out to the source tree
unconditionally, ignoring the specified output directory. Correct the
build rule to emit the header to the output directory. Opportunistically
reorganize the rules to avoid interleaving with the set of beauty make
rules.
Reported-by: Ian Rogers <irogers@google.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20231121192956.919380-3-oliver.upton@linux.dev
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
MEM_REGION_SLOT and MEM_REGION_GPA are not really needed in
test_invalid_memory_region_flags; the VM never runs and there are no
other slots, so it is okay to use slot 0 and place it at address
zero. This fixes compilation on architectures that do not
define them.
Fixes: 5d74316466 ("KVM: selftests: Add a memory region subtest to validate invalid flags")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce several new KVM uAPIs to ultimately create a guest-first memory
subsystem within KVM, a.k.a. guest_memfd. Guest-first memory allows KVM
to provide features, enhancements, and optimizations that are kludgly
or outright impossible to implement in a generic memory subsystem.
The core KVM ioctl() for guest_memfd is KVM_CREATE_GUEST_MEMFD, which
similar to the generic memfd_create(), creates an anonymous file and
returns a file descriptor that refers to it. Again like "regular"
memfd files, guest_memfd files live in RAM, have volatile storage,
and are automatically released when the last reference is dropped.
The key differences between memfd files (and every other memory subystem)
is that guest_memfd files are bound to their owning virtual machine,
cannot be mapped, read, or written by userspace, and cannot be resized.
guest_memfd files do however support PUNCH_HOLE, which can be used to
convert a guest memory area between the shared and guest-private states.
A second KVM ioctl(), KVM_SET_MEMORY_ATTRIBUTES, allows userspace to
specify attributes for a given page of guest memory. In the long term,
it will likely be extended to allow userspace to specify per-gfn RWX
protections, including allowing memory to be writable in the guest
without it also being writable in host userspace.
The immediate and driving use case for guest_memfd are Confidential
(CoCo) VMs, specifically AMD's SEV-SNP, Intel's TDX, and KVM's own pKVM.
For such use cases, being able to map memory into KVM guests without
requiring said memory to be mapped into the host is a hard requirement.
While SEV+ and TDX prevent untrusted software from reading guest private
data by encrypting guest memory, pKVM provides confidentiality and
integrity *without* relying on memory encryption. In addition, with
SEV-SNP and especially TDX, accessing guest private memory can be fatal
to the host, i.e. KVM must be prevent host userspace from accessing
guest memory irrespective of hardware behavior.
Long term, guest_memfd may be useful for use cases beyond CoCo VMs,
for example hardening userspace against unintentional accesses to guest
memory. As mentioned earlier, KVM's ABI uses userspace VMA protections to
define the allow guest protection (with an exception granted to mapping
guest memory executable), and similarly KVM currently requires the guest
mapping size to be a strict subset of the host userspace mapping size.
Decoupling the mappings sizes would allow userspace to precisely map
only what is needed and with the required permissions, without impacting
guest performance.
A guest-first memory subsystem also provides clearer line of sight to
things like a dedicated memory pool (for slice-of-hardware VMs) and
elimination of "struct page" (for offload setups where userspace _never_
needs to DMA from or into guest memory).
guest_memfd is the result of 3+ years of development and exploration;
taking on memory management responsibilities in KVM was not the first,
second, or even third choice for supporting CoCo VMs. But after many
failed attempts to avoid KVM-specific backing memory, and looking at
where things ended up, it is quite clear that of all approaches tried,
guest_memfd is the simplest, most robust, and most extensible, and the
right thing to do for KVM and the kernel at-large.
The "development cycle" for this version is going to be very short;
ideally, next week I will merge it as is in kvm/next, taking this through
the KVM tree for 6.8 immediately after the end of the merge window.
The series is still based on 6.6 (plus KVM changes for 6.7) so it
will require a small fixup for changes to get_file_rcu() introduced in
6.7 by commit 0ede61d858 ("file: convert to SLAB_TYPESAFE_BY_RCU").
The fixup will be done as part of the merge commit, and most of the text
above will become the commit message for the merge.
Pending post-merge work includes:
- hugepage support
- looking into using the restrictedmem framework for guest memory
- introducing a testing mechanism to poison memory, possibly using
the same memory attributes introduced here
- SNP and TDX support
There are two non-KVM patches buried in the middle of this series:
fs: Rename anon_inode_getfile_secure() and anon_inode_getfd_secure()
mm: Add AS_UNMOVABLE to mark mapping as completely unmovable
The first is small and mostly suggested-by Christian Brauner; the second
a bit less so but it was written by an mm person (Vlastimil Babka).
Add a subtest to set_memory_region_test to verify that KVM rejects invalid
flags and combinations with -EINVAL. KVM might or might not fail with
EINVAL anyways, but we can at least try.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20231031002049.3915752-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
"Testing private access when memslot gets deleted" tests the behavior
of KVM when a private memslot gets deleted while the VM is using the
private memslot. When KVM looks up the deleted (slot = NULL) memslot,
KVM should exit to userspace with KVM_EXIT_MEMORY_FAULT.
In the second test, upon a private access to non-private memslot, KVM
should also exit to userspace with KVM_EXIT_MEMORY_FAULT.
Intentionally don't take a requirement on KVM_CAP_GUEST_MEMFD,
KVM_CAP_MEMORY_FAULT_INFO, KVM_MEMORY_ATTRIBUTE_PRIVATE, etc., as it's a
KVM bug to advertise KVM_X86_SW_PROTECTED_VM without its prerequisites.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
[sean: call out the similarities with set_memory_region_test]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20231027182217.3615211-36-seanjc@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a selftest to verify the basic functionality of guest_memfd():
+ file descriptor created with the guest_memfd() ioctl does not allow
read/write/mmap operations
+ file size and block size as returned from fstat are as expected
+ fallocate on the fd checks that offset/length on
fallocate(FALLOC_FL_PUNCH_HOLE) should be page aligned
+ invalid inputs (misaligned size, invalid flags) are rejected
+ file size and inode are unique (the innocuous-sounding
anon_inode_getfile() backs all files with a single inode...)
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Co-developed-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20231027182217.3615211-35-seanjc@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Expand set_memory_region_test to exercise various positive and negative
testcases for private memory.
- Non-guest_memfd() file descriptor for private memory
- guest_memfd() from different VM
- Overlapping bindings
- Unaligned bindings
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Co-developed-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
[sean: trim the testcases to remove duplicate coverage]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20231027182217.3615211-34-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add helpers to invoke KVM_SET_USER_MEMORY_REGION2 directly so that tests
can validate of features that are unique to "version 2" of "set user
memory region", e.g. do negative testing on gmem_fd and gmem_offset.
Provide a raw version as well as an assert-success version to reduce
the amount of boilerplate code need for basic usage.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20231027182217.3615211-33-seanjc@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a selftest to exercise implicit/explicit conversion functionality
within KVM and verify:
- Shared memory is visible to host userspace
- Private memory is not visible to host userspace
- Host userspace and guest can communicate over shared memory
- Data in shared backing is preserved across conversions (test's
host userspace doesn't free the data)
- Private memory is bound to the lifetime of the VM
Ideally, KVM's selftests infrastructure would be reworked to allow backing
a single region of guest memory with multiple memslots for _all_ backing
types and shapes, i.e. ideally the code for using a single backing fd
across multiple memslots would work for "regular" memory as well. But
sadly, support for KVM_CREATE_GUEST_MEMFD has languished for far too long,
and overhauling selftests' memslots infrastructure would likely open a can
of worms, i.e. delay things even further.
In addition to the more obvious tests, verify that PUNCH_HOLE actually
frees memory. Directly verifying that KVM frees memory is impractical, if
it's even possible, so instead indirectly verify memory is freed by
asserting that the guest reads zeroes after a PUNCH_HOLE. E.g. if KVM
zaps SPTEs but doesn't actually punch a hole in the inode, the subsequent
read will still see the previous value. And obviously punching a hole
shouldn't cause explosions.
Let the user specify the number of memslots in the private mem conversion
test, i.e. don't require the number of memslots to be '1' or "nr_vcpus".
Creating more memslots than vCPUs is particularly interesting, e.g. it can
result in a single KVM_SET_MEMORY_ATTRIBUTES spanning multiple memslots.
To keep the math reasonable, align each vCPU's chunk to at least 2MiB (the
size is 2MiB+4KiB), and require the total size to be cleanly divisible by
the number of memslots. The goal is to be able to validate that KVM plays
nice with multiple memslots, being able to create a truly arbitrary number
of memslots doesn't add meaningful value, i.e. isn't worth the cost.
Intentionally don't take a requirement on KVM_CAP_GUEST_MEMFD,
KVM_CAP_MEMORY_FAULT_INFO, KVM_MEMORY_ATTRIBUTE_PRIVATE, etc., as it's a
KVM bug to advertise KVM_X86_SW_PROTECTED_VM without its prerequisites.
Signed-off-by: Vishal Annapurve <vannapurve@google.com>
Co-developed-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20231027182217.3615211-32-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add GUEST_SYNC[1-6]() so that tests can pass the maximum amount of
information supported via ucall(), without needing to resort to shared
memory.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20231027182217.3615211-31-seanjc@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a "vm_shape" structure to encapsulate the selftests-defined "mode",
along with the KVM-defined "type" for use when creating a new VM. "mode"
tracks physical and virtual address properties, as well as the preferred
backing memory type, while "type" corresponds to the VM type.
Taking the VM type will allow adding tests for KVM_CREATE_GUEST_MEMFD
without needing an entirely separate set of helpers. At this time,
guest_memfd is effectively usable only by confidential VM types in the
form of guest private memory, and it's expected that x86 will double down
and require unique VM types for TDX and SNP guests.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20231027182217.3615211-30-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add helpers for x86 guests to invoke the KVM_HC_MAP_GPA_RANGE hypercall,
which KVM will forward to userspace and thus can be used by tests to
coordinate private<=>shared conversions between host userspace code and
guest code.
Signed-off-by: Vishal Annapurve <vannapurve@google.com>
[sean: drop shared/private helpers (let tests specify flags)]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20231027182217.3615211-29-seanjc@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add helpers to convert memory between private and shared via KVM's
memory attributes, as well as helpers to free/allocate guest_memfd memory
via fallocate(). Userspace, i.e. tests, is NOT required to do fallocate()
when converting memory, as the attributes are the single source of truth.
Provide allocate() helpers so that tests can mimic a userspace that frees
private memory on conversion, e.g. to prioritize memory usage over
performance.
Signed-off-by: Vishal Annapurve <vannapurve@google.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20231027182217.3615211-28-seanjc@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add support for creating "private" memslots via KVM_CREATE_GUEST_MEMFD and
KVM_SET_USER_MEMORY_REGION2. Make vm_userspace_mem_region_add() a wrapper
to its effective replacement, vm_mem_add(), so that private memslots are
fully opt-in, i.e. don't require update all tests that add memory regions.
Pivot on the KVM_MEM_PRIVATE flag instead of the validity of the "gmem"
file descriptor so that simple tests can let vm_mem_add() do the heavy
lifting of creating the guest memfd, but also allow the caller to pass in
an explicit fd+offset so that fancier tests can do things like back
multiple memslots with a single file. If the caller passes in a fd, dup()
the fd so that (a) __vm_mem_region_delete() can close the fd associated
with the memory region without needing yet another flag, and (b) so that
the caller can safely close its copy of the fd without having to first
destroy memslots.
Co-developed-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20231027182217.3615211-27-seanjc@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use KVM_SET_USER_MEMORY_REGION2 throughout KVM's selftests library so that
support for guest private memory can be added without needing an entirely
separate set of helpers.
Note, this obviously makes selftests backwards-incompatible with older KVM
versions from this point forward.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20231027182217.3615211-26-seanjc@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Drop kvm_userspace_memory_region_find(), it's unused and a terrible API
(probably why it's unused). If anything outside of kvm_util.c needs to
get at the memslot, userspace_mem_region_find() can be exposed to give
others full access to all memory region/slot information.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20231027182217.3615211-25-seanjc@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This function does the same but makes it clearer why one would use
the "____"-prefixed version of vm_create().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Generalized infrastructure for 'writable' ID registers, effectively
allowing userspace to opt-out of certain vCPU features for its guest
* Optimization for vSGI injection, opportunistically compressing MPIDR
to vCPU mapping into a table
* Improvements to KVM's PMU emulation, allowing userspace to select
the number of PMCs available to a VM
* Guest support for memory operation instructions (FEAT_MOPS)
* Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
bugs and getting rid of useless code
* Changes to the way the SMCCC filter is constructed, avoiding wasted
memory allocations when not in use
* Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing
the overhead of errata mitigations
* Miscellaneous kernel and selftest fixes
LoongArch:
* New architecture. The hardware uses the same model as x86, s390
and RISC-V, where guest/host mode is orthogonal to supervisor/user
mode. The virtualization extensions are very similar to MIPS,
therefore the code also has some similarities but it's been cleaned
up to avoid some of the historical bogosities that are found in
arch/mips. The kernel emulates MMU, timer and CSR accesses, while
interrupt controllers are only emulated in userspace, at least for
now.
RISC-V:
* Support for the Smstateen and Zicond extensions
* Support for virtualizing senvcfg
* Support for virtualized SBI debug console (DBCN)
S390:
* Nested page table management can be monitored through tracepoints
and statistics
x86:
* Fix incorrect handling of VMX posted interrupt descriptor in KVM_SET_LAPIC,
which could result in a dropped timer IRQ
* Avoid WARN on systems with Intel IPI virtualization
* Add CONFIG_KVM_MAX_NR_VCPUS, to allow supporting up to 4096 vCPUs without
forcing more common use cases to eat the extra memory overhead.
* Add virtualization support for AMD SRSO mitigation (IBPB_BRTYPE and
SBPB, aka Selective Branch Predictor Barrier).
* Fix a bug where restoring a vCPU snapshot that was taken within 1 second of
creating the original vCPU would cause KVM to try to synchronize the vCPU's
TSC and thus clobber the correct TSC being set by userspace.
* Compute guest wall clock using a single TSC read to avoid generating an
inaccurate time, e.g. if the vCPU is preempted between multiple TSC reads.
* "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which complain
about a "Firmware Bug" if the bit isn't set for select F/M/S combos.
Likewise "virtualize" (ignore) MSR_AMD64_TW_CFG to appease Windows Server
2022.
* Don't apply side effects to Hyper-V's synthetic timer on writes from
userspace to fix an issue where the auto-enable behavior can trigger
spurious interrupts, i.e. do auto-enabling only for guest writes.
* Remove an unnecessary kick of all vCPUs when synchronizing the dirty log
without PML enabled.
* Advertise "support" for non-serializing FS/GS base MSR writes as appropriate.
* Harden the fast page fault path to guard against encountering an invalid
root when walking SPTEs.
* Omit "struct kvm_vcpu_xen" entirely when CONFIG_KVM_XEN=n.
* Use the fast path directly from the timer callback when delivering Xen
timer events, instead of waiting for the next iteration of the run loop.
This was not done so far because previously proposed code had races,
but now care is taken to stop the hrtimer at critical points such as
restarting the timer or saving the timer information for userspace.
* Follow the lead of upstream Xen and ignore the VCPU_SSHOTTMR_future flag.
* Optimize injection of PMU interrupts that are simultaneous with NMIs.
* Usual handful of fixes for typos and other warts.
x86 - MTRR/PAT fixes and optimizations:
* Clean up code that deals with honoring guest MTRRs when the VM has
non-coherent DMA and host MTRRs are ignored, i.e. EPT is enabled.
* Zap EPT entries when non-coherent DMA assignment stops/start to prevent
using stale entries with the wrong memtype.
* Don't ignore guest PAT for CR0.CD=1 && KVM_X86_QUIRK_CD_NW_CLEARED=y.
This was done as a workaround for virtual machine BIOSes that did not
bother to clear CR0.CD (because ancient KVM/QEMU did not bother to
set it, in turn), and there's zero reason to extend the quirk to
also ignore guest PAT.
x86 - SEV fixes:
* Report KVM_EXIT_SHUTDOWN instead of EINVAL if KVM intercepts SHUTDOWN while
running an SEV-ES guest.
* Clean up the recognition of emulation failures on SEV guests, when KVM would
like to "skip" the instruction but it had already been partially emulated.
This makes it possible to drop a hack that second guessed the (insufficient)
information provided by the emulator, and just do the right thing.
Documentation:
* Various updates and fixes, mostly for x86
* MTRR and PAT fixes and optimizations:
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmVBZc0UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroP1LQf+NgsmZ1lkGQlKdSdijoQ856w+k0or
l2SV1wUwiEdFPSGK+RTUlHV5Y1ni1dn/CqCVIJZKEI3ZtZ1m9/4HKIRXvbMwFHIH
hx+E4Lnf8YUjsGjKTLd531UKcpphztZavQ6pXLEwazkSkDEra+JIKtooI8uU+9/p
bd/eF1V+13a8CHQf1iNztFJVxqBJbVlnPx4cZDRQQvewskIDGnVDtwbrwCUKGtzD
eNSzhY7si6O2kdQNkuA8xPhg29dYX9XLaCK2K1l8xOUm8WipLdtF86GAKJ5BVuOL
6ek/2QCYjZ7a+coAZNfgSEUi8JmFHEqCo7cnKmWzPJp+2zyXsdudqAhT1g==
=UIxm
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM:
- Generalized infrastructure for 'writable' ID registers, effectively
allowing userspace to opt-out of certain vCPU features for its
guest
- Optimization for vSGI injection, opportunistically compressing
MPIDR to vCPU mapping into a table
- Improvements to KVM's PMU emulation, allowing userspace to select
the number of PMCs available to a VM
- Guest support for memory operation instructions (FEAT_MOPS)
- Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
bugs and getting rid of useless code
- Changes to the way the SMCCC filter is constructed, avoiding wasted
memory allocations when not in use
- Load the stage-2 MMU context at vcpu_load() for VHE systems,
reducing the overhead of errata mitigations
- Miscellaneous kernel and selftest fixes
LoongArch:
- New architecture for kvm.
The hardware uses the same model as x86, s390 and RISC-V, where
guest/host mode is orthogonal to supervisor/user mode. The
virtualization extensions are very similar to MIPS, therefore the
code also has some similarities but it's been cleaned up to avoid
some of the historical bogosities that are found in arch/mips. The
kernel emulates MMU, timer and CSR accesses, while interrupt
controllers are only emulated in userspace, at least for now.
RISC-V:
- Support for the Smstateen and Zicond extensions
- Support for virtualizing senvcfg
- Support for virtualized SBI debug console (DBCN)
S390:
- Nested page table management can be monitored through tracepoints
and statistics
x86:
- Fix incorrect handling of VMX posted interrupt descriptor in
KVM_SET_LAPIC, which could result in a dropped timer IRQ
- Avoid WARN on systems with Intel IPI virtualization
- Add CONFIG_KVM_MAX_NR_VCPUS, to allow supporting up to 4096 vCPUs
without forcing more common use cases to eat the extra memory
overhead.
- Add virtualization support for AMD SRSO mitigation (IBPB_BRTYPE and
SBPB, aka Selective Branch Predictor Barrier).
- Fix a bug where restoring a vCPU snapshot that was taken within 1
second of creating the original vCPU would cause KVM to try to
synchronize the vCPU's TSC and thus clobber the correct TSC being
set by userspace.
- Compute guest wall clock using a single TSC read to avoid
generating an inaccurate time, e.g. if the vCPU is preempted
between multiple TSC reads.
- "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which
complain about a "Firmware Bug" if the bit isn't set for select
F/M/S combos. Likewise "virtualize" (ignore) MSR_AMD64_TW_CFG to
appease Windows Server 2022.
- Don't apply side effects to Hyper-V's synthetic timer on writes
from userspace to fix an issue where the auto-enable behavior can
trigger spurious interrupts, i.e. do auto-enabling only for guest
writes.
- Remove an unnecessary kick of all vCPUs when synchronizing the
dirty log without PML enabled.
- Advertise "support" for non-serializing FS/GS base MSR writes as
appropriate.
- Harden the fast page fault path to guard against encountering an
invalid root when walking SPTEs.
- Omit "struct kvm_vcpu_xen" entirely when CONFIG_KVM_XEN=n.
- Use the fast path directly from the timer callback when delivering
Xen timer events, instead of waiting for the next iteration of the
run loop. This was not done so far because previously proposed code
had races, but now care is taken to stop the hrtimer at critical
points such as restarting the timer or saving the timer information
for userspace.
- Follow the lead of upstream Xen and ignore the VCPU_SSHOTTMR_future
flag.
- Optimize injection of PMU interrupts that are simultaneous with
NMIs.
- Usual handful of fixes for typos and other warts.
x86 - MTRR/PAT fixes and optimizations:
- Clean up code that deals with honoring guest MTRRs when the VM has
non-coherent DMA and host MTRRs are ignored, i.e. EPT is enabled.
- Zap EPT entries when non-coherent DMA assignment stops/start to
prevent using stale entries with the wrong memtype.
- Don't ignore guest PAT for CR0.CD=1 && KVM_X86_QUIRK_CD_NW_CLEARED=y
This was done as a workaround for virtual machine BIOSes that did
not bother to clear CR0.CD (because ancient KVM/QEMU did not bother
to set it, in turn), and there's zero reason to extend the quirk to
also ignore guest PAT.
x86 - SEV fixes:
- Report KVM_EXIT_SHUTDOWN instead of EINVAL if KVM intercepts
SHUTDOWN while running an SEV-ES guest.
- Clean up the recognition of emulation failures on SEV guests, when
KVM would like to "skip" the instruction but it had already been
partially emulated. This makes it possible to drop a hack that
second guessed the (insufficient) information provided by the
emulator, and just do the right thing.
Documentation:
- Various updates and fixes, mostly for x86
- MTRR and PAT fixes and optimizations"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (164 commits)
KVM: selftests: Avoid using forced target for generating arm64 headers
tools headers arm64: Fix references to top srcdir in Makefile
KVM: arm64: Add tracepoint for MMIO accesses where ISV==0
KVM: arm64: selftest: Perform ISB before reading PAR_EL1
KVM: arm64: selftest: Add the missing .guest_prepare()
KVM: arm64: Always invalidate TLB for stage-2 permission faults
KVM: x86: Service NMI requests after PMI requests in VM-Enter path
KVM: arm64: Handle AArch32 SPSR_{irq,abt,und,fiq} as RAZ/WI
KVM: arm64: Do not let a L1 hypervisor access the *32_EL2 sysregs
KVM: arm64: Refine _EL2 system register list that require trap reinjection
arm64: Add missing _EL2 encodings
arm64: Add missing _EL12 encodings
KVM: selftests: aarch64: vPMU test for validating user accesses
KVM: selftests: aarch64: vPMU register test for unimplemented counters
KVM: selftests: aarch64: vPMU register test for implemented counters
KVM: selftests: aarch64: Introduce vpmu_counter_access test
tools: Import arm_pmuv3.h
KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run
KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
...
This kselftest update for Linux 6.7-rc1 consists of:
-- kbuild kselftest-merge target fixes
-- fixes to several tests
-- resctrl test fixes and enhancements
-- ksft_perror() helper and reporting improvements
-- printf attribute to kselftest prints to improve reporting
-- documentation and clang build warning fixes
Bulk of the patches are for resctrl fixes and enhancements.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmVCoHMACgkQCwJExA0N
QxwzrA//ehiiLdV2lyghzPpDTVY8jKlB1xIpg3s0r0M3m/j6nAdnOgOe2gkapT7T
gFGL0r7xL9crqFdymwDANLSvNWOeghqB1oIok9Ruw5Rl3FcLnkh920bE6tPsddJg
9+/KqtZvL0Sr43l9OSgX2Uzqyw60wRQwpO0431hmgnKjblk8Rh4GZ7fUCLLNf4Ia
yOq1s2/cdmEwRc96lDaBWZaOTusejwh/xy8tgAjozHipLsmsexbyyHVWJWkVhMOD
ZklCtrq4lckRz+Vky6akvjoL6Mjl//7pg323e2fUcDCQxQvqwnCo2VqqyOVBnN2A
6XHQ6yXwh0xzCKRFgAiFhWlsKOz3wEIDrdp4dmhDkg4lw4gGJcwNke1UyX5zXYKM
1a6R1vbQS9qQOsWf34AYKZBHruFNtUt0FJYgI43SuH+fGc0D5cU91Rz+s9QIPCwj
8tcr5RWin8BOziDz05lxSKWRHD+3oc5qmsmGYBJhilwtvY2wNbRZNDZjiO28kiIy
3kUWXeCtHmZE1KHK1H5v6bMC8SqUU7ukvV5WebqGpxzJ2eFPbeXcek9/AWSWOFni
7thUg6MG3e4c/zRk8JYbmqXS/GeTkdmc3+VMXApLhTB8uSOWsnVMfJS9Zc2A1tGg
n6NRBJFQO8t9Wm1l9XvlnC9HA/8lO/3uih+SzKn/u8KvoN96HPM=
=JZb+
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-next-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest updates from Shuah Khan:
- kbuild kselftest-merge target fixes
- fixes to several tests
- resctrl test fixes and enhancements
- ksft_perror() helper and reporting improvements
- printf attribute to kselftest prints to improve reporting
- documentation and clang build warning fixes
The bulk of the patches are for resctrl fixes and enhancements.
* tag 'linux_kselftest-next-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (51 commits)
selftests/resctrl: Fix MBM test failure when MBA unavailable
selftests/clone3: Report descriptive test names
selftests:modify the incorrect print format
selftests/efivarfs: create-read: fix a resource leak
selftests/ftrace: Add riscv support for kprobe arg tests
selftests/ftrace: add loongarch support for kprobe args char tests
selftests/amd-pstate: Added option to provide perf binary path
selftests/amd-pstate: Fix broken paths to run workloads in amd-pstate-ut
selftests/resctrl: Move run_benchmark() to a more fitting file
selftests/resctrl: Fix schemata write error check
selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests
selftests/resctrl: Fix feature checks
selftests/resctrl: Refactor feature check to use resource and feature name
selftests/resctrl: Move _GNU_SOURCE define into Makefile
selftests/resctrl: Remove duplicate feature check from CMT test
selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
selftests/resctrl: Fix uninitialized .sa_flags
selftests/resctrl: Cleanup benchmark argument parsing
selftests/resctrl: Remove ben_count variable
selftests/resctrl: Make benchmark command const and build it with pointers
...
- Generalized infrastructure for 'writable' ID registers, effectively
allowing userspace to opt-out of certain vCPU features for its guest
- Optimization for vSGI injection, opportunistically compressing MPIDR
to vCPU mapping into a table
- Improvements to KVM's PMU emulation, allowing userspace to select
the number of PMCs available to a VM
- Guest support for memory operation instructions (FEAT_MOPS)
- Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
bugs and getting rid of useless code
- Changes to the way the SMCCC filter is constructed, avoiding wasted
memory allocations when not in use
- Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing
the overhead of errata mitigations
- Miscellaneous kernel and selftest fixes
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCZUFJRgAKCRCivnWIJHzd
FtgYAP9cMsc1Mhlw3jNQnTc6q0cbTulD/SoEDPUat1dXMqjs+gEAnskwQTrTX834
fgGQeCAyp7Gmar+KeP64H0xm8kPSpAw=
=R4M7
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.7
- Generalized infrastructure for 'writable' ID registers, effectively
allowing userspace to opt-out of certain vCPU features for its guest
- Optimization for vSGI injection, opportunistically compressing MPIDR
to vCPU mapping into a table
- Improvements to KVM's PMU emulation, allowing userspace to select
the number of PMCs available to a VM
- Guest support for memory operation instructions (FEAT_MOPS)
- Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
bugs and getting rid of useless code
- Changes to the way the SMCCC filter is constructed, avoiding wasted
memory allocations when not in use
- Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing
the overhead of errata mitigations
- Miscellaneous kernel and selftest fixes
- Add CONFIG_KVM_MAX_NR_VCPUS to allow supporting up to 4096 vCPUs without
forcing more common use cases to eat the extra memory overhead.
- Add IBPB and SBPB virtualization support.
- Fix a bug where restoring a vCPU snapshot that was taken within 1 second of
creating the original vCPU would cause KVM to try to synchronize the vCPU's
TSC and thus clobber the correct TSC being set by userspace.
- Compute guest wall clock using a single TSC read to avoid generating an
inaccurate time, e.g. if the vCPU is preempted between multiple TSC reads.
- "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which complain
about a "Firmware Bug" if the bit isn't set for select F/M/S combos.
- Don't apply side effects to Hyper-V's synthetic timer on writes from
userspace to fix an issue where the auto-enable behavior can trigger
spurious interrupts, i.e. do auto-enabling only for guest writes.
- Remove an unnecessary kick of all vCPUs when synchronizing the dirty log
without PML enabled.
- Advertise "support" for non-serializing FS/GS base MSR writes as appropriate.
- Use octal notation for file permissions through KVM x86.
- Fix a handful of typo fixes and warts.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmU8EugSHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5xS0P+gPTDO81CUZO70LrO2W4E7toRBf/F9x1
/v5D/76p9hG32Z6+BJs/xxDxJFagw75MtoR5oKivtXiip3TxbfOyDOlaQkIRo85E
/d95il/LRidL3Mv3TXRj1lykXnxSSz9tigAGEZti1Y9Fn9fXEIwurJH7dU5cBI1E
fin5bsDaTNRjG4jjTiEUbnKPRTlD/S7CQJn4CaYvZhMv/eJkYDLyBBVy4VLoLzvD
ctL6VJQLGPVxbxr9mEmulaqMrSuDIQQLkRVQJAViKyerBInTEc5d/GPCHuE8O3zi
0r/QSJbMS9titWLz07NhJ1UH4VJNyaEhRlyJPSFhBW4h6dzUb3EXdUe0Hwa+JH/S
H2cVqsANItTCIhvDtuEGIRDahu0eD+63h90InJ0gEVL1kSJS+UWZHB71PkUEQgAV
2OsuT1D26fuxrv+0b9ioBZURycqKw++zGsrwyVhe77eBgqBJ12tbL4TAD+QNjaQ5
HZTCe6YV83gZoOMeVkoTGSf96s9lGORgxsaAIXmFuLB9RVCVXhVh0ph2HZsnV8Hw
ZXEXpBEFo7GUhb0NIvsk2W73QL87A3fLv15yITWc8KuC7/dXP9z6KpSKjFySS69X
uWD1MVx6shhvbg97UzoJlXc3/z0aVzmdZJudE5d0gcFvAjIItqp6ICPOoKxfj8pT
tqRZu3kVHd61
=sfp8
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-misc-6.7' of https://github.com/kvm-x86/linux into HEAD
KVM x86 misc changes for 6.7:
- Add CONFIG_KVM_MAX_NR_VCPUS to allow supporting up to 4096 vCPUs without
forcing more common use cases to eat the extra memory overhead.
- Add IBPB and SBPB virtualization support.
- Fix a bug where restoring a vCPU snapshot that was taken within 1 second of
creating the original vCPU would cause KVM to try to synchronize the vCPU's
TSC and thus clobber the correct TSC being set by userspace.
- Compute guest wall clock using a single TSC read to avoid generating an
inaccurate time, e.g. if the vCPU is preempted between multiple TSC reads.
- "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which complain
about a "Firmware Bug" if the bit isn't set for select F/M/S combos.
- Don't apply side effects to Hyper-V's synthetic timer on writes from
userspace to fix an issue where the auto-enable behavior can trigger
spurious interrupts, i.e. do auto-enabling only for guest writes.
- Remove an unnecessary kick of all vCPUs when synchronizing the dirty log
without PML enabled.
- Advertise "support" for non-serializing FS/GS base MSR writes as appropriate.
- Use octal notation for file permissions through KVM x86.
- Fix a handful of typo fixes and warts.
- Smstateen and Zicond support for Guest/VM
- Virtualized senvcfg CSR for Guest/VM
- Added Smstateen registers to the get-reg-list selftests
- Added Zicond to the get-reg-list selftests
- Virtualized SBI debug console (DBCN) for Guest/VM
- Added SBI debug console (DBCN) to the get-reg-list selftests
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmU3dKgACgkQrUjsVaLH
LAeh6g//RVlm3rLLwnWes3PwnBZDtkXYVVLy8oCEPjjjZGMJ+piwV8kcq4QpNear
e1nrOJTnndkjPG9TdJIxFW0qGnPLe4D9Ag6NoFpJXwxZRy/Glr8iWbt3pOcrv6vu
gk/9RP5gQpJpcP7D2scR+go6ryEzncCNY5GLtXoyFM9drQk+IROSrMWrWI5CXBJi
c0qwQWsSWb3KTBTZ7Gsm0Y7E9WIi5u6bBxcLJvblUKszS+PpXO9IA+8qOGZS4Cts
ocy5qqu+NNPECZjSa+kZti8ZSbdjNw9ORyB4OHlBUt55Uidp72pPFdTjGvSvs/jq
6wF70i1qjZOtjVUNdlkF93EN1xu4f3Hus/6/Q1pcyq+k9vo7eeJQQRWR4bJZ9SJR
ZkhYfOgqMUzMnrf2M20en9DygRDFstL2XW5mGDPmk9XLSNv1KwpKf96Xl2E20uPb
21EVQSiybmpl80WkWs0txwk62qQYzYUx+Io8XKXl8MrqNN2HCnzUUy/fjy/kd7iM
Oe1a8kNcV/eawpURBNVCcpld9qM1OPCmseqrr9tpvdR4SmAo1mAi/d7/I61umwpE
a36fqmwIZ0ppMs86BOcMRcTaee6EjPeldXnRCTjCLqo6YvsdHRrwpNBHRMbUOKpy
0688Xf+vjmu81IXhfaRTcs5dvKe1NI4puRtG5DUt7n1csSUyNWI=
=TFKQ
-----END PGP SIGNATURE-----
Merge tag 'kvm-riscv-6.7-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv changes for 6.7
- Smstateen and Zicond support for Guest/VM
- Virtualized senvcfg CSR for Guest/VM
- Added Smstateen registers to the get-reg-list selftests
- Added Zicond to the get-reg-list selftests
- Virtualized SBI debug console (DBCN) for Guest/VM
- Added SBI debug console (DBCN) to the get-reg-list selftests
* kvm-arm64/pmu_pmcr_n:
: User-defined PMC limit, courtesy Raghavendra Rao Ananta
:
: Certain VMMs may want to reserve some PMCs for host use while running a
: KVM guest. This was a bit difficult before, as KVM advertised all
: supported counters to the guest. Userspace can now limit the number of
: advertised PMCs by writing to PMCR_EL0.N, as KVM's sysreg and PMU
: emulation enforce the specified limit for handling guest accesses.
KVM: selftests: aarch64: vPMU test for validating user accesses
KVM: selftests: aarch64: vPMU register test for unimplemented counters
KVM: selftests: aarch64: vPMU register test for implemented counters
KVM: selftests: aarch64: Introduce vpmu_counter_access test
tools: Import arm_pmuv3.h
KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run
KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
KVM: arm64: PMU: Set PMCR_EL0.N for vCPU based on the associated PMU
KVM: arm64: PMU: Add a helper to read a vCPU's PMCR_EL0
KVM: arm64: Select default PMU in KVM_ARM_VCPU_INIT handler
KVM: arm64: PMU: Introduce helpers to set the guest's PMU
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* kvm-arm64/writable-id-regs:
: Writable ID registers, courtesy of Jing Zhang
:
: This series significantly expands the architectural feature set that
: userspace can manipulate via the ID registers. A new ioctl is defined
: that makes the mutable fields in the ID registers discoverable to
: userspace.
KVM: selftests: Avoid using forced target for generating arm64 headers
tools headers arm64: Fix references to top srcdir in Makefile
KVM: arm64: selftests: Test for setting ID register from usersapce
tools headers arm64: Update sysreg.h with kernel sources
KVM: selftests: Generate sysreg-defs.h and add to include path
perf build: Generate arm64's sysreg-defs.h and add to include path
tools: arm64: Add a Makefile for generating sysreg-defs.h
KVM: arm64: Document vCPU feature selection UAPIs
KVM: arm64: Allow userspace to change ID_AA64ZFR0_EL1
KVM: arm64: Allow userspace to change ID_AA64PFR0_EL1
KVM: arm64: Allow userspace to change ID_AA64MMFR{0-2}_EL1
KVM: arm64: Allow userspace to change ID_AA64ISAR{0-2}_EL1
KVM: arm64: Bump up the default KVM sanitised debug version to v8p8
KVM: arm64: Reject attempts to set invalid debug arch version
KVM: arm64: Advertise selected DebugVer in DBGDIDR.Version
KVM: arm64: Use guest ID register values for the sake of emulation
KVM: arm64: Document KVM_ARM_GET_REG_WRITABLE_MASKS
KVM: arm64: Allow userspace to get the writable masks for feature ID registers
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
The 'prepare' target that generates the arm64 sysreg headers had no
prerequisites, so it wound up forcing a rebuild of all KVM selftests
each invocation. Add a rule for the generated headers and just have
dependents use that for a prerequisite.
Reported-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Fixes: 9697d84cc3 ("KVM: selftests: Generate sysreg-defs.h and add to include path")
Tested-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Link: https://lore.kernel.org/r/20231027005439.3142015-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* kvm-arm64/misc:
: Miscellaneous updates
:
: - Put an upper bound on the number of I-cache invalidations by
: cacheline to avoid soft lockups
:
: - Get rid of bogus refererence count transfer for THP mappings
:
: - Do a local TLB invalidation on permission fault race
:
: - Fixes for page_fault_test KVM selftest
:
: - Add a tracepoint for detecting MMIO instructions unsupported by KVM
KVM: arm64: Add tracepoint for MMIO accesses where ISV==0
KVM: arm64: selftest: Perform ISB before reading PAR_EL1
KVM: arm64: selftest: Add the missing .guest_prepare()
KVM: arm64: Always invalidate TLB for stage-2 permission faults
KVM: arm64: Do not transfer page refcount for THP adjustment
KVM: arm64: Avoid soft lockups due to I-cache maintenance
arm64: tlbflush: Rename MAX_TLBI_OPS
KVM: arm64: Don't use kerneldoc comment for arm64_check_features()
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
It looks like a mistake to issue ISB *after* reading PAR_EL1, we should
instead perform it between the AT instruction and the reads of PAR_EL1.
As according to DDI0487J.a IJTYVP,
"When an address translation instruction is executed, explicit
synchronization is required to guarantee the result is visible to
subsequent direct reads of PAR_EL1."
Otherwise all guest_at testcases fail on my box with
==== Test Assertion Failure ====
aarch64/page_fault_test.c:142: par & 1 == 0
pid=1355864 tid=1355864 errno=4 - Interrupted system call
1 0x0000000000402853: vcpu_run_loop at page_fault_test.c:681
2 0x0000000000402cdb: run_test at page_fault_test.c:730
3 0x0000000000403897: for_each_guest_mode at guest_modes.c:100
4 0x00000000004019f3: for_each_test_and_guest_mode at page_fault_test.c:1105
5 (inlined by) main at page_fault_test.c:1131
6 0x0000ffffb153c03b: ?? ??:0
7 0x0000ffffb153c113: ?? ??:0
8 0x0000000000401aaf: _start at ??:?
0x1 != 0x0 (par & 1 != 0)
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231007124043.626-2-yuzenghui@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Running page_fault_test on a Cortex A72 fails with
Test: ro_memslot_no_syndrome_guest_cas
Testing guest mode: PA-bits:40, VA-bits:48, 4K pages
Testing memory backing src type: anonymous
==== Test Assertion Failure ====
aarch64/page_fault_test.c:117: guest_check_lse()
pid=1944087 tid=1944087 errno=4 - Interrupted system call
1 0x00000000004028b3: vcpu_run_loop at page_fault_test.c:682
2 0x0000000000402d93: run_test at page_fault_test.c:731
3 0x0000000000403957: for_each_guest_mode at guest_modes.c:100
4 0x00000000004019f3: for_each_test_and_guest_mode at page_fault_test.c:1108
5 (inlined by) main at page_fault_test.c:1134
6 0x0000ffff868e503b: ?? ??:0
7 0x0000ffff868e5113: ?? ??:0
8 0x0000000000401aaf: _start at ??:?
guest_check_lse()
because we don't have a guest_prepare stage to check the presence of
FEAT_LSE and skip the related guest_cas testing, and we end-up failing in
GUEST_ASSERT(guest_check_lse()).
Add the missing .guest_prepare() where it's indeed required.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231007124043.626-1-yuzenghui@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Add a vPMU test scenario to validate the userspace accesses for
the registers PM{C,I}NTEN{SET,CLR} and PMOVS{SET,CLR} to ensure
that KVM honors the architectural definitions of these registers
for a given PMCR.N.
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-13-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Add a new test case to the vpmu_counter_access test to check
if PMU registers or their bits for unimplemented counters are not
accessible or are RAZ, as expected.
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-12-rananta@google.com
[Oliver: fix issues relating to exception return address]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Add a new test case to the vpmu_counter_access test to check if PMU
registers or their bits for implemented counters on the vCPU are
readable/writable as expected, and can be programmed to count events.
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-11-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Introduce vpmu_counter_access test for arm64 platforms.
The test configures PMUv3 for a vCPU, sets PMCR_EL0.N for the vCPU,
and check if the guest can consistently see the same number of the
PMU event counters (PMCR_EL0.N) that userspace sets.
This test case is done with each of the PMCR_EL0.N values from
0 to 31 (With the PMCR_EL0.N values greater than the host value,
the test expects KVM_SET_ONE_REG for the PMCR_EL0 to fail).
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-10-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
We have a new SBI debug console (DBCN) extension supported by in-kernel
KVM so let us add this extension to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Add tests to verify setting ID registers from userspace is handled
correctly by KVM. Also add a test case to use ioctl
KVM_ARM_GET_REG_WRITABLE_MASKS to get writable masks.
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231011195740.3349631-6-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
The users of sysreg.h (perf, KVM selftests) are now generating the
necessary sysreg-defs.h; sync sysreg.h with the kernel sources and
fix the KVM selftests that use macros which suffered a rename.
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231011195740.3349631-5-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Start generating sysreg-defs.h for arm64 builds in anticipation of
updating sysreg.h to a version that depends on it.
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231011195740.3349631-4-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
- Play nice with %llx when formatting guest printf and assert statements.
- Clean up stale test metadata.
- Zero-initialize structures in memslot perf test to workaround a suspected
"may be used uninitialized" false positives from GCC.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmUp1RISHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5I+wP/1hpxu+wC29SxT6PxLcMFCuIy+XV+jSV
mnFRKhu4Np4E2Th8jFs94qnVN/gdPSNF+kbY//4kqfenf6V6VpNj4rUER2xZY+HV
0Ben08AzuoDdzn9UikG59GJeyXuZZDqqOdjM2xgefXDqyRYvFhwQleVCULzoGBeB
iwUD7g2MOJfFjqUhJAV8HDQTzhMgDIc9J+d+oO6J8FRXd64jQ4CBpaM1FDyjmpFp
RiKMRPkOuVkTbiScCGGwrMdpOXpT1tY5DcIP0N7O6EafyohItQ21hZWth5/1DmIS
wL1rMTad7uRcdQUdTDLUTegcFRQUjtUo5NS5l92MNFWGlW+J+C5GNVR8znMQCGq1
ZgovBR72FpFHaup2P2G4mxE9A0M7kiyhLqJFyaM6dvQQTyYSw0gPxKEjUmdljru1
Gf4SVZPdTAe543xpFJy3ANB4lWsc5uYzWxbIuNNSNl1p6l2em+wvvDroMJujVYU5
GnKxEtuwEnAUN094a8MMvWR6Q1m1Bfi5brFaPYmUveoZjGb7z8LxebQ4Tg3L08Ga
rH2Ri/Dt5mF/Pr5hje7eWDOYyDopPIcUWzRpIVA0/ORScqMXOV/oqEVufA+wwlSj
k0vcRkieRa2cxsKV+F00BSxKXEtdH51k3TEZA7dKFIKDuQk9q46+g5JXr4DF29dl
eXVDD03hHAFa
=LJ3q
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-selftests-6.6-fixes' of https://github.com/kvm-x86/linux into HEAD
KVM selftests fixes for 6.6:
- Play nice with %llx when formatting guest printf and assert statements.
- Clean up stale test metadata.
- Zero-initialize structures in memslot perf test to workaround a suspected
"may be used uninitialized" false positives from GCC.
The __printf() macro is used in many tools in the linux kernel to
validate the format specifiers in functions that use printf. The kvm
selftest uses it without putting it in a macro definition while it
also imports the kselftests.h header where the macro attribute is
defined.
Use __printf() from kselftests.h instead of the full attribute.
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Extend x86's state to forcefully load *all* host-supported xfeatures by
modifying xstate_bv in the saved state. Stuffing xstate_bv ensures that
the selftest is verifying KVM's full ABI regardless of whether or not the
guest code is successful in getting various xfeatures out of their INIT
state, e.g. see the disaster that is/was MPX.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230928001956.924301-6-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Expand x86's state test to load XSAVE state into a "dummy" vCPU prior to
KVM_SET_CPUID2, and again with an empty guest CPUID model. Except for
off-by-default features, i.e. AMX, KVM's ABI for KVM_SET_XSAVE is that
userspace is allowed to load xfeatures so long as they are supported by
the host. This is a regression test for a combination of KVM bugs where
the state saved by KVM_GET_XSAVE{2} could not be loaded via KVM_SET_XSAVE
if the saved xstate_bv would load guest-unsupported xfeatures.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230928001956.924301-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Modify support XSAVE state in the "state test's" guest code so that saving
and loading state via KVM_{G,S}ET_XSAVE actually does something useful,
i.e. so that xstate_bv in XSAVE state isn't empty.
Punt on BNDCSR for now, it's easier to just stuff that xfeature from the
host side.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230928001956.924301-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When outputting the "new" register list we want to print all of the
new registers, decoding as much as possible of each of them. Also, we
don't want to assert while listing registers with '--list'. We output
"/* UNKNOWN */" after each new register (which we were already doing
for some), which should be enough.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
We have a new conditional operations related ISA extensions so let us
add these extensions to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
We have a new smstateen registers as separate sub-type of CSR ONE_REG
interface so let us add these registers to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
We have a new senvcfg register in the general CSR ONE_REG interface
so let us add it to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Add a couple macros to use when filling arrays in order to ensure
the elements are placed in the right order, regardless of the
order we prefer to read them. And immediately apply the new
macro to resorting the ISA extension lists alphabetically.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Verify the following behavior holds true for writes and reads of HWCR from
host userspace:
* Attempts to set bits 3, 6, or 8 are ignored
* Bits 18 and 24 are the only bits that can be set
* Any bit that can be set can also be cleared
Signed-off-by: Jim Mattson <jmattson@google.com>
Link: https://lore.kernel.org/r/20230929230246.1954854-4-jmattson@google.com
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Zero-initialize the entire test_result structure used by memslot_perf_test
instead of zeroing only the fields used to guard the pr_info() calls.
gcc 13.2.0 is a bit overzealous and incorrectly thinks that rbestslottime's
slot_runtime may be used uninitialized.
In file included from memslot_perf_test.c:25:
memslot_perf_test.c: In function ‘main’:
include/test_util.h:31:22: error: ‘rbestslottime.slot_runtime.tv_nsec’ may be used uninitialized [-Werror=maybe-uninitialized]
31 | #define pr_info(...) printf(__VA_ARGS__)
| ^~~~~~~~~~~~~~~~~~~
memslot_perf_test.c:1127:17: note: in expansion of macro ‘pr_info’
1127 | pr_info("Best slot setup time for the whole test area was %ld.%.9lds\n",
| ^~~~~~~
memslot_perf_test.c:1092:28: note: ‘rbestslottime.slot_runtime.tv_nsec’ was declared here
1092 | struct test_result rbestslottime;
| ^~~~~~~~~~~~~
include/test_util.h:31:22: error: ‘rbestslottime.slot_runtime.tv_sec’ may be used uninitialized [-Werror=maybe-uninitialized]
31 | #define pr_info(...) printf(__VA_ARGS__)
| ^~~~~~~~~~~~~~~~~~~
memslot_perf_test.c:1127:17: note: in expansion of macro ‘pr_info’
1127 | pr_info("Best slot setup time for the whole test area was %ld.%.9lds\n",
| ^~~~~~~
memslot_perf_test.c:1092:28: note: ‘rbestslottime.slot_runtime.tv_sec’ was declared here
1092 | struct test_result rbestslottime;
| ^~~~~~~~~~~~~
That can't actually happen, at least not without the "result" structure in
test_loop() also being used uninitialized, which gcc doesn't complain
about, as writes to rbestslottime are all-or-nothing, i.e. slottimens can't
be non-zero without slot_runtime being written.
if (!data->mem_size &&
(!rbestslottime->slottimens ||
result.slottimens < rbestslottime->slottimens))
*rbestslottime = result;
Zero-initialize the structures to make gcc happy even though this is
likely a compiler bug. The cost to do so is negligible, both in terms of
code and runtime overhead. The only downside is that the compiler won't
warn about legitimate usage of "uninitialized" data, e.g. the test could
end up consuming zeros instead of useful data. However, given that the
test is quite mature and unlikely to see substantial changes, the odds of
introducing such bugs are relatively low, whereas being able to compile
KVM selftests with -Werror detects issues on a regular basis.
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20231005002954.2887098-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Delete inaccurate descriptions and obsolete metadata for test cases.
It adds zero value, and has a non-zero chance of becoming stale and
misleading in the future. No functional changes intended.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Link: https://lore.kernel.org/r/20230914094803.94661-1-likexu@tencent.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Treat %ll* formats the same as %l* formats when processing printfs from
the guest so that using e.g. %llx instead of %lx generates the expected
output. Ideally, unexpected formats would generate compile-time warnings
or errors, but it's not at all obvious how to actually accomplish that.
Alternatively, guest_vsnprintf() could assert on an unexpected format,
but since the vast majority of printfs are for failed guest asserts,
getting *something* printed is better than nothing.
E.g. before
==== Test Assertion Failure ====
x86_64/private_mem_conversions_test.c:265: mem[i] == 0
pid=4286 tid=4290 errno=4 - Interrupted system call
1 0x0000000000401c74: __test_mem_conversions at private_mem_conversions_test.c:336
2 0x00007f3aae6076da: ?? ??:0
3 0x00007f3aae32161e: ?? ??:0
Expected 0x0 at offset 0 (gpa 0x%lx), got 0x0
and after
==== Test Assertion Failure ====
x86_64/private_mem_conversions_test.c:265: mem[i] == 0
pid=5664 tid=5668 errno=4 - Interrupted system call
1 0x0000000000401c74: __test_mem_conversions at private_mem_conversions_test.c:336
2 0x00007fbe180076da: ?? ??:0
3 0x00007fbe17d2161e: ?? ??:0
Expected 0x0 at offset 0 (gpa 0x100000000), got 0xcc
Fixes: e511938249 ("KVM: selftests: Add guest_snprintf() to KVM selftests")
Cc: Aaron Lewis <aaronlewis@google.com>
Link: https://lore.kernel.org/r/20230921171641.3641776-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
- Fix KVM_GET_REG_LIST API for ISA_EXT registers
- Fix reading ISA_EXT register of a missing extension
- Fix ISA_EXT register handling in get-reg-list test
- Fix filtering of AIA registers in get-reg-list test
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmUMDssACgkQrUjsVaLH
LAckSg/+IZ5DPvPs81rUpL3i1Z5SrK4jXWL2zyvMIksBEYmFD2NPNvinVZ4Sxv6u
IzWNKJcAp4nA/+qPGPLXCURDe1W6PCDvO4SShjYm2UkPtNIfiskmFr3MunXZysgm
I7USJgj9ev+46yfOnwlYrwpZ8sQk7Z6nLTI/6Osk4Q7Sm0Vjoobh6krub7LNjeKQ
y6p+vxrXj+Owc5H9bgl0wAi6GOmOJKAM+cZU5DygQxjOgiUgNbOzsVgbLDTvExNq
gISUU4PoAO7/U1NKEaaopbe7C0KNQHTnegedtXsDzg7WTBah77/MNBt4snbfiR27
6rODklZlG/kAGIHdVtYC+zf8AfPqvGTIT8SLGmzQlyVlHujFBGn0L41NmMzW+EeA
7UpfUk8vPiiGhefBE+Ml3yqiReogo+aRhL1mZoI39rPusd7DMnwx97KpBlAcYuTI
PTgqycIMRmq2dSCHya+nrOVpwwV3Qx4G8Alpq1jOa7XDMeGMj4h521NQHjWckIK2
IJ2a0RtzB10+Z91nLV+amdAno326AnxJC7dR26O6uqVSPJy/nHE2GAc49gFKeWq6
QmzgzY1sU2Y02/TM8miyKSl3i+bpZSIPzXCKlOm1TowBKO+sfJzn/yMon9sVaVhb
4Wjgg3vgE74y9FVsL4JXR/PfrZH5Aq77J1R+/pMtsNTtVYrt1Sk=
=ytFs
-----END PGP SIGNATURE-----
Merge tag 'kvm-riscv-fixes-6.6-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv fixes for 6.6, take #1
- Fix KVM_GET_REG_LIST API for ISA_EXT registers
- Fix reading ISA_EXT register of a missing extension
- Fix ISA_EXT register handling in get-reg-list test
- Fix filtering of AIA registers in get-reg-list test
Currently the AIA ONE_REG registers are reported by get-reg-list
as new registers for various vcpu_reg_list configs whenever Ssaia
is available on the host because Ssaia extension can only be
disabled by Smstateen extension which is not always available.
To tackle this, we should filter-out AIA ONE_REG registers only
when Ssaia can't be disabled for a VCPU.
Fixes: 477069398e ("KVM: riscv: selftests: Add get-reg-list test")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Same set of ISA_EXT registers are not present on all host because
ISA_EXT registers are visible to the KVM user space based on the
ISA extensions available on the host. Also, disabling an ISA
extension using corresponding ISA_EXT register does not affect
the visibility of the ISA_EXT register itself.
Based on the above, we should filter-out all ISA_EXT registers.
Fixes: 477069398e ("KVM: riscv: selftests: Add get-reg-list test")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Assert that vasprintf() succeeds as the "returned" string is undefined
on failure. Checking the result also eliminates the only warning with
default options in KVM selftests, i.e. is the only thing getting in the
way of compile with -Werror.
lib/test_util.c: In function ‘strdup_printf’:
lib/test_util.c:390:9: error: ignoring return value of ‘vasprintf’
declared with attribute ‘warn_unused_result’ [-Werror=unused-result]
390 | vasprintf(&str, fmt, ap);
| ^~~~~~~~~~~~~~~~~~~~~~~~
Don't bother capturing the return value, allegedly vasprintf() can only
fail due to a memory allocation failure.
Fixes: dfaf20af76 ("KVM: arm64: selftests: Replace str_with_index with strdup_printf")
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Anup Patel <anup@brainfault.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Tested-by: Andrew Jones <ajones@ventanamicro.com>
Message-Id: <20230914010636.1391735-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allows a PV guest to use crypto cards. Card access is governed by
the firmware and once a crypto queue is "bound" to a PV VM every
other entity (PV or not) looses access until it is not bound
anymore. Enablement is done via flags when creating the PV VM.
- Guest debug fixes (Ilya)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEwGNS88vfc9+v45Yq41TmuOI4ufgFAmTsgQ8ACgkQ41TmuOI4
ufgCtRAAvSx+XmHhTL4D6QJsEm2Ahgz+9VKxlD91r+gbuw1h9rcZJcSGnZ41nxW1
rl/cEL4sGyEP8SjlKm9LB18mJ7LoJJaCIpzckWqmbGVvkdIXd45VxvppiSWCSq3X
TaFtfmXLi0iFznVHMHAR53if2t/exNXHHEwjAGm1byVUXy4xgqLaaXrdYBSRGVPQ
pmHoIJTlZUux/eOSrXEzsGPuza+dIQBilvZZRs1SZJmlh0rz39XX29GZTHHv6ant
8dkf5Q2Lkvs+jI6+6i4YCLQFzXixcLgaBjRRvRnE8aCP8/DbSjR+S+Qu4mMfWFtp
2oO2X7rwB/vu8FM06TxRrif/03crxaYtFdWbmGJUhhwp9DS7WO27sk61Z/yWHQFX
cviKEEvn3DvtBrrBtKrbEa04depRuwpQfwkbtnFDkGbDgxeekswMO81xV9T8VNxF
teyUyS9Fev4XuAjmBS2F1dHv9i/Sl2uB/Uh14GvTkyBOQzrcRw8dONR0ppVi9OVO
k0pjj9JmKpE+F39IuDYK0H+G82X67YQLk3yZfAF0zfVxV6ZrpEPtnsPe2rYPD5bW
zdMfzOiTeBVt+JTy9Dqkt2NWjWfjt+k7ws00q4ijDlcNYLO3cgIIYUPsim53Ccue
AG6iYyu4/o8hOSi2LkjEaPeC+wrCVfqKUOjzVa8FcZfy0wFKVdc=
=E2KD
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-next-6.6-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
- PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch)
Allows a PV guest to use crypto cards. Card access is governed by
the firmware and once a crypto queue is "bound" to a PV VM every
other entity (PV or not) looses access until it is not bound
anymore. Enablement is done via flags when creating the PV VM.
- Guest debug fixes (Ilya)
- Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs
- Add support for printf() in guest code and covert all guest asserts to use
printf-based reporting
- Clean up the PMU event filter test and add new testcases
- Include x86 selftests in the KVM x86 MAINTAINERS entry
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmTueu4SHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5wvIQAK8jWhb1Y4CzrJmcZyYYIR6apgtXl4vB
KbhFIFHi5ZeZXlpXA2o/FW8Q9LNmcRLtxoapb09t/eyb0+ODllDPt/aSG7p6Y4p9
rNb1g6Hj77LTaG5gMy7/lbk9ERzf61+MKUuucU7WzjlY8oyd+lm+y2cx2O3+S/89
C5cp2CGnqK2NMbUnzYN8izMrdvtwDvgQvm3H7Ah8yrGXJkcemVggXibuh+2coTfo
p2RKrY+A4Syw/edNe0GVZYoSVJdwPEif8o0gAz5PwC2LTjpf9Iobt89KEx08BkVw
ms0MFbwLS66MoSYIVoZkBdy/Tri5aCKxHGqu7taEWhogjbzrPvktA6PNYihO4zGa
OSjA/oyAPvFJ4cLuBlrVh/xPWVoGX/6Sx3dBP5TI3zyR0FAqZkoAPDivWhflOpTt
q3aoHr6THGRzqHOCYuX7nwzhqBFSSHUF1zy/P7rThSzieSzUiJiANUwBjTeB9Wsr
5Cn+KQ8XOZw1LVcoeI9y97xcHh9HeP3seO+MFie8OH9QK4nUqgqEbF8sp7WF0rB6
6rZ1lht9a2Qx4xdtqSMBkQdgnnaiCZ7jBtEFMK6kSQ67zvorlCwkOue3TrtorJ4H
1XI/DGAzltEfCLMAq+4FkHkkEr84S3gRjaLlI9aHWlVrSk1wxM87R16jgVfJp74R
gTNAzCys2KwM
=dHTQ
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-selftests-6.6' of https://github.com/kvm-x86/linux into HEAD
KVM: x86: Selftests changes for 6.6:
- Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs
- Add support for printf() in guest code and covert all guest asserts to use
printf-based reporting
- Clean up the PMU event filter test and add new testcases
- Include x86 selftests in the KVM x86 MAINTAINERS entry
Test different variations of single-stepping into interrupts:
- SVC and PGM interrupts;
- Interrupts generated by ISKE;
- Interrupts generated by instructions emulated by KVM;
- Interrupts generated by instructions emulated by userspace.
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20230725143857.228626-7-iii@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
[frankja@de.igm.com: s/ASSERT_EQ/TEST_ASSERT_EQ/ because function was
renamed in the selftest printf series]
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Explicitly set the exception vector to #UD when potentially injecting an
exception in sync_regs_test's subtests that try to detect TOCTOU bugs
in KVM's handling of exceptions injected by userspace. A side effect of
the original KVM bug was that KVM would clear the vector, but relying on
KVM to clear the vector (i.e. make it #DE) makes it less likely that the
test would ever find *new* KVM bugs, e.g. because only the first iteration
would run with a legal vector to start.
Explicitly inject #UD for race_events_inj_pen() as well, e.g. so that it
doesn't inherit the illegal 255 vector from race_events_exc(), which
currently runs first.
Link: https://lore.kernel.org/r/20230817233430.1416463-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reload known good vCPU state if the vCPU triple faults in any of the
race_sync_regs() subtests, e.g. if KVM successfully injects an exception
(the vCPU isn't configured to handle exceptions). On Intel, the VMCS
is preserved even after shutdown, but AMD's APM states that the VMCB is
undefined after a shutdown and so KVM synthesizes an INIT to sanitize
vCPU/VMCB state, e.g. to guard against running with a garbage VMCB.
The synthetic INIT results in the vCPU never exiting to userspace, as it
gets put into Real Mode at the reset vector, which is full of zeros (as is
GPA 0 and beyond), and so executes ADD for a very, very long time.
Fixes: 60c4063b47 ("KVM: selftests: Extend x86's sync_regs_test to check for event vector races")
Cc: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20230817233430.1416463-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add a test to ensure that setting both generic and fixed performance
event filters does not affect the consistency of the fixed event filter
behavior in KVM.
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Link: https://lore.kernel.org/r/20230810090945.16053-7-cloudliang@tencent.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add tests to cover that pmu event_filter works as expected when it's
applied to fixed performance counters, even if there is none fixed
counter exists (e.g. Intel guest pmu version=1 or AMD guest).
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com>
Link: https://lore.kernel.org/r/20230810090945.16053-6-cloudliang@tencent.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add test cases to verify the handling of unsupported input values for the
PMU event filter. The tests cover unsupported "action" values, unsupported
"flags" values, and unsupported "nevents" values. All these cases should
return an error, as they are currently not supported by the filter.
Furthermore, the tests also cover the case where setting non-existent
fixed counters in the fixed bitmap does not fail.
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com>
Link: https://lore.kernel.org/r/20230810090945.16053-5-cloudliang@tencent.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add custom "__kvm_pmu_event_filter" structure to improve pmu event
filter settings. Simplifies event filter setup by organizing event
filter parameters in a cleaner, more organized way.
Alternatively, selftests could use a struct overlay ala vcpu_set_msr()
to avoid dynamically allocating the array:
struct {
struct kvm_msrs header;
struct kvm_msr_entry entry;
} buffer = {};
memset(&buffer, 0, sizeof(buffer));
buffer.header.nmsrs = 1;
buffer.entry.index = msr_index;
buffer.entry.data = msr_value;
but the extra layer added by the nested structs is counterproductive
to writing efficient, clean code.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Link: https://lore.kernel.org/r/20230810090945.16053-4-cloudliang@tencent.com
[sean: massage changelog to explain alternative]
Signed-off-by: Sean Christopherson <seanjc@google.com>
None of the callers consume remove_event(), and it incorrectly implies
that the incoming filter isn't modified. Drop the return.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Link: https://lore.kernel.org/r/20230810090945.16053-3-cloudliang@tencent.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add x86 properties for Intel PMU so that tests don't have to manually
retrieve the correct CPUID leaf+register, and so that the resulting code
is self-documenting.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Link: https://lore.kernel.org/r/20230810090945.16053-2-cloudliang@tencent.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
get-reg-list test is used to check for KVM registers regressions
during VM migration which happens when destination host kernel
missing registers that the source host kernel has. The blessed
list registers was created by running on v6.5-rc3
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Add new skips_set members to vcpu_reg_sublist so as to skip
set operation on some registers.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Only do the get/set tests on present and blessed registers
since we don't know the capabilities of any new ones.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
No functional changes. Just move the finalize_vcpu call back to
run_test and do weak function trick to prepare for the opration
in riscv.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
No functional changes. Just move the reject_set check logic to a
function so we can check for a specific errno. This is a preparation
for support reject_set in riscv.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Add some unfortunate #ifdeffery to ensure the common get-reg-list.c
can be compiled and run with other architectures. The next
architecture to support get-reg-list should now only need to provide
$(ARCH_DIR)/get-reg-list.c where arch-specific print_reg() and
vcpu_configs[] get defined.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Split the arch-neutral test code out of aarch64/get-reg-list.c into
get-reg-list.c. To do this we invent a new make variable
$(SPLIT_TESTS) which expects common parts to be in the KVM selftests
root and the counterparts to have the same name, but be in
$(ARCH_DIR).
There's still some work to be done to de-aarch64 the common
get-reg-list.c, but we leave that to the next patch to avoid
modifying too much code while moving it.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
core_reg_fixup() complicates sharing the get-reg-list test with
other architectures. Rather than work at keeping it, with plenty
of #ifdeffery, just delete it, as it's unlikely to test a kernel
based on anything older than v5.2 with the get-reg-list test,
which is a test meant to check for regressions in new kernels.
(And, an older version of the test can still be used for older
kernels if necessary.)
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Rename vcpu_config to vcpu_reg_list to be more specific and add
it to kvm_util.h. While it may not get used outside get-reg-list
tests, exporting it doesn't hurt, as long as it has a unique enough
name. This is a step in the direction of sharing most of the get-
reg-list test code between architectures.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
print_reg() and its helpers only use the vcpu_config pointer for
config_name(). So just pass the config name in instead, which is used
as a prefix in asserts. print_reg() can now be compiled independently
of config_name().
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
The check doesn't prove much anyway, as the reg lists could be
messed up too. Just drop the check to simplify making print_reg
more independent.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
The original author of aarch64/get-reg-list.c (me) was wearing
tunnel vision goggles when implementing str_with_index(). There's
no reason to have such a special case string function. Instead,
take inspiration from glib and implement strdup_printf. The
implementation builds on vasprintf() which requires _GNU_SOURCE,
but we require _GNU_SOURCE in most files already.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Use GUEST_FAIL() in ARM's arch timer helpers now that printf-based
guest asserts are the default (and only) style of guest asserts, and
say goodbye to the GUEST_ASSERT_1() alias.
Link: https://lore.kernel.org/r/20230729003643.1053367-35-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Use the newfanged printf-based guest assert framework to spit out the
guest RIP when an unhandled exception is detected, which makes debugging
such failures *much* easier.
Link: https://lore.kernel.org/r/20230729003643.1053367-34-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Drop the param-based guest assert macros and enable the printf versions
for all selftests. Note! This change can affect tests even if they
don't use directly use guest asserts! E.g. via library code, or due to
the compiler making different optimization decisions.
Link: https://lore.kernel.org/r/20230729003643.1053367-33-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Convert x86's VMX PMU capabilities test to use printf-based guest asserts.
Opportunstically add a helper to do the WRMSR+assert so as to reduce the
amount of copy+paste needed to spit out debug information.
Link: https://lore.kernel.org/r/20230729003643.1053367-31-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Convert x86's nested SVM software interrupt injection test to use printf-
based guest asserts. Opportunistically use GUEST_ASSERT() and
GUEST_FAIL() in a few locations to spit out more debug information.
Link: https://lore.kernel.org/r/20230729003643.1053367-28-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Convert x86's nested exceptions test to printf-based guest asserts, and
use REPORT_GUEST_ASSERT() instead of TEST_FAIL() so that output is
formatted correctly.
Link: https://lore.kernel.org/r/20230729003643.1053367-26-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Convert x86's MONITOR/MWAIT test to use printf-based guest asserts. Add a
macro to handle reporting failures to reduce the amount of copy+paste
needed for MONITOR vs. MWAIT.
Link: https://lore.kernel.org/r/20230729003643.1053367-25-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Convert x86's Hyper-V feature test to use print-based guest asserts.
Opportunistically use the EQ and NE variants in a few places to capture
additional information.
Link: https://lore.kernel.org/r/20230729003643.1053367-23-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Convert x86's CPUID test to use printf-based GUEST_ASSERT_EQ() so that
the test prints out debug information. Note, the test previously used
REPORT_GUEST_ASSERT_2(), but that was pointless because none of the
guest-side code passed any parameters to the assert.
Link: https://lore.kernel.org/r/20230729003643.1053367-21-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Convert the steal_time test to use printf-based GUEST_ASERT.
Opportunistically use GUEST_ASSERT_EQ() and GUEST_ASSERT_NE() so that the
test spits out debug information on failure.
Link: https://lore.kernel.org/r/20230729003643.1053367-20-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Use printf-based guest assert reporting in ARM's vGIC IRQ test. Note,
this is not as innocuous as it looks! The printf-based version of
GUEST_ASSERT_EQ() ensures the expressions are evaluated only once, whereas
the old version did not!
Link: https://lore.kernel.org/r/20230729003643.1053367-15-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Convert ARM's debug exceptions test to use printf-based GUEST_ASSERT().
Opportunistically Use GUEST_ASSERT_EQ() in guest_code_ss() so that the
expected vs. actual values get printed out.
Link: https://lore.kernel.org/r/20230729003643.1053367-12-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Convert ARM's aarch_timer test to use printf-based GUEST_ASSERT().
To maintain existing functionality, manually print the host information,
e.g. stage and iteration, to stderr prior to reporting the guest assert.
Link: https://lore.kernel.org/r/20230729003643.1053367-11-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add a test to exercise the various features in KVM selftest's local
snprintf() and compare them to LIBC's snprintf() to ensure they behave
the same.
This is not an exhaustive test. KVM's local snprintf() does not
implement all the features LIBC does, e.g. KVM's local snprintf() does
not support floats or doubles, so testing for those features were
excluded.
Testing was added for the features that are expected to work to
support a minimal version of printf() in the guest.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
[sean: use UCALL_EXIT_REASON, enable for all architectures]
Link: https://lore.kernel.org/r/20230731203026.1192091-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Define the expected architecture specific exit reason for a successful
ucall so that common tests can assert that a ucall occurred without the
test needing to implement arch specific code.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20230731203026.1192091-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add an architecture specific ucall.h and inline the simple arch hooks,
e.g. the init hook for everything except ARM, and the actual "do ucall"
hook for everything except x86 (which should be simple, but temporarily
isn't due to carrying a workaround).
Having a per-arch ucall header will allow adding a #define for the
expected KVM exit reason for a ucall that is colocated (for everything
except x86) with the ucall itself.
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20230731203026.1192091-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add printf-based GUEST_ASSERT macros and accompanying host-side support to
provide an assert-specific versions of GUEST_PRINTF(). To make it easier
to parse assert messages, for humans and bots alike, preserve/use the same
layout as host asserts, e.g. in the example below, the reported expression,
file, line number, and message are from the guest assertion, not the host
reporting of the assertion.
The call stack still captures the host reporting, but capturing the guest
stack is a less pressing concern, i.e. can be done in the future, and an
optimal solution would capture *both* the host and guest stacks, i.e.
capturing the host stack isn't an outright bug.
Running soft int test
==== Test Assertion Failure ====
x86_64/svm_nested_soft_inject_test.c:39: regs->rip != (unsigned long)l2_guest_code_int
pid=214104 tid=214104 errno=4 - Interrupted system call
1 0x0000000000401b35: run_test at svm_nested_soft_inject_test.c:191
2 0x00000000004017d2: main at svm_nested_soft_inject_test.c:212
3 0x0000000000415b03: __libc_start_call_main at libc-start.o:?
4 0x000000000041714f: __libc_start_main_impl at ??:?
5 0x0000000000401660: _start at ??:?
Expected IRQ at RIP 0x401e50, received IRQ at 0x401e50
Don't bother sharing code between ucall_assert() and ucall_fmt(), as
forwarding the variable arguments would either require using macros or
building a va_list, i.e. would make the code less readable and/or require
just as much copy+paste code anyways.
Gate the new macros with a flag so that tests can more or less be switched
over one-by-one. The slow conversion won't be perfect, e.g. library code
won't pick up the flag, but the only asserts in library code are of the
vanilla GUEST_ASSERT() variety, i.e. don't print out variables.
Add a temporary alias to GUEST_ASSERT_1() to fudge around ARM's
arch_timer.h header using GUEST_ASSERT_1(), thus thwarting any attempt to
convert tests one-by-one.
Link: https://lore.kernel.org/r/20230729003643.1053367-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add more flexibility to guest debugging and testing by adding
GUEST_PRINTF() and GUEST_ASSERT_FMT() to the ucall framework.
Add a sized buffer to the ucall structure to hold the formatted string,
i.e. to allow the guest to easily resolve the string, and thus avoid the
ugly pattern of the host side having to make assumptions about the desired
format, as well as having to pass around a large number of parameters.
The buffer size was chosen to accommodate most use cases, and based on
similar usage. E.g. printf() uses the same size buffer in
arch/x86/boot/printf.c. And 1KiB ought to be enough for anybody.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
[sean: massage changelog, wrap macro param in ()]
Link: https://lore.kernel.org/r/20230729003643.1053367-8-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add additional pages to the guest to account for the number of pages
the ucall headers need. The only reason things worked before is the
ucall headers are fairly small. If they were ever to increase in
size the guest could run out of memory.
This is done in preparation for adding string formatting options to
the guest through the ucall framework which increases the size of
the ucall headers.
Fixes: 426729b2cf ("KVM: selftests: Add ucall pool based implementation")
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Link: https://lore.kernel.org/r/20230729003643.1053367-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add a local version of guest_snprintf() for use in the guest.
Having a local copy allows the guest access to string formatting
options without dependencies on LIBC. LIBC is problematic because
it heavily relies on both AVX-512 instructions and a TLS, neither of
which are guaranteed to be set up in the guest.
The file guest_sprintf.c was lifted from arch/x86/boot/printf.c and
adapted to work in the guest, including the addition of buffer length.
I.e. s/sprintf/snprintf/
The functions where prefixed with "guest_" to allow guests to
explicitly call them.
A string formatted by this function is expected to succeed or die. If
something goes wrong during the formatting process a GUEST_ASSERT()
will be thrown.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Link: https://lore.kernel.org/all/mtdi6smhur5rqffvpu7qux7mptonw223y2653x2nwzvgm72nlo@zyc4w3kwl3rg
[sean: add a link to the discussion of other options]
Link: https://lore.kernel.org/r/20230729003643.1053367-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add strnlen() to the string overrides to allow it to be called in the
guest.
The implementation for strnlen() was taken from the kernel's generic
version, lib/string.c.
This will be needed when printf() is introduced.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Link: https://lore.kernel.org/r/20230729003643.1053367-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Preserve or clobber all GPRs (except RIP and RSP, as they're saved and
restored via the VMCS) when performing a ucall on x86 to fudge around a
horrific long-standing bug in selftests' nested VMX support where L2's
GPRs are not preserved across a nested VM-Exit. I.e. if a test triggers a
nested VM-Exit to L1 in response to a ucall, e.g. GUEST_SYNC(), then L2's
GPR state can be corrupted.
The issues manifests as an unexpected #GP in clear_bit() when running the
hyperv_evmcs test due to RBX being used to track the ucall object, and RBX
being clobbered by the nested VM-Exit. The problematic hyperv_evmcs
testcase is where L0 (test's host userspace) injects an NMI in response to
GUEST_SYNC(8) from L2, but the bug could "randomly" manifest in any test
that induces a nested VM-Exit from L0. The bug hasn't caused failures in
the past due to sheer dumb luck.
The obvious fix is to rework the nVMX helpers to save/restore L2 GPRs
across VM-Exit and VM-Enter, but that is a much bigger task and carries
its own risks, e.g. nSVM does save/restore GPRs, but not in a thread-safe
manner, and there is a _lot_ of cleanup that can be done to unify code
for doing VM-Enter on nVMX, nSVM, and eVMCS.
Link: https://lore.kernel.org/r/20230729003643.1053367-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Clean up TEST_ASSERT_EQ() so that the (mostly) raw code is captured in the
main assert message, not the helper macro's code. E.g. make this:
x86_64/tsc_msrs_test.c:106: __a == __b
pid=40470 tid=40470 errno=0 - Success
1 0x000000000040170e: main at tsc_msrs_test.c:106
2 0x0000000000416f23: __libc_start_call_main at libc-start.o:?
3 0x000000000041856f: __libc_start_main_impl at ??:?
4 0x0000000000401ef0: _start at ??:?
TEST_ASSERT_EQ(rounded_host_rdmsr(MSR_IA32_TSC), val + 1) failed.
rounded_host_rdmsr(MSR_IA32_TSC) is 0
val + 1 is 0x1
look like this:
x86_64/tsc_msrs_test.c:106: rounded_host_rdmsr(MSR_IA32_TSC) == val + 1
pid=5737 tid=5737 errno=0 - Success
1 0x0000000000401714: main at tsc_msrs_test.c:106
2 0x0000000000415c23: __libc_start_call_main at libc-start.o:?
3 0x000000000041726f: __libc_start_main_impl at ??:?
4 0x0000000000401e60: _start at ??:?
0 != 0x1 (rounded_host_rdmsr(MSR_IA32_TSC) != val + 1)
Opportunstically clean up the formatting of the entire macro.
Link: https://lore.kernel.org/r/20230729003643.1053367-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
There is already an ASSERT_EQ macro in the file
tools/testing/selftests/kselftest_harness.h, so currently KVM selftests
can't include test_util.h from the KVM selftests together with that file.
Rename the macro in the KVM selftests to TEST_ASSERT_EQ to avoid the
problem - it is also more similar to the other macros in test_util.h that
way.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20230712075910.22480-2-thuth@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Don't nullify "nodep" to NULL one line before it's set to "tmp".
Signed-off-by: Minjie Du <duminjie@vivo.com>
Link: https://lore.kernel.org/r/20230704122148.11573-1-duminjie@vivo.com
[sean: massage shortlog+changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
With test case kvm_page_table_test, start time is acquired with
time type CLOCK_MONOTONIC_RAW, however end time in timespec_elapsed()
is acquired with time type CLOCK_MONOTONIC. This can cause inaccurate
elapsed time calculation due to mixing timebases, e.g. LoongArch in
particular will see weirdness.
Modify kvm_page_table_test to use unified time type CLOCK_MONOTONIC for
start time.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20230731022405.854884-1-maobibo@loongson.cn
[sean: massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Attempt to set the to-be-queued exception to be both pending and injected
_after_ KVM_CAP_SYNC_REGS's kvm_vcpu_ioctl_x86_set_vcpu_events() squashes
the pending exception (if there's also an injected exception). Buggy KVM
versions will eventually yell loudly about having impossible state when
processing queued excpetions, e.g.
WARNING: CPU: 0 PID: 1115 at arch/x86/kvm/x86.c:10095 kvm_check_and_inject_events+0x220/0x500 [kvm]
arch/x86/kvm/x86.c:kvm_check_and_inject_events():
WARN_ON_ONCE(vcpu->arch.exception.injected &&
vcpu->arch.exception.pending);
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20230728001606.2275586-3-mhal@rbox.co
[sean: split to separate patch, massage changelog and comment]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Attempt to modify the to-be-injected exception vector to an illegal value
_after_ the sanity checks performed by KVM_CAP_SYNC_REGS's
arch/x86/kvm/x86.c:kvm_vcpu_ioctl_x86_set_vcpu_events(). Buggy KVM
versions will eventually yells loudly about attempting to inject a bogus
vector, e.g.
WARNING: CPU: 0 PID: 1107 at arch/x86/kvm/x86.c:547 kvm_check_and_inject_events+0x4a0/0x500 [kvm]
arch/x86/kvm/x86.c:exception_type():
WARN_ON(vector > 31 || vector == NMI_VECTOR)
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20230728001606.2275586-3-mhal@rbox.co
[sean: split to separate patch]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Attempt to modify vcpu->run->s.regs _after_ the sanity checks performed by
KVM_CAP_SYNC_REGS's arch/x86/kvm/x86.c:sync_regs(). This can lead to some
nonsensical vCPU states accompanied by kernel splats, e.g. disabling PAE
while long mode is enabled makes KVM all kinds of confused:
WARNING: CPU: 0 PID: 1142 at arch/x86/kvm/mmu/paging_tmpl.h:358 paging32_walk_addr_generic+0x431/0x8f0 [kvm]
arch/x86/kvm/mmu/paging_tmpl.h:
KVM_BUG_ON(is_long_mode(vcpu) && !is_pae(vcpu), vcpu->kvm)
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20230728001606.2275586-3-mhal@rbox.co
[sean: see link]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add coverage to x86's set_sregs_test to verify KVM rejects vendor-agnostic
illegal CR0 values, i.e. CR0 values whose legality doesn't depend on the
current VMX mode. KVM historically has neglected to reject bad CR0s from
userspace, i.e. would happily accept a completely bogus CR0 via
KVM_SET_SREGS{2}.
Punt VMX specific subtests to future work, as they would require quite a
bit more effort, and KVM gets coverage for CR0 checks in general through
other means, e.g. KVM-Unit-Tests.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230613203037.1968489-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Verify that VM and vCPU binary stats files are usable even after userspace
has put its last direct reference to the VM. This is a regression test
for a UAF bug where KVM didn't gift the stats files a reference to the VM.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230711230131.648752-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Expand the binary stats test to verify that a stats fd can be dup()'d
and read, to (very) roughly simulate userspace passing around the file.
Adding the dup() test is primarily an intermediate step towards verifying
that userspace can read VM/vCPU stats before _and_ after userspace closes
its copy of the VM fd; the dup() test itself is only mildly interesting.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230711230131.648752-7-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Verify that KVM doesn't artificially limit KVM_GET_STATS_FD to a single
file per VM/vCPU. There's no known use case for getting multiple stats
fds, but it should work, and more importantly creating multiple files will
make it easier to test that KVM correct manages VM refcounts for stats
files.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230711230131.648752-6-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Explicitly free the all-encompassing vcpus array in the binary stats test
so that the test is consistent with respect to freeing all dynamically
allocated resources (versus letting them be freed on exit).
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230711230131.648752-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the stats fd cleanup code into stats_test() and drop the
superfluous vm_stats_test() and vcpu_stats_test() helpers in order to
decouple creation of the stats file from consuming/testing the file
(deduping code is a bonus). This will make it easier to test various
edge cases related to stats, e.g. that userspace can dup() a stats fd,
that userspace can have multiple stats files for a singleVM/vCPU, etc.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230711230131.648752-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use pread() with an explicit offset when reading the header and the header
name for a binary stats fd so that the common helper and the binary stats
test don't subtly rely on the file effectively being untouched, e.g. to
allow multiple reads of the header, name, etc.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230711230131.648752-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Eager page splitting optimization for dirty logging, optionally
allowing for a VM to avoid the cost of hugepage splitting in the stage-2
fault path.
* Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact with
services that live in the Secure world. pKVM intervenes on FF-A calls
to guarantee the host doesn't misuse memory donated to the hyp or a
pKVM guest.
* Support for running the split hypervisor with VHE enabled, known as
'hVHE' mode. This is extremely useful for testing the split
hypervisor on VHE-only systems, and paves the way for new use cases
that depend on having two TTBRs available at EL2.
* Generalized framework for configurable ID registers from userspace.
KVM/arm64 currently prevents arbitrary CPU feature set configuration
from userspace, but the intent is to relax this limitation and allow
userspace to select a feature set consistent with the CPU.
* Enable the use of Branch Target Identification (FEAT_BTI) in the
hypervisor.
* Use a separate set of pointer authentication keys for the hypervisor
when running in protected mode, as the host is untrusted at runtime.
* Ensure timer IRQs are consistently released in the init failure
paths.
* Avoid trapping CTR_EL0 on systems with Enhanced Virtualization Traps
(FEAT_EVT), as it is a register commonly read from userspace.
* Erratum workaround for the upcoming AmpereOne part, which has broken
hardware A/D state management.
RISC-V:
* Redirect AMO load/store misaligned traps to KVM guest
* Trap-n-emulate AIA in-kernel irqchip for KVM guest
* Svnapot support for KVM Guest
s390:
* New uvdevice secret API
* CMM selftest and fixes
* fix racy access to target CPU for diag 9c
x86:
* Fix missing/incorrect #GP checks on ENCLS
* Use standard mmu_notifier hooks for handling APIC access page
* Drop now unnecessary TR/TSS load after VM-Exit on AMD
* Print more descriptive information about the status of SEV and SEV-ES during
module load
* Add a test for splitting and reconstituting hugepages during and after
dirty logging
* Add support for CPU pinning in demand paging test
* Add support for AMD PerfMonV2, with a variety of cleanups and minor fixes
included along the way
* Add a "nx_huge_pages=never" option to effectively avoid creating NX hugepage
recovery threads (because nx_huge_pages=off can be toggled at runtime)
* Move handling of PAT out of MTRR code and dedup SVM+VMX code
* Fix output of PIC poll command emulation when there's an interrupt
* Add a maintainer's handbook to document KVM x86 processes, preferred coding
style, testing expectations, etc.
* Misc cleanups, fixes and comments
Generic:
* Miscellaneous bugfixes and cleanups
Selftests:
* Generate dependency files so that partial rebuilds work as expected
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmSgHrIUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroORcAf+KkBlXwQMf+Q0Hy6Mfe0OtkKmh0Ae
6HJ6dsuMfOHhWv5kgukh+qvuGUGzHq+gpVKmZg2yP3h3cLHOLUAYMCDm+rjXyjsk
F4DbnJLfxq43Pe9PHRKFxxSecRcRYCNox0GD5UYL4PLKcH0FyfQrV+HVBK+GI8L3
FDzUcyJkR12Lcj1qf++7fsbzfOshL0AJPmidQCoc6wkLJpUEr/nYUqlI1Kx3YNuQ
LKmxFHS4l4/O/px3GKNDrLWDbrVlwciGIa3GZLS52PZdW3mAqT+cqcPcYK6SW71P
m1vE80VbNELX5q3YSRoOXtedoZ3Pk97LEmz/xQAsJ/jri0Z5Syk0Ok0m/Q==
=AMXp
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM64:
- Eager page splitting optimization for dirty logging, optionally
allowing for a VM to avoid the cost of hugepage splitting in the
stage-2 fault path.
- Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact
with services that live in the Secure world. pKVM intervenes on
FF-A calls to guarantee the host doesn't misuse memory donated to
the hyp or a pKVM guest.
- Support for running the split hypervisor with VHE enabled, known as
'hVHE' mode. This is extremely useful for testing the split
hypervisor on VHE-only systems, and paves the way for new use cases
that depend on having two TTBRs available at EL2.
- Generalized framework for configurable ID registers from userspace.
KVM/arm64 currently prevents arbitrary CPU feature set
configuration from userspace, but the intent is to relax this
limitation and allow userspace to select a feature set consistent
with the CPU.
- Enable the use of Branch Target Identification (FEAT_BTI) in the
hypervisor.
- Use a separate set of pointer authentication keys for the
hypervisor when running in protected mode, as the host is untrusted
at runtime.
- Ensure timer IRQs are consistently released in the init failure
paths.
- Avoid trapping CTR_EL0 on systems with Enhanced Virtualization
Traps (FEAT_EVT), as it is a register commonly read from userspace.
- Erratum workaround for the upcoming AmpereOne part, which has
broken hardware A/D state management.
RISC-V:
- Redirect AMO load/store misaligned traps to KVM guest
- Trap-n-emulate AIA in-kernel irqchip for KVM guest
- Svnapot support for KVM Guest
s390:
- New uvdevice secret API
- CMM selftest and fixes
- fix racy access to target CPU for diag 9c
x86:
- Fix missing/incorrect #GP checks on ENCLS
- Use standard mmu_notifier hooks for handling APIC access page
- Drop now unnecessary TR/TSS load after VM-Exit on AMD
- Print more descriptive information about the status of SEV and
SEV-ES during module load
- Add a test for splitting and reconstituting hugepages during and
after dirty logging
- Add support for CPU pinning in demand paging test
- Add support for AMD PerfMonV2, with a variety of cleanups and minor
fixes included along the way
- Add a "nx_huge_pages=never" option to effectively avoid creating NX
hugepage recovery threads (because nx_huge_pages=off can be toggled
at runtime)
- Move handling of PAT out of MTRR code and dedup SVM+VMX code
- Fix output of PIC poll command emulation when there's an interrupt
- Add a maintainer's handbook to document KVM x86 processes,
preferred coding style, testing expectations, etc.
- Misc cleanups, fixes and comments
Generic:
- Miscellaneous bugfixes and cleanups
Selftests:
- Generate dependency files so that partial rebuilds work as
expected"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (153 commits)
Documentation/process: Add a maintainer handbook for KVM x86
Documentation/process: Add a label for the tip tree handbook's coding style
KVM: arm64: Fix misuse of KVM_ARM_VCPU_POWER_OFF bit index
RISC-V: KVM: Remove unneeded semicolon
RISC-V: KVM: Allow Svnapot extension for Guest/VM
riscv: kvm: define vcpu_sbi_ext_pmu in header
RISC-V: KVM: Expose IMSIC registers as attributes of AIA irqchip
RISC-V: KVM: Add in-kernel virtualization of AIA IMSIC
RISC-V: KVM: Expose APLIC registers as attributes of AIA irqchip
RISC-V: KVM: Add in-kernel emulation of AIA APLIC
RISC-V: KVM: Implement device interface for AIA irqchip
RISC-V: KVM: Skeletal in-kernel AIA irqchip support
RISC-V: KVM: Set kvm_riscv_aia_nr_hgei to zero
RISC-V: KVM: Add APLIC related defines
RISC-V: KVM: Add IMSIC related defines
RISC-V: KVM: Implement guest external interrupt line management
KVM: x86: Remove PRIx* definitions as they are solely for user space
s390/uv: Update query for secret-UVCs
s390/uv: replace scnprintf with sysfs_emit
s390/uvdevice: Add 'Lock Secret Store' UVC
...
- Add a test for splitting and reconstituting hugepages during and after
dirty logging
- Add support for CPU pinning in demand paging test
- Generate dependency files so that partial rebuilds work as expected
- Misc cleanups and fixes
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmSaKE8SHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5xYsP/0D614Kig677ZtMCvrxmExHvsKLOCQ5i
hDOg9PgYFnqTBTwiK9d+QDm7xX+p9M6gHpqZD1pH8VbxQn+ybH6yP3LhIMgewJEd
Aa9rKspX7Kljmy4J5JkzQmAO2gRlrAaGMDxDZPpmPmHCt/mn/Z7nCPYDMVjjBJxe
p+OI4Pa6N1TYrixT7mGEcgvpjTGFXIzmJdNr+OxlGVwWp7ab/vvlqpiJbRLrRfLB
FHDIhI5ahxLhIsz0dSp97C7vrowJT17YdFPd5JAEod9HtXEYmdQR/yxOTexzmU5L
Zd+mu+c4px3cz+YjJO3LYk5CYVXnYXM4YKQUHxmS4jiXQ6X7irBZDLcNMc1Zflh8
QRjHvnxzmFWY4/UXVSPwRYfw+ZEgW5WOrytMp4e+i2W7ElAcHbdjJK0QHmATO19v
pz0uDjFadFG0PFWjE70HnpB93gU/okWjrKNQ0ARdgkTJFMfFHEFuwoBUphDUB6OZ
c1Nab8XTB11jjJM8coTJf+hwaoD3IWWzIyBXdqCOhue1kCzxDiSaFkHfDbN2rgZS
xUjJVljmaktGTysUYrK38jzUTDCdJWLFEyq4fxxwVCvGkF96pCNZepM6wtMwIzNq
ku7VYiysBQuqMGIOs82v5X0Vs7zXqM4K/kU/Z8MD8eXt5tVuibpqEoCTYvw22V1b
qDViIde6kQey
=Wwws
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-selftests-6.5' of https://github.com/kvm-x86/linux into HEAD
KVM selftests changes for 6.5:
- Add a test for splitting and reconstituting hugepages during and after
dirty logging
- Add support for CPU pinning in demand paging test
- Generate dependency files so that partial rebuilds work as expected
- Misc cleanups and fixes
- Move handling of PAT out of MTRR code and dedup SVM+VMX code
- Fix output of PIC poll command emulation when there's an interrupt
- Fix a longstanding bug in the reporting of the number of entries returned by
KVM_GET_CPUID2
- Add a maintainer's handbook to document KVM x86 processes, preferred coding
style, testing expectations, etc.
- Misc cleanups
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmSaGMMSHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5iDIP/0PwY3J5odTEUTnAyuDFPimd5PBt9k/O
B414wdpSKVgzq+0An4qM9mKRnklVIh2p8QqQTvDhcBUg3xb6CX9xZ4ery7hp/T5O
tr5bAXs2AYX6jpxvsopt+w+E9j6fvkJhcJCRU9im3QbrqwUE+ecyU5OHvmv2n/GO
syVZJbPOYuoLPKDjlSMrScE6fWEl9UOvHc5BK/vafTeyisMG3vv1BSmJj6GuiNNk
TS1RRIg//cOZghQyDfdXt0azTmakNZyNn35xnoX9x8SRmdRykyUjQeHmeqWxPDso
kiGO+CGancfS57S6ZtCkJjqEWZ1o/zKdOxr8MMf/3nJhv4kY7/5XtlVoACv5soW9
bZEmNiXIaSbvKNMwAlLJxHFbLa1sMdSCb345CIuMdt5QiWJ53ZiTyIAJX6+eL+Zf
8nkeekgPf5VUs6Zt0RdRPyvo+W7Vp9BtI87yDXm1nQKpbys2pt6CD3YB/oF4QViG
a5cyGoFuqRQbS3nmbshIlR7EanTuxbhLZKrNrFnolZ5e624h3Cnk2hVsfTznVGiX
vNHWM80phk1CWB9McErrZVkGfjlyVyBL13CBB2XF7Dl6PfF6/N22a9bOuTJD3tvk
PlNx4hvZm3esvvyGpjfbSajTKYE8O7rxiE1KrF0BpZ5IUl5WSiTr6XCy/yI/mIeM
hay2IWhPOF2z
=D0BH
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-misc-6.5' of https://github.com/kvm-x86/linux into HEAD
KVM x86 changes for 6.5:
* Move handling of PAT out of MTRR code and dedup SVM+VMX code
* Fix output of PIC poll command emulation when there's an interrupt
* Add a maintainer's handbook to document KVM x86 processes, preferred coding
style, testing expectations, etc.
* Misc cleanups
- Support for the Armv8.9 Permission Indirection Extensions. While this
feature doesn't add new functionality, it enables future support for
Guarded Control Stacks (GCS) and Permission Overlays.
- User-space support for the Armv8.8 memcpy/memset instructions.
- arm64 perf: support the HiSilicon SoC uncore PMU, Arm CMN sysfs
identifier, support for the NXP i.MX9 SoC DDRC PMU, fixes and
cleanups.
- Removal of superfluous ISBs on context switch (following retrospective
architecture tightening).
- Decode the ISS2 register during faults for additional information to
help with debugging.
- KPTI clean-up/simplification of the trampoline exit code.
- Addressing several -Wmissing-prototype warnings.
- Kselftest improvements for signal handling and ptrace.
- Fix TPIDR2_EL0 restoring on sigreturn
- Clean-up, robustness improvements of the module allocation code.
- More sysreg conversions to the automatic register/bitfields
generation.
- CPU capabilities handling cleanup.
- Arm documentation updates: ACPI, ptdump.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmSZyXwACgkQa9axLQDI
XvEM3BAAkMzHGTDhNVNGLSO07PVmdzTiuoNFlfX7bktdIb+El76VhGXhHeEywTje
wAq9JIYBf/Src2HbgZLwuly8Fn2vCrhyp++bRJW82o9SiBnx91+0mH7zLf+XHiQ4
FHKZxvaE6PaDc9o8WXr+IeucPRb5W2HgH37mktxh7ShMLsxorwS94V1oL29A2mV9
t4XQY7/tdmrDKMKMuQnIr1DurNXBhJ1OKvDnSN/Zzm96JOU/QQ32N2wEE7Y0aHOh
bBzClksx2mguQqV515mySGFe5yy9NqaAfx2hTAciq+1rwbiCSjqQQmEswoUH8WLX
JNLylxADWT2qXThFe8W6uyFzEshSAoI1yKxlCGuOsQpu4sFJtR8oh8dDj5669g4Y
j0jR87r9rWm0iyYI5I+XDMxFVyuh2eFInvjtynRbj+mtS3f/SkO8fXG6Uya+I76C
UGLlBUKnLr/zHuIGN0LE/V4dYTqsi9EtHoc2Am2xCZsS9jqkxKJG8C93Zsm4GlJC
OcUtBSjW0rYJq+tLk0yhR6hbh59QbiRh05KnZsPpOKi8purlKSL9ZNPRi7TndLdm
HjHUY+vQwNIpPIb6pyK4aYZuTdGEQIsQykQ8CULiIGlHi7kc4g9029ouLc5bBAeU
mU8D62I2ztzPoYljYWNtO7K6g/Dq8c4lpsaMAJ+1Wp2iq2xBJjo=
=rNBK
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"Notable features are user-space support for the memcpy/memset
instructions and the permission indirection extension.
- Support for the Armv8.9 Permission Indirection Extensions. While
this feature doesn't add new functionality, it enables future
support for Guarded Control Stacks (GCS) and Permission Overlays
- User-space support for the Armv8.8 memcpy/memset instructions
- arm64 perf: support the HiSilicon SoC uncore PMU, Arm CMN sysfs
identifier, support for the NXP i.MX9 SoC DDRC PMU, fixes and
cleanups
- Removal of superfluous ISBs on context switch (following
retrospective architecture tightening)
- Decode the ISS2 register during faults for additional information
to help with debugging
- KPTI clean-up/simplification of the trampoline exit code
- Addressing several -Wmissing-prototype warnings
- Kselftest improvements for signal handling and ptrace
- Fix TPIDR2_EL0 restoring on sigreturn
- Clean-up, robustness improvements of the module allocation code
- More sysreg conversions to the automatic register/bitfields
generation
- CPU capabilities handling cleanup
- Arm documentation updates: ACPI, ptdump"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (124 commits)
kselftest/arm64: Add a test case for TPIDR2 restore
arm64/signal: Restore TPIDR2 register rather than memory state
arm64: alternatives: make clean_dcache_range_nopatch() noinstr-safe
Documentation/arm64: Add ptdump documentation
arm64: hibernate: remove WARN_ON in save_processor_state
kselftest/arm64: Log signal code and address for unexpected signals
docs: perf: Fix warning from 'make htmldocs' in hisi-pmu.rst
arm64/fpsimd: Exit streaming mode when flushing tasks
docs: perf: Add new description for HiSilicon UC PMU
drivers/perf: hisi: Add support for HiSilicon UC PMU driver
drivers/perf: hisi: Add support for HiSilicon H60PA and PAv3 PMU driver
perf: arm_cspmu: Add missing MODULE_DEVICE_TABLE
perf/arm-cmn: Add sysfs identifier
perf/arm-cmn: Revamp model detection
perf/arm_dmc620: Add cpumask
arm64: mm: fix VA-range sanity check
arm64/mm: remove now-superfluous ISBs from TTBR writes
Documentation/arm64: Update ACPI tables from BBR
Documentation/arm64: Update references in arm-acpi
Documentation/arm64: Update ARM and arch reference
...
Add a selftest for CMMA migration on s390.
The tests cover:
- interaction of dirty tracking and migration mode, see my recent patch
"KVM: s390: disable migration mode when dirty tracking is disabled" [1],
- several invalid calls of KVM_S390_GET_CMMA_BITS, for example: invalid
flags, CMMA support off, with/without peeking
- ensure KVM_S390_GET_CMMA_BITS initally reports all pages as dirty,
- ensure KVM_S390_GET_CMMA_BITS properly skips over holes in memslots, but
also non-dirty pages
Note that without the patch at [1] and the small fix in this series, the
selftests will fail.
[1] https://lore.kernel.org/all/20230127140532.230651-2-nrb@linux.ibm.com/
Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20230324145424.293889-3-nrb@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
[frankja@linux.ibm.com: squashed
20230606150510.671301-1-nrb@linux.ibm.com / "KVM: s390: selftests:
CMMA: don't run if CMMA not supported"]
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Add "-MD" in CFLAGS to generate dependency files. Currently, each
time a header file is updated in KVM selftest, we will have to run
"make clean && make" to rebuild the whole test suite. By adding new
compiling flags and dependent rules in Makefile, we do not need to
make clean && make each time a header file is updated.
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Link: https://lore.kernel.org/r/20230601080338.212942-1-yu.c.zhang@linux.intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Mimic the dirty log test and allow the user to pin demand paging test
tasks to physical CPUs.
Put the help message into a general helper as suggested by Sean.
Signed-off-by: Peter Xu <peterx@redhat.com>
[sean: rebase, tweak arg ordering, add "print" to helper, print program name]
Link: https://lore.kernel.org/r/20230607001226.1398889-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
This stops the test complaining about missing registers, when running
on an older kernel that does not support newer features.
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Link: https://lore.kernel.org/r/20230606145859.697944-20-joey.gouly@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Keep switching between LAPIC_MODE_X2APIC and LAPIC_MODE_DISABLED during
APIC map construction to hunt for TOCTOU bugs in KVM. KVM's optimized map
recalc makes multiple passes over the list of vCPUs, and the calculations
ignore vCPU's whose APIC is hardware-disabled, i.e. there's a window where
toggling LAPIC_MODE_DISABLED is quite interesting.
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20230602233250.1014316-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Verify that KVM reports the actual number of CPUID entries on success, but
doesn't touch the userspace struct on failure (which for better or worse,
is KVM's ABI).
Link: https://lore.kernel.org/r/20230526210340.2799158-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add a test for page splitting during dirty logging and for hugepage
recovery after dirty logging.
Page splitting represents non-trivial behavior, which is complicated
by MANUAL_PROTECT mode, which causes pages to be split on the first
clear, instead of when dirty logging is enabled.
Add a test which makes assertions about page counts to help define the
expected behavior of page splitting and to provide needed coverage of the
behavior. This also helps ensure that a failure in eager page splitting
is not covered up by splitting in the vCPU path.
Tested by running the test on an Intel Haswell machine w/wo
MANUAL_PROTECT.
Reviewed-by: Vipin Sharma <vipinsh@google.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
Link: https://lore.kernel.org/r/20230131181820.179033-3-bgardon@google.com
[sean: let the user run without hugetlb, as suggested by Paolo]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Move some helper functions from dirty_log_perf_test.c to the memstress
library so that they can be used in a future commit which tests page
splitting during dirty logging.
Reviewed-by: Vipin Sharma <vipinsh@google.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
Link: https://lore.kernel.org/r/20230131181820.179033-2-bgardon@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Access the same memory addresses on each iteration of the memstress
guest code. This ensures that the state of KVM's page tables
is the same after every iteration, including the pages that host the
guest page tables for args and vcpu_args.
This difference is visible when running the proposed
dirty_log_page_splitting_test[*] on AMD, or on Intel with pml=0 and
eptad=0. The tests fail due to different semantics of dirty bits for
page-table pages on AMD (and eptad=0) and Intel. Both AMD and Intel with
eptad=0 treat page-table accesses as writes, therefore more pages are
dropped before the repopulation phase when dirty logging is disabled.
The "missing" page had been included in the population phase because it
hosts the page tables for vcpu_args, but repopulation does not need it."
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Vipin Sharma <vipinsh@google.com>
Link: https://lore.kernel.org/r/20230412200913.1570873-1-pbonzini@redhat.com
[sean: add additional details in changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
There's one PER_VCPU_DEBUG in per-vcpu uffd threads but it's never hit.
Trigger that when quit in normal ways (kick pollfd[1]), meanwhile fix the
number of nanosec calculation.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: James Houghton <jthoughton@google.com>
Link: https://lore.kernel.org/r/20230427201112.2164776-3-peterx@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
This fixes two things:
- Unbreaks MISSING mode test on anonymous memory type
- Prefault alias mem before uffd thread creations, otherwise the uffd
thread timing will be inaccurate when guest mem size is large, because
it'll take prefault time into total time.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: James Houghton <jthoughton@google.com>
Link: https://lore.kernel.org/r/20230427201112.2164776-2-peterx@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Refactor the nested TSC scaling test's check on a stable system TSC to
use TEST_REQUIRE() to do the heavy lifting when the system doesn't have
a stable TSC. Using a helper+TEST_REQUIRE() eliminates the need for
gotos and a custom message.
Cc: Hao Ge <gehao@kylinos.cn>
Cc: Vipin Sharma <vipinsh@google.com>
Link: https://lore.kernel.org/r/20230406001724.706668-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
- Don't advertisze XTILE_CFG in KVM_GET_SUPPORTED_CPUID if XTILE_DATA is
not being reported due to userspace not opting in via prctl()
- Overhaul the AMX selftests to improve coverage and cleanup the test
- Misc cleanups
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmRGt50SHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5MskP/2PhSrdgHxCwfpqpdVe/q5OWwFuhn3wG
f5QKMpEBg4wJFeIE3eGJEaDlg776nWtWDNgUmqdjoZ8vyyadkPX9CV2Y2Hq0M7Tw
d0gKPjQrz2BavyDYoPNfs4pfshs4EvDTswBkhdAt8KTZhGZosJOywQIp61V3ePqr
1rDP6C4+CmwTRAK0f7egslyJ2pZXiUcvhITvzx8XhIAQh6nEK4gUZ/l3hLmg38kD
Af23kiLnP8lHUUx4BQtRAnTw0SZXJ8DcKtoFkzEH8mdj4g6EqXpxy48zuyZcqWVi
4XIFr+WECPsV5gdqWN9rMDqIG2ib+2heKDmcdUptcVuvr1ktv0reQybmgVck4CKX
fTAdu86/LBaQmIHwNHaNFPwdUby4QQZ8ajafPC62oc+B6N1lQg8bbCwnvO6KGlGl
FaQTnzaZq7ft4tfQRXOMu1AbLZLK7dIqJHHhxR3MkBkd4MAcZ1bVKkvlJLqsOKNw
TEsreXErY7AsegZK73Rn4IN/CJGBof5bZ2NIchmiN+0UfMsd9zGn66Als6oRNh4E
tRUhFONPIEmydy9UB50qe6b98ElB6R++opZbvkVW2hy8lMy3iJrCvUbOs1nx3wbn
cxvIuTfw/dAFf70S03/zudf7lYHs2wKV1rrIAebyTd4NnvWdVB8OaSHgZswMgVjb
UzzQfnQ+u9so
=BY10
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-selftests-6.4' of https://github.com/kvm-x86/linux into HEAD
KVM selftests, and an AMX/XCR0 bugfix, for 6.4:
- Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if XTILE_DATA is
not being reported due to userspace not opting in via prctl()
- Overhaul the AMX selftests to improve coverage and cleanup the test
- Misc cleanups
- Disallow virtualizing legacy LBRs if architectural LBRs are available,
the two are mutually exclusive in hardware
- Disallow writes to immutable feature MSRs (notably PERF_CAPABILITIES)
after KVM_RUN, and overhaul the vmx_pmu_caps selftest to better
validate PERF_CAPABILITIES
- Apply PMU filters to emulated events and add test coverage to the
pmu_event_filter selftest
- Misc cleanups and fixes
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmRGtd4SHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5Z9kP/i3WZ40hevvQvB/5cEpxxmxYDwCYnnjM
hiQgK5jT4SrMTmVjLgkNdI2PogQoS4CX+GC7lcA9bvse84hjuPvgOflb2B+p2UQi
Ytbr9g/tfKNIpnKIk9mcPcSObN9vm2Kgt7n28rtPrHWj89eQzgc66eijqdpKBLxA
c3crVR8krwYAQK0tmzHq1+H6hB369YbHAHyTTRRI/bNWnqKblnvUbt0NL2aBusa9
rNMaOdRtinLpy2dmuX/b3japRB8QTnlf7zpPIF4cBEhbYXy5woClZpf1D2fCA6Er
XFbEoYawMVd9UeJYbW4z5yErLT83eYoGp4U0eFXWp6fvh8nZlgCGvBKE9g4mmqwj
aSLaTR5eVN2qlw6jXVeg3unCo8Eyl36AwYwve2L6sFmBvZvNV5iz2eQ7rrOe4oE3
dnTUaLQ8I2SVg04MbYmCq5W+frTL/I7kqNpbccL1Z3R5WO4y5gz63mug6NfLIvhR
t45TAIaifxBfcXQsBZM3v2KUK/xQrD3AbJmFKh54L2CKqiGaNWsMLX+6NZ7LZWgf
8rEqsVkkQDgF7z8eXai4TR26nYfSX6g9gDqtOH73L87aJ7PJk5cRoDWQ1sWs1e/l
4HA/L0Bo/3pnKAa0ZWxJOixmzqY49gNQf3dj8gt3jk3y2ijbAivshiSpPBmIxn0u
QLeOf/LGvipl
=m18F
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-pmu-6.4' of https://github.com/kvm-x86/linux into HEAD
KVM x86 PMU changes for 6.4:
- Disallow virtualizing legacy LBRs if architectural LBRs are available,
the two are mutually exclusive in hardware
- Disallow writes to immutable feature MSRs (notably PERF_CAPABILITIES)
after KVM_RUN, and overhaul the vmx_pmu_caps selftest to better
validate PERF_CAPABILITIES
- Apply PMU filters to emulated events and add test coverage to the
pmu_event_filter selftest
- Misc cleanups and fixes
* kvm-arm64/smccc-filtering:
: .
: SMCCC call filtering and forwarding to userspace, courtesy of
: Oliver Upton. From the cover letter:
:
: "The Arm SMCCC is rather prescriptive in regards to the allocation of
: SMCCC function ID ranges. Many of the hypercall ranges have an
: associated specification from Arm (FF-A, PSCI, SDEI, etc.) with some
: room for vendor-specific implementations.
:
: The ever-expanding SMCCC surface leaves a lot of work within KVM for
: providing new features. Furthermore, KVM implements its own
: vendor-specific ABI, with little room for other implementations (like
: Hyper-V, for example). Rather than cramming it all into the kernel we
: should provide a way for userspace to handle hypercalls."
: .
KVM: selftests: Fix spelling mistake "KVM_HYPERCAL_EXIT_SMC" -> "KVM_HYPERCALL_EXIT_SMC"
KVM: arm64: Test that SMC64 arch calls are reserved
KVM: arm64: Prevent userspace from handling SMC64 arch range
KVM: arm64: Expose SMC/HVC width to userspace
KVM: selftests: Add test for SMCCC filter
KVM: selftests: Add a helper for SMCCC calls with SMC instruction
KVM: arm64: Let errors from SMCCC emulation to reach userspace
KVM: arm64: Return NOT_SUPPORTED to guest for unknown PSCI version
KVM: arm64: Introduce support for userspace SMCCC filtering
KVM: arm64: Add support for KVM_EXIT_HYPERCALL
KVM: arm64: Use a maple tree to represent the SMCCC filter
KVM: arm64: Refactor hvc filtering to support different actions
KVM: arm64: Start handling SMCs from EL1
KVM: arm64: Rename SMC/HVC call handler to reflect reality
KVM: arm64: Add vm fd device attribute accessors
KVM: arm64: Add a helper to check if a VM has ran once
KVM: x86: Redefine 'longmode' as a flag for KVM_EXIT_HYPERCALL
Signed-off-by: Marc Zyngier <maz@kernel.org>