linux-yocto/arch/x86/kvm
Sean Christopherson e0d9a7cf37 KVM: SVM: Reject SEV{-ES} intra host migration if vCPU creation is in-flight
commit ecf371f8b0 upstream.

Reject migration of SEV{-ES} state if either the source or destination VM
is actively creating a vCPU, i.e. if kvm_vm_ioctl_create_vcpu() is in the
section between incrementing created_vcpus and online_vcpus.  The bulk of
vCPU creation runs _outside_ of kvm->lock to allow creating multiple vCPUs
in parallel, and so sev_info.es_active can get toggled from false=>true in
the destination VM after (or during) svm_vcpu_create(), resulting in an
SEV{-ES} VM effectively having a non-SEV{-ES} vCPU.

The issue manifests most visibly as a crash when trying to free a vCPU's
NULL VMSA page in an SEV-ES VM, but any number of things can go wrong.

  BUG: unable to handle page fault for address: ffffebde00000000
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: Oops: 0000 [#1] SMP KASAN NOPTI
  CPU: 227 UID: 0 PID: 64063 Comm: syz.5.60023 Tainted: G     U     O        6.15.0-smp-DEV #2 NONE
  Tainted: [U]=USER, [O]=OOT_MODULE
  Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 12.52.0-0 10/28/2024
  RIP: 0010:constant_test_bit arch/x86/include/asm/bitops.h:206 [inline]
  RIP: 0010:arch_test_bit arch/x86/include/asm/bitops.h:238 [inline]
  RIP: 0010:_test_bit include/asm-generic/bitops/instrumented-non-atomic.h:142 [inline]
  RIP: 0010:PageHead include/linux/page-flags.h:866 [inline]
  RIP: 0010:___free_pages+0x3e/0x120 mm/page_alloc.c:5067
  Code: <49> f7 06 40 00 00 00 75 05 45 31 ff eb 0c 66 90 4c 89 f0 4c 39 f0
  RSP: 0018:ffff8984551978d0 EFLAGS: 00010246
  RAX: 0000777f80000001 RBX: 0000000000000000 RCX: ffffffff918aeb98
  RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffebde00000000
  RBP: 0000000000000000 R08: ffffebde00000007 R09: 1ffffd7bc0000000
  R10: dffffc0000000000 R11: fffff97bc0000001 R12: dffffc0000000000
  R13: ffff8983e19751a8 R14: ffffebde00000000 R15: 1ffffd7bc0000000
  FS:  0000000000000000(0000) GS:ffff89ee661d3000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffebde00000000 CR3: 000000793ceaa000 CR4: 0000000000350ef0
  DR0: 0000000000000000 DR1: 0000000000000b5f DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Call Trace:
   <TASK>
   sev_free_vcpu+0x413/0x630 arch/x86/kvm/svm/sev.c:3169
   svm_vcpu_free+0x13a/0x2a0 arch/x86/kvm/svm/svm.c:1515
   kvm_arch_vcpu_destroy+0x6a/0x1d0 arch/x86/kvm/x86.c:12396
   kvm_vcpu_destroy virt/kvm/kvm_main.c:470 [inline]
   kvm_destroy_vcpus+0xd1/0x300 virt/kvm/kvm_main.c:490
   kvm_arch_destroy_vm+0x636/0x820 arch/x86/kvm/x86.c:12895
   kvm_put_kvm+0xb8e/0xfb0 virt/kvm/kvm_main.c:1310
   kvm_vm_release+0x48/0x60 virt/kvm/kvm_main.c:1369
   __fput+0x3e4/0x9e0 fs/file_table.c:465
   task_work_run+0x1a9/0x220 kernel/task_work.c:227
   exit_task_work include/linux/task_work.h:40 [inline]
   do_exit+0x7f0/0x25b0 kernel/exit.c:953
   do_group_exit+0x203/0x2d0 kernel/exit.c:1102
   get_signal+0x1357/0x1480 kernel/signal.c:3034
   arch_do_signal_or_restart+0x40/0x690 arch/x86/kernel/signal.c:337
   exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
   syscall_exit_to_user_mode+0x67/0xb0 kernel/entry/common.c:218
   do_syscall_64+0x7c/0x150 arch/x86/entry/syscall_64.c:100
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
  RIP: 0033:0x7f87a898e969
   </TASK>
  Modules linked in: gq(O)
  gsmi: Log Shutdown Reason 0x03
  CR2: ffffebde00000000
  ---[ end trace 0000000000000000 ]---

Deliberately don't check for a NULL VMSA when freeing the vCPU, as crashing
the host is likely desirable due to the VMSA being consumed by hardware.
E.g. if KVM manages to allow VMRUN on the vCPU, hardware may read/write a
bogus VMSA page.  Accessing PFN 0 is "fine"-ish now that it's sequestered
away thanks to L1TF, but panicking in this scenario is preferable to
potentially running with corrupted state.

Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Fixes: 0b020f5af0 ("KVM: SEV: Add support for SEV-ES intra host migration")
Fixes: b56639318b ("KVM: SEV: Add support for SEV intra host migration")
Cc: stable@vger.kernel.org
Cc: James Houghton <jthoughton@google.com>
Cc: Peter Gonda <pgonda@google.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Tested-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: James Houghton <jthoughton@google.com>
Link: https://lore.kernel.org/r/20250602224459.41505-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-17 18:32:07 +02:00
..
mmu KVM: nSVM: Enter guest mode before initializing nested NPT MMU 2025-02-21 13:50:01 +01:00
svm KVM: SVM: Reject SEV{-ES} intra host migration if vCPU creation is in-flight 2025-07-17 18:32:07 +02:00
vmx x86/bugs: Rename MDS machinery to something more generic 2025-07-10 15:59:53 +02:00
.gitignore KVM: x86: use a separate asm-offsets.c file 2022-11-09 12:10:17 -05:00
cpuid.c KVM: SVM: Advertise TSA CPUID bits to guests 2025-07-10 15:59:54 +02:00
cpuid.h KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init 2024-12-27 13:52:55 +01:00
debugfs.c KVM: x86: Reduce refcount if single_open() fails in kvm_mmu_rmaps_stat_open() 2022-10-27 04:41:54 -04:00
emulate.c KVM: x86: smm: number of GPRs in the SMRAM image depends on the image format 2022-10-28 06:10:30 -04:00
fpu.h
hyperv.c KVM: x86: Reject Hyper-V's SEND_IPI hypercalls if local APIC isn't in-kernel 2025-02-21 13:50:01 +01:00
hyperv.h KVM: x86: Report error when setting CPUID if Hyper-V allocation fails 2022-09-26 12:02:39 -04:00
i8254.c KVM: x86: PIT: Preserve state of speaker port data bit 2022-06-08 13:06:20 -04:00
i8254.h KVM: x86: PIT: Preserve state of speaker port data bit 2022-06-08 13:06:20 -04:00
i8259.c KVM: x86/i8259: Remove a dead store of irq in a conditional block 2022-04-02 05:41:19 -04:00
ioapic.c KVM: x86/ioapic: Remove unused "addr" and "length" of ioapic_read_indirect() 2022-02-10 13:47:13 -05:00
ioapic.h x86/kvm: remove unused ack_notifier callbacks 2021-11-18 07:05:57 -05:00
irq_comm.c KVM: x86/xen: Make kvm_xen_set_evtchn() reusable from other places 2022-04-02 05:41:14 -04:00
irq.c KVM: x86/xen: handle PV timers oneshot mode 2022-04-02 05:41:16 -04:00
irq.h x86/kvm: remove unused ack_notifier callbacks 2021-11-18 07:05:57 -05:00
Kconfig KVM: x86: Select CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL 2022-09-29 10:23:08 +01:00
kvm_cache_regs.h KVM: VMX: Make CR0.WP a guest owned bit 2023-05-17 11:53:29 +02:00
kvm_emulate.h KVM: x86: Bug the VM if the emulator accesses a non-existent GPR 2022-06-10 10:01:33 -04:00
kvm_onhyperv.c KVM: x86: Uninline and export hv_track_root_tdp() 2022-02-10 13:47:19 -05:00
kvm_onhyperv.h KVM: SVM: Flush Hyper-V TLB when required 2023-04-20 12:35:12 +02:00
kvm-asm-offsets.c KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly 2022-11-09 12:25:53 -05:00
lapic.c KVM: x86: Unconditionally set irr_pending when updating APICv state 2024-11-22 15:37:31 +01:00
lapic.h KVM: x86: Rename kvm_apic_has_events() to make it INIT/SIPI specific 2022-09-26 12:37:18 -04:00
Makefile KVM: SVM: replace regs argument of __svm_vcpu_run() with vcpu_svm 2022-11-09 12:16:34 -05:00
mmu.h KVM: x86/mmu: Refresh CR0.WP prior to checking for emulated permission faults 2023-05-17 11:53:30 +02:00
mtrr.c
pmu.c KVM: x86: Make use of kvm_read_cr*_bits() when testing bits 2023-05-17 11:53:29 +02:00
pmu.h KVM: x86/pmu: Truncate counter value to allowed width on write 2023-11-02 09:35:21 +01:00
reverse_cpuid.h KVM: SVM: Advertise TSA CPUID bits to guests 2025-07-10 15:59:54 +02:00
trace.h KVM: SVM: Use unsigned integers when dealing with ASIDs 2024-04-10 16:28:30 +02:00
tss.h
x86.c x86/its: Enumerate Indirect Target Selection (ITS) bug 2025-05-18 08:21:26 +02:00
x86.h KVM: x86: Track supported PERF_CAPABILITIES in kvm_caps 2023-05-17 11:53:27 +02:00
xen.c KVM: x86/xen: Allow 'out of range' event channel ports in IRQ routing table. 2025-07-17 18:32:07 +02:00
xen.h KVM: x86/xen: inject vCPU upcall vector when local APIC is enabled 2024-04-03 15:19:30 +02:00