linux-yocto/arch/arm64/include/asm
Mark Rutland e500dff1e4 arm64/fpsimd: Do not discard modified SVE state
[ Upstream commit 398edaa12f9cf2be7902f306fc023c20e3ebd3e4 ]

Historically SVE state was discarded deterministically early in the
syscall entry path, before ptrace is notified of syscall entry. This
permitted ptrace to modify SVE state before and after the "real" syscall
logic was executed, with the modified state being retained.

This behaviour was changed by commit:

  8c845e2731 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")

That commit was intended to speed up workloads that used SVE by
opportunistically leaving SVE enabled when returning from a syscall.
The syscall entry logic was modified to truncate the SVE state without
disabling userspace access to SVE, and fpsimd_save_user_state() was
modified to discard userspace SVE state whenever
in_syscall(current_pt_regs()) is true, i.e. when
current_pt_regs()->syscallno != NO_SYSCALL.

Leaving SVE enabled opportunistically resulted in a couple of changes to
userspace visible behaviour which weren't described at the time, but are
logical consequences of opportunistically leaving SVE enabled:

* Signal handlers can observe the type of saved state in the signal's
  sve_context record. When the kernel only tracks FPSIMD state, the 'vq'
  field is 0 and there is no space allocated for register contents. When
  the kernel tracks SVE state, the 'vq' field is non-zero and the
  register contents are saved into the record.

  As a result of the above commit, 'vq' (and the presence of SVE
  register state) is non-deterministically zero or non-zero for a period
  of time after a syscall. The effective register state is still
  deterministic.

  Hopefully no-one relies on this being deterministic. In general,
  handlers for asynchronous events cannot expect a deterministic state.

* Similarly to signal handlers, ptrace requests can observe the type of
  saved state in the NT_ARM_SVE and NT_ARM_SSVE regsets, as this is
  exposed in the header flags. As a result of the above commit, this is
  now in a non-deterministic state after a syscall. The effective
  register state is still deterministic.

  Hopefully no-one relies on this being deterministic. In general,
  debuggers would have to handle this changing at arbitrary points
  during program flow.

Discarding the SVE state within fpsimd_save_user_state() resulted in
other changes to userspace visible behaviour which are not desirable:

* A ptrace tracer can modify (or create) a tracee's SVE state at syscall
  entry or syscall exit. As a result of the above commit, the tracee's
  SVE state can be discarded non-deterministically after modification,
  rather than being retained as it previously was.

  Note that for co-operative tracer/tracee pairs, the tracer may
  (re)initialise the tracee's state arbitrarily after the tracee sends
  itself an initial SIGSTOP via a syscall, so this affects realistic
  design patterns.

* The current_pt_regs()->syscallno field can be modified via ptrace, and
  can be altered even when the tracee is not really in a syscall,
  causing non-deterministic discarding to occur in situations where this
  was not previously possible.

Further, using current_pt_regs()->syscallno in this way is unsound:

* There are data races between readers and writers of the
  current_pt_regs()->syscallno field.

  The current_pt_regs()->syscallno field is written in interruptible
  task context using plain C accesses, and is read in irq/softirq
  context using plain C accesses. These accesses are subject to data
  races, with the usual concerns with tearing, etc.

* Writes to current_pt_regs()->syscallno are subject to compiler
  reordering.

  As current_pt_regs()->syscallno is written with plain C accesses,
  the compiler is free to move those writes arbitrarily relative to
  anything which doesn't access the same memory location.

  In theory this could break signal return, where prior to restoring the
  SVE state, restore_sigframe() calls forget_syscall(). If the write
  were hoisted after restore of some SVE state, that state could be
  discarded unexpectedly.

  In practice that reordering cannot happen in the absence of LTO (as
  cross compilation-unit function calls happen prevent this reordering),
  and that reordering appears to be unlikely in the presence of LTO.

Additionally, since commit:

  f130ac0ae4 ("arm64: syscall: unmask DAIF earlier for SVCs")

... DAIF is unmasked before el0_svc_common() sets regs->syscallno to the
real syscall number. Consequently state may be saved in SVE format prior
to this point.

Considering all of the above, current_pt_regs()->syscallno should not be
used to infer whether the SVE state can be discarded. Luckily we can
instead use cpu_fp_state::to_save to track when it is safe to discard
the SVE state:

* At syscall entry, after the live SVE register state is truncated, set
  cpu_fp_state::to_save to FP_STATE_FPSIMD to indicate that only the
  FPSIMD portion is live and needs to be saved.

* At syscall exit, once the task's state is guaranteed to be live, set
  cpu_fp_state::to_save to FP_STATE_CURRENT to indicate that TIF_SVE
  must be considered to determine which state needs to be saved.

* Whenever state is modified, it must be saved+flushed prior to
  manipulation. The state will be truncated if necessary when it is
  saved, and reloading the state will set fp_state::to_save to
  FP_STATE_CURRENT, preventing subsequent discarding.

This permits SVE state to be discarded *only* when it is known to have
been truncated (and the non-FPSIMD portions must be zero), and ensures
that SVE state is retained after it is explicitly modified.

For backporting, note that this fix depends on the following commits:

* b2482807fb ("arm64/sme: Optimise SME exit on syscall entry")
* f130ac0ae4 ("arm64: syscall: unmask DAIF earlier for SVCs")
* 929fa99b1215 ("arm64/fpsimd: signal: Always save+flush state early")

Fixes: 8c845e2731 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")
Fixes: f130ac0ae4 ("arm64: syscall: unmask DAIF earlier for SVCs")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250508132644.1395904-2-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-19 15:28:08 +02:00
..
stacktrace arm64: stacktrace: track hyp stacks in unwinder's address space 2022-09-09 12:30:08 +01:00
vdso arm64: vdso: use SYS_CNTVCTSS_EL0 for gettimeofday 2022-09-09 12:27:25 +01:00
xen arm/xen: Introduce xen_setup_dma_ops() 2022-06-06 08:54:33 +02:00
acenv.h
acpi.h arm64: acpi: Harden get_cpu_for_acpi_id() against missing CPU entry 2024-09-12 11:11:42 +02:00
alternative-macros.h work around gcc bugs with 'asm goto' with outputs 2024-02-23 09:24:47 +01:00
alternative.h Merge branches 'for-next/kpti', 'for-next/missing-proto-warn', 'for-next/iss2-decode', 'for-next/kselftest', 'for-next/misc', 'for-next/feat_mops', 'for-next/module-alloc', 'for-next/sysreg', 'for-next/cpucap', 'for-next/acpi', 'for-next/kdump', 'for-next/acpi-doc', 'for-next/doc' and 'for-next/tpidr2-fix', remote-tracking branch 'arm64/for-next/perf' into for-next/core 2023-06-23 18:32:20 +01:00
apple_m1_pmu.h drivers/perf: Add Apple icestorm/firestorm CPU PMU driver 2022-03-08 13:32:48 +00:00
arch_gicv3.h arm64: add ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap 2023-01-31 16:06:17 +00:00
arch_timer.h arm64/arch_timer: Provide noinstr sched_clock_read() functions 2023-06-05 21:11:05 +02:00
archrandom.h arm64: kaslr: add kaslr_early_init() declaration 2023-05-25 17:44:02 +01:00
arm_dsu_pmu.h
arm_pmuv3.h arm64/arm: arm_pmuv3: perf: Don't truncate 64-bit registers 2023-11-20 11:59:38 +01:00
arm-cci.h
asm_pointer_auth.h arm64/sysreg: Add _EL1 into ID_AA64ISAR2_EL1 definition names 2022-07-05 11:45:46 +01:00
asm-bug.h arm64: asm-bug: Add .align 2 to the end of __BUG_ENTRY 2024-06-12 11:12:49 +02:00
asm-extable.h arm64: extable: cleanup redundant extable type EX_TYPE_FIXUP 2022-06-28 12:11:47 +01:00
asm-offsets.h
asm-prototypes.h
asm-uaccess.h arm64/mm: remove now-superfluous ISBs from TTBR writes 2023-06-15 17:47:54 +01:00
assembler.h Merge branch 'for-next/trivial' into for-next/core 2022-12-06 11:33:29 +00:00
atomic_ll_sc.h arch: Remove cmpxchg_double 2023-06-05 09:36:39 +02:00
atomic_lse.h arch: Remove cmpxchg_double 2023-06-05 09:36:39 +02:00
atomic.h locking/atomic: make atomic*_{cmp,}xchg optional 2023-06-05 09:57:14 +02:00
barrier.h arm64: barrier: Restore spec_bar() macro 2024-08-14 13:58:48 +02:00
bitops.h include: move find.h from asm_generic to linux 2022-01-15 08:47:31 -08:00
bitrev.h
boot.h
brk-imm.h arm64: Support Clang UBSAN trap codes for better reporting 2023-02-08 15:26:58 -08:00
bug.h
cache.h arm64: allow kmalloc() caches aligned to the smaller cache_line_size() 2023-06-19 16:19:22 -07:00
cacheflush.h arm64: implement the new page table range API 2023-08-24 16:20:20 -07:00
checksum.h
clocksource.h
cmpxchg.h arch: Remove cmpxchg_double 2023-06-05 09:36:39 +02:00
compat.h arm64: avoid prototype warnings for syscalls 2023-05-25 17:44:01 +01:00
compiler.h arm64: move PAC masks to <asm/pointer_auth.h> 2023-04-13 12:27:11 +01:00
cpu_ops.h arm64: cpuidle: remove generic cpuidle support 2022-06-23 14:19:33 +01:00
cpu.h arm64: cpufeature: add system register ID_AA64MMFR3 2023-06-06 16:52:40 +01:00
cpufeature.h arm64: cpufeature: Fix CLRBHB and BC detection 2023-09-18 10:45:11 +01:00
cpuidle.h arm64: cpuidle: remove generic cpuidle support 2022-06-23 14:19:33 +01:00
cputype.h arm64: Add support for HIP09 Spectre-BHB mitigation 2025-06-04 14:41:54 +02:00
current.h
daifflags.h
dcc.h
debug-monitors.h arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step 2023-04-14 13:39:47 +01:00
device.h
dmi.h
efi.h Merge patch series "riscv: Introduce KASLR" 2023-09-08 11:25:13 -07:00
el2_setup.h KVM: arm64: Disable SME traps for (h)VHE at setup 2023-07-26 17:08:29 +00:00
elf.h arm64: exec: Adjust affinity for compat tasks with mismatched 32-bit EL0 2021-08-20 12:33:06 +02:00
esr.h arm64/fpsimd: Avoid RES0 bits in the SME trap handler 2025-06-19 15:28:06 +02:00
exception.h Merge branches 'for-next/kpti', 'for-next/missing-proto-warn', 'for-next/iss2-decode', 'for-next/kselftest', 'for-next/misc', 'for-next/feat_mops', 'for-next/module-alloc', 'for-next/sysreg', 'for-next/cpucap', 'for-next/acpi', 'for-next/kdump', 'for-next/acpi-doc', 'for-next/doc' and 'for-next/tpidr2-fix', remote-tracking branch 'arm64/for-next/perf' into for-next/core 2023-06-23 18:32:20 +01:00
exec.h
extable.h arm64/bpf: Remove 128MB limit for BPF JIT programs 2021-11-08 22:16:26 +01:00
fb.h arch/arm64: Implement <asm/fb.h> with generic helpers 2023-04-20 10:04:27 +02:00
fixmap.h arm64: mm: always map fixmap at page granularity 2023-04-11 18:55:28 +01:00
fpsimd.h arm64/fpsimd: Do not discard modified SVE state 2025-06-19 15:28:08 +02:00
fpsimdmacros.h arm64: sme: Use STR P to clear FFR context field in streaming SVE mode 2023-06-29 11:29:31 +01:00
ftrace.h tracing: arm64: Avoid missing-prototype warnings 2023-07-12 12:06:04 -04:00
futex.h arm64: extable: add a dedicated uaccess handler 2021-10-21 10:45:22 +01:00
gpr-num.h arm64: gpr-num: support W registers 2021-10-21 10:45:22 +01:00
hardirq.h
hugetlb.h mm: hugetlb: Add huge page size param to huge_ptep_get_and_clear() 2025-03-13 12:58:38 +01:00
hw_breakpoint.h arm64: move cpu_suspend_set_dbg_restorer() prototype to header 2023-05-25 17:44:01 +01:00
hwcap.h arm64: add HWCAP for FEAT_HBC (hinted conditional branches) 2023-08-04 17:32:13 +01:00
hyp_image.h
hyperv-tlfs.h PCI: hv: Add arm64 Hyper-V vPCI support 2022-01-12 08:24:29 -06:00
hypervisor.h
image.h arm64: Fix dangling references to Documentation/arm64 2023-06-21 08:53:31 -06:00
insn-def.h arm64: move AARCH64_BREAK_FAULT into insn-def.h 2022-02-22 21:25:48 +00:00
insn.h arm64: insn: Add support for encoding DSB 2025-05-18 08:24:10 +02:00
io.h arm64 : mm: add wrapper function ioremap_prot() 2023-08-18 10:12:36 -07:00
irq_work.h arch: consolidate arch_irq_work_raise prototypes 2024-02-05 20:14:17 +00:00
irq.h
irqflags.h arm64: alternatives: use cpucap naming 2023-06-07 17:57:47 +01:00
jump_label.h arm64: jump_label: Ensure patched jump_labels are visible to all CPUs 2024-08-11 12:47:24 +02:00
kasan.h
Kbuild arm64/sysreg: Enable automatic generation of system register definitions 2022-05-04 15:30:28 +01:00
kernel-pgtable.h arm64: fix build warning for ARM64_MEMSTART_SHIFT 2023-08-04 17:19:44 +01:00
kexec.h arm64: kdump : take off the protection on crashkernel memory region 2023-04-11 19:24:46 +01:00
kfence.h mm,kfence: decouple kfence from page granularity mapping judgement 2023-03-27 16:15:20 +01:00
kgdb.h
kprobes.h kprobes: treewide: Make it harder to refer kretprobe_trampoline directly 2021-09-30 21:24:06 -04:00
kvm_arm.h KVM: arm64: Add nPIR{E0}_EL1 to HFG traps 2023-10-12 16:38:50 +01:00
kvm_asm.h KVM/arm64 updates for Linux 6.6 2023-08-31 13:18:53 -04:00
kvm_emulate.h KVM: arm64: Fix resetting SME trap values on reset for (h)VHE 2023-07-26 17:08:30 +00:00
kvm_host.h KVM: arm64: Eagerly switch ZCR_EL{1,2} 2025-03-28 21:59:56 +01:00
kvm_hyp.h KVM: arm64: Eagerly switch ZCR_EL{1,2} 2025-03-28 21:59:56 +01:00
kvm_mmu.h KVM: arm64: Remove size-order align in the nVHE hyp private VA range 2023-08-26 12:00:54 +01:00
kvm_mte.h KVM: arm64: Save/restore MTE registers 2021-06-22 14:08:05 +01:00
kvm_nested.h KVM: arm64: nv: Add trap forwarding infrastructure 2023-08-17 10:00:27 +01:00
kvm_pgtable.h KVM: arm64: Define kvm_tlb_flush_vmid_range() 2023-08-17 09:40:35 +01:00
kvm_pkvm.h KVM: arm64: pkvm: Add support for fragmented FF-A descriptors 2023-06-01 21:34:51 +00:00
kvm_ptrauth.h
kvm_ras.h KVM: arm64: Treat ESR_EL2 as a 64-bit register 2022-04-29 19:26:27 +01:00
kvm_types.h
linkage.h arm64: Extend support for CONFIG_FUNCTION_ALIGNMENT 2023-01-24 11:49:43 +00:00
lse.h arm64: alternatives: use cpucap naming 2023-06-07 17:57:47 +01:00
memory.h asm-generic updates for 6.5 2023-07-06 10:06:04 -07:00
mman.h arm64: mte: Do not allow PROT_MTE on MAP_HUGETLB user mappings 2025-02-27 04:10:42 -08:00
mmu_context.h Merge branches 'for-next/kpti', 'for-next/missing-proto-warn', 'for-next/iss2-decode', 'for-next/kselftest', 'for-next/misc', 'for-next/feat_mops', 'for-next/module-alloc', 'for-next/sysreg', 'for-next/cpucap', 'for-next/acpi', 'for-next/kdump', 'for-next/acpi-doc', 'for-next/doc' and 'for-next/tpidr2-fix', remote-tracking branch 'arm64/for-next/perf' into for-next/core 2023-06-23 18:32:20 +01:00
mmu.h arm64: Remove unsued extern declaration init_mem_pgprot() 2023-07-27 11:18:02 +01:00
mmzone.h
module.h arm64: module: mandate MODULE_PLTS 2023-06-06 17:39:05 +01:00
module.lds.h arm64: module: mandate MODULE_PLTS 2023-06-06 17:39:05 +01:00
mshyperv.h arm64: hyperv: Add Hyper-V hypercall and register access utilities 2021-08-04 16:54:36 +00:00
mte-def.h arm64: mte: Define the number of bytes for storing the tags in a page 2022-02-15 22:53:29 +00:00
mte-kasan.h arm64: mte: rename TCO routines 2023-04-05 19:42:43 -07:00
mte.h arm64: mte: simplify swap tag restoration logic 2023-08-18 10:12:02 -07:00
neon-intrinsics.h
neon.h
numa.h
page-def.h
page.h mm: add vma_alloc_zeroed_movable_folio() 2023-02-02 22:33:18 -08:00
paravirt_api_clock.h sched/headers: Add initial new headers as identity mappings 2022-02-23 10:58:28 +01:00
paravirt.h
patching.h arm64: patching: Add aarch64_insn_write_literal_u64() 2023-01-24 11:49:43 +00:00
pci.h asm-generic: Add new pci.h and use it 2022-07-22 17:34:57 -05:00
percpu.h arch: Remove cmpxchg_double 2023-06-05 09:36:39 +02:00
perf_event.h arm64: perf: Move PMUv3 driver to drivers/perf 2023-03-27 14:01:18 +01:00
pgalloc.h arm64: mm: Fix VM_BUG_ON(mm != &init_mm) for trans_pgd 2021-11-16 10:12:57 +00:00
pgtable-hwdef.h arm64: add encodings of PIRx_ELx registers 2023-06-06 16:52:41 +01:00
pgtable-prot.h arm64: add encodings of PIRx_ELx registers 2023-06-06 16:52:41 +01:00
pgtable-types.h
pgtable.h arm64/mm: Check PUD_TYPE_TABLE in pud_bad() 2025-06-04 14:42:00 +02:00
pointer_auth.h arm64: move PAC masks to <asm/pointer_auth.h> 2023-04-13 12:27:11 +01:00
preempt.h arm64: Support PREEMPT_DYNAMIC 2022-02-19 11:11:09 +01:00
probes.h
proc-fns.h
processor.h locking: remove spin_lock_prefetch 2023-08-12 09:18:47 -07:00
ptdump.h ARM: 9255/1: efi/dump UEFI runtime page tables for ARM 2022-11-07 14:19:01 +00:00
ptrace.h arm64: rename ARM64_HAS_IRQ_PRIO_MASKING to ARM64_HAS_GIC_PRIO_MASKING 2023-01-31 16:06:17 +00:00
pvclock-abi.h
rwonce.h arm64: Do not include __READ_ONCE() block in assembly files 2022-03-09 21:56:50 +00:00
scs.h arm64: add scs_patch_vmlinux prototype 2023-05-25 17:44:01 +01:00
sdei.h arm64: sdei: abort running SDEI handlers during crash 2023-08-04 17:35:33 +01:00
seccomp.h
sections.h arm64: entry: Allow the trampoline text to occupy multiple pages 2022-02-15 17:40:28 +00:00
semihost.h serial: earlycon-arm-semihost: Move smh_putc() variants in respective arch's semihost.h 2023-01-19 14:58:19 +01:00
set_memory.h set_memory: allow querying whether set_direct_map_*() is actually enabled 2021-07-08 11:48:20 -07:00
setup.h arm64: mm: Fix "rodata=on" when CONFIG_RODATA_FULL_DEFAULT_ENABLED=y 2023-12-03 07:33:05 +01:00
shmparam.h
signal.h
signal32.h
simd.h arm64: replace in_irq() with in_hardirq() 2021-08-20 19:49:38 +01:00
smp_plat.h arm64: Add missing header <asm/smp.h> in two files 2021-07-12 13:37:34 +01:00
smp.h arm64: smp: Switch to hotplug core state synchronization 2023-05-15 13:44:57 +02:00
sparsemem.h mm, treewide: redefine MAX_ORDER sanely 2023-04-05 19:42:46 -07:00
spectre.h arm64: bpf: Add BHB mitigation to the epilogue for cBPF programs 2025-05-18 08:24:10 +02:00
spinlock_types.h locking: Allow to include asm/spinlock_types.h from linux/spinlock_types_raw.h 2021-12-07 15:14:12 +01:00
spinlock.h
stack_pointer.h
stackprotector.h stackprotector: actually use get_random_canary() 2022-11-18 02:18:10 +01:00
stacktrace.h arm64: efi: Account for the EFI runtime stack in stack unwinder 2023-01-16 15:27:31 +01:00
stage2_pgtable.h KVM: arm64: Limit stage2_apply_range() batch size to largest block 2022-10-09 02:33:49 +01:00
stat.h
string.h Revert "arm64: Mitigate MTE issues with str{n}cmp()" 2022-03-07 21:57:02 +00:00
suspend.h
sync_bitops.h
syscall_wrapper.h posix-timers: Get rid of [COMPAT_]SYS_NI() uses 2024-01-20 11:51:46 +01:00
syscall.h tracing: arm64: Avoid missing-prototype warnings 2023-07-12 12:06:04 -04:00
sysreg.h ARM: 2023-09-07 13:52:20 -07:00
system_misc.h arm64: die(): pass 'err' as long 2022-09-16 12:17:03 +01:00
thread_info.h thread_info: move function declarations to linux/thread_info.h 2023-06-09 17:44:16 -07:00
timex.h
tlb.h arm64: convert various functions to use ptdescs 2023-08-21 13:37:55 -07:00
tlbbatch.h arm64: support batched/deferred tlb shootdown during page reclamation/migration 2023-08-18 10:12:37 -07:00
tlbflush.h Fix mmu notifiers for range-based invalidates 2025-04-25 10:45:55 +02:00
topology.h arm64, topology: enable use of init_cpu_capacity_cppc() 2022-03-10 20:21:58 +01:00
trans_pgd.h arm64: trans_pgd: remove trans_pgd_map_page() 2021-10-01 13:31:01 +01:00
traps.h arm64: move early_brk64 prototype to header 2023-05-25 17:44:03 +01:00
uaccess.h arm64/mm: remove now-superfluous ISBs from TTBR writes 2023-06-15 17:47:54 +01:00
unistd.h arch: Register fchmodat2, usually as syscall 452 2023-07-27 12:25:35 +02:00
unistd32.h syscalls: fix compat_sys_io_pgetevents_time64 usage 2024-07-05 09:34:04 +02:00
uprobes.h arm64: probes: Fix uprobes for big-endian kernels 2024-10-22 15:46:20 +02:00
vdso.h arm64: alternative: patch alternatives in the vDSO 2022-09-09 12:27:25 +01:00
vectors.h arm64: fix clang warning about TRAMP_VALIAS 2022-03-18 13:48:28 +00:00
vermagic.h
virt.h KVM: arm64: Handle kvm_arm_init failure correctly in finalize_pkvm 2023-07-11 19:30:14 +00:00
vmalloc.h kasan, vmalloc, arm64: mark vmalloc mappings as pgprot_tagged 2022-03-24 19:06:47 -07:00
vmap_stack.h kasan, arm64: reset pointer tags of vmapped stacks 2022-03-24 19:06:47 -07:00
word-at-a-time.h arm64: mte: rename TCO routines 2023-04-05 19:42:43 -07:00
xor.h lib/xor: make xor prototypes more friendly to compiler vectorization 2022-02-11 20:39:39 +11:00