History log of /linux/arch/arm64/kernel/fpsimd.c (Results 1 – 25 of 1711)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# ab93e0dd 06-Aug-2025 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare input updates for 6.17 merge window.


# a7bee4e7 04-Aug-2025 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'ib-mfd-gpio-input-pwm-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into next

Merge an immutable branch between MFD, GPIO, Input and PWM to resolve
conflicts for the mer

Merge tag 'ib-mfd-gpio-input-pwm-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into next

Merge an immutable branch between MFD, GPIO, Input and PWM to resolve
conflicts for the merge window pull request.

show more ...


Revision tags: v6.16, v6.16-rc7, v6.16-rc6, v6.16-rc5, v6.16-rc4
# 74f1af95 29-Jun-2025 Rob Clark <robin.clark@oss.qualcomm.com>

Merge remote-tracking branch 'drm/drm-next' into msm-next

Back-merge drm-next to (indirectly) get arm-smmu updates for making
stall-on-fault more reliable.

Signed-off-by: Rob Clark <robin.clark@oss

Merge remote-tracking branch 'drm/drm-next' into msm-next

Back-merge drm-next to (indirectly) get arm-smmu updates for making
stall-on-fault more reliable.

Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>

show more ...


Revision tags: v6.16-rc3, v6.16-rc2
# c598d5eb 11-Jun-2025 Thomas Zimmermann <tzimmermann@suse.de>

Merge drm/drm-next into drm-misc-next

Backmerging to forward to v6.16-rc1

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>


# 86e2d052 09-Jun-2025 Thomas Hellström <thomas.hellstrom@linux.intel.com>

Merge drm/drm-next into drm-xe-next

Backmerging to bring in 6.16

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>


# 34c55367 09-Jun-2025 Jani Nikula <jani.nikula@intel.com>

Merge drm/drm-next into drm-intel-next

Sync to v6.16-rc1, among other things to get the fixed size GENMASK_U*()
and BIT_U*() macros.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>


Revision tags: v6.16-rc1
# 47cf96fb 28-May-2025 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
"The headline feature is the re-enablement of support for Arm's
Scalab

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
"The headline feature is the re-enablement of support for Arm's
Scalable Matrix Extension (SME) thanks to a bumper crop of fixes
from Mark Rutland.

If matrices aren't your thing, then Ryan's page-table optimisation
work is much more interesting.

Summary:

ACPI, EFI and PSCI:

- Decouple Arm's "Software Delegated Exception Interface" (SDEI)
support from the ACPI GHES code so that it can be used by platforms
booted with device-tree

- Remove unnecessary per-CPU tracking of the FPSIMD state across EFI
runtime calls

- Fix a node refcount imbalance in the PSCI device-tree code

CPU Features:

- Ensure register sanitisation is applied to fields in ID_AA64MMFR4

- Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM
guests can reliably query the underlying CPU types from the VMM

- Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes
to our context-switching, signal handling and ptrace code

Entry code:

- Hook up TIF_NEED_RESCHED_LAZY so that CONFIG_PREEMPT_LAZY can be
selected

Memory management:

- Prevent BSS exports from being used by the early PI code

- Propagate level and stride information to the low-level TLB
invalidation routines when operating on hugetlb entries

- Use the page-table contiguous hint for vmap() mappings with
VM_ALLOW_HUGE_VMAP where possible

- Optimise vmalloc()/vmap() page-table updates to use "lazy MMU mode"
and hook this up on arm64 so that the trailing DSB (used to publish
the updates to the hardware walker) can be deferred until the end
of the mapping operation

- Extend mmap() randomisation for 52-bit virtual addresses (on par
with 48-bit addressing) and remove limited support for
randomisation of the linear map

Perf and PMUs:

- Add support for probing the CMN-S3 driver using ACPI

- Minor driver fixes to the CMN, Arm-NI and amlogic PMU drivers

Selftests:

- Fix FPSIMD and SME tests to align with the freshly re-enabled SME
support

- Fix default setting of the OUTPUT variable so that tests are
installed in the right location

vDSO:

- Replace raw counter access from inline assembly code with a call to
the the __arch_counter_get_cntvct() helper function

Miscellaneous:

- Add some missing header inclusions to the CCA headers

- Rework rendering of /proc/cpuinfo to follow the x86-approach and
avoid repeated buffer expansion (the user-visible format remains
identical)

- Remove redundant selection of CONFIG_CRC32

- Extend early error message when failing to map the device-tree
blob"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (83 commits)
arm64: cputype: Add cputype definition for HIP12
arm64: el2_setup.h: Make __init_el2_fgt labels consistent, again
perf/arm-cmn: Add CMN S3 ACPI binding
arm64/boot: Disallow BSS exports to startup code
arm64/boot: Move global CPU override variables out of BSS
arm64/boot: Move init_pgdir[] and init_idmap_pgdir[] into __pi_ namespace
perf/arm-cmn: Initialise cmn->cpu earlier
kselftest/arm64: Set default OUTPUT path when undefined
arm64: Update comment regarding values in __boot_cpu_mode
arm64: mm: Drop redundant check in pmd_trans_huge()
arm64/mm: Re-organise setting up FEAT_S1PIE registers PIRE0_EL1 and PIR_EL1
arm64/mm: Permit lazy_mmu_mode to be nested
arm64/mm: Disable barrier batching in interrupt contexts
arm64/cpuinfo: only show one cpu's info in c_show()
arm64/mm: Batch barriers when updating kernel mappings
mm/vmalloc: Enter lazy mmu mode while manipulating vmalloc ptes
arm64/mm: Support huge pte-mapped pages in vmap
mm/vmalloc: Gracefully unmap huge ptes
mm/vmalloc: Warn on improper use of vunmap_range()
arm64/mm: Hoist barriers out of set_ptes_anysz() loop
...

show more ...


# bbfd5594 28-May-2025 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Merge drm/drm-next into drm-intel-gt-next

Need to pull in a67221b5eb8d ("drm/i915/dp: Return min bpc supported by source instead of 0")
in order to fix build breakage on GCC 9.4.0 (from Ubuntu 20.04

Merge drm/drm-next into drm-intel-gt-next

Need to pull in a67221b5eb8d ("drm/i915/dp: Return min bpc supported by source instead of 0")
in order to fix build breakage on GCC 9.4.0 (from Ubuntu 20.04).

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

show more ...


# 53a08704 27-May-2025 Will Deacon <will@kernel.org>

Merge branch 'for-next/sme-fixes' into for-next/core

* for-next/sme-fixes: (35 commits)
arm64/fpsimd: Allow CONFIG_ARM64_SME to be selected
arm64/fpsimd: ptrace: Gracefully handle errors
arm64

Merge branch 'for-next/sme-fixes' into for-next/core

* for-next/sme-fixes: (35 commits)
arm64/fpsimd: Allow CONFIG_ARM64_SME to be selected
arm64/fpsimd: ptrace: Gracefully handle errors
arm64/fpsimd: ptrace: Mandate SVE payload for streaming-mode state
arm64/fpsimd: ptrace: Do not present register data for inactive mode
arm64/fpsimd: ptrace: Save task state before generating SVE header
arm64/fpsimd: ptrace/prctl: Ensure VL changes leave task in a valid state
arm64/fpsimd: ptrace/prctl: Ensure VL changes do not resurrect stale data
arm64/fpsimd: Make clone() compatible with ZA lazy saving
arm64/fpsimd: Clear PSTATE.SM during clone()
arm64/fpsimd: Consistently preserve FPSIMD state during clone()
arm64/fpsimd: Remove redundant task->mm check
arm64/fpsimd: signal: Use SMSTOP behaviour in setup_return()
arm64/fpsimd: Add task_smstop_sm()
arm64/fpsimd: Factor out {sve,sme}_state_size() helpers
arm64/fpsimd: Clarify sve_sync_*() functions
arm64/fpsimd: ptrace: Consistently handle partial writes to NT_ARM_(S)SVE
arm64/fpsimd: signal: Consistently read FPSIMD context
arm64/fpsimd: signal: Mandate SVE payload for streaming-mode state
arm64/fpsimd: signal: Clear PSTATE.SM when restoring FPSIMD frame only
arm64/fpsimd: Do not discard modified SVE state
...

show more ...


# b5b6910f 27-May-2025 Will Deacon <will@kernel.org>

Merge branch 'for-next/efi' into for-next/core

* for-next/efi:
arm64/fpsimd: Avoid unnecessary per-CPU buffers for EFI runtime calls


Revision tags: v6.15, v6.15-rc7, v6.15-rc6
# b87c8c4a 08-May-2025 Mark Rutland <mark.rutland@arm.com>

arm64/fpsimd: ptrace/prctl: Ensure VL changes leave task in a valid state

Currently, vec_set_vector_length() can manipulate a task into an invalid
state as a result of a prctl/ptrace syscall which c

arm64/fpsimd: ptrace/prctl: Ensure VL changes leave task in a valid state

Currently, vec_set_vector_length() can manipulate a task into an invalid
state as a result of a prctl/ptrace syscall which changes the SVE/SME
vector length, resulting in several problems:

(1) When changing the SVE vector length, if the task initially has
PSTATE.ZA==1, and sve_alloc() fails to allocate memory, the task
will be left with PSTATE.ZA==1 and sve_state==NULL. This is not a
legitimate state, and could result in a subsequent null pointer
dereference.

(2) When changing the SVE vector length, if the task initially has
PSTATE.SM==1, the task will be left with PSTATE.SM==1 and
fp_type==FP_STATE_FPSIMD. Streaming mode state always needs to be
saved in SVE format, so this is not a legitimate state.

Attempting to restore this state may cause a task to erroneously
inherit stale streaming mode predicate registers and FFR contents,
behaving non-deterministically and potentially leaving information
from another task.

While in this state, reads of the NT_ARM_SSVE regset will indicate
that the registers are not stored in SVE format. For the NT_ARM_SSVE
regset specifically, debuggers interpret this as meaning that
PSTATE.SM==0.

(3) When changing the SME vector length, if the task initially has
PSTATE.SM==1, the lower 128 bits of task's streaming mode vector
state will be migrated to non-streaming mode, rather than these bits
being zeroed as is usually the case for changes to PSTATE.SM.

To fix the first issue, we can eagerly allocate the new sve_state and
sme_state before modifying the task. This makes it possible to handle
memory allocation failure without modifying the task state at all, and
removes the need to clear TIF_SVE and TIF_SME.

To fix the second issue, we either need to clear PSTATE.SM or not change
the saved fp_type. Given we're going to eagerly allocate sve_state and
sme_state, the simplest option is to preserve PSTATE.SM and the saves
fp_type, and consistently truncate the SVE state. This ensures that the
task always stays in a valid state, and by virtue of not exiting
streaming mode, this also sidesteps the third issue.

I believe these changes should not be problematic for realistic usage:

* When the SVE/SME vector length is changed via prctl(), syscall entry
will have cleared PSTATE.SM. Unless the task's state has been
manipulated via ptrace after entry, the task will have PSTATE.SM==0.

* When the SVE/SME vector length is changed via a write to the
NT_ARM_SVE or NT_ARM_SSVE regsets, PSTATE.SM will be forced
immediately after the length change, and new vector state will be
copied from userspace.

* When the SME vector length is changed via a write to the NT_ARM_ZA
regset, the (S)SVE state is clobbered today, so anyone who cares about
the specific state would need to install this after writing to the
NT_ARM_ZA regset.

As we need to free the old SVE state while TIF_SVE may still be set, we
cannot use sve_free(), and using kfree() directly makes it clear that
the free pairs with the subsequent assignment. As this leaves sve_free()
unused, I've removed the existing sve_free() and renamed __sve_free() to
mirror sme_free().

Fixes: 8bd7f91c03d8 ("arm64/sme: Implement traps and syscall handling for SME")
Fixes: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Spickett <david.spickett@arm.com>
Cc: Luis Machado <luis.machado@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250508132644.1395904-16-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


# 49ce4841 08-May-2025 Mark Rutland <mark.rutland@arm.com>

arm64/fpsimd: ptrace/prctl: Ensure VL changes do not resurrect stale data

The SVE/SME vector lengths can be changed via prctl/ptrace syscalls.
Changes to the SVE/SME vector lengths are documented as

arm64/fpsimd: ptrace/prctl: Ensure VL changes do not resurrect stale data

The SVE/SME vector lengths can be changed via prctl/ptrace syscalls.
Changes to the SVE/SME vector lengths are documented as preserving the
lower 128 bits of the Z registers (i.e. the bits shared with the FPSIMD
V registers). To ensure this, vec_set_vector_length() explicitly copies
register values from a task's saved SVE state to its saved FPSIMD state
when dropping the task to FPSIMD-only.

The logic for this was not updated when when FPSIMD/SVE state tracking
was changed across commits:

baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
a0136be443d5 (arm64/fpsimd: Load FP state based on recorded data type")
bbc6172eefdb ("arm64/fpsimd: SME no longer requires SVE register state")
8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")

Since the last commit above, a task's FPSIMD/SVE state may be stored in
FPSIMD format while TIF_SVE is set, and the stored SVE state is stale.
When vec_set_vector_length() encounters this case, it will erroneously
clobber the live FPSIMD state with stale SVE state by using
sve_to_fpsimd().

Fix this by using fpsimd_sync_from_effective_state() instead.

Related issues with streaming mode state will be addressed in subsequent
patches.

Fixes: 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Spickett <david.spickett@arm.com>
Cc: Luis Machado <luis.machado@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250508132644.1395904-15-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


# 6ef1d778 08-May-2025 Mark Rutland <mark.rutland@arm.com>

arm64/fpsimd: Add task_smstop_sm()

In a few places we want to transition a task from streaming mode to
non-streaming mode, e.g. signal delivery where we historically tried to
use an SMSTOP SM instru

arm64/fpsimd: Add task_smstop_sm()

In a few places we want to transition a task from streaming mode to
non-streaming mode, e.g. signal delivery where we historically tried to
use an SMSTOP SM instruction.

Add a new helper to manipulate a task's state in the same way as an
SMSTOP SM instruction. I have not added a corresponding helper to
simulate the effects of SMSTART SM. Only ptrace transitions a task into
streaming mode, and ptrace has distinct semantics for such transitions.

Per ARM DDI 0487 L.a, section B1.4.6:

| RRSWFQ
| When the Effective value of PSTATE.SM is changed by any method from 0
| to 1, an entry to Streaming SVE mode is performed, and all implemented
| bits of Streaming SVE register state are set to zero.

| RKFRQZ
| When the Effective value of PSTATE.SM is changed by any method from 1
| to 0, an exit from Streaming SVE mode is performed, and in the
| newly-entered mode, all implemented bits of the SVE scalable vector
| registers, SVE predicate registers, and FFR, are set to zero.

Per ARM DDI 0487 L.a, section C5.2.9:

| On entry to or exit from Streaming SVE mode, FPMR is set to 0

Per ARM DDI 0487 L.a, section C5.2.10:

| On entry to or exit from Streaming SVE mode, FPSR.{IOC, DZC, OFC, UFC,
| IXC, IDC, QC} are set to 1 and the remaining bits are set to 0.

This means bits 0, 1, 2, 3, 4, 7, and 27 respectively, i.e. 0x0800009f

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250508132644.1395904-9-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


# 8738288a 08-May-2025 Mark Rutland <mark.rutland@arm.com>

arm64/fpsimd: Factor out {sve,sme}_state_size() helpers

In subsequent patches we'll need to determine the SVE/SME state size for
a given SVE VL and SME VL regardless of whether a task is currently
c

arm64/fpsimd: Factor out {sve,sme}_state_size() helpers

In subsequent patches we'll need to determine the SVE/SME state size for
a given SVE VL and SME VL regardless of whether a task is currently
configured with those VLs. Split the sizing logic out of
sve_state_size() and sme_state_size() so that we don't need to open-code
this logic elsewhere.

At the same time, apply minor cleanups:

* Move sve_state_size() into fpsimd.h, matching the placement of
sme_state_size().

* Remove the feature checks from sve_state_size(). We only call
sve_state_size() when at least one of SVE and SME are supported, and
when either of the two is not supported, the task's corresponding
SVE/SME vector length will be zero.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250508132644.1395904-8-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


# b255be42 08-May-2025 Mark Rutland <mark.rutland@arm.com>

arm64/fpsimd: Clarify sve_sync_*() functions

The sve_sync_{to,from}_fpsimd*() functions are intended to
extract/insert the currently effective FPSIMD state of a task regardless
of whether the task's

arm64/fpsimd: Clarify sve_sync_*() functions

The sve_sync_{to,from}_fpsimd*() functions are intended to
extract/insert the currently effective FPSIMD state of a task regardless
of whether the task's state is saved in FPSIMD format or SVE format.
Historically they were only used by ptrace, but sve_sync_to_fpsimd() is
now used more widely, and sve_sync_from_fpsimd_zeropad() may be used
more widely in future.

When FPSIMD/SVE state tracking was changed across commits:

baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
a0136be443d5 (arm64/fpsimd: Load FP state based on recorded data type")
bbc6172eefdb ("arm64/fpsimd: SME no longer requires SVE register state")
8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")

... sve_sync_to_fpsimd() was updated to consider task->thread.fp_type
rather than the task's TIF_SVE and PSTATE.SM, but (apparently due to an
oversight) sve_sync_from_fpsimd_zeropad() was left as-is, leaving the
two inconsistent.

Due to this, sve_sync_from_fpsimd_zeropad() may copy state from
task->thread.uw.fpsimd_state into task->thread.sve_state when
task->thread.fp_type == FP_STATE_FPSIMD. This is redundant (but benign)
as task->thread.uw.fpsimd_state is the effective state that will be
restored, and task->thread.sve_state will not be consumed. For
consistency, and to avoid the redundant work, it better for
sve_sync_from_fpsimd_zeropad() to consider task->thread.fp_type alone,
matching sve_sync_to_fpsimd().

The naming of both functions is somehat unfortunate, as it is unclear
when and why they copy state. It would be better to describe them in
terms of the effective state.

Considering all of the above, clean this up:

* Adjust sve_sync_from_fpsimd_zeropad() to consider
task->thread.fp_type.

* Update comments to clarify the intended semantics/usage. I've removed
the description that task->thread.sve_state must have been allocated,
as this is only necessary when task->thread.fp_type == FP_STATE_SVE,
which itself implies that task->thread.sve_state must have been
allocated.

* Rename the functions to more clearly indicate when/why they copy
state:

- sve_sync_to_fpsimd() => fpsimd_sync_from_effective_state()

- sve_sync_from_fpsimd_zeropad => fpsimd_sync_to_effective_state_zeropad()

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250508132644.1395904-7-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


# 316283f2 08-May-2025 Mark Rutland <mark.rutland@arm.com>

arm64/fpsimd: ptrace: Consistently handle partial writes to NT_ARM_(S)SVE

Partial writes to the NT_ARM_SVE and NT_ARM_SSVE regsets using an
payload are handled inconsistently and non-deterministical

arm64/fpsimd: ptrace: Consistently handle partial writes to NT_ARM_(S)SVE

Partial writes to the NT_ARM_SVE and NT_ARM_SSVE regsets using an
payload are handled inconsistently and non-deterministically. A comment
within sve_set_common() indicates that we intended that a partial write
would preserve any effective FPSIMD/SVE state which was not overwritten,
but this has never worked consistently, and during syscalls the FPSIMD
vector state may be non-deterministically preserved and may be
erroneously migrated between streaming and non-streaming SVE modes.

The simplest fix is to handle a partial write by consistently zeroing
the remaining state. As detailed below I do not believe this will
adversely affect any real usage.

Neither GDB nor LLDB attempt partial writes to these regsets, and the
documentation (in Documentation/arch/arm64/sve.rst) has always indicated
that state preservation was not guaranteed, as is says:

| The effect of writing a partial, incomplete payload is unspecified.

When the logic was originally introduced in commit:

43d4da2c45b2 ("arm64/sve: ptrace and ELF coredump support")

... there were two potential behaviours, depending on TIF_SVE:

* When TIF_SVE was clear, all SVE state would be zeroed, excluding the
low 128 bits of vectors shared with FPSIMD, FPSR, and FPCR.

* When TIF_SVE was set, all SVE state would be zeroed, including the
low 128 bits of vectors shared with FPSIMD, but excluding FPSR and
FPCR.

Note that as writing to NT_ARM_SVE would set TIF_SVE, partial writes to
NT_ARM_SVE would not be idempotent, and if a first write preserved the
low 128 bits, a subsequent (potentially identical) partial write would
discard the low 128 bits.

When support for the NT_ARM_SSVE regset was added in commit:

e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers")

... the above behaviour was retained for writes to the NT_ARM_SVE
regset, though writes to the NT_ARM_SSVE would always zero the SVE
registers and would not inherit FPSIMD register state. This happened as
fpsimd_sync_to_sve() only copied the FPSIMD regs when TIF_SVE was clear
and PSTATE.SM==0.

Subsequently, when FPSIMD/SVE state tracking was changed across commits:

baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
a0136be443d5 (arm64/fpsimd: Load FP state based on recorded data type")
bbc6172eefdb ("arm64/fpsimd: SME no longer requires SVE register state")
8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")

... there was no corresponding update to the ptrace code, nor to
fpsimd_sync_to_sve(), which stil considers TIF_SVE and PSTATE.SM rather
than the saved fp_type. The saved state can be in the FPSIMD format
regardless of whether TIF_SVE is set or clear, and the saved type can
change non-deterministically during syscalls. Consequently a subsequent
partial write to the NT_ARM_SVE or NT_ARM_SSVE regsets may
non-deterministically preserve the FPSIMD state, and may migrate this
state between streaming and non-streaming modes.

Clean this up by never attempting to preserve ANY state when writing an
SVE payload to the NT_ARM_SVE/NT_ARM_SSVE regsets, zeroing all relevant
state including FPSR and FPCR. This simplifies the code, makes the
behaviour deterministic, and avoids migrating state between streaming
and non-streaming modes. As above, I do not believe this should
adversely affect existing userspace applications.

At the same time, remove fpsimd_sync_to_sve(). It is no longer used,
doesn't do what its documentation implies, and gets in the way of other
cleanups and fixes.

Fixes: 43d4da2c45b2 ("arm64/sve: ptrace and ELF coredump support")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Spickett <david.spickett@arm.com>
Cc: Luis Machado <luis.machado@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250508132644.1395904-6-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


# 398edaa1 08-May-2025 Mark Rutland <mark.rutland@arm.com>

arm64/fpsimd: Do not discard modified SVE state

Historically SVE state was discarded deterministically early in the
syscall entry path, before ptrace is notified of syscall entry. This
permitted ptr

arm64/fpsimd: Do not discard modified SVE state

Historically SVE state was discarded deterministically early in the
syscall entry path, before ptrace is notified of syscall entry. This
permitted ptrace to modify SVE state before and after the "real" syscall
logic was executed, with the modified state being retained.

This behaviour was changed by commit:

8c845e2731041f0f ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")

That commit was intended to speed up workloads that used SVE by
opportunistically leaving SVE enabled when returning from a syscall.
The syscall entry logic was modified to truncate the SVE state without
disabling userspace access to SVE, and fpsimd_save_user_state() was
modified to discard userspace SVE state whenever
in_syscall(current_pt_regs()) is true, i.e. when
current_pt_regs()->syscallno != NO_SYSCALL.

Leaving SVE enabled opportunistically resulted in a couple of changes to
userspace visible behaviour which weren't described at the time, but are
logical consequences of opportunistically leaving SVE enabled:

* Signal handlers can observe the type of saved state in the signal's
sve_context record. When the kernel only tracks FPSIMD state, the 'vq'
field is 0 and there is no space allocated for register contents. When
the kernel tracks SVE state, the 'vq' field is non-zero and the
register contents are saved into the record.

As a result of the above commit, 'vq' (and the presence of SVE
register state) is non-deterministically zero or non-zero for a period
of time after a syscall. The effective register state is still
deterministic.

Hopefully no-one relies on this being deterministic. In general,
handlers for asynchronous events cannot expect a deterministic state.

* Similarly to signal handlers, ptrace requests can observe the type of
saved state in the NT_ARM_SVE and NT_ARM_SSVE regsets, as this is
exposed in the header flags. As a result of the above commit, this is
now in a non-deterministic state after a syscall. The effective
register state is still deterministic.

Hopefully no-one relies on this being deterministic. In general,
debuggers would have to handle this changing at arbitrary points
during program flow.

Discarding the SVE state within fpsimd_save_user_state() resulted in
other changes to userspace visible behaviour which are not desirable:

* A ptrace tracer can modify (or create) a tracee's SVE state at syscall
entry or syscall exit. As a result of the above commit, the tracee's
SVE state can be discarded non-deterministically after modification,
rather than being retained as it previously was.

Note that for co-operative tracer/tracee pairs, the tracer may
(re)initialise the tracee's state arbitrarily after the tracee sends
itself an initial SIGSTOP via a syscall, so this affects realistic
design patterns.

* The current_pt_regs()->syscallno field can be modified via ptrace, and
can be altered even when the tracee is not really in a syscall,
causing non-deterministic discarding to occur in situations where this
was not previously possible.

Further, using current_pt_regs()->syscallno in this way is unsound:

* There are data races between readers and writers of the
current_pt_regs()->syscallno field.

The current_pt_regs()->syscallno field is written in interruptible
task context using plain C accesses, and is read in irq/softirq
context using plain C accesses. These accesses are subject to data
races, with the usual concerns with tearing, etc.

* Writes to current_pt_regs()->syscallno are subject to compiler
reordering.

As current_pt_regs()->syscallno is written with plain C accesses,
the compiler is free to move those writes arbitrarily relative to
anything which doesn't access the same memory location.

In theory this could break signal return, where prior to restoring the
SVE state, restore_sigframe() calls forget_syscall(). If the write
were hoisted after restore of some SVE state, that state could be
discarded unexpectedly.

In practice that reordering cannot happen in the absence of LTO (as
cross compilation-unit function calls happen prevent this reordering),
and that reordering appears to be unlikely in the presence of LTO.

Additionally, since commit:

f130ac0ae4412dbe ("arm64: syscall: unmask DAIF earlier for SVCs")

... DAIF is unmasked before el0_svc_common() sets regs->syscallno to the
real syscall number. Consequently state may be saved in SVE format prior
to this point.

Considering all of the above, current_pt_regs()->syscallno should not be
used to infer whether the SVE state can be discarded. Luckily we can
instead use cpu_fp_state::to_save to track when it is safe to discard
the SVE state:

* At syscall entry, after the live SVE register state is truncated, set
cpu_fp_state::to_save to FP_STATE_FPSIMD to indicate that only the
FPSIMD portion is live and needs to be saved.

* At syscall exit, once the task's state is guaranteed to be live, set
cpu_fp_state::to_save to FP_STATE_CURRENT to indicate that TIF_SVE
must be considered to determine which state needs to be saved.

* Whenever state is modified, it must be saved+flushed prior to
manipulation. The state will be truncated if necessary when it is
saved, and reloading the state will set fp_state::to_save to
FP_STATE_CURRENT, preventing subsequent discarding.

This permits SVE state to be discarded *only* when it is known to have
been truncated (and the non-FPSIMD portions must be zero), and ensures
that SVE state is retained after it is explicitly modified.

For backporting, note that this fix depends on the following commits:

* b2482807fbd4 ("arm64/sme: Optimise SME exit on syscall entry")
* f130ac0ae441 ("arm64: syscall: unmask DAIF earlier for SVCs")
* 929fa99b1215 ("arm64/fpsimd: signal: Always save+flush state early")

Fixes: 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")
Fixes: f130ac0ae441 ("arm64: syscall: unmask DAIF earlier for SVCs")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250508132644.1395904-2-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


Revision tags: v6.15-rc5
# f699c666 30-Apr-2025 Mark Rutland <mark.rutland@arm.com>

arm64/fpsimd: Avoid warning when sve_to_fpsimd() is unused

Historically fpsimd_to_sve() and sve_to_fpsimd() were (conditionally)
called by functions which were defined regardless of CONFIG_ARM64_SVE

arm64/fpsimd: Avoid warning when sve_to_fpsimd() is unused

Historically fpsimd_to_sve() and sve_to_fpsimd() were (conditionally)
called by functions which were defined regardless of CONFIG_ARM64_SVE.
Hence it was necessary that both fpsimd_to_sve() and sve_to_fpsimd()
were always defined and not guarded by ifdeffery.

As a result of the removal of fpsimd_signal_preserve_current_state() in
commit:

929fa99b1215966f ("arm64/fpsimd: signal: Always save+flush state early")

... sve_to_fpsimd() has no callers when CONFIG_ARM64_SVE=n, resulting in
a build-time warnign that it is unused:

| arch/arm64/kernel/fpsimd.c:676:13: warning: unused function 'sve_to_fpsimd' [-Wunused-function]
| 676 | static void sve_to_fpsimd(struct task_struct *task)
| | ^~~~~~~~~~~~~
| 1 warning generated.

In contrast, fpsimd_to_sve() still has callers which are defined when
CONFIG_ARM64_SVE=n, and it would be awkward to hide this behind
ifdeffery and/or to use stub functions.

For now, suppress the warning by marking both fpsimd_to_sve() and
sve_to_fpsimd() as 'static inline', as we usually do for stub functions.
The compiler will no longer warn if either function is unused.

Aside from suppressing the warning, there should be no functional change
as a result of this patch.

Link: https://lore.kernel.org/linux-arm-kernel/20250429194600.GA26883@willie-the-truck/
Reported-by: Will Deacon <will@kernel.org>
Fixes: 929fa99b1215 ("arm64/fpsimd: signal: Always save+flush state early")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250430173240.4023627-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


Revision tags: v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14
# e04796c8 18-Mar-2025 Ard Biesheuvel <ardb@kernel.org>

arm64/fpsimd: Avoid unnecessary per-CPU buffers for EFI runtime calls

The EFI specification has some elaborate rules about which runtime
services may be called while another runtime service call is

arm64/fpsimd: Avoid unnecessary per-CPU buffers for EFI runtime calls

The EFI specification has some elaborate rules about which runtime
services may be called while another runtime service call is already in
progress. In Linux, however, for simplicity, all EFI runtime service
invocations are serialized via the efi_runtime_lock semaphore.

This implies that calls to the helper pair arch_efi_call_virt_setup()
and arch_efi_call_virt_teardown() are serialized too, and are guaranteed
not to nest. Furthermore, the arm64 arch code has its own spinlock to
serialize use of the EFI runtime stack, of which only a single instance
exists.

This all means that the FP/SIMD and SVE state preserve/restore logic in
__efi_fpsimd_begin() and __efi_fpsimd_end() are also serialized, and
only a single instance of the associated per-CPU variables can ever be
in use at the same time. There is therefore no need at all for per-CPU
variables here, and they can all be replaced with singleton instances.
This saves a non-trivial amount of memory on systems with many CPUs.

To be more robust against potential future changes in the core EFI code
that may invalidate the reasoning above, move the invocations of
__efi_fpsimd_begin() and __efi_fpsimd_end() into the critical section
covered by the efi_rt_lock spinlock.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250318132421.3155799-2-ardb+git@google.com
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


# 929fa99b 09-Apr-2025 Mark Rutland <mark.rutland@arm.com>

arm64/fpsimd: signal: Always save+flush state early

There are several issues with the way the native signal handling code
manipulates FPSIMD/SVE/SME state, described in detail below. These
issues la

arm64/fpsimd: signal: Always save+flush state early

There are several issues with the way the native signal handling code
manipulates FPSIMD/SVE/SME state, described in detail below. These
issues largely result from races with preemption and inconsistent
handling of live state vs saved state.

Known issues with native FPSIMD/SVE/SME state management include:

* On systems with FPMR, the code to save/restore the FPMR accesses the
register while it is not owned by the current task. Consequently, this
may corrupt the FPMR of the current task and/or may corrupt the FPMR
of an unrelated task. The FPMR save/restore has been broken since it
was introduced in commit:

8c46def44409fc91 ("arm64/signal: Add FPMR signal handling")

* On systems with SME, setup_return() modifies both the live register
state and the saved state register state regardless of whether the
task's state is live, and without holding the cpu fpsimd context.
Consequently:

- This may corrupt the state an unrelated task which has PSTATE.SM set
and/or PSTATE.ZA set.

- The task may enter the signal handler in streaming mode, and or with
ZA storage enabled unexpectedly.

- The task may enter the signal handler in non-streaming SVE mode with
stale SVE register state, which may have been inherited from
streaming SVE mode unexpectedly. Where the streaming and
non-streaming vector lengths differ, this may be packed into
registers arbitrarily.

This logic has been broken since it was introduced in commit:

40a8e87bb32855b3 ("arm64/sme: Disable ZA and streaming mode when handling signals")

Further incorrect manipulation of state was added in commits:

ea64baacbc36a0d5 ("arm64/signal: Flush FPSIMD register state when disabling streaming mode")
baa8515281b30861 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")

* Several restoration functions use fpsimd_flush_task_state() to discard
the live FPSIMD/SVE/SME while the in-memory copy is stale.

When a subset of the FPSIMD/SVE/SME state is restored, the remainder
may be non-deterministically reset to a stale snapshot from some
arbitrary point in the past.

This non-deterministic discarding was introduced in commit:

8cd969d28fd2848d ("arm64/sve: Signal handling support")

As of that commit, when TIF_SVE was initially clear, failure to
restore the SVE signal frame could reset the FPSIMD registers to a
stale snapshot.

The pattern of discarding unsaved state was subsequently copied into
restoration functions for some new state in commits:

39782210eb7e8763 ("arm64/sme: Implement ZA signal handling")
ee072cf708048c0d ("arm64/sme: Implement signal handling for ZT")

* On systems with SME/SME2, the entire FPSIMD/SVE/SME state may be
loaded onto the CPU redundantly. Either restore_fpsimd_context() or
restore_sve_fpsimd_context() will load the entire FPSIMD/SVE/SME state
via fpsimd_update_current_state() before restore_za_context() and
restore_zt_context() each discard the state via
fpsimd_flush_task_state().

This is purely redundant work, and not a functional bug.

To fix these issues, rework the native signal handling code to always
save+flush the current task's FPSIMD/SVE/SME state before manipulating
that state. This avoids races with preemption and ensures that state is
manipulated consistently regardless of whether it happened to be live
prior to manipulation. This largely involes:

* Using fpsimd_save_and_flush_current_state() to save+flush the state
for both signal delivery and signal return, before the state is
manipulated in any way.

* Removing fpsimd_signal_preserve_current_state() and updating
preserve_fpsimd_context() to explicitly ensure that the FPSIMD state
is up-to-date, as preserve_fpsimd_context() is the only consumer of
the FPSIMD state during signal delivery.

* Modifying fpsimd_update_current_state() to not reload the FPSIMD state
onto the CPU. Ideally we'd remove fpsimd_update_current_state()
entirely, but I've left that for subsequent patches as there are a
number of of other problems with the FPSIMD<->SVE conversion helpers
that should be addressed at the same time. For now I've removed the
misleading comment.

For setup_return(), we need to decide (for ABI reasons) whether signal
delivery should have all the side-effects of an SMSTOP. For now I've
left a TODO comment, as there are other questions in this area that I'll
address with subsequent patches.

Fixes: 8c46def44409 ("arm64/signal: Add FPMR signal handling")
Fixes: 40a8e87bb328 ("arm64/sme: Disable ZA and streaming mode when handling signals")
Fixes: ea64baacbc36 ("arm64/signal: Flush FPSIMD register state when disabling streaming mode")
Fixes: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
Fixes: 8cd969d28fd2 ("arm64/sve: Signal handling support")
Fixes: 39782210eb7e ("arm64/sme: Implement ZA signal handling")
Fixes: ee072cf70804 ("arm64/sme: Implement signal handling for ZT")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250409164010.3480271-13-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# d3a18158 09-Apr-2025 Mark Rutland <mark.rutland@arm.com>

arm64/fpsimd: Add fpsimd_save_and_flush_current_state()

When the current task's FPSIMD/SVE/SME state may be live on *any* CPU in
the system, special care must be taken when manipulating that state,

arm64/fpsimd: Add fpsimd_save_and_flush_current_state()

When the current task's FPSIMD/SVE/SME state may be live on *any* CPU in
the system, special care must be taken when manipulating that state, as
this manipulation can race with preemption and/or asynchronous usage of
FPSIMD/SVE/SME (e.g. kernel-mode NEON in softirq handlers).

Even when manipulation is is protected with get_cpu_fpsimd_context() and
get_cpu_fpsimd_context(), the logic necessary when the state is live on
the current CPU can be wildly different from the logic necessary when
the state is not live on the current CPU. A number of historical and
extant issues result from failing to handle these cases consistetntly
and/or correctly.

To make it easier to get such manipulation correct, add a new
fpsimd_save_and_flush_current_state() helper function, which ensures
that the current task's state has been saved to memory and any stale
state on any CPU has been "flushed" such that is not live on any CPU in
the system. This will allow code to safely manipulate the saved state
without risk of races.

Subsequent patches will use the new function.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250409164010.3480271-11-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# c94f2f32 09-Apr-2025 Mark Rutland <mark.rutland@arm.com>

arm64/fpsimd: Fix merging of FPSIMD state during signal return

For backwards compatibility reasons, when a signal return occurs which
restores SVE state, the effective lower 128 bits of each of the

arm64/fpsimd: Fix merging of FPSIMD state during signal return

For backwards compatibility reasons, when a signal return occurs which
restores SVE state, the effective lower 128 bits of each of the SVE
vector registers are restored from the corresponding FPSIMD vector
register in the FPSIMD signal frame, overriding the values in the SVE
signal frame. This is intended to be the case regardless of streaming
mode.

To make this happen, restore_sve_fpsimd_context() uses
fpsimd_update_current_state() to merge the lower 128 bits from the
FPSIMD signal frame into the SVE register state. Unfortunately,
fpsimd_update_current_state() performs this merging dependent upon
TIF_SVE, which is not always correct for streaming SVE register state:

* When restoring non-streaming SVE register state there is no observable
problem, as the signal return code configures TIF_SVE and the saved
fp_type to match before calling fpsimd_update_current_state(), which
observes either:

- TIF_SVE set AND fp_type == FP_STATE_SVE
- TIF_SVE clear AND fp_type == FP_STATE_FPSIMD

* On systems which have SME but not SVE, TIF_SVE cannot be set. Thus the
merging will never happen for the streaming SVE register state.

* On systems which have SVE and SME, TIF_SVE can be set and cleared
independently of PSTATE.SM. Thus the merging may or may not happen for
streaming SVE register state.

As TIF_SVE can be cleared non-deterministically during syscalls
(including at the start of sigreturn()), the merging may occur
non-deterministically from the perspective of userspace.

This logic has been broken since its introduction in commit:

85ed24dad2904f7c ("arm64/sme: Implement streaming SVE signal handling")

... at which point both fpsimd_signal_preserve_current_state() and
fpsimd_update_current_state() only checked TIF SVE. When PSTATE.SM==1
and TIF_SVE was clear, signal delivery would place stale FPSIMD state
into the FPSIMD signal frame, and signal return would not merge this
into the restored register state.

Subsequently, signal delivery was fixed as part of commit:

61da7c8e2a602f66 ("arm64/signal: Don't assume that TIF_SVE means we saved SVE state")

... but signal restore was not given a corresponding fix, and when
TIF_SVE was clear, signal restore would still fail to merge the FPSIMD
state into the restored SVE register state. The 'Fixes' tag did not
indicate that this had been broken since its introduction.

Fix this by merging the FPSIMD state dependent upon the saved fp_type,
matching what we (currently) do during signal delivery.

As described above, when backporting this commit, it will also be
necessary to backport commit:

61da7c8e2a602f66 ("arm64/signal: Don't assume that TIF_SVE means we saved SVE state")

... and prior to commit:

baa8515281b30861 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")

... it will be necessary for fpsimd_signal_preserve_current_state() and
fpsimd_update_current_state() to consider both TIF_SVE and
thread_sm_enabled(&current->thread), in place of the saved fp_type.

Fixes: 85ed24dad290 ("arm64/sme: Implement streaming SVE signal handling")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250409164010.3480271-10-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# a90878f2 09-Apr-2025 Mark Rutland <mark.rutland@arm.com>

arm64/fpsimd: Reset FPMR upon exec()

An exec() is expected to reset all FPSIMD/SVE/SME state, and barring
special handling of the vector lengths, the state is expected to reset
to zero. This reset i

arm64/fpsimd: Reset FPMR upon exec()

An exec() is expected to reset all FPSIMD/SVE/SME state, and barring
special handling of the vector lengths, the state is expected to reset
to zero. This reset is handled in fpsimd_flush_thread(), which the core
exec() code calls via flush_thread().

When support was added for FPMR, no logic was added to
fpsimd_flush_thread() to reset the FPMR value, and thus it is
erroneously inherited across an exec().

Add the missing reset of FPMR.

Fixes: 203f2b95a882 ("arm64/fpsimd: Support FEAT_FPMR")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250409164010.3480271-9-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# 01098d89 09-Apr-2025 Mark Rutland <mark.rutland@arm.com>

arm64/fpsimd: Avoid clobbering kernel FPSIMD state with SMSTOP

On system with SME, a thread's kernel FPSIMD state may be erroneously
clobbered during a context switch immediately after that state is

arm64/fpsimd: Avoid clobbering kernel FPSIMD state with SMSTOP

On system with SME, a thread's kernel FPSIMD state may be erroneously
clobbered during a context switch immediately after that state is
restored. Systems without SME are unaffected.

If the CPU happens to be in streaming SVE mode before a context switch
to a thread with kernel FPSIMD state, fpsimd_thread_switch() will
restore the kernel FPSIMD state using fpsimd_load_kernel_state() while
the CPU is still in streaming SVE mode. When fpsimd_thread_switch()
subsequently calls fpsimd_flush_cpu_state(), this will execute an
SMSTOP, causing an exit from streaming SVE mode. The exit from
streaming SVE mode will cause the hardware to reset a number of
FPSIMD/SVE/SME registers, clobbering the FPSIMD state.

Fix this by calling fpsimd_flush_cpu_state() before restoring the kernel
FPSIMD state.

Fixes: e92bee9f861b ("arm64/fpsimd: Avoid erroneous elide of user state reload")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250409164010.3480271-8-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# e5fa85fc 09-Apr-2025 Mark Brown <broonie@kernel.org>

arm64/fpsimd: Don't corrupt FPMR when streaming mode changes

When the effective value of PSTATE.SM is changed from 0 to 1 or from 1
to 0 by any method, an entry or exit to/from streaming SVE mode is

arm64/fpsimd: Don't corrupt FPMR when streaming mode changes

When the effective value of PSTATE.SM is changed from 0 to 1 or from 1
to 0 by any method, an entry or exit to/from streaming SVE mode is
performed, and hardware automatically resets a number of registers. As
of ARM DDI 0487 L.a, this means:

* All implemented bits of the SVE vector registers are set to zero.

* All implemented bits of the SVE predicate registers are set to zero.

* All implemented bits of FFR are set to zero, if FFR is implemented in
the new mode.

* FPSR is set to 0x0000_0000_0800_009f.

* FPMR is set to 0, if FPMR is implemented.

Currently task_fpsimd_load() restores FPMR before restoring SVCR (which
is an accessor for PSTATE.{SM,ZA}), and so the restored value of FPMR
may be clobbered if the restored value of PSTATE.SM happens to differ
from the initial value of PSTATE.SM.

Fix this by moving the restore of FPMR later.

Note: this was originally posted as [1].

Fixes: 203f2b95a882 ("arm64/fpsimd: Support FEAT_FPMR")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/linux-arm-kernel/20241204-arm64-sme-reenable-v2-2-bae87728251d@kernel.org/
[ Rutland: rewrite commit message ]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250409164010.3480271-7-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


12345678910>>...69