History log of /src/lib/libc/aarch64/string/strlen.S (Results 1 – 6 of 6)
Revision Date Author Comments
# 521c1fe0 13-Jan-2025 Robert Clausecker <fuz@FreeBSD.org>

libc/aarch64: fix strlen() when flush-to-zero is set

Our SIMD-enhanced strlen() implementation for AArch64 uses
a floating-point comparison to compare a bit mask to zero.
This works fine under norma

libc/aarch64: fix strlen() when flush-to-zero is set

Our SIMD-enhanced strlen() implementation for AArch64 uses
a floating-point comparison to compare a bit mask to zero.
This works fine under normal circumstances, but fails if
the FZ (flush-to-zero) flag is set in FPCR (the floating-point
control register) as then the CPU no longer distinguishes
denormals from zero.

This was not caught during testing; this flag is rarely set
and programs that do so rarely perform string manipulation.

Avoid this problem by using an integer comparison instead.
The performance impact seems to be small (about 0.5 %) on
the Windows 2023 Dev Kit, but seems to be more significant
(up to around 19%) on the RPi 5.

Reviewed by: getz
Fixes: 3863fec1ce2dc6033f094a085118605ea89db9e2
Differential Revision: https://reviews.freebsd.org/D48442

show more ...


# 3863fec1 26-Aug-2024 Getz Mikalsen <getz@FreeBSD.org>

lib/libc/aarch64/string: add strlen SIMD implementation

Adds a SIMD enhanced strlen for Aarch64. It takes inspiration from
the amd64 implementation but I struggled getting the performance I
had hope

lib/libc/aarch64/string: add strlen SIMD implementation

Adds a SIMD enhanced strlen for Aarch64. It takes inspiration from
the amd64 implementation but I struggled getting the performance I
had hoped for on cores like the Graviton3 when compared to the
existing implementation from Arm Optimized Routines.

See the DR for bechmark results.

Tested by: fuz (exprun)
Reviewed by: fuz, emaste
Sponsored by: Google LLC (GSoC 2024)
PR: 281175
Differential Revision: https://reviews.freebsd.org/D45623

show more ...


# 521c1fe0 13-Jan-2025 Robert Clausecker <fuz@FreeBSD.org>

libc/aarch64: fix strlen() when flush-to-zero is set

Our SIMD-enhanced strlen() implementation for AArch64 uses
a floating-point comparison to compare a bit mask to zero.
This works fine under norma

libc/aarch64: fix strlen() when flush-to-zero is set

Our SIMD-enhanced strlen() implementation for AArch64 uses
a floating-point comparison to compare a bit mask to zero.
This works fine under normal circumstances, but fails if
the FZ (flush-to-zero) flag is set in FPCR (the floating-point
control register) as then the CPU no longer distinguishes
denormals from zero.

This was not caught during testing; this flag is rarely set
and programs that do so rarely perform string manipulation.

Avoid this problem by using an integer comparison instead.
The performance impact seems to be small (about 0.5 %) on
the Windows 2023 Dev Kit, but seems to be more significant
(up to around 19%) on the RPi 5.

Reviewed by: getz
Fixes: 3863fec1ce2dc6033f094a085118605ea89db9e2
Differential Revision: https://reviews.freebsd.org/D48442

show more ...


# 3863fec1 26-Aug-2024 Getz Mikalsen <getz@FreeBSD.org>

lib/libc/aarch64/string: add strlen SIMD implementation

Adds a SIMD enhanced strlen for Aarch64. It takes inspiration from
the amd64 implementation but I struggled getting the performance I
had hope

lib/libc/aarch64/string: add strlen SIMD implementation

Adds a SIMD enhanced strlen for Aarch64. It takes inspiration from
the amd64 implementation but I struggled getting the performance I
had hoped for on cores like the Graviton3 when compared to the
existing implementation from Arm Optimized Routines.

See the DR for bechmark results.

Tested by: fuz (exprun)
Reviewed by: fuz, emaste
Sponsored by: Google LLC (GSoC 2024)
PR: 281175
Differential Revision: https://reviews.freebsd.org/D45623

show more ...


# 521c1fe0 13-Jan-2025 Robert Clausecker <fuz@FreeBSD.org>

libc/aarch64: fix strlen() when flush-to-zero is set

Our SIMD-enhanced strlen() implementation for AArch64 uses
a floating-point comparison to compare a bit mask to zero.
This works fine under norma

libc/aarch64: fix strlen() when flush-to-zero is set

Our SIMD-enhanced strlen() implementation for AArch64 uses
a floating-point comparison to compare a bit mask to zero.
This works fine under normal circumstances, but fails if
the FZ (flush-to-zero) flag is set in FPCR (the floating-point
control register) as then the CPU no longer distinguishes
denormals from zero.

This was not caught during testing; this flag is rarely set
and programs that do so rarely perform string manipulation.

Avoid this problem by using an integer comparison instead.
The performance impact seems to be small (about 0.5 %) on
the Windows 2023 Dev Kit, but seems to be more significant
(up to around 19%) on the RPi 5.

Reviewed by: getz
Fixes: 3863fec1ce2dc6033f094a085118605ea89db9e2
Differential Revision: https://reviews.freebsd.org/D48442

show more ...


# 3863fec1 26-Aug-2024 Getz Mikalsen <getz@FreeBSD.org>

lib/libc/aarch64/string: add strlen SIMD implementation

Adds a SIMD enhanced strlen for Aarch64. It takes inspiration from
the amd64 implementation but I struggled getting the performance I
had hope

lib/libc/aarch64/string: add strlen SIMD implementation

Adds a SIMD enhanced strlen for Aarch64. It takes inspiration from
the amd64 implementation but I struggled getting the performance I
had hoped for on cores like the Graviton3 when compared to the
existing implementation from Arm Optimized Routines.

See the DR for bechmark results.

Tested by: fuz (exprun)
Reviewed by: fuz, emaste
Sponsored by: Google LLC (GSoC 2024)
PR: 281175
Differential Revision: https://reviews.freebsd.org/D45623

show more ...