| #
d142ab35
|
| 14-Apr-2026 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:
- Several improvements related to crc_kunit, to align with the standar
Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:
- Several improvements related to crc_kunit, to align with the standard KUnit conventions and make it easier for developers and CI systems to run this test suite
- Add an arm64-optimized implementation of CRC64-NVME
- Remove unused code for big endian arm64
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: lib/crc: arm64: Simplify intrinsics implementation lib/crc: arm64: Use existing macros for kernel-mode FPU cflags lib/crc: arm64: Drop unnecessary chunking logic from crc64 lib/crc: arm64: Assume a little-endian kernel lib/crc: arm64: add NEON accelerated CRC64-NVMe implementation lib/crc: arm64: Drop check for CONFIG_KERNEL_MODE_NEON crypto: crc32c - Remove another outdated comment crypto: crc32c - Remove more outdated usage information kunit: configs: Enable all CRC tests in all_tests.config lib/crc: tests: Add a .kunitconfig file lib/crc: tests: Add CRC_ENABLE_ALL_FOR_KUNIT lib/crc: tests: Make crc_kunit test only the enabled CRC variants
show more ...
|
| #
e0718ed6
|
| 30-Mar-2026 |
Ard Biesheuvel <ardb@kernel.org> |
lib/crc: arm64: Drop unnecessary chunking logic from crc64
On arm64, kernel mode NEON executes with preemption enabled, so there is no need to chunk the input by hand.
Signed-off-by: Ard Biesheuvel
lib/crc: arm64: Drop unnecessary chunking logic from crc64
On arm64, kernel mode NEON executes with preemption enabled, so there is no need to chunk the input by hand.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260330144630.33026-8-ardb@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
| #
63432fd6
|
| 29-Mar-2026 |
Demian Shulhan <demyansh@gmail.com> |
lib/crc: arm64: add NEON accelerated CRC64-NVMe implementation
Implement an optimized CRC64 (NVMe) algorithm for ARM64 using NEON Polynomial Multiply Long (PMULL) instructions. The generic shift-and
lib/crc: arm64: add NEON accelerated CRC64-NVMe implementation
Implement an optimized CRC64 (NVMe) algorithm for ARM64 using NEON Polynomial Multiply Long (PMULL) instructions. The generic shift-and-XOR software implementation is slow, which creates a bottleneck in NVMe and other storage subsystems.
The acceleration is implemented using C intrinsics (<arm_neon.h>) rather than raw assembly for better readability and maintainability.
Key highlights of this implementation: - Uses 4KB chunking inside scoped_ksimd() to avoid preemption latency spikes on large buffers. - Pre-calculates and loads fold constants via vld1q_u64() to minimize register spilling. - Benchmarks show the break-even point against the generic implementation is around 128 bytes. The PMULL path is enabled only for len >= 128.
Performance results (kunit crc_benchmark on Cortex-A72): - Generic (len=4096): ~268 MB/s - PMULL (len=4096): ~1556 MB/s (nearly 6x improvement)
Signed-off-by: Demian Shulhan <demyansh@gmail.com> Link: https://lore.kernel.org/r/20260329074338.1053550-1-demyansh@gmail.com Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|