#
ab93e0dd |
| 06-Aug-2025 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.17 merge window.
|
#
a7bee4e7 |
| 04-Aug-2025 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'ib-mfd-gpio-input-pwm-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into next
Merge an immutable branch between MFD, GPIO, Input and PWM to resolve conflicts for the mer
Merge tag 'ib-mfd-gpio-input-pwm-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into next
Merge an immutable branch between MFD, GPIO, Input and PWM to resolve conflicts for the merge window pull request.
show more ...
|
#
0a91336e |
| 02-Aug-2025 |
Huacai Chen <chenhuacai@loongson.cn> |
Merge tag 'bpf-next-6.17' into loongarch-next
LoongArch architecture changes for 6.17 have many bpf features such as trampoline, so merge 'bpf-next-6.17' to create a base to make bpf work well.
|
#
a6923c06 |
| 02-Aug-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:
- Fix kCFI failures in JITed BPF code on arm64 (Sami Tolvanen, Puranjay Mo
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:
- Fix kCFI failures in JITed BPF code on arm64 (Sami Tolvanen, Puranjay Mohan, Mark Rutland, Maxwell Bland)
- Disallow tail calls between BPF programs that use different cgroup local storage maps to prevent out-of-bounds access (Daniel Borkmann)
- Fix unaligned access in flow_dissector and netfilter BPF programs (Paul Chaignon)
- Avoid possible use of uninitialized mod_len in libbpf (Achill Gilgenast)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Test for unaligned flow_dissector ctx access bpf: Improve ctx access verifier error message bpf: Check netfilter ctx accesses are aligned bpf: Check flow_dissector ctx accesses are aligned arm64/cfi,bpf: Support kCFI + BPF on arm64 cfi: Move BPF CFI types and helpers to generic code cfi: add C CFI type macro libbpf: Avoid possible use of uninitialized mod_len bpf: Fix oob access in cgroup local storage bpf: Move cgroup iterator helpers to bpf.h bpf: Move bpf map owner out of common struct bpf: Add cookie object to bpf maps
show more ...
|
#
abad3d0b |
| 30-Jul-2025 |
Daniel Borkmann <daniel@iogearbox.net> |
bpf: Fix oob access in cgroup local storage
Lonial reported that an out-of-bounds access in cgroup local storage can be crafted via tail calls. Given two programs each utilizing a cgroup local stora
bpf: Fix oob access in cgroup local storage
Lonial reported that an out-of-bounds access in cgroup local storage can be crafted via tail calls. Given two programs each utilizing a cgroup local storage with a different value size, and one program doing a tail call into the other. The verifier will validate each of the indivial programs just fine. However, in the runtime context the bpf_cg_run_ctx holds an bpf_prog_array_item which contains the BPF program as well as any cgroup local storage flavor the program uses. Helpers such as bpf_get_local_storage() pick this up from the runtime context:
ctx = container_of(current->bpf_ctx, struct bpf_cg_run_ctx, run_ctx); storage = ctx->prog_item->cgroup_storage[stype];
if (stype == BPF_CGROUP_STORAGE_SHARED) ptr = &READ_ONCE(storage->buf)->data[0]; else ptr = this_cpu_ptr(storage->percpu_buf);
For the second program which was called from the originally attached one, this means bpf_get_local_storage() will pick up the former program's map, not its own. With mismatching sizes, this can result in an unintended out-of-bounds access.
To fix this issue, we need to extend bpf_map_owner with an array of storage_cookie[] to match on i) the exact maps from the original program if the second program was using bpf_get_local_storage(), or ii) allow the tail call combination if the second program was not using any of the cgroup local storage maps.
Fixes: 7d9c3427894f ("bpf: Make cgroup storages shared between programs on the same cgroup") Reported-by: Lonial Con <kongln9170@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20250730234733.530041-4-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
fd1c98f0 |
| 30-Jul-2025 |
Daniel Borkmann <daniel@iogearbox.net> |
bpf: Move bpf map owner out of common struct
Given this is only relevant for BPF tail call maps, it is adding up space and penalizing other map types. We also need to extend this with further object
bpf: Move bpf map owner out of common struct
Given this is only relevant for BPF tail call maps, it is adding up space and penalizing other map types. We also need to extend this with further objects to track / compare to. Therefore, lets move this out into a separate structure and dynamically allocate it only for BPF tail call maps.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20250730234733.530041-2-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
d9104cec |
| 30-Jul-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'bpf-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
- Remove usermode driver (UMD) framework (Thomas Weißschuh)
- In
Merge tag 'bpf-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
- Remove usermode driver (UMD) framework (Thomas Weißschuh)
- Introduce Strongly Connected Component (SCC) in the verifier to detect loops and refine register liveness (Eduard Zingerman)
- Allow 'void *' cast using bpf_rdonly_cast() and corresponding '__arg_untrusted' for global function parameters (Eduard Zingerman)
- Improve precision for BPF_ADD and BPF_SUB operations in the verifier (Harishankar Vishwanathan)
- Teach the verifier that constant pointer to a map cannot be NULL (Ihor Solodrai)
- Introduce BPF streams for error reporting of various conditions detected by BPF runtime (Kumar Kartikeya Dwivedi)
- Teach the verifier to insert runtime speculation barrier (lfence on x86) to mitigate speculative execution instead of rejecting the programs (Luis Gerhorst)
- Various improvements for 'veristat' (Mykyta Yatsenko)
- For CONFIG_DEBUG_KERNEL config warn on internal verifier errors to improve bug detection by syzbot (Paul Chaignon)
- Support BPF private stack on arm64 (Puranjay Mohan)
- Introduce bpf_cgroup_read_xattr() kfunc to read xattr of cgroup's node (Song Liu)
- Introduce kfuncs for read-only string opreations (Viktor Malik)
- Implement show_fdinfo() for bpf_links (Tao Chen)
- Reduce verifier's stack consumption (Yonghong Song)
- Implement mprog API for cgroup-bpf programs (Yonghong Song)
* tag 'bpf-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (192 commits) selftests/bpf: Migrate fexit_noreturns case into tracing_failure test suite selftests/bpf: Add selftest for attaching tracing programs to functions in deny list bpf: Add log for attaching tracing programs to functions in deny list bpf: Show precise rejected function when attaching fexit/fmod_ret to __noreturn functions bpf: Fix various typos in verifier.c comments bpf: Add third round of bounds deduction selftests/bpf: Test invariants on JSLT crossing sign selftests/bpf: Test cross-sign 64bits range refinement selftests/bpf: Update reg_bound range refinement logic bpf: Improve bounds when s64 crosses sign boundary bpf: Simplify bounds refinement from s32 selftests/bpf: Enable private stack tests for arm64 bpf, arm64: JIT support for private stack bpf: Move bpf_jit_get_prog_name() to core.c bpf, arm64: Fix fp initialization for exception boundary umd: Remove usermode driver framework bpf/preload: Don't select USERMODE_DRIVER selftests/bpf: Fix test dynptr/test_dynptr_memset_xdp_chunks failure selftests/bpf: Fix test dynptr/test_dynptr_copy_xdp failure selftests/bpf: Increase xdp data size for arm64 64K page size ...
show more ...
|
#
13150742 |
| 29-Jul-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers: "This is the main crypto library pull request
Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers: "This is the main crypto library pull request for 6.17. The main focus this cycle is on reorganizing the SHA-1 and SHA-2 code, providing high-quality library APIs for SHA-1 and SHA-2 including HMAC support, and establishing conventions for lib/crypto/ going forward:
- Migrate the SHA-1 and SHA-512 code (and also SHA-384 which shares most of the SHA-512 code) into lib/crypto/. This includes both the generic and architecture-optimized code. Greatly simplify how the architecture-optimized code is integrated. Add an easy-to-use library API for each SHA variant, including HMAC support. Finally, reimplement the crypto_shash support on top of the library API.
- Apply the same reorganization to the SHA-256 code (and also SHA-224 which shares most of the SHA-256 code). This is a somewhat smaller change, due to my earlier work on SHA-256. But this brings in all the same additional improvements that I made for SHA-1 and SHA-512.
There are also some smaller changes:
- Move the architecture-optimized ChaCha, Poly1305, and BLAKE2s code from arch/$(SRCARCH)/lib/crypto/ to lib/crypto/$(SRCARCH)/. For these algorithms it's just a move, not a full reorganization yet.
- Fix the MIPS chacha-core.S to build with the clang assembler.
- Fix the Poly1305 functions to work in all contexts.
- Fix a performance regression in the x86_64 Poly1305 code.
- Clean up the x86_64 SHA-NI optimized SHA-1 assembly code.
Note that since the new organization of the SHA code is much simpler, the diffstat of this pull request is negative, despite the addition of new fully-documented library APIs for multiple SHA and HMAC-SHA variants.
These APIs will allow further simplifications across the kernel as users start using them instead of the old-school crypto API. (I've already written a lot of such conversion patches, removing over 1000 more lines of code. But most of those will target 6.18 or later)"
* tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (67 commits) lib/crypto: arm64/sha512-ce: Drop compatibility macros for older binutils lib/crypto: x86/sha1-ni: Convert to use rounds macros lib/crypto: x86/sha1-ni: Minor optimizations and cleanup crypto: sha1 - Remove sha1_base.h lib/crypto: x86/sha1: Migrate optimized code into library lib/crypto: sparc/sha1: Migrate optimized code into library lib/crypto: s390/sha1: Migrate optimized code into library lib/crypto: powerpc/sha1: Migrate optimized code into library lib/crypto: mips/sha1: Migrate optimized code into library lib/crypto: arm64/sha1: Migrate optimized code into library lib/crypto: arm/sha1: Migrate optimized code into library crypto: sha1 - Use same state format as legacy drivers crypto: sha1 - Wrap library and add HMAC support lib/crypto: sha1: Add HMAC support lib/crypto: sha1: Add SHA-1 library functions lib/crypto: sha1: Rename sha1_init() to sha1_init_raw() crypto: x86/sha1 - Rename conflicting symbol lib/crypto: sha2: Add hmac_sha*_init_usingrawkey() lib/crypto: arm/poly1305: Remove unneeded empty weak function lib/crypto: x86/poly1305: Fix performance regression on short messages ...
show more ...
|
Revision tags: v6.16 |
|
#
3ba58312 |
| 24-Jul-2025 |
Puranjay Mohan <puranjay@kernel.org> |
bpf: Move bpf_jit_get_prog_name() to core.c
bpf_jit_get_prog_name() will be used by all JITs when enabling support for private stack. This function is currently implemented in the x86 JIT.
Move the
bpf: Move bpf_jit_get_prog_name() to core.c
bpf_jit_get_prog_name() will be used by all JITs when enabling support for private stack. This function is currently implemented in the x86 JIT.
Move the function to core.c so that other JITs can easily use it in their implementation of private stack.
Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20250724120257.7299-2-puranjay@kernel.org
show more ...
|
Revision tags: v6.16-rc7, v6.16-rc6 |
|
#
9503ca2c |
| 12-Jul-2025 |
Eric Biggers <ebiggers@kernel.org> |
lib/crypto: sha1: Rename sha1_init() to sha1_init_raw()
Rename the existing sha1_init() to sha1_init_raw(), since it conflicts with the upcoming library function. This will later be removed, but th
lib/crypto: sha1: Rename sha1_init() to sha1_init_raw()
Rename the existing sha1_init() to sha1_init_raw(), since it conflicts with the upcoming library function. This will later be removed, but this keeps the kernel building for the introduction of the library.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250712232329.818226-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
show more ...
|
#
0074250c |
| 07-Jul-2025 |
Alexei Starovoitov <ast@kernel.org> |
Merge branch 'bpf-streams-fixes'
Kumar Kartikeya Dwivedi says:
==================== BPF Streams - Fixes
This set contains some fixes for recently reported issues for BPF streams. Please check indi
Merge branch 'bpf-streams-fixes'
Kumar Kartikeya Dwivedi says:
==================== BPF Streams - Fixes
This set contains some fixes for recently reported issues for BPF streams. Please check individual patches for details. ====================
Link: https://patch.msgid.link/20250705053035.3020320-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
Revision tags: v6.16-rc5 |
|
#
116c8f47 |
| 05-Jul-2025 |
Kumar Kartikeya Dwivedi <memxor@gmail.com> |
bpf: Fix bounds for bpf_prog_get_file_line linfo loop
We may overrun the bounds because linfo and jited_linfo are already advanced to prog->aux->linfo_idx, hence we must only iterate the remaining e
bpf: Fix bounds for bpf_prog_get_file_line linfo loop
We may overrun the bounds because linfo and jited_linfo are already advanced to prog->aux->linfo_idx, hence we must only iterate the remaining elements until we reach prog->aux->nr_linfo. Adjust the nr_linfo calculation to fix this. Reported in [0].
[0]: https://lore.kernel.org/bpf/f3527af3b0620ce36e299e97e7532d2555018de2.camel@gmail.com
Reported-by: Eduard Zingerman <eddyz87@gmail.com> Fixes: 0e521efaf363 ("bpf: Add function to extract program source info") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250705053035.3020320-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
71b4a995 |
| 04-Jul-2025 |
Alexei Starovoitov <ast@kernel.org> |
Merge branch 'bpf-standard-streams'
Kumar Kartikeya Dwivedi says:
==================== BPF Standard Streams
This set introduces a standard output interface with two streams, namely stdout and stde
Merge branch 'bpf-standard-streams'
Kumar Kartikeya Dwivedi says:
==================== BPF Standard Streams
This set introduces a standard output interface with two streams, namely stdout and stderr, for BPF programs. The idea is that these streams will be written to by BPF programs and the kernel, and serve as standard interfaces for informing user space of any BPF runtime violations. Users can also utilize them for printing normal messages for debugging usage, as is the case with bpf_printk() and trace pipe interface.
BPF programs and the kernel can use these streams to output messages. User space can dump these messages using bpftool.
The stream interface itself is implemented using a lockless list, so that we can queue messages from any context. Every printk statement into the stream leads to memory allocation. Allocation itself relies on try_alloc_pages() to construct a bespoke bump allocator to carve out elements. If this fails, we finally give up and drop the message.
See commit logs for more details.
Two scenarios are covered: - Deadlocks and timeouts in rqspinlock. - Timeouts for may_goto.
In each we provide the stack trace and source information for the offending BPF programs. Both the C source line and the file and line numbers are printed. The output format is as follows:
ERROR: AA or ABBA deadlock detected for bpf_res_spin_lock Attempted lock = 0xff11000108f3a5e0 Total held locks = 1 Held lock[ 0] = 0xff11000108f3a5e0 CPU: 48 UID: 0 PID: 786 Comm: test_progs Call trace: bpf_stream_stage_dump_stack+0xb0/0xd0 bpf_prog_report_rqspinlock_violation+0x10b/0x130 bpf_res_spin_lock+0x8c/0xa0 bpf_prog_3699ea119d1f6ed8_foo+0xe5/0x140 if (!bpf_res_spin_lock(&v2->lock)) @ stream_bpftool.c:62 bpf_prog_9b324ec4a1b2a5c0_stream_bpftool_dump_prog_stream+0x7e/0x2d0 foo(stream); @ stream_bpftool.c:93 bpf_prog_test_run_syscall+0x102/0x240 __sys_bpf+0xd68/0x2bf0 __x64_sys_bpf+0x1e/0x30 do_syscall_64+0x68/0x140 entry_SYSCALL_64_after_hwframe+0x76/0x7e
ERROR: Timeout detected for may_goto instruction CPU: 48 UID: 0 PID: 786 Comm: test_progs Call trace: bpf_stream_stage_dump_stack+0xb0/0xd0 bpf_prog_report_may_goto_violation+0x6a/0x90 bpf_check_timed_may_goto+0x4d/0xa0 arch_bpf_timed_may_goto+0x21/0x40 bpf_prog_3699ea119d1f6ed8_foo+0x12f/0x140 while (can_loop) @ stream_bpftool.c:71 bpf_prog_9b324ec4a1b2a5c0_stream_bpftool_dump_prog_stream+0x7e/0x2d0 foo(stream); @ stream_bpftool.c:93 bpf_prog_test_run_syscall+0x102/0x240 __sys_bpf+0xd68/0x2bf0 __x64_sys_bpf+0x1e/0x30 do_syscall_64+0x68/0x140 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Changelog: ---------- v4 -> v5 v4: https://lore.kernel.org/bpf/20250702031737.407548-1-memxor@gmail.com
* Add acks from Emil. * Address various nits. * Add extra failure tests. * Make deadlock test a little more robust to catch problems.
v3 -> v4 v3: https://lore.kernel.org/bpf/20250624031252.2966759-1-memxor@gmail.com
* Switch to alloc_pages_nolock(), avoid incorrect memcg accounting. (Alexei) * We will figure out proper accounting later. * Drop error limit logic, restrict stream capacity to 100,000 bytes. (Alexei) * Remove extra invocation of is_bpf_text_address(). (Jiri) * Avoid emitting NULL byte into the stream text, adjust regex in selftests. (Alexei) * Add comment around rcu_read_lock() for bpf_prog_ksym_find. (Alexei) * Tighten stream capacity check selftest. * Add acks from Andrii.
v2 -> v3 v2: https://lore.kernel.org/bpf/20250524011849.681425-1-memxor@gmail.com
* Fix bug when handling single element stream stage. (Eduard) * Move to mutex for protection of stream read and copy_to_user(). (Alexei) * Split bprintf refactor into its own patch. (Alexei) * Move kfunc definition to common_btf_ids to avoid initcall proliferation. (Alexei) * Return line number by reference in bpf_prog_get_file_line. (Alexei) * Remove NULL checks for BTF name pointer. (Alexei) * Add WARN_ON_ONCE(!rcu_read_lock_held()) in bpf_prog_ksym_find. (Eduard) * Remove hardcoded stream stage from macros. (Alexei, Eduard) * Move refactoring hunks to their own patch. (Alexei) * Add empty opts parameter for future extensibility to libbpf API. (Andrii, Eduard) * Add BPF_STREAM_{STDOUT,STDERR} to UAPI. (Andrii) * Add code to match on backtrace output. (Eduard) * Fix misc nits. * Add acks.
v1 -> v2 v1: https://lore.kernel.org/bpf/20250507171720.1958296-1-memxor@gmail.com
* Drop arena page fault prints, will be done as follow up. (Alexei) * Defer Andrii's request to reuse code and Alan's suggestion of error counts to follow up. * Drop bpf_dynptr_from_mem_slice patch. * Drop some acks due to heavy reworking. * Fix KASAN splat in bpf_prog_get_file_line. (Eduard) * Collapse bpf_prog_ksym_find and is_bpf_text_address into single call. (Eduard) * Add missing RCU read lock in bpf_prog_ksym_find. * Fix incorrect error handling in dump_stack_cb. * Simplify libbpf macro. (Eduard, Andrii) * Introduce bpf_prog_stream_read() libbpf API. (Eduard, Alexei, Andrii) * Drop BPF prog from the bpftool, use libbpf API. * Rework selftests.
RFC v1 -> v1 RFC v1: https://lore.kernel.org/bpf/20250414161443.1146103-1-memxor@gmail.com
* Rebase on bpf-next/master. * Change output in dump_stack to also print source line. (Alexei) * Simplify API to single pop() operation. (Eduard, Alexei) * Add kdoc for bpf_dynptr_from_mem_slice. * Fix -EINVAL returned from prog_dump_stream. (Eduard) * Split dump_stack() patch into multiple commits. * Add macro wrapping stream staging API. * Change bpftool command from dump to tracelog. (Quentin) * Add bpftool documentation and bash completion. (Quentin) * Change license of bpftool to Dual BSD/GPL. * Simplify memory allocator. (Alexei) * No overflow into second page. * Remove bpf_mem_alloc() fallback. * Symlink bpftool BPF program and exercise as selftest. (Eduard) * Verify output after dumping from ringbuf. (Eduard) * More failure cases to check API invariants. * Remove patches for dynptr lifetime fixes (split into separate set). * Limit maximum error messages, and add stream capacity. (Eduard) ====================
Link: https://patch.msgid.link/20250703204818.925464-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
e8d01330 |
| 03-Jul-2025 |
Kumar Kartikeya Dwivedi <memxor@gmail.com> |
bpf: Report may_goto timeout to BPF stderr
Begin reporting may_goto timeouts to BPF program's stderr stream.
Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsala
bpf: Report may_goto timeout to BPF stderr
Begin reporting may_goto timeouts to BPF program's stderr stream.
Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250703204818.925464-8-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
f0c53fd4 |
| 03-Jul-2025 |
Kumar Kartikeya Dwivedi <memxor@gmail.com> |
bpf: Add function to find program from stack trace
In preparation of figuring out the closest program that led to the current point in the kernel, implement a function that scans through the stack t
bpf: Add function to find program from stack trace
In preparation of figuring out the closest program that led to the current point in the kernel, implement a function that scans through the stack trace and finds out the closest BPF program when walking down the stack trace.
Special care needs to be taken to skip over kernel and BPF subprog frames. We basically scan until we find a BPF main prog frame. The assumption is that if a program calls into us transitively, we'll hit it along the way. If not, we end up returning NULL.
Contextually the function will be used in places where we know the program may have called into us.
Due to reliance on arch_bpf_stack_walk(), this function only works on x86 with CONFIG_UNWINDER_ORC, arm64, and s390. Remove the warning from arch_bpf_stack_walk as well since we call it outside bpf_throw() context.
Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250703204818.925464-6-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
d0903268 |
| 03-Jul-2025 |
Kumar Kartikeya Dwivedi <memxor@gmail.com> |
bpf: Ensure RCU lock is held around bpf_prog_ksym_find
Add a warning to ensure RCU lock is held around tree lookup, and then fix one of the invocations in bpf_stack_walker. The program has an active
bpf: Ensure RCU lock is held around bpf_prog_ksym_find
Add a warning to ensure RCU lock is held around tree lookup, and then fix one of the invocations in bpf_stack_walker. The program has an active stack frame and won't disappear. Use the opportunity to remove unneeded invocation of is_bpf_text_address.
Fixes: f18b03fabaa9 ("bpf: Implement BPF exceptions") Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250703204818.925464-5-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
0e521efa |
| 03-Jul-2025 |
Kumar Kartikeya Dwivedi <memxor@gmail.com> |
bpf: Add function to extract program source info
Prepare a function for use in future patches that can extract the file info, line info, and the source line number for a given BPF program provided i
bpf: Add function to extract program source info
Prepare a function for use in future patches that can extract the file info, line info, and the source line number for a given BPF program provided it's program counter.
Only the basename of the file path is provided, given it can be excessively long in some cases.
This will be used in later patches to print source info to the BPF stream.
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250703204818.925464-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
5ab154f1 |
| 03-Jul-2025 |
Kumar Kartikeya Dwivedi <memxor@gmail.com> |
bpf: Introduce BPF standard streams
Add support for a stream API to the kernel and expose related kfuncs to BPF programs. Two streams are exposed, BPF_STDOUT and BPF_STDERR. These can be used for pr
bpf: Introduce BPF standard streams
Add support for a stream API to the kernel and expose related kfuncs to BPF programs. Two streams are exposed, BPF_STDOUT and BPF_STDERR. These can be used for printing messages that can be consumed from user space, thus it's similar in spirit to existing trace_pipe interface.
The kernel will use the BPF_STDERR stream to notify the program of any errors encountered at runtime. BPF programs themselves may use both streams for writing debug messages. BPF library-like code may use BPF_STDERR to print warnings or errors on misuse at runtime.
The implementation of a stream is as follows. Everytime a message is emitted from the kernel (directly, or through a BPF program), a record is allocated by bump allocating from per-cpu region backed by a page obtained using alloc_pages_nolock(). This ensures that we can allocate memory from any context. The eventual plan is to discard this scheme in favor of Alexei's kmalloc_nolock() [0].
This record is then locklessly inserted into a list (llist_add()) so that the printing side doesn't require holding any locks, and works in any context. Each stream has a maximum capacity of 4MB of text, and each printed message is accounted against this limit.
Messages from a program are emitted using the bpf_stream_vprintk kfunc, which takes a stream_id argument in addition to working otherwise similar to bpf_trace_vprintk.
The bprintf buffer helpers are extracted out to be reused for printing the string into them before copying it into the stream, so that we can (with the defined max limit) format a string and know its true length before performing allocations of the stream element.
For consuming elements from a stream, we expose a bpf(2) syscall command named BPF_PROG_STREAM_READ_BY_FD, which allows reading data from the stream of a given prog_fd into a user space buffer. The main logic is implemented in bpf_stream_read(). The log messages are queued in bpf_stream::log by the bpf_stream_vprintk kfunc, and then pulled and ordered correctly in the stream backlog.
For this purpose, we hold a lock around bpf_stream_backlog_peek(), as llist_del_first() (if we maintained a second lockless list for the backlog) wouldn't be safe from multiple threads anyway. Then, if we fail to find something in the backlog log, we splice out everything from the lockless log, and place it in the backlog log, and then return the head of the backlog. Once the full length of the element is consumed, we will pop it and free it.
The lockless list bpf_stream::log is a LIFO stack. Elements obtained using a llist_del_all() operation are in LIFO order, thus would break the chronological ordering if printed directly. Hence, this batch of messages is first reversed. Then, it is stashed into a separate list in the stream, i.e. the backlog_log. The head of this list is the actual message that should always be returned to the caller. All of this is done in bpf_stream_backlog_fill().
From the kernel side, the writing into the stream will be a bit more involved than the typical printk. First, the kernel typically may print a collection of messages into the stream, and parallel writers into the stream may suffer from interleaving of messages. To ensure each group of messages is visible atomically, we can lift the advantage of using a lockless list for pushing in messages.
To enable this, we add a bpf_stream_stage() macro, and require kernel users to use bpf_stream_printk statements for the passed expression to write into the stream. Underneath the macro, we have a message staging API, where a bpf_stream_stage object on the stack accumulates the messages being printed into a local llist_head, and then a commit operation splices the whole batch into the stream's lockless log list.
This is especially pertinent for rqspinlock deadlock messages printed to program streams. After this change, we see each deadlock invocation as a non-interleaving contiguous message without any confusion on the reader's part, improving their user experience in debugging the fault.
While programs cannot benefit from this staged stream writing API, they could just as well hold an rqspinlock around their print statements to serialize messages, hence this is kept kernel-internal for now.
Overall, this infrastructure provides NMI-safe any context printing of messages to two dedicated streams.
Later patches will add support for printing splats in case of BPF arena page faults, rqspinlock deadlocks, and cond_break timeouts, and integration of this facility into bpftool for dumping messages to user space.
[0]: https://lore.kernel.org/bpf/20250501032718.65476-1-alexei.starovoitov@gmail.com
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250703204818.925464-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
Revision tags: v6.16-rc4 |
|
#
74f1af95 |
| 29-Jun-2025 |
Rob Clark <robin.clark@oss.qualcomm.com> |
Merge remote-tracking branch 'drm/drm-next' into msm-next
Back-merge drm-next to (indirectly) get arm-smmu updates for making stall-on-fault more reliable.
Signed-off-by: Rob Clark <robin.clark@oss
Merge remote-tracking branch 'drm/drm-next' into msm-next
Back-merge drm-next to (indirectly) get arm-smmu updates for making stall-on-fault more reliable.
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
show more ...
|
Revision tags: v6.16-rc3, v6.16-rc2 |
|
#
c598d5eb |
| 11-Jun-2025 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next
Backmerging to forward to v6.16-rc1
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
#
5fcf896e |
| 10-Jun-2025 |
Alexei Starovoitov <ast@kernel.org> |
Merge branch 'bpf-mitigate-spectre-v1-using-barriers'
Luis Gerhorst says:
==================== This improves the expressiveness of unprivileged BPF by inserting speculation barriers instead of reje
Merge branch 'bpf-mitigate-spectre-v1-using-barriers'
Luis Gerhorst says:
==================== This improves the expressiveness of unprivileged BPF by inserting speculation barriers instead of rejecting the programs.
The approach was previously presented at LPC'24 [1] and RAID'24 [2].
To mitigate the Spectre v1 (PHT) vulnerability, the kernel rejects potentially-dangerous unprivileged BPF programs as of commit 9183671af6db ("bpf: Fix leakage under speculation on mispredicted branches"). In [2], we have analyzed 364 object files from open source projects (Linux Samples and Selftests, BCC, Loxilb, Cilium, libbpf Examples, Parca, and Prevail) and found that this affects 31% to 54% of programs.
To resolve this in the majority of cases this patchset adds a fall-back for mitigating Spectre v1 using speculation barriers. The kernel still optimistically attempts to verify all speculative paths but uses speculation barriers against v1 when unsafe behavior is detected. This allows for more programs to be accepted without disabling the BPF Spectre mitigations (e.g., by setting cpu_mitigations_off()).
For this, it relies on the fact that speculation barriers generally prevent all later instructions from executing if the speculation was not correct (not only loads). See patch 7 ("bpf: Fall back to nospec for Spectre v1") for a detailed description and references to the relevant vendor documentation (AMD and Intel x86-64, ARM64, and PowerPC).
In [1] we have measured the overhead of this approach relative to having mitigations off and including the upstream Spectre v4 mitigations. For event tracing and stack-sampling profilers, we found that mitigations increase BPF program execution time by 0% to 62%. For the Loxilb network load balancer, we have measured a 14% slowdown in SCTP performance but no significant slowdown for TCP. This overhead only applies to programs that were previously rejected.
I reran the expressiveness-evaluation with v6.14 and made sure the main results still match those from [1] and [2] (which used v6.5).
Main design decisions are:
* Do not use separate bytecode insns for v1 and v4 barriers (inspired by Daniel Borkmann's question at LPC). This simplifies the verifier significantly and has the only downside that performance on PowerPC is not as high as it could be.
* Allow archs to still disable v1/v4 mitigations separately by setting bpf_jit_bypass_spec_v1/v4(). This has the benefit that archs can benefit from improved BPF expressiveness / performance if they are not vulnerable (e.g., ARM64 for v4 in the kernel).
* Do not remove the empty BPF_NOSPEC implementation for backends for which it is unknown whether they are vulnerable to Spectre v1.
[1] https://lpc.events/event/18/contributions/1954/ ("Mitigating Spectre-PHT using Speculation Barriers in Linux eBPF") [2] https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and Precise Spectre Defenses for Untrusted Linux Kernel Extensions")
Changes:
* v3 -> v4: - Remove insn parameter from do_check_insn() and extract process_bpf_exit_full as a function as requested by Eduard - Investigate apparent sanitize_check_bounds() bug reported by Kartikeya (does appear to not be a bug but only confusing code), sent separate patch to document it and add an assert - Remove already-merged commit 1 ("selftests/bpf: Fix caps for __xlated/jited_unpriv") - Drop former commit 10 ("bpf: Allow nospec-protected var-offset stack access") as it did not include a test and there are other places where var-off is rejected. Also, none of the tested real-world programs used var-off in the paper. Therefore keep the old behavior for now and potentially prepare a patch that converts all cases later if required. - Add link to AMD lfence and PowerPC speculation barrier (ori 31,31,0) documentation - Move detailed barrier documentation to commit 7 ("bpf: Fall back to nospec for Spectre v1") - Link to v3: https://lore.kernel.org/all/20250501073603.1402960-1-luis.gerhorst@fau.de/
* v2 -> v3: - Fix https://lore.kernel.org/oe-kbuild-all/202504212030.IF1SLhz6-lkp@intel.com/ and similar by moving the bpf_jit_bypass_spec_v1/v4() prototypes out of the #ifdef CONFIG_BPF_SYSCALL. Decided not to move them to filter.h (where similar bpf_jit_*() prototypes live) as they would still have to be duplicated in bpf.h to be usable to bpf_bypass_spec_v1/v4() (unless including filter.h in bpf.h is an option). - Fix https://lore.kernel.org/oe-kbuild-all/202504220035.SoGveGpj-lkp@intel.com/ by moving the variable declarations out of the switch-case. - Build touched C files with W=2 and bpf config on x86 to check that there are no other warnings introduced. - Found 3 more checkpatch warnings that can be fixed without degrading readability. - Rebase to bpf-next 2025-05-01 - Link to v2: https://lore.kernel.org/bpf/20250421091802.3234859-1-luis.gerhorst@fau.de/
* v1 -> v2: - Drop former commits 9 ("bpf: Return PTR_ERR from push_stack()") and 11 ("bpf: Fall back to nospec for spec path verification") as suggested by Alexei. This series therefore no longer changes push_stack() to return PTR_ERR. - Add detailed explanation of how lfence works internally and how it affects the algorithm. - Add tests checking that nospec instructions are inserted in expected locations using __xlated_unpriv as suggested by Eduard (also, include a fix for __xlated_unpriv) - Add a test for the mitigations from the description of commit 9183671af6db ("bpf: Fix leakage under speculation on mispredicted branches") - Remove unused variables from do_check[_insn]() as suggested by Eduard. - Remove INSN_IDX_MODIFIED to improve readability as suggested by Eduard. This also causes the nospec_result-check to run (and fail) for jumping-ops. Add a warning to assert that this check must never succeed in that case. - Add details on the safety of patch 10 ("bpf: Allow nospec-protected var-offset stack access") based on the feedback on v1. - Rebase to bpf-next-250420 - Link to v1: https://lore.kernel.org/all/20250313172127.1098195-1-luis.gerhorst@fau.de/
* RFC -> v1: - rebase to bpf-next-250313 - tests: mark expected successes/new errors - add bpt_jit_bypass_spec_v1/v4() to avoid #ifdef in bpf_bypass_spec_v1/v4() - ensure that nospec with v1-support is implemented for archs for which GCC supports speculation barriers, except for MIPS - arm64: emit speculation barrier - powerpc: change nospec to include v1 barrier - discuss potential security (archs that do not impl. BPF nospec) and performance (only PowerPC) regressions - Link to RFC: https://lore.kernel.org/bpf/20250224203619.594724-1-luis.gerhorst@fau.de/ ====================
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://patch.msgid.link/20250603205800.334980-1-luis.gerhorst@fau.de Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
Revision tags: v6.16-rc1 |
|
#
dff883d9 |
| 03-Jun-2025 |
Luis Gerhorst <luis.gerhorst@fau.de> |
bpf, arm64, powerpc: Change nospec to include v1 barrier
This changes the semantics of BPF_NOSPEC (previously a v4-only barrier) to always emit a speculation barrier that works against both Spectre
bpf, arm64, powerpc: Change nospec to include v1 barrier
This changes the semantics of BPF_NOSPEC (previously a v4-only barrier) to always emit a speculation barrier that works against both Spectre v1 AND v4. If mitigation is not needed on an architecture, the backend should set bpf_jit_bypass_spec_v4/v1().
As of now, this commit only has the user-visible implication that unpriv BPF's performance on PowerPC is reduced. This is the case because we have to emit additional v1 barrier instructions for BPF_NOSPEC now.
This commit is required for a future commit to allow us to rely on BPF_NOSPEC for Spectre v1 mitigation. As of this commit, the feature that nospec acts as a v1 barrier is unused.
Commit f5e81d111750 ("bpf: Introduce BPF nospec instruction for mitigating Spectre v4") noted that mitigation instructions for v1 and v4 might be different on some archs. While this would potentially offer improved performance on PowerPC, it was dismissed after the following considerations:
* Only having one barrier simplifies the verifier and allows us to easily rely on v4-induced barriers for reducing the complexity of v1-induced speculative path verification.
* For the architectures that implemented BPF_NOSPEC, only PowerPC has distinct instructions for v1 and v4. Even there, some insns may be shared between the barriers for v1 and v4 (e.g., 'ori 31,31,0' and 'sync'). If this is still found to impact performance in an unacceptable way, BPF_NOSPEC can be split into BPF_NOSPEC_V1 and BPF_NOSPEC_V4 later. As an optimization, we can already skip v1/v4 insns from being emitted for PowerPC with this setup if bypass_spec_v1/v4 is set.
Vulnerability-status for BPF_NOSPEC-based Spectre mitigations (v4 as of this commit, v1 in the future) is therefore:
* x86 (32-bit and 64-bit), ARM64, and PowerPC (64-bit): Mitigated - This patch implements BPF_NOSPEC for these architectures. The previous v4-only version was supported since commit f5e81d111750 ("bpf: Introduce BPF nospec instruction for mitigating Spectre v4") and commit b7540d625094 ("powerpc/bpf: Emit stf barrier instruction sequences for BPF_NOSPEC").
* LoongArch: Not Vulnerable - Commit a6f6a95f2580 ("LoongArch, bpf: Fix jit to skip speculation barrier opcode") is the only other past commit related to BPF_NOSPEC and indicates that the insn is not required there.
* MIPS: Vulnerable (if unprivileged BPF is enabled) - Commit a6f6a95f2580 ("LoongArch, bpf: Fix jit to skip speculation barrier opcode") indicates that it is not vulnerable, but this contradicts the kernel and Debian documentation. Therefore, I assume that there exist vulnerable MIPS CPUs (but maybe not from Loongson?). In the future, BPF_NOSPEC could be implemented for MIPS based on the GCC speculation_barrier [1]. For now, we rely on unprivileged BPF being disabled by default.
* Other: Unknown - To the best of my knowledge there is no definitive information available that indicates that any other arch is vulnerable. They are therefore left untouched (BPF_NOSPEC is not implemented, but bypass_spec_v1/v4 is also not set).
I did the following testing to ensure the insn encoding is correct:
* ARM64: * 'dsb nsh; isb' was successfully tested with the BPF CI in [2] * 'sb' locally using QEMU v7.2.15 -cpu max (emitted sb insn is executed for example with './test_progs -t verifier_array_access')
* PowerPC: The following configs were tested locally with ppc64le QEMU v8.2 '-machine pseries -cpu POWER9': * STF_BARRIER_EIEIO + CONFIG_PPC_BOOK32_64 * STF_BARRIER_SYNC_ORI (forced on) + CONFIG_PPC_BOOK32_64 * STF_BARRIER_FALLBACK (forced on) + CONFIG_PPC_BOOK32_64 * CONFIG_PPC_E500 (forced on) + STF_BARRIER_EIEIO * CONFIG_PPC_E500 (forced on) + STF_BARRIER_SYNC_ORI (forced on) * CONFIG_PPC_E500 (forced on) + STF_BARRIER_FALLBACK (forced on) * CONFIG_PPC_E500 (forced on) + STF_BARRIER_NONE (forced on) Most of those cobinations should not occur in practice, but I was not able to get an PPC e6500 rootfs (for testing PPC_E500 without forcing it on). In any case, this should ensure that there are no unexpected conflicts between the insns when combined like this. Individual v1/v4 barriers were already emitted elsewhere.
Hari's ack is for the PowerPC changes only.
[1] https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=29b74545531f6afbee9fc38c267524326dbfbedf ("MIPS: Add speculation_barrier support") [2] https://github.com/kernel-patches/bpf/pull/8576
Signed-off-by: Luis Gerhorst <luis.gerhorst@fau.de> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Cc: Henriette Herzog <henriette.herzog@rub.de> Cc: Maximilian Ott <ott@cs.fau.de> Cc: Milan Stephan <milan.stephan@fau.de> Link: https://lore.kernel.org/r/20250603211703.337860-1-luis.gerhorst@fau.de Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
03c68a0f |
| 03-Jun-2025 |
Luis Gerhorst <luis.gerhorst@fau.de> |
bpf, arm64, powerpc: Add bpf_jit_bypass_spec_v1/v4()
JITs can set bpf_jit_bypass_spec_v1/v4() if they want the verifier to skip analysis/patching for the respective vulnerability. For v4, this will
bpf, arm64, powerpc: Add bpf_jit_bypass_spec_v1/v4()
JITs can set bpf_jit_bypass_spec_v1/v4() if they want the verifier to skip analysis/patching for the respective vulnerability. For v4, this will reduce the number of barriers the verifier inserts. For v1, it allows more programs to be accepted.
The primary motivation for this is to not regress unpriv BPF's performance on ARM64 in a future commit where BPF_NOSPEC is also used against Spectre v1.
This has the user-visible change that v1-induced rejections on non-vulnerable PowerPC CPUs are avoided.
For now, this does not change the semantics of BPF_NOSPEC. It is still a v4-only barrier and must not be implemented if bypass_spec_v4 is always true for the arch. Changing it to a v1 AND v4-barrier is done in a future commit.
As an alternative to bypass_spec_v1/v4, one could introduce NOSPEC_V1 AND NOSPEC_V4 instructions and allow backends to skip their lowering as suggested by commit f5e81d111750 ("bpf: Introduce BPF nospec instruction for mitigating Spectre v4"). Adding bpf_jit_bypass_spec_v1/v4() was found to be preferable for the following reason:
* bypass_spec_v1/v4 benefits non-vulnerable CPUs: Always performing the same analysis (not taking into account whether the current CPU is vulnerable), needlessly restricts users of CPUs that are not vulnerable. The only use case for this would be portability-testing, but this can later be added easily when needed by allowing users to force bypass_spec_v1/v4 to false.
* Portability is still acceptable: Directly disabling the analysis instead of skipping the lowering of BPF_NOSPEC(_V1/V4) might allow programs on non-vulnerable CPUs to be accepted while the program will be rejected on vulnerable CPUs. With the fallback to speculation barriers for Spectre v1 implemented in a future commit, this will only affect programs that do variable stack-accesses or are very complex.
For PowerPC, the SEC_FTR checking in bpf_jit_bypass_spec_v4() is based on the check that was previously located in the BPF_NOSPEC case.
For LoongArch, it would likely be safe to set both bpf_jit_bypass_spec_v1() and _v4() according to commit a6f6a95f2580 ("LoongArch, bpf: Fix jit to skip speculation barrier opcode"). This is omitted here as I am unable to do any testing for LoongArch.
Hari's ack concerns the PowerPC part only.
Signed-off-by: Luis Gerhorst <luis.gerhorst@fau.de> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Cc: Henriette Herzog <henriette.herzog@rub.de> Cc: Maximilian Ott <ott@cs.fau.de> Cc: Milan Stephan <milan.stephan@fau.de> Link: https://lore.kernel.org/r/20250603211318.337474-1-luis.gerhorst@fau.de Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
86e2d052 |
| 09-Jun-2025 |
Thomas Hellström <thomas.hellstrom@linux.intel.com> |
Merge drm/drm-next into drm-xe-next
Backmerging to bring in 6.16
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
#
34c55367 |
| 09-Jun-2025 |
Jani Nikula <jani.nikula@intel.com> |
Merge drm/drm-next into drm-intel-next
Sync to v6.16-rc1, among other things to get the fixed size GENMASK_U*() and BIT_U*() macros.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|