#
0b119045 |
| 26-Feb-2025 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.14-rc4' into next
Sync up with the mainline.
|
Revision tags: v6.14-rc4, v6.14-rc3 |
|
#
7f89ec6c |
| 15-Feb-2025 |
Jakub Kicinski <kuba@kernel.org> |
Merge branch 'bnxt_en-add-npar-1-2-and-tph-support'
Michael Chan says:
==================== bnxt_en: Add NPAR 1.2 and TPH support
The first patch adds NPAR 1.2 support. Patches 2 to 11 add TPH (T
Merge branch 'bnxt_en-add-npar-1-2-and-tph-support'
Michael Chan says:
==================== bnxt_en: Add NPAR 1.2 and TPH support
The first patch adds NPAR 1.2 support. Patches 2 to 11 add TPH (TLP Processing Hints) support. These TPH driver patches are new revisions originally posted as part of the TPH PCI patch series. Additional driver refactoring has been done so that we can free and allocate RX completion ring and the TX rings if the channel is a combined channel. We also add napi_disable() and napi_enable() during queue_stop() and queue_start() respectively, and reset for error handling in queue_start().
v4: https://lore.kernel.org/20250208202916.1391614-1-michael.chan@broadcom.com v3: https://lore.kernel.org/20250204004609.1107078-1-michael.chan@broadcom.com v2: https://lore.kernel.org/20250116192343.34535-1-michael.chan@broadcom.com v1: https://lore.kernel.org/20250113063927.4017173-1-michael.chan@broadcom.com
Discussion about adding napi_disable()/napi_enable():
https://lore.kernel.org/netdev/5336d624-8d8b-40a6-b732-b020e4a119a2@davidwei.uk/#t
Previous driver series fixing rtnl_lock and empty release function:
https://lore.kernel.org/netdev/20241115200412.1340286-1-wei.huang2@amd.com/
v5 of the PCI series using netdev_rx_queue_restart():
https://lore.kernel.org/netdev/20240916205103.3882081-5-wei.huang2@amd.com/
v1 of the PCI series using open/close:
https://lore.kernel.org/netdev/20240509162741.1937586-9-wei.huang2@amd.com/ ====================
Link: https://patch.msgid.link/20250213011240.1640031-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
c214410c |
| 13-Feb-2025 |
Manoj Panicker <manoj.panicker2@amd.com> |
bnxt_en: Add TPH support in BNXT driver
Add TPH support to the Broadcom BNXT device driver. This allows the driver to utilize TPH functions for retrieving and configuring Steering Tags when changing
bnxt_en: Add TPH support in BNXT driver
Add TPH support to the Broadcom BNXT device driver. This allows the driver to utilize TPH functions for retrieving and configuring Steering Tags when changing interrupt affinity. With compatible NIC firmware, network traffic will be tagged correctly with Steering Tags, resulting in significant memory bandwidth savings and other advantages as demonstrated by real network benchmarks on TPH-capable platforms.
Co-developed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Co-developed-by: Wei Huang <wei.huang2@amd.com> Signed-off-by: Wei Huang <wei.huang2@amd.com> Signed-off-by: Manoj Panicker <manoj.panicker2@amd.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250213011240.1640031-12-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
6b6bf60f |
| 13-Feb-2025 |
Somnath Kotur <somnath.kotur@broadcom.com> |
bnxt_en: Reallocate RX completion ring for TPH support
In order to program the correct Steering Tag during an IRQ affinity change, we need to free/re-allocate the RX completion ring during queue_res
bnxt_en: Reallocate RX completion ring for TPH support
In order to program the correct Steering Tag during an IRQ affinity change, we need to free/re-allocate the RX completion ring during queue_restart. If TPH is enabled, call FW to free the Rx completion ring and clear the ring entries in queue_stop(). Re-allocate it in queue_start() if TPH is enabled. Note that TPH mode is not enabled in this patch and will be enabled later in the patch series.
While modifying bnxt_queue_start(), remove the unnecessary zeroing of rxr->rx_next_cons. It gets overwritten by the clone in bnxt_queue_start(). Remove the rx_reset counter increment since restart is not reset. Add comment to clarify that the ring allocations in queue_start should never fail.
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250213011240.1640031-9-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
ebdf7fe4 |
| 13-Feb-2025 |
Michael Chan <michael.chan@broadcom.com> |
bnxt_en: Set NPAR 1.2 support when registering with firmware
NPAR (Network interface card partitioning)[1] 1.2 adds a transparent VLAN tag for all packets between the NIC and the switch. Because of
bnxt_en: Set NPAR 1.2 support when registering with firmware
NPAR (Network interface card partitioning)[1] 1.2 adds a transparent VLAN tag for all packets between the NIC and the switch. Because of that, RX VLAN acceleration cannot be supported for any additional host configured VLANs. The driver has to acknowledge that it can support no RX VLAN acceleration and set the NPAR 1.2 supported flag when registering with the FW. Otherwise, the FW call will fail and the driver will abort on these NPAR 1.2 NICs with this error:
bnxt_en 0000:26:00.0 (unnamed net_device) (uninitialized): hwrm req_type 0x1d seq id 0xb error 0x2
[1] https://techdocs.broadcom.com/us/en/storage-and-ethernet-connectivity/ethernet-nic-controllers/bcm957xxx/adapters/introduction/features/network-partitioning-npar.html
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250213011240.1640031-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
Revision tags: v6.14-rc2 |
|
#
93c7dd1b |
| 06-Feb-2025 |
Maxime Ripard <mripard@kernel.org> |
Merge drm/drm-next into drm-misc-next
Bring rc1 to start the new release dev.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
#
2c1ed907 |
| 06-Feb-2025 |
Maxime Ripard <mripard@kernel.org> |
Merge remote-tracking branch 'drm-misc/drm-misc-next-fixes' into drm-misc-fixes
Merge the few remaining patches stuck into drm-misc-next-fixes.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
#
9e676a02 |
| 05-Feb-2025 |
Namhyung Kim <namhyung@kernel.org> |
Merge tag 'v6.14-rc1' into perf-tools-next
To get the various fixes in the current master.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
#
ea9f8f2b |
| 05-Feb-2025 |
Jani Nikula <jani.nikula@intel.com> |
Merge drm/drm-next into drm-intel-next
Sync with v6.14-rc1.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
c771600c |
| 05-Feb-2025 |
Tvrtko Ursulin <tursulin@ursulin.net> |
Merge drm/drm-next into drm-intel-gt-next
We need 4ba4f1afb6a9 ("perf: Generic hotplug support for a PMU with a scope") in order to land a i915 PMU simplification and a fix. That landed in 6.12 and
Merge drm/drm-next into drm-intel-gt-next
We need 4ba4f1afb6a9 ("perf: Generic hotplug support for a PMU with a scope") in order to land a i915 PMU simplification and a fix. That landed in 6.12 and we are stuck at 6.9 so lets bump things forward.
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
show more ...
|
Revision tags: v6.14-rc1 |
|
#
220ed690 |
| 30-Jan-2025 |
Lucas De Marchi <lucas.demarchi@intel.com> |
Merge drm/drm-next into drm-xe-next
Backmerge drm-next to get the common APIs and refactors as well as getting the display changes from i915 in xe so the probe order can be improved.
Signed-off-by:
Merge drm/drm-next into drm-xe-next
Backmerge drm-next to get the common APIs and refactors as well as getting the display changes from i915 in xe so the probe order can be improved.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
show more ...
|
#
0ddeb4fe |
| 24-Jan-2025 |
Miquel Raynal <miquel.raynal@bootlin.com> |
Merge tag 'nand/for-6.14' into mtd/next
* Raw NAND changes
A new controller driver, from Nuvoton, has been merged.
Bastien Curutchet has contributed a series improving the Davinci controller drive
Merge tag 'nand/for-6.14' into mtd/next
* Raw NAND changes
A new controller driver, from Nuvoton, has been merged.
Bastien Curutchet has contributed a series improving the Davinci controller driver, both on the organization of the code, but also on the performance side. The binding has also been converted to yaml, received a new OOB layout and now supports on-die ECC engines.
The Qualcomm controller driver has been deeply cleaned to extract some parts of the code into a shared file with the Qualcomm SPI memory controller.
Aside from these main changes, the Cadence binding has been converted to yaml, the brcmnand controller driver has received a small fix, otherwise some more minor changes have also made their way in.
* SPI NAND changes
The SPI NAND subsystem has seen a great improvement, with the advent of DTR operations (DDR operations, which may be extended to the address cycles). The first vendor driver to benefit from these improvements is the Winbond driver.
A new manufacturer driver is added SkyHigh, with a new constraint for the core, it is impossible to disable the on-die ECC engine.
A Foresee device is also now supported.
show more ...
|
#
37ba6c7f |
| 23-Jan-2025 |
Maarten Lankhorst <dev@lankhorst.se> |
Merge remote-tracking branch 'drm/drm-next' into drm-misc-next-fixes
A regression was caused by commit e4b5ccd392b9 ("drm/v3d: Ensure job pointer is set to NULL after job completion"), but this comm
Merge remote-tracking branch 'drm/drm-next' into drm-misc-next-fixes
A regression was caused by commit e4b5ccd392b9 ("drm/v3d: Ensure job pointer is set to NULL after job completion"), but this commit is not yet in next-fixes, fast-forward it.
Try #2, first one didn't have v6.13 in it.
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
show more ...
|
#
07c5b277 |
| 23-Jan-2025 |
Simona Vetter <simona.vetter@ffwll.ch> |
Merge v6.13 into drm-next
A regression was caused by commit e4b5ccd392b9 ("drm/v3d: Ensure job pointer is set to NULL after job completion"), but this commit is not yet in next-fixes, fast-forward i
Merge v6.13 into drm-next
A regression was caused by commit e4b5ccd392b9 ("drm/v3d: Ensure job pointer is set to NULL after job completion"), but this commit is not yet in next-fixes, fast-forward it.
Note that this recreates Linus merge in 96c84703f1cf ("Merge tag 'drm-next-2025-01-17' of https://gitlab.freedesktop.org/drm/kernel") because I didn't want to backmerge a random point in the merge window.
Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
show more ...
|
#
0ad9617c |
| 22-Jan-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'net-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni: "This is slightly smaller than usual, with the most interesting
Merge tag 'net-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni: "This is slightly smaller than usual, with the most interesting work being still around RTNL scope reduction.
Core:
- More core refactoring to reduce the RTNL lock contention, including preparatory work for the per-network namespace RTNL lock, replacing RTNL lock with a per device-one to protect NAPI-related net device data and moving synchronize_net() calls outside such lock.
- Extend drop reasons usage, adding net scheduler, AF_UNIX, bridge and more specific TCP coverage.
- Reduce network namespace tear-down time by removing per-subsystems synchronize_net() in tipc and sched.
- Add flow label selector support for fib rules, allowing traffic redirection based on such header field.
Netfilter:
- Do not remove netdev basechain when last device is gone, allowing netdev basechains without devices.
- Revisit the flowtable teardown strategy, dealing better with fin, reset and re-open events.
- Scale-up IP-vs connection dumping by avoiding linear search on each restart.
Protocols:
- A significant XDP socket refactor, consolidating and optimizing several helpers into the core
- Better scaling of ICMP rate-limiting, by removing false-sharing in inet peers handling.
- Introduces netlink notifications for multicast IPv4 and IPv6 address changes.
- Add ipsec support for IP-TFS/AggFrag encapsulation, allowing aggregation and fragmentation of the inner IP.
- Add sysctl to configure TIME-WAIT reuse delay for TCP sockets, to avoid local port exhaustion issues when the average connection lifetime is very short.
- Support updating keys (re-keying) for connections using kernel TLS (for TLS 1.3 only).
- Support ipv4-mapped ipv6 address clients in smc-r v2.
- Add support for jumbo data packet transmission in RxRPC sockets, gluing multiple data packets in a single UDP packet.
- Support RxRPC RACK-TLP to manage packet loss and retransmission in conjunction with the congestion control algorithm.
Driver API:
- Introduce a unified and structured interface for reporting PHY statistics, exposing consistent data across different H/W via ethtool.
- Make timestamping selectable, allow the user to select the desired hwtstamp provider (PHY or MAC) administratively.
- Add support for configuring a header-data-split threshold (HDS) value via ethtool, to deal with partial or buggy H/W implementation.
- Consolidate DSA drivers Energy Efficiency Ethernet support.
- Add EEE management to phylink, making use of the phylib implementation.
- Add phylib support for in-band capabilities negotiation.
- Simplify how phylib-enabled mac drivers expose the supported interfaces.
Tests and tooling:
- Make the YNL tool package-friendly to make it easier to deploy it separately from the kernel.
- Increase TCP selftest coverage importing several packetdrill test-cases.
- Regenerate the ethtool uapi header from the YNL spec, to ease maintenance and future development.
- Add YNL support for decoding the link types used in net self-tests, allowing a single build to run both net and drivers/net.
Drivers:
- Ethernet high-speed NICs: - nVidia/Mellanox (mlx5): - add cross E-Switch QoS support - add SW Steering support for ConnectX-8 - implement support for HW-Managed Flow Steering, improving the rule deletion/insertion rate - support for multi-host LAG - Intel (ixgbe, ice, igb): - ice: add support for devlink health events - ixgbe: add initial support for E610 chipset variant - igb: add support for AF_XDP zero-copy - Meta: - add support for basic RSS config - allow changing the number of channels - add hardware monitoring support - Broadcom (bnxt): - implement TCP data split and HDS threshold ethtool support, enabling Device Memory TCP. - Marvell Octeon: - implement egress ipsec offload support for the cn10k family - Hisilicon (HIBMC): - implement unicast MAC filtering
- Ethernet NICs embedded and virtual: - Convert UDP tunnel drivers to NETDEV_PCPU_STAT_DSTATS, avoiding contented atomic operations for drop counters - Freescale: - quicc: phylink conversion - enetc: support Tx and Rx checksum offload and improve TSO performances - MediaTek: - airoha: introduce support for ETS and HTB Qdisc offload - Microchip: - lan78XX USB: preparation work for phylink conversion - Synopsys (stmmac): - support DWMAC IP on NXP Automotive SoCs S32G2xx/S32G3xx/S32R45 - refactor EEE support to leverage the new driver API - optimize DMA and cache access to increase raw RX performances by 40% - TI: - icssg-prueth: add multicast filtering support for VLAN interface - netkit: - add ability to configure head/tailroom - VXLAN: - accepts packets with user-defined reserved bit
- Ethernet switches: - Microchip: - lan969x: add RGMII support - lan969x: improve TX and RX performance using the FDMA engine - nVidia/Mellanox: - move Tx header handling to PCI driver, to ease XDP support
- Ethernet PHYs: - Texas Instruments DP83822: - add support for GPIO2 clock output - Realtek: - 8169: add support for RTL8125D rev.b - rtl822x: add hwmon support for the temperature sensor - Microchip: - add support for RDS PTP hardware - consolidate periodic output signal generation
- CAN: - several DT-bindings to DT schema conversions - tcan4x5x: - add HW standby support - support nWKRQ voltage selection - kvaser: - allowing Bus Error Reporting runtime configuration
- WiFi: - the on-going Multi-Link Operation (MLO) effort continues, affecting both the stack and in drivers - mac80211/cfg80211: - Emergency Preparedness Communication Services (EPCS) station mode support - support for adding and removing station links for MLO - add support for WiFi 7/EHT mesh over 320 MHz channels - report Tx power info for each link - RealTek (rtw88): - enable USB Rx aggregation and USB 3 to improve performance - LED support - RealTek (rtw89): - refactor power save to support Multi-Link Operations - add support for RTL8922AE-VS variant - MediaTek (mt76): - single wiphy multiband support (preparation for MLO) - p2p device support - add TP-Link TXE50UH USB adapter support - Qualcomm (ath10k): - support for the QCA6698AQ IP core - Qualcomm (ath12k): - enable MLO for QCN9274
- Bluetooth: - Allow sysfs to trigger hdev reset, to allow recovering devices not responsive from user-space - MediaTek: add support for MT7922, MT7925, MT7921e devices - Realtek: add support for RTL8851BE devices - Qualcomm: add support for WCN785x devices - ISO: allow BIG re-sync"
* tag 'net-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1386 commits) net/rose: prevent integer overflows in rose_setsockopt() net: phylink: fix regression when binding a PHY net: ethernet: ti: am65-cpsw: streamline TX queue creation and cleanup net: ethernet: ti: am65-cpsw: streamline RX queue creation and cleanup net: ethernet: ti: am65-cpsw: ensure proper channel cleanup in error path ipv6: Convert inet6_rtm_deladdr() to per-netns RTNL. ipv6: Convert inet6_rtm_newaddr() to per-netns RTNL. ipv6: Move lifetime validation to inet6_rtm_newaddr(). ipv6: Set cfg.ifa_flags before device lookup in inet6_rtm_newaddr(). ipv6: Pass dev to inet6_addr_add(). ipv6: Convert inet6_ioctl() to per-netns RTNL. ipv6: Hold rtnl_net_lock() in addrconf_init() and addrconf_cleanup(). ipv6: Hold rtnl_net_lock() in addrconf_dad_work(). ipv6: Hold rtnl_net_lock() in addrconf_verify_work(). ipv6: Convert net.ipv6.conf.${DEV}.XXX sysctl to per-netns RTNL. ipv6: Add __in6_dev_get_rtnl_net(). net: stmmac: Drop redundant skb_mark_for_recycle() for SKB frags net: mii: Fix the Speed display when the network cable is not connected sysctl net: Remove macro checks for CONFIG_SYSCTL eth: bnxt: update header sizing defaults ...
show more ...
|
#
25768de5 |
| 21-Jan-2025 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.14 merge window.
|
Revision tags: v6.13 |
|
#
2ee738e9 |
| 16-Jan-2025 |
Jakub Kicinski <kuba@kernel.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.13-rc8).
Conflicts:
drivers/net/ethernet/realtek/r8169_main.c 1f691a1fc4be
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.13-rc8).
Conflicts:
drivers/net/ethernet/realtek/r8169_main.c 1f691a1fc4be ("r8169: remove redundant hwmon support") 152d00a91396 ("r8169: simplify setting hwmon attribute visibility") https://lore.kernel.org/20250115122152.760b4e8d@canb.auug.org.au
Adjacent changes:
drivers/net/ethernet/broadcom/bnxt/bnxt.c 152f4da05aee ("bnxt_en: add support for rx-copybreak ethtool command") f0aa6a37a3db ("eth: bnxt: always recalculate features after XDP clearing, fix null-deref")
drivers/net/ethernet/intel/ice/ice_type.h 50327223a8bb ("ice: add lock to protect low latency interface") dc26548d729e ("ice: Fix quad registers read on E825")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
ce69b401 |
| 16-Jan-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'net-6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni: "Notably this includes fixes for a few regressions spotted very recent
Merge tag 'net-6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni: "Notably this includes fixes for a few regressions spotted very recently. No known outstanding ones.
Current release - regressions:
- core: avoid CFI problems with sock priv helpers
- xsk: bring back busy polling support
- netpoll: ensure skb_pool list is always initialized
Current release - new code bugs:
- core: make page_pool_ref_netmem work with net iovs
- ipv4: route: fix drop reason being overridden in ip_route_input_slow
- udp: make rehash4 independent in udp_lib_rehash()
Previous releases - regressions:
- bpf: fix bpf_sk_select_reuseport() memory leak
- openvswitch: fix lockup on tx to unregistering netdev with carrier
- mptcp: be sure to send ack when mptcp-level window re-opens
- eth: - bnxt: always recalculate features after XDP clearing, fix null-deref - mlx5: fix sub-function add port error handling - fec: handle page_pool_dev_alloc_pages error
Previous releases - always broken:
- vsock: some fixes due to transport de-assignment
- eth: - ice: fix E825 initialization - mlx5e: fix inversion dependency warning while enabling IPsec tunnel - gtp: destroy device along with udp socket's netns dismantle. - xilinx: axienet: Fix IRQ coalescing packet count overflow"
* tag 'net-6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits) netdev: avoid CFI problems with sock priv helpers net/mlx5e: Always start IPsec sequence number from 1 net/mlx5e: Rely on reqid in IPsec tunnel mode net/mlx5e: Fix inversion dependency warning while enabling IPsec tunnel net/mlx5: Clear port select structure when fail to create net/mlx5: SF, Fix add port error handling net/mlx5: Fix a lockdep warning as part of the write combining test net/mlx5: Fix RDMA TX steering prio net: make page_pool_ref_netmem work with net iovs net: ethernet: xgbe: re-add aneg to supported features in PHY quirks net: pcs: xpcs: actively unset DW_VR_MII_DIG_CTRL1_2G5_EN for 1G SGMII net: pcs: xpcs: fix DW_VR_MII_DIG_CTRL1_2G5_EN bit being set for 1G SGMII w/o inband selftests: net: Adapt ethtool mq tests to fix in qdisc graft net: fec: handle page_pool_dev_alloc_pages error net: netpoll: ensure skb_pool list is always initialized net: xilinx: axienet: Fix IRQ coalescing packet count overflow nfp: bpf: prevent integer overflow in nfp_bpf_event_output() selftests: mptcp: avoid spurious errors on disconnect mptcp: fix spurious wake-up on under memory pressure mptcp: be sure to send ack when mptcp-level window re-opens ...
show more ...
|
#
bf38a2f7 |
| 15-Jan-2025 |
Jakub Kicinski <kuba@kernel.org> |
Merge branch 'bnxt_en-implement-tcp-data-split-and-thresh-option'
Taehee Yoo says:
==================== bnxt_en: implement tcp-data-split and thresh option
This series implements hds-thresh ethtoo
Merge branch 'bnxt_en-implement-tcp-data-split-and-thresh-option'
Taehee Yoo says:
==================== bnxt_en: implement tcp-data-split and thresh option
This series implements hds-thresh ethtool command. This series also implements backend of tcp-data-split and hds-thresh ethtool command for bnxt_en driver. These ethtool commands are mandatory options for device memory TCP.
NICs that use the bnxt_en driver support tcp-data-split feature named HDS(header-data-split). But there is no implementation for the HDS to enable by ethtool. Only getting the current HDS status is implemented and the HDS is just automatically enabled only when either LRO, HW-GRO, or JUMBO is enabled. The hds_threshold follows the rx-copybreak value but it wasn't changeable.
Currently, bnxt_en driver enables tcp-data-split by default but not always work. There is hds_threshold value, which indicates that a packet size is larger than this value, a packet will be split into header and data. hds_threshold value has been 256, which is a default value of rx-copybreak value too. The rx-copybreak value hasn't been allowed to change so the hds_threshold too.
This patchset decouples hds_threshold and rx-copybreak first. and make tcp-data-split, rx-copybreak, and hds-thresh configurable independently.
But the default configuration is the same. The default value of rx-copybreak is 256 and default hds-thresh is also 256.
The behavior of rx-copybreak will probably be changed in almost all drivers. If HDS is not enabled, rx-copybreak copies both header and payload from a page. But if HDS is enabled, rx-copybreak copies only header from the first page. Due to this change, it may need to disable(set to 0) rx-copybreak when the HDS is required.
There are several related options. TPA(HW-GRO, LRO), JUMBO, jumbo_thresh(firmware command), and Aggregation Ring.
The aggregation ring is fundamental to these all features. When gro/lro/jumbo packets are received, NIC receives the first packet from the normal ring. follow packets come from the aggregation ring.
These features are working regardless of HDS. If HDS is enabled, the first packet contains the header only, and the following packets contain only payload. So, HW-GRO/LRO is working regardless of HDS.
There is another threshold value, which is jumbo_thresh. This is very similar to hds_thresh, but jumbo thresh doesn't split header and data. It just split the first and following data based on length. When NIC receives 1500 sized packet, and jumbo_thresh is 256(default, but follows rx-copybreak), the first data is 256 and the following packet size is 1500-256.
Before this patch, at least if one of GRO, LRO, and JUMBO flags is enabled, the Aggregation ring will be enabled. If the Aggregation ring is enabled, both hds_threshold and jumbo_thresh are set to the default value of rx-copybreak.
So, GRO, LRO, JUMBO frames, they larger than 256 bytes, they will be split into header and data if the protocol is TCP or UDP. for the other protocol, jumbo_thresh works instead of hds_thresh.
This means that tcp-data-split relies on the GRO, LRO, and JUMBO flags. But by this patch, tcp-data-split no longer relies on these flags. If the tcp-data-split is enabled, the Aggregation ring will be enabled. Also, hds_threshold no longer follows rx-copybreak value, it will be set to the hds-thresh value by user-space, but the default value is still 256.
If the protocol is TCP or UDP and the HDS is disabled and Aggregation ring is enabled, a packet will be split into several pieces due to jumbo_thresh.
When single buffer XDP is attached, tcp-data-split is automatically disabled.
LRO, GRO, and JUMBO are tested with BCM57414, BCM57504 and the firmware version is 230.0.157.0. I couldn't find any specification about minimum and maximum value of hds_threshold, but from my test result, it was about 0 ~ 1023. It means, over 1023 sized packets will be split into header and data if tcp-data-split is enabled regardless of hds_treshold value. When hds_threshold is 1500 and received packet size is 1400, HDS should not be activated, but it is activated. The maximum value of hds-thresh value is 256 because it has been working. It was decided very conservatively.
I checked out the tcp-data-split(HDS) works independently of GRO, LRO, JUMBO. Also, I checked out tcp-data-split should be disabled automatically when XDP is attached and disallowed to enable it again while XDP is attached. I tested ranged values from min to max for hds-thresh and rx-copybreak, and it works. hds-thresh from 0 to 256, and rx-copybreak 0 to 256. When testing this patchset, I checked skb->data, skb->data_len, and nr_frags values.
By this patchset, bnxt_en driver supports a force enable tcp-data-split, but it doesn't support for disable tcp-data-split. When tcp-data-split is explicitly enabled, HDS works always. When tcp-data-split is unknown, it depends on the current configuration of LRO/GRO/JUMBO.
1/10 patch adds a new hds_config member in the ethtool_netdev_state. It indicates that what tcp-data-split value is really updated from userspace. So the driver can distinguish a passed tcp-data-split value is came from user or driver itself.
2/10 patch adds hds-thresh command in the ethtool. This threshold value indicates if a received packet size is larger than this threshold, the packet's header and payload will be split. Example: # ethtool -G <interface name> hds-thresh <value> This option can not be used when tcp-data-split is disabled or not supported. # ethtool -G enp14s0f0np0 tcp-data-split on hds-thresh 256 # ethtool -g enp14s0f0np0 Ring parameters for enp14s0f0np0: Pre-set maximums: ... Current hardware settings: ... TCP data split: on HDS thresh: 256
3/10, 4/10 add condition checks for devmem and ethtool. If tcp-data-split is disabled or threshold value is not zero, setup of devmem will be failed. Also, tcp-data-split and hds-thresh will not be changed while devmem is running.
5/10 add condition checks for netdev core. It disallows setup single buffer XDP program when tcp-data-split is enabled.
6/10 patch implements .{set, get}_tunable() in the bnxt_en. The bnxt_en driver has been supporting the rx-copybreak feature but is not configurable, Only the default rx-copybreak value has been working. So, it changes the bnxt_en driver to be able to configure the rx-copybreak value.
7/10 patch adds an implementation of tcp-data-split ethtool command. The HDS relies on the Aggregation ring, which is automatically enabled when either LRO, GRO, or large mtu is configured. So, if the Aggregation ring is enabled, HDS is automatically enabled by it.
8/10 patch adds the implementation of hds-thresh logic in the bnxt_en driver. The default value is 256, which used to be the default rx-copybreak value.
9/10 add HDS feature implementation for netdevsim. HDS feature is not common so far. Only a few NICs support this feature. There is no way to test HDS core-API unless we have proper hw NIC. In order to test HDS core-API without hw NIC, netdevsim can be used. It implements HDS control and data plane for netdevsim.
10/10 add selftest for HDS(tcp-data-split and HDS-thresh). The tcp-data-split tests are the same with `ethtool -G tcp-data-split <on | auto>` HDS-thresh tests are same with `ethtool -G eth0 hds-thresh <0 - MAX>`
This series is tested with BCM57504 and netdevsim. ====================
Link: https://patch.msgid.link/20250114142852.3364986-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
6b43673a |
| 14-Jan-2025 |
Taehee Yoo <ap420073@gmail.com> |
bnxt_en: add support for hds-thresh ethtool command
The bnxt_en driver has configured the hds_threshold value automatically when TPA is enabled based on the rx-copybreak default value. Now the hds-t
bnxt_en: add support for hds-thresh ethtool command
The bnxt_en driver has configured the hds_threshold value automatically when TPA is enabled based on the rx-copybreak default value. Now the hds-thresh ethtool command is added, so it adds an implementation of hds-thresh option.
Configuration of the hds-thresh is applied only when the tcp-data-split is enabled. The default value of hds-thresh is 256, which is the default value of rx-copybreak, which used to be the hds_thresh value.
The maximum hds-thresh is 1023.
# Example: # ethtool -G enp14s0f0np0 tcp-data-split on hds-thresh 256 # ethtool -g enp14s0f0np0 Ring parameters for enp14s0f0np0: Pre-set maximums: ... HDS thresh: 1023 Current hardware settings: ... TCP data split: on HDS thresh: 256
Tested-by: Stanislav Fomichev <sdf@fomichev.me> Tested-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250114142852.3364986-9-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
87c8f849 |
| 14-Jan-2025 |
Taehee Yoo <ap420073@gmail.com> |
bnxt_en: add support for tcp-data-split ethtool command
NICs that uses bnxt_en driver supports tcp-data-split feature by the name of HDS(header-data-split). But there is no implementation for the HD
bnxt_en: add support for tcp-data-split ethtool command
NICs that uses bnxt_en driver supports tcp-data-split feature by the name of HDS(header-data-split). But there is no implementation for the HDS to enable by ethtool. Only getting the current HDS status is implemented and The HDS is just automatically enabled only when either LRO, HW-GRO, or JUMBO is enabled. The hds_threshold follows rx-copybreak value. and it was unchangeable.
This implements `ethtool -G <interface name> tcp-data-split <value>` command option. The value can be <on> and <auto>. The value is <auto> and one of LRO/GRO/JUMBO is enabled, HDS is automatically enabled and all LRO/GRO/JUMBO are disabled, HDS is automatically disabled.
HDS feature relies on the aggregation ring. So, if HDS is enabled, the bnxt_en driver initializes the aggregation ring. This is the reason why BNXT_FLAG_AGG_RINGS contains HDS condition.
Acked-by: Jakub Kicinski <kuba@kernel.org> Tested-by: Stanislav Fomichev <sdf@fomichev.me> Tested-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250114142852.3364986-8-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
152f4da0 |
| 14-Jan-2025 |
Taehee Yoo <ap420073@gmail.com> |
bnxt_en: add support for rx-copybreak ethtool command
The bnxt_en driver supports rx-copybreak, but it couldn't be set by userspace. Only the default value(256) has worked. This patch makes the bnxt
bnxt_en: add support for rx-copybreak ethtool command
The bnxt_en driver supports rx-copybreak, but it couldn't be set by userspace. Only the default value(256) has worked. This patch makes the bnxt_en driver support following command. `ethtool --set-tunable <devname> rx-copybreak <value> ` and `ethtool --get-tunable <devname> rx-copybreak`.
By this patch, hds_threshol is set to the rx-copybreak value. But it will be set by `ethtool -G eth0 hds-thresh N` in the next patch.
Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Tested-by: Stanislav Fomichev <sdf@fomichev.me> Tested-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250114142852.3364986-7-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
fc4378b2 |
| 15-Jan-2025 |
Miquel Raynal <miquel.raynal@bootlin.com> |
Merge tag 'spi-mem-dtr-2' into nand/next
spi: Support DTR in spi-mem
Changes to support DTR with spi-mem.
|
#
12080e85 |
| 14-Jan-2025 |
Maxime Ripard <mripard@kernel.org> |
Merge drm/drm-next into drm-misc-next-fixes
drm-next has the dmem cgroup patches we need to merge fixes for.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
Revision tags: v6.13-rc7 |
|
#
f0aa6a37 |
| 09-Jan-2025 |
Jakub Kicinski <kuba@kernel.org> |
eth: bnxt: always recalculate features after XDP clearing, fix null-deref
Recalculate features when XDP is detached.
Before: # ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp # ip li set dev
eth: bnxt: always recalculate features after XDP clearing, fix null-deref
Recalculate features when XDP is detached.
Before: # ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp # ip li set dev eth0 xdp off # ethtool -k eth0 | grep gro rx-gro-hw: off [requested on]
After: # ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp # ip li set dev eth0 xdp off # ethtool -k eth0 | grep gro rx-gro-hw: on
The fact that HW-GRO doesn't get re-enabled automatically is just a minor annoyance. The real issue is that the features will randomly come back during another reconfiguration which just happens to invoke netdev_update_features(). The driver doesn't handle reconfiguring two things at a time very robustly.
Starting with commit 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()") we only reconfigure the RSS hash table if the "effective" number of Rx rings has changed. If HW-GRO is enabled "effective" number of rings is 2x what user sees. So if we are in the bad state, with HW-GRO re-enablement "pending" after XDP off, and we lower the rings by / 2 - the HW-GRO rings doing 2x and the ethtool -L doing / 2 may cancel each other out, and the:
if (old_rx_rings != bp->hw_resc.resv_rx_rings &&
condition in __bnxt_reserve_rings() will be false. The RSS map won't get updated, and we'll crash with:
BUG: kernel NULL pointer dereference, address: 0000000000000168 RIP: 0010:__bnxt_hwrm_vnic_set_rss+0x13a/0x1a0 bnxt_hwrm_vnic_rss_cfg_p5+0x47/0x180 __bnxt_setup_vnic_p5+0x58/0x110 bnxt_init_nic+0xb72/0xf50 __bnxt_open_nic+0x40d/0xab0 bnxt_open_nic+0x2b/0x60 ethtool_set_channels+0x18c/0x1d0
As we try to access a freed ring.
The issue is present since XDP support was added, really, but prior to commit 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()") it wasn't causing major issues.
Fixes: 1054aee82321 ("bnxt_en: Use NETIF_F_GRO_HW.") Fixes: 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Link: https://patch.msgid.link/20250109043057.2888953-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|