Commit Graph

18125 Commits

Author SHA1 Message Date
Jakub Kicinski
463ec95a16 ipsec-2025-01-27
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH7ZpcWbFyOOp6OJbrB3Eaf9PW7cFAmeXIF4ACgkQrB3Eaf9P
 W7fRMA/+Js2x0HNA3+6SMb5nJzY6lywi1BIRAzstyfd6EsxbHlgfdWYCCpixboA0
 /ZfDe7yPND/ewPIQLT9eO6hk9YzuAVhYUkdIcDC5jdFDNbh9dDqBdyu5P/5spsi9
 9SdFEucoOsKBP4ejmSvtwGsVNIf/1vB8hFqYxB+vh8+d/g8PHrI3xxk+2b7KkIGS
 ms+IyDCoVdCGQUOp4BGtEQbzXtx67diH5dcfwg8/DJpSMbfqO3ZFRG7gPu8C5Igt
 cxVSCW67rv/zzPkGPv8B+nczAdVUZ3OFXgEWxdDCN/mUbFKwxUcIDxZVJMfBBAUP
 lcjsbzmNfj2PNMLZFe/5LuU6o+sFEZdxmTPmvbb+lSYrRHx2oz2/Jb871gEj8rTC
 vNZ+1Lu1k7QRjEPiO1fe85vWdmU4G81+WAzC88nD0KYLDUN4c+MmxUFQkKbAxf6p
 e6VCihcKqi5Sa6R73Ohm87iyiSuv8WvkyVSM0XgQrkXWDFy5Jp2Bo25pW0QgVxK+
 l/aHhDA+YHFEOZTcjZsh/EdKlQRIxBNJ3ualITkjd2T+A1WyWm0A3S+kYZQCKqiM
 WGGWM3oVNXkUAaRxvURNvmXqO+hPeKfIElDeVrOUjG8zQ+EktKcg4KpDQb2BGJCj
 s9ksFj0pplR4GHxUrFmkEPxJWYKpFqUYCZMJDnBnHFm1ykC7QGM=
 =pg+h
 -----END PGP SIGNATURE-----

Merge tag 'ipsec-2025-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
pull request (net): ipsec 2025-01-27

1) Fix incrementing the upper 32 bit sequence numbers for GSO skbs.
   From Jianbo Liu.

2) Fix an out-of-bounds read on xfrm state lookup.
   From Florian Westphal.

3) Fix secpath handling on packet offload mode.
   From Alexandre Cassen.

4) Fix the usage of skb->sk in the xfrm layer.

5) Don't disable preemption while looking up cache state
   to fix PREEMPT_RT.
   From Sebastian Sewior.

* tag 'ipsec-2025-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  xfrm: Don't disable preemption while looking up cache state.
  xfrm: Fix the usage of skb->sk
  xfrm: delete intermediate secpath entry in packet offload mode
  xfrm: state: fix out-of-bounds read during lookup
  xfrm: replay: Fix the update of replay_esn->oseq_hi for GSO
====================

Link: https://patch.msgid.link/20250127060757.3946314-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-27 15:15:12 -08:00
Jakub Kicinski
67e4bb2ced net: page_pool: don't try to stash the napi id
Page ppol tried to cache the NAPI ID in page pool info to avoid
having a dependency on the life cycle of the NAPI instance.
Since commit under Fixes the NAPI ID is not populated until
napi_enable() and there's a good chance that page pool is
created before NAPI gets enabled.

Protect the NAPI pointer with the existing page pool mutex,
the reading path already holds it. napi_id itself we need
to READ_ONCE(), it's protected by netdev_lock() which are
not holding in page pool.

Before this patch napi IDs were missing for mlx5:

 # ./cli.py --spec netlink/specs/netdev.yaml --dump page-pool-get

 [{'id': 144, 'ifindex': 2, 'inflight': 3072, 'inflight-mem': 12582912},
  {'id': 143, 'ifindex': 2, 'inflight': 5568, 'inflight-mem': 22806528},
  {'id': 142, 'ifindex': 2, 'inflight': 5120, 'inflight-mem': 20971520},
  {'id': 141, 'ifindex': 2, 'inflight': 4992, 'inflight-mem': 20447232},
  ...

After:

 [{'id': 144, 'ifindex': 2, 'inflight': 3072, 'inflight-mem': 12582912,
   'napi-id': 565},
  {'id': 143, 'ifindex': 2, 'inflight': 4224, 'inflight-mem': 17301504,
   'napi-id': 525},
  {'id': 142, 'ifindex': 2, 'inflight': 4288, 'inflight-mem': 17563648,
   'napi-id': 524},
  ...

Fixes: 86e25f40aa ("net: napi: Add napi_config")
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20250123231620.1086401-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-27 14:37:41 -08:00
Linus Torvalds
0ad9617c78 Networking changes for 6.14.
Core
 ----
 
  - More core refactoring to reduce the RTNL lock contention,
    including preparatory work for the per-network namespace RTNL lock,
    replacing RTNL lock with a per device-one to protect NAPI-related
    net device data and moving synchronize_net() calls outside such
    lock.
 
  - Extend drop reasons usage, adding net scheduler, AF_UNIX, bridge and
    more specific TCP coverage.
 
  - Reduce network namespace tear-down time by removing per-subsystems
    synchronize_net() in tipc and sched.
 
  - Add flow label selector support for fib rules, allowing traffic
    redirection based on such header field.
 
 Netfilter
 ---------
 
  - Do not remove netdev basechain when last device is gone, allowing
    netdev basechains without devices.
 
  - Revisit the flowtable teardown strategy, dealing better with fin,
    reset and re-open events.
 
  - Scale-up IP-vs connection dumping by avoiding linear search on
    each restart.
 
 Protocols
 ---------
 
  - A significant XDP socket refactor, consolidating and optimizing
    several helpers into the core
 
  - Better scaling of ICMP rate-limiting, by removing false-sharing in
    inet peers handling.
 
  - Introduces netlink notifications for multicast IPv4 and IPv6
    address changes.
 
  - Add ipsec support for IP-TFS/AggFrag encapsulation, allowing
    aggregation and fragmentation of the inner IP.
 
  - Add sysctl to configure TIME-WAIT reuse delay for TCP sockets,
    to avoid local port exhaustion issues when the average connection
    lifetime is very short.
 
  - Support updating keys (re-keying) for connections using kernel
    TLS (for TLS 1.3 only).
 
  - Support ipv4-mapped ipv6 address clients in smc-r v2.
 
  - Add support for jumbo data packet transmission in RxRPC sockets,
    gluing multiple data packets in a single UDP packet.
 
  - Support RxRPC RACK-TLP to manage packet loss and retransmission in
    conjunction with the congestion control algorithm.
 
 Driver API
 ----------
 
  - Introduce a unified and structured interface for reporting PHY
    statistics, exposing consistent data across different H/W via
    ethtool.
 
  - Make timestamping selectable, allow the user to select the desired
    hwtstamp provider (PHY or MAC) administratively.
 
  - Add support for configuring a header-data-split threshold (HDS)
    value via ethtool, to deal with partial or buggy H/W implementation.
 
  - Consolidate DSA drivers Energy Efficiency Ethernet support.
 
  - Add EEE management to phylink, making use of the phylib
    implementation.
 
  - Add phylib support for in-band capabilities negotiation.
 
  - Simplify how phylib-enabled mac drivers expose the supported
    interfaces.
 
 Tests and tooling
 -----------------
 
  - Make the YNL tool package-friendly to make it easier to deploy it
    separately from the kernel.
 
  - Increase TCP selftest coverage importing several packetdrill
    test-cases.
 
  - Regenerate the ethtool uapi header from the YNL spec,
    to ease maintenance and future development.
 
  - Add YNL support for decoding the link types used in net
    self-tests, allowing a single build to run both net and
    drivers/net.
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - nVidia/Mellanox (mlx5):
      - add cross E-Switch QoS support
      - add SW Steering support for ConnectX-8
      - implement support for HW-Managed Flow Steering, improving the
        rule deletion/insertion rate
      - support for multi-host LAG
    - Intel (ixgbe, ice, igb):
      - ice: add support for devlink health events
      - ixgbe: add initial support for E610 chipset variant
      - igb: add support for AF_XDP zero-copy
    - Meta:
      - add support for basic RSS config
      - allow changing the number of channels
      - add hardware monitoring support
    - Broadcom (bnxt):
      - implement TCP data split and HDS threshold ethtool support,
        enabling Device Memory TCP.
    - Marvell Octeon:
      - implement egress ipsec offload support for the cn10k family
    - Hisilicon (HIBMC):
      - implement unicast MAC filtering
 
  - Ethernet NICs embedded and virtual:
    - Convert UDP tunnel drivers to NETDEV_PCPU_STAT_DSTATS, avoiding
      contented atomic operations for drop counters
    - Freescale:
      - quicc: phylink conversion
      - enetc: support Tx and Rx checksum offload and improve TSO
        performances
    - MediaTek:
      - airoha: introduce support for ETS and HTB Qdisc offload
    - Microchip:
      - lan78XX USB: preparation work for phylink conversion
    - Synopsys (stmmac):
      - support DWMAC IP on NXP Automotive SoCs S32G2xx/S32G3xx/S32R45
      - refactor EEE support to leverage the new driver API
      - optimize DMA and cache access to increase raw RX performances
        by 40%
    - TI:
      - icssg-prueth: add multicast filtering support for VLAN
        interface
    - netkit:
      - add ability to configure head/tailroom
    - VXLAN:
      - accepts packets with user-defined reserved bit
 
  - Ethernet switches:
    - Microchip:
      - lan969x: add RGMII support
      - lan969x: improve TX and RX performance using the FDMA engine
    - nVidia/Mellanox:
      - move Tx header handling to PCI driver, to ease XDP support
 
  - Ethernet PHYs:
    - Texas Instruments DP83822:
      - add support for GPIO2 clock output
    - Realtek:
      - 8169: add support for RTL8125D rev.b
      - rtl822x: add hwmon support for the temperature sensor
    - Microchip:
      - add support for RDS PTP hardware
      - consolidate periodic output signal generation
 
  - CAN:
    - several DT-bindings to DT schema conversions
    - tcan4x5x:
      - add HW standby support
      - support nWKRQ voltage selection
    - kvaser:
      - allowing Bus Error Reporting runtime configuration
 
  - WiFi:
    - the on-going Multi-Link Operation (MLO) effort continues, affecting
      both the stack and in drivers
    - mac80211/cfg80211:
      - Emergency Preparedness Communication Services (EPCS) station mode
        support
      - support for adding and removing station links for MLO
      - add support for WiFi 7/EHT mesh over 320 MHz channels
      - report Tx power info for each link
    - RealTek (rtw88):
      - enable USB Rx aggregation and USB 3 to improve performance
      - LED support
    - RealTek (rtw89):
      - refactor power save to support Multi-Link Operations
      - add support for RTL8922AE-VS variant
    - MediaTek (mt76):
      - single wiphy multiband support (preparation for MLO)
      - p2p device support
      - add TP-Link TXE50UH USB adapter support
    - Qualcomm (ath10k):
      - support for the QCA6698AQ IP core
    - Qualcomm (ath12k):
      - enable MLO for QCN9274
 
  - Bluetooth:
    - Allow sysfs to trigger hdev reset, to allow recovering devices
      not responsive from user-space
    - MediaTek: add support for MT7922, MT7925, MT7921e devices
    - Realtek: add support for RTL8851BE devices
    - Qualcomm: add support for WCN785x devices
    - ISO: allow BIG re-sync
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmePf5YSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkUcMQALblhkGTxurnfT+yK+Bsuhn2LoHl2RPN
 4u2Kjkzm+2FYgcw6lS17cFXsnfAPlRIpmhnmKk1EBgsBdkuL29c+jtqnljA2bboD
 tIMhMgWiaLS3xgEMrLeKnseIo0G9mviQRphGeZPFTaLb4Ww/bd5LAp4ZGc5oij76
 tURatC3b6MuO4Lt5U+jWKnRwviXku8udHkVHXlvPdirawHCVinmx3tvce/BI/MaD
 eUOp6ZeJCPCOLtk7b8WEyxxvdY0f6D9ed82qfPDHjb94SJv+Vxb38RZtNuApIjn9
 S0KdlNih/4flDy17LDxGYSyFps78lUFRbpqmsUlnZkyLXpsph7/WTvAmMAFcrX0K
 UgQ/F/q5GAvcP5WZcCj5+tZaRmfKQraQirXMtYU/Uj50qCnSU7ssyACASt23GLZ8
 OF8tCLlm9lLOU1B6Ofkul1Dbo5f0Xpaghga4dFb0kzSfbm78fTUnqBNsJ7jIkWfi
 fD6dO+fg+p2ZMD0CACGo3CNxQuJmaQWg6BIDeno6God8kZ6qBMxY/sFr4qozrvFH
 x/FgQq8dgc8WLmaPejKiNIPkdQepXrIiv3T9jgMVyEjJnWB/LBfyWKSQOdTfnLs+
 rgr4YMV6XW4bx0fYqTI8B9jZ+FCWbG6sn4UtRTHITKcd3FSvd8Y+PHa5YyCUWvJM
 l8pePMGF0XVF
 =hrsp
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "This is slightly smaller than usual, with the most interesting work
  being still around RTNL scope reduction.

  Core:

   - More core refactoring to reduce the RTNL lock contention, including
     preparatory work for the per-network namespace RTNL lock, replacing
     RTNL lock with a per device-one to protect NAPI-related net device
     data and moving synchronize_net() calls outside such lock.

   - Extend drop reasons usage, adding net scheduler, AF_UNIX, bridge
     and more specific TCP coverage.

   - Reduce network namespace tear-down time by removing per-subsystems
     synchronize_net() in tipc and sched.

   - Add flow label selector support for fib rules, allowing traffic
     redirection based on such header field.

  Netfilter:

   - Do not remove netdev basechain when last device is gone, allowing
     netdev basechains without devices.

   - Revisit the flowtable teardown strategy, dealing better with fin,
     reset and re-open events.

   - Scale-up IP-vs connection dumping by avoiding linear search on each
     restart.

  Protocols:

   - A significant XDP socket refactor, consolidating and optimizing
     several helpers into the core

   - Better scaling of ICMP rate-limiting, by removing false-sharing in
     inet peers handling.

   - Introduces netlink notifications for multicast IPv4 and IPv6
     address changes.

   - Add ipsec support for IP-TFS/AggFrag encapsulation, allowing
     aggregation and fragmentation of the inner IP.

   - Add sysctl to configure TIME-WAIT reuse delay for TCP sockets, to
     avoid local port exhaustion issues when the average connection
     lifetime is very short.

   - Support updating keys (re-keying) for connections using kernel TLS
     (for TLS 1.3 only).

   - Support ipv4-mapped ipv6 address clients in smc-r v2.

   - Add support for jumbo data packet transmission in RxRPC sockets,
     gluing multiple data packets in a single UDP packet.

   - Support RxRPC RACK-TLP to manage packet loss and retransmission in
     conjunction with the congestion control algorithm.

  Driver API:

   - Introduce a unified and structured interface for reporting PHY
     statistics, exposing consistent data across different H/W via
     ethtool.

   - Make timestamping selectable, allow the user to select the desired
     hwtstamp provider (PHY or MAC) administratively.

   - Add support for configuring a header-data-split threshold (HDS)
     value via ethtool, to deal with partial or buggy H/W
     implementation.

   - Consolidate DSA drivers Energy Efficiency Ethernet support.

   - Add EEE management to phylink, making use of the phylib
     implementation.

   - Add phylib support for in-band capabilities negotiation.

   - Simplify how phylib-enabled mac drivers expose the supported
     interfaces.

  Tests and tooling:

   - Make the YNL tool package-friendly to make it easier to deploy it
     separately from the kernel.

   - Increase TCP selftest coverage importing several packetdrill
     test-cases.

   - Regenerate the ethtool uapi header from the YNL spec, to ease
     maintenance and future development.

   - Add YNL support for decoding the link types used in net self-tests,
     allowing a single build to run both net and drivers/net.

  Drivers:

   - Ethernet high-speed NICs:
      - nVidia/Mellanox (mlx5):
         - add cross E-Switch QoS support
         - add SW Steering support for ConnectX-8
         - implement support for HW-Managed Flow Steering, improving the
           rule deletion/insertion rate
         - support for multi-host LAG
      - Intel (ixgbe, ice, igb):
         - ice: add support for devlink health events
         - ixgbe: add initial support for E610 chipset variant
         - igb: add support for AF_XDP zero-copy
      - Meta:
         - add support for basic RSS config
         - allow changing the number of channels
         - add hardware monitoring support
      - Broadcom (bnxt):
         - implement TCP data split and HDS threshold ethtool support,
           enabling Device Memory TCP.
      - Marvell Octeon:
         - implement egress ipsec offload support for the cn10k family
      - Hisilicon (HIBMC):
         - implement unicast MAC filtering

   - Ethernet NICs embedded and virtual:
      - Convert UDP tunnel drivers to NETDEV_PCPU_STAT_DSTATS, avoiding
        contented atomic operations for drop counters
      - Freescale:
         - quicc: phylink conversion
         - enetc: support Tx and Rx checksum offload and improve TSO
           performances
      - MediaTek:
         - airoha: introduce support for ETS and HTB Qdisc offload
      - Microchip:
         - lan78XX USB: preparation work for phylink conversion
      - Synopsys (stmmac):
         - support DWMAC IP on NXP Automotive SoCs S32G2xx/S32G3xx/S32R45
         - refactor EEE support to leverage the new driver API
         - optimize DMA and cache access to increase raw RX performances
           by 40%
      - TI:
         - icssg-prueth: add multicast filtering support for VLAN
           interface
      - netkit:
         - add ability to configure head/tailroom
      - VXLAN:
         - accepts packets with user-defined reserved bit

   - Ethernet switches:
      - Microchip:
         - lan969x: add RGMII support
         - lan969x: improve TX and RX performance using the FDMA engine
      - nVidia/Mellanox:
         - move Tx header handling to PCI driver, to ease XDP support

   - Ethernet PHYs:
      - Texas Instruments DP83822:
         - add support for GPIO2 clock output
      - Realtek:
         - 8169: add support for RTL8125D rev.b
         - rtl822x: add hwmon support for the temperature sensor
      - Microchip:
         - add support for RDS PTP hardware
         - consolidate periodic output signal generation

   - CAN:
      - several DT-bindings to DT schema conversions
      - tcan4x5x:
         - add HW standby support
         - support nWKRQ voltage selection
      - kvaser:
         - allowing Bus Error Reporting runtime configuration

   - WiFi:
      - the on-going Multi-Link Operation (MLO) effort continues,
        affecting both the stack and in drivers
      - mac80211/cfg80211:
         - Emergency Preparedness Communication Services (EPCS) station
           mode support
         - support for adding and removing station links for MLO
         - add support for WiFi 7/EHT mesh over 320 MHz channels
         - report Tx power info for each link
      - RealTek (rtw88):
         - enable USB Rx aggregation and USB 3 to improve performance
         - LED support
      - RealTek (rtw89):
         - refactor power save to support Multi-Link Operations
         - add support for RTL8922AE-VS variant
      - MediaTek (mt76):
         - single wiphy multiband support (preparation for MLO)
         - p2p device support
         - add TP-Link TXE50UH USB adapter support
      - Qualcomm (ath10k):
         - support for the QCA6698AQ IP core
      - Qualcomm (ath12k):
         - enable MLO for QCN9274

   - Bluetooth:
      - Allow sysfs to trigger hdev reset, to allow recovering devices
        not responsive from user-space
      - MediaTek: add support for MT7922, MT7925, MT7921e devices
      - Realtek: add support for RTL8851BE devices
      - Qualcomm: add support for WCN785x devices
      - ISO: allow BIG re-sync"

* tag 'net-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1386 commits)
  net/rose: prevent integer overflows in rose_setsockopt()
  net: phylink: fix regression when binding a PHY
  net: ethernet: ti: am65-cpsw: streamline TX queue creation and cleanup
  net: ethernet: ti: am65-cpsw: streamline RX queue creation and cleanup
  net: ethernet: ti: am65-cpsw: ensure proper channel cleanup in error path
  ipv6: Convert inet6_rtm_deladdr() to per-netns RTNL.
  ipv6: Convert inet6_rtm_newaddr() to per-netns RTNL.
  ipv6: Move lifetime validation to inet6_rtm_newaddr().
  ipv6: Set cfg.ifa_flags before device lookup in inet6_rtm_newaddr().
  ipv6: Pass dev to inet6_addr_add().
  ipv6: Convert inet6_ioctl() to per-netns RTNL.
  ipv6: Hold rtnl_net_lock() in addrconf_init() and addrconf_cleanup().
  ipv6: Hold rtnl_net_lock() in addrconf_dad_work().
  ipv6: Hold rtnl_net_lock() in addrconf_verify_work().
  ipv6: Convert net.ipv6.conf.${DEV}.XXX sysctl to per-netns RTNL.
  ipv6: Add __in6_dev_get_rtnl_net().
  net: stmmac: Drop redundant skb_mark_for_recycle() for SKB frags
  net: mii: Fix the Speed display when the network cable is not connected
  sysctl net: Remove macro checks for CONFIG_SYSCTL
  eth: bnxt: update header sizing defaults
  ...
2025-01-22 08:28:57 -08:00
Linus Torvalds
f96a974170 lsm/stable-6.14 PR 20250121
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmeQFBoUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPvcA//XCdwMz0bGtWKv58nuyP8vkQx08n6
 //olz/O8te3uWK5O3kRiarzFLwH8qsHQ6A7GYalwwix34hatR4ndJE0Y/guVRWa1
 +aBmJxJ7Jm/q3fvpAEfqiSgreuE6kBoztlDOWEq+hUQGu4qfnQGm2EnvbvfFrAmN
 VheOfIQSU2KCL/Scc3FGnF6uru4WrqN0JJ9RbvrEpfdQgmcyTGLnQsZLljutWSIq
 kDWkteIr7cj3O9J45zpxZsTftvYSgVn/y1iKeXbHI4DBA1eheK12vsHB9AADKI1J
 GwHxOrnLpZtv+ICUKqcfFTmWTl+NmfJJurAT5KXKdBjL3xM5MoJlBvK1A5qE9CMo
 LaHVG/TZR2MmBaoM3EN+gvWhDgWlvT02Q/0cYaafTlVLMez3HtfctxN6OnCvTXTB
 Y8dqYClhhlBm/mHQwYfMoeKw4MftUpzEqBd1Nj7Qe8dbP0f/62Ca3K2B3D6Rf8QV
 pj3ryMlSWYV9mdTerruLNQexTGoN7l66jPwzdWpTbFeL3WmNtfCako8OZGbXgPIu
 Iahm3P+jnSVx8ZQro2c9zwdKXI5xiI335pCBbDZ8aX+JAsfj0OofHsFx5Q5diber
 M7tAEhxDqRisbpz7Ei+/LOAEGg2Z619XKg8ks4z6Y4P5PF7zEgeWTkZJk2iLbxXe
 6LLOjmF7LLw+G4M=
 =fgyr
 -----END PGP SIGNATURE-----

Merge tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm

Pull lsm updates from Paul Moore:

 - Improved handling of LSM "secctx" strings through lsm_context struct

   The LSM secctx string interface is from an older time when only one
   LSM was supported, migrate over to the lsm_context struct to better
   support the different LSMs we now have and make it easier to support
   new LSMs in the future.

   These changes explain the Rust, VFS, and networking changes in the
   diffstat.

 - Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are
   enabled

   Small tweak to be a bit smarter about when we build the LSM's common
   audit helpers.

 - Check for absurdly large policies from userspace in SafeSetID

   SafeSetID policies rules are fairly small, basically just "UID:UID",
   it easy to impose a limit of KMALLOC_MAX_SIZE on policy writes which
   helps quiet a number of syzbot related issues. While work is being
   done to address the syzbot issues through other mechanisms, this is a
   trivial and relatively safe fix that we can do now.

 - Various minor improvements and cleanups

   A collection of improvements to the kernel selftests, constification
   of some function parameters, removing redundant assignments, and
   local variable renames to improve readability.

* tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  lockdown: initialize local array before use to quiet static analysis
  safesetid: check size of policy writes
  net: corrections for security_secid_to_secctx returns
  lsm: rename variable to avoid shadowing
  lsm: constify function parameters
  security: remove redundant assignment to return variable
  lsm: Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are set
  selftests: refactor the lsm `flags_overset_lsm_set_self_attr` test
  binder: initialize lsm_context structure
  rust: replace lsm context+len with lsm_context
  lsm: secctx provider check on release
  lsm: lsm_context in security_dentry_init_security
  lsm: use lsm_context in security_inode_getsecctx
  lsm: replace context+len with lsm_context
  lsm: ensure the correct LSM context releaser
2025-01-21 20:03:04 -08:00
Kuniyuki Iwashima
f7a6082b5e ipv6: Add __in6_dev_get_rtnl_net().
We will convert rtnl_lock() with rtnl_net_lock(), and we want to
convert __in6_dev_get() too.

__in6_dev_get() uses rcu_dereference_rtnl(), but as written in its
comment, rtnl_dereference() or rcu_dereference() is preferable.

Let's add __in6_dev_get_rtnl_net() that uses rtnl_net_dereference().

We can add the RCU version helper later if needed.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250115080608.28127-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 12:16:04 -08:00
Jakub Kicinski
4fd001f5f3 netfilter pull request 25-01-19
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEjF9xRqF1emXiQiqU1w0aZmrPKyEFAmeNMZwACgkQ1w0aZmrP
 KyFK5A/9EVenL7bwgh9L/F3T1Mo94sM4Ych80Rf3XXHN1t7aBx4zOnIJRUDAlNCB
 +btuE+NYeJ/tpt01Qv9uSnI35W2eTf/EyEtAFfC9RZIVt0ESbtb2Y4rBv3LID/pN
 dtMPDoFJoLK64bViIdL9zoHsNxFRVgP+sC99d91sgr+LjiZ2JDqWKAog51tpWp8k
 GyYt4Kc/lPyo4QsNhMY6i9J12HvQtzhlwqWJ5k9ftrd5Sh+tGDYLzbmucTw8HmWq
 YFxYEM+Ri4KeymuuGzySKOWung2c0e5Qd1ugCxIjXnw71veQ60cITKzvm0WOhIB5
 nsF0gwgrpKNd5a/PV+00CyDfOXDrROQCsb5STqDZNxnsiC5FGlpDXV9RoH/e4H/N
 e/G8//SSy/+GxWIF3HguWhggxTUN6fvC/UmMD4OHyLRkhfbSOm8KEBaTtzriNI/w
 AeHrW0JO1X13/QkWcXAD4xp59mQV3AZ2UuqLJS6AiggkGZ2/LjUKFZaga2F317Bu
 JqhzpUyMMbkVx0qN5cGuBm4g0HiUlpYa4j8CcA+9Us++McMJQ38HEUVrQgonKSP/
 gNYRuT3LMVIjYyBUYy4vRLbxNfmfY4ZIlA9b0My4wo4WIGmqn7US/sxj6cIclKgL
 OAUu/UC6+jVTmLTYi3r146SOeyCt2QGD+KWc6x+FoghgukRe9Gg=
 =AASc
 -----END PGP SIGNATURE-----

Merge tag 'nf-next-25-01-19' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following batch contains Netfilter updates for net-next:

1) Unbreak set size settings for rbtree set backend, intervals in
   rbtree are represented as two elements, this detailed is leaked
   to userspace leading to bogus ENOSPC from control plane.

2) Remove dead code in br_netfilter's br_nf_pre_routing_finish()
   due to never matching error when looking up for route,
   from Antoine Tenart.

3) Simplify check for device already in use in flowtable,
   from Phil Sutter.

4) Three patches to restore interface name field in struct nft_hook
   and use it, this is to prepare for wildcard interface support.
   From Phil Sutter.

5) Do not remove netdev basechain when last device is gone, this is
   for consistency with the flowtable behaviour. This allows for netdev
   basechains without devices. Another patch to simplify netdev event
   notifier after this update. Also from Phil.

6) Two patches to add missing spinlock when flowtable updates TCP
   state flags, from Florian Westphal.

7) Simplify __nf_ct_refresh_acct() by removing skbuff parameter,
   also from Florian.

8) Flowtable gc now extends ct timeout for offloaded flow. This
   is to address a possible race that leads to handing over flow
   to classic path with long ct timeouts.

9) Tear down flow if cached rt_mtu is stale, before this patch,
   packet is handed over to classic path but flow entry still remained
   in place.

10) Revisit the flowtable teardown strategy, which was originally
    designed to release flowtable hardware entries early. Add a new
    CLOSING flag that still allows hardware to release entries when
    fin/rst is seen, but keeps the flow entry in place when the
    TCP connection is closed. Release flow after timeout or when a new
    syn packet is seen for TCP reopen scenario.

* tag 'nf-next-25-01-19' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: flowtable: add CLOSING state
  netfilter: flowtable: teardown flow if cached mtu is stale
  netfilter: conntrack: rework offload nf_conn timeout extension logic
  netfilter: conntrack: remove skb argument from nf_ct_refresh
  netfilter: nft_flow_offload: update tcp state flags under lock
  netfilter: nft_flow_offload: clear tcp MAXACK flag before moving to slowpath
  netfilter: nf_tables: Simplify chain netdev notifier
  netfilter: nf_tables: Tolerate chains with no remaining hooks
  netfilter: nf_tables: Compare netdev hooks based on stored name
  netfilter: nf_tables: Use stored ifname in netdev hook dumps
  netfilter: nf_tables: Store user-defined hook ifname
  netfilter: nf_tables: Flowtable hook's pf value never varies
  netfilter: br_netfilter: remove unused conditional and dead code
  netfilter: nf_tables: fix set size with rbtree backend
====================

Link: https://patch.msgid.link/20250119172051.8261-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:59:25 -08:00
Jakub Kicinski
3c836451ca net: move HDS config from ethtool state
Separate the HDS config from the ethtool state struct.
The HDS config contains just simple parameters, not state.
Having it as a separate struct will make it easier to clone / copy
and also long term potentially make it per-queue.

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:44:57 -08:00
Kuniyuki Iwashima
b3e365bbf4 af_unix: Set drop reason in unix_dgram_disconnected().
unix_dgram_disconnected() is called from two places:

  1. when a connect()ed socket dis-connect()s or re-connect()s to
     another socket

  2. when sendmsg() fails because the peer socket that the client
     has connect()ed to has been close()d

Then, the client's recv queue is purged to remove all messages from
the old peer socket.

Let's define a new drop reason for that case.

  # echo 1 > /sys/kernel/tracing/events/skb/kfree_skb/enable

  # python3
  >>> from socket import *
  >>>
  >>> # s1 has a message from s2
  >>> s1, s2 = socketpair(AF_UNIX, SOCK_DGRAM)
  >>> s2.send(b'hello world')
  >>>
  >>> # re-connect() drops the message from s2
  >>> s3 = socket(AF_UNIX, SOCK_DGRAM)
  >>> s3.bind('')
  >>> s1.connect(s3.getsockname())

  # cat /sys/kernel/tracing/trace_pipe
     python3-250 ... kfree_skb: ... location=skb_queue_purge_reason+0xdc/0x110 reason: UNIX_DISCONNECT

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250116053441.5758-8-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:27:41 -08:00
Kuniyuki Iwashima
533643b091 af_unix: Set drop reason in manage_oob().
AF_UNIX SOCK_STREAM socket supports MSG_OOB.

When OOB data is sent to a socket, recv() will break at that point.

If the next recv() does not have MSG_OOB, the normal data following
the OOB data is returned.

Then, the OOB skb is dropped.

Let's define a new drop reason for that case in manage_oob().

  # echo 1 > /sys/kernel/tracing/events/skb/kfree_skb/enable

  # python3
  >>> from socket import *
  >>> s1, s2 = socketpair(AF_UNIX)
  >>> s1.send(b'a', MSG_OOB)
  >>> s1.send(b'b')
  >>> s2.recv(2)
  b'b'

  # cat /sys/kernel/tracing/trace_pipe
  ...
     python3-223 ... kfree_skb: ... location=unix_stream_read_generic+0x59e/0xc20 reason: UNIX_SKIP_OOB

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250116053441.5758-6-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:27:41 -08:00
Kuniyuki Iwashima
c32f0bd7d4 af_unix: Set drop reason in unix_release_sock().
unix_release_sock() is called when the last refcnt of struct file
is released.

Let's define a new drop reason SKB_DROP_REASON_SOCKET_CLOSE and
set it for kfree_skb() in unix_release_sock().

  # echo 1 > /sys/kernel/tracing/events/skb/kfree_skb/enable

  # python3
  >>> from socket import *
  >>> s1, s2 = socketpair(AF_UNIX)
  >>> s1.send(b'hello world')
  >>> s2.close()

  # cat /sys/kernel/tracing/trace_pipe
  ...
     python3-280 ... kfree_skb: ... protocol=0 location=unix_release_sock+0x260/0x420 reason: SOCKET_CLOSE

To be precise, unix_release_sock() is also called for a new child
socket in unix_stream_connect() when something fails, but the new
sk does not have skb in the recv queue then and no event is logged.

Note that only tcp_inbound_ao_hash() uses a similar drop reason,
SKB_DROP_REASON_TCP_CLOSE, and this can be generalised later.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250116053441.5758-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:27:40 -08:00
Kuniyuki Iwashima
454d402481 net: dropreason: Gather SOCKET_ drop reasons.
The following patch adds a new drop reason starting with
the SOCKET_ prefix.

Let's gather the existing SOCKET_ reasons.

Note that the order is not part of uAPI.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250116053441.5758-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:27:40 -08:00
Ales Nezbeda
457bb7970a net: macsec: Add endianness annotations in salt struct
This change resolves warning produced by sparse tool as currently
there is a mismatch between normal generic type in salt and endian
annotated type in macsec driver code. Endian annotated types should
be used here.

Sparse output:
warning: restricted ssci_t degrades to integer
warning: incorrect type in assignment (different base types)
    expected restricted ssci_t [usertype] ssci
    got unsigned int
warning: restricted __be64 degrades to integer
warning: incorrect type in assignment (different base types)
    expected restricted __be64 [usertype] pn
    got unsigned long long

Signed-off-by: Ales Nezbeda <anezbeda@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-20 12:20:42 +00:00
Xin Long
a12c76a033 net: sched: refine software bypass handling in tc_run
This patch addresses issues with filter counting in block (tcf_block),
particularly for software bypass scenarios, by introducing a more
accurate mechanism using useswcnt.

Previously, filtercnt and skipswcnt were introduced by:

  Commit 2081fd3445 ("net: sched: cls_api: add filter counter") and
  Commit f631ef39d8 ("net: sched: cls_api: add skip_sw counter")

  filtercnt tracked all tp (tcf_proto) objects added to a block, and
  skipswcnt counted tp objects with the skipsw attribute set.

The problem is: a single tp can contain multiple filters, some with skipsw
and others without. The current implementation fails in the case:

  When the first filter in a tp has skipsw, both skipswcnt and filtercnt
  are incremented, then adding a second filter without skipsw to the same
  tp does not modify these counters because tp->counted is already set.

  This results in bypass software behavior based solely on skipswcnt
  equaling filtercnt, even when the block includes filters without
  skipsw. Consequently, filters without skipsw are inadvertently bypassed.

To address this, the patch introduces useswcnt in block to explicitly count
tp objects containing at least one filter without skipsw. Key changes
include:

  Whenever a filter without skipsw is added, its tp is marked with usesw
  and counted in useswcnt. tc_run() now uses useswcnt to determine software
  bypass, eliminating reliance on filtercnt and skipswcnt.

  This refined approach prevents software bypass for blocks containing
  mixed filters, ensuring correct behavior in tc_run().

Additionally, as atomic operations on useswcnt ensure thread safety and
tp->lock guards access to tp->usesw and tp->counted, the broader lock
down_write(&block->cb_lock) is no longer required in tc_new_tfilter(),
and this resolves a performance regression caused by the filter counting
mechanism during parallel filter insertions.

  The improvement can be demonstrated using the following script:

  # cat insert_tc_rules.sh

    tc qdisc add dev ens1f0np0 ingress
    for i in $(seq 16); do
        taskset -c $i tc -b rules_$i.txt &
    done
    wait

  Each of rules_$i.txt files above includes 100000 tc filter rules to a
  mlx5 driver NIC ens1f0np0.

  Without this patch:

  # time sh insert_tc_rules.sh

    real    0m50.780s
    user    0m23.556s
    sys	    4m13.032s

  With this patch:

  # time sh insert_tc_rules.sh

    real    0m17.718s
    user    0m7.807s
    sys     3m45.050s

Fixes: 047f340b36 ("net: sched: make skip_sw actually skip software")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Tested-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-20 09:21:27 +00:00
Pablo Neira Ayuso
fdbaf51633 netfilter: flowtable: add CLOSING state
tcp rst/fin packet triggers an immediate teardown of the flow which
results in sending flows back to the classic forwarding path.

This behaviour was introduced by:

  da5984e510 ("netfilter: nf_flow_table: add support for sending flows back to the slow path")
  b6f27d322a ("netfilter: nf_flow_table: tear down TCP flows if RST or FIN was seen")

whose goal is to expedite removal of flow entries from the hardware
table. Before these patches, the flow was released after the flow entry
timed out.

However, this approach leads to packet races when restoring the
conntrack state as well as late flow re-offload situations when the TCP
connection is ending.

This patch adds a new CLOSING state that is is entered when tcp rst/fin
packet is seen. This allows for an early removal of the flow entry from
the hardware table. But the flow entry still remains in software, so tcp
packets to shut down the flow are not sent back to slow path.

If syn packet is seen from this new CLOSING state, then this flow enters
teardown state, ct state is set to TCP_CONNTRACK_CLOSE state and packet
is sent to slow path, so this TCP reopen scenario can be handled by
conntrack. TCP_CONNTRACK_CLOSE provides a small timeout that aims at
quickly releasing this stale entry from the conntrack table.

Moreover, skip hardware re-offload from flowtable software packet if the
flow is in CLOSING state.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19 16:41:56 +01:00
Florian Westphal
03428ca5ce netfilter: conntrack: rework offload nf_conn timeout extension logic
Offload nf_conn entries may not see traffic for a very long time.

To prevent incorrect 'ct is stale' checks during nf_conntrack table
lookup, the gc worker extends the timeout nf_conn entries marked for
offload to a large value.

The existing logic suffers from a few problems.

Garbage collection runs without locks, its unlikely but possible
that @ct is removed right after the 'offload' bit test.

In that case, the timeout of a new/reallocated nf_conn entry will
be increased.

Prevent this by obtaining a reference count on the ct object and
re-check of the confirmed and offload bits.

If those are not set, the ct is being removed, skip the timeout
extension in this case.

Parallel teardown is also problematic:
 cpu1                                cpu2
 gc_worker
                                     calls flow_offload_teardown()
 tests OFFLOAD bit, set
                                     clear OFFLOAD bit
                                     ct->timeout is repaired (e.g. set to timeout[UDP_CT_REPLIED])
 nf_ct_offload_timeout() called
 expire value is fetched
 <INTERRUPT>
-> NF_CT_DAY timeout for flow that isn't offloaded
(and might not see any further packets).

Use cmpxchg: if ct->timeout was repaired after the 2nd 'offload bit' test
passed, then ct->timeout will only be updated of ct->timeout was not
altered in between.

As we already have a gc worker for flowtable entries, ct->timeout repair
can be handled from the flowtable gc worker.

This avoids having flowtable specific logic in the conntrack core
and avoids checking entries that were never offloaded.

This allows to remove the nf_ct_offload_timeout helper.
Its safe to use in the add case, but not on teardown.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19 16:41:55 +01:00
Florian Westphal
31768596b1 netfilter: conntrack: remove skb argument from nf_ct_refresh
Its not used (and could be NULL), so remove it.
This allows to use nf_ct_refresh in places where we don't have
an skb without having to double-check that skb == NULL would be safe.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19 16:41:55 +01:00
Phil Sutter
fc0133428e netfilter: nf_tables: Tolerate chains with no remaining hooks
Do not drop a netdev-family chain if the last interface it is registered
for vanishes. Users dumping and storing the ruleset upon shutdown to
restore it upon next boot may otherwise lose the chain and all contained
rules. They will still lose the list of devices, a later patch will fix
that. For now, this aligns the event handler's behaviour with that for
flowtables.
The controversal situation at netns exit should be no problem here:
event handler will unregister the hooks, core nftables cleanup code will
drop the chain itself.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19 16:41:54 +01:00
Phil Sutter
b7c2d793c2 netfilter: nf_tables: Store user-defined hook ifname
Prepare for hooks with NULL ops.dev pointer (due to non-existent device)
and store the interface name and length as specified by the user upon
creation. No functional change intended.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19 16:41:53 +01:00
Pablo Neira Ayuso
8d738c1869 netfilter: nf_tables: fix set size with rbtree backend
The existing rbtree implementation uses singleton elements to represent
ranges, however, userspace provides a set size according to the number
of ranges in the set.

Adjust provided userspace set size to the number of singleton elements
in the kernel by multiplying the range by two.

Check if the no-match all-zero element is already in the set, in such
case release one slot in the set size.

Fixes: 0ed6389c48 ("netfilter: nf_tables: rename set implementations")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19 16:41:41 +01:00
Jakub Kicinski
1a280c54fd bluetooth-next pull request for net-next:
- btusb: Add new VID/PID 13d3/3610 for MT7922
  - btusb: Add new VID/PID 13d3/3628 for MT7925
  - btusb: Add MT7921e device 13d3:3576
  - btusb: Add RTL8851BE device 13d3:3600
  - btusb: Add ID 0x2c7c:0x0130 for Qualcomm WCN785x
  - btusb: add sysfs attribute to control USB alt setting
  - qca: Expand firmware-name property
  - qca: Fix poor RF performance for WCN6855
  - L2CAP: handle NULL sock pointer in l2cap_sock_alloc
  - Allow reset via sysfs
  - ISO: Allow BIG re-sync
  - dt-bindings: Utilize PMU abstraction for WCN6750
  - MGMT: Mark LL Privacy as stable
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmeIIfsZHGx1aXoudm9u
 LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKfeQD/9PPfKysA655jB+EOnMJ1Qt
 43uibOX3aaqswjtynwR3fNOB5rnibev8uHWJlKh4CXj2X6v0hmJkz+sYSgRzLUZP
 4Lap/AdY0jqTFlaIzmKGc6fTp/cxUYd/guUEMk8tlOLlTPJ7WFIAEq7Tu2LXe8VS
 QGWXdzgsXMoQ36PmcfnM2TTsz+5U674nCvGF5IFSJABnnv/WJUxmbyDnZYAm66R5
 bfmOJ7qzWgq+exYz9g8V5Frwvmp92AlndHcuHIvAvvTryfNgOeP91JAWgczGWhfR
 2qwqVAn9Tkgu2GNkCPbXUnU+wPYBUc7APjLtS0kZDm/sYugZr7Q0Omqt/9YuyhsD
 7pvYOj1bNrpTjHHqNvD5qY3O2882wmm85DIyE1mfMDEknc5n26L68YQKrXdrSeqS
 M/F6FGdUxXapcdtu8+lNMgv8ByvVdPo0I8cBhJSNyNQ7P2D1q9+MFTBnAIcDD0aB
 q9vgMIzNCrZAHai3YzjI8BxEofFn12WO1LyKJAgvknfFDS2CBrZFt39QoqPDDvoe
 wACjcf1A55FFvTUF91KClWSEBOP/XubZmc7SO/e8TSlll9maOROVooRWMq/dAKeR
 Yau1TBpzKmRnj+UxFpIrkP7yRlUADNUWQ5GnsE+q26IkyShZ5vgir/GS9pgE6kQr
 5f/Qw6rlvLVvqAnPOAqFPg==
 =21qg
 -----END PGP SIGNATURE-----

Merge tag 'for-net-next-2025-01-15' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next

Luiz Augusto von Dentz says:

====================
bluetooth-next pull request for net-next:

 - btusb: Add new VID/PID 13d3/3610 for MT7922
 - btusb: Add new VID/PID 13d3/3628 for MT7925
 - btusb: Add MT7921e device 13d3:3576
 - btusb: Add RTL8851BE device 13d3:3600
 - btusb: Add ID 0x2c7c:0x0130 for Qualcomm WCN785x
 - btusb: add sysfs attribute to control USB alt setting
 - qca: Expand firmware-name property
 - qca: Fix poor RF performance for WCN6855
 - L2CAP: handle NULL sock pointer in l2cap_sock_alloc
 - Allow reset via sysfs
 - ISO: Allow BIG re-sync
 - dt-bindings: Utilize PMU abstraction for WCN6750
 - MGMT: Mark LL Privacy as stable

* tag 'for-net-next-2025-01-15' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (23 commits)
  Bluetooth: MGMT: Fix slab-use-after-free Read in mgmt_remove_adv_monitor_sync
  Bluetooth: qca: Fix poor RF performance for WCN6855
  Bluetooth: Allow reset via sysfs
  Bluetooth: Get rid of cmd_timeout and use the reset callback
  Bluetooth: Remove the cmd timeout count in btusb
  Bluetooth: Use str_enable_disable-like helpers
  Bluetooth: btmtk: Remove resetting mt7921 before downloading the fw
  Bluetooth: L2CAP: handle NULL sock pointer in l2cap_sock_alloc
  Bluetooth: btusb: Add RTL8851BE device 13d3:3600
  dt-bindings: bluetooth: Utilize PMU abstraction for WCN6750
  Bluetooth: btusb: Add MT7921e device 13d3:3576
  Bluetooth: btrtl: check for NULL in btrtl_setup_realtek()
  Bluetooth: btbcm: Fix NULL deref in btbcm_get_board_name()
  Bluetooth: qca: Expand firmware-name to load specific rampatch
  Bluetooth: qca: Update firmware-name to support board specific nvm
  dt-bindings: net: bluetooth: qca: Expand firmware-name property
  Bluetooth: btusb: Add new VID/PID 13d3/3628 for MT7925
  Bluetooth: btusb: Add new VID/PID 13d3/3610 for MT7922
  Bluetooth: btusb: add sysfs attribute to control USB alt setting
  Bluetooth: btusb: Add ID 0x2c7c:0x0130 for Qualcomm WCN785x
  ...
====================

Link: https://patch.msgid.link/20250117213203.3921910-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18 17:51:58 -08:00
Jakub Kicinski
66cc61a25c wireless-next patches for v6.14
Most likely the last "new features" pull request for v6.14 and this is
 a bigger one. Multi-Link Operation (MLO) work continues both in stack
 in drivers. Few new devices supported and usual fixes all over.
 
 Major changes:
 
 cfg80211
 
 * Emergency Preparedness Communication Services (EPCS) station mode support
 
 mac80211
 
 * an option to filter a sta from being flushed
 
 * some support for RX Operating Mode Indication (OMI) power saving
 
 * support for adding and removing station links for MLO
 
 iwlwifi
 
 * new device ids
 
 * rework firmware error handling and restart
 
 rtw88
 
 * RTL8812A: RFE type 2 support
 
 * LED support
 
 rtw89
 
 * variant info to support RTL8922AE-VS
 
 mt76
 
 * mt7996: single wiphy multiband support (preparation for MLO)
 
 * mt7996: support for more variants
 
 * mt792x: P2P_DEVICE support
 
 * mt7921u: TP-Link TXE50UH support
 
 ath12k
 
 * enable MLO for QCN9274 (although it seems to be broken with dual
   band devices)
 
 * MLO radar detection support
 
 * debugfs: transmit buffer OFDMA, AST entry and puncture stats
 -----BEGIN PGP SIGNATURE-----
 
 iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmeKvq8RHGt2YWxvQGtl
 cm5lbC5vcmcACgkQbhckVSbrbZtJ3wf/bMRrED1k1xtC117yxBthXOZj7Ae0MG3s
 cqIDBZ2MSmi596jfmc/FRFTN6Rie7+WQg8MGhpWpJt4WyvCd+IqmluLwuJGg3oHz
 KdDQHN766cPb50mykTBY0m+Uh9eqyhaoozzK+KbrJZNLUt/Ri3DOdPInWP/ODKuT
 uo9xv25zmOMQuUJlhOlFpOW68y6isuMsnPpgTvbtjndw3nPe+tVIfGPdg3xeQnq1
 SG05wmOeJ8obmO1VsnmHPN20F5PfS5t2asu12UafR7KTiJOdP2QXKGNTy+ag7fQ/
 9Sq6SzuPpbHHJJ93f0aJOkTUKdQVmk2xfCYNOyPNu1ewLqiVlvnkew==
 =Vu+D
 -----END PGP SIGNATURE-----

Merge tag 'wireless-next-2025-01-17' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.14

Most likely the last "new features" pull request for v6.14 and this is
a bigger one. Multi-Link Operation (MLO) work continues both in stack
in drivers. Few new devices supported and usual fixes all over.

Major changes:

cfg80211
 * Emergency Preparedness Communication Services (EPCS) station mode support

mac80211
 * an option to filter a sta from being flushed
 * some support for RX Operating Mode Indication (OMI) power saving
 * support for adding and removing station links for MLO

iwlwifi
 * new device ids
 * rework firmware error handling and restart

rtw88
 * RTL8812A: RFE type 2 support
 * LED support

rtw89
 * variant info to support RTL8922AE-VS

mt76
 * mt7996: single wiphy multiband support (preparation for MLO)
 * mt7996: support for more variants
 * mt792x: P2P_DEVICE support
 * mt7921u: TP-Link TXE50UH support

ath12k
 * enable MLO for QCN9274 (although it seems to be broken with dual
   band devices)
 * MLO radar detection support
 * debugfs: transmit buffer OFDMA, AST entry and puncture stats

* tag 'wireless-next-2025-01-17' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (322 commits)
  wifi: brcmfmac: fix NULL pointer dereference in brcmf_txfinalize()
  wifi: rtw88: add RTW88_LEDS depends on LEDS_CLASS to Kconfig
  wifi: wilc1000: unregister wiphy only after netdev registration
  wifi: cfg80211: adjust allocation of colocated AP data
  wifi: mac80211: fix memory leak in ieee80211_mgd_assoc_ml_reconf()
  wifi: ath12k: fix key cache handling
  wifi: ath12k: Fix uninitialized variable access in ath12k_mac_allocate() function
  wifi: ath12k: Remove ath12k_get_num_hw() helper function
  wifi: ath12k: Refactor the ath12k_hw get helper function argument
  wifi: ath12k: Refactor ath12k_hw set helper function argument
  wifi: mt76: mt7996: add implicit beamforming support for mt7992
  wifi: mt76: mt7996: fix beacon command during disabling
  wifi: mt76: mt7996: fix ldpc setting
  wifi: mt76: mt7996: fix definition of tx descriptor
  wifi: mt76: connac: adjust phy capabilities based on band constraints
  wifi: mt76: mt7996: fix incorrect indexing of MIB FW event
  wifi: mt76: mt7996: fix HE Phy capability
  wifi: mt76: mt7996: fix the capability of reception of EHT MU PPDU
  wifi: mt76: mt7996: add max mpdu len capability
  wifi: mt76: mt7921: avoid undesired changes of the preset regulatory domain
  ...
====================

Link: https://patch.msgid.link/20250117203529.72D45C4CEDD@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18 17:46:54 -08:00
Vladimir Oltean
4b0a3ffa79 net: dsa: implement get_ts_stats ethtool operation for user ports
Integrate with the standard infrastructure for reporting hardware packet
timestamping statistics.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250116104628.123555-3-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 20:01:09 -08:00
Jakub Kicinski
ba0209bd18 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
ice: support FW Recovery Mode

Konrad Knitter says:

Enable update of card in FW Recovery Mode

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: support FW Recovery Mode
  devlink: add devl guard
  pldmfw: enable selected component update
====================

Link: https://patch.msgid.link/20250116212059.1254349-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 19:45:35 -08:00
Konrad Knitter
0502bd2e06 devlink: add devl guard
Add devl guard for scoped_guard().

Example usage:

scoped_guard(devl, priv_to_devlink(pf)) {
	err = init_devlink(pf);
	if (err)
		return err;
}

Co-developed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Konrad Knitter <konrad.knitter@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-16 13:04:58 -08:00
Jakub Kicinski
2ee738e90e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.13-rc8).

Conflicts:

drivers/net/ethernet/realtek/r8169_main.c
  1f691a1fc4 ("r8169: remove redundant hwmon support")
  152d00a913 ("r8169: simplify setting hwmon attribute visibility")
https://lore.kernel.org/20250115122152.760b4e8d@canb.auug.org.au

Adjacent changes:

drivers/net/ethernet/broadcom/bnxt/bnxt.c
  152f4da05a ("bnxt_en: add support for rx-copybreak ethtool command")
  f0aa6a37a3 ("eth: bnxt: always recalculate features after XDP clearing, fix null-deref")

drivers/net/ethernet/intel/ice/ice_type.h
  50327223a8 ("ice: add lock to protect low latency interface")
  dc26548d72 ("ice: Fix quad registers read on E825")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-16 10:34:59 -08:00
Linus Torvalds
ce69b40190 Current release - regressions:
- core: avoid CFI problems with sock priv helpers
 
   - xsk: bring back busy polling support
 
   - netpoll: ensure skb_pool list is always initialized
 
 Current release - new code bugs:
 
   - core: make page_pool_ref_netmem work with net iovs
 
   - ipv4: route: fix drop reason being overridden in ip_route_input_slow
 
   - udp: make rehash4 independent in udp_lib_rehash()
 
 Previous releases - regressions:
 
   - bpf: fix bpf_sk_select_reuseport() memory leak
 
   - openvswitch: fix lockup on tx to unregistering netdev with carrier
 
   - mptcp: be sure to send ack when mptcp-level window re-opens
 
   - eth: bnxt: always recalculate features after XDP clearing, fix null-deref
 
   - eth: mlx5: fix sub-function add port error handling
 
   - eth: fec: handle page_pool_dev_alloc_pages error
 
 Previous releases - always broken:
 
   - vsock: some fixes due to transport de-assignment
 
   - eth: ice: fix E825 initialization
 
   - eth: mlx5e: fix inversion dependency warning while enabling IPsec tunnel
 
   - eth: gtp: destroy device along with udp socket's netns dismantle.
 
   - eth: xilinx: axienet: Fix IRQ coalescing packet count overflow
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmeJGOISHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkfLwP/1XaeyEtSifDiF+bj7f3M6gd8RC2wkNq
 8DvHadl+uPx1RWv0F2UH9fsVz17A3Gg3oF2Agl4tMP5p9F0e489pNjm2QXOl1zac
 hpJdV0VdNJHKEfWhKODRfLap6fNtPoEQP5r3scbFYuzkdMw6sYZujdQUmFNghPYe
 Y6GKZIrQ96vYLpSTrLCAQt/2EEt608b3ESFFhqTkvB8voB2cODNxxBoTJ5K+jMa0
 +fVW46siGKc8HSaUJCWS5YkAW/Tu3AXJmYgKGQg9PaErVclwImsQFXggIki80P7W
 747Gkuc3kZm3Mt91d6kK1s5Sxr/FAaaJlOOE2iHpZld6cN+Y6niJ+knFdkaX5rCE
 T/aLq8cdegwSdct6CIJ7YZp3v1AVv21erWf7OpbY9KGTWPV9d2yzh3fYin87tAzs
 YYo0H1OqqbxpnKThgGREpu+LqEkCbMzsKmwn/5wTAZZl28ySZWZin2ukzTMRqla0
 Y8JJvBYvcHn/ekb4gJNaDhJF7ZBuLjrXlG1SXAyO+GS4TwToqrK/luPRf0tkbI/Z
 QVNBNCukdRTy/IeZQJsc1gtE1tlQmRXNlTbAILPIkWWdxgjdpd/wBbP8/qG9184l
 Ut4gu7AVF+LLH5nhgRVHAcfrO3i/kbRFC3ErQw06YLyqLInyss8GPULP7tWeFfE3
 iM/DsTHjjr04
 =EtnO
 -----END PGP SIGNATURE-----

Merge tag 'net-6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Notably this includes fixes for a few regressions spotted very
  recently. No known outstanding ones.

  Current release - regressions:

   - core: avoid CFI problems with sock priv helpers

   - xsk: bring back busy polling support

   - netpoll: ensure skb_pool list is always initialized

  Current release - new code bugs:

   - core: make page_pool_ref_netmem work with net iovs

   - ipv4: route: fix drop reason being overridden in
     ip_route_input_slow

   - udp: make rehash4 independent in udp_lib_rehash()

  Previous releases - regressions:

   - bpf: fix bpf_sk_select_reuseport() memory leak

   - openvswitch: fix lockup on tx to unregistering netdev with carrier

   - mptcp: be sure to send ack when mptcp-level window re-opens

   - eth:
      - bnxt: always recalculate features after XDP clearing, fix
        null-deref
      - mlx5: fix sub-function add port error handling
      - fec: handle page_pool_dev_alloc_pages error

  Previous releases - always broken:

   - vsock: some fixes due to transport de-assignment

   - eth:
      - ice: fix E825 initialization
      - mlx5e: fix inversion dependency warning while enabling IPsec
        tunnel
      - gtp: destroy device along with udp socket's netns dismantle.
      - xilinx: axienet: Fix IRQ coalescing packet count overflow"

* tag 'net-6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits)
  netdev: avoid CFI problems with sock priv helpers
  net/mlx5e: Always start IPsec sequence number from 1
  net/mlx5e: Rely on reqid in IPsec tunnel mode
  net/mlx5e: Fix inversion dependency warning while enabling IPsec tunnel
  net/mlx5: Clear port select structure when fail to create
  net/mlx5: SF, Fix add port error handling
  net/mlx5: Fix a lockdep warning as part of the write combining test
  net/mlx5: Fix RDMA TX steering prio
  net: make page_pool_ref_netmem work with net iovs
  net: ethernet: xgbe: re-add aneg to supported features in PHY quirks
  net: pcs: xpcs: actively unset DW_VR_MII_DIG_CTRL1_2G5_EN for 1G SGMII
  net: pcs: xpcs: fix DW_VR_MII_DIG_CTRL1_2G5_EN bit being set for 1G SGMII w/o inband
  selftests: net: Adapt ethtool mq tests to fix in qdisc graft
  net: fec: handle page_pool_dev_alloc_pages error
  net: netpoll: ensure skb_pool list is always initialized
  net: xilinx: axienet: Fix IRQ coalescing packet count overflow
  nfp: bpf: prevent integer overflow in nfp_bpf_event_output()
  selftests: mptcp: avoid spurious errors on disconnect
  mptcp: fix spurious wake-up on under memory pressure
  mptcp: be sure to send ack when mptcp-level window re-opens
  ...
2025-01-16 09:09:44 -08:00
Eric Dumazet
0734d7c3d9 net: expedite synchronize_net() for cleanup_net()
cleanup_net() is the single thread responsible
for netns dismantles, and a serious bottleneck.

Before we can get per-netns RTNL, make sure
all synchronize_net() called from this thread
are using rcu_synchronize_expedited().

v3: deal with CONFIG_NET_NS=n

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jesse Brandeburg <jbrandeburg@cloudflare.com>
Link: https://patch.msgid.link/20250114205531.967841-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-15 19:17:03 -08:00
Pavel Begunkov
cbc16bceea net: make page_pool_ref_netmem work with net iovs
page_pool_ref_netmem() should work with either netmem representation, but
currently it casts to a page with netmem_to_page(), which will fail with
net iovs. Use netmem_get_pp_ref_count_ref() instead.

Fixes: 8ab79ed50c ("page_pool: devmem support")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://lore.kernel.org/20250108220644.3528845-2-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-15 18:44:30 -08:00
Hsin-chen Chuang
f07d478090 Bluetooth: Get rid of cmd_timeout and use the reset callback
The hdev->reset is never used now and the hdev->cmd_timeout actually
does reset. This patch changes the call path from
  hdev->cmd_timeout -> vendor_cmd_timeout -> btusb_reset -> hdev->reset
, to
  hdev->reset -> vendor_reset -> btusb_reset
Which makes it clear when we export the hdev->reset to a wider usage
e.g. allowing reset from sysfs.

This patch doesn't introduce any behavior change.

Signed-off-by: Hsin-chen Chuang <chharry@chromium.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-01-15 10:36:42 -05:00
Dr. David Alan Gilbert
b05ce88960 Bluetooth: hci: Remove deadcode
hci_bdaddr_list_del_with_flags() was added in 2020's
commit 8baaa4038e ("Bluetooth: Add bdaddr_list_with_flags for classic
whitelist")
but has remained unused.

hci_remove_ext_adv_instance() was added in 2020's
commit eca0ae4aea ("Bluetooth: Add initial implementation of BIS
connections")
but has remained unused.

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-01-15 10:32:20 -05:00
Luiz Augusto von Dentz
e209e5ccc5 Bluetooth: MGMT: Mark LL Privacy as stable
This marks LL Privacy as stable by removing its experimental UUID and
move its functionality to Device Flag (HCI_CONN_FLAG_ADDRESS_RESOLUTION)
which can be set by MGMT Device Set Flags so userspace retain control of
the feature.

Link: https://github.com/bluez/bluez/issues/1028
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-01-15 10:32:00 -05:00
Eric Dumazet
d16b344790 tcp: add LINUX_MIB_PAWS_OLD_ACK SNMP counter
Prior patch in the series added TCP_RFC7323_PAWS_ACK drop reason.

This patch adds the corresponding SNMP counter, for folks
using nstat instead of tracing for TCP diagnostics.

nstat -az | grep PAWSOldAck

Suggested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250113135558.3180360-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14 13:28:13 -08:00
Eric Dumazet
124c4c32e9 tcp: add TCP_RFC7323_PAWS_ACK drop reason
XPS can cause reorders because of the relaxed OOO
conditions for pure ACK packets.

For hosts not using RFS, what can happpen is that ACK
packets are sent on behalf of the cpu processing NIC
interrupts, selecting TX queue A for ACK packet P1.

Then a subsequent sendmsg() can run on another cpu.
TX queue selection uses the socket hash and can choose
another queue B for packets P2 (with payload).

If queue A is more congested than queue B,
the ACK packet P1 could be sent on the wire after
P2.

A linux receiver when processing P1 (after P2) currently increments
LINUX_MIB_PAWSESTABREJECTED (TcpExtPAWSEstab)
and use TCP_RFC7323_PAWS drop reason.
It might also send a DUPACK if not rate limited.

In order to better understand this pattern, this
patch adds a new drop_reason : TCP_RFC7323_PAWS_ACK.

For old ACKS like these, we no longer increment
LINUX_MIB_PAWSESTABREJECTED and no longer sends a DUPACK,
keeping credit for other more interesting DUPACK.

perf record -e skb:kfree_skb -a
perf script
...
         swapper       0 [148] 27475.438637: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [208] 27475.438706: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [208] 27475.438908: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [148] 27475.439010: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [148] 27475.439214: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [208] 27475.439286: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
...

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20250113135558.3180360-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14 13:28:13 -08:00
Paolo Abeni
624d7a8a9d netfilter pull request 25-01-11
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEjF9xRqF1emXiQiqU1w0aZmrPKyEFAmeC97YACgkQ1w0aZmrP
 KyEUfg//e+n/YKyHn3MRlHeKf9HnPdAdzrHqNrq8t8332x/4nFGBeMWPbEuM0H7B
 kM4eZUp8XjS5JF4ze8mZV6HJ7c2a0JHMkrs1/I9uZzBkvZayFl2ueL0cCQoizK0G
 opOLinsw/RUe6H/ulGEq2K7rtBjIAvWi5d2/i+oMERkr3ADOq0d4cWlGJra0a6Lb
 RyCPDJoQ65kNBHCChxYVhhpC8LlMCSDTuZPwcl58qGRRqiTIyVme05K9yCQrcJAO
 91trgnsTHghJ1xBcTAvewcSTDylkcL7qkiFuCYcvUmPmVBrFKn1g/va99rrsvVI6
 Fz2pbIPMZa/6Gdpx/mBzDaPv0XAU+cqLuIJ/t5fdRiCviHzEYZaGXh3+ZWLyhR8d
 FLXXC3V1IKnsBqqN3TC7Rx81h9c6jt3+KPKNMFOpoYtn3V9shHtjpAHCLW7UDSU6
 zPp8FhzwDS7BV91Oyp2uKWtmUr0GQFRpr9F/iSGp8Ix0l8kL+AD5t3M3LA/QDjnq
 GSpAsakrZMXgfFfAZxecvT8cVzWE0KydKrAawsFTn11s++rVCQlBLayZjHD6ZfuM
 IlP69GZcPgEvxT5lZIcg74pzCQpbR8jV6KME3XV8Kvu8FLNqv2l+hy1bB+x3YAFQ
 8BOalM7KPDNqNdK91DVkfEl0DxabmJcRj7R0C1L85R5EN4xYATA=
 =6jpN
 -----END PGP SIGNATURE-----

Merge tag 'nf-next-25-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains a small batch of Netfilter/IPVS updates
for net-next:

1) Remove unused genmask parameter in nf_tables_addchain()

2) Speed up reads from /proc/net/ip_vs_conn, from Florian Westphal.

3) Skip empty buckets in hashlimit to avoid atomic operations that results
   in false positive reports by syzbot with lockdep enabled, patch from
   Eric Dumazet.

4) Add conntrack event timestamps available via ctnetlink,
   from Florian Westphal.

netfilter pull request 25-01-11

* tag 'nf-next-25-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: conntrack: add conntrack event timestamp
  netfilter: xt_hashlimit: htable_selective_cleanup() optimization
  ipvs: speed up reads from ip_vs_conn proc file
  netfilter: nf_tables: remove the genmask parameter
====================

Link: https://patch.msgid.link/20250111230800.67349-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-14 12:08:24 +01:00
Benjamin Berg
6bd9a087c8 wifi: mac80211: set key link ID to the deflink one
When in non-MLO mode, the key ID was set to -1 even for keys that are
not pairwise. Change the link ID to be the link ID of the deflink in
this case so that drivers do not need to special cases for this.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250102161730.0c066f084677.I4a5c288465e75119edb6a0df90dddf6f30d14a02@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-01-13 15:34:09 +01:00
Ilan Peer
904c277342 wifi: cfg80211: Add support for controlling EPCS
Add support for configuring Emergency Preparedness Communication
Services (EPCS) for station mode.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250102161730.ea54ac94445c.I11d750188bc0871e13e86146a3b5cc048d853e69@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-01-13 15:34:09 +01:00
Ilan Peer
65c1c04179 wifi: cfg80211: Add support for dynamic addition/removal of links
Add support for requesting dynamic addition/removal of links to the
current MLO association.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250102161730.cef23352f2a2.I79c849974c494cb1cbf9e1b22a5d2d37395ff5ac@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-01-13 15:34:08 +01:00
Benjamin Berg
f6d2e5abf1 wifi: nl80211: permit userspace to pass supported selectors
Currently the SAE_H2E selector already exists, which needs to be
implemented by the SME. As new such selectors might be added in the
future, add a feature to permit userspace to report a selector as
supported.

If not given, the kernel should assume that userspace does support
SAE_H2E.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250101070249.fe67b871cc39.Ieb98390328927e998e612345a58b6dbc00b0e3a2@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-01-13 15:26:45 +01:00
Miri Korenblit
dfd5b5b5b7 wifi: mac80211: clarify key idx documententaion
ieee80211_key_conf::keyidx s in range 0-7, ano not 0-3. Make this clear
in the documentation.

Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Link: https://patch.msgid.link/20250101070249.4e414710fba7.Ib739c40dd5aa6ed148c3151220eb38d8a9e238de@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-01-13 15:26:43 +01:00
Johannes Berg
da7f40c05c wifi: mac80211: add some support for RX OMI power saving
In order to save power, it can be desirable to change the
RX operating mode using OMI to reduce the bandwidth. As the
handshake must be done in the HTC+ field, it cannot be done
by mac80211 directly, so expose functions to the driver to
request and finalize the necessary updates.

Note that RX OMI really only changes what the peer (AP) will
transmit to us, but in order to use it to actually save some
power (by reducing the listen bandwidth) we also update rate
scaling and then the channel context's mindef accordingly.

The updates are split into two in order to sequence them
correctly, when reducing bandwidth first reduce the rate
scaling and thus TX, then send OMI, then reduce the listen
bandwidth (chandef); when increasing bandwidth this is the
other way around. This also requires tracking in different
variables which part is applicable already.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250101070249.2c1a1934bd73.I4e90fd503504e37f9eac5bdae62e3f07e7071275@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-01-13 15:26:43 +01:00
Miri Korenblit
687a7c8a72 wifi: mac80211: change disassoc sequence a bit
Currently, the sequence goes like this (among others):
1. flush all stations (including the AP ones) -> this will tell the
   drivers to remove the stations
2. notify the driver the vif is not associated.

Which means that in between 1 and 2, the state is that the vif is
associated, but there is no AP station, which makes no sense, and may be
problematic for some drivers (for example iwlwifi)

Change the sequence to:
1. flush the TDLS stations
2. move the AP station to IEEE80211_STA_NONE
3. notify the driver about the vif being unassociated
4. flush the AP station

In order to not break other drivers, add a vif flag to indicate whether
the driver wants to new sequence or not. If the flag is not set, then
things will be done in the old sequence.

Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Link: https://patch.msgid.link/20241224192322.996ad1be6cb3.I7815d33415aa1d65c0120b54be7a15a45388f807@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-01-13 13:53:04 +01:00
Stanislav Fomichev
5ef44b3cb4 xsk: Bring back busy polling support
Commit 86e25f40aa ("net: napi: Add napi_config") moved napi->napi_id
assignment to a later point in time (napi_hash_add_with_id). This breaks
__xdp_rxq_info_reg which copies napi_id at an earlier time and now
stores 0 napi_id. It also makes sk_mark_napi_id_once_xdp and
__sk_mark_napi_id_once useless because they now work against 0 napi_id.
Since sk_busy_loop requires valid napi_id to busy-poll on, there is no way
to busy-poll AF_XDP sockets anymore.

Bring back the ability to busy-poll on XSK by resolving socket's napi_id
at bind time. This relies on relatively recent netif_queue_set_napi,
but (assume) at this point most popular drivers should have been converted.
This also removes per-tx/rx cycles which used to check and/or set
the napi_id value.

Confirmed by running a busy-polling AF_XDP socket
(github.com/fomichev/xskrtt) on mlx5 and looking at BusyPollRxPackets
from /proc/net/netstat.

Fixes: 86e25f40aa ("net: napi: Add napi_config")
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250109003436.2829560-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10 18:07:56 -08:00
Linus Torvalds
7110f24f9e vfs-6.13-rc7.fixes.2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4EhtAAKCRCRxhvAZXjc
 orToAQCIKKS7fk9j8CUSAdRG5mMy7Q++8OEVA+gyyMWuXnBPYwD/ehy+1xBVjCcI
 FBzLadaJSuygjZVCzhVXsE0oRf4A2wg=
 =waDA
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.13-rc7.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
 "afs:

   - Fix the maximum cell name length

   - Fix merge preference rule failure condition

  fuse:

   - Fix fuse_get_user_pages() so it doesn't risk misleading the caller
     to think pages have been allocated when they actually haven't

   - Fix direct-io folio offset and length calculation

  netfs:

   - Fix async direct-io handling

   - Fix read-retry for filesystems that don't provide a
     ->prepare_read() method

  vfs:

   - Prevent truncating 64-bit offsets to 32-bits in iomap

   - Fix memory barrier interactions when polling

   - Remove MNT_ONRB to fix concurrent modification of @mnt->mnt_flags
     leading to MNT_ONRB to not be raised and invalid access to a list
     member"

* tag 'vfs-6.13-rc7.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  poll: kill poll_does_not_wait()
  sock_poll_wait: kill the no longer necessary barrier after poll_wait()
  io_uring_poll: kill the no longer necessary barrier after poll_wait()
  poll_wait: kill the obsolete wait_address check
  poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll()
  afs: Fix merge preference rule failure condition
  netfs: Fix read-retry for fs with no ->prepare_read()
  netfs: Fix kernel async DIO
  fs: kill MNT_ONRB
  iomap: avoid avoid truncating 64-bit offset to 32 bits
  afs: Fix the maximum cell name length
  fuse: Set *nbytesp=0 in fuse_get_user_pages on allocation failure
  fuse: fix direct io folio offset and length calculation
2025-01-10 09:11:11 -08:00
Christian Brauner
1623bc27a8
Merge branch 'vfs-6.14.poll' into vfs.fixes
Bring in the fixes for __pollwait() and waitqueue_active() interactions.

Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-10 12:01:21 +01:00
Oleg Nesterov
b2849867b3
sock_poll_wait: kill the no longer necessary barrier after poll_wait()
Now that poll_wait() provides a full barrier we can remove smp_mb() from
sock_poll_wait().

Also, the poll_does_not_wait() check before poll_wait() just adds the
unnecessary confusion, kill it. poll_wait() does the same "p && p->_qproc"
check.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250107162736.GA18944@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-10 11:59:00 +01:00
David S. Miller
7b24f164cf ipsec-next-2025-01-09
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH7ZpcWbFyOOp6OJbrB3Eaf9PW7cFAmd/mNkACgkQrB3Eaf9P
 W7cQww/9Hnv7+wosuBxFW2o2ptgZThiwj/Ovz0oiWGWvctpN7VlpwmrDOsXQ+XMn
 xF6JBFEjnJanYoDBb78D0dMffJcMcUKZopTUU/ZMitSNr8aIHYiuB4SWPG1tqxl4
 Ete7Mr2m3tS96YePQNnAaRZzEuGsx3BQb28VLTWl9So81MByD2OK4fsAbYz22Gg8
 7A6tDHn1mUd9b2VG+LeeBZaDDFG8C0O2x4E/8Z3DX3z1N8y3LABPwZ38jcgTviKO
 1ZldGrJT+PownBydu23bWDassKE2TuVvGH9e/SOPeQj8DJ4Lmd0bafMZTY6xwfNT
 RJCwhlzZUpYRXFzvcf3+U3egsqEWEemV7/LzAapdT0V9OqfLWMUh3b1jMA4KcblZ
 qmlm/MhZyXutDoeuASwtM4jgM3wGwovOofrKKsb13hD9VLBs5jFZmFSw5TlbmwE3
 sjv7V4pFwNyfJnwQtyMmfuuHiy8w+fzqAA2GCg8mF3OosHABH/FOvEBP8xg1Vqu1
 iKlLkByfyaCFD+GxTPzqSDvSB8nDzeZBgBM/ILGuwH0OWr+0gxBzl1sxrhAkw9hC
 Gf+4J3wg7EknTxfrJk4LyqfyS50GvUIzpLSkHYxDtQRd4zv75bxRzMvoZvuapTtN
 GGpWYftKO8n2kgLORQ0dTtFee3c4w/No+KXWsFdtRVNpyk3N0Yw=
 =U79q
 -----END PGP SIGNATURE-----

Merge tag 'ipsec-next-2025-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next

Steffen Klassert says:

====================
ipsec-next-2025-01-09

1) Implement the AGGFRAG protocol and basic IP-TFS (RFC9347) functionality.
   From Christian Hopps.

2) Support ESN context update to hardware for TX.
   From Jianbo Liu.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-10 09:15:17 +00:00
Jakub Kicinski
14ea4cd1b1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.13-rc7).

Conflicts:
  a42d71e322 ("net_sched: sch_cake: Add drop reasons")
  737d4d91d3 ("sched: sch_cake: add bounds checks to host bulk flow fairness counts")

Adjacent changes:

drivers/net/ethernet/meta/fbnic/fbnic.h
  3a856ab347 ("eth: fbnic: add IRQ reuse support")
  95978931d5 ("eth: fbnic: Revert "eth: fbnic: Add hardware monitoring support via HWMON interface"")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-09 16:11:47 -08:00
Florian Westphal
601731fc7c netfilter: conntrack: add conntrack event timestamp
Nadia Pinaeva writes:
  I am working on a tool that allows collecting network performance
  metrics by using conntrack events.
  Start time of a conntrack entry is used to evaluate seen_reply
  latency, therefore the sooner it is timestamped, the better the
  precision is.
  In particular, when using this tool to compare the performance of the
  same feature implemented using iptables/nftables/OVS it is crucial
  to have the entry timestamped earlier to see any difference.

At this time, conntrack events can only get timestamped at recv time in
userspace, so there can be some delay between the event being generated
and the userspace process consuming the message.

There is sys/net/netfilter/nf_conntrack_timestamp, which adds a
64bit timestamp (ns resolution) that records start and stop times,
but its not suited for this either, start time is the 'hashtable insertion
time', not 'conntrack allocation time'.

There is concern that moving the start-time moment to conntrack
allocation will add overhead in case of flooding, where conntrack
entries are allocated and released right away without getting inserted
into the hashtable.

Also, even if this was changed it would not with events other than
new (start time) and destroy (stop time).

Pablo suggested to add new CTA_TIMESTAMP_EVENT, this adds this feature.
The timestamp is recorded in case both events are requested and the
sys/net/netfilter/nf_conntrack_timestamp toggle is enabled.

Reported-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-09 14:42:16 +01:00
Yuyang Huang
33d97a07b3 netlink: add IPv6 anycast join/leave notifications
This change introduces a mechanism for notifying userspace
applications about changes to IPv6 anycast addresses via netlink. It
includes:

* Addition and deletion of IPv6 anycast addresses are reported using
  RTM_NEWANYCAST and RTM_DELANYCAST.
* A new netlink group (RTNLGRP_IPV6_ACADDR) for subscribing to these
  notifications.

This enables user space applications(e.g. ip monitor) to efficiently
track anycast addresses through netlink messages, improving metrics
collection and system monitoring. It also unlocks the potential for
advanced anycast management in user space, such as hardware offload
control and fine grained network control.

Cc: Maciej Żenczykowski <maze@google.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Yuyang Huang <yuyanghuang@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250107114355.1766086-1-yuyanghuang@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-09 12:54:45 +01:00
Russell King (Oracle)
2fa8b4383d net: dsa: remove get_mac_eee() method
The get_mac_eee() is no longer called by the core DSA code, nor are
there any implementations of this method. Remove it.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tUllU-007UzL-KV@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07 18:06:18 -08:00