linux-yocto/net
Florian Westphal d74b49bb6b netfilter: nf_tables: restart set lookup on base_seq change
[ Upstream commit b2f742c846cab9afc5953a5d8f17b54922dcc723 ]

The hash, hash_fast, rhash and bitwise sets may indicate no result even
though a matching element exists during a short time window while other
cpu is finalizing the transaction.

This happens when the hash lookup/bitwise lookup function has picked up
the old genbit, right before it was toggled by nf_tables_commit(), but
then the same cpu managed to unlink the matching old element from the
hash table:

cpu0					cpu1
  has added new elements to clone
  has marked elements as being
  inactive in new generation
					perform lookup in the set
  enters commit phase:
					A) observes old genbit
   increments base_seq
I) increments the genbit
II) removes old element from the set
					B) finds matching element
					C) returns no match: found
					element is not valid in old
					generation

					Next lookup observes new genbit and
					finds matching e2.

Consider a packet matching element e1, e2.

cpu0 processes following transaction:
1. remove e1
2. adds e2, which has same key as e1.

P matches both e1 and e2.  Therefore, cpu1 should always find a match
for P. Due to above race, this is not the case:

cpu1 observed the old genbit.  e2 will not be considered once it is found.
The element e1 is not found anymore if cpu0 managed to unlink it from the
hlist before cpu1 found it during list traversal.

The situation only occurs for a brief time period, lookups happening
after I) observe new genbit and return e2.

This problem exists in all set types except nft_set_pipapo, so fix it once
in nft_lookup rather than each set ops individually.

Sample the base sequence counter, which gets incremented right before the
genbit is changed.

Then, if no match is found, retry the lookup if the base sequence was
altered in between.

If the base sequence hasn't changed:
 - No update took place: no-match result is expected.
   This is the common case.  or:
 - nf_tables_commit() hasn't progressed to genbit update yet.
   Old elements were still visible and nomatch result is expected, or:
 - nf_tables_commit updated the genbit:
   We picked up the new base_seq, so the lookup function also picked
   up the new genbit, no-match result is expected.

If the old genbit was observed, then nft_lookup also picked up the old
base_seq: nft_lookup_should_retry() returns true and relookup is performed
in the new generation.

This problem was added when the unconditional synchronize_rcu() call
that followed the current/next generation bit toggle was removed.

Thanks to Pablo Neira Ayuso for reviewing an earlier version of this
patchset, for suggesting re-use of existing base_seq and placement of
the restart loop in nft_set_do_lookup().

Fixes: 0cbc06b3fa ("netfilter: nf_tables: remove synchronize_rcu in commit phase")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-09-19 16:35:50 +02:00
..
6lowpan
9p 9p/trans_fd: mark concurrent read and writes to p9_conn->err 2025-05-02 07:59:20 +02:00
802
8021q net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime 2025-07-24 08:56:34 +02:00
appletalk net: appletalk: Fix use-after-free in AARP proxy probe 2025-08-01 09:48:41 +01:00
atm net: atm: fix memory leak in atm_register_sysfs when device_register fail 2025-09-09 18:58:13 +02:00
ax25 ax25: properly unshare skbs in ax25_kiss_rcv() 2025-09-09 18:58:13 +02:00
batman-adv batman-adv: fix OOB read/write in network-coding decode 2025-09-09 18:58:18 +02:00
bluetooth Bluetooth: Fix use-after-free in l2cap_sock_cleanup_listen() 2025-09-09 18:58:07 +02:00
bpf
bridge net: bridge: Bounce invalid boolopts 2025-09-19 16:35:48 +02:00
caif caif: reduce stack size, again 2025-08-15 12:13:40 +02:00
can can: j1939: j1939_local_ecu_get(): undo increment when j1939_local_ecu_get() fails 2025-09-19 16:35:49 +02:00
ceph libceph: fix invalid accesses to ceph_connection_v1_info 2025-09-19 16:35:47 +02:00
core net_sched: gen_estimator: fix est_timer() vs CONFIG_PREEMPT_RT=y 2025-09-09 18:58:07 +02:00
dcb
dccp
devlink devlink: fix xa_alloc_cyclic() error handling 2025-03-28 22:03:27 +01:00
dns_resolver
dsa net: dsa: provide implementation of .support_eee() 2025-09-09 18:58:19 +02:00
ethernet
ethtool ethtool: cmis_cdb: use correct rpl size in ethtool_cmis_module_poll() 2025-04-25 10:47:43 +02:00
handshake
hsr net, hsr: reject HSR frame if skb can't hold tag 2025-08-28 16:31:02 +02:00
ieee802154
ife
ipv4 tunnels: reset the GSO metadata before reusing the skb 2025-09-19 16:35:48 +02:00
ipv6 net/tcp: Fix socket memory leak in TCP-AO failure handling for IPv6 2025-09-09 18:58:10 +02:00
iucv
kcm net: kcm: Fix race condition in kcm_unattach() 2025-08-20 18:30:18 +02:00
key
l2tp l2tp: do not use sock_hold() in pppol2tp_session_get_sock() 2025-09-04 15:31:51 +02:00
l3mdev
lapb
llc llc: fix data loss when reading from a socket in llc_ui_recvmsg() 2025-05-29 11:03:20 +02:00
mac80211 wifi: mac80211: check basic rates validity in sta_link_apply_parameters 2025-08-20 18:30:56 +02:00
mac802154
mctp mctp: return -ENOPROTOOPT for unknown getsockopt options 2025-09-09 18:58:13 +02:00
mpls mpls: Use rcu_dereference_rtnl() in mpls_route_input_rcu(). 2025-06-27 11:11:43 +01:00
mptcp mptcp: sockopt: make sync_socket_options propagate SOCK_KEEPOPEN 2025-09-19 16:35:45 +02:00
ncsi net: ncsi: Fix buffer overflow in fetching version id 2025-08-20 18:30:38 +02:00
netfilter netfilter: nf_tables: restart set lookup on base_seq change 2025-09-19 16:35:50 +02:00
netlabel calipso: unlock rcu before returning -EAFNOSUPPORT 2025-06-19 15:32:37 +02:00
netlink genetlink: fix genl_bind() invoking bind() after -EPERM 2025-09-19 16:35:48 +02:00
netrom
nfc NFC: nci: uart: Set tty->disc_data only in success path 2025-06-27 11:11:21 +01:00
nsh
openvswitch net: openvswitch: Fix the dead loop of MPLS parse 2025-06-19 15:31:55 +02:00
packet net/packet: fix a race in packet_set_ring() and packet_notifier() 2025-08-15 12:14:09 +02:00
phonet phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept() 2025-07-24 08:56:24 +02:00
psample
qrtr
rds net: better track kernel sockets lifetime 2025-08-20 18:30:56 +02:00
rfkill
rose net: rose: fix a typo in rose_clear_routes() 2025-09-04 15:31:55 +02:00
rxrpc rxrpc: Fix transmission of an abort in response to an abort 2025-07-24 08:56:35 +02:00
sched net/sched: Remove unnecessary WARNING condition for empty child qdisc in htb_activate 2025-08-28 16:31:15 +02:00
sctp sctp: initialize more fields in sctp_v6_from_sk() 2025-09-04 15:31:51 +02:00
smc net/smc: Remove validation of reserved bits in CLC Decline message 2025-09-09 18:58:13 +02:00
strparser
sunrpc Revert "SUNRPC: Don't allow waiting for exiting tasks" 2025-09-19 16:35:45 +02:00
switchdev net: switchdev: Convert blocking notification chain to a raw one 2025-03-22 12:54:12 -07:00
tipc tipc: Fix use-after-free in tipc_conn_close(). 2025-07-17 18:37:05 +02:00
tls tls: fix handling of zero-length records on the rx_list 2025-08-28 16:31:11 +02:00
unix af_unix: Don't set -ECONNRESET for consumed OOB skb. 2025-07-06 11:01:40 +02:00
vmw_vsock vsock/virtio: Validate length in packet header before skb_put() 2025-08-28 16:30:59 +02:00
wireless wifi: cfg80211: sme: cap SSID length in __cfg80211_connect_result() 2025-09-09 18:58:12 +02:00
x25
xdp xsk: Fix race condition in AF_XDP generic RX path 2025-05-09 09:50:38 +02:00
xfrm xfrm: Duplicate SPI Handling 2025-08-20 18:30:34 +02:00
compat.c
devres.c
Kconfig
Kconfig.debug
Makefile
socket.c
sysctl_net.c