linux-yocto/net/netfilter
Florian Westphal 09d6074995 netfilter: nf_tables: avoid chain re-validation if possible
[ Upstream commit 8e1a1bc4f5a42747c08130b8242ebebd1210b32f ]

Hamza Mahfooz reports cpu soft lock-ups in
nft_chain_validate():

 watchdog: BUG: soft lockup - CPU#1 stuck for 27s! [iptables-nft-re:37547]
[..]
 RIP: 0010:nft_chain_validate+0xcb/0x110 [nf_tables]
[..]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_table_validate+0x6b/0xb0 [nf_tables]
  nf_tables_validate+0x8b/0xa0 [nf_tables]
  nf_tables_commit+0x1df/0x1eb0 [nf_tables]
[..]

Currently nf_tables will traverse the entire table (chain graph), starting
from the entry points (base chains), exploring all possible paths
(chain jumps).  But there are cases where we could avoid revalidation.

Consider:
1  input -> j2 -> j3
2  input -> j2 -> j3
3  input -> j1 -> j2 -> j3

Then the second rule does not need to revalidate j2, and, by extension j3,
because this was already checked during validation of the first rule.
We need to validate it only for rule 3.

This is needed because chain loop detection also ensures we do not exceed
the jump stack: Just because we know that j2 is cycle free, its last jump
might now exceed the allowed stack size.  We also need to update all
reachable chains with the new largest observed call depth.

Care has to be taken to revalidate even if the chain depth won't be an
issue: chain validation also ensures that expressions are not called from
invalid base chains.  For example, the masquerade expression can only be
called from NAT postrouting base chains.

Therefore we also need to keep record of the base chain context (type,
hooknum) and revalidate if the chain becomes reachable from a different
hook location.

Reported-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Closes: https://lore.kernel.org/netfilter-devel/20251118221735.GA5477@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net/
Tested-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-01-17 16:35:32 +01:00
..
ipset netfilter: ipset: Remove unused htable_bits in macro ahash_region 2025-09-11 15:40:55 +02:00
ipvs ipvs: fix ipv4 null-ptr-deref in route error path 2026-01-02 12:56:47 +01:00
core.c netfilter: nf_dup{4, 6}: Move duplication check to task_struct 2025-05-23 13:57:12 +02:00
Kconfig netfilter: Exclude LEGACY TABLES on PREEMPT_RT. 2025-07-25 18:38:50 +02:00
Makefile netfilter: conntrack: remove DCCP protocol support 2025-07-03 13:51:39 +02:00
nf_bpf_link.c bpf: Check netfilter ctx accesses are aligned 2025-08-01 09:22:44 -07:00
nf_conncount.c netfilter: nf_conncount: update last_gc only when GC has been performed 2026-01-17 16:35:22 +01:00
nf_conntrack_acct.c
nf_conntrack_amanda.c netfilter: conntrack: remove skb argument from nf_ct_refresh 2025-01-19 16:41:55 +01:00
nf_conntrack_bpf.c net: netfilter: Make ct zone opts configurable for bpf ct helpers 2024-05-22 15:00:56 -07:00
nf_conntrack_broadcast.c netfilter: conntrack: remove skb argument from nf_ct_refresh 2025-01-19 16:41:55 +01:00
nf_conntrack_core.c netfilter: conntrack: Remove unused net in nf_conntrack_double_lock() 2025-07-25 18:38:41 +02:00
nf_conntrack_ecache.c net: replace use of system_wq with system_percpu_wq 2025-09-22 17:40:30 -07:00
nf_conntrack_expect.c treewide, timers: Rename from_timer() to timer_container_of() 2025-06-08 09:07:37 +02:00
nf_conntrack_extend.c netfilter: conntrack: fix extension size table 2023-09-13 21:57:50 +02:00
nf_conntrack_ftp.c
nf_conntrack_h323_asn1.c netfilter: nf_conntrack_h323: Add protection for bmp length out of range 2024-03-07 03:10:35 +01:00
nf_conntrack_h323_main.c netfilter: conntrack: remove skb argument from nf_ct_refresh 2025-01-19 16:41:55 +01:00
nf_conntrack_h323_types.c
nf_conntrack_helper.c netfilter: conntrack: helper: Replace -EEXIST by -EBUSY 2025-08-27 11:53:38 +02:00
nf_conntrack_irc.c
nf_conntrack_labels.c netfilter: conntrack: switch connlabels to atomic_t 2023-10-24 13:16:30 +02:00
nf_conntrack_netbios_ns.c
nf_conntrack_netlink.c netfilter: ctnetlink: remove refcounting in dying list dumping 2025-08-20 13:52:36 +02:00
nf_conntrack_ovs.c netfilter: use nf_ip6_check_hbh_len in nf_ct_skb_network_trim 2023-03-08 14:25:41 +01:00
nf_conntrack_pptp.c
nf_conntrack_proto_generic.c
nf_conntrack_proto_gre.c netfilter: conntrack: gre: don't set assured flag for clash entries 2023-07-05 14:42:15 +02:00
nf_conntrack_proto_icmp.c
nf_conntrack_proto_icmpv6.c netfilter: conntrack: fix ct-state for ICMPv6 Multicast Router Discovery 2024-05-06 11:13:56 +02:00
nf_conntrack_proto_sctp.c netfilter: conntrack: cleanup timeout definitions 2025-01-12 20:21:01 -08:00
nf_conntrack_proto_tcp.c move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
nf_conntrack_proto_udp.c netfilter: conntrack: udp: fix seen-reply test 2023-02-01 12:18:51 +01:00
nf_conntrack_proto.c netfilter: conntrack: remove DCCP protocol support 2025-07-03 13:51:39 +02:00
nf_conntrack_sane.c
nf_conntrack_seqadj.c
nf_conntrack_sip.c netfilter: conntrack: remove skb argument from nf_ct_refresh 2025-01-19 16:41:55 +01:00
nf_conntrack_snmp.c
nf_conntrack_standalone.c netfilter: nf_conntrack: do not skip entries in /proc/net/nf_conntrack 2025-09-24 11:50:28 +02:00
nf_conntrack_tftp.c
nf_conntrack_timeout.c
nf_conntrack_timestamp.c
nf_dup_netdev.c netfilter: nf_dup_netdev: Move the recursion counter struct netdev_xmit 2025-05-23 13:57:12 +02:00
nf_flow_table_bpf.c netfilter: Add bpf_xdp_flow_lookup kfunc 2024-07-01 17:03:01 +02:00
nf_flow_table_core.c netfilter: conntrack: fix erronous removal of offload bit 2025-04-17 11:14:22 +02:00
nf_flow_table_inet.c net: netfilter: move nf flowtable bpf initialization in nf_flow_table_module_init() 2024-09-12 15:41:03 +02:00
nf_flow_table_ip.c Revert "netfilter: flowtable: teardown flow if cached mtu is stale" 2025-02-12 10:35:20 +01:00
nf_flow_table_offload.c net: hold netdev instance lock during nft ndo_setup_tc 2025-03-06 12:59:43 -08:00
nf_flow_table_procfs.c
nf_flow_table_xdp.c netfilter: nf_tables: Add flowtable map for xdp offload 2024-07-01 17:01:53 +02:00
nf_hooks_lwtunnel.c sysctl: treewide: constify the ctl_table argument of proc_handlers 2024-07-24 20:59:29 +02:00
nf_internals.h netfilter: move the sysctl nf_hooks_lwtunnel into the netfilter core 2024-06-19 18:41:59 +02:00
nf_log_syslog.c tcp: extend TCP flags to allow AE bit/ACE field 2025-03-17 13:49:46 +00:00
nf_log.c netfilter: load nf_log_syslog on enabling nf_conntrack_log_invalid 2025-07-25 18:35:41 +02:00
nf_nat_amanda.c
nf_nat_bpf.c bpf: treewide: Annotate BPF kfuncs in BTF 2024-01-31 20:40:56 -08:00
nf_nat_core.c netfilter: nf_nat: remove bogus direction check 2026-01-02 12:56:47 +01:00
nf_nat_ftp.c
nf_nat_helper.c treewide: use get_random_u32_below() instead of deprecated function 2022-11-18 02:15:15 +01:00
nf_nat_irc.c
nf_nat_masquerade.c
nf_nat_ovs.c netfilter: nf_nat: fix action not being set for all ct states 2024-01-03 11:17:17 +01:00
nf_nat_proto.c netfilter: conntrack: remove DCCP protocol support 2025-07-03 13:51:39 +02:00
nf_nat_redirect.c netfilter: nat: fix ipv6 nat redirect with mapped and scoped addresses 2023-11-08 16:40:30 +01:00
nf_nat_sip.c
nf_nat_tftp.c
nf_queue.c netfilter: move nf_reinject into nfnetlink_queue modules 2024-02-21 12:03:22 +01:00
nf_sockopt.c
nf_synproxy_core.c move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
nf_tables_api.c netfilter: nf_tables: avoid chain re-validation if possible 2026-01-17 16:35:32 +01:00
nf_tables_core.c netfilter: nf_tables: Only use nf_skip_indirect_calls() when MITIGATION_RETPOLINE 2025-03-23 10:53:47 +01:00
nf_tables_offload.c netfilter: nf_tables: Have a list of nf_hook_ops in nft_hook 2025-05-23 13:57:13 +02:00
nf_tables_trace.c netfilter: nf_tables: hide clash bit from userspace 2025-07-14 15:22:35 +02:00
nfnetlink_acct.c
nfnetlink_cthelper.c
nfnetlink_cttimeout.c netfilter: conntrack: remove DCCP protocol support 2025-07-03 13:51:39 +02:00
nfnetlink_hook.c netfilter: nfnetlink_hook: Dump flowtable info 2025-07-25 18:40:01 +02:00
nfnetlink_log.c treewide, timers: Rename from_timer() to timer_container_of() 2025-06-08 09:07:37 +02:00
nfnetlink_osf.c netfilter: add missing module descriptions 2023-11-08 13:52:32 +01:00
nfnetlink_queue.c netfilter: nfnetlink_queue: Initialize ctx to avoid memory allocation error 2025-03-23 10:20:33 +01:00
nfnetlink.c netfilter: nfnetlink: reset nlh pointer during batch replay 2025-09-24 11:50:28 +02:00
nft_bitwise.c netfilter: bitwise: add support for doing AND, OR and XOR directly 2024-11-15 12:07:04 +01:00
nft_byteorder.c move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
nft_chain_filter.c Revert "netfilter: nf_tables: Add notifications for hook changes" 2025-07-14 15:22:47 +02:00
nft_chain_nat.c netfilter: add missing module descriptions 2023-11-08 13:52:32 +01:00
nft_chain_route.c
nft_cmp.c netfilter: nf_tables: pass context structure to nft_parse_register_load 2024-08-20 12:37:24 +02:00
nft_compat.c netfilter: nf_tables: make destruction work queue pernet 2025-03-06 13:35:54 +01:00
nft_connlimit.c netfilter: nft_connlimit: update the count if add was skipped 2025-12-18 14:03:26 +01:00
nft_counter.c netfilter: nft_counter: Use u64_stats_t for statistic. 2024-09-03 10:47:16 +02:00
nft_ct_fast.c netfilter: nf_tables: fix ct untracked match breakage 2023-05-03 13:49:08 +02:00
nft_ct.c netfilter: nft_ct: add seqadj extension for natted connections 2025-10-29 14:47:59 +01:00
nft_dup_netdev.c netfilter: nf_tables: pass context structure to nft_parse_register_load 2024-08-20 12:37:24 +02:00
nft_dynset.c netfilter: nft_set: remove indirection from update API call 2025-07-25 18:40:23 +02:00
nft_exthdr.c netfilter: conntrack: remove DCCP protocol support 2025-07-03 13:51:39 +02:00
nft_fib_inet.c
nft_fib_netdev.c
nft_fib.c netfilter: nf_tables: drop unused 3rd argument from validate callback ops 2024-09-03 10:47:17 +02:00
nft_flow_offload.c netfilter: flowtable: check for maximum number of encapsulations in bridge vlan 2025-12-18 14:03:26 +01:00
nft_fwd_netdev.c netfilter: nf_tables: drop unused 3rd argument from validate callback ops 2024-09-03 10:47:17 +02:00
nft_hash.c netfilter: nf_tables: pass context structure to nft_parse_register_load 2024-08-20 12:37:24 +02:00
nft_immediate.c netfilter: nf_tables: drop unused 3rd argument from validate callback ops 2024-09-03 10:47:17 +02:00
nft_inner.c netfilter: nft_inner: Use nested-BH locking for nft_pcpu_tun_ctx 2025-05-23 13:57:12 +02:00
nft_last.c netfilter: nf_tables: allow clone callbacks to sleep 2024-05-10 11:13:45 +02:00
nft_limit.c netfilter: nf_tables: allow clone callbacks to sleep 2024-05-10 11:13:45 +02:00
nft_log.c netfilter: nf_tables: missing objects with no memcg accounting 2024-09-26 13:03:02 +02:00
nft_lookup.c netfilter: nf_tables: restart set lookup on base_seq change 2025-09-10 20:30:37 +02:00
nft_masq.c netfilter: nf_tables: drop unused 3rd argument from validate callback ops 2024-09-03 10:47:17 +02:00
nft_meta.c netfilter: nf_tables: missing objects with no memcg accounting 2024-09-26 13:03:02 +02:00
nft_nat.c netfilter: nf_tables: drop unused 3rd argument from validate callback ops 2024-09-03 10:47:17 +02:00
nft_numgen.c netfilter: nf_tables: missing objects with no memcg accounting 2024-09-26 13:03:02 +02:00
nft_objref.c netfilter: nft_objref: validate objref and objrefmap expressions 2025-10-08 13:17:25 +02:00
nft_osf.c netfilter: nf_tables: drop unused 3rd argument from validate callback ops 2024-09-03 10:47:17 +02:00
nft_payload.c netfilter: nft_payload: extend offset to 65535 bytes 2025-09-02 15:28:18 +02:00
nft_queue.c netfilter: nf_tables: drop unused 3rd argument from validate callback ops 2024-09-03 10:47:17 +02:00
nft_quota.c netfilter: nft_quota: match correctly when the quota just depleted 2025-05-05 13:15:09 +02:00
nft_range.c netfilter: nf_tables: pass context structure to nft_parse_register_load 2024-08-20 12:37:24 +02:00
nft_redir.c netfilter: nf_tables: drop unused 3rd argument from validate callback ops 2024-09-03 10:47:17 +02:00
nft_reject_inet.c netfilter: nf_tables: drop unused 3rd argument from validate callback ops 2024-09-03 10:47:17 +02:00
nft_reject_netdev.c netfilter: nf_tables: drop unused 3rd argument from validate callback ops 2024-09-03 10:47:17 +02:00
nft_reject.c netfilter: nf_tables: drop unused 3rd argument from validate callback ops 2024-09-03 10:47:17 +02:00
nft_rt.c netfilter: nf_tables: drop unused 3rd argument from validate callback ops 2024-09-03 10:47:17 +02:00
nft_set_bitmap.c netfilter: nft_set_bitmap: fix lockdep splat due to missing annotation 2025-09-10 20:28:24 +02:00
nft_set_hash.c netfilter: nf_tables: allow iter callbacks to sleep 2025-09-02 15:28:17 +02:00
nft_set_pipapo_avx2.c netfilter: nft_set_pipapo_avx2: fix skip of expired entries 2025-09-24 11:50:28 +02:00
nft_set_pipapo_avx2.h netfilter: nft_set_pipapo: use avx2 algorithm for insertions too 2025-08-20 13:52:37 +02:00
nft_set_pipapo.c netfilter: nft_set_pipapo: fix range overlap detection 2026-01-17 16:35:21 +01:00
nft_set_pipapo.h netfilter: nft_set_pipapo: Use nested-BH locking for nft_pipapo_scratch 2025-08-20 13:52:37 +02:00
nft_set_rbtree.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-09-11 17:40:13 -07:00
nft_socket.c netfilter: nft_socket: remove WARN_ON_ONCE with huge level value 2025-08-07 13:19:26 +02:00
nft_synproxy.c netfilter: nft_synproxy: avoid possible data-race on update operation 2026-01-17 16:35:21 +01:00
nft_tproxy.c netfilter: nf_tables: drop unused 3rd argument from validate callback ops 2024-09-03 10:47:17 +02:00
nft_tunnel.c netfilter: nft_tunnel: fix geneve_opt dump 2025-05-23 13:57:12 +02:00
nft_xfrm.c xfrm: add generic iptfs defines and functionality 2024-12-05 10:01:28 +01:00
utils.c netfilter: move nf_reinject into nfnetlink_queue modules 2024-02-21 12:03:22 +01:00
x_tables.c netfilter: Exclude LEGACY TABLES on PREEMPT_RT. 2025-07-25 18:38:50 +02:00
xt_addrtype.c netfilter: xtables: avoid NFPROTO_UNSPEC where needed 2024-10-09 23:20:46 +02:00
xt_AUDIT.c
xt_bpf.c
xt_cgroup.c net: cgroup: Guard users of sock_cgroup_classid() 2025-04-24 16:04:02 +02:00
xt_CHECKSUM.c netfilter: xtables: avoid NFPROTO_UNSPEC where needed 2024-10-09 23:20:46 +02:00
xt_CLASSIFY.c netfilter: xtables: avoid NFPROTO_UNSPEC where needed 2024-10-09 23:20:46 +02:00
xt_cluster.c netfilter: xtables: avoid NFPROTO_UNSPEC where needed 2024-10-09 23:20:46 +02:00
xt_comment.c
xt_connbytes.c netfilter: xtables: avoid NFPROTO_UNSPEC where needed 2024-10-09 23:20:46 +02:00
xt_connlabel.c
xt_connlimit.c netfilter: nf_conncount: rework API to use sk_buff directly 2025-12-18 14:03:26 +01:00
xt_connmark.c netfilter: xtables: avoid NFPROTO_UNSPEC where needed 2024-10-09 23:20:46 +02:00
xt_CONNSECMARK.c netfilter: xtables: avoid NFPROTO_UNSPEC where needed 2024-10-09 23:20:46 +02:00
xt_conntrack.c
xt_cpu.c
xt_CT.c netfilter: xtables: avoid NFPROTO_UNSPEC where needed 2024-10-09 23:20:46 +02:00
xt_dccp.c
xt_devgroup.c
xt_dscp.c
xt_DSCP.c
xt_ecn.c
xt_esp.c
xt_hashlimit.c netfilter: xt_hashlimit: replace vmalloc calls with kvmalloc 2025-03-12 16:37:48 +01:00
xt_helper.c
xt_hl.c
xt_HL.c
xt_HMARK.c
xt_IDLETIMER.c treewide, timers: Rename from_timer() to timer_container_of() 2025-06-08 09:07:37 +02:00
xt_ipcomp.c
xt_iprange.c
xt_ipvs.c
xt_l2tp.c
xt_LED.c treewide, timers: Rename from_timer() to timer_container_of() 2025-06-08 09:07:37 +02:00
xt_length.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf 2023-02-22 21:25:23 -08:00
xt_limit.c
xt_LOG.c
xt_mac.c
xt_mark.c netfilter: xtables: support arpt_mark and ipv6 optstrip for iptables-nft only builds 2025-05-22 17:16:02 +02:00
xt_MASQUERADE.c
xt_multiport.c
xt_nat.c
xt_NETMAP.c
xt_nfacct.c netfilter: xt_nfacct: don't assume acct name is null-terminated 2025-07-25 18:40:43 +02:00
xt_NFLOG.c netfilter: xtables: fix typo causing some targets not to load on IPv6 2024-10-21 11:31:26 +02:00
xt_NFQUEUE.c
xt_osf.c netfilter: nfnetlink_osf: fix module autoload 2023-06-20 22:43:42 +02:00
xt_owner.c netfilter: xt_owner: Fix for unsafe access of sk->sk_socket 2023-12-06 17:52:15 +01:00
xt_physdev.c netfilter: propagate net to nf_bridge_get_physindev 2024-01-17 12:02:48 +01:00
xt_pkttype.c
xt_policy.c
xt_quota.c
xt_rateest.c
xt_RATEEST.c netfilter: xtables: avoid NFPROTO_UNSPEC where needed 2024-10-09 23:20:46 +02:00
xt_realm.c
xt_recent.c netfilter: xt_recent: Lift restrictions on max hitcount value 2024-06-28 17:57:50 +02:00
xt_REDIRECT.c netfilter: nft_redir: use struct nf_nat_range2 throughout and deduplicate eval call-backs 2023-03-22 21:48:59 +01:00
xt_repldata.h netfilter: xtables: Use strscpy() instead of strscpy_pad() 2025-03-23 10:53:47 +01:00
xt_sctp.c netfilter: xt_sctp: validate the flag_info count 2023-08-30 17:34:01 +02:00
xt_SECMARK.c netfilter: xtables: avoid NFPROTO_UNSPEC where needed 2024-10-09 23:20:46 +02:00
xt_set.c
xt_socket.c net: annotate data-races around sk->sk_mark 2023-07-29 18:13:41 +01:00
xt_state.c
xt_statistic.c
xt_string.c
xt_tcpmss.c
xt_TCPMSS.c
xt_TCPOPTSTRIP.c netfilter: xtables: support arpt_mark and ipv6 optstrip for iptables-nft only builds 2025-05-22 17:16:02 +02:00
xt_tcpudp.c xtables: move icmp/icmpv6 logic to xt_tcpudp 2023-03-22 21:48:59 +01:00
xt_TEE.c
xt_time.c
xt_TPROXY.c
xt_TRACE.c netfilter: xtables: fix typo causing some targets not to load on IPv6 2024-10-21 11:31:26 +02:00
xt_u32.c netfilter: xt_u32: validate user space input 2023-08-30 17:34:01 +02:00