linux-yocto/net/sched
William Liu 7ff2d83ecf net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree
[ Upstream commit 0e1d5d9b5c ]

htb_lookup_leaf has a BUG_ON that can trigger with the following:

tc qdisc del dev lo root
tc qdisc add dev lo root handle 1: htb default 1
tc class add dev lo parent 1: classid 1:1 htb rate 64bit
tc qdisc add dev lo parent 1:1 handle 2: netem
tc qdisc add dev lo parent 2:1 handle 3: blackhole
ping -I lo -c1 -W0.001 127.0.0.1

The root cause is the following:

1. htb_dequeue calls htb_dequeue_tree which calls the dequeue handler on
   the selected leaf qdisc
2. netem_dequeue calls enqueue on the child qdisc
3. blackhole_enqueue drops the packet and returns a value that is not
   just NET_XMIT_SUCCESS
4. Because of this, netem_dequeue calls qdisc_tree_reduce_backlog, and
   since qlen is now 0, it calls htb_qlen_notify -> htb_deactivate ->
   htb_deactiviate_prios -> htb_remove_class_from_row -> htb_safe_rb_erase
5. As this is the only class in the selected hprio rbtree,
   __rb_change_child in __rb_erase_augmented sets the rb_root pointer to
   NULL
6. Because blackhole_dequeue returns NULL, netem_dequeue returns NULL,
   which causes htb_dequeue_tree to call htb_lookup_leaf with the same
   hprio rbtree, and fail the BUG_ON

The function graph for this scenario is shown here:
 0)               |  htb_enqueue() {
 0) + 13.635 us   |    netem_enqueue();
 0)   4.719 us    |    htb_activate_prios();
 0) # 2249.199 us |  }
 0)               |  htb_dequeue() {
 0)   2.355 us    |    htb_lookup_leaf();
 0)               |    netem_dequeue() {
 0) + 11.061 us   |      blackhole_enqueue();
 0)               |      qdisc_tree_reduce_backlog() {
 0)               |        qdisc_lookup_rcu() {
 0)   1.873 us    |          qdisc_match_from_root();
 0)   6.292 us    |        }
 0)   1.894 us    |        htb_search();
 0)               |        htb_qlen_notify() {
 0)   2.655 us    |          htb_deactivate_prios();
 0)   6.933 us    |        }
 0) + 25.227 us   |      }
 0)   1.983 us    |      blackhole_dequeue();
 0) + 86.553 us   |    }
 0) # 2932.761 us |    qdisc_warn_nonwc();
 0)               |    htb_lookup_leaf() {
 0)               |      BUG_ON();
 ------------------------------------------

The full original bug report can be seen here [1].

We can fix this just by returning NULL instead of the BUG_ON,
as htb_dequeue_tree returns NULL when htb_lookup_leaf returns
NULL.

[1] https://lore.kernel.org/netdev/pF5XOOIim0IuEfhI-SOxTgRvNoDwuux7UHKnE_Y5-zVd4wmGvNk2ceHjKb8ORnzw0cGwfmVu42g9dL7XyJLf1NEzaztboTWcm0Ogxuojoeo=@willsroot.io/

Fixes: 512bb43eb5 ("pkt_sched: sch_htb: Optimize WARN_ONs in htb_dequeue_tree() etc.")
Signed-off-by: William Liu <will@willsroot.io>
Signed-off-by: Savino Dicanosa <savy@syst3mfailure.io>
Link: https://patch.msgid.link/20250717022816.221364-1-will@willsroot.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24 08:53:19 +02:00
..
act_api.c net: use unrcu_pointer() helper 2024-12-09 10:32:10 +01:00
act_bpf.c net: Rename mono_delivery_time to tstamp_type for scalabilty 2025-05-09 09:43:57 +02:00
act_connmark.c
act_csum.c
act_ct.c sched: act_ct: take care of padding in struct zones_ht_key 2024-08-11 12:47:18 +02:00
act_ctinfo.c
act_gact.c
act_gate.c
act_ife.c
act_ipt.c
act_meta_mark.c
act_meta_skbprio.c
act_meta_skbtcindex.c
act_mirred.c
act_mpls.c
act_nat.c
act_pedit.c
act_police.c
act_sample.c
act_simple.c
act_skbedit.c
act_skbmod.c
act_tunnel_key.c net: fix geneve_opt length integer overflow 2025-04-10 14:37:40 +02:00
act_vlan.c
cls_api.c tc: Ensure we have enough buffer space when sending filter netlink notifications 2025-04-25 10:45:06 +02:00
cls_basic.c
cls_bpf.c net: Rename mono_delivery_time to tstamp_type for scalabilty 2025-05-09 09:43:57 +02:00
cls_cgroup.c
cls_flow.c net_sched: cls_flow: validate TCA_FLOW_RSHIFT attribute 2025-01-17 13:36:12 +01:00
cls_flower.c net: fix geneve_opt length integer overflow 2025-04-10 14:37:40 +02:00
cls_fw.c
cls_matchall.c
cls_route.c
cls_u32.c net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes. 2024-11-22 15:38:32 +01:00
em_canid.c
em_cmp.c
em_ipset.c
em_ipt.c
em_meta.c
em_nbyte.c
em_text.c
em_u32.c
ematch.c
Kconfig
Makefile
sch_api.c net/sched: Abort __tc_modify_qdisc if parent class does not exist 2025-07-17 18:35:11 +02:00
sch_blackhole.c
sch_cake.c sched: sch_cake: add bounds checks to host bulk flow fairness counts 2025-01-17 13:36:15 +01:00
sch_cbs.c net/sched: cbs: Fix integer overflow in cbs_set_port_rate() 2024-12-14 20:00:03 +01:00
sch_choke.c net: sched: fix ordering of qlen adjustment 2024-12-27 13:58:41 +01:00
sch_codel.c net_sched: Flush gso_skb list too during ->change() 2025-05-22 14:12:15 +02:00
sch_drr.c sch_drr: make drr_qlen_notify() idempotent 2025-05-09 09:44:04 +02:00
sch_etf.c
sch_ets.c net_sched: ets: fix a race in ets_qdisc_change() 2025-06-19 15:28:43 +02:00
sch_fifo.c pfifo_tail_enqueue: Drop new packet when sch->limit == 0 2025-03-13 12:58:40 +01:00
sch_fq_codel.c net_sched: Flush gso_skb list too during ->change() 2025-05-22 14:12:15 +02:00
sch_fq_pie.c net_sched: Flush gso_skb list too during ->change() 2025-05-22 14:12:15 +02:00
sch_fq.c net_sched: Flush gso_skb list too during ->change() 2025-05-22 14:12:15 +02:00
sch_frag.c
sch_generic.c net: fix races in netdev_tx_sent_queue()/dev_watchdog() 2024-11-01 01:58:29 +01:00
sch_gred.c sched: address a potential NULL pointer dereference in the GRED scheduler. 2025-03-22 12:50:38 -07:00
sch_hfsc.c net_sched: hfsc: Address reentrant enqueue adding class to eltree twice 2025-06-04 14:42:24 +02:00
sch_hhf.c net_sched: Flush gso_skb list too during ->change() 2025-05-22 14:12:15 +02:00
sch_htb.c net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree 2025-07-24 08:53:19 +02:00
sch_ingress.c bpf: Fix too early release of tcx_entry 2024-07-18 13:21:12 +02:00
sch_mq.c
sch_mqprio_lib.c
sch_mqprio_lib.h
sch_mqprio.c
sch_multiq.c net: sched: sch_multiq: fix possible OOB write in multiq_tune() 2024-06-21 14:38:16 +02:00
sch_netem.c netem: Update sch->q.qlen before qdisc_tree_reduce_backlog() 2025-02-17 09:40:14 +01:00
sch_pie.c net_sched: Flush gso_skb list too during ->change() 2025-05-22 14:12:15 +02:00
sch_plug.c
sch_prio.c net_sched: prio: fix a race in prio_tune() 2025-06-19 15:28:43 +02:00
sch_qfq.c net/sched: sch_qfq: Fix race condition on qfq_aggregate 2025-07-24 08:53:17 +02:00
sch_red.c net_sched: red: fix a race in __red_change() 2025-06-19 15:28:43 +02:00
sch_sfb.c
sch_sfq.c net_sched: sch_sfq: reject invalid perturb period 2025-06-27 11:08:59 +01:00
sch_skbprio.c net_sched: skbprio: Remove overly strict queue assertions 2025-04-10 14:37:39 +02:00
sch_taprio.c net/sched: fix use-after-free in taprio_dev_notifier 2025-06-27 11:08:48 +01:00
sch_tbf.c net_sched: tbf: fix a race in tbf_change() 2025-06-19 15:28:43 +02:00
sch_teql.c