linux-yocto/kernel/time
Frederic Weisbecker d4b3a4c2aa timers/migration: Fix imbalanced NUMA trees
[ Upstream commit 5eb579dfd46b4949117ecb0f1ba2f12d3dc9a6f2 ]

When a CPU from a new node boots, the old root may happen to be
connected to the new root even if their node mismatch, as depicted in
the following scenario:

1) CPU 0 boots and creates the first group for node 0.

   [GRP0:0]
    node 0
      |
    CPU 0

2) CPU 1 from node 1 boots and creates a new top that corresponds to
   node 1, but it also connects the old root from node 0 to the new root
   from node 1 by mistake.

             [GRP1:0]
              node 1
            /        \
           /          \
   [GRP0:0]             [GRP0:1]
    node 0               node 1
      |                    |
    CPU 0                CPU 1

3) This eventually leads to an imbalanced tree where some node 0 CPUs
   migrate node 1 timers (and vice versa) way before reaching the
   crossnode groups, resulting in more frequent remote memory accesses
   than expected.

                      [GRP2:0]
                      NUMA_NO_NODE
                     /             \
             [GRP1:0]              [GRP1:1]
              node 1               node 0
            /        \                |
           /          \             [...]
   [GRP0:0]             [GRP0:1]
    node 0               node 1
      |                    |
    CPU 0...              CPU 1...

A balanced tree should only contain groups having children that belong
to the same node:

                      [GRP2:0]
                      NUMA_NO_NODE
                     /             \
             [GRP1:0]              [GRP1:0]
              node 0               node 1
            /        \             /      \
           /          \           /        \
   [GRP0:0]          [...]      [...]    [GRP0:1]
    node 0                                node 1
      |                                     |
    CPU 0...                              CPU 1...

In order to fix this, the hierarchy must be unfolded up to the crossnode
level as soon as a node mismatch is detected. For example the stage 2
above should lead to this layout:

                      [GRP2:0]
                      NUMA_NO_NODE
                     /             \
             [GRP1:0]              [GRP1:1]
              node 0               node 1
              /                         \
             /                           \
        [GRP0:0]                        [GRP0:1]
        node 0                           node 1
          |                                |
       CPU 0                             CPU 1

This means that not only GRP1:0 must be created but also GRP1:1 and
GRP2:0 in order to prepare a balanced tree for next CPUs to boot.

Fixes: 7ee9887703 ("timers: Implement the hierarchical pull model")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251024132536.39841-4-frederic@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-18 14:02:47 +01:00
..
alarmtimer.c time: Fix spelling mistakes in comments 2025-09-21 10:02:02 +02:00
clockevents.c tick: Do not set device to detached state in tick_shutdown() 2025-09-09 13:39:00 +02:00
clocksource-wdtest.c
clocksource.c time: Fix spelling mistakes in comments 2025-09-21 10:02:02 +02:00
hrtimer.c Updates for the time(rs) core subsystem: 2025-09-30 16:09:27 -07:00
itimer.c timers/itimer: Avoid direct access to hrtimer clockbase 2025-09-09 12:27:17 +02:00
jiffies.c
Kconfig
Makefile time: Build generic update_vsyscall() only with generic time vDSO 2025-09-04 11:23:50 +02:00
namespace.c namespace-6.18-rc1 2025-09-29 11:20:29 -07:00
ntp_internal.h
ntp.c
posix-clock.c
posix-cpu-timers.c
posix-stubs.c
posix-timers.c posix-timers: Plug potential memory leak in do_timer_create() 2025-11-14 16:58:31 +01:00
posix-timers.h
sched_clock.c time/sched_clock: Export symbol for sched_clock register function 2025-09-23 10:52:31 +02:00
sleep_timeout.c
test_udelay.c
tick-broadcast-hrtimer.c
tick-broadcast.c
tick-common.c tick: Do not set device to detached state in tick_shutdown() 2025-09-09 13:39:00 +02:00
tick-internal.h tick: Do not set device to detached state in tick_shutdown() 2025-09-09 13:39:00 +02:00
tick-legacy.c
tick-oneshot.c
tick-sched.c tick/sched: Fix bogus condition in report_idle_softirq() 2025-11-19 19:30:45 +01:00
tick-sched.h
time_test.c
time.c time: export timespec64_add_safe() symbol 2025-09-03 16:51:08 -07:00
timeconst.bc
timeconv.c
timecounter.c
timekeeping_debug.c
timekeeping_internal.h
timekeeping.c timekeeping: Fix error code in tk_aux_sysfs_init() 2025-11-25 17:52:24 +01:00
timekeeping.h
timer_list.c hrtimer: Remove hrtimer_clock_base:: Get_time 2025-09-09 12:27:18 +02:00
timer_migration.c timers/migration: Fix imbalanced NUMA trees 2025-12-18 14:02:47 +01:00
timer_migration.h
timer.c timers: Fix NULL function pointer race in timer_shutdown_sync() 2025-11-22 22:55:26 +01:00
vsyscall.c vdso/vsyscall: Avoid slow division loop in auxiliary clock update 2025-09-03 11:55:11 +02:00