linux-imx/kernel
Steven Rostedt 7a5f01828e tracing/osnoise: Use a cpumask to know what threads are kthreads
commit 177e1cc2f4 upstream.

The start_kthread() and stop_thread() code was not always called with the
interface_lock held. This means that the kthread variable could be
unexpectedly changed causing the kthread_stop() to be called on it when it
should not have been, leading to:

 while true; do
   rtla timerlat top -u -q & PID=$!;
   sleep 5;
   kill -INT $PID;
   sleep 0.001;
   kill -TERM $PID;
   wait $PID;
  done

Causing the following OOPS:

 Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI
 KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
 CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty #125 a533010b71dab205ad2f507188ce8c82203b0254
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
 RIP: 0010:hrtimer_active+0x58/0x300
 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f
 RSP: 0018:ffff88811d97f940 EFLAGS: 00010202
 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b
 RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28
 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60
 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d
 R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28
 FS:  0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0
 Call Trace:
  <TASK>
  ? die_addr+0x40/0xa0
  ? exc_general_protection+0x154/0x230
  ? asm_exc_general_protection+0x26/0x30
  ? hrtimer_active+0x58/0x300
  ? __pfx_mutex_lock+0x10/0x10
  ? __pfx_locks_remove_file+0x10/0x10
  hrtimer_cancel+0x15/0x40
  timerlat_fd_release+0x8e/0x1f0
  ? security_file_release+0x43/0x80
  __fput+0x372/0xb10
  task_work_run+0x11e/0x1f0
  ? _raw_spin_lock+0x85/0xe0
  ? __pfx_task_work_run+0x10/0x10
  ? poison_slab_object+0x109/0x170
  ? do_exit+0x7a0/0x24b0
  do_exit+0x7bd/0x24b0
  ? __pfx_migrate_enable+0x10/0x10
  ? __pfx_do_exit+0x10/0x10
  ? __pfx_read_tsc+0x10/0x10
  ? ktime_get+0x64/0x140
  ? _raw_spin_lock_irq+0x86/0xe0
  do_group_exit+0xb0/0x220
  get_signal+0x17ba/0x1b50
  ? vfs_read+0x179/0xa40
  ? timerlat_fd_read+0x30b/0x9d0
  ? __pfx_get_signal+0x10/0x10
  ? __pfx_timerlat_fd_read+0x10/0x10
  arch_do_signal_or_restart+0x8c/0x570
  ? __pfx_arch_do_signal_or_restart+0x10/0x10
  ? vfs_read+0x179/0xa40
  ? ksys_read+0xfe/0x1d0
  ? __pfx_ksys_read+0x10/0x10
  syscall_exit_to_user_mode+0xbc/0x130
  do_syscall_64+0x74/0x110
  ? __pfx___rseq_handle_notify_resume+0x10/0x10
  ? __pfx_ksys_read+0x10/0x10
  ? fpregs_restore_userregs+0xdb/0x1e0
  ? fpregs_restore_userregs+0xdb/0x1e0
  ? syscall_exit_to_user_mode+0x116/0x130
  ? do_syscall_64+0x74/0x110
  ? do_syscall_64+0x74/0x110
  ? do_syscall_64+0x74/0x110
  entry_SYSCALL_64_after_hwframe+0x71/0x79
 RIP: 0033:0x7ff0070eca9c
 Code: Unable to access opcode bytes at 0x7ff0070eca72.
 RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c
 RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003
 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0
 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003
 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008
  </TASK>
 Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core
 ---[ end trace 0000000000000000 ]---

This is because it would mistakenly call kthread_stop() on a user space
thread making it "exit" before it actually exits.

Since kthreads are created based on global behavior, use a cpumask to know
when kthreads are running and that they need to be shutdown before
proceeding to do new work.

Link: https://lore.kernel.org/all/20240820130001.124768-1-tglozar@redhat.com/

This was debugged by using the persistent ring buffer:

Link: https://lore.kernel.org/all/20240823013902.135036960@goodmis.org/

Note, locking was originally used to fix this, but that proved to cause too
many deadlocks to work around:

  https://lore.kernel.org/linux-trace-kernel/20240823102816.5e55753b@gandalf.local.home/

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20240904103428.08efdf4c@gandalf.local.home
Fixes: e88ed227f6 ("tracing/timerlat: Add user-space interface")
Reported-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-12 11:11:27 +02:00
..
bpf bpf: Fix a kernel verifier crash in stacksafe() 2024-08-29 17:33:58 +02:00
cgroup cgroup: Avoid extra dereference in css_populate_dir() 2024-08-29 17:33:24 +02:00
configs Kbuild updates for v6.6 2023-09-05 11:01:47 -07:00
debug kdb: Use the passed prompt in kdb_position_cursor() 2024-08-03 08:54:34 +02:00
dma dma-debug: avoid deadlock between dma debug vs printk and netconsole 2024-09-08 07:54:32 +02:00
entry entry: Respect changes to system call number by trace_sys_enter() 2024-04-03 15:28:50 +02:00
events bpf, events: Use prog to emit ksymbol event for main program 2024-08-03 08:54:36 +02:00
futex futex: Don't include process MM in futex key on no-MMU 2023-11-20 11:58:53 +01:00
gcov gcov: add support for GCC 14 2024-06-27 13:49:13 +02:00
irq genirq/cpuhotplug: Retry with cpu_online_mask when migration fails 2024-08-19 06:04:24 +02:00
kcsan
livepatch livepatch: Fix missing newline character in klp_resolve_symbols() 2023-11-20 11:59:25 +01:00
locking rtmutex: Drop rt_mutex::wait_lock before scheduling 2024-09-12 11:11:25 +02:00
module module: make waiting for a concurrent module loader interruptible 2024-08-14 13:58:53 +02:00
power PM: s2idle: Make sure CPUs will wakeup directly on resume 2024-04-17 11:19:26 +02:00
printk printk: For @suppress_panic_printk check for other CPU in panic 2024-04-13 13:07:29 +02:00
rcu rcu/nocb: Remove buggy bypass lock contention mitigation 2024-09-08 07:54:44 +02:00
sched sched/topology: Handle NUMA_NO_NODE in sched_numa_find_nth_cpu() 2024-08-29 17:33:24 +02:00
time hrtimer: Prevent queuing of hrtimer without a function callback 2024-08-29 17:33:41 +02:00
trace tracing/osnoise: Use a cpumask to know what threads are kthreads 2024-09-12 11:11:27 +02:00
.gitignore
acct.c audit/stable-6.6 PR 20230829 2023-08-30 08:17:35 -07:00
async.c async: Introduce async_schedule_dev_nocall() 2024-01-31 16:18:49 -08:00
audit_fsnotify.c
audit_tree.c
audit_watch.c audit: don't WARN_ON_ONCE(!current->mm) in audit_exe_compare() 2023-11-28 17:19:56 +00:00
audit.c audit: Send netlink ACK before setting connection in auditd_set 2024-02-05 20:14:14 +00:00
audit.h
auditfilter.c ima: Avoid blocking in RCU read-side critical section 2024-07-11 12:49:18 +02:00
auditsc.c audit,io_uring: io_uring openat triggers audit reference count underflow 2023-10-13 18:34:46 +02:00
backtracetest.c
bounds.c bounds: Use the right number of bits for power-of-two CONFIG_NR_CPUS 2024-05-02 16:32:50 +02:00
capability.c lsm: constify the 'target' parameter in security_capget() 2023-08-08 16:48:47 -04:00
cfi.c
compat.c
configs.c
context_tracking.c
cpu_pm.c
cpu.c cpu/SMT: Enable SMT only if a core is online 2024-08-29 17:33:30 +02:00
crash_core.c mm: turn folio_test_hugetlb into a PageType 2024-05-02 16:32:47 +02:00
crash_dump.c
cred.c cred: get rid of CONFIG_DEBUG_CREDENTIALS 2023-12-20 17:01:51 +01:00
delayacct.c
dma.c
exec_domain.c
exit.c mm: optimize the redundant loop of mm_update_owner_next() 2024-07-11 12:49:15 +02:00
extable.c
fail_function.c
fork.c Revert "fork: defer linking file vma until vma is fully initialized" 2024-06-21 14:38:47 +02:00
freezer.c
gen_kheaders.sh kheaders: explicitly define file modes for archived headers 2024-06-21 14:38:40 +02:00
groups.c
hung_task.c
iomem.c kernel/iomem.c: remove __weak ioremap_cache helper 2023-08-21 13:37:28 -07:00
irq_work.c
jump_label.c jump_label: Fix the fix, brown paper bags galore 2024-08-14 13:58:38 +02:00
kallsyms_internal.h
kallsyms_selftest.c Modules changes for v6.6-rc1 2023-08-29 17:32:32 -07:00
kallsyms_selftest.h
kallsyms.c kallsyms: Change func signature for cleanup_symbol_name() 2023-08-25 15:00:36 -07:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.kexec kexec: select CRYPTO from KEXEC_FILE instead of depending on it 2024-01-05 15:19:41 +01:00
Kconfig.locks
Kconfig.preempt
kcov.c kcov: properly check for softirq context 2024-08-14 13:58:57 +02:00
kexec_core.c kexec: do syscore_shutdown() in kernel_kexec 2024-01-31 16:18:56 -08:00
kexec_elf.c
kexec_file.c kexec_file: fix elfcorehdr digest exclusion when CONFIG_CRASH_HOTPLUG=y 2024-09-12 11:11:27 +02:00
kexec_internal.h
kexec.c kernel: kexec: copy user-array safely 2023-11-28 17:19:40 +00:00
kheaders.c
kprobes.c kprobes: Fix to check symbol prefixes correctly 2024-08-14 13:58:51 +02:00
ksyms_common.c
ksysfs.c crash: hotplug support for kexec_load() 2023-08-24 16:25:14 -07:00
kthread.c kthread: add kthread_stop_put 2024-06-12 11:12:52 +02:00
latencytop.c
Makefile kernel/numa.c: Move logging out of numa.h 2024-06-12 11:11:50 +02:00
module_signature.c
notifier.c
nsproxy.c nsproxy: Convert nsproxy.count to refcount_t 2023-08-21 11:29:12 -07:00
numa.c kernel/numa.c: Move logging out of numa.h 2024-06-12 11:11:50 +02:00
padata.c padata: Fix possible divide-by-0 panic in padata_mt_helper() 2024-08-14 13:58:59 +02:00
panic.c panic: Flush kernel log buffer at the end 2024-04-13 13:07:29 +02:00
params.c kernel: params: Remove unnecessary ‘0’ values from err 2023-07-10 12:47:01 -07:00
pid_namespace.c zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING 2024-06-21 14:38:50 +02:00
pid_sysctl.h memfd: replace ratcheting feature from vm.memfd_noexec with hierarchy 2023-08-21 13:37:59 -07:00
pid.c pidfd: prevent a kernel-doc warning 2023-09-19 13:21:33 -07:00
profile.c profiling: remove profile=sleep support 2024-08-14 13:58:47 +02:00
ptrace.c
range.c
reboot.c kernel/reboot: emergency_restart: Set correct system_state 2023-11-28 17:20:04 +00:00
regset.c
relay.c kernel: relay: remove unnecessary NULL values from relay_open_buf 2023-08-18 10:18:55 -07:00
resource_kunit.c
resource.c x86/kaslr: Expose and use the end of the physical memory address space 2024-09-12 11:11:25 +02:00
rseq.c
scftorture.c
scs.c
seccomp.c seccomp: Add missing kerndoc notations 2023-08-17 12:32:15 -07:00
signal.c kernel: rerun task_work while freezing in get_signal() 2024-08-03 08:54:13 +02:00
smp.c smp,csd: Throw an error if a CSD lock is stuck for too long 2023-11-28 17:19:36 +00:00
smpboot.c kthread: add kthread_stop_put 2024-06-12 11:12:52 +02:00
smpboot.h
softirq.c softirq: Fix suspicious RCU usage in __do_softirq() 2024-06-12 11:11:27 +02:00
stackleak.c
stacktrace.c
static_call_inline.c
static_call.c
stop_machine.c
sys_ni.c syscalls: fix compat_sys_io_pgetevents_time64 usage 2024-07-05 09:34:04 +02:00
sys.c prctl: generalize PR_SET_MDWE support check to be per-arch 2024-04-03 15:28:54 +02:00
sysctl-test.c
sysctl.c
task_work.c task_work: Introduce task_work_cancel() again 2024-08-03 08:54:16 +02:00
taskstats.c
torture.c rcutorture: Fix stuttering races and other issues 2023-11-28 17:20:08 +00:00
tracepoint.c
tsacct.c taskstats: version 12 with thread group and exe info 2022-04-29 14:38:03 -07:00
ucount.c sysctl: Add size to register_sysctl 2023-08-15 15:26:17 -07:00
uid16.c
uid16.h
umh.c
up.c A set of locking related fixes and updates: 2021-05-09 13:07:03 -07:00
user_namespace.c
user-return-notifier.c
user.c
usermode_driver.c
utsname_sysctl.c
utsname.c
vhost_task.c vhost_task: Handle SIGKILL by flushing work and exiting 2024-07-11 12:49:10 +02:00
watch_queue.c kernel: watch_queue: copy user-array safely 2023-11-28 17:19:40 +00:00
watchdog_buddy.c
watchdog_perf.c watchdog/perf: properly initialize the turbo mode timestamp and rearm counter 2024-08-03 08:54:29 +02:00
watchdog.c watchdog: move softlockup_panic back to early_param 2023-11-28 17:19:57 +00:00
workqueue_internal.h workqueue: Drop the special locking rule for worker->flags and worker_pool->flags 2023-08-07 15:57:22 -10:00
workqueue.c workqueue: Fix selection of wake_cpu in kick_pool() 2024-05-17 12:02:31 +02:00