linux-yocto/fs/erofs
Junli Liu cc2ec79a6c erofs: fix atomic context detection when !CONFIG_DEBUG_LOCK_ALLOC
[ Upstream commit c99fab6e80b76422741d34aafc2f930a482afbdd ]

Since EROFS handles decompression in non-atomic contexts due to
uncontrollable decompression latencies and vmap() usage, it tries
to detect atomic contexts and only kicks off a kworker on demand
in order to reduce unnecessary scheduling overhead.

However, the current approach is insufficient and can lead to
sleeping function calls in invalid contexts, causing kernel
warnings and potential system instability. See the stacktrace [1]
and previous discussion [2].

The current implementation only checks rcu_read_lock_any_held(),
which behaves inconsistently across different kernel configurations:

- When CONFIG_DEBUG_LOCK_ALLOC is enabled: correctly detects
  RCU critical sections by checking rcu_lock_map
- When CONFIG_DEBUG_LOCK_ALLOC is disabled: compiles to
  "!preemptible()", which only checks preempt_count and misses
  RCU critical sections

This patch introduces z_erofs_in_atomic() to provide comprehensive
atomic context detection:

1. Check RCU preemption depth when CONFIG_PREEMPTION is enabled,
   as RCU critical sections may not affect preempt_count but still
   require atomic handling

2. Always use async processing when CONFIG_PREEMPT_COUNT is disabled,
   as preemption state cannot be reliably determined

3. Fall back to standard preemptible() check for remaining cases

The function replaces the previous complex condition check and ensures
that z_erofs always uses (kthread_)work in atomic contexts to minimize
scheduling overhead and prevent sleeping in invalid contexts.

[1] Problem stacktrace
[ 61.266692] BUG: sleeping function called from invalid context at kernel/locking/rtmutex_api.c:510
[ 61.266702] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 107, name: irq/54-ufshcd
[ 61.266704] preempt_count: 0, expected: 0
[ 61.266705] RCU nest depth: 2, expected: 0
[ 61.266710] CPU: 0 UID: 0 PID: 107 Comm: irq/54-ufshcd Tainted: G W O 6.12.17 #1
[ 61.266714] Tainted: [W]=WARN, [O]=OOT_MODULE
[ 61.266715] Hardware name: schumacher (DT)
[ 61.266717] Call trace:
[ 61.266718] dump_backtrace+0x9c/0x100
[ 61.266727] show_stack+0x20/0x38
[ 61.266728] dump_stack_lvl+0x78/0x90
[ 61.266734] dump_stack+0x18/0x28
[ 61.266736] __might_resched+0x11c/0x180
[ 61.266743] __might_sleep+0x64/0xc8
[ 61.266745] mutex_lock+0x2c/0xc0
[ 61.266748] z_erofs_decompress_queue+0xe8/0x978
[ 61.266753] z_erofs_decompress_kickoff+0xa8/0x190
[ 61.266756] z_erofs_endio+0x168/0x288
[ 61.266758] bio_endio+0x160/0x218
[ 61.266762] blk_update_request+0x244/0x458
[ 61.266766] scsi_end_request+0x38/0x278
[ 61.266770] scsi_io_completion+0x4c/0x600
[ 61.266772] scsi_finish_command+0xc8/0xe8
[ 61.266775] scsi_complete+0x88/0x148
[ 61.266777] blk_mq_complete_request+0x3c/0x58
[ 61.266780] scsi_done_internal+0xcc/0x158
[ 61.266782] scsi_done+0x1c/0x30
[ 61.266783] ufshcd_compl_one_cqe+0x12c/0x438
[ 61.266786] __ufshcd_transfer_req_compl+0x2c/0x78
[ 61.266788] ufshcd_poll+0xf4/0x210
[ 61.266789] ufshcd_transfer_req_compl+0x50/0x88
[ 61.266791] ufshcd_intr+0x21c/0x7c8
[ 61.266792] irq_forced_thread_fn+0x44/0xd8
[ 61.266796] irq_thread+0x1a4/0x358
[ 61.266799] kthread+0x12c/0x138
[ 61.266802] ret_from_fork+0x10/0x20

[2] https://lore.kernel.org/r/58b661d0-0ebb-4b45-a10d-c5927fb791cd@paulmck-laptop

Signed-off-by: Junli Liu <liujunli@lixiang.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250805011957.911186-1-liujunli@lixiang.com
[ Gao Xiang: Use the original trace in v1. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-09-04 15:31:44 +02:00
..
compress.h
data.c erofs: fix to add missing tracepoint in erofs_readahead() 2025-07-17 18:37:18 +02:00
decompressor_deflate.c
decompressor_lzma.c
decompressor_zstd.c
decompressor.c erofs: address D-cache aliasing 2025-07-17 18:37:15 +02:00
dir.c
erofs_fs.h erofs: restrict pcluster size limitations 2024-09-12 23:00:09 +08:00
fileio.c erofs: refine readahead tracepoint 2025-07-17 18:37:18 +02:00
fscache.c erofs: reference struct erofs_device_info for erofs_map_dev 2024-12-27 14:02:00 +01:00
inode.c erofs: reject inodes with negative i_size 2024-09-12 23:00:09 +08:00
internal.h erofs: fix large fragment handling 2025-08-01 09:48:45 +01:00
Kconfig erofs: mark experimental fscache backend deprecated 2024-09-10 15:27:11 +08:00
Makefile erofs: support unencoded inodes for fileio 2024-09-10 15:26:36 +08:00
namei.c
super.c erofs: avoid using multiple devices with different type 2025-06-19 15:31:29 +02:00
sysfs.c erofs: clean up erofs_register_sysfs() 2024-09-10 00:46:34 +08:00
xattr.c add a string-to-qstr constructor 2025-07-10 16:05:08 +02:00
xattr.h
zdata.c erofs: fix atomic context detection when !CONFIG_DEBUG_LOCK_ALLOC 2025-09-04 15:31:44 +02:00
zmap.c erofs: fix large fragment handling 2025-08-01 09:48:45 +01:00
zutil.c erofs: fix rare pcluster memory leak after unmounting 2025-07-17 18:37:23 +02:00