linux-yocto/fs/btrfs
Boris Burkov 36679fab54 btrfs: check folio mapping after unlock in relocate_one_folio()
commit 3e74859ee35edc33a022c3f3971df066ea0ca6b9 upstream.

When we call btrfs_read_folio() to bring a folio uptodate, we unlock the
folio. The result of that is that a different thread can modify the
mapping (like remove it with invalidate) before we call folio_lock().
This results in an invalid page and we need to try again.

In particular, if we are relocating concurrently with aborting a
transaction, this can result in a crash like the following:

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP
  CPU: 76 PID: 1411631 Comm: kworker/u322:5
  Workqueue: events_unbound btrfs_reclaim_bgs_work
  RIP: 0010:set_page_extent_mapped+0x20/0xb0
  RSP: 0018:ffffc900516a7be8 EFLAGS: 00010246
  RAX: ffffea009e851d08 RBX: ffffea009e0b1880 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: ffffc900516a7b90 RDI: ffffea009e0b1880
  RBP: 0000000003573000 R08: 0000000000000001 R09: ffff88c07fd2f3f0
  R10: 0000000000000000 R11: 0000194754b575be R12: 0000000003572000
  R13: 0000000003572fff R14: 0000000000100cca R15: 0000000005582fff
  FS:  0000000000000000(0000) GS:ffff88c07fd00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 000000407d00f002 CR4: 00000000007706f0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
  <TASK>
  ? __die+0x78/0xc0
  ? page_fault_oops+0x2a8/0x3a0
  ? __switch_to+0x133/0x530
  ? wq_worker_running+0xa/0x40
  ? exc_page_fault+0x63/0x130
  ? asm_exc_page_fault+0x22/0x30
  ? set_page_extent_mapped+0x20/0xb0
  relocate_file_extent_cluster+0x1a7/0x940
  relocate_data_extent+0xaf/0x120
  relocate_block_group+0x20f/0x480
  btrfs_relocate_block_group+0x152/0x320
  btrfs_relocate_chunk+0x3d/0x120
  btrfs_reclaim_bgs_work+0x2ae/0x4e0
  process_scheduled_works+0x184/0x370
  worker_thread+0xc6/0x3e0
  ? blk_add_timer+0xb0/0xb0
  kthread+0xae/0xe0
  ? flush_tlb_kernel_range+0x90/0x90
  ret_from_fork+0x2f/0x40
  ? flush_tlb_kernel_range+0x90/0x90
  ret_from_fork_asm+0x11/0x20
  </TASK>

This occurs because cleanup_one_transaction() calls
destroy_delalloc_inodes() which calls invalidate_inode_pages2() which
takes the folio_lock before setting mapping to NULL. We fail to check
this, and subsequently call set_extent_mapping(), which assumes that
mapping != NULL (in fact it asserts that in debug mode)

Note that the "fixes" patch here is not the one that introduced the
race (the very first iteration of this code from 2009) but a more recent
change that made this particular crash happen in practice.

Fixes: e7f1326cc2 ("btrfs: set page extent mapped after read_folio in relocate_one_page")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Zhaoyang Li <lizy04@hust.edu.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-04 14:40:22 +02:00
..
tests btrfs: tests: allocate dummy fs_info and root in test_find_delalloc() 2024-08-29 17:30:37 +02:00
acl.c
async-thread.c
async-thread.h btrfs: remove unused typedefs get_extent_t and btrfs_work_func_t 2022-07-25 17:45:36 +02:00
backref.c btrfs: fix information leak in btrfs_ioctl_logical_to_ino() 2024-05-02 16:29:28 +02:00
backref.h btrfs: ignore fiemap path cache if we have multiple leaves for a data extent 2022-10-11 14:48:07 +02:00
block-group.c btrfs: get zone unusable bytes while holding lock at btrfs_reclaim_bgs_work() 2025-06-04 14:40:04 +02:00
block-group.h btrfs: add and use helper to check if block group is used 2024-02-23 09:12:28 +01:00
block-rsv.c btrfs: calculate the right space for delayed refs when updating global reserve 2024-09-30 16:23:55 +02:00
block-rsv.h btrfs: calculate the right space for delayed refs when updating global reserve 2024-09-30 16:23:55 +02:00
btrfs_inode.h btrfs: use a runtime flag to indicate an inode is a free space inode 2022-09-26 12:28:07 +02:00
check-integrity.c fs/btrfs: Use the enum req_op and blk_opf_t types 2022-07-14 12:14:32 -06:00
check-integrity.h
compression.c btrfs: fix extent map use-after-free when adding pages to compressed bio 2024-09-04 13:25:00 +02:00
compression.h for-5.20-tag 2022-08-03 14:54:52 -07:00
ctree.c btrfs: fix use-after-free when COWing tree bock and tracing is enabled 2025-01-09 13:30:03 +01:00
ctree.h btrfs: rename and export __btrfs_cow_block() 2025-01-09 13:30:03 +01:00
delalloc-space.c btrfs: don't reserve space for checksums when writing to nocow files 2024-02-23 09:12:29 +01:00
delalloc-space.h btrfs: add the ability to use NO_FLUSH for data reservations 2022-09-29 17:08:28 +02:00
delayed-inode.c btrfs: change BUG_ON to assertion when checking for delayed_node root 2024-08-29 17:30:37 +02:00
delayed-inode.h btrfs: fix infinite directory reads 2024-01-31 16:17:05 -08:00
delayed-ref.c btrfs: reinitialize delayed ref list after deleting it from the list 2024-11-14 13:15:17 +01:00
delayed-ref.h btrfs: calculate the right space for delayed refs when updating global reserve 2024-09-30 16:23:55 +02:00
dev-replace.c btrfs: dev-replace: properly validate device names 2024-03-06 14:45:10 +00:00
dev-replace.h btrfs: add struct declarations in dev-replace.h 2022-09-26 12:28:07 +02:00
dir-item.c btrfs: fix passing 0 to ERR_PTR in btrfs_search_dir_index_item() 2024-11-01 01:56:06 +01:00
discard.c btrfs: make btrfs_discard_workfn() block_group ref explicit 2025-06-04 14:40:04 +02:00
discard.h
disk-io.c btrfs: fix non-empty delayed iputs list on unmount due to async workers 2025-06-04 14:40:04 +02:00
disk-io.h btrfs: fix double free of anonymous device after snapshot creation failure 2024-03-06 14:45:10 +00:00
export.c btrfs: export: handle invalid inode or root reference in btrfs_get_parent() 2024-04-13 13:05:01 +02:00
export.h btrfs: fix type of parameter generation in btrfs_get_dentry 2022-10-24 15:28:58 +02:00
extent_io.c btrfs: avoid linker error in btrfs_find_create_tree_block() 2025-06-04 14:40:04 +02:00
extent_io.h btrfs: move extent io tree unrelated prototypes to their appropriate header 2022-09-26 12:28:04 +02:00
extent_map.c btrfs: fix wrong block_start calculation for btrfs_drop_extent_map_range() 2024-06-21 14:35:38 +02:00
extent_map.h btrfs: get the next extent map during fiemap/lseek more efficiently 2023-04-26 14:28:38 +02:00
extent-io-tree.c btrfs: fix off-by-one in delalloc search during lseek 2023-01-12 12:01:56 +01:00
extent-io-tree.h btrfs: stop tracking failed reads in the I/O tree 2022-09-26 12:28:05 +02:00
extent-tree.c btrfs: don't BUG_ON() when 0 reference count at btrfs_lookup_extent_info() 2025-05-22 14:10:09 +02:00
file-item.c btrfs: mark the len field in struct btrfs_ordered_sum as unsigned 2024-01-10 17:10:35 +01:00
file.c btrfs: avoid page_lockend underflow in btrfs_punch_hole_lock_range() 2025-05-02 07:46:54 +02:00
free-space-cache.c btrfs: zoned: properly take lock to read/update block group's zoned variables 2024-08-29 17:30:15 +02:00
free-space-cache.h btrfs: remove use btrfs_remove_free_space_cache instead of variant 2022-09-26 12:27:58 +02:00
free-space-tree.c btrfs: convert btrfs_block_group::needs_free_space to runtime flag 2023-08-23 17:52:28 +02:00
free-space-tree.h btrfs: make clear_cache mount option to rebuild FST without disabling it 2023-05-17 11:53:42 +02:00
inode-item.c btrfs: use struct fscrypt_str instead of struct qstr 2023-10-10 22:00:36 +02:00
inode-item.h btrfs: use struct fscrypt_str instead of struct qstr 2023-10-10 22:00:36 +02:00
inode.c btrfs: fix the length of reserved qgroup to free 2025-04-25 10:44:03 +02:00
ioctl.c btrfs: qgroup: fix qgroup prealloc rsv leak in subvolume operations 2024-12-14 19:53:56 +01:00
Kconfig
locking.c btrfs: add block-group tree to lockdep classes 2023-07-19 16:22:13 +02:00
locking.h btrfs: implement a nowait option for tree searches 2022-09-26 12:46:42 +02:00
lzo.c btrfs: replace kmap() with kmap_local_page() in lzo.c 2022-07-25 17:45:33 +02:00
Makefile btrfs: move extent state init and alloc functions to their own file 2022-09-26 12:28:03 +02:00
misc.h btrfs: convert the io_failure_tree to a plain rb_tree 2022-09-26 12:28:02 +02:00
ordered-data.c btrfs: fix qgroup_free_reserved_data int overflow 2024-01-10 17:10:35 +01:00
ordered-data.h btrfs: mark the len field in struct btrfs_ordered_sum as unsigned 2024-01-10 17:10:35 +01:00
orphan.c
print-tree.c btrfs: avoid using fixed char array size for tree names 2024-08-14 13:52:59 +02:00
print-tree.h
props.c btrfs: remove the unnecessary result variables 2022-09-26 12:28:00 +02:00
props.h
qgroup.c btrfs: run delayed iputs when flushing delalloc 2024-09-04 13:24:55 +02:00
qgroup.h btrfs: fix qgroup_free_reserved_data int overflow 2024-01-10 17:10:35 +01:00
raid56.c btrfs: raid56: avoid double freeing for rbio if full_stripe_write() failed 2022-10-24 15:26:56 +02:00
raid56.h btrfs: properly abstract the parity raid bio handling 2022-09-26 12:27:59 +02:00
rcu-string.h btrfs: replace strncpy() with strscpy() 2023-01-12 12:01:55 +01:00
ref-verify.c btrfs: ref-verify: fix use-after-free after invalid ref action 2024-12-14 19:54:10 +01:00
ref-verify.h
reflink.c btrfs: replace sb::s_blocksize by fs_info::sectorsize 2024-08-29 17:30:42 +02:00
reflink.h
relocation.c btrfs: check folio mapping after unlock in relocate_one_folio() 2025-06-04 14:40:22 +02:00
root-tree.c btrfs: qgroup: fix qgroup prealloc rsv leak in subvolume operations 2024-12-14 19:53:56 +01:00
scrub.c btrfs: scrub: initialize ret in scrub_simple_mirror() to fix compilation warning 2024-07-11 12:47:10 +02:00
send.c btrfs: send: return -ENAMETOOLONG when attempting a path that is too long 2025-06-04 14:40:05 +02:00
send.h btrfs: send: allow protocol version 3 with CONFIG_BTRFS_DEBUG 2022-10-11 14:46:55 +02:00
space-info.c btrfs: zoned: fix zone_unusable accounting on making block group read-write again 2024-08-11 12:36:00 +02:00
space-info.h btrfs: zoned: fix zone_unusable accounting on making block group read-write again 2024-08-11 12:36:00 +02:00
struct-funcs.c btrfs: remove redundant check in up check_setget_bounds 2022-07-25 17:45:33 +02:00
subpage.c btrfs: convert process_page_range() to use filemap_get_folios_contig() 2022-09-11 20:26:03 -07:00
subpage.h
super.c btrfs: correctly escape subvol in btrfs_show_options() 2025-04-25 10:43:53 +02:00
sysfs.c btrfs: sysfs: fix direct super block member reads 2025-01-02 10:30:55 +01:00
sysfs.h
transaction.c btrfs: fix use-after-free when attempting to join an aborted transaction 2025-02-21 13:49:29 +01:00
transaction.h btrfs: fix race between direct IO write and fsync when using same fd 2024-09-12 11:10:29 +02:00
tree-checker.c btrfs: tree-checker: reject inline extent items with 0 ref count 2024-12-27 13:52:59 +01:00
tree-checker.h
tree-defrag.c btrfs: move the auto defrag code to defrag.c 2023-02-22 12:59:40 +01:00
tree-log.c btrfs: fix uninitialized pointer free on read_alloc_one_name() error 2024-10-22 15:56:39 +02:00
tree-log.h btrfs: use struct fscrypt_str instead of struct qstr 2023-10-10 22:00:36 +02:00
tree-mod-log.c
tree-mod-log.h
ulist.c
ulist.h
uuid-tree.c
verity.c btrfs: send: add support for fs-verity 2022-09-26 12:27:55 +02:00
volumes.c btrfs: do not clear read-only when adding sprout device 2024-12-14 19:54:37 +01:00
volumes.h btrfs: add a helper to read the superblock metadata_uuid 2023-09-23 11:11:08 +02:00
xattr.c btrfs: check if root is readonly while setting security xattr 2022-08-22 18:06:30 +02:00
xattr.h
zlib.c btrfs: zlib: zero-initialize zlib workspace 2023-02-14 19:11:40 +01:00
zoned.c btrfs: zoned: fix zone finishing with missing devices 2025-04-25 10:44:02 +02:00
zoned.h btrfs: zoned: clone zoned device info when cloning a device 2022-11-07 14:35:21 +01:00
zstd.c btrfs: zstd: replace kmap() with kmap_local_page() 2022-07-25 17:45:40 +02:00