linux-yocto/fs
Qu Wenruo fcb1f77b8e btrfs: do not allow relocation of partially dropped subvolumes
commit 4289b494ac553e74e86fed1c66b2bf9530bc1082 upstream.

[BUG]
There is an internal report that balance triggered transaction abort,
with the following call trace:

  item 85 key (594509824 169 0) itemoff 12599 itemsize 33
          extent refs 1 gen 197740 flags 2
          ref#0: tree block backref root 7
  item 86 key (594558976 169 0) itemoff 12566 itemsize 33
          extent refs 1 gen 197522 flags 2
          ref#0: tree block backref root 7
 ...
 BTRFS error (device loop0): extent item not found for insert, bytenr 594526208 num_bytes 16384 parent 449921024 root_objectid 934 owner 1 offset 0
 BTRFS error (device loop0): failed to run delayed ref for logical 594526208 num_bytes 16384 type 182 action 1 ref_mod 1: -117
 ------------[ cut here ]------------
 BTRFS: Transaction aborted (error -117)
 WARNING: CPU: 1 PID: 6963 at ../fs/btrfs/extent-tree.c:2168 btrfs_run_delayed_refs+0xfa/0x110 [btrfs]

And btrfs check doesn't report anything wrong related to the extent
tree.

[CAUSE]
The cause is a little complex, firstly the extent tree indeed doesn't
have the backref for 594526208.

The extent tree only have the following two backrefs around that bytenr
on-disk:

        item 65 key (594509824 METADATA_ITEM 0) itemoff 13880 itemsize 33
                refs 1 gen 197740 flags TREE_BLOCK
                tree block skinny level 0
                (176 0x7) tree block backref root CSUM_TREE
        item 66 key (594558976 METADATA_ITEM 0) itemoff 13847 itemsize 33
                refs 1 gen 197522 flags TREE_BLOCK
                tree block skinny level 0
                (176 0x7) tree block backref root CSUM_TREE

But the such missing backref item is not an corruption on disk, as the
offending delayed ref belongs to subvolume 934, and that subvolume is
being dropped:

        item 0 key (934 ROOT_ITEM 198229) itemoff 15844 itemsize 439
                generation 198229 root_dirid 256 bytenr 10741039104 byte_limit 0 bytes_used 345571328
                last_snapshot 198229 flags 0x1000000000001(RDONLY) refs 0
                drop_progress key (206324 EXTENT_DATA 2711650304) drop_level 2
                level 2 generation_v2 198229

And that offending tree block 594526208 is inside the dropped range of
that subvolume.  That explains why there is no backref item for that
bytenr and why btrfs check is not reporting anything wrong.

But this also shows another problem, as btrfs will do all the orphan
subvolume cleanup at a read-write mount.

So half-dropped subvolume should not exist after an RW mount, and
balance itself is also exclusive to subvolume cleanup, meaning we
shouldn't hit a subvolume half-dropped during relocation.

The root cause is, there is no orphan item for this subvolume.
In fact there are 5 subvolumes from around 2021 that have the same
problem.

It looks like the original report has some older kernels running, and
caused those zombie subvolumes.

Thankfully upstream commit 8d488a8c7b ("btrfs: fix subvolume/snapshot
deletion not triggered on mount") has long fixed the bug.

[ENHANCEMENT]
For repairing such old fs, btrfs-progs will be enhanced.

Considering how delayed the problem will show up (at run delayed ref
time) and at that time we have to abort transaction already, it is too
late.

Instead here we reject any half-dropped subvolume for reloc tree at the
earliest time, preventing confusion and extra time wasted on debugging
similar bugs.

CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28 16:26:04 +02:00
..
9p fs/9p: fix uninitialized values during inode evict 2024-11-22 15:37:34 +01:00
adfs
affs affs: don't write overlarge OFS data block size fields 2025-04-10 14:33:38 +02:00
afs afs: Fix the server_list to unuse a displaced server rather than putting it 2025-03-07 16:56:43 +01:00
autofs autofs: fix memory leak of waitqueues in autofs_catatonic_mode 2023-09-23 11:10:59 +02:00
befs
bfs
btrfs btrfs: do not allow relocation of partially dropped subvolumes 2025-08-28 16:26:04 +02:00
cachefiles cachefiles: Fix the incorrect return value in __cachefiles_write() 2025-07-24 08:51:51 +02:00
ceph ceph: fix possible integer overflow in ceph_zero_objects() 2025-07-06 10:57:56 +02:00
coda
configfs configfs: Do not override creating attribute file failure in populate_attrs() 2025-06-27 11:07:25 +01:00
cramfs
crypto fs: Create a generic is_dot_dotdot() utility 2024-10-17 15:21:17 +02:00
debugfs debugfs: fix automount d_fsdata usage 2024-01-20 11:50:04 +01:00
devpts
dlm dlm: make tcp still work in multi-link env 2025-06-04 14:40:05 +02:00
ecryptfs fs: Create a generic is_dot_dotdot() utility 2024-10-17 15:21:17 +02:00
efivarfs efivarfs: Fix error on non-existent file 2024-12-27 13:52:55 +01:00
efs
erofs erofs: address D-cache aliasing 2025-08-15 12:04:51 +02:00
exfat exfat: fix the infinite loop in exfat_find_last_cluster() 2025-04-10 14:33:37 +02:00
exportfs exportfs: use pr_debug for unreachable debug statements 2024-03-06 14:45:15 +00:00
ext2 ext2: Handle fiemap on empty files to prevent EINVAL 2025-08-28 16:25:51 +02:00
ext4 ext4: fix largest free orders lists corruption on mb_optimize_scan switch 2025-08-28 16:26:03 +02:00
f2fs f2fs: fix to calculate dirty data during has_not_enough_free_secs() 2025-08-15 12:05:07 +02:00
fat fat: fix uninitialized variable 2024-10-22 15:56:43 +02:00
freevxfs
fscache netfs/fscache: Add a memory barrier for FSCACHE_VOLUME_CREATING 2024-12-14 19:53:15 +01:00
fuse fuse: Return EPERM rather than ENOSYS from link() 2025-06-04 14:40:02 +02:00
gfs2 gfs2: move msleep to sleepable context 2025-06-27 11:07:25 +01:00
hfs hfs: fix not erasing deleted b-tree node issue 2025-08-28 16:25:51 +02:00
hfsplus hfsplus: don't use BUG_ON() in hfsplus_create_attributes_file() 2025-08-28 16:25:51 +02:00
hostfs
hpfs
hugetlbfs mm/hugetlb: add hugetlb_folio_subpool() helpers 2024-05-17 11:55:51 +02:00
iomap iomap: avoid avoid truncating 64-bit offset to 32 bits 2025-01-23 17:17:12 +01:00
isofs isofs: Verify inode mode when loading from disk 2025-07-24 08:51:49 +02:00
jbd2 jbd2: fix data-race and null-ptr-deref in jbd2_journal_dirty_metadata() 2025-06-27 11:07:26 +01:00
jffs2 jffs2: check jffs2_prealloc_raw_node_refs() result in few other places 2025-06-27 11:07:36 +01:00
jfs jfs: upper bound check of tree index in dbAllocAG 2025-08-28 16:26:00 +02:00
kernfs kernfs: Relax constraint in draining guard 2025-06-27 11:07:11 +01:00
lockd nfsd: stop setting ->pg_stats for unused stats 2024-08-19 06:00:04 +02:00
minix
netfs netfs: Only call folio_start_fscache() one time for each folio 2023-10-06 14:56:32 +02:00
nfs pNFS: Fix uninited ptr deref in block/scsi layout 2025-08-28 16:26:03 +02:00
nfs_common
nfsd NFSD: detect mismatch of file handle and delegation stateid in OPEN op 2025-08-28 16:25:48 +02:00
nilfs2 nilfs2: reject invalid file types when reading inodes 2025-08-15 12:04:49 +02:00
nls fs/nls: make load_nls() take a const parameter 2023-09-13 09:42:22 +02:00
notify fsnotify: fix sending inotify event with unexpected filename 2024-12-14 19:53:59 +01:00
ntfs
ntfs3 fs/ntfs3: correctly create symlink for relative path 2025-08-28 16:25:51 +02:00
ocfs2 ocfs2: fix possible memory leak in ocfs2_finish_quota_recovery 2025-06-27 11:07:13 +01:00
omfs fs: omfs: Use flexible-array member in struct omfs_extent 2025-07-06 10:58:03 +02:00
openpromfs openpromfs: finish conversion to the new mount API 2024-06-12 11:03:03 +02:00
orangefs fs/orangefs: use snprintf() instead of sprintf() 2025-08-28 16:25:59 +02:00
overlayfs ovl: Check for NULL d_inode() in ovl_dentry_upper() 2025-07-06 10:57:56 +02:00
proc proc: use the same treatment to check proc_lseek as ones for proc_read_iter et.al 2025-08-15 12:05:03 +02:00
pstore pstore/blk: trivial typo fixes 2025-02-21 13:48:53 +01:00
qnx4
qnx6
quota quota: flush quota_release_work upon quota writeback 2024-12-14 19:54:10 +01:00
ramfs shmem: use ramfs_kill_sb() for kill_sb method of ramfs-based tmpfs 2023-07-19 16:22:11 +02:00
reiserfs reiserfs: fix uninit-value in comp_keys 2024-08-29 17:30:20 +02:00
romfs
smb cifs: Fix calling CIFSFindFirst() for root path without msearch 2025-08-28 16:25:59 +02:00
squashfs Squashfs: check return result of sb_min_blocksize 2025-06-27 11:07:13 +01:00
sysfs fs: sysfs: Fix reference leak in sysfs_break_active_protection() 2024-04-27 17:07:16 +02:00
sysv sysv: don't call sb_bread() with pointers_lock held 2024-04-13 13:05:05 +02:00
tracefs tracefs: Add missing lockdown check to tracefs_create_dir() 2023-09-23 11:11:12 +02:00
ubifs ubifs: skip dumping tnc tree when zroot is null 2025-02-21 13:49:21 +01:00
udf udf: Verify partition map count 2025-08-28 16:25:51 +02:00
ufs
unicode Revert "unicode: Don't special case ignorable code points" 2024-12-14 19:54:50 +01:00
vboxsf vboxsf: fix building with GCC 15 2025-03-28 21:58:51 +01:00
verity fsverity: skip PKCS#7 parser when keyring is empty 2023-09-13 09:43:03 +02:00
xfs xfs: reset rootdir extent size hint after growfsrt 2025-06-27 11:07:20 +01:00
zonefs zonefs: Improve error handling 2024-02-23 09:12:45 +01:00
aio.c fs/aio: Check IOCB_AIO_RW before the struct aio_kiocb conversion 2024-04-03 15:19:42 +02:00
anon_inodes.c fs: export anon_inode_make_secure_inode() and fix secretmem LSM bypass 2025-07-17 18:32:10 +02:00
attr.c attr: block mode changes of symlinks 2023-09-23 11:11:10 +02:00
bad_inode.c
binfmt_elf_fdpic.c binfmt: Fix whitespace issues 2025-05-22 14:09:58 +02:00
binfmt_elf_test.c
binfmt_elf.c binfmt_elf: Move brk for static PIE even if ASLR disabled 2025-05-22 14:09:59 +02:00
binfmt_flat.c binfmt_flat: Fix integer overflow bug on 32 bit systems 2025-02-21 13:49:39 +01:00
binfmt_misc.c binfmt_misc: cleanup on filesystem umount 2024-08-29 17:30:30 +02:00
binfmt_script.c
buffer.c
char_dev.c
compat_binfmt_elf.c
coredump.c coredump: hand a pidfd to the usermode coredump helper 2025-06-04 14:40:25 +02:00
d_path.c
dax.c fsdax: dax_unshare_iter needs to copy entire blocks 2024-11-08 16:26:42 +01:00
dcache.c fs: better handle deep ancestor chains in is_subdir() 2024-07-25 09:49:18 +02:00
direct-io.c
drop_caches.c
eventfd.c eventfd: prevent underflow for eventfd semaphores 2023-09-13 09:42:27 +02:00
eventpoll.c eventpoll: Fix semi-unbounded recursion 2025-08-28 16:25:49 +02:00
exec.c binfmt: Fix whitespace issues 2025-05-22 14:09:58 +02:00
fcntl.c fs: Fix file_set_fowner LSM hook inconsistencies 2024-10-17 15:21:23 +02:00
fhandle.c do_sys_name_to_handle(): use kzalloc() to fix kernel-infoleak 2024-03-26 18:20:27 -04:00
file_table.c fs: fix proc_handler for sysctl_nr_open 2025-02-21 13:48:53 +01:00
file.c fs: Prevent file descriptor table allocations exceeding INT_MAX 2025-08-28 16:25:49 +02:00
filesystems.c fs/filesystems: Fix potential unsigned integer underflow in fs_name() 2025-06-27 11:07:23 +01:00
fs_context.c vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing 2023-09-13 09:42:28 +02:00
fs_parser.c
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c writeback, cgroup: switch inodes with dirty timestamps to release dying cgwbs 2023-11-20 11:51:50 +01:00
fsopen.c
init.c
inode.c fs/inode: Prevent dump_mapping() accessing invalid dentry.d_name.name 2024-12-14 19:53:13 +01:00
internal.h nfs: use vfs setgid helper 2023-08-30 16:11:10 +02:00
ioctl.c lsm: new security_file_ioctl_compat() hook 2024-01-31 16:17:00 -08:00
Kconfig nfs: add missing selections of CONFIG_CRC32 2025-04-25 10:43:52 +02:00
Kconfig.binfmt
kernel_read_file.c
libfs.c better lockdep annotations for simple_recursive_removal() 2025-08-28 16:25:51 +02:00
locks.c filelock: Fix fcntl/close race recovery compat path 2024-07-27 11:32:19 +02:00
Makefile smb: move client and server files to common directory fs/smb 2023-06-28 11:12:40 +02:00
mbcache.c
mount.h
mpage.c
namei.c fuse: don't truncate cached, mutated symlink 2025-03-28 21:58:53 +01:00
namespace.c clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right userns 2025-07-24 08:51:54 +02:00
no-block.c
nsfs.c
open.c openat2: explicitly return -E2BIG for (usize > PAGE_SIZE) 2024-11-01 01:56:06 +01:00
pipe.c fs/pipe: Fix lockdep false-positive in watchqueue pipe_write() 2024-04-10 16:28:30 +02:00
pnode.c
pnode.h
posix_acl.c
proc_namespace.c
read_write.c
readdir.c
remap_range.c
select.c hrtimer: Use and report correct timerslack values for realtime tasks 2025-03-28 21:58:48 +01:00
seq_file.c
signalfd.c
splice.c mm: merge folio_has_private()/filemap_release_folio() call pairs 2024-01-10 17:10:31 +01:00
stack.c
stat.c
statfs.c statfs: enforce statfs[64] structure initialization 2023-05-24 17:32:51 +01:00
super.c fs: don't allow non-init s_user_ns for filesystems without FS_USERNS_MOUNT 2024-08-03 08:49:53 +02:00
sync.c
sysctls.c
timerfd.c
userfaultfd.c mm/uffd: fix vma operation where start addr cuts part of vma 2025-06-27 11:07:04 +01:00
utimes.c
xattr.c