Go to file
Lorenzo Stoakes 6aa18653e1 mm: drop the assumption that VM_SHARED always implies writable
[ Upstream commit e8e17ee90e ]

Patch series "permit write-sealed memfd read-only shared mappings", v4.

The man page for fcntl() describing memfd file seals states the following
about F_SEAL_WRITE:-

    Furthermore, trying to create new shared, writable memory-mappings via
    mmap(2) will also fail with EPERM.

With emphasis on 'writable'.  In turns out in fact that currently the
kernel simply disallows all new shared memory mappings for a memfd with
F_SEAL_WRITE applied, rendering this documentation inaccurate.

This matters because users are therefore unable to obtain a shared mapping
to a memfd after write sealing altogether, which limits their usefulness.
This was reported in the discussion thread [1] originating from a bug
report [2].

This is a product of both using the struct address_space->i_mmap_writable
atomic counter to determine whether writing may be permitted, and the
kernel adjusting this counter when any VM_SHARED mapping is performed and
more generally implicitly assuming VM_SHARED implies writable.

It seems sensible that we should only update this mapping if VM_MAYWRITE
is specified, i.e.  whether it is possible that this mapping could at any
point be written to.

If we do so then all we need to do to permit write seals to function as
documented is to clear VM_MAYWRITE when mapping read-only.  It turns out
this functionality already exists for F_SEAL_FUTURE_WRITE - we can
therefore simply adapt this logic to do the same for F_SEAL_WRITE.

We then hit a chicken and egg situation in mmap_region() where the check
for VM_MAYWRITE occurs before we are able to clear this flag.  To work
around this, perform this check after we invoke call_mmap(), with careful
consideration of error paths.

Thanks to Andy Lutomirski for the suggestion!

[1]:https://lore.kernel.org/all/20230324133646.16101dfa666f253c4715d965@linux-foundation.org/
[2]:https://bugzilla.kernel.org/show_bug.cgi?id=217238

This patch (of 3):

There is a general assumption that VMAs with the VM_SHARED flag set are
writable.  If the VM_MAYWRITE flag is not set, then this is simply not the
case.

Update those checks which affect the struct address_space->i_mmap_writable
field to explicitly test for this by introducing
[vma_]is_shared_maywrite() helper functions.

This remains entirely conservative, as the lack of VM_MAYWRITE guarantees
that the VMA cannot be written to.

Link: https://lkml.kernel.org/r/cover.1697116581.git.lstoakes@gmail.com
Link: https://lkml.kernel.org/r/d978aefefa83ec42d18dfa964ad180dbcde34795.1697116581.git.lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
[isaacmanjarres: resolved merge conflicts due to
due to refactoring that happened in upstream commit
5de195060b ("mm: resolve faulty mmap_region() error path behaviour")]
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28 16:26:12 +02:00
arch KVM: x86: Take irqfds.lock when adding/deleting IRQ bypass producer 2025-08-28 16:26:12 +02:00
block block: reject invalid operation in submit_bio_noacct 2025-08-28 16:26:10 +02:00
certs sign-file,extract-cert: use pkcs11 provider for OPENSSL MAJOR >= 3 2025-04-25 10:44:04 +02:00
crypto crypto: xts - Only add ecb if it is not already there 2025-06-27 11:07:06 +01:00
Documentation dt-bindings: display: sprd,sharkl3-dsi-host: Fix missing clocks constraints 2025-08-28 16:26:06 +02:00
drivers crypto: qat - fix ring to service map for QAT GEN4 2025-08-28 16:26:12 +02:00
fs btrfs: populate otime when logging an inode item 2025-08-28 16:26:12 +02:00
include mm: drop the assumption that VM_SHARED always implies writable 2025-08-28 16:26:12 +02:00
init sched/isolation: Make CONFIG_CPU_ISOLATION depend on CONFIG_SMP 2025-05-02 07:47:04 +02:00
io_uring io_uring/poll: fix POLLERR handling 2025-07-24 08:51:48 +02:00
ipc ipc: fix to protect IPCS lookups using RCU 2025-06-27 11:07:30 +01:00
kernel mm: drop the assumption that VM_SHARED always implies writable 2025-08-28 16:26:12 +02:00
lib maple_tree: fix mt_destroy_walk() on root leaf node 2025-07-17 18:32:09 +02:00
LICENSES
mm mm: drop the assumption that VM_SHARED always implies writable 2025-08-28 16:26:12 +02:00
net mptcp: reset fallback status gracefully at disconnect() time 2025-08-28 16:26:12 +02:00
rust rust: module: place cleanup_module() in .exit.text section 2025-07-06 10:57:54 +02:00
samples samples: mei: Fix building on musl libc 2025-08-15 12:04:55 +02:00
scripts kconfig: lxdialog: fix 'space' to (de)select options 2025-08-28 16:26:02 +02:00
security apparmor: use the condition in AA_BUG_FMT even with debug disabled 2025-08-28 16:26:01 +02:00
sound ASoC: fsl_sai: replace regmap_write with regmap_update_bits 2025-08-28 16:26:03 +02:00
tools tools/nolibc: fix spelling of FD_SETBITMASK in FD_* macros 2025-08-28 16:26:04 +02:00
usr kbuild: hdrcheck: fix cross build with clang 2025-06-27 11:07:25 +01:00
virt KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin() 2024-06-27 13:46:21 +02:00
.clang-format
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore Remove *.orig pattern from .gitignore 2024-10-17 15:21:15 +02:00
.mailmap
.rustfmt.toml
COPYING
CREDITS
Kbuild
Kconfig
MAINTAINERS sign-file,extract-cert: move common SSL helper functions to a header 2025-04-25 10:44:04 +02:00
Makefile Linux 6.1.148 2025-08-15 12:05:13 +02:00
README

Linux kernel

There are several guides for kernel developers and users. These guides can be rendered in a number of formats, like HTML and PDF. Please read Documentation/admin-guide/README.rst first.

In order to build the documentation, use make htmldocs or make pdfdocs. The formatted documentation can also be read online at:

https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory, several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the requirements for building and running the kernel, and information about the problems which may result by upgrading your kernel.