linux-yocto/Documentation/userspace-api
Rodrigo Campos 31a5fc473d seccomp: Support atomic "addfd + send reply"
[ Upstream commit 0ae71c7720 ]

Alban Crequy reported a race condition userspace faces when we want to
add some fds and make the syscall return them[1] using seccomp notify.

The problem is that currently two different ioctl() calls are needed by
the process handling the syscalls (agent) for another userspace process
(target): SECCOMP_IOCTL_NOTIF_ADDFD to allocate the fd and
SECCOMP_IOCTL_NOTIF_SEND to return that value. Therefore, it is possible
for the agent to do the first ioctl to add a file descriptor but the
target is interrupted (EINTR) before the agent does the second ioctl()
call.

This patch adds a flag to the ADDFD ioctl() so it adds the fd and
returns that value atomically to the target program, as suggested by
Kees Cook[2]. This is done by simply allowing
seccomp_do_user_notification() to add the fd and return it in this case.
Therefore, in this case the target wakes up from the wait in
seccomp_do_user_notification() either to interrupt the syscall or to add
the fd and return it.

This "allocate an fd and return" functionality is useful for syscalls
that return a file descriptor only, like connect(2). Other syscalls that
return a file descriptor but not as return value (or return more than
one fd), like socketpair(), pipe(), recvmsg with SCM_RIGHTs, will not
work with this flag.

This effectively combines SECCOMP_IOCTL_NOTIF_ADDFD and
SECCOMP_IOCTL_NOTIF_SEND into an atomic opteration. The notification's
return value, nor error can be set by the user. Upon successful invocation
of the SECCOMP_IOCTL_NOTIF_ADDFD ioctl with the SECCOMP_ADDFD_FLAG_SEND
flag, the notifying process's errno will be 0, and the return value will
be the file descriptor number that was installed.

[1]: https://lore.kernel.org/lkml/CADZs7q4sw71iNHmV8EOOXhUKJMORPzF7thraxZYddTZsxta-KQ@mail.gmail.com/
[2]: https://lore.kernel.org/lkml/202012011322.26DCBC64F2@keescook/

Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io>
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Tycho Andersen <tycho@tycho.pizza>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210517193908.3113-4-sargun@sargun.me
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14 17:06:29 +02:00
..
accelerators docs: ocxl.rst: add it to the uAPI book 2019-07-15 11:03:02 -03:00
ebpf docs/bpf: Add bpf() syscall command reference 2021-03-04 18:39:46 -08:00
ioctl It's been a relatively busy cycle in docsland, though more than usually 2021-04-26 13:22:43 -07:00
media media: hevc: Fix dependent slice segment flags 2021-07-14 17:06:22 +02:00
index.rst Add Landlock, a new LSM from Mickaël Salaün <mic@linux.microsoft.com> 2021-05-01 18:50:44 -07:00
iommu.rst docs: IOMMU user API 2020-10-01 14:52:46 +02:00
landlock.rst landlock: Add user and kernel documentation 2021-04-22 12:22:11 -07:00
no_new_privs.rst doc: ReSTify no_new_privs.txt 2017-05-18 10:30:09 -06:00
seccomp_filter.rst seccomp: Support atomic "addfd + send reply" 2021-07-14 17:06:29 +02:00
spec_ctrl.rst Documentation: Add section about CPU vulnerabilities for Spectre 2019-06-26 11:42:41 -06:00
sysfs-platform_profile.rst Documentation: Add documentation for new platform_profile sysfs attribute 2020-12-30 18:28:57 +01:00
unshare.rst doc-rst: fix inline emphasis in unshare.rst 2017-05-18 10:23:10 -06:00