If a memory copy function fails to copy the whole buffer,
a positive integar with the remaining bytes is returned.
In binder_translate_fd_array() this can result in an fd being
skipped due to the failed copy, but the loop continues
processing fds since the early return condition expects a
negative integer on error.
Fix by returning "ret > 0 ? -EINVAL : ret" to handle this case.
Fixes: bb4a2e48d5 ("binder: return errors from buffer copy functions")
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20211130185152.437403-2-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
binder.c file comment produce warning with checkpatch as below:
WARNING: Possible repeated word: 'for'
Remove the repeated word from the comment avoid this warning.
Signed-off-by: Ajith P V <ajithpv.linux@gmail.com>
Link: https://lore.kernel.org/r/20211125122218.6767-1-ajithpv.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is a partial revert of commit
29bc22ac5e ("binder: use euid from cred instead of using task").
Setting sender_euid using proc->cred caused some Android system test
regressions that need further investigation. It is a partial
reversion because subsequent patches rely on proc->cred.
Fixes: 29bc22ac5e ("binder: use euid from cred instead of using task")
Cc: stable@vger.kernel.org # 4.4+
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Change-Id: I9b1769a3510fed250bb21859ef8beebabe034c66
Link: https://lore.kernel.org/r/20211112180720.2858135-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here is the big set of char and misc and other tiny driver subsystem
updates for 5.16-rc1.
Loads of things in here, all of which have been in linux-next for a
while with no reported problems (except for one called out below.)
Included are:
- habanana labs driver updates, including dma_buf usage,
reviewed and acked by the dma_buf maintainers
- iio driver update (going through this tree not staging as they
really do not belong going through that tree anymore)
- counter driver updates
- hwmon driver updates that the counter drivers needed, acked by
the hwmon maintainer
- xillybus driver updates
- binder driver updates
- extcon driver updates
- dma_buf module namespaces added (will cause a build error in
arm64 for allmodconfig, but that change is on its way through
the drm tree)
- lkdtm driver updates
- pvpanic driver updates
- phy driver updates
- virt acrn and nitr_enclaves driver updates
- smaller char and misc driver updates
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYYPX2A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymUUgCbB4EKysgLuXYdjUalZDx+vvZO4k0AniS14O4k
F+2dVSZ5WX6wumUzCaA6
=bXQM
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big set of char and misc and other tiny driver subsystem
updates for 5.16-rc1.
Loads of things in here, all of which have been in linux-next for a
while with no reported problems (except for one called out below.)
Included are:
- habanana labs driver updates, including dma_buf usage, reviewed and
acked by the dma_buf maintainers
- iio driver update (going through this tree not staging as they
really do not belong going through that tree anymore)
- counter driver updates
- hwmon driver updates that the counter drivers needed, acked by the
hwmon maintainer
- xillybus driver updates
- binder driver updates
- extcon driver updates
- dma_buf module namespaces added (will cause a build error in arm64
for allmodconfig, but that change is on its way through the drm
tree)
- lkdtm driver updates
- pvpanic driver updates
- phy driver updates
- virt acrn and nitr_enclaves driver updates
- smaller char and misc driver updates"
* tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (386 commits)
comedi: dt9812: fix DMA buffers on stack
comedi: ni_usb6501: fix NULL-deref in command paths
arm64: errata: Enable TRBE workaround for write to out-of-range address
arm64: errata: Enable workaround for TRBE overwrite in FILL mode
coresight: trbe: Work around write to out of range
coresight: trbe: Make sure we have enough space
coresight: trbe: Add a helper to determine the minimum buffer size
coresight: trbe: Workaround TRBE errata overwrite in FILL mode
coresight: trbe: Add infrastructure for Errata handling
coresight: trbe: Allow driver to choose a different alignment
coresight: trbe: Decouple buffer base from the hardware base
coresight: trbe: Add a helper to pad a given buffer area
coresight: trbe: Add a helper to calculate the trace generated
coresight: trbe: Defer the probe on offline CPUs
coresight: trbe: Fix incorrect access of the sink specific data
coresight: etm4x: Add ETM PID for Kryo-5XX
coresight: trbe: Prohibit trace before disabling TRBE
coresight: trbe: End the AUX handle on truncation
coresight: trbe: Do not truncate buffer on IRQ
coresight: trbe: Fix handling of spurious interrupts
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmGANbAUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNaMBAAg+9gZr0F7xiafu8JFZqZfx/AQdJ2
G2cn3le+/tXGZmF8m/+82lOaR6LeQLatgSDJNSkXWkKr0nRwseQJDbtRfvYJdn0t
Ax05/Fmz6OGxQ2wgRYgaFiSrKpE5p3NhDtiLFVdkCJaQNe/8DZOc7NhBl6EjZf3x
ubhl2hUiJ4AmiXGwcYhr4uKgP4nhW8OM1/OkskVi+bBMmLA8KTY9kslmIDP5E3BW
29W4qhqeLNQupY5dGMEMVcyxY9ZUWpO39q4uOaQVZrUGE7xABkj/jhnxT5gFTSlI
pu8VhsYXm9KuRVveIsv0L5SZfadwoM9YAl7ki1wD3W5rHqOAte3rBTm6VmNlQwfU
MqxP65Jiyxudxet5Be3/dCRH/+MDQuwBxivgmZXbeVxor2SeznVb0GDaEUC5FSHu
CJIgWtQzsPJMxgAEGXN4F3QGP0htTTJni56GUPOsrf4TIBW02TT+oLTLFRIokQQL
INNOfwVSRXElnCsvxsHR4oB+JZ9pJyBaAmeupcQ6jmcKiWlbLj4s+W0U0pM5h91v
hmMpz7KMxrX6gVL4gB2Jj4aN3r5YRbq26NBu6D+wdwwBTeTTocaHSpAqkv4buClf
uNk3cG8Hkp8TTg9cM8jYgpxMyzKH/AI/Uw3VhEa1xCiq2Ck3DgfnZvnvcRRaZevU
FPgmwgqePJXGi60=
=sb8J
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
- Add LSM/SELinux/Smack controls and auditing for io-uring.
As usual, the individual commit descriptions have more detail, but we
were basically missing two things which we're adding here:
+ establishment of a proper audit context so that auditing of
io-uring ops works similarly to how it does for syscalls (with
some io-uring additions because io-uring ops are *not* syscalls)
+ additional LSM hooks to enable access control points for some of
the more unusual io-uring features, e.g. credential overrides.
The additional audit callouts and LSM hooks were done in conjunction
with the io-uring folks, based on conversations and RFC patches
earlier in the year.
- Fixup the binder credential handling so that the proper credentials
are used in the LSM hooks; the commit description and the code
comment which is removed in these patches are helpful to understand
the background and why this is the proper fix.
- Enable SELinux genfscon policy support for securityfs, allowing
improved SELinux filesystem labeling for other subsystems which make
use of securityfs, e.g. IMA.
* tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
security: Return xattr name from security_dentry_init_security()
selinux: fix a sock regression in selinux_ip_postroute_compat()
binder: use cred instead of task for getsecid
binder: use cred instead of task for selinux checks
binder: use euid from cred instead of using task
LSM: Avoid warnings about potentially unused hook variables
selinux: fix all of the W=1 build warnings
selinux: make better use of the nf_hook_state passed to the NF hooks
selinux: fix race condition when computing ocontext SIDs
selinux: remove unneeded ipv6 hook wrappers
selinux: remove the SELinux lockdown implementation
selinux: enable genfscon labeling for securityfs
Smack: Brutalist io_uring support
selinux: add support for the io_uring access controls
lsm,io_uring: add LSM hooks to io_uring
io_uring: convert io_uring to the secure anon inode interface
fs: add anon_inode_getfile_secure() similar to anon_inode_getfd_secure()
audit: add filtering for io_uring records
audit,io_uring,io-wq: add some basic audit support to io_uring
audit: prepare audit_context for use in calling contexts beyond syscalls
When freeing txn buffers, binder_transaction_buffer_release()
attempts to detect whether the current context is the target by
comparing current->group_leader to proc->tsk. This is an unreliable
test. Instead explicitly pass an 'is_failure' boolean.
Detecting the sender was being used as a way to tell if the
transaction failed to be sent. When cleaning up after
failing to send a transaction, there is no need to close
the fds associated with a BINDER_TYPE_FDA object. Now
'is_failure' can be used to accurately detect this case.
Fixes: 44d8047f1d ("binder: use standard functions to allocate fds")
Cc: stable <stable@vger.kernel.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20211015233811.3532235-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use the 'struct cred' saved at binder_open() to lookup
the security ID via security_cred_getsecid(). This
ensures that the security context that opened binder
is the one used to generate the secctx.
Cc: stable@vger.kernel.org # 5.4+
Fixes: ec74136ded ("binder: create node flag to request sender's security context")
Signed-off-by: Todd Kjos <tkjos@google.com>
Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Since binder was integrated with selinux, it has passed
'struct task_struct' associated with the binder_proc
to represent the source and target of transactions.
The conversion of task to SID was then done in the hook
implementations. It turns out that there are race conditions
which can result in an incorrect security context being used.
Fix by using the 'struct cred' saved during binder_open and pass
it to the selinux subsystem.
Cc: stable@vger.kernel.org # 5.14 (need backport for earlier stables)
Fixes: 79af73079d ("Add security hooks to binder and implement the hooks for SELinux.")
Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Save the 'struct cred' associated with a binder process
at initial open to avoid potential race conditions
when converting to an euid.
Set a transaction's sender_euid from the 'struct cred'
saved at binder_open() instead of looking up the euid
from the binder proc's 'struct task'. This ensures
the euid is associated with the security context that
of the task that opened binder.
Cc: stable@vger.kernel.org # 4.4+
Fixes: 457b9a6f09 ("Staging: android: add binder driver")
Signed-off-by: Todd Kjos <tkjos@google.com>
Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Suggested-by: Jann Horn <jannh@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
During BC_FREE_BUFFER processing, the BINDER_TYPE_FDA object
cleanup may close 1 or more fds. The close operations are
completed using the task work mechanism -- which means the thread
needs to return to userspace or the file object may never be
dereferenced -- which can lead to hung processes.
Force the binder thread back to userspace if an fd is closed during
BC_FREE_BUFFER handling.
Fixes: 80cd795630 ("binder: fix use-after-free due to ksys_close() during fdget()")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Martijn Coenen <maco@android.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20210830195146.587206-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently cgroup freezer is used to freeze the application threads, and
BINDER_FREEZE is used to freeze the corresponding binder interface.
There's already a mechanism in ioctl(BINDER_FREEZE) to wait for any
existing transactions to drain out before actually freezing the binder
interface.
But freezing an app requires 2 steps, freezing the binder interface with
ioctl(BINDER_FREEZE) and then freezing the application main threads with
cgroupfs. This is not an atomic operation. The following race issue
might happen.
1) Binder interface is frozen by ioctl(BINDER_FREEZE);
2) Main thread A initiates a new sync binder transaction to process B;
3) Main thread A is frozen by "echo 1 > cgroup.freeze";
4) The response from process B reaches the frozen thread, which will
unexpectedly fail.
This patch provides a mechanism to check if there's any new pending
transaction happening between ioctl(BINDER_FREEZE) and freezing the
main thread. If there's any, the main thread freezing operation can
be rolled back to finish the pending transaction.
Furthermore, the response might reach the binder driver before the
rollback actually happens. That will still cause failed transaction.
As the other process doesn't wait for another response of the response,
the response transaction failure can be fixed by treating the response
transaction like an oneway/async one, allowing it to reach the frozen
thread. And it will be consumed when the thread gets unfrozen later.
NOTE: This patch reuses the existing definition of struct
binder_frozen_status_info but expands the bit assignments of __u32
member sync_recv.
To ensure backward compatibility, bit 0 of sync_recv still indicates
there's an outstanding sync binder transaction. This patch adds new
information to bit 1 of sync_recv, indicating the binder transaction
happens exactly when there's a race.
If an existing userspace app runs on a new kernel, a sync binder call
will set bit 0 of sync_recv so ioctl(BINDER_GET_FROZEN_INFO) still
return the expected value (true). The app just doesn't check bit 1
intentionally so it doesn't have the ability to tell if there's a race.
This behavior is aligned with what happens on an old kernel which
doesn't set bit 1 at all.
A new userspace app can 1) check bit 0 to know if there's a sync binder
transaction happened when being frozen - same as before; and 2) check
bit 1 to know if that sync binder transaction happened exactly when
there's a race - a new information for rollback decision.
the same time, confirmed the pending transactions succeeded.
Fixes: 432ff1e916 ("binder: BINDER_FREEZE ioctl")
Acked-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Li Li <dualli@google.com>
Test: stress test with apps being frozen and initiating binder calls at
Link: https://lore.kernel.org/r/20210910164210.2282716-2-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the case of a failed transaction, only the thread and process id are
logged. Add the handle info for the reference to the target node in user
error log to aid debugging.
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
Link: https://lore.kernel.org/r/20210802220446.1938347-1-ramjiyani@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
All the other ioctl paths return EFAULT in case the
copy_from_user/copy_to_user call fails, make oneway spam detection
follow the same paradigm.
Fixes: a7dc1e6f99 ("binder: tell userspace to dump current backtrace when detected oneway spamming")
Acked-by: Todd Kjos <tkjos@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
Link: https://lore.kernel.org/r/20210506193726.45118-1-luca.stefani.ge1@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmCHM2sUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNfCg/9GmoCyCh+ZRj5RGQ6M+yJas1+yyJQ
uEfTNde54yfATUTaaWYnZG59yqzM3I2uaV11U7tqg8ajiFPxJKqbs5R9jl3lnSjH
0Dg22nXPSCOTKcU0x/DeLoKRr+M9jO1K/nQ8NEZvYX4nC/OgtCvJqb/oEQZIKAk5
2a7OEmNNQyFGd274p9dELaDHxN9UIaJ2PzQFXtq7ROHgBXQO4ONb2ajOf6mDSFQb
vP/CDHwaH+pcE28w44oRy0/YBkO1SrdqoFQchg5yFagM5tQRLGkXK4OFSs5KHi5Q
YMtmaOzMPIv1e5eaC1HuuMJYA4pPb30T9hFHP7tmBVZfmZaFaDeUs+BhMm98WTiS
o0iTP7tfs36/poOR1Q0/sB06uvF9hUAAX1ZuE95YySifbXU9hsUc9b0uQSwCdg9P
/J9rcdHLTpWqjw9n02mezWmAvo5U8ZvbDs+0xPIwI+3RTUP5t6mp+Hd5Tc7bPTq1
0rpWXx+FQoSytFap5qiUSiwBp+HF6HQnNIXB0Muf6wctChoTjvo7TwoxH//z4kEm
+SddhOCNkB7VC/X7hOxhl0F/rdHuXvb1AFIWjpTLJH2CR1PvMtF+sGey+uPT6hKZ
/gvhmQGjFdph99eGlfVbCNvx1pM61O25IscaYD1T2wGImw+z7dX4WkG3WoOdDSkR
bRjrBkcHh0gLhWk=
=HTEy
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
- Add support for measuring the SELinux state and policy capabilities
using IMA.
- A handful of SELinux/NFS patches to compare the SELinux state of one
mount with a set of mount options. Olga goes into more detail in the
patch descriptions, but this is important as it allows more
flexibility when using NFS and SELinux context mounts.
- Properly differentiate between the subjective and objective LSM
credentials; including support for the SELinux and Smack. My clumsy
attempt at a proper fix for AppArmor didn't quite pass muster so John
is working on a proper AppArmor patch, in the meantime this set of
patches shouldn't change the behavior of AppArmor in any way. This
change explains the bulk of the diffstat beyond security/.
- Fix a problem where we were not properly terminating the permission
list for two SELinux object classes.
* tag 'selinux-pr-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: add proper NULL termination to the secclass_map permissions
smack: differentiate between subjective and objective task credentials
selinux: clarify task subjective and objective credentials
lsm: separate security_task_getsecid() into subjective and objective variants
nfs: account for selinux security context when deciding to share superblock
nfs: remove unneeded null check in nfs_fill_super()
lsm,selinux: add new hook to compare new mount to an existing mount
selinux: fix misspellings using codespell tool
selinux: fix misspellings using codespell tool
selinux: measure state and policy capabilities
selinux: Allow context mounts for unpriviliged overlayfs
When async binder buffer got exhausted, some normal oneway transactions
will also be discarded and may cause system or application failures. By
that time, the binder debug information we dump may not be relevant to
the root cause. And this issue is difficult to debug if without the
backtrace of the thread sending spam.
This change will send BR_ONEWAY_SPAM_SUSPECT to userspace when oneway
spamming is detected, request to dump current backtrace. Oneway spamming
will be reported only once when exceeding the threshold (target process
dips below 80% of its oneway space, and current process is responsible for
either more than 50 transactions, or more than 50% of the oneway space).
And the detection will restart when the async buffer has returned to a
healthy state.
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Hang Lu <hangl@codeaurora.org>
Link: https://lore.kernel.org/r/1617961246-4502-3-git-send-email-hangl@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add BR_FROZEN_REPLY in binder_return_strings to support stat function.
Fixes: ae28c1be1e ("binder: BINDER_GET_FROZEN_INFO ioctl")
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Hang Lu <hangl@codeaurora.org>
Link: https://lore.kernel.org/r/1617961246-4502-2-git-send-email-hangl@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
User space needs to know if binder transactions occurred to frozen
processes. Introduce a new BINDER_GET_FROZEN ioctl and keep track of
transactions occurring to frozen proceses.
Signed-off-by: Marco Ballesio <balejs@google.com>
Signed-off-by: Li Li <dualli@google.com>
Acked-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20210316011630.1121213-4-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
when interrupted by a signal, binder_wait_for_work currently returns
-ERESTARTSYS. This error code isn't propagated to user space, but a way
to handle interruption due to signals must be provided to code using
this API.
Replace this instance of -ERESTARTSYS with -EINTR, which is propagated
to user space.
binder_wait_for_work
Signed-off-by: Marco Ballesio <balejs@google.com>
Signed-off-by: Li Li <dualli@google.com>
Test: built, booted, interrupted a worker thread within
Acked-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20210316011630.1121213-3-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Frozen tasks can't process binder transactions, so a way is required to
inform transmitting ends of communication failures due to the frozen
state of their receiving counterparts. Additionally, races are possible
between transitions to frozen state and binder transactions enqueued to
a specific process.
Implement BINDER_FREEZE ioctl for user space to inform the binder driver
about the intention to freeze or unfreeze a process. When the ioctl is
called, block the caller until any pending binder transactions toward
the target process are flushed. Return an error to transactions to
processes marked as frozen.
Co-developed-by: Todd Kjos <tkjos@google.com>
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Marco Ballesio <balejs@google.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Li Li <dualli@google.com>
Link: https://lore.kernel.org/r/20210316011630.1121213-2-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Of the three LSMs that implement the security_task_getsecid() LSM
hook, all three LSMs provide the task's objective security
credentials. This turns out to be unfortunate as most of the hook's
callers seem to expect the task's subjective credentials, although
a small handful of callers do correctly expect the objective
credentials.
This patch is the first step towards fixing the problem: it splits
the existing security_task_getsecid() hook into two variants, one
for the subjective creds, one for the objective creds.
void security_task_getsecid_subj(struct task_struct *p,
u32 *secid);
void security_task_getsecid_obj(struct task_struct *p,
u32 *secid);
While this patch does fix all of the callers to use the correct
variant, in order to keep this patch focused on the callers and to
ease review, the LSMs continue to use the same implementation for
both hooks. The net effect is that this patch should not change
the behavior of the kernel in any way, it will be up to the latter
LSM specific patches in this series to change the hook
implementations and return the correct credentials.
Acked-by: Mimi Zohar <zohar@linux.ibm.com> (IMA)
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Pull execve updates from Eric Biederman:
"This set of changes ultimately fixes the interaction of posix file
lock and exec. Fundamentally most of the change is just moving where
unshare_files is called during exec, and tweaking the users of
files_struct so that the count of files_struct is not unnecessarily
played with.
Along the way fcheck and related helpers were renamed to more
accurately reflect what they do.
There were also many other small changes that fell out, as this is the
first time in a long time much of this code has been touched.
Benchmarks haven't turned up any practical issues but Al Viro has
observed a possibility for a lot of pounding on task_lock. So I have
some changes in progress to convert put_files_struct to always rcu
free files_struct. That wasn't ready for the merge window so that will
have to wait until next time"
* 'exec-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits)
exec: Move io_uring_task_cancel after the point of no return
coredump: Document coredump code exclusively used by cell spufs
file: Remove get_files_struct
file: Rename __close_fd_get_file close_fd_get_file
file: Replace ksys_close with close_fd
file: Rename __close_fd to close_fd and remove the files parameter
file: Merge __alloc_fd into alloc_fd
file: In f_dupfd read RLIMIT_NOFILE once.
file: Merge __fd_install into fd_install
proc/fd: In fdinfo seq_show don't use get_files_struct
bpf/task_iter: In task_file_seq_get_next use task_lookup_next_fd_rcu
proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu
file: Implement task_lookup_next_fd_rcu
kcmp: In get_file_raw_ptr use task_lookup_fd_rcu
proc/fd: In tid_fd_mode use task_lookup_fd_rcu
file: Implement task_lookup_fd_rcu
file: Rename fcheck lookup_fd_rcu
file: Replace fcheck_files with files_lookup_fd_rcu
file: Factor files_lookup_fd_locked out of fcheck_files
file: Rename __fcheck_files to files_lookup_fd_raw
...
The function close_fd_get_file is explicitly a variant of
__close_fd[1]. Now that __close_fd has been renamed close_fd, rename
close_fd_get_file to be consistent with close_fd.
When __alloc_fd, __close_fd and __fd_install were introduced the
double underscore indicated that the function took a struct
files_struct parameter. The function __close_fd_get_file never has so
the naming has always been inconsistent. This just cleans things up
so there are not any lingering mentions or references __close_fd left
in the code.
[1] 80cd795630 ("binder: fix use-after-free due to ksys_close() during fdget()")
Link: https://lkml.kernel.org/r/20201120231441.29911-23-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Add a per-transaction flag to indicate that the buffer
must be cleared when the transaction is complete to
prevent copies of sensitive data from being preserved
in memory.
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20201120233743.3617529-1-tkjos@google.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since the original trace_binder_transaction_received cannot
precisely present the real finished time of transaction, adding a
trace_binder_txn_latency_free at the point of free transaction
may be more close to it.
Signed-off-by: Frankie.Chang <Frankie.Chang@mediatek.com>
Acked-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/1605063764-12930-3-git-send-email-Frankie.Chang@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Moving all structs to header file makes module more
extendable, and makes all these structs to be defined
in the same file.
Signed-off-by: Frankie.Chang <Frankie.Chang@mediatek.com>
Acked-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/1605063764-12930-2-git-send-email-Frankie.Chang@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Depending on the context, the error return value
here (extra_buffers_size < added_size) should be
negative.
Acked-by: Martijn Coenen <maco@android.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
Link: https://lore.kernel.org/r/20201026110314.135481-1-zhangqilong3@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A previous commit changed the notification mode from true/false to an
int, allowing notify-no, notify-yes, or signal-notify. This was
backwards compatible in the sense that any existing true/false user
would translate to either 0 (on notification sent) or 1, the latter
which mapped to TWA_RESUME. TWA_SIGNAL was assigned a value of 2.
Clean this up properly, and define a proper enum for the notification
mode. Now we have:
- TWA_NONE. This is 0, same as before the original change, meaning no
notification requested.
- TWA_RESUME. This is 1, same as before the original change, meaning
that we use TIF_NOTIFY_RESUME.
- TWA_SIGNAL. This uses TIF_SIGPENDING/JOBCTL_TASK_WORK for the
notification.
Clean up all the callers, switching their 0/1/false/true to using the
appropriate TWA_* mode for notifications.
Fixes: e91b481623 ("task_work: teach task_work_add() to do signal_wake_up()")
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When releasing a thread todo list when tearing down
a binder_proc, the following race was possible which
could result in a use-after-free:
1. Thread 1: enter binder_release_work from binder_thread_release
2. Thread 2: binder_update_ref_for_handle() -> binder_dec_node_ilocked()
3. Thread 2: dec nodeA --> 0 (will free node)
4. Thread 1: ACQ inner_proc_lock
5. Thread 2: block on inner_proc_lock
6. Thread 1: dequeue work (BINDER_WORK_NODE, part of nodeA)
7. Thread 1: REL inner_proc_lock
8. Thread 2: ACQ inner_proc_lock
9. Thread 2: todo list cleanup, but work was already dequeued
10. Thread 2: free node
11. Thread 2: REL inner_proc_lock
12. Thread 1: deref w->type (UAF)
The problem was that for a BINDER_WORK_NODE, the binder_work element
must not be accessed after releasing the inner_proc_lock while
processing the todo list elements since another thread might be
handling a deref on the node containing the binder_work element
leading to the node being freed.
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20201009232455.4054810-1-tkjos@google.com
Cc: <stable@vger.kernel.org> # 4.14, 4.19, 5.4, 5.8
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The most common cause of the binder transaction buffer filling up is a
client rapidly firing oneway transactions into a process, before it has
a chance to handle them. Yet the root cause of this is often hard to
debug, because either the system or the app will stop, and by that time
binder debug information we dump in bugreports is no longer relevant.
This change warns as soon as a process dips below 80% of its oneway
space (less than 100kB available in the configuration), when any one
process is responsible for either more than 50 transactions, or more
than 50% of the oneway space.
Signed-off-by: Martijn Coenen <maco@android.com>
Acked-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20200821122544.1277051-1-maco@android.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While binder transactions with the same binder_proc as sender and recipient
are forbidden, transactions with the same task_struct as sender and
recipient are possible (even though currently there is a weird check in
binder_transaction() that rejects them in the target==0 case).
Therefore, task_struct identities can't be used to distinguish whether
the caller is running in the context of the sender or the recipient.
Since I see no easy way to make this WARN_ON() useful and correct, let's
just remove it.
Fixes: 44d8047f1d ("binder: use standard functions to allocate fds")
Reported-by: syzbot+e113a0b970b7b3f394ba@syzkaller.appspotmail.com
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/r/20200806165359.2381483-1-jannh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Binder is designed such that a binder_proc never has references to
itself. If this rule is violated, memory corruption can occur when a
process sends a transaction to itself; see e.g.
<https://syzkaller.appspot.com/bug?extid=09e05aba06723a94d43d>.
There is a remaining edgecase through which such a transaction-to-self
can still occur from the context of a task with BINDER_SET_CONTEXT_MGR
access:
- task A opens /dev/binder twice, creating binder_proc instances P1
and P2
- P1 becomes context manager
- P2 calls ACQUIRE on the magic handle 0, allocating index 0 in its
handle table
- P1 dies (by closing the /dev/binder fd and waiting a bit)
- P2 becomes context manager
- P2 calls ACQUIRE on the magic handle 0, allocating index 1 in its
handle table
[this triggers a warning: "binder: 1974:1974 tried to acquire
reference to desc 0, got 1 instead"]
- task B opens /dev/binder once, creating binder_proc instance P3
- P3 calls P2 (via magic handle 0) with (void*)1 as argument (two-way
transaction)
- P2 receives the handle and uses it to call P3 (two-way transaction)
- P3 calls P2 (via magic handle 0) (two-way transaction)
- P2 calls P2 (via handle 1) (two-way transaction)
And then, if P2 does *NOT* accept the incoming transaction work, but
instead closes the binder fd, we get a crash.
Solve it by preventing the context manager from using ACQUIRE on ref 0.
There shouldn't be any legitimate reason for the context manager to do
that.
Additionally, print a warning if someone manages to find another way to
trigger a transaction-to-self bug in the future.
Cc: stable@vger.kernel.org
Fixes: 457b9a6f09 ("Staging: android: add binder driver")
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Martijn Coenen <maco@android.com>
Link: https://lore.kernel.org/r/20200727120424.1627555-1-jannh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The binder driver makes the assumption proc->context pointer is invariant after
initialization (as documented in the kerneldoc header for struct proc).
However, in commit f0fe2c0f05 ("binder: prevent UAF for binderfs devices II")
proc->context is set to NULL during binder_deferred_release().
Another proc was in the middle of setting up a transaction to the dying
process and crashed on a NULL pointer deref on "context" which is a local
set to &proc->context:
new_ref->data.desc = (node == context->binder_context_mgr_node) ? 0 : 1;
Here's the stack:
[ 5237.855435] Call trace:
[ 5237.855441] binder_get_ref_for_node_olocked+0x100/0x2ec
[ 5237.855446] binder_inc_ref_for_node+0x140/0x280
[ 5237.855451] binder_translate_binder+0x1d0/0x388
[ 5237.855456] binder_transaction+0x2228/0x3730
[ 5237.855461] binder_thread_write+0x640/0x25bc
[ 5237.855466] binder_ioctl_write_read+0xb0/0x464
[ 5237.855471] binder_ioctl+0x30c/0x96c
[ 5237.855477] do_vfs_ioctl+0x3e0/0x700
[ 5237.855482] __arm64_sys_ioctl+0x78/0xa4
[ 5237.855488] el0_svc_common+0xb4/0x194
[ 5237.855493] el0_svc_handler+0x74/0x98
[ 5237.855497] el0_svc+0x8/0xc
The fix is to move the kfree of the binder_device to binder_free_proc()
so the binder_device is freed when we know there are no references
remaining on the binder_proc.
Fixes: f0fe2c0f05 ("binder: prevent UAF for binderfs devices II")
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200622200715.114382-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is a necessary follow up to the first fix I proposed and we merged
in 2669b8b0c7 ("binder: prevent UAF for binderfs devices"). I have been
overly optimistic that the simple fix I proposed would work. But alas,
ihold() + iput() won't work since the inodes won't survive the
destruction of the superblock.
So all we get with my prior fix is a different race with a tinier
race-window but it doesn't solve the issue. Fwiw, the problem lies with
generic_shutdown_super(). It even has this cozy Al-style comment:
if (!list_empty(&sb->s_inodes)) {
printk("VFS: Busy inodes after unmount of %s. "
"Self-destruct in 5 seconds. Have a nice day...\n",
sb->s_id);
}
On binder_release(), binder_defer_work(proc, BINDER_DEFERRED_RELEASE) is
called which punts the actual cleanup operation to a workqueue. At some
point, binder_deferred_func() will be called which will end up calling
binder_deferred_release() which will retrieve and cleanup the
binder_context attach to this struct binder_proc.
If we trace back where this binder_context is attached to binder_proc we
see that it is set in binder_open() and is taken from the struct
binder_device it is associated with. This obviously assumes that the
struct binder_device that context is attached to is _never_ freed. While
that might be true for devtmpfs binder devices it is most certainly
wrong for binderfs binder devices.
So, assume binder_open() is called on a binderfs binder devices. We now
stash away the struct binder_context associated with that struct
binder_devices:
proc->context = &binder_dev->context;
/* binderfs stashes devices in i_private */
if (is_binderfs_device(nodp)) {
binder_dev = nodp->i_private;
info = nodp->i_sb->s_fs_info;
binder_binderfs_dir_entry_proc = info->proc_log_dir;
} else {
.
.
.
proc->context = &binder_dev->context;
Now let's assume that the binderfs instance for that binder devices is
shutdown via umount() and/or the mount namespace associated with it goes
away. As long as there is still an fd open for that binderfs binder
device things are fine. But let's assume we now close the last fd for
that binderfs binder device. Now binder_release() is called and punts to
the workqueue. Assume that the workqueue has quite a bit of stuff to do
and doesn't get to cleaning up the struct binder_proc and the associated
struct binder_context with it for that binderfs binder device right
away. In the meantime, the VFS is killing the super block and is
ultimately calling sb->evict_inode() which means it will call
binderfs_evict_inode() which does:
static void binderfs_evict_inode(struct inode *inode)
{
struct binder_device *device = inode->i_private;
struct binderfs_info *info = BINDERFS_I(inode);
clear_inode(inode);
if (!S_ISCHR(inode->i_mode) || !device)
return;
mutex_lock(&binderfs_minors_mutex);
--info->device_count;
ida_free(&binderfs_minors, device->miscdev.minor);
mutex_unlock(&binderfs_minors_mutex);
kfree(device->context.name);
kfree(device);
}
thereby freeing the struct binder_device including struct
binder_context.
Now the workqueue finally has time to get around to cleaning up struct
binder_proc and is now trying to access the associate struct
binder_context. Since it's already freed it will OOPs.
Fix this by introducing a refounct on binder devices.
This is an alternative fix to 51d8a7eca6 ("binder: prevent UAF read in
print_binder_transaction_log_entry()").
Fixes: 3ad20fe393 ("binder: implement binderfs")
Fixes: 2669b8b0c7 ("binder: prevent UAF for binderfs devices")
Fixes: 03e2e07e38 ("binder: Make transaction_log available in binderfs")
Related : 51d8a7eca6 ("binder: prevent UAF read in print_binder_transaction_log_entry()")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20200303164340.670054-1-christian.brauner@ubuntu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On binder_release(), binder_defer_work(proc, BINDER_DEFERRED_RELEASE) is
called which punts the actual cleanup operation to a workqueue. At some
point, binder_deferred_func() will be called which will end up calling
binder_deferred_release() which will retrieve and cleanup the
binder_context attach to this struct binder_proc.
If we trace back where this binder_context is attached to binder_proc we
see that it is set in binder_open() and is taken from the struct
binder_device it is associated with. This obviously assumes that the
struct binder_device that context is attached to is _never_ freed. While
that might be true for devtmpfs binder devices it is most certainly
wrong for binderfs binder devices.
So, assume binder_open() is called on a binderfs binder devices. We now
stash away the struct binder_context associated with that struct
binder_devices:
proc->context = &binder_dev->context;
/* binderfs stashes devices in i_private */
if (is_binderfs_device(nodp)) {
binder_dev = nodp->i_private;
info = nodp->i_sb->s_fs_info;
binder_binderfs_dir_entry_proc = info->proc_log_dir;
} else {
.
.
.
proc->context = &binder_dev->context;
Now let's assume that the binderfs instance for that binder devices is
shutdown via umount() and/or the mount namespace associated with it goes
away. As long as there is still an fd open for that binderfs binder
device things are fine. But let's assume we now close the last fd for
that binderfs binder device. Now binder_release() is called and punts to
the workqueue. Assume that the workqueue has quite a bit of stuff to do
and doesn't get to cleaning up the struct binder_proc and the associated
struct binder_context with it for that binderfs binder device right
away. In the meantime, the VFS is killing the super block and is
ultimately calling sb->evict_inode() which means it will call
binderfs_evict_inode() which does:
static void binderfs_evict_inode(struct inode *inode)
{
struct binder_device *device = inode->i_private;
struct binderfs_info *info = BINDERFS_I(inode);
clear_inode(inode);
if (!S_ISCHR(inode->i_mode) || !device)
return;
mutex_lock(&binderfs_minors_mutex);
--info->device_count;
ida_free(&binderfs_minors, device->miscdev.minor);
mutex_unlock(&binderfs_minors_mutex);
kfree(device->context.name);
kfree(device);
}
thereby freeing the struct binder_device including struct
binder_context.
Now the workqueue finally has time to get around to cleaning up struct
binder_proc and is now trying to access the associate struct
binder_context. Since it's already freed it will OOPs.
Fix this by holding an additional reference to the inode that is only
released once the workqueue is done cleaning up struct binder_proc. This
is an easy alternative to introducing separate refcounting on struct
binder_device which we can always do later if it becomes necessary.
This is an alternative fix to 51d8a7eca6 ("binder: prevent UAF read in
print_binder_transaction_log_entry()").
Fixes: 3ad20fe393 ("binder: implement binderfs")
Fixes: 03e2e07e38 ("binder: Make transaction_log available in binderfs")
Related : 51d8a7eca6 ("binder: prevent UAF read in print_binder_transaction_log_entry()")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl4yEegQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpn5ZD/4/WlXs2cUDgg1C65bzZFO4qvevm+VkXmsk
GbyrnFstRekvSH01/ZQxlyDVKS8Wux0XIJ6OArCh1047LvL1bEE5dvOW5iIiwa/r
grjQuwFAzIPsE2fgcAO17BKIUzq2Z96+hwDzH7dw0i32yBuLvNmY/1SxcCHKfPut
uzGyp7t3/2dIHbpWILRndMYe0O9j9ubmOMvKyKTwy723yDEafsUoqu2mlpigzTq4
2i+DbYBIAd8qmLqG/m3e+vOt9xodJ2Q0hlO+v6DcP2SKXU64Hb/N98HadR//aWP9
41DBXqs+dvDBcu3Jxb80PFUTiOQZECJivkns5cNcjuSXmNkOuQhDQR5K372AHmR9
m6e6FSBxwej8HselAZCI6yu9uBKd0i+MM4FnFs/O73QGYx2ayXsEXp/Jad9xiYgW
pC5XJTSqJQhPE0AYYEOzHPPcBLBcpvXHkvmGKdjkNb8OLhhgh2S/YG0DNC+8ABXr
j1uIe/n3kJEEmOanUyiitGyLmDq+mXd7aCVKJL/J0KiGD8Gkc1avAZ1ZrTQgjujY
FqqBFawO8gv3g0L4WMI8JI+HJGMnA488obet6UKm9+l/Z/urEpXzDAKf/W/vnx2B
LD0FSA0bCh1tyO6JU+avFwHlwShtV7/rx/OhrmCK7CCYKtZCA2IEctxyr8U+PBIv
DtwIMTYTsA==
=ZZUI
-----END PGP SIGNATURE-----
Merge tag 'for-5.6/io_uring-vfs-2020-01-29' of git://git.kernel.dk/linux-block
Pull io_uring updates from Jens Axboe:
- Support for various new opcodes (fallocate, openat, close, statx,
fadvise, madvise, openat2, non-vectored read/write, send/recv, and
epoll_ctl)
- Faster ring quiesce for fileset updates
- Optimizations for overflow condition checking
- Support for max-sized clamping
- Support for probing what opcodes are supported
- Support for io-wq backend sharing between "sibling" rings
- Support for registering personalities
- Lots of little fixes and improvements
* tag 'for-5.6/io_uring-vfs-2020-01-29' of git://git.kernel.dk/linux-block: (64 commits)
io_uring: add support for epoll_ctl(2)
eventpoll: support non-blocking do_epoll_ctl() calls
eventpoll: abstract out epoll_ctl() handler
io_uring: fix linked command file table usage
io_uring: support using a registered personality for commands
io_uring: allow registering credentials
io_uring: add io-wq workqueue sharing
io-wq: allow grabbing existing io-wq
io_uring/io-wq: don't use static creds/mm assignments
io-wq: make the io_wq ref counted
io_uring: fix refcounting with batched allocations at OOM
io_uring: add comment for drain_next
io_uring: don't attempt to copy iovec for READ/WRITE
io_uring: honor IOSQE_ASYNC for linked reqs
io_uring: prep req when do IOSQE_ASYNC
io_uring: use labeled array init in io_op_defs
io_uring: optimise sqe-to-req flags translation
io_uring: remove REQ_F_IO_DRAINED
io_uring: file switch work needs to get flushed on exit
io_uring: hide uring_fd in ctx
...
Since commit 43e23b6c0b ("debugfs: log errors when something goes wrong")
debugfs logs attempts to create existing files.
However binder attempts to create multiple debugfs files with
the same name when a single PID has multiple contexts, this leads
to log spamming during an Android boot (17 such messages during
boot on my system).
Fix this by checking if we already know the PID and only create
the debugfs entry for the first context per PID.
Do the same thing for binderfs for symmetry.
Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
Acked-by: Todd Kjos <tkjos@google.com>
Fixes: 43e23b6c0b ("debugfs: log errors when something goes wrong")
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1578671054-5982-1-git-send-email-martin.fuzzey@flowbird.group
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Just one caller of this, and just use filp_close() there manually.
This is important to allow async close/removal of the fd.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
For BINDER_TYPE_PTR and BINDER_TYPE_FDA transactions, the
num_valid local was calculated incorrectly causing the
range check in binder_validate_ptr() to miss out-of-bounds
offsets.
Fixes: bde4a19fc0 ("binder: use userspace pointer as base of buffer space")
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20191213202531.55010-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As part of the cleanup of some remaining y2038 issues, I came to
fs/compat_ioctl.c, which still has a couple of commands that need support
for time64_t.
In completely unrelated work, I spent time on cleaning up parts of this
file in the past, moving things out into drivers instead.
After Al Viro reviewed an earlier version of this series and did a lot
more of that cleanup, I decided to try to completely eliminate the rest
of it and move it all into drivers.
This series incorporates some of Al's work and many patches of my own,
but in the end stops short of actually removing the last part, which is
the scsi ioctl handlers. I have patches for those as well, but they need
more testing or possibly a rewrite.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJdsHCdAAoJEJpsee/mABjZtYkP/1JGl3jFv3Iq/5BCdPkaePP1
RtMJRNfURgK3GeuHUui330PvVjI/pLWXU/VXMK2MPTASpJLzYz3uCaZrpVWEMpDZ
+ImzGmgJkITlW1uWU3zOcQhOxTyb1hCZ0Ci+2xn9QAmyOL7prXoXCXDWv3h6iyiF
lwG+nW+HNtyx41YG+9bRfKNoG0ZJ+nkJ70BV6u0acQHXWn7Xuupa9YUmBL87hxAL
6dlJfLTJg6q8QSv/Q6LxslfWk2Ti8OOJZOwtFM5R8Bgl0iUcvshiRCKfv/3t9jXD
dJNvF1uq8z+gracWK49Qsfq5dnZ2ZxHFUo9u0NjbCrxNvWH/sdvhbaUBuJI75seH
VIznCkdxFhrqitJJ8KmxANxG08u+9zSKjSlxG2SmlA4qFx/AoStoHwQXcogJscNb
YIXYKmWBvwPzYu09QFAXdHFPmZvp/3HhMWU6o92lvDhsDwzkSGt3XKhCJea4DCaT
m+oCcoACqSWhMwdbJOEFofSub4bY43s5iaYuKes+c8O261/Dwg6v/pgIVez9mxXm
TBnvCsotq5m8wbwzv99eFqGeJH8zpDHrXxEtRR5KQqMqjLq/OQVaEzmpHZTEuK7n
e/V/PAKo2/V63g4k6GApQXDxnjwT+m0aWToWoeEzPYXS6KmtWC91r4bWtslu3rdl
bN65armTm7bFFR32Avnu
=lgCl
-----END PGP SIGNATURE-----
Merge tag 'compat-ioctl-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground
Pull removal of most of fs/compat_ioctl.c from Arnd Bergmann:
"As part of the cleanup of some remaining y2038 issues, I came to
fs/compat_ioctl.c, which still has a couple of commands that need
support for time64_t.
In completely unrelated work, I spent time on cleaning up parts of
this file in the past, moving things out into drivers instead.
After Al Viro reviewed an earlier version of this series and did a lot
more of that cleanup, I decided to try to completely eliminate the
rest of it and move it all into drivers.
This series incorporates some of Al's work and many patches of my own,
but in the end stops short of actually removing the last part, which
is the scsi ioctl handlers. I have patches for those as well, but they
need more testing or possibly a rewrite"
* tag 'compat-ioctl-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (42 commits)
scsi: sd: enable compat ioctls for sed-opal
pktcdvd: add compat_ioctl handler
compat_ioctl: move SG_GET_REQUEST_TABLE handling
compat_ioctl: ppp: move simple commands into ppp_generic.c
compat_ioctl: handle PPPIOCGIDLE for 64-bit time_t
compat_ioctl: move PPPIOCSCOMPRESS to ppp_generic
compat_ioctl: unify copy-in of ppp filters
tty: handle compat PPP ioctls
compat_ioctl: move SIOCOUTQ out of compat_ioctl.c
compat_ioctl: handle SIOCOUTQNSD
af_unix: add compat_ioctl support
compat_ioctl: reimplement SG_IO handling
compat_ioctl: move WDIOC handling into wdt drivers
fs: compat_ioctl: move FITRIM emulation into file systems
gfs2: add compat_ioctl support
compat_ioctl: remove unused convert_in_user macro
compat_ioctl: remove last RAID handling code
compat_ioctl: remove /dev/raw ioctl translation
compat_ioctl: remove PCI ioctl translation
compat_ioctl: remove joystick ioctl translation
...
We want the binder fix in here as well for testing and to work on top
of.
Also handles a merge issue in binder.c to help linux-next out
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The .ioctl and .compat_ioctl file operations have the same prototype so
they can both point to the same function, which works great almost all
the time when all the commands are compatible.
One exception is the s390 architecture, where a compat pointer is only
31 bit wide, and converting it into a 64-bit pointer requires calling
compat_ptr(). Most drivers here will never run in s390, but since we now
have a generic helper for it, it's easy enough to use it consistently.
I double-checked all these drivers to ensure that all ioctl arguments
are used as pointers or are ignored, but are not interpreted as integer
values.
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: David Sterba <dsterba@suse.com>
Acked-by: Darren Hart (VMware) <dvhart@infradead.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
SZ_1K has been defined in include/linux/sizes.h since v3.6. Get rid of the
duplicate definition.
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20191016150119.154756-2-jannh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
binder_mmap() tries to prevent the creation of overly big binder mappings
by silently truncating the size of the VMA to 4MiB. However, this violates
the API contract of mmap(). If userspace attempts to create a large binder
VMA, and later attempts to unmap that VMA, it will call munmap() on a range
beyond the end of the VMA, which may have been allocated to another VMA in
the meantime. This can lead to userspace memory corruption.
The following sequence of calls leads to a segfault without this commit:
int main(void) {
int binder_fd = open("/dev/binder", O_RDWR);
if (binder_fd == -1) err(1, "open binder");
void *binder_mapping = mmap(NULL, 0x800000UL, PROT_READ, MAP_SHARED,
binder_fd, 0);
if (binder_mapping == MAP_FAILED) err(1, "mmap binder");
void *data_mapping = mmap(NULL, 0x400000UL, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
if (data_mapping == MAP_FAILED) err(1, "mmap data");
munmap(binder_mapping, 0x800000UL);
*(char*)data_mapping = 1;
return 0;
}
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Todd Kjos <tkjos@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20191016150119.154756-1-jannh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a binder transaction is initiated on a binder device coming from a
binderfs instance, a pointer to the name of the binder device is stashed
in the binder_transaction_log_entry's context_name member. Later on it
is used to print the name in print_binder_transaction_log_entry(). By
the time print_binder_transaction_log_entry() accesses context_name
binderfs_evict_inode() might have already freed the associated memory
thereby causing a UAF. Do the simple thing and prevent this by copying
the name of the binder device instead of stashing a pointer to it.
Reported-by: Jann Horn <jannh@google.com>
Fixes: 03e2e07e38 ("binder: Make transaction_log available in binderfs")
Link: https://lore.kernel.org/r/CAG48ez14Q0-F8LqsvcNbyR2o6gPW8SHXsm4u5jmD9MpsteM2Tw@mail.gmail.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: Todd Kjos <tkjos@google.com>
Reviewed-by: Hridya Valsaraju <hridya@google.com>
Link: https://lore.kernel.org/r/20191008130159.10161-1-christian.brauner@ubuntu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently /sys/kernel/debug/binder/proc contains
the debug data for every binder_proc instance.
This patch makes this information also available
in a binderfs instance mounted with a mount option
"stats=global" in addition to debugfs. The patch does
not affect the presence of the file in debugfs.
If a binderfs instance is mounted at path /dev/binderfs,
this file would be present at /dev/binderfs/binder_logs/proc.
This change provides an alternate way to access this file when debugfs
is not mounted.
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Link: https://lore.kernel.org/r/20190903161655.107408-5-hridya@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, the binder transaction log files 'transaction_log'
and 'failed_transaction_log' live in debugfs at the following locations:
/sys/kernel/debug/binder/failed_transaction_log
/sys/kernel/debug/binder/transaction_log
This patch makes these files also available in a binderfs instance
mounted with the mount option "stats=global".
It does not affect the presence of these files in debugfs.
If a binderfs instance is mounted at path /dev/binderfs, the location of
these files will be as follows:
/dev/binderfs/binder_logs/failed_transaction_log
/dev/binderfs/binder_logs/transaction_log
This change provides an alternate option to access these files when
debugfs is not mounted.
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Link: https://lore.kernel.org/r/20190903161655.107408-4-hridya@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The following binder stat files currently live in debugfs.
/sys/kernel/debug/binder/state
/sys/kernel/debug/binder/stats
/sys/kernel/debug/binder/transactions
This patch makes these files available in a binderfs instance
mounted with the mount option 'stats=global'. For example, if a binderfs
instance is mounted at path /dev/binderfs, the above files will be
available at the following locations:
/dev/binderfs/binder_logs/state
/dev/binderfs/binder_logs/stats
/dev/binderfs/binder_logs/transactions
This provides a way to access them even when debugfs is not mounted.
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20190903161655.107408-3-hridya@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, since each binderfs instance needs its own
private binder devices, every time a binderfs instance is
mounted, all the default binder devices need to be created
via the BINDER_CTL_ADD IOCTL. This patch aims to
add a solution to automatically create the default binder
devices for each binderfs instance that gets mounted.
To achieve this goal, when CONFIG_ANDROID_BINDERFS is set,
the default binder devices specified by CONFIG_ANDROID_BINDER_DEVICES
are created in each binderfs instance instead of global devices
being created by the binder driver.
Co-developed-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Link: https://lore.kernel.org/r/20190808222727.132744-2-hridya@google.com
Link: https://lore.kernel.org/r/20190904110704.8606-2-christian.brauner@ubuntu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, a transaction to context manager from its own process
is prevented by checking if its binder_proc struct is the same as
that of the sender. However, this would not catch cases where the
process opens the binder device again and uses the new fd to send
a transaction to the context manager.
Reported-by: syzbot+8b3c354d33c4ac78bfad@syzkaller.appspotmail.com
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Acked-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190715191804.112933-1-hridya@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case the target node requests a security context, the
extra_buffers_size is increased with the size of the security context.
But, that size is not available for use by regular scatter-gather
buffers; make sure the ending of that buffer is marked correctly.
Acked-by: Todd Kjos <tkjos@google.com>
Fixes: ec74136ded ("binder: create node flag to request sender's security context")
Signed-off-by: Martijn Coenen <maco@android.com>
Cc: stable@vger.kernel.org # 5.1+
Link: https://lore.kernel.org/r/20190709110923.220736-1-maco@android.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The buffer copy functions assumed the caller would ensure
correct alignment and that the memory to be copied was
completely within the binder buffer. There have been
a few cases discovered by syzkallar where a malformed
transaction created by a user could violated the
assumptions and resulted in a BUG_ON.
The fix is to remove the BUG_ON and always return the
error to be handled appropriately by the caller.
Acked-by: Martijn Coenen <maco@android.com>
Reported-by: syzbot+3ae18325f96190606754@syzkaller.appspotmail.com
Fixes: bde4a19fc0 ("binder: use userspace pointer as base of buffer space")
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
syzkallar found a 32-byte memory leak in a rarely executed error
case. The transaction complete work item was not freed if put_user()
failed when writing the BR_TRANSACTION_COMPLETE to the user command
buffer. Fixed by freeing it before put_user() is called.
Reported-by: syzbot+182ce46596c3f2e1eb24@syzkaller.appspotmail.com
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a race between the binder driver cleaning
up a completed transaction via binder_free_transaction()
and a user calling binder_ioctl(BC_FREE_BUFFER) to
release a buffer. It doesn't matter which is first but
they need to be protected against running concurrently
which can result in a UAF.
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Based on 1 normalized pattern(s):
this software is licensed under the terms of the gnu general public
license version 2 as published by the free software foundation and
may be copied distributed and modified under those terms this
program is distributed in the hope that it will be useful but
without any warranty without even the implied warranty of
merchantability or fitness for a particular purpose see the gnu
general public license for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 285 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141900.642774971@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When allocating space in the target buffer for the security context,
make sure the extra_buffers_size doesn't overflow. This can only
happen if the given size is invalid, but an overflow can turn it
into a valid size. Fail the transaction if an overflow is detected.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The selinux-testsuite found an issue resulting in a BUG_ON()
where a conditional relied on a size_t going negative when
checking the validity of a buffer offset.
Fixes: 7a67a39320 ("binder: add function to copy binder object from buffer")
Reported-by: Paul Moore <paul@paul-moore.com>
Tested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that alloc->buffer points to the userspace vm_area
rename buffer->data to buffer->user_data and rename
local pointers that hold user addresses. Also use the
"__user" tag to annotate all user pointers so sparse
can flag cases where user pointer vaues are copied to
kernel pointers. Refactor code to use offsets instead
of user pointers.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove user_buffer_offset since there is no kernel
buffer pointer anymore.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Refactor the functions to validate and fixup struct
binder_buffer pointer objects to avoid using vm_area
pointers. Instead copy to/from kernel space using
binder_alloc_copy_to_buffer() and
binder_alloc_copy_from_buffer(). The following
functions were refactored:
refactor binder_validate_ptr()
binder_validate_fixup()
binder_fixup_parent()
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When creating or tearing down a transaction, the binder driver
examines objects in the buffer and takes appropriate action.
To do this without needing to dereference pointers into the
buffer, the local copies of the objects are needed. This patch
introduces a function to validate and copy binder objects
from the buffer to a local structure.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Avoid vm_area when copying to or from binder buffers.
Instead, new copy functions are added that copy from
kernel space to binder buffer space. These use
kmap_atomic() and kunmap_atomic() to create temporary
mappings and then memcpy() is used to copy within
that page.
Also, kmap_atomic() / kunmap_atomic() use the appropriate
cache flushing to support VIVT cache architectures.
Allow binder to build if CPU_CACHE_VIVT is defined.
Several uses of the new functions are added here. More
to follow in subsequent patches.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The binder driver uses a vm_area to map the per-process
binder buffer space. For 32-bit android devices, this is
now taking too much vmalloc space. This patch removes
the use of vm_area when copying the transaction data
from the sender to the buffer space. Instead of using
copy_from_user() for multi-page copies, it now uses
binder_alloc_copy_user_to_buffer() which uses kmap()
and kunmap() to map each page, and uses copy_from_user()
for copying to that page.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
binderfs should not have a separate device_initcall(). When a kernel is
compiled with CONFIG_ANDROID_BINDERFS register the filesystem alongside
CONFIG_ANDROID_IPC. This use-case is especially sensible when users specify
CONFIG_ANDROID_IPC=y, CONFIG_ANDROID_BINDERFS=y and
ANDROID_BINDER_DEVICES="".
When CONFIG_ANDROID_BINDERFS=n then this always succeeds so there's no
regression potential for legacy workloads.
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To allow servers to verify client identity, allow a node
flag to be set that causes the sender's security context
to be delivered with the transaction. The BR_TRANSACTION
command is extended in BR_TRANSACTION_SEC_CTX to
contain a pointer to the security context string.
Signed-off-by: Todd Kjos <tkjos@google.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As discussed at Linux Plumbers Conference 2018 in Vancouver [1] this is the
implementation of binderfs.
/* Abstract */
binderfs is a backwards-compatible filesystem for Android's binder ipc
mechanism. Each ipc namespace will mount a new binderfs instance. Mounting
binderfs multiple times at different locations in the same ipc namespace
will not cause a new super block to be allocated and hence it will be the
same filesystem instance.
Each new binderfs mount will have its own set of binder devices only
visible in the ipc namespace it has been mounted in. All devices in a new
binderfs mount will follow the scheme binder%d and numbering will always
start at 0.
/* Backwards compatibility */
Devices requested in the Kconfig via CONFIG_ANDROID_BINDER_DEVICES for the
initial ipc namespace will work as before. They will be registered via
misc_register() and appear in the devtmpfs mount. Specifically, the
standard devices binder, hwbinder, and vndbinder will all appear in their
standard locations in /dev. Mounting or unmounting the binderfs mount in
the initial ipc namespace will have no effect on these devices, i.e. they
will neither show up in the binderfs mount nor will they disappear when the
binderfs mount is gone.
/* binder-control */
Each new binderfs instance comes with a binder-control device. No other
devices will be present at first. The binder-control device can be used to
dynamically allocate binder devices. All requests operate on the binderfs
mount the binder-control device resides in.
Assuming a new instance of binderfs has been mounted at /dev/binderfs
via mount -t binderfs binderfs /dev/binderfs. Then a request to create a
new binder device can be made as illustrated in [2].
Binderfs devices can simply be removed via unlink().
/* Implementation details */
- dynamic major number allocation:
When binderfs is registered as a new filesystem it will dynamically
allocate a new major number. The allocated major number will be returned
in struct binderfs_device when a new binder device is allocated.
- global minor number tracking:
Minor are tracked in a global idr struct that is capped at
BINDERFS_MAX_MINOR. The minor number tracker is protected by a global
mutex. This is the only point of contention between binderfs mounts.
- struct binderfs_info:
Each binderfs super block has its own struct binderfs_info that tracks
specific details about a binderfs instance:
- ipc namespace
- dentry of the binder-control device
- root uid and root gid of the user namespace the binderfs instance
was mounted in
- mountable by user namespace root:
binderfs can be mounted by user namespace root in a non-initial user
namespace. The devices will be owned by user namespace root.
- binderfs binder devices without misc infrastructure:
New binder devices associated with a binderfs mount do not use the
full misc_register() infrastructure.
The misc_register() infrastructure can only create new devices in the
host's devtmpfs mount. binderfs does however only make devices appear
under its own mountpoint and thus allocates new character device nodes
from the inode of the root dentry of the super block. This will have
the side-effect that binderfs specific device nodes do not appear in
sysfs. This behavior is similar to devpts allocated pts devices and
has no effect on the functionality of the ipc mechanism itself.
[1]: https://goo.gl/JL2tfX
[2]: program to allocate a new binderfs binder device:
#define _GNU_SOURCE
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <linux/android/binder_ctl.h>
int main(int argc, char *argv[])
{
int fd, ret, saved_errno;
size_t len;
struct binderfs_device device = { 0 };
if (argc < 2)
exit(EXIT_FAILURE);
len = strlen(argv[1]);
if (len > BINDERFS_MAX_NAME)
exit(EXIT_FAILURE);
memcpy(device.name, argv[1], len);
fd = open("/dev/binderfs/binder-control", O_RDONLY | O_CLOEXEC);
if (fd < 0) {
printf("%s - Failed to open binder-control device\n",
strerror(errno));
exit(EXIT_FAILURE);
}
ret = ioctl(fd, BINDER_CTL_ADD, &device);
saved_errno = errno;
close(fd);
errno = saved_errno;
if (ret < 0) {
printf("%s - Failed to allocate new binder device\n",
strerror(errno));
exit(EXIT_FAILURE);
}
printf("Allocated new binder device with major %d, minor %d, and "
"name %s\n", device.major, device.minor,
device.name);
exit(EXIT_SUCCESS);
}
Cc: Martijn Coenen <maco@android.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
44d8047f1d ("binder: use standard functions to allocate fds")
exposed a pre-existing issue in the binder driver.
fdget() is used in ksys_ioctl() as a performance optimization.
One of the rules associated with fdget() is that ksys_close() must
not be called between the fdget() and the fdput(). There is a case
where this requirement is not met in the binder driver which results
in the reference count dropping to 0 when the device is still in
use. This can result in use-after-free or other issues.
If userpace has passed a file-descriptor for the binder driver using
a BINDER_TYPE_FDA object, then kys_close() is called on it when
handling a binder_ioctl(BC_FREE_BUFFER) command. This violates
the assumptions for using fdget().
The problem is fixed by deferring the close using task_work_add(). A
new variant of __close_fd() was created that returns a struct file
with a reference. The fput() is deferred instead of using ksys_close().
Fixes: 44d8047f1d ("binder: use standard functions to allocate fds")
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When dumping out binder transactions via a debug node,
the output is too verbose if a process has many nodes.
Change the output for transaction dumps to only display
nodes with pending async transactions.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We already have the DEFINE_SHOW_ATTRIBUTE.There is no need to define
such a macro,so remove BINDER_DEBUG_ENTRY.
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Acked-by: Todd Kjos <tkjos@android.com>
Reviewed-by: Joey Pabalinas <joeypabalinas@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add __acquire()/__release() annnotations to fix warnings
in sparse context checking
There is one case where the warning was due to a lack of
a "default:" case in a switch statement where a lock was
being released in each of the cases, so the default
case was added.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Malicious code can attempt to free buffers using the BC_FREE_BUFFER
ioctl to binder. There are protections against a user freeing a buffer
while in use by the kernel, however there was a window where
BC_FREE_BUFFER could be used to free a recently allocated buffer that
was not completely initialized. This resulted in a use-after-free
detected by KASAN with a malicious test program.
This window is closed by setting the buffer's allow_user_free attribute
to 0 when the buffer is allocated or when the user has previously freed
it instead of waiting for the caller to set it. The problem was that
when the struct buffer was recycled, allow_user_free was stale and set
to 1 allowing a free to go through.
Signed-off-by: Todd Kjos <tkjos@google.com>
Acked-by: Arve Hjønnevåg <arve@android.com>
Cc: stable <stable@vger.kernel.org> # 4.14
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes the following sparse warning:
drivers/android/binder.c:3312:1: warning:
symbol 'binder_free_buf' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This allows the context manager to retrieve information about nodes
that it holds a reference to, such as the current number of
references to those nodes.
Such information can for example be used to determine whether the
servicemanager is the only process holding a reference to a node.
This information can then be passed on to the process holding the
node, which can in turn decide whether it wants to shut down to
reduce resource usage.
Signed-off-by: Martijn Coenen <maco@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Binder uses internal fs interfaces to allocate and install fds:
__alloc_fd
__fd_install
__close_fd
get_files_struct
put_files_struct
These were used to support the passing of fds between processes
as part of a transaction. The actual allocation and installation
of the fds in the target process was handled by the sending
process so the standard functions, alloc_fd() and fd_install()
which assume task==current couldn't be used.
This patch refactors this mechanism so that the fds are
allocated and installed by the target process allowing the
standard functions to be used.
The sender now creates a list of fd fixups that contains the
struct *file and the address to fixup with the new fd once
it is allocated. This list is processed by the target process
when the transaction is dequeued.
A new error case is introduced by this change. If an async
transaction with file descriptors cannot allocate new
fds in the target (probably due to out of file descriptors),
the transaction is discarded with a log message. In the old
implementation this would have been detected in the sender
context and failed prior to sending.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a process dies, failed reply is sent to the sender of any transaction
queued on a dead thread's todo list. The sender asserts that the
received failed reply corresponds to the head of the transaction stack.
This assert can fail if the dead thread is allowed to send outgoing
transactions when there is already a transaction on its todo list,
because this new transaction can end up on the transaction stack of the
original sender. The following steps illustrate how this assertion can
fail.
1. Thread1 sends txn19 to Thread2
(T1->transaction_stack=txn19, T2->todo+=txn19)
2. Without processing todo list, Thread2 sends txn20 to Thread1
(T1->todo+=txn20, T2->transaction_stack=txn20)
3. T1 processes txn20 on its todo list
(T1->transaction_stack=txn20->txn19, T1->todo=<empty>)
4. T2 dies, T2->todo cleanup attempts to send failed reply for txn19, but
T1->transaction_stack points to txn20 -- assertion failes
Step 2. is the incorrect behavior. When there is a transaction on a
thread's todo list, this thread should not be able to send any outgoing
synchronous transactions. Only the head of the todo list needs to be
checked because only threads that are waiting for proc work can directly
receive work from another thread, and no work is allowed to be queued
on such a thread without waking up the thread. This patch also enforces
that a thread is not waiting for proc work when a work is directly
enqueued to its todo list.
Acked-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Sherry Yang <sherryy@android.com>
Reviewed-by: Martijn Coenen <maco@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If asm/cacheflush.h is included first, the following build warnings are
seen with sparc32 builds.
In file included from arch/sparc/include/asm/cacheflush.h:11:0,
from drivers/android/binder.c:54:
arch/sparc/include/asm/cacheflush_32.h:40:37: warning:
'struct page' declared inside parameter list will not be visible
outside of this definition or declaration
Moving the asm/ include after linux/ includes solves the problem.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use new return type vm_fault_t for fault handler in
struct vm_operations_struct. For now, this is just
documenting that the function returns a VM_FAULT
value rather than an errno. Once all instances are
converted, vm_fault_t will become a distinct type.
Reference id -> 1c8f422059 ("mm: change return type
to vm_fault_t")
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
binder_update_page_range needs down_write of mmap_sem because
vm_insert_page need to change vma->vm_flags to VM_MIXEDMAP unless
it is set. However, when I profile binder working, it seems
every binder buffers should be mapped in advance by binder_mmap.
It means we could set VM_MIXEDMAP in binder_mmap time which is
already hold a mmap_sem as down_write so binder_update_page_range
doesn't need to hold a mmap_sem as down_write.
Please use proper API down_read. It would help mmap_sem contention
problem as well as fixing down_write abuse.
Ganesh Mahendran tested app launching and binder throughput test
and he said he couldn't find any problem and I did binder latency
test per Greg KH request(Thanks Martijn to teach me how I can do)
I cannot find any problem, too.
Cc: Ganesh Mahendran <opensource.ganesh@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Todd Kjos <tkjos@google.com>
Reviewed-by: Martijn Coenen <maco@android.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When to execute binder_stat_br the e->cmd has been modifying as BR_OK
instead of the original return error cmd, in fact we want to know the
original return error, such as BR_DEAD_REPLY or BR_FAILED_REPLY, etc.
instead of always BR_OK, in order to avoid the value of the e->cmd is
always BR_OK, so we need assign the value of the e->cmd to cmd before
e->cmd = BR_OK.
Signed-off-by: songjinshi <songjinshi@xiaomi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
New devices launching with Android P need to use the 64-bit
binder interface, even on 32-bit SoCs [0].
This change removes the Kconfig option to select the 32-bit
binder interface. We don't think this will affect existing
userspace for the following reasons:
1) The latest Android common tree is 4.14, so we don't
believe any Android devices are on kernels >4.14.
2) Android devices launch on an LTS release and stick with
it, so we wouldn't expect devices running on <= 4.14 now
to upgrade to 4.17 or later. But even if they did, they'd
rebuild the world (kernel + userspace) anyway.
3) Other userspaces like 'anbox' are already using the
64-bit interface.
Note that this change doesn't remove the 32-bit UAPI
itself; the reason for that is that Android userspace
always uses the latest UAPI headers from upstream, and
userspace retains 32-bit support for devices that are
upgrading. This will be removed as well in 2-3 years,
at which point we can remove the code from the UAPI
as well.
Finally, this change introduces build errors on archs where
64-bit get_user/put_user is not supported, so make binder
unavailable on m68k (which wouldn't want it anyway).
[0]: https://android-review.googlesource.com/c/platform/build/+/595193
Signed-off-by: Martijn Coenen <maco@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It doesn't make any difference to runtime but I've switched these two
checks to make my static checker happy.
The problem is that "buffer->data_size" is user controlled and if it's
less than "sizeo(*hdr)" then that means "offset" can be more than
"buffer->data_size". It's just cleaner to check it in the other order.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Martijn Coenen <maco@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This can't happen with normal nodes (because you can't get a ref
to a node you own), but it could happen with the context manager;
to make the behavior consistent with regular nodes, reject
transactions into the context manager by the process owning it.
Reported-by: syzbot+09e05aba06723a94d43d@syzkaller.appspotmail.com
Signed-off-by: Martijn Coenen <maco@android.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To prevent races with ep_remove_waitqueue() removing the
waitqueue at the same time.
Reported-by: syzbot+a2a3c4909716e271487e@syzkaller.appspotmail.com
Signed-off-by: Martijn Coenen <maco@android.com>
Cc: stable <stable@vger.kernel.org> # 4.14+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The format specifier "%p" can leak kernel addresses. Use
"%pK" instead. There were 4 remaining cases in binder.c.
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
binder_send_failed_reply() is called when a synchronous
transaction fails. It reports an error to the thread that
is waiting for the completion. Given that the transaction
is synchronous, there should never be more than 1 error
response to that thread -- this was being asserted with
a WARN().
However, when exercising the driver with syzbot tests, cases
were observed where multiple "synchronous" requests were
sent without waiting for responses, so it is possible that
multiple errors would be reported to the thread. This testing
was conducted with panic_on_warn set which forced the crash.
This is easily reproduced by sending back-to-back
"synchronous" transactions without checking for any
response (eg, set read_size to 0):
bwr.write_buffer = (uintptr_t)&bc1;
bwr.write_size = sizeof(bc1);
bwr.read_buffer = (uintptr_t)&br;
bwr.read_size = 0;
ioctl(fd, BINDER_WRITE_READ, &bwr);
sleep(1);
bwr2.write_buffer = (uintptr_t)&bc2;
bwr2.write_size = sizeof(bc2);
bwr2.read_buffer = (uintptr_t)&br;
bwr2.read_size = 0;
ioctl(fd, BINDER_WRITE_READ, &bwr2);
sleep(1);
The first transaction is sent to the servicemanager and the reply
fails because no VMA is set up by this client. After
binder_send_failed_reply() is called, the BINDER_WORK_RETURN_ERROR
is sitting on the thread's todo list since the read_size was 0 and
the client is not waiting for a response.
The 2nd transaction is sent and the BINDER_WORK_RETURN_ERROR has not
been consumed, so the thread's reply_error.cmd is still set (normally
cleared when the BINDER_WORK_RETURN_ERROR is handled). Therefore
when the servicemanager attempts to reply to the 2nd failed
transaction, the error is already set and it triggers this warning.
This is a user error since it is not waiting for the synchronous
transaction to complete. If it ever does check, it will see an
error.
Changed the WARN() to a pr_warn().
Signed-off-by: Todd Kjos <tkjos@android.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the kzalloc() in binder_get_thread() fails, binder_poll()
dereferences the resulting NULL pointer.
Fix it by returning POLLERR if the memory allocation failed.
This bug was found by syzkaller using fault injection.
Reported-by: syzbot <syzkaller@googlegroups.com>
Fixes: 457b9a6f09 ("Staging: android: add binder driver")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:
for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
done
with de-mangling cleanups yet to come.
NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do. But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.
The next patch from Al will sort out the final differences, and we
should be all done.
Scripted-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here is the big pull request for char/misc drivers for 4.16-rc1.
There's a lot of stuff in here. Three new driver subsystems were added
for various types of hardware busses:
- siox
- slimbus
- soundwire
as well as a new vboxguest subsystem for the VirtualBox hypervisor
drivers.
There's also big updates from the FPGA subsystem, lots of Android binder
fixes, the usual handful of hyper-v updates, and lots of other smaller
driver updates.
All of these have been in linux-next for a long time, with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWnLuZw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynS4QCcCrPmwfD5PJwaF+q2dPfyKaflkQMAn0x6Wd+u
Gw3Z2scgjETUpwJ9ilnL
=xcQ0
-----END PGP SIGNATURE-----
Merge tag 'char-misc-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big pull request for char/misc drivers for 4.16-rc1.
There's a lot of stuff in here. Three new driver subsystems were added
for various types of hardware busses:
- siox
- slimbus
- soundwire
as well as a new vboxguest subsystem for the VirtualBox hypervisor
drivers.
There's also big updates from the FPGA subsystem, lots of Android
binder fixes, the usual handful of hyper-v updates, and lots of other
smaller driver updates.
All of these have been in linux-next for a long time, with no reported
issues"
* tag 'char-misc-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (155 commits)
char: lp: use true or false for boolean values
android: binder: use VM_ALLOC to get vm area
android: binder: Use true and false for boolean values
lkdtm: fix handle_irq_event symbol for INT_HW_IRQ_EN
EISA: Delete error message for a failed memory allocation in eisa_probe()
EISA: Whitespace cleanup
misc: remove AVR32 dependencies
virt: vbox: Add error mapping for VERR_INVALID_NAME and VERR_NO_MORE_FILES
soundwire: Fix a signedness bug
uio_hv_generic: fix new type mismatch warnings
uio_hv_generic: fix type mismatch warnings
auxdisplay: img-ascii-lcd: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
uio_hv_generic: add rescind support
uio_hv_generic: check that host supports monitor page
uio_hv_generic: create send and receive buffers
uio: document uio_hv_generic regions
doc: fix documentation about uio_hv_generic
vmbus: add monitor_id and subchannel_id to sysfs per channel
vmbus: fix ABI documentation
uio_hv_generic: use ISR callback method
...
Pull poll annotations from Al Viro:
"This introduces a __bitwise type for POLL### bitmap, and propagates
the annotations through the tree. Most of that stuff is as simple as
'make ->poll() instances return __poll_t and do the same to local
variables used to hold the future return value'.
Some of the obvious brainos found in process are fixed (e.g. POLLIN
misspelled as POLL_IN). At that point the amount of sparse warnings is
low and most of them are for genuine bugs - e.g. ->poll() instance
deciding to return -EINVAL instead of a bitmap. I hadn't touched those
in this series - it's large enough as it is.
Another problem it has caught was eventpoll() ABI mess; select.c and
eventpoll.c assumed that corresponding POLL### and EPOLL### were
equal. That's true for some, but not all of them - EPOLL### are
arch-independent, but POLL### are not.
The last commit in this series separates userland POLL### values from
the (now arch-independent) kernel-side ones, converting between them
in the few places where they are copied to/from userland. AFAICS, this
is the least disruptive fix preserving poll(2) ABI and making epoll()
work on all architectures.
As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and
it will trigger only on what would've triggered EPOLLWRBAND on other
architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered
at all on sparc. With this patch they should work consistently on all
architectures"
* 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
make kernel-side POLL... arch-independent
eventpoll: no need to mask the result of epi_item_poll() again
eventpoll: constify struct epoll_event pointers
debugging printk in sg_poll() uses %x to print POLL... bitmap
annotate poll(2) guts
9p: untangle ->poll() mess
->si_band gets POLL... bitmap stored into a user-visible long field
ring_buffer_poll_wait() return value used as return value of ->poll()
the rest of drivers/*: annotate ->poll() instances
media: annotate ->poll() instances
fs: annotate ->poll() instances
ipc, kernel, mm: annotate ->poll() instances
net: annotate ->poll() instances
apparmor: annotate ->poll() instances
tomoyo: annotate ->poll() instances
sound: annotate ->poll() instances
acpi: annotate ->poll() instances
crypto: annotate ->poll() instances
block: annotate ->poll() instances
x86: annotate ->poll() instances
...