mirror of
git://git.yoctoproject.org/linux-yocto.git
synced 2025-08-22 00:42:01 +02:00
v6.6.100
355 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
![]() |
20fb292ab5 |
cgroup: Fix compilation issue due to cgroup_mutex not being exported
[ Upstream commit
|
||
![]() |
a00e607102 |
cgroup: fix race between fork and cgroup.kill
commit |
||
![]() |
9e67b05419 |
cgroup/bpf: only cgroup v2 can be attached by bpf programs
[ Upstream commit |
||
![]() |
92031d6601 |
Revert "cgroup: Fix memory leak caused by missing cgroup_bpf_offline"
[ Upstream commit |
||
![]() |
fb384669cb |
cgroup: Fix potential overflow issue when checking max_depth
[ Upstream commit |
||
![]() |
84a6b76b28 |
cgroup: Protect css->cgroup write under css_set_lock
[ Upstream commit
|
||
![]() |
a2225b7af5 |
cgroup: Avoid extra dereference in css_populate_dir()
[ Upstream commit
|
||
![]() |
dd9542ae7c |
cgroup: Make operations on the cgroup root_list RCU safe
commit
|
||
![]() |
aea95c68b7 |
kernfs: Convert kernfs_path_from_node_locked() from strlcpy() to strscpy()
[ Upstream commit |
||
![]() |
04d2fea2b7 |
sched: psi: fix unprivileged polling against cgroups
commit |
||
![]() |
76be05d4fd |
cgroup: fix build when CGROUP_SCHED is not enabled
Sudip Mukherjee reports that the mips sb1250_swarm_defconfig build fails
with the current kernel. It isn't actually MIPS-specific, it's just
that that defconfig does not have CGROUP_SCHED enabled like most configs
do, and as such shows this error:
kernel/cgroup/cgroup.c: In function 'cgroup_local_stat_show':
kernel/cgroup/cgroup.c:3699:15: error: implicit declaration of function 'cgroup_tryget_css'; did you mean 'cgroup_tryget'? [-Werror=implicit-function-declaration]
3699 | css = cgroup_tryget_css(cgrp, ss);
| ^~~~~~~~~~~~~~~~~
| cgroup_tryget
kernel/cgroup/cgroup.c:3699:13: warning: assignment to 'struct cgroup_subsys_state *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
3699 | css = cgroup_tryget_css(cgrp, ss);
| ^
because cgroup_tryget_css() only exists when CGROUP_SCHED is enabled,
and the cgroup_local_stat_show() function should similarly be guarded by
that config option.
Move things around a bit to fix this all.
Fixes:
|
||
![]() |
7716f383a5 |
cgroup: Changes for v6.6
* Per-cpu cpu usage stats are now tracked. This currently isn't printed out
in the cgroupfs interface and can only be accessed through e.g. BPF.
Should decide on a not-too-ugly way to show per-cpu stats in cgroupfs.
* cpuset received some cleanups and prepatory patches for the pending
cpus.exclusive patchset which will allow cpuset partitions to be created
below non-partition parents, which should ease the management of partition
cpusets.
* A lot of code and documentation cleanup patches.
* tools/testing/selftests/cgroup/test_cpuset.c is added. This causes trivial
conflicts in .gitignore and Makefile under the directory against
|
||
![]() |
78d44b824e |
cgroup: Avoid -Wstringop-overflow warnings
Change the notation from pointer-to-array to pointer-to-pointer. With this, we avoid the compiler complaining about trying to access a region of size zero as an argument during function calls. This is a workaround to prevent the compiler complaining about accessing an array of size zero when evaluating the arguments of a couple of function calls. See below: kernel/cgroup/cgroup.c: In function 'find_css_set': kernel/cgroup/cgroup.c:1206:16: warning: 'find_existing_css_set' accessing 4 bytes in a region of size 0 [-Wstringop-overflow=] 1206 | cset = find_existing_css_set(old_cset, cgrp, template); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ kernel/cgroup/cgroup.c:1206:16: note: referencing argument 3 of type 'struct cgroup_subsys_state *[0]' kernel/cgroup/cgroup.c:1071:24: note: in a call to function 'find_existing_css_set' 1071 | static struct css_set *find_existing_css_set(struct css_set *old_cset, | ^~~~~~~~~~~~~~~~~~~~~ With the change to pointer-to-pointer, the functions are not prevented from being executed, and they will do what they have to do when CGROUP_SUBSYS_COUNT == 0. Address the following -Wstringop-overflow warnings seen when built with ARM architecture and aspeed_g4_defconfig configuration (notice that under this configuration CGROUP_SUBSYS_COUNT == 0): kernel/cgroup/cgroup.c:1208:16: warning: 'find_existing_css_set' accessing 4 bytes in a region of size 0 [-Wstringop-overflow=] kernel/cgroup/cgroup.c:1258:15: warning: 'css_set_hash' accessing 4 bytes in a region of size 0 [-Wstringop-overflow=] kernel/cgroup/cgroup.c:6089:18: warning: 'css_set_hash' accessing 4 bytes in a region of size 0 [-Wstringop-overflow=] kernel/cgroup/cgroup.c:6153:18: warning: 'css_set_hash' accessing 4 bytes in a region of size 0 [-Wstringop-overflow=] This results in no differences in binary output. Link: https://github.com/KSPP/linux/issues/316 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
7f828eacc4 |
cgroup: fix obsolete function name in cgroup_destroy_locked()
Since commit
|
||
![]() |
a2c15fece4 |
cgroup: fix obsolete function name above css_free_rwork_fn()
Since commit
|
||
![]() |
55a5956a55 |
cgroup: clean up printk()
Convert the only printk() to use pr_*() helper. No functional change. Signed-off-by: Kamalesh Babulal <kamalesh.babulal@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
a3fdeeb3f1 |
cgroup: fix obsolete comment above cgroup_create()
Since commit
|
||
![]() |
752182b24b |
Linux 6.5-rc2
-----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmS0at0eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG6m8H/RZCd2DWeM94CgK5 DIxNLxu90PrEcrOnqeHFJtQoSiUQTHeseh9E4BH0JdPDxtlU89VwYRAevseWiVkp JyyJPB40UR4i2DO0P1+oWBBsGEG+bo8lZ1M+uxU5k6lgC0fAi96/O48mwwmI0Mtm P6BkWd3IkSXc7l4AGIrKa5P+pYEnm0Z6YAZILfdMVBcLXZWv1OAwEEkZNdUuhE3d 5DxlQ8MLzlQIbe+LasiwdL9r606acFaJG9Hewl9x5DBlsObZ3d2rX4vEt1NEVh89 shV29xm2AjCpLh4kstJFdTDCkDw/4H7TcFB/NMxXyzEXp3Bx8YXq+mjVd2mpq1FI hHtCsOA= =KiaU -----END PGP SIGNATURE----- Merge tag 'v6.5-rc2' into sched/core, to pick up fixes Sync with upstream fixes before applying EEVDF. Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
![]() |
c25ff4b911 |
cgroup: remove cgrp->kn check in css_populate_dir()
cgroup_create() creates cgrp and assigns the kernfs_node to cgrp->kn, then cgroup_mkdir() populates base and csses cft file by calling css_populate_dir() and cgroup_apply_control_enable() with a valid cgrp->kn. Check for NULL cgrp->kn, will always be false, remove it. Signed-off-by: Kamalesh Babulal <kamalesh.babulal@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
6f71780e7f |
cgroup: fix obsolete function name
cgroup_taskset_migrate() has been renamed to cgroup_migrate_execute() since
commit
|
||
![]() |
fcbb485d9f |
cgroup: use cached local variable parent in for loop
Use local variable parent to initialize iter tcgrp in for loop so the size of cgroup.o can be reduced by 64 bytes. No functional change intended. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
677ea015f2 |
sched: add throttled time stat for throttled children
We currently export the total throttled time for cgroups that are given a bandwidth limit. This patch extends this accounting to also account the total time that each children cgroup has been throttled. This is useful to understand the degree to which children have been affected by the throttling control. Children which are not runnable during the entire throttled period, for example, will not show any self-throttling time during this period. Expose this in a new interface, 'cpu.stat.local', which is similar to how non-hierarchical events are accounted in 'memory.events.local'. Signed-off-by: Josh Don <joshdon@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20230620183247.737942-2-joshdon@google.com |
||
![]() |
d1d4ff5d11 |
cgroup: put cgroup_tryget_css() inside CONFIG_CGROUP_SCHED
Put cgroup_tryget_css() inside CONFIG_CGROUP_SCHED to fix the warning of 'cgroup_tryget_css' defined but not used [-Wunused-function] when CONFIG_CGROUP_SCHED is disabled. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Kamalesh Babulal <kamalesh.babulal@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
868f87b375 |
cgroup: fix obsolete comment above for_each_css()
cgroup_tree_mutex is removed since commit
|
||
![]() |
1299eb2b0a |
cgroup: minor cleanup for cgroup_extra_stat_show()
Make it under CONFIG_CGROUP_SCHED to rid of __maybe_unused annotation. And further fetch cgrp inside cgroup_extra_stat_show() directly to rid of __maybe_unused annotation of cgrp. No functional change intended. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
2246ca53d7 |
cgroup: remove unneeded return value of cgroup_rm_cftypes_locked()
The return value of cgroup_rm_cftypes_locked() is always 0. So remove it to simplify the code. No functional change intended. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
aff037078e |
sched/psi: use kernfs polling functions for PSI trigger polling
Destroying psi trigger in cgroup_file_release causes UAF issues when a cgroup is removed from under a polling process. This is happening because cgroup removal causes a call to cgroup_file_release while the actual file is still alive. Destroying the trigger at this point would also destroy its waitqueue head and if there is still a polling process on that file accessing the waitqueue, it will step on the freed pointer: do_select vfs_poll do_rmdir cgroup_rmdir kernfs_drain_open_files cgroup_file_release cgroup_pressure_release psi_trigger_destroy wake_up_pollfree(&t->event_wait) // vfs_poll is unblocked synchronize_rcu kfree(t) poll_freewait -> UAF access to the trigger's waitqueue head Patch [1] fixed this issue for epoll() case using wake_up_pollfree(), however the same issue exists for synchronous poll() case. The root cause of this issue is that the lifecycles of the psi trigger's waitqueue and of the file associated with the trigger are different. Fix this by using kernfs_generic_poll function when polling on cgroup-specific psi triggers. It internally uses kernfs_open_node->poll waitqueue head with its lifecycle tied to the file's lifecycle. This also renders the fix in [1] obsolete, so revert it. [1] commit |
||
![]() |
6e2332e0ab |
cgroup: Changes for v6.5
* Whenever cpuset needs to rebuild sched_domain, it walked all tasks looking for DEADLINE tasks as they need to be accounted on the new domain. Walking all tasks can be expensive and there may not be any DEADLINE tasks at all. Task iteration is now omitted if there are no DEADLINE tasks. * Fixes DEADLINE bandwidth misaccounting after task migration failures. * When no controller is enabled, -Wstringop-overflow warning is triggered. The fix patch added an early exit which is too eager and got reverted for now. Will fix later. * Everything else are minor cleanups. -----BEGIN PGP SIGNATURE----- iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZJoRHw4cdGpAa2VybmVs Lm9yZwAKCRCxYfJx3gVYGZatAQCKTv8pb5HEgochph4n26laSdVZs6ce3Y+s7V1T rum+3QD/TyJFmCkZSMscolZGFuafpg41sjPbmc4SexeuAMYCMgY= =nioD -----END PGP SIGNATURE----- Merge tag 'cgroup-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - Whenever cpuset needs to rebuild sched_domain, it walked all tasks looking for DEADLINE tasks as they need to be accounted on the new domain. Walking all tasks can be expensive and there may not be any DEADLINE tasks at all. Task iteration is now omitted if there are no DEADLINE tasks - Fixes DEADLINE bandwidth misaccounting after task migration failures - When no controller is enabled, -Wstringop-overflow warning is triggered. The fix patch added an early exit which is too eager and got reverted for now. Will fix later - Everything else is minor cleanups * tag 'cgroup-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: Revert "cgroup: Avoid -Wstringop-overflow warnings" cgroup/misc: Expose misc.current on cgroup v2 root cgroup: Avoid -Wstringop-overflow warnings cgroup: remove obsolete comment on cgroup_on_dfl() cgroup: remove unused task_cgroup_path() cgroup/cpuset: remove unneeded header files cgroup: make cgroup_is_threaded() and cgroup_is_thread_root() static rdmacg: fix kernel-doc warnings in rdmacg cgroup: Replace the css_set call with cgroup_get cgroup: remove unused macro for_each_e_css() cgroup: Update out-of-date comment in cgroup_migrate() cgroup: Replace all non-returning strlcpy with strscpy cgroup/cpuset: remove unneeded header files cgroup/cpuset: Free DL BW in case can_attach() fails sched/deadline: Create DL BW alloc, free & check overflow interface cgroup/cpuset: Iterate only if DEADLINE tasks are present sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets sched/cpuset: Bring back cpuset_mutex cgroup/cpuset: Rename functions dealing with DEADLINE accounting |
||
![]() |
ed3b7923a8 |
Scheduler changes for v6.5:
- Scheduler SMP load-balancer improvements: - Avoid unnecessary migrations within SMT domains on hybrid systems. Problem: On hybrid CPU systems, (processors with a mixture of higher-frequency SMT cores and lower-frequency non-SMT cores), under the old code lower-priority CPUs pulled tasks from the higher-priority cores if more than one SMT sibling was busy - resulting in many unnecessary task migrations. Solution: The new code improves the load balancer to recognize SMT cores with more than one busy sibling and allows lower-priority CPUs to pull tasks, which avoids superfluous migrations and lets lower-priority cores inspect all SMT siblings for the busiest queue. - Implement the 'runnable boosting' feature in the EAS balancer: consider CPU contention in frequency, EAS max util & load-balance busiest CPU selection. This improves CPU utilization for certain workloads, while leaves other key workloads unchanged. - Scheduler infrastructure improvements: - Rewrite the scheduler topology setup code by consolidating it into the build_sched_topology() helper function and building it dynamically on the fly. - Resolve the local_clock() vs. noinstr complications by rewriting the code: provide separate sched_clock_noinstr() and local_clock_noinstr() functions to be used in instrumentation code, and make sure it is all instrumentation-safe. - Fixes: - Fix a kthread_park() race with wait_woken() - Fix misc wait_task_inactive() bugs unearthed by the -rt merge: - Fix UP PREEMPT bug by unifying the SMP and UP implementations. - Fix task_struct::saved_state handling. - Fix various rq clock update bugs, unearthed by turning on the rq clock debugging code. - Fix the PSI WINDOW_MIN_US trigger limit, which was easy to trigger by creating enough cgroups, by removing the warnign and restricting window size triggers to PSI file write-permission or CAP_SYS_RESOURCE. - Propagate SMT flags in the topology when removing degenerate domain - Fix grub_reclaim() calculation bug in the deadline scheduler code - Avoid resetting the min update period when it is unnecessary, in psi_trigger_destroy(). - Don't balance a task to its current running CPU in load_balance(), which was possible on certain NUMA topologies with overlapping groups. - Fix the sched-debug printing of rq->nr_uninterruptible - Cleanups: - Address various -Wmissing-prototype warnings, as a preparation to (maybe) enable this warning in the future. - Remove unused code - Mark more functions __init - Fix shadow-variable warnings Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmSatWQRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1j62xAAuGOx1LcDfRGC6WGQzp1zOdlsVQtnDvlS qL58zYSHgizprpVQ3j87SBaG4CHCdvd2Bo36yW0lNZS4nd203qdq7fkrMb3hPP/w egUQUzMegf5fF6BWldKeMjuHSt+twFQz/ZAKK8iSbAir6CHNAqbNst1oL0i/+Tyk o33hBs1hT5tnbFb1NSVZkX4k+qT3LzTW4K2QgjjGtkScr6yHh2BdEVefyigWOjdo 9s02d00ll9a2r+F5txlN7Dnw6TN7rmTXGMOJU5bZvBE90/anNiAorMXHJdEKCyUR u9+JtBdJWiCplGa/tSRcxT16ZW1VdtTnd9q66TDhXREd2UNDFqBEyg5Wl77K4Tlf vKFajmj/to+cTbuv6m6TVR+zyXpdEpdL6F04P44U3qiJvDobBqeDNKHHIqpmbHXl AXUXcPWTVAzXX1Ce5M+BeAgTBQ1T7C5tELILrTNQHJvO1s9VVBRFZ/l65Ps4vu7T wIZ781IFuopk0zWqHovNvgKrJ7oFmOQQZFttQEe8n6nafkjI7u+IZ8FayiGaUMRr 4GawFGUCEdYh8z9qyslGKe8Q/Rphfk6hxMFRYUJpDmubQ0PkMeDjDGq77jDGl1PF VqwSDEyOaBJs7Gqf/mem00JtzBmXhkhm1SEjggHMI2IQbr/eeBXoLQOn3CDapO/N PiDbtX760ic= =EWQA -----END PGP SIGNATURE----- Merge tag 'sched-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Scheduler SMP load-balancer improvements: - Avoid unnecessary migrations within SMT domains on hybrid systems. Problem: On hybrid CPU systems, (processors with a mixture of higher-frequency SMT cores and lower-frequency non-SMT cores), under the old code lower-priority CPUs pulled tasks from the higher-priority cores if more than one SMT sibling was busy - resulting in many unnecessary task migrations. Solution: The new code improves the load balancer to recognize SMT cores with more than one busy sibling and allows lower-priority CPUs to pull tasks, which avoids superfluous migrations and lets lower-priority cores inspect all SMT siblings for the busiest queue. - Implement the 'runnable boosting' feature in the EAS balancer: consider CPU contention in frequency, EAS max util & load-balance busiest CPU selection. This improves CPU utilization for certain workloads, while leaves other key workloads unchanged. Scheduler infrastructure improvements: - Rewrite the scheduler topology setup code by consolidating it into the build_sched_topology() helper function and building it dynamically on the fly. - Resolve the local_clock() vs. noinstr complications by rewriting the code: provide separate sched_clock_noinstr() and local_clock_noinstr() functions to be used in instrumentation code, and make sure it is all instrumentation-safe. Fixes: - Fix a kthread_park() race with wait_woken() - Fix misc wait_task_inactive() bugs unearthed by the -rt merge: - Fix UP PREEMPT bug by unifying the SMP and UP implementations - Fix task_struct::saved_state handling - Fix various rq clock update bugs, unearthed by turning on the rq clock debugging code. - Fix the PSI WINDOW_MIN_US trigger limit, which was easy to trigger by creating enough cgroups, by removing the warnign and restricting window size triggers to PSI file write-permission or CAP_SYS_RESOURCE. - Propagate SMT flags in the topology when removing degenerate domain - Fix grub_reclaim() calculation bug in the deadline scheduler code - Avoid resetting the min update period when it is unnecessary, in psi_trigger_destroy(). - Don't balance a task to its current running CPU in load_balance(), which was possible on certain NUMA topologies with overlapping groups. - Fix the sched-debug printing of rq->nr_uninterruptible Cleanups: - Address various -Wmissing-prototype warnings, as a preparation to (maybe) enable this warning in the future. - Remove unused code - Mark more functions __init - Fix shadow-variable warnings" * tag 'sched-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) sched/core: Avoid multiple calling update_rq_clock() in __cfsb_csd_unthrottle() sched/core: Avoid double calling update_rq_clock() in __balance_push_cpu_stop() sched/core: Fixed missing rq clock update before calling set_rq_offline() sched/deadline: Update GRUB description in the documentation sched/deadline: Fix bandwidth reclaim equation in GRUB sched/wait: Fix a kthread_park race with wait_woken() sched/topology: Mark set_sched_topology() __init sched/fair: Rename variable cpu_util eff_util arm64/arch_timer: Fix MMIO byteswap sched/fair, cpufreq: Introduce 'runnable boosting' sched/fair: Refactor CPU utilization functions cpuidle: Use local_clock_noinstr() sched/clock: Provide local_clock_noinstr() x86/tsc: Provide sched_clock_noinstr() clocksource: hyper-v: Provide noinstr sched_clock() clocksource: hyper-v: Adjust hv_read_tsc_page_tsc() to avoid special casing U64_MAX x86/vdso: Fix gettimeofday masking math64: Always inline u128 version of mul_u64_u64_shr() s390/time: Provide sched_clock_noinstr() loongarch: Provide noinstr sched_clock_read() ... |
||
![]() |
81621430c8 |
Revert "cgroup: Avoid -Wstringop-overflow warnings"
This reverts commit
|
||
![]() |
36de5f303c |
cgroup: Avoid -Wstringop-overflow warnings
Address the following -Wstringop-overflow warnings seen when
built with ARM architecture and aspeed_g4_defconfig configuration
(notice that under this configuration CGROUP_SUBSYS_COUNT == 0):
kernel/cgroup/cgroup.c:1208:16: warning: 'find_existing_css_set' accessing 4 bytes in a region of size 0 [-Wstringop-overflow=]
kernel/cgroup/cgroup.c:1258:15: warning: 'css_set_hash' accessing 4 bytes in a region of size 0 [-Wstringop-overflow=]
kernel/cgroup/cgroup.c:6089:18: warning: 'css_set_hash' accessing 4 bytes in a region of size 0 [-Wstringop-overflow=]
kernel/cgroup/cgroup.c:6153:18: warning: 'css_set_hash' accessing 4 bytes in a region of size 0 [-Wstringop-overflow=]
These changes are based on commit
|
||
![]() |
a04de42460 |
cgroup: remove obsolete comment on cgroup_on_dfl()
The debug feature is supported since commit
|
||
![]() |
6f363f5aa8 |
cgroup: Do not corrupt task iteration when rebinding subsystem
We found a refcount UAF bug as follows:
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 1 PID: 342 at lib/refcount.c:25 refcount_warn_saturate+0xa0/0x148
Workqueue: events cpuset_hotplug_workfn
Call trace:
refcount_warn_saturate+0xa0/0x148
__refcount_add.constprop.0+0x5c/0x80
css_task_iter_advance_css_set+0xd8/0x210
css_task_iter_advance+0xa8/0x120
css_task_iter_next+0x94/0x158
update_tasks_root_domain+0x58/0x98
rebuild_root_domains+0xa0/0x1b0
rebuild_sched_domains_locked+0x144/0x188
cpuset_hotplug_workfn+0x138/0x5a0
process_one_work+0x1e8/0x448
worker_thread+0x228/0x3e0
kthread+0xe0/0xf0
ret_from_fork+0x10/0x20
then a kernel panic will be triggered as below:
Unable to handle kernel paging request at virtual address 00000000c0000010
Call trace:
cgroup_apply_control_disable+0xa4/0x16c
rebind_subsystems+0x224/0x590
cgroup_destroy_root+0x64/0x2e0
css_free_rwork_fn+0x198/0x2a0
process_one_work+0x1d4/0x4bc
worker_thread+0x158/0x410
kthread+0x108/0x13c
ret_from_fork+0x10/0x18
The race that cause this bug can be shown as below:
(hotplug cpu) | (umount cpuset)
mutex_lock(&cpuset_mutex) | mutex_lock(&cgroup_mutex)
cpuset_hotplug_workfn |
rebuild_root_domains | rebind_subsystems
update_tasks_root_domain | spin_lock_irq(&css_set_lock)
css_task_iter_start | list_move_tail(&cset->e_cset_node[ss->id]
while(css_task_iter_next) | &dcgrp->e_csets[ss->id]);
css_task_iter_end | spin_unlock_irq(&css_set_lock)
mutex_unlock(&cpuset_mutex) | mutex_unlock(&cgroup_mutex)
Inside css_task_iter_start/next/end, css_set_lock is hold and then
released, so when iterating task(left side), the css_set may be moved to
another list(right side), then it->cset_head points to the old list head
and it->cset_pos->next points to the head node of new list, which can't
be used as struct css_set.
To fix this issue, switch from all css_sets to only scgrp's css_sets to
patch in-flight iterators to preserve correct iteration, and then
update it->cset_head as well.
Reported-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Link: https://www.spinics.net/lists/cgroups/msg37935.html
Suggested-by: Michal Koutný <mkoutny@suse.com>
Link: https://lore.kernel.org/all/20230526114139.70274-1-xiujianfeng@huaweicloud.com/
Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Fixes:
|
||
![]() |
d16b3af466 |
cgroup: remove unused task_cgroup_path()
task_cgroup_path() is not used anymore. So remove it. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
0dad9b072b |
cgroup: make cgroup_is_threaded() and cgroup_is_thread_root() static
They're only called inside cgroup.c. Make them static. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
7bf11e90a3 |
cgroup: Replace the css_set call with cgroup_get
We will release the refcnt of cgroup via cgroup_put, for example in the cgroup_lock_and_drain_offline function, so replace css_get with the cgroup_get function for better readability. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
a49a11dc64 |
cgroup: remove unused macro for_each_e_css()
for_each_e_css() is unused now. Remove it. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
659db0789c |
cgroup: Update out-of-date comment in cgroup_migrate()
Commit
|
||
![]() |
2bd1103392 |
cgroup: always put cset in cgroup_css_set_put_fork
A successful call to cgroup_css_set_fork() will always have taken
a ref on kargs->cset (regardless of CLONE_INTO_CGROUP), so always
do a corresponding put in cgroup_css_set_put_fork().
Without this, a cset and its contained css structures will be
leaked for some fork failures. The following script reproduces
the leak for a fork failure due to exceeding pids.max in the
pids controller. A similar thing can happen if we jump to the
bad_fork_cancel_cgroup label in copy_process().
[ -z "$1" ] && echo "Usage $0 pids-root" && exit 1
PID_ROOT=$1
CGROUP=$PID_ROOT/foo
[ -e $CGROUP ] && rmdir -f $CGROUP
mkdir $CGROUP
echo 5 > $CGROUP/pids.max
echo $$ > $CGROUP/cgroup.procs
fork_bomb()
{
set -e
for i in $(seq 10); do
/bin/sleep 3600 &
done
}
(fork_bomb) &
wait
echo $$ > $PID_ROOT/cgroup.procs
kill $(cat $CGROUP/cgroup.procs)
rmdir $CGROUP
Fixes:
|
||
![]() |
6c24849f55 |
sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets
Qais reported that iterating over all tasks when rebuilding root domains for finding out which ones are DEADLINE and need their bandwidth correctly restored on such root domains can be a costly operation (10+ ms delays on suspend-resume). To fix the problem keep track of the number of DEADLINE tasks belonging to each cpuset and then use this information (followup patch) to only perform the above iteration if DEADLINE tasks are actually present in the cpuset for which a corresponding root domain is being rebuilt. Reported-by: Qais Yousef <qyousef@layalina.io> Link: https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@layalina.io/ Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Reviewed-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
519fabc7aa |
psi: remove 500ms min window size limitation for triggers
Current 500ms min window size for psi triggers limits polling interval to 50ms to prevent polling threads from using too much cpu bandwidth by polling too frequently. However the number of cgroups with triggers is unlimited, so this protection can be defeated by creating multiple cgroups with psi triggers (triggers in each cgroup are served by a single "psimon" kernel thread). Instead of limiting min polling period, which also limits the latency of psi events, it's better to limit psi trigger creation to authorized users only, like we do for system-wide psi triggers (/proc/pressure/* files can be written only by processes with CAP_SYS_RESOURCE capability). This also makes access rules for cgroup psi files consistent with system-wide ones. Add a CAP_SYS_RESOURCE capability check for cgroup psi file writers and remove the psi window min size limitation. Suggested-by: Sudarshan Rajagopalan <quic_sudaraja@quicinc.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lore.kernel.org/all/cover.1676067791.git.quic_sudaraja@quicinc.com/ |
||
![]() |
86e98ed15b |
cgroup changes for v6.4-rc1
* cpuset changes including the fix for an incorrect interaction with CPU hotplug and an optimization. * Other doc and cosmetic changes. -----BEGIN PGP SIGNATURE----- iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZErfng4cdGpAa2VybmVs Lm9yZwAKCRCxYfJx3gVYGVVtAQCDycK4VSgc4nsFPG1vh1Oy1A723ciEUwAbKmV/ F1n7xwEA68FiDvE29LpMJJuYP9HnX0A5zRMyNnb52kN9jmgcEQI= =ALol -----END PGP SIGNATURE----- Merge tag 'cgroup-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - cpuset changes including the fix for an incorrect interaction with CPU hotplug and an optimization - Other doc and cosmetic changes * tag 'cgroup-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: docs: cgroup-v1/cpusets: update libcgroup project link cgroup/cpuset: Minor updates to test_cpuset_prs.sh cgroup/cpuset: Include offline CPUs when tasks' cpumasks in top_cpuset are updated cgroup/cpuset: Skip task update if hotplug doesn't affect current cpuset cpuset: Clean up cpuset_node_allowed cgroup: bpf: use cgroup_lock()/cgroup_unlock() wrappers |
||
![]() |
586b222d74 |
Scheduler changes for v6.4:
- Allow unprivileged PSI poll()ing - Fix performance regression introduced by mm_cid - Improve livepatch stalls by adding livepatch task switching to cond_resched(), this resolves livepatching busy-loop stalls with certain CPU-bound kthreads. - Improve sched_move_task() performance on autogroup configs. - On core-scheduling CPUs, avoid selecting throttled tasks to run - Misc cleanups, fixes and improvements. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmRK39cRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1hXPhAAk2WqOV2cW4BjSCHjWWE05IfTb0HMn8si mFGBAnr1GIkJRvICAusAwDU3FcmP5mWyXA+LK110d3x4fKJP15vCD5ru5lHnBfX7 fSD+Ml8uM4Xlp8iUoQspilbQwmWkQSwhudbDs3Nj7XGUzJCvNgm1sM3xPRDlqSJ5 6zumfVOPTfzSGcZY3a8sMuJnCepZHLRR6NkLzo/DuI1NMy2Jw1dK43dh77AO1mBF M53PF2IQgm6Wu/67p2k5eDq4c0AKL4PyIb4dRTGOPyljWMf41n28jwMv1tjlvu+Y uT0JD8MJSrFiylyT41x7Asr7orAGXj3cPhShK5R0vrutx/SbqBiaaE1MO9U3aC3B 7xVXEORHWD6KIDqTvzmWGrMBkIdyWB6CLk6EJKr3MqM9hUtP2ift7bkAgIad9h+4 G9DdVePGoCyh/TQtJ9EPIULAYeu9mmDZe8rTQ8C5MCSg//05/CTMgBbb0NiFWhnd 0JQl1B0nNUA87whVUxK8Hfu4DLh7m9jrzgQr9Ww8/FwQ6tQHBOKWgDdbv45ckkaG cJIQt/+vLilddazc8u8E+BGaD5w2uIYF0uL7kvG6Q5oARX06AZ5dj1m06vhZe/Ym laOVZEpJsbQnxviY6jwj1n+CSB9aK7feiQfDePBPbpJGGUHyZoKrnLN6wmW2se+H VCHtdgsEl5I= =Hgci -----END PGP SIGNATURE----- Merge tag 'sched-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Allow unprivileged PSI poll()ing - Fix performance regression introduced by mm_cid - Improve livepatch stalls by adding livepatch task switching to cond_resched(). This resolves livepatching busy-loop stalls with certain CPU-bound kthreads - Improve sched_move_task() performance on autogroup configs - On core-scheduling CPUs, avoid selecting throttled tasks to run - Misc cleanups, fixes and improvements * tag 'sched-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/clock: Fix local_clock() before sched_clock_init() sched/rt: Fix bad task migration for rt tasks sched: Fix performance regression introduced by mm_cid sched/core: Make sched_dynamic_mutex static sched/psi: Allow unprivileged polling of N*2s period sched/psi: Extract update_triggers side effect sched/psi: Rename existing poll members in preparation sched/psi: Rearrange polling code in preparation sched/fair: Fix inaccurate tally of ttwu_move_affine vhost: Fix livepatch timeouts in vhost_worker() livepatch,sched: Add livepatch task switching to cond_resched() livepatch: Skip task_call_func() for current task livepatch: Convert stack entries array to percpu sched: Interleave cfs bandwidth timers for improved single thread performance at low utilization sched/core: Reduce cost of sched_move_task when config autogroup sched/core: Avoid selecting the task that is throttled to run when core-sched enable sched/topology: Make sched_energy_mutex,update static |
||
![]() |
6e98b09da9 |
Networking changes for 6.4.
Core ---- - Introduce a config option to tweak MAX_SKB_FRAGS. Increasing the default value allows for better BIG TCP performances. - Reduce compound page head access for zero-copy data transfers. - RPS/RFS improvements, avoiding unneeded NET_RX_SOFTIRQ when possible. - Threaded NAPI improvements, adding defer skb free support and unneeded softirq avoidance. - Address dst_entry reference count scalability issues, via false sharing avoidance and optimize refcount tracking. - Add lockless accesses annotation to sk_err[_soft]. - Optimize again the skb struct layout. - Extends the skb drop reasons to make it usable by multiple subsystems. - Better const qualifier awareness for socket casts. BPF --- - Add skb and XDP typed dynptrs which allow BPF programs for more ergonomic and less brittle iteration through data and variable-sized accesses. - Add a new BPF netfilter program type and minimal support to hook BPF programs to netfilter hooks such as prerouting or forward. - Add more precise memory usage reporting for all BPF map types. - Adds support for using {FOU,GUE} encap with an ipip device operating in collect_md mode and add a set of BPF kfuncs for controlling encap params. - Allow BPF programs to detect at load time whether a particular kfunc exists or not, and also add support for this in light skeleton. - Bigger batch of BPF verifier improvements to prepare for upcoming BPF open-coded iterators allowing for less restrictive looping capabilities. - Rework RCU enforcement in the verifier, add kptr_rcu and enforce BPF programs to NULL-check before passing such pointers into kfunc. - Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and in local storage maps. - Enable RCU semantics for task BPF kptrs and allow referenced kptr tasks to be stored in BPF maps. - Add support for refcounted local kptrs to the verifier for allowing shared ownership, useful for adding a node to both the BPF list and rbtree. - Add BPF verifier support for ST instructions in convert_ctx_access() which will help new -mcpu=v4 clang flag to start emitting them. - Add ARM32 USDT support to libbpf. - Improve bpftool's visual program dump which produces the control flow graph in a DOT format by adding C source inline annotations. Protocols --------- - IPv4: Allow adding to IPv4 address a 'protocol' tag. Such value indicates the provenance of the IP address. - IPv6: optimize route lookup, dropping unneeded R/W lock acquisition. - Add the handshake upcall mechanism, allowing the user-space to implement generic TLS handshake on kernel's behalf. - Bridge: support per-{Port, VLAN} neighbor suppression, increasing resilience to nodes failures. - SCTP: add support for Fair Capacity and Weighted Fair Queueing schedulers. - MPTCP: delay first subflow allocation up to its first usage. This will allow for later better LSM interaction. - xfrm: Remove inner/outer modes from input/output path. These are not needed anymore. - WiFi: - reduced neighbor report (RNR) handling for AP mode - HW timestamping support - support for randomized auth/deauth TA for PASN privacy - per-link debugfs for multi-link - TC offload support for mac80211 drivers - mac80211 mesh fast-xmit and fast-rx support - enable Wi-Fi 7 (EHT) mesh support Netfilter --------- - Add nf_tables 'brouting' support, to force a packet to be routed instead of being bridged. - Update bridge netfilter and ovs conntrack helpers to handle IPv6 Jumbo packets properly, i.e. fetch the packet length from hop-by-hop extension header. This is needed for BIT TCP support. - The iptables 32bit compat interface isn't compiled in by default anymore. - Move ip(6)tables builtin icmp matches to the udptcp one. This has the advantage that icmp/icmpv6 match doesn't load the iptables/ip6tables modules anymore when iptables-nft is used. - Extended netlink error report for netdevice in flowtables and netdev/chains. Allow for incrementally add/delete devices to netdev basechain. Allow to create netdev chain without device. Driver API ---------- - Remove redundant Device Control Error Reporting Enable, as PCI core has already error reporting enabled at enumeration time. - Move Multicast DB netlink handlers to core, allowing devices other then bridge to use them. - Allow the page_pool to directly recycle the pages from safely localized NAPI. - Implement lockless TX queue stop/wake combo macros, allowing for further code de-duplication and sanitization. - Add YNL support for user headers and struct attrs. - Add partial YNL specification for devlink. - Add partial YNL specification for ethtool. - Add tc-mqprio and tc-taprio support for preemptible traffic classes. - Add tx push buf len param to ethtool, specifies the maximum number of bytes of a transmitted packet a driver can push directly to the underlying device. - Add basic LED support for switch/phy. - Add NAPI documentation, stop relaying on external links. - Convert dsa_master_ioctl() to netdev notifier. This is a preparatory work to make the hardware timestamping layer selectable by user space. - Add transceiver support and improve the error messages for CAN-FD controllers. New hardware / drivers ---------------------- - Ethernet: - AMD/Pensando core device support - MediaTek MT7981 SoC - MediaTek MT7988 SoC - Broadcom BCM53134 embedded switch - Texas Instruments CPSW9G ethernet switch - Qualcomm EMAC3 DWMAC ethernet - StarFive JH7110 SoC - NXP CBTX ethernet PHY - WiFi: - Apple M1 Pro/Max devices - RealTek rtl8710bu/rtl8188gu - RealTek rtl8822bs, rtl8822cs and rtl8821cs SDIO chipset - Bluetooth: - Realtek RTL8821CS, RTL8851B, RTL8852BS - Mediatek MT7663, MT7922 - NXP w8997 - Actions Semi ATS2851 - QTI WCN6855 - Marvell 88W8997 - Can: - STMicroelectronics bxcan stm32f429 Drivers ------- - Ethernet NICs: - Intel (1G, icg): - add tracking and reporting of QBV config errors. - add support for configuring max SDU for each Tx queue. - Intel (100G, ice): - refactor mailbox overflow detection to support Scalable IOV - GNSS interface optimization - Intel (i40e): - support XDP multi-buffer - nVidia/Mellanox: - add the support for linux bridge multicast offload - enable TC offload for egress and engress MACVLAN over bond - add support for VxLAN GBP encap/decap flows offload - extend packet offload to fully support libreswan - support tunnel mode in mlx5 IPsec packet offload - extend XDP multi-buffer support - support MACsec VLAN offload - add support for dynamic msix vectors allocation - drop RX page_cache and fully use page_pool - implement thermal zone to report NIC temperature - Netronome/Corigine: - add support for multi-zone conntrack offload - Solarflare/Xilinx: - support offloading TC VLAN push/pop actions to the MAE - support TC decap rules - support unicast PTP - Other NICs: - Broadcom (bnxt): enforce software based freq adjustments only on shared PHC NIC - RealTek (r8169): refactor to addess ASPM issues during NAPI poll. - Micrel (lan8841): add support for PTP_PF_PEROUT - Cadence (macb): enable PTP unicast - Engleder (tsnep): add XDP socket zero-copy support - virtio-net: implement exact header length guest feature - veth: add page_pool support for page recycling - vxlan: add MDB data path support - gve: add XDP support for GQI-QPL format - geneve: accept every ethertype - macvlan: allow some packets to bypass broadcast queue - mana: add support for jumbo frame - Ethernet high-speed switches: - Microchip (sparx5): Add support for TC flower templates. - Ethernet embedded switches: - Broadcom (b54): - configure 6318 and 63268 RGMII ports - Marvell (mv88e6xxx): - faster C45 bus scan - Microchip: - lan966x: - add support for IS1 VCAP - better TX/RX from/to CPU performances - ksz9477: add ETS Qdisc support - ksz8: enhance static MAC table operations and error handling - sama7g5: add PTP capability - NXP (ocelot): - add support for external ports - add support for preemptible traffic classes - Texas Instruments: - add CPSWxG SGMII support for J7200 and J721E - Intel WiFi (iwlwifi): - preparation for Wi-Fi 7 EHT and multi-link support - EHT (Wi-Fi 7) sniffer support - hardware timestamping support for some devices/firwmares - TX beacon protection on newer hardware - Qualcomm 802.11ax WiFi (ath11k): - MU-MIMO parameters support - ack signal support for management packets - RealTek WiFi (rtw88): - SDIO bus support - better support for some SDIO devices (e.g. MAC address from efuse) - RealTek WiFi (rtw89): - HW scan support for 8852b - better support for 6 GHz scanning - support for various newer firmware APIs - framework firmware backwards compatibility - MediaTek WiFi (mt76): - P2P support - mesh A-MSDU support - EHT (Wi-Fi 7) support - coredump support Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmRI/mUSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkgO0QAJGxpuN67YgYV0BIM+/atWKEEexJYG7B 9MMpU4jMO3EW/pUS5t7VRsBLUybLYVPmqCZoHodObDfnu59jiPOegb6SikJv/ZwJ Zw62PVk5MvDnQjlu4e6kDcGwkplteN08TlgI+a49BUTedpdFitrxHAYGW8f2fRO6 cK2XSld+ZucMoym5vRwf8yWS1BwdxnslPMxDJ+/8ZbWBZv44qAnG2vMB/kIx7ObC Vel/4m6MzTwVsLYBsRvcwMVbNNlZ9GuhztlTzEbfGA4ZhTadIAMgb5VTWXB84Ws7 Aic5wTdli+q+x6/2cxhbyeoVuB9HHObYmLBAciGg4GNljP5rnQBY3X3+KVZ/x9TI HQB7CmhxmAZVrO9pLARFV+ECrMTH2/dy3NyrZ7uYQ3WPOXJi8hJZjOTO/eeEGL7C eTjdz0dZBWIBK2gON/6s4nExXVQUTEF2ZsPi52jTTClKjfe5pz/ddeFQIWaY1DTm pInEiWPAvd28JyiFmhFNHsuIBCjX/Zqe2JuMfMBeBibDAC09o/OGdKJYUI15AiRf F46Pdb7use/puqfrYW44kSAfaPYoBiE+hj1RdeQfen35xD9HVE4vdnLNeuhRlFF9 aQfyIRHYQofkumRDr5f8JEY66cl9NiKQ4IVW1xxQfYDNdC6wQqREPG1md7rJVMrJ vP7ugFnttneg =ITVa -----END PGP SIGNATURE----- Merge tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core: - Introduce a config option to tweak MAX_SKB_FRAGS. Increasing the default value allows for better BIG TCP performances - Reduce compound page head access for zero-copy data transfers - RPS/RFS improvements, avoiding unneeded NET_RX_SOFTIRQ when possible - Threaded NAPI improvements, adding defer skb free support and unneeded softirq avoidance - Address dst_entry reference count scalability issues, via false sharing avoidance and optimize refcount tracking - Add lockless accesses annotation to sk_err[_soft] - Optimize again the skb struct layout - Extends the skb drop reasons to make it usable by multiple subsystems - Better const qualifier awareness for socket casts BPF: - Add skb and XDP typed dynptrs which allow BPF programs for more ergonomic and less brittle iteration through data and variable-sized accesses - Add a new BPF netfilter program type and minimal support to hook BPF programs to netfilter hooks such as prerouting or forward - Add more precise memory usage reporting for all BPF map types - Adds support for using {FOU,GUE} encap with an ipip device operating in collect_md mode and add a set of BPF kfuncs for controlling encap params - Allow BPF programs to detect at load time whether a particular kfunc exists or not, and also add support for this in light skeleton - Bigger batch of BPF verifier improvements to prepare for upcoming BPF open-coded iterators allowing for less restrictive looping capabilities - Rework RCU enforcement in the verifier, add kptr_rcu and enforce BPF programs to NULL-check before passing such pointers into kfunc - Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and in local storage maps - Enable RCU semantics for task BPF kptrs and allow referenced kptr tasks to be stored in BPF maps - Add support for refcounted local kptrs to the verifier for allowing shared ownership, useful for adding a node to both the BPF list and rbtree - Add BPF verifier support for ST instructions in convert_ctx_access() which will help new -mcpu=v4 clang flag to start emitting them - Add ARM32 USDT support to libbpf - Improve bpftool's visual program dump which produces the control flow graph in a DOT format by adding C source inline annotations Protocols: - IPv4: Allow adding to IPv4 address a 'protocol' tag. Such value indicates the provenance of the IP address - IPv6: optimize route lookup, dropping unneeded R/W lock acquisition - Add the handshake upcall mechanism, allowing the user-space to implement generic TLS handshake on kernel's behalf - Bridge: support per-{Port, VLAN} neighbor suppression, increasing resilience to nodes failures - SCTP: add support for Fair Capacity and Weighted Fair Queueing schedulers - MPTCP: delay first subflow allocation up to its first usage. This will allow for later better LSM interaction - xfrm: Remove inner/outer modes from input/output path. These are not needed anymore - WiFi: - reduced neighbor report (RNR) handling for AP mode - HW timestamping support - support for randomized auth/deauth TA for PASN privacy - per-link debugfs for multi-link - TC offload support for mac80211 drivers - mac80211 mesh fast-xmit and fast-rx support - enable Wi-Fi 7 (EHT) mesh support Netfilter: - Add nf_tables 'brouting' support, to force a packet to be routed instead of being bridged - Update bridge netfilter and ovs conntrack helpers to handle IPv6 Jumbo packets properly, i.e. fetch the packet length from hop-by-hop extension header. This is needed for BIT TCP support - The iptables 32bit compat interface isn't compiled in by default anymore - Move ip(6)tables builtin icmp matches to the udptcp one. This has the advantage that icmp/icmpv6 match doesn't load the iptables/ip6tables modules anymore when iptables-nft is used - Extended netlink error report for netdevice in flowtables and netdev/chains. Allow for incrementally add/delete devices to netdev basechain. Allow to create netdev chain without device Driver API: - Remove redundant Device Control Error Reporting Enable, as PCI core has already error reporting enabled at enumeration time - Move Multicast DB netlink handlers to core, allowing devices other then bridge to use them - Allow the page_pool to directly recycle the pages from safely localized NAPI - Implement lockless TX queue stop/wake combo macros, allowing for further code de-duplication and sanitization - Add YNL support for user headers and struct attrs - Add partial YNL specification for devlink - Add partial YNL specification for ethtool - Add tc-mqprio and tc-taprio support for preemptible traffic classes - Add tx push buf len param to ethtool, specifies the maximum number of bytes of a transmitted packet a driver can push directly to the underlying device - Add basic LED support for switch/phy - Add NAPI documentation, stop relaying on external links - Convert dsa_master_ioctl() to netdev notifier. This is a preparatory work to make the hardware timestamping layer selectable by user space - Add transceiver support and improve the error messages for CAN-FD controllers New hardware / drivers: - Ethernet: - AMD/Pensando core device support - MediaTek MT7981 SoC - MediaTek MT7988 SoC - Broadcom BCM53134 embedded switch - Texas Instruments CPSW9G ethernet switch - Qualcomm EMAC3 DWMAC ethernet - StarFive JH7110 SoC - NXP CBTX ethernet PHY - WiFi: - Apple M1 Pro/Max devices - RealTek rtl8710bu/rtl8188gu - RealTek rtl8822bs, rtl8822cs and rtl8821cs SDIO chipset - Bluetooth: - Realtek RTL8821CS, RTL8851B, RTL8852BS - Mediatek MT7663, MT7922 - NXP w8997 - Actions Semi ATS2851 - QTI WCN6855 - Marvell 88W8997 - Can: - STMicroelectronics bxcan stm32f429 Drivers: - Ethernet NICs: - Intel (1G, icg): - add tracking and reporting of QBV config errors - add support for configuring max SDU for each Tx queue - Intel (100G, ice): - refactor mailbox overflow detection to support Scalable IOV - GNSS interface optimization - Intel (i40e): - support XDP multi-buffer - nVidia/Mellanox: - add the support for linux bridge multicast offload - enable TC offload for egress and engress MACVLAN over bond - add support for VxLAN GBP encap/decap flows offload - extend packet offload to fully support libreswan - support tunnel mode in mlx5 IPsec packet offload - extend XDP multi-buffer support - support MACsec VLAN offload - add support for dynamic msix vectors allocation - drop RX page_cache and fully use page_pool - implement thermal zone to report NIC temperature - Netronome/Corigine: - add support for multi-zone conntrack offload - Solarflare/Xilinx: - support offloading TC VLAN push/pop actions to the MAE - support TC decap rules - support unicast PTP - Other NICs: - Broadcom (bnxt): enforce software based freq adjustments only on shared PHC NIC - RealTek (r8169): refactor to addess ASPM issues during NAPI poll - Micrel (lan8841): add support for PTP_PF_PEROUT - Cadence (macb): enable PTP unicast - Engleder (tsnep): add XDP socket zero-copy support - virtio-net: implement exact header length guest feature - veth: add page_pool support for page recycling - vxlan: add MDB data path support - gve: add XDP support for GQI-QPL format - geneve: accept every ethertype - macvlan: allow some packets to bypass broadcast queue - mana: add support for jumbo frame - Ethernet high-speed switches: - Microchip (sparx5): Add support for TC flower templates - Ethernet embedded switches: - Broadcom (b54): - configure 6318 and 63268 RGMII ports - Marvell (mv88e6xxx): - faster C45 bus scan - Microchip: - lan966x: - add support for IS1 VCAP - better TX/RX from/to CPU performances - ksz9477: add ETS Qdisc support - ksz8: enhance static MAC table operations and error handling - sama7g5: add PTP capability - NXP (ocelot): - add support for external ports - add support for preemptible traffic classes - Texas Instruments: - add CPSWxG SGMII support for J7200 and J721E - Intel WiFi (iwlwifi): - preparation for Wi-Fi 7 EHT and multi-link support - EHT (Wi-Fi 7) sniffer support - hardware timestamping support for some devices/firwmares - TX beacon protection on newer hardware - Qualcomm 802.11ax WiFi (ath11k): - MU-MIMO parameters support - ack signal support for management packets - RealTek WiFi (rtw88): - SDIO bus support - better support for some SDIO devices (e.g. MAC address from efuse) - RealTek WiFi (rtw89): - HW scan support for 8852b - better support for 6 GHz scanning - support for various newer firmware APIs - framework firmware backwards compatibility - MediaTek WiFi (mt76): - P2P support - mesh A-MSDU support - EHT (Wi-Fi 7) support - coredump support" * tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2078 commits) net: phy: hide the PHYLIB_LEDS knob net: phy: marvell-88x2222: remove unnecessary (void*) conversions tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp. net: amd: Fix link leak when verifying config failed net: phy: marvell: Fix inconsistent indenting in led_blink_set lan966x: Don't use xdp_frame when action is XDP_TX tsnep: Add XDP socket zero-copy TX support tsnep: Add XDP socket zero-copy RX support tsnep: Move skb receive action to separate function tsnep: Add functions for queue enable/disable tsnep: Rework TX/RX queue initialization tsnep: Replace modulo operation with mask net: phy: dp83867: Add led_brightness_set support net: phy: Fix reading LED reg property drivers: nfc: nfcsim: remove return value check of `dev_dir` net: phy: dp83867: Remove unnecessary (void*) conversions net: ethtool: coalesce: try to make user settings stick twice net: mana: Check if netdev/napi_alloc_frag returns single page net: mana: Rename mana_refill_rxoob and remove some empty lines net: veth: add page_pool stats ... |
||
![]() |
2f31fa029d |
cgroup_get_from_fd(): switch to fdget_raw()
Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
![]() |
d82caa2735 |
sched/psi: Allow unprivileged polling of N*2s period
PSI offers 2 mechanisms to get information about a specific resource pressure. One is reading from /proc/pressure/<resource>, which gives average pressures aggregated every 2s. The other is creating a pollable fd for a specific resource and cgroup. The trigger creation requires CAP_SYS_RESOURCE, and gives the possibility to pick specific time window and threshold, spawing an RT thread to aggregate the data. Systemd would like to provide containers the option to monitor pressure on their own cgroup and sub-cgroups. For example, if systemd launches a container that itself then launches services, the container should have the ability to poll() for pressure in individual services. But neither the container nor the services are privileged. This patch implements a mechanism to allow unprivileged users to create pressure triggers. The difference with privileged triggers creation is that unprivileged ones must have a time window that's a multiple of 2s. This is so that we can avoid unrestricted spawning of rt threads, and use instead the same aggregation mechanism done for the averages, which runs independently of any triggers. Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lore.kernel.org/r/20230330105418.77061-5-cerasuolodomenico@gmail.com |
||
![]() |
4cdb91b0de |
cgroup: bpf: use cgroup_lock()/cgroup_unlock() wrappers
Replace mutex_[un]lock() with cgroup_[un]lock() wrappers to stay consistent across cgroup core and other subsystem code, while operating on the cgroup_mutex. Signed-off-by: Kamalesh Babulal <kamalesh.babulal@oracle.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
![]() |
b8a2e3f93d |
cgroup: Make current_cgns_cgroup_dfl() safe to call after exit_task_namespace()
The commit |
||
![]() |
4609e1f18e
|
fs: port ->permission() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
|
||
![]() |
7e68dd7d07 |
Networking changes for 6.2.
Core ---- - Allow live renaming when an interface is up - Add retpoline wrappers for tc, improving considerably the performances of complex queue discipline configurations. - Add inet drop monitor support. - A few GRO performance improvements. - Add infrastructure for atomic dev stats, addressing long standing data races. - De-duplicate common code between OVS and conntrack offloading infrastructure. - A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements. - Netfilter: introduce packet parser for tunneled packets - Replace IPVS timer-based estimators with kthreads to scale up the workload with the number of available CPUs. - Add the helper support for connection-tracking OVS offload. BPF --- - Support for user defined BPF objects: the use case is to allocate own objects, build own object hierarchies and use the building blocks to build own data structures flexibly, for example, linked lists in BPF. - Make cgroup local storage available to non-cgroup attached BPF programs. - Avoid unnecessary deadlock detection and failures wrt BPF task storage helpers. - A relevant bunch of BPF verifier fixes and improvements. - Veristat tool improvements to support custom filtering, sorting, and replay of results. - Add LLVM disassembler as default library for dumping JITed code. - Lots of new BPF documentation for various BPF maps. - Add bpf_rcu_read_{,un}lock() support for sleepable programs. - Add RCU grace period chaining to BPF to wait for the completion of access from both sleepable and non-sleepable BPF programs. - Add support storing struct task_struct objects as kptrs in maps. - Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer values. - Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions. Protocols --------- - TCP: implement Protective Load Balancing across switch links. - TCP: allow dynamically disabling TCP-MD5 static key, reverting back to fast[er]-path. - UDP: Introduce optional per-netns hash lookup table. - IPv6: simplify and cleanup sockets disposal. - Netlink: support different type policies for each generic netlink operation. - MPTCP: add MSG_FASTOPEN and FastOpen listener side support. - MPTCP: add netlink notification support for listener sockets events. - SCTP: add VRF support, allowing sctp sockets binding to VRF devices. - Add bridging MAC Authentication Bypass (MAB) support. - Extensions for Ethernet VPN bridging implementation to better support multicast scenarios. - More work for Wi-Fi 7 support, comprising conversion of all the existing drivers to internal TX queue usage. - IPSec: introduce a new offload type (packet offload) allowing complete header processing and crypto offloading. - IPSec: extended ack support for more descriptive XFRM error reporting. - RXRPC: increase SACK table size and move processing into a per-local endpoint kernel thread, reducing considerably the required locking. - IEEE 802154: synchronous send frame and extended filtering support, initial support for scanning available 15.4 networks. - Tun: bump the link speed from 10Mbps to 10Gbps. - Tun/VirtioNet: implement UDP segmentation offload support. Driver API ---------- - PHY/SFP: improve power level switching between standard level 1 and the higher power levels. - New API for netdev <-> devlink_port linkage. - PTP: convert existing drivers to new frequency adjustment implementation. - DSA: add support for rx offloading. - Autoload DSA tagging driver when dynamically changing protocol. - Add new PCP and APPTRUST attributes to Data Center Bridging. - Add configuration support for 800Gbps link speed. - Add devlink port function attribute to enable/disable RoCE and migratable. - Extend devlink-rate to support strict prioriry and weighted fair queuing. - Add devlink support to directly reading from region memory. - New device tree helper to fetch MAC address from nvmem. - New big TCP helper to simplify temporary header stripping. New hardware / drivers ---------------------- - Ethernet: - Marvel Octeon CNF95N and CN10KB Ethernet Switches. - Marvel Prestera AC5X Ethernet Switch. - WangXun 10 Gigabit NIC. - Motorcomm yt8521 Gigabit Ethernet. - Microchip ksz9563 Gigabit Ethernet Switch. - Microsoft Azure Network Adapter. - Linux Automation 10Base-T1L adapter. - PHY: - Aquantia AQR112 and AQR412. - Motorcomm YT8531S. - PTP: - Orolia ART-CARD. - WiFi: - MediaTek Wi-Fi 7 (802.11be) devices. - RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB devices. - Bluetooth: - Broadcom BCM4377/4378/4387 Bluetooth chipsets. - Realtek RTL8852BE and RTL8723DS. - Cypress.CYW4373A0 WiFi + Bluetooth combo device. Drivers ------- - CAN: - gs_usb: bus error reporting support. - kvaser_usb: listen only and bus error reporting support. - Ethernet NICs: - Intel (100G): - extend action skbedit to RX queue mapping. - implement devlink-rate support. - support direct read from memory. - nVidia/Mellanox (mlx5): - SW steering improvements, increasing rules update rate. - Support for enhanced events compression. - extend H/W offload packet manipulation capabilities. - implement IPSec packet offload mode. - nVidia/Mellanox (mlx4): - better big TCP support. - Netronome Ethernet NICs (nfp): - IPsec offload support. - add support for multicast filter. - Broadcom: - RSS and PTP support improvements. - AMD/SolarFlare: - netlink extened ack improvements. - add basic flower matches to offload, and related stats. - Virtual NICs: - ibmvnic: introduce affinity hint support. - small / embedded: - FreeScale fec: add initial XDP support. - Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood. - TI am65-cpsw: add suspend/resume support. - Mediatek MT7986: add RX wireless wthernet dispatch support. - Realtek 8169: enable GRO software interrupt coalescing per default. - Ethernet high-speed switches: - Microchip (sparx5): - add support for Sparx5 TC/flower H/W offload via VCAP. - Mellanox mlxsw: - add 802.1X and MAC Authentication Bypass offload support. - add ip6gre support. - Embedded Ethernet switches: - Mediatek (mtk_eth_soc): - improve PCS implementation, add DSA untag support. - enable flow offload support. - Renesas: - add rswitch R-Car Gen4 gPTP support. - Microchip (lan966x): - add full XDP support. - add TC H/W offload via VCAP. - enable PTP on bridge interfaces. - Microchip (ksz8): - add MTU support for KSZ8 series. - Qualcomm 802.11ax WiFi (ath11k): - support configuring channel dwell time during scan. - MediaTek WiFi (mt76): - enable Wireless Ethernet Dispatch (WED) offload support. - add ack signal support. - enable coredump support. - remain_on_channel support. - Intel WiFi (iwlwifi): - enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities. - 320 MHz channels support. - RealTek WiFi (rtw89): - new dynamic header firmware format support. - wake-over-WLAN support. Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmOYXUcSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOk8zQP/R7BZtbJMTPiWkRnSoKHnAyupDVwrz5U ktukLkwPsCyJuEbAjgxrxf4EEEQ9uq2FFlxNSYuKiiQMqIpFxV6KED7LCUygn4Tc kxtkp0Q+5XiqisWlQmtfExf2OjuuPqcjV9tWCDBI6GebKUbfNwY/eI44RcMu4BSv DzIlW5GkX/kZAPqnnuqaLsN3FudDTJHGEAD7NbA++7wJ076RWYSLXlFv0Z+SCSPS H8/PEG0/ZK/65rIWMAFRClJ9BNIDwGVgp0GrsIvs1gqbRUOlA1hl1rDM21TqtNFf 5QPQT7sIfTcCE/nerxKJD5JE3JyP+XRlRn96PaRw3rt4MgI6I/EOj/HOKQ5tMCNc oPiqb7N70+hkLZyr42qX+vN9eDPjp2koEQm7EO2Zs+/534/zWDs24Zfk/Aa1ps0I Fa82oGjAgkBhGe/FZ6i5cYoLcyxqRqZV1Ws9XQMl72qRC7/BwvNbIW6beLpCRyeM yYIU+0e9dEm+wHQEdh2niJuVtR63hy8tvmPx56lyh+6u0+pondkwbfSiC5aD3kAC ikKsN5DyEsdXyiBAlytCEBxnaOjQy4RAz+3YXSiS0eBNacXp03UUrNGx4Pzpu/D0 QLFJhBnMFFCgy5to8/DvKnrTPgZdSURwqbIUcZdvU21f1HLR8tUTpaQnYffc/Whm V8gnt1EL+0cc =CbJC -----END PGP SIGNATURE----- Merge tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core: - Allow live renaming when an interface is up - Add retpoline wrappers for tc, improving considerably the performances of complex queue discipline configurations - Add inet drop monitor support - A few GRO performance improvements - Add infrastructure for atomic dev stats, addressing long standing data races - De-duplicate common code between OVS and conntrack offloading infrastructure - A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements - Netfilter: introduce packet parser for tunneled packets - Replace IPVS timer-based estimators with kthreads to scale up the workload with the number of available CPUs - Add the helper support for connection-tracking OVS offload BPF: - Support for user defined BPF objects: the use case is to allocate own objects, build own object hierarchies and use the building blocks to build own data structures flexibly, for example, linked lists in BPF - Make cgroup local storage available to non-cgroup attached BPF programs - Avoid unnecessary deadlock detection and failures wrt BPF task storage helpers - A relevant bunch of BPF verifier fixes and improvements - Veristat tool improvements to support custom filtering, sorting, and replay of results - Add LLVM disassembler as default library for dumping JITed code - Lots of new BPF documentation for various BPF maps - Add bpf_rcu_read_{,un}lock() support for sleepable programs - Add RCU grace period chaining to BPF to wait for the completion of access from both sleepable and non-sleepable BPF programs - Add support storing struct task_struct objects as kptrs in maps - Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer values - Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions Protocols: - TCP: implement Protective Load Balancing across switch links - TCP: allow dynamically disabling TCP-MD5 static key, reverting back to fast[er]-path - UDP: Introduce optional per-netns hash lookup table - IPv6: simplify and cleanup sockets disposal - Netlink: support different type policies for each generic netlink operation - MPTCP: add MSG_FASTOPEN and FastOpen listener side support - MPTCP: add netlink notification support for listener sockets events - SCTP: add VRF support, allowing sctp sockets binding to VRF devices - Add bridging MAC Authentication Bypass (MAB) support - Extensions for Ethernet VPN bridging implementation to better support multicast scenarios - More work for Wi-Fi 7 support, comprising conversion of all the existing drivers to internal TX queue usage - IPSec: introduce a new offload type (packet offload) allowing complete header processing and crypto offloading - IPSec: extended ack support for more descriptive XFRM error reporting - RXRPC: increase SACK table size and move processing into a per-local endpoint kernel thread, reducing considerably the required locking - IEEE 802154: synchronous send frame and extended filtering support, initial support for scanning available 15.4 networks - Tun: bump the link speed from 10Mbps to 10Gbps - Tun/VirtioNet: implement UDP segmentation offload support Driver API: - PHY/SFP: improve power level switching between standard level 1 and the higher power levels - New API for netdev <-> devlink_port linkage - PTP: convert existing drivers to new frequency adjustment implementation - DSA: add support for rx offloading - Autoload DSA tagging driver when dynamically changing protocol - Add new PCP and APPTRUST attributes to Data Center Bridging - Add configuration support for 800Gbps link speed - Add devlink port function attribute to enable/disable RoCE and migratable - Extend devlink-rate to support strict prioriry and weighted fair queuing - Add devlink support to directly reading from region memory - New device tree helper to fetch MAC address from nvmem - New big TCP helper to simplify temporary header stripping New hardware / drivers: - Ethernet: - Marvel Octeon CNF95N and CN10KB Ethernet Switches - Marvel Prestera AC5X Ethernet Switch - WangXun 10 Gigabit NIC - Motorcomm yt8521 Gigabit Ethernet - Microchip ksz9563 Gigabit Ethernet Switch - Microsoft Azure Network Adapter - Linux Automation 10Base-T1L adapter - PHY: - Aquantia AQR112 and AQR412 - Motorcomm YT8531S - PTP: - Orolia ART-CARD - WiFi: - MediaTek Wi-Fi 7 (802.11be) devices - RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB devices - Bluetooth: - Broadcom BCM4377/4378/4387 Bluetooth chipsets - Realtek RTL8852BE and RTL8723DS - Cypress.CYW4373A0 WiFi + Bluetooth combo device Drivers: - CAN: - gs_usb: bus error reporting support - kvaser_usb: listen only and bus error reporting support - Ethernet NICs: - Intel (100G): - extend action skbedit to RX queue mapping - implement devlink-rate support - support direct read from memory - nVidia/Mellanox (mlx5): - SW steering improvements, increasing rules update rate - Support for enhanced events compression - extend H/W offload packet manipulation capabilities - implement IPSec packet offload mode - nVidia/Mellanox (mlx4): - better big TCP support - Netronome Ethernet NICs (nfp): - IPsec offload support - add support for multicast filter - Broadcom: - RSS and PTP support improvements - AMD/SolarFlare: - netlink extened ack improvements - add basic flower matches to offload, and related stats - Virtual NICs: - ibmvnic: introduce affinity hint support - small / embedded: - FreeScale fec: add initial XDP support - Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood - TI am65-cpsw: add suspend/resume support - Mediatek MT7986: add RX wireless wthernet dispatch support - Realtek 8169: enable GRO software interrupt coalescing per default - Ethernet high-speed switches: - Microchip (sparx5): - add support for Sparx5 TC/flower H/W offload via VCAP - Mellanox mlxsw: - add 802.1X and MAC Authentication Bypass offload support - add ip6gre support - Embedded Ethernet switches: - Mediatek (mtk_eth_soc): - improve PCS implementation, add DSA untag support - enable flow offload support - Renesas: - add rswitch R-Car Gen4 gPTP support - Microchip (lan966x): - add full XDP support - add TC H/W offload via VCAP - enable PTP on bridge interfaces - Microchip (ksz8): - add MTU support for KSZ8 series - Qualcomm 802.11ax WiFi (ath11k): - support configuring channel dwell time during scan - MediaTek WiFi (mt76): - enable Wireless Ethernet Dispatch (WED) offload support - add ack signal support - enable coredump support - remain_on_channel support - Intel WiFi (iwlwifi): - enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities - 320 MHz channels support - RealTek WiFi (rtw89): - new dynamic header firmware format support - wake-over-WLAN support" * tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2002 commits) ipvs: fix type warning in do_div() on 32 bit net: lan966x: Remove a useless test in lan966x_ptp_add_trap() net: ipa: add IPA v4.7 support dt-bindings: net: qcom,ipa: Add SM6350 compatible bnxt: Use generic HBH removal helper in tx path IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver selftests: forwarding: Add bridge MDB test selftests: forwarding: Rename bridge_mdb test bridge: mcast: Support replacement of MDB port group entries bridge: mcast: Allow user space to specify MDB entry routing protocol bridge: mcast: Allow user space to add (*, G) with a source list and filter mode bridge: mcast: Add support for (*, G) with a source list and filter mode bridge: mcast: Avoid arming group timer when (S, G) corresponds to a source bridge: mcast: Add a flag for user installed source entries bridge: mcast: Expose __br_multicast_del_group_src() bridge: mcast: Expose br_multicast_new_group_src() bridge: mcast: Add a centralized error path bridge: mcast: Place netlink policy before validation functions bridge: mcast: Split (*, G) and (S, G) addition into different functions bridge: mcast: Do not derive entry type from its filter mode ... |