Commit Graph

138 Commits

Author SHA1 Message Date
Waiman Long
1bf33285d9 cgroup/cpuset: Minor updates to test_cpuset_prs.sh
This patch makes the following minor updates to the cpuset partition
testing script test_cpuset_prs.sh.

 - Remove online_cpus function call as it will be called anyway on exit
   in cleanup.
 - Make the enabling of sched/verbose debugfs flag conditional on the
   "-v" verbose option and set DELAY_FACTOR to 2 in this case as cpuset
   partition operations are likely to be slowed down by enabling that.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2023-03-29 08:42:25 -10:00
Ivan Orlov
c83f320e55 selftests: cgroup: Add 'malloc' failures checks in test_memcontrol
There are several 'malloc' calls in test_memcontrol, which can be
unsuccessful. This patch will add 'malloc' failures checking to
give more details about test's fail reasons and avoid possible
undefined behavior during the future null dereference (like the
one in alloc_anon_50M_check_swap function).

Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-03-28 10:25:25 -06:00
Waiman Long
e5ae880384 cgroup/cpuset: Fix wrong check in update_parent_subparts_cpumask()
It was found that the check to see if a partition could use up all
the cpus from the parent cpuset in update_parent_subparts_cpumask()
was incorrect. As a result, it is possible to leave parent with no
effective cpu left even if there are tasks in the parent cpuset. This
can lead to system panic as reported in [1].

Fix this probem by updating the check to fail the enabling the partition
if parent's effective_cpus is a subset of the child's cpus_allowed.

Also record the error code when an error happens in update_prstate()
and add a test case where parent partition and child have the same cpu
list and parent has task. Enabling partition in the child will fail in
this case.

[1] https://www.spinics.net/lists/cgroups/msg36254.html

Fixes: f0af1bfc27 ("cgroup/cpuset: Relax constraints to partition & cpus changes")
Cc: stable@vger.kernel.org # v6.1
Reported-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2023-01-31 12:14:02 -10:00
Linus Torvalds
e2ca6ba6ba MM patches for 6.2-rc1.
- More userfaultfs work from Peter Xu.
 
 - Several convert-to-folios series from Sidhartha Kumar and Huang Ying.
 
 - Some filemap cleanups from Vishal Moola.
 
 - David Hildenbrand added the ability to selftest anon memory COW handling.
 
 - Some cpuset simplifications from Liu Shixin.
 
 - Addition of vmalloc tracing support by Uladzislau Rezki.
 
 - Some pagecache folioifications and simplifications from Matthew Wilcox.
 
 - A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use it.
 
 - Miguel Ojeda contributed some cleanups for our use of the
   __no_sanitize_thread__ gcc keyword.  This series shold have been in the
   non-MM tree, my bad.
 
 - Naoya Horiguchi improved the interaction between memory poisoning and
   memory section removal for huge pages.
 
 - DAMON cleanups and tuneups from SeongJae Park
 
 - Tony Luck fixed the handling of COW faults against poisoned pages.
 
 - Peter Xu utilized the PTE marker code for handling swapin errors.
 
 - Hugh Dickins reworked compound page mapcount handling, simplifying it
   and making it more efficient.
 
 - Removal of the autonuma savedwrite infrastructure from Nadav Amit and
   David Hildenbrand.
 
 - zram support for multiple compression streams from Sergey Senozhatsky.
 
 - David Hildenbrand reworked the GUP code's R/O long-term pinning so
   that drivers no longer need to use the FOLL_FORCE workaround which
   didn't work very well anyway.
 
 - Mel Gorman altered the page allocator so that local IRQs can remnain
   enabled during per-cpu page allocations.
 
 - Vishal Moola removed the try_to_release_page() wrapper.
 
 - Stefan Roesch added some per-BDI sysfs tunables which are used to
   prevent network block devices from dirtying excessive amounts of
   pagecache.
 
 - David Hildenbrand did some cleanup and repair work on KSM COW
   breaking.
 
 - Nhat Pham and Johannes Weiner have implemented writeback in zswap's
   zsmalloc backend.
 
 - Brian Foster has fixed a longstanding corner-case oddity in
   file[map]_write_and_wait_range().
 
 - sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang
   Chen.
 
 - Shiyang Ruan has done some work on fsdax, to make its reflink mode
   work better under xfstests.  Better, but still not perfect.
 
 - Christoph Hellwig has removed the .writepage() method from several
   filesystems.  They only need .writepages().
 
 - Yosry Ahmed wrote a series which fixes the memcg reclaim target
   beancounting.
 
 - David Hildenbrand has fixed some of our MM selftests for 32-bit
   machines.
 
 - Many singleton patches, as usual.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY5j6ZwAKCRDdBJ7gKXxA
 jkDYAP9qNeVqp9iuHjZNTqzMXkfmJPsw2kmy2P+VdzYVuQRcJgEAgoV9d7oMq4ml
 CodAgiA51qwzId3GRytIo/tfWZSezgA=
 =d19R
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - More userfaultfs work from Peter Xu

 - Several convert-to-folios series from Sidhartha Kumar and Huang Ying

 - Some filemap cleanups from Vishal Moola

 - David Hildenbrand added the ability to selftest anon memory COW
   handling

 - Some cpuset simplifications from Liu Shixin

 - Addition of vmalloc tracing support by Uladzislau Rezki

 - Some pagecache folioifications and simplifications from Matthew
   Wilcox

 - A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use
   it

 - Miguel Ojeda contributed some cleanups for our use of the
   __no_sanitize_thread__ gcc keyword.

   This series should have been in the non-MM tree, my bad

 - Naoya Horiguchi improved the interaction between memory poisoning and
   memory section removal for huge pages

 - DAMON cleanups and tuneups from SeongJae Park

 - Tony Luck fixed the handling of COW faults against poisoned pages

 - Peter Xu utilized the PTE marker code for handling swapin errors

 - Hugh Dickins reworked compound page mapcount handling, simplifying it
   and making it more efficient

 - Removal of the autonuma savedwrite infrastructure from Nadav Amit and
   David Hildenbrand

 - zram support for multiple compression streams from Sergey Senozhatsky

 - David Hildenbrand reworked the GUP code's R/O long-term pinning so
   that drivers no longer need to use the FOLL_FORCE workaround which
   didn't work very well anyway

 - Mel Gorman altered the page allocator so that local IRQs can remnain
   enabled during per-cpu page allocations

 - Vishal Moola removed the try_to_release_page() wrapper

 - Stefan Roesch added some per-BDI sysfs tunables which are used to
   prevent network block devices from dirtying excessive amounts of
   pagecache

 - David Hildenbrand did some cleanup and repair work on KSM COW
   breaking

 - Nhat Pham and Johannes Weiner have implemented writeback in zswap's
   zsmalloc backend

 - Brian Foster has fixed a longstanding corner-case oddity in
   file[map]_write_and_wait_range()

 - sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang
   Chen

 - Shiyang Ruan has done some work on fsdax, to make its reflink mode
   work better under xfstests. Better, but still not perfect

 - Christoph Hellwig has removed the .writepage() method from several
   filesystems. They only need .writepages()

 - Yosry Ahmed wrote a series which fixes the memcg reclaim target
   beancounting

 - David Hildenbrand has fixed some of our MM selftests for 32-bit
   machines

 - Many singleton patches, as usual

* tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (313 commits)
  mm/hugetlb: set head flag before setting compound_order in __prep_compound_gigantic_folio
  mm: mmu_gather: allow more than one batch of delayed rmaps
  mm: fix typo in struct pglist_data code comment
  kmsan: fix memcpy tests
  mm: add cond_resched() in swapin_walk_pmd_entry()
  mm: do not show fs mm pc for VM_LOCKONFAULT pages
  selftests/vm: ksm_functional_tests: fixes for 32bit
  selftests/vm: cow: fix compile warning on 32bit
  selftests/vm: madv_populate: fix missing MADV_POPULATE_(READ|WRITE) definitions
  mm/gup_test: fix PIN_LONGTERM_TEST_READ with highmem
  mm,thp,rmap: fix races between updates of subpages_mapcount
  mm: memcg: fix swapcached stat accounting
  mm: add nodes= arg to memory.reclaim
  mm: disable top-tier fallback to reclaim on proactive reclaim
  selftests: cgroup: make sure reclaim target memcg is unprotected
  selftests: cgroup: refactor proactive reclaim code to reclaim_until()
  mm: memcg: fix stale protection of reclaim target memcg
  mm/mmap: properly unaccount memory on mas_preallocate() failure
  omfs: remove ->writepage
  jfs: remove ->writepage
  ...
2022-12-13 19:29:45 -08:00
Linus Torvalds
8702f2c611 Non-MM patches for 6.2-rc1.
- A ptrace API cleanup series from Sergey Shtylyov
 
 - Fixes and cleanups for kexec from ye xingchen
 
 - nilfs2 updates from Ryusuke Konishi
 
 - squashfs feature work from Xiaoming Ni: permit configuration of the
   filesystem's compression concurrency from the mount command line.
 
 - A series from Akinobu Mita which addresses bound checking errors when
   writing to debugfs files.
 
 - A series from Yang Yingliang to address rapido memory leaks
 
 - A series from Zheng Yejian to address possible overflow errors in
   encode_comp_t().
 
 - And a whole shower of singleton patches all over the place.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY5efRgAKCRDdBJ7gKXxA
 jgvdAP0al6oFDtaSsshIdNhrzcMwfjt6PfVxxHdLmNhF1hX2dwD/SVluS1bPSP7y
 0sZp7Ustu3YTb8aFkMl96Y9m9mY1Nwg=
 =ga5B
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - A ptrace API cleanup series from Sergey Shtylyov

 - Fixes and cleanups for kexec from ye xingchen

 - nilfs2 updates from Ryusuke Konishi

 - squashfs feature work from Xiaoming Ni: permit configuration of the
   filesystem's compression concurrency from the mount command line

 - A series from Akinobu Mita which addresses bound checking errors when
   writing to debugfs files

 - A series from Yang Yingliang to address rapidio memory leaks

 - A series from Zheng Yejian to address possible overflow errors in
   encode_comp_t()

 - And a whole shower of singleton patches all over the place

* tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (79 commits)
  ipc: fix memory leak in init_mqueue_fs()
  hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount
  rapidio: devices: fix missing put_device in mport_cdev_open
  kcov: fix spelling typos in comments
  hfs: Fix OOB Write in hfs_asc2mac
  hfs: fix OOB Read in __hfs_brec_find
  relay: fix type mismatch when allocating memory in relay_create_buf()
  ocfs2: always read both high and low parts of dinode link count
  io-mapping: move some code within the include guarded section
  kernel: kcsan: kcsan_test: build without structleak plugin
  mailmap: update email for Iskren Chernev
  eventfd: change int to __u64 in eventfd_signal() ifndef CONFIG_EVENTFD
  rapidio: fix possible UAF when kfifo_alloc() fails
  relay: use strscpy() is more robust and safer
  cpumask: limit visibility of FORCE_NR_CPUS
  acct: fix potential integer overflow in encode_comp_t()
  acct: fix accuracy loss for input value of encode_comp_t()
  linux/init.h: include <linux/build_bug.h> and <linux/stringify.h>
  rapidio: rio: fix possible name leak in rio_register_mport()
  rapidio: fix possible name leaks when rio_add_device() fails
  ...
2022-12-12 17:28:58 -08:00
Linus Torvalds
a312a8cc3c cgroup changes for v6.2-rc1
Nothing too interesting.
 
 * Add CONFIG_DEBUG_GROUP_REF which makes cgroup refcnt operations kprobable.
 
 * A couple cpuset optimizations.
 
 * Other misc changes including doc and test updates.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCY5bHvg4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGcYrAQCfrlzrbWw6gTQ7fmr0Avxjy5FxLjsdzEGPcmGY
 ByEMhgD/VdUf3zI/Khr91Gsi5JXQxQf7a5caD369xupRWUWjqA8=
 =Nf+E
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:
 "Nothing too interesting:

   - Add CONFIG_DEBUG_GROUP_REF which makes cgroup refcnt operations
     kprobable

   - A couple cpuset optimizations

   - Other misc changes including doc and test updates"

* tag 'cgroup-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: remove rcu_read_lock()/rcu_read_unlock() in critical section of spin_lock_irq()
  cgroup/cpuset: Improve cpuset_css_alloc() description
  kselftest/cgroup: Add cleanup() to test_cpuset_prs.sh
  cgroup/cpuset: Optimize cpuset_attach() on v2
  cgroup/cpuset: Skip spread flags update on v2
  kselftest/cgroup: Fix gathering number of CPUs
  cgroup: cgroup refcnt functions should be exported when CONFIG_DEBUG_CGROUP_REF
  cgroup: Implement DEBUG_CGROUP_REF
2022-12-12 15:48:36 -08:00
Yosry Ahmed
1c74697776 selftests: cgroup: make sure reclaim target memcg is unprotected
Make sure that we ignore protection of a memcg that is the target of memcg
reclaim.

Link: https://lkml.kernel.org/r/20221202031512.1365483-4-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Chris Down <chris@chrisdown.name>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vasily Averin <vasily.averin@linux.dev>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-11 18:12:19 -08:00
Yosry Ahmed
e5d64edac6 selftests: cgroup: refactor proactive reclaim code to reclaim_until()
Refactor the code that drives writing to memory.reclaim (retrying, error
handling, etc) from test_memcg_reclaim() to a helper called
reclaim_until(), which proactively reclaims from a memcg until its usage
reaches a certain value.

While we are at it, refactor and simplify the reclaim loop.

This will be used in a following patch in another test.

Link: https://lkml.kernel.org/r/20221202031512.1365483-3-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Suggested-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Chris Down <chris@chrisdown.name>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vasily Averin <vasily.averin@linux.dev>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-11 18:12:19 -08:00
Michal Hocko
de16d6e4a9 kselftests: cgroup: update kmem test precision tolerance
1813e51eec ("memcg: increase MEMCG_CHARGE_BATCH to 64") has changed
the batch size while this test case has been left behind. This has led
to a test failure reported by test bot:
not ok 2 selftests: cgroup: test_kmem # exit=1

Update the tolerance for the pcp charges to reflect the
MEMCG_CHARGE_BATCH change to fix this.

[akpm@linux-foundation.org: update comments, per Roman]
Link: https://lkml.kernel.org/r/Y4m8Unt6FhWKC6IH@dhcp22.suse.cz
Fixes: 1813e51eec ("memcg: increase MEMCG_CHARGE_BATCH to 64")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: kernel test robot <yujie.liu@intel.com>
  Link: https://lore.kernel.org/oe-lkp/202212010958.c1053bd3-yujie.liu@intel.com
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Michal Koutný" <mkoutny@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-09 18:41:16 -08:00
Kamalesh Babulal
1e85591db1 kselftest/cgroup: Add cleanup() to test_cpuset_prs.sh
Install a cleanup function using the trap command for signals EXIT,
SIGINT, SIGQUIT and SIGABRT.  The cleanup function will perform:
1. Online the CPUs that were made offline during the test.
2. Removing the cgroups created.
3. Restoring the original /sys/kernel/debug/sched/verbose value,
   currently it's left turned on, irrespective of the original
   configuration value.

the test performs steps 1 and 2, on the successful runs, but not during
all of the failed runs.  With the cleanup(), the system will perform all
three steps during failed/passed test runs.

Signed-off-by: Kamalesh Babulal <kamalesh.babulal@oracle.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-11-22 10:25:37 -10:00
YueHaibing
333d073dee selftests: cgroup: fix unsigned comparison with less than zero
'size' is unsigned, it never less than zero.

Link: https://lkml.kernel.org/r/20221105110611.28920-1-yuehaibing@huawei.com
Fixes: 6c26df84e1 ("selftests: cgroup: return -errno from cg_read()/cg_write() on failure")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: Kamalesh Babulal <kamalesh.babulal@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: zefan li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-18 13:55:09 -08:00
Breno Leitao
35b7fa4ed1 kselftest/cgroup: Fix gathering number of CPUs
test_cpuset_prs.sh is failing with the following error:

	test_cpuset_prs.sh: line 29: [[: 8
	57%: syntax error in expression (error token is "57%")

This is happening because `lscpu | grep "^CPU(s)"` returns two lines in
some systems (such as Debian unstable):

	# lscpu | grep "^CPU(s)"
	CPU(s):                          8
	CPU(s) scaling MHz:              55%

This is a simple fix that discard the second line.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Kamalesh Babulal <kamalesh.babulal@oracle.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-11-14 11:37:56 -10:00
Linus Torvalds
27bc50fc90 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any negative
   reports (or any positive ones, come to that).
 
 - Also the Maple Tree from Liam R.  Howlett.  An overlapping range-based
   tree for vmas.  It it apparently slight more efficient in its own right,
   but is mainly targeted at enabling work to reduce mmap_lock contention.
 
   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.
 
   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   (https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
   This has yet to be addressed due to Liam's unfortunately timed
   vacation.  He is now back and we'll get this fixed up.
 
 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer.  It uses
   clang-generated instrumentation to detect used-unintialized bugs down to
   the single bit level.
 
   KMSAN keeps finding bugs.  New ones, as well as the legacy ones.
 
 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.
 
 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
   file/shmem-backed pages.
 
 - userfaultfd updates from Axel Rasmussen
 
 - zsmalloc cleanups from Alexey Romanov
 
 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
 
 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.
 
 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.
 
 - memcg cleanups from Kairui Song.
 
 - memcg fixes and cleanups from Johannes Weiner.
 
 - Vishal Moola provides more folio conversions
 
 - Zhang Yi removed ll_rw_block() :(
 
 - migration enhancements from Peter Xu
 
 - migration error-path bugfixes from Huang Ying
 
 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths.  For optimizations by PMEM drivers, DRM
   drivers, etc.
 
 - vma merging improvements from Jakub Matěn.
 
 - NUMA hinting cleanups from David Hildenbrand.
 
 - xu xin added aditional userspace visibility into KSM merging activity.
 
 - THP & KSM code consolidation from Qi Zheng.
 
 - more folio work from Matthew Wilcox.
 
 - KASAN updates from Andrey Konovalov.
 
 - DAMON cleanups from Kaixu Xia.
 
 - DAMON work from SeongJae Park: fixes, cleanups.
 
 - hugetlb sysfs cleanups from Muchun Song.
 
 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
 joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
 bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
 =xfWx
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
   linux-next for a couple of months without, to my knowledge, any
   negative reports (or any positive ones, come to that).

 - Also the Maple Tree from Liam Howlett. An overlapping range-based
   tree for vmas. It it apparently slightly more efficient in its own
   right, but is mainly targeted at enabling work to reduce mmap_lock
   contention.

   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.

   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   at [1]. This has yet to be addressed due to Liam's unfortunately
   timed vacation. He is now back and we'll get this fixed up.

 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
   clang-generated instrumentation to detect used-unintialized bugs down
   to the single bit level.

   KMSAN keeps finding bugs. New ones, as well as the legacy ones.

 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.

 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
   support file/shmem-backed pages.

 - userfaultfd updates from Axel Rasmussen

 - zsmalloc cleanups from Alexey Romanov

 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
   memory-failure

 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.

 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.

 - memcg cleanups from Kairui Song.

 - memcg fixes and cleanups from Johannes Weiner.

 - Vishal Moola provides more folio conversions

 - Zhang Yi removed ll_rw_block() :(

 - migration enhancements from Peter Xu

 - migration error-path bugfixes from Huang Ying

 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths. For optimizations by PMEM drivers, DRM
   drivers, etc.

 - vma merging improvements from Jakub Matěn.

 - NUMA hinting cleanups from David Hildenbrand.

 - xu xin added aditional userspace visibility into KSM merging
   activity.

 - THP & KSM code consolidation from Qi Zheng.

 - more folio work from Matthew Wilcox.

 - KASAN updates from Andrey Konovalov.

 - DAMON cleanups from Kaixu Xia.

 - DAMON work from SeongJae Park: fixes, cleanups.

 - hugetlb sysfs cleanups from Muchun Song.

 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.

Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]

* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
  hugetlb: allocate vma lock for all sharable vmas
  hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
  hugetlb: fix vma lock handling during split vma and range unmapping
  mglru: mm/vmscan.c: fix imprecise comments
  mm/mglru: don't sync disk for each aging cycle
  mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
  mm: memcontrol: use do_memsw_account() in a few more places
  mm: memcontrol: deprecate swapaccounting=0 mode
  mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
  mm/secretmem: remove reduntant return value
  mm/hugetlb: add available_huge_pages() func
  mm: remove unused inline functions from include/linux/mm_inline.h
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
  selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: dedup THP helpers
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  ...
2022-10-10 17:53:04 -07:00
Johannes Weiner
e55b9f9686 mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
Since 2d1c498072 ("mm: memcontrol: make swap tracking an integral part
of memory control"), CONFIG_MEMCG_SWAP hasn't been a user-visible config
option anymore, it just means CONFIG_MEMCG && CONFIG_SWAP.

Update the sites accordingly and drop the symbol.

[ While touching the docs, remove two references to CONFIG_MEMCG_KMEM,
  which hasn't been a user-visible symbol for over half a decade. ]

Link: https://lkml.kernel.org/r/20220926135704.400818-5-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Shakeel Butt <shakeelb@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03 14:03:36 -07:00
Waiman Long
a8c52eba88 kselftest/cgroup: Add cpuset v2 partition root state test
Add a test script test_cpuset_prs.sh with a helper program wait_inotify
for exercising the cpuset v2 partition root state code.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-09-04 10:47:28 -10:00
Linus Torvalds
8291eaafed Two followon fixes for the post-5.19 series "Use pageblock_order for cma
and alloc_contig_range alignment", from Zi Yan.
 
 A series of z3fold cleanups and fixes from Miaohe Lin.
 
 Some memcg selftests work from Michal Koutný <mkoutny@suse.com>
 
 Some swap fixes and cleanups from Miaohe Lin.
 
 Several individual minor fixups.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYpEE7QAKCRDdBJ7gKXxA
 jlamAP9WmjNdx+5Pz5OkkaSjBO7y7vBrBTcQ9e5pz8bUWRoQhwEA+WtsssLmq9aI
 7DBDmBKYCMTbzOQTqaMRHkB+JWZo+Ao=
 =L3f1
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-05-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more MM updates from Andrew Morton:

 - Two follow-on fixes for the post-5.19 series "Use pageblock_order for
   cma and alloc_contig_range alignment", from Zi Yan.

 - A series of z3fold cleanups and fixes from Miaohe Lin.

 - Some memcg selftests work from Michal Koutný <mkoutny@suse.com>

 - Some swap fixes and cleanups from Miaohe Lin

 - Several individual minor fixups

* tag 'mm-stable-2022-05-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (25 commits)
  mm/shmem.c: suppress shift warning
  mm: Kconfig: reorganize misplaced mm options
  mm: kasan: fix input of vmalloc_to_page()
  mm: fix is_pinnable_page against a cma page
  mm: filter out swapin error entry in shmem mapping
  mm/shmem: fix infinite loop when swap in shmem error at swapoff time
  mm/madvise: free hwpoison and swapin error entry in madvise_free_pte_range
  mm/swapfile: fix lost swap bits in unuse_pte()
  mm/swapfile: unuse_pte can map random data if swap read fails
  selftests: memcg: factor out common parts of memory.{low,min} tests
  selftests: memcg: remove protection from top level memcg
  selftests: memcg: adjust expected reclaim values of protected cgroups
  selftests: memcg: expect no low events in unprotected sibling
  selftests: memcg: fix compilation
  mm/z3fold: fix z3fold_page_migrate races with z3fold_map
  mm/z3fold: fix z3fold_reclaim_page races with z3fold_free
  mm/z3fold: always clear PAGE_CLAIMED under z3fold page lock
  mm/z3fold: put z3fold page back into unbuddied list when reclaim or migration fails
  revert "mm/z3fold.c: allow __GFP_HIGHMEM in z3fold_alloc"
  mm/z3fold: throw warning on failure of trylock_page in z3fold_alloc
  ...
2022-05-27 11:40:49 -07:00
Michal Koutný
f079a020ba selftests: memcg: factor out common parts of memory.{low,min} tests
The memory protection test setup and runtime is almost equal for
memory.low and memory.min cases.

It makes modification of the common parts prone to mistakes, since the
protections are similar not only in setup but also in principle, factor
the common part out.

Past exceptions between the tests:
- missing memory.min is fine (kept),
- test_memcg_low protected orphaned pagecache (adapted like
  test_memcg_min and we keep the processes of protected memory running).

The evaluation in two tests is different (OOM of allocator vs low events
of protégés), this is kept different.

Link: https://lkml.kernel.org/r/20220518161859.21565-6-mkoutny@suse.com
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
CC: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Richard Palethorpe <rpalethorpe@suse.de>
Cc: David Vernet <void@manifault.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-27 09:33:45 -07:00
Michal Koutný
6a35919005 selftests: memcg: remove protection from top level memcg
The reclaim is triggered by memory limit in a subtree, therefore the
testcase does not need configured protection against external reclaim.

Also, correct respective comments.

Link: https://lkml.kernel.org/r/20220518161859.21565-5-mkoutny@suse.com
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: David Vernet <void@manifault.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Richard Palethorpe <rpalethorpe@suse.de>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-27 09:33:45 -07:00
Michal Koutný
f10b6e9a8e selftests: memcg: adjust expected reclaim values of protected cgroups
The numbers are not easy to derive in a closed form (certainly mere
protections ratios do not apply), therefore use a simulation to obtain
expected numbers.

Link: https://lkml.kernel.org/r/20220518161859.21565-4-mkoutny@suse.com
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: David Vernet <void@manifault.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Richard Palethorpe <rpalethorpe@suse.de>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-27 09:33:45 -07:00
Michal Koutný
1d09069f53 selftests: memcg: expect no low events in unprotected sibling
This is effectively a revert of commit cdc69458a5 ("cgroup: account for
memory_recursiveprot in test_memcg_low()").  The case test_memcg_low will
fail with memory_recursiveprot until resolved in reclaim code.

However, this patch preserves the existing helpers and variables for later
uses.

Link: https://lkml.kernel.org/r/20220518161859.21565-3-mkoutny@suse.com
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: David Vernet <void@manifault.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Richard Palethorpe <rpalethorpe@suse.de>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-27 09:33:45 -07:00
Michal Koutný
ff3b72a5d6 selftests: memcg: fix compilation
Patch series "memcontrol selftests fixups", v2.

Flushing the patches to make memcontrol selftests check the events
behavior we had consensus about (test_memcg_low fails).

(test_memcg_reclaim, test_memcg_swap_max fail for me now but it's present
even before the refactoring.)

The two bigger changes are:
- adjustment of the protected values to make tests succeed with the given
  tolerance,
- both test_memcg_low and test_memcg_min check protection of memory in
  populated cgroups (actually as per Documentation/admin-guide/cgroup-v2.rst
  memory.min should not apply to empty cgroups, which is not the case
  currently. Therefore I unified tests with the populated case in order to to
  bring more broken tests).


This patch (of 5):

This fixes mis-applied changes from commit 72b1e03aa7 ("cgroup: account
for memory_localevents in test_memcg_oom_group_leaf_events()").

Link: https://lkml.kernel.org/r/20220518161859.21565-1-mkoutny@suse.com
Link: https://lkml.kernel.org/r/20220518161859.21565-2-mkoutny@suse.com
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: David Vernet <void@manifault.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Richard Palethorpe <rpalethorpe@suse.de>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-27 09:33:44 -07:00
Linus Torvalds
98931dd95f Yang Shi has improved the behaviour of khugepaged collapsing of readonly
file-backed transparent hugepages.
 
 Johannes Weiner has arranged for zswap memory use to be tracked and
 managed on a per-cgroup basis.
 
 Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime
 enablement of the recent huge page vmemmap optimization feature.
 
 Baolin Wang contributes a series to fix some issues around hugetlb
 pagetable invalidation.
 
 Zhenwei Pi has fixed some interactions between hwpoisoned pages and
 virtualization.
 
 Tong Tiangen has enabled the use of the presently x86-only
 page_table_check debugging feature on arm64 and riscv.
 
 David Vernet has done some fixup work on the memcg selftests.
 
 Peter Xu has taught userfaultfd to handle write protection faults against
 shmem- and hugetlbfs-backed files.
 
 More DAMON development from SeongJae Park - adding online tuning of the
 feature and support for monitoring of fixed virtual address ranges.  Also
 easier discovery of which monitoring operations are available.
 
 Nadav Amit has done some optimization of TLB flushing during mprotect().
 
 Neil Brown continues to labor away at improving our swap-over-NFS support.
 
 David Hildenbrand has some fixes to anon page COWing versus
 get_user_pages().
 
 Peng Liu fixed some errors in the core hugetlb code.
 
 Joao Martins has reduced the amount of memory consumed by device-dax's
 compound devmaps.
 
 Some cleanups of the arch-specific pagemap code from Anshuman Khandual.
 
 Muchun Song has found and fixed some errors in the TLB flushing of
 transparent hugepages.
 
 Roman Gushchin has done more work on the memcg selftests.
 
 And, of course, many smaller fixes and cleanups.  Notably, the customary
 million cleanup serieses from Miaohe Lin.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYo52xQAKCRDdBJ7gKXxA
 jtJFAQD238KoeI9z5SkPMaeBRYSRQmNll85mxs25KapcEgWgGQD9FAb7DJkqsIVk
 PzE+d9hEfirUGdL6cujatwJ6ejYR8Q8=
 =nFe6
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Almost all of MM here. A few things are still getting finished off,
  reviewed, etc.

   - Yang Shi has improved the behaviour of khugepaged collapsing of
     readonly file-backed transparent hugepages.

   - Johannes Weiner has arranged for zswap memory use to be tracked and
     managed on a per-cgroup basis.

   - Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for
     runtime enablement of the recent huge page vmemmap optimization
     feature.

   - Baolin Wang contributes a series to fix some issues around hugetlb
     pagetable invalidation.

   - Zhenwei Pi has fixed some interactions between hwpoisoned pages and
     virtualization.

   - Tong Tiangen has enabled the use of the presently x86-only
     page_table_check debugging feature on arm64 and riscv.

   - David Vernet has done some fixup work on the memcg selftests.

   - Peter Xu has taught userfaultfd to handle write protection faults
     against shmem- and hugetlbfs-backed files.

   - More DAMON development from SeongJae Park - adding online tuning of
     the feature and support for monitoring of fixed virtual address
     ranges. Also easier discovery of which monitoring operations are
     available.

   - Nadav Amit has done some optimization of TLB flushing during
     mprotect().

   - Neil Brown continues to labor away at improving our swap-over-NFS
     support.

   - David Hildenbrand has some fixes to anon page COWing versus
     get_user_pages().

   - Peng Liu fixed some errors in the core hugetlb code.

   - Joao Martins has reduced the amount of memory consumed by
     device-dax's compound devmaps.

   - Some cleanups of the arch-specific pagemap code from Anshuman
     Khandual.

   - Muchun Song has found and fixed some errors in the TLB flushing of
     transparent hugepages.

   - Roman Gushchin has done more work on the memcg selftests.

  ... and, of course, many smaller fixes and cleanups. Notably, the
  customary million cleanup serieses from Miaohe Lin"

* tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits)
  mm: kfence: use PAGE_ALIGNED helper
  selftests: vm: add the "settings" file with timeout variable
  selftests: vm: add "test_hmm.sh" to TEST_FILES
  selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests
  selftests: vm: add migration to the .gitignore
  selftests/vm/pkeys: fix typo in comment
  ksm: fix typo in comment
  selftests: vm: add process_mrelease tests
  Revert "mm/vmscan: never demote for memcg reclaim"
  mm/kfence: print disabling or re-enabling message
  include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace"
  include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion"
  mm: fix a potential infinite loop in start_isolate_page_range()
  MAINTAINERS: add Muchun as co-maintainer for HugeTLB
  zram: fix Kconfig dependency warning
  mm/shmem: fix shmem folio swapoff hang
  cgroup: fix an error handling path in alloc_pagecache_max_30M()
  mm: damon: use HPAGE_PMD_SIZE
  tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate
  nodemask.h: fix compilation error with GCC12
  ...
2022-05-26 12:32:41 -07:00
Christophe JAILLET
7fb6378701 cgroup: fix an error handling path in alloc_pagecache_max_30M()
If the first goto is taken, 'fd' is not opened yet (and is un-initialized).
So a direct return is safer.

Link: https://lkml.kernel.org/r/628312312eb40e0e39463a2c06415fde5295c716.1653229120.git.christophe.jaillet@wanadoo.fr
Fixes: c1a31a2f7a ("cgroup: fix racy check in alloc_pagecache_max_30M() helper function")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: David Vernet <void@manifault.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-25 10:47:46 -07:00
Waiman Long
213adc63df kseltest/cgroup: Make test_stress.sh work if run interactively
Commit 54de76c012 ("kselftest/cgroup: fix test_stress.sh to use OUTPUT
dir") changes the test_core command path from . to $OUTPUT. However,
variable OUTPUT may not be defined if the command is run interactively.
Fix that by using ${OUTPUT:-.} to cover both cases.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-05-13 09:33:21 -10:00
David Vernet
c1a31a2f7a cgroup: fix racy check in alloc_pagecache_max_30M() helper function
alloc_pagecache_max_30M() in the cgroup memcg tests performs a 50MB
pagecache allocation, which it expects to be capped at 30MB due to the
calling process having a memory.high setting of 30MB.  After the
allocation, the function contains a check that verifies that MB(29) <
memory.current <= MB(30).  This check can actually fail
non-deterministically.

The testcases that use this function are test_memcg_high() and
test_memcg_max(), which set memory.min and memory.max to 30MB respectively
for the cgroup under test.  The allocation can slightly exceed this number
in both cases, and for memory.max, the process performing the allocation
will not have the OOM killer invoked as it's performing a pagecache
allocation.  This patchset therefore updates the above check to instead
use the verify_close() helper function.

Link: https://lkml.kernel.org/r/20220423155619.3669555-6-void@manifault.com
Signed-off-by: David Vernet <void@manifault.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-13 07:20:12 -07:00
David Vernet
830316807e cgroup: remove racy check in test_memcg_sock()
test_memcg_sock() in the cgroup memcg tests, verifies expected memory
accounting for sockets.  The test forks a process which functions as a TCP
server, and sends large buffers back and forth between itself (as the TCP
client) and the forked TCP server.  While doing so, it verifies that
memory.current and memory.stat.sock look correct.

There is currently a check in tcp_client() which asserts memory.current >=
memory.stat.sock.  This check is racy, as between memory.current and
memory.stat.sock being queried, a packet could come in which causes
mem_cgroup_charge_skmem() to be invoked.  This could cause
memory.stat.sock to exceed memory.current.  Reversing the order of
querying doesn't address the problem either, as memory may be reclaimed
between the two calls.  Instead, this patch just removes that assertion
altogether, and instead relies on the values_close() check that follows to
validate the expected accounting.

Link: https://lkml.kernel.org/r/20220423155619.3669555-5-void@manifault.com
Signed-off-by: David Vernet <void@manifault.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-13 07:20:12 -07:00
David Vernet
72b1e03aa7 cgroup: account for memory_localevents in test_memcg_oom_group_leaf_events()
The test_memcg_oom_group_leaf_events() testcase in the cgroup memcg tests
validates that processes in a group that perform allocations exceeding
memory.oom.group are killed.  It also validates that the
memory.events.oom_kill events are properly propagated in this case.

Commit 06e11c907ea4 ("kselftests: memcg: update the oom group leaf events
test") fixed test_memcg_oom_group_leaf_events() to account for the fact
that the memory.events.oom_kill events in a child cgroup is propagated up
to its parent.  This behavior can actually be configured by the
memory_localevents mount option, so this patch updates the testcase to
properly account for the possible presence of this mount option.

Link: https://lkml.kernel.org/r/20220423155619.3669555-4-void@manifault.com
Signed-off-by: David Vernet <void@manifault.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-13 07:20:12 -07:00
David Vernet
cdc69458a5 cgroup: account for memory_recursiveprot in test_memcg_low()
The test_memcg_low() testcase in test_memcontrol.c verifies the expected
behavior of groups using the memory.low knob.  Part of the testcase
verifies that a group with memory.low that experiences reclaim due to
memory pressure elsewhere in the system, observes memory.events.low events
as a result of that reclaim.

In commit 8a931f8013 ("mm: memcontrol: recursive memory.low
protection"), the memory controller was updated to propagate memory.low
and memory.min protection from a parent group to its children via a
configurable memory_recursiveprot mount option.  This unfortunately broke
the memcg tests, which asserts that a sibling that experienced reclaim but
had a memory.low value of 0, would not observe any memory.low events. 
This patch updates test_memcg_low() to account for the new behavior
introduced by memory_recursiveprot.

So as to make the test resilient to multiple configurations, the patch
also adds a new proc_mount_contains() helper that checks for a string in
/proc/mounts, and is used to toggle behavior based on whether the default
memory_recursiveprot was present.

Link: https://lkml.kernel.org/r/20220423155619.3669555-3-void@manifault.com
Signed-off-by: David Vernet <void@manifault.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-13 07:20:12 -07:00
David Vernet
f0cdaa5687 cgroups: refactor children cgroups in memcg tests
Patch series "Fix bugs in memcontroller cgroup tests", v2.

tools/testing/selftests/cgroup/test_memcontrol.c contains a set of
testcases which validate expected behavior of the cgroup memory
controller.  Roman Gushchin recently sent out a patchset that fixed a few
issues in the test.  This patchset continues that effort by fixing a few
more issues that were causing non-deterministic failures in the suite. 
With this patchset, I'm unable to reproduce any more errors after running
the tests in a continuous loop for many iterations.  Before, I was able to
reproduce at least one of the errors fixed in this patchset with just one
or two runs.


This patch (of 5):

In test_memcg_min() and test_memcg_low(), there is an array of four
sibling cgroups.  All but one of these sibling groups does a 50MB
allocation, and the group that does no allocation is the third of four in
the array.  This is not a problem per se, but makes it a bit tricky to do
some assertions in test_memcg_low(), as we want to make assertions on the
siblings based on whether or not they performed allocations.  Having a
static index before which all groups have performed an allocation makes
this cleaner.

This patch therefore reorders the sibling groups so that the group that
performs no allocations is the last in the array.  A follow-on patch will
leverage this to fix a bug in the test that incorrectly asserts that a
sibling group that had performed an allocation, but only had protection
from its parent, will not observe any memory.events.low events during
reclaim.

Link: https://lkml.kernel.org/r/20220423155619.3669555-1-void@manifault.com
Link: https://lkml.kernel.org/r/20220423155619.3669555-2-void@manifault.com
Signed-off-by: David Vernet <void@manifault.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-13 07:20:12 -07:00
Phil Auld
54de76c012 kselftest/cgroup: fix test_stress.sh to use OUTPUT dir
Running cgroup kselftest with O= fails to run the with_stress test due
to hardcoded ./test_core. Find test_core binary using the OUTPUT directory.

Fixes: 1a99fcc035 ("selftests: cgroup: Run test_core under interfering stress")
Signed-off-by: Phil Auld <pauld@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-05-12 07:11:01 -10:00
Yosry Ahmed
eae3cb2e87 selftests: cgroup: add a selftest for memory.reclaim
Add a new test for memory.reclaim that verifies that the interface
correctly reclaims memory as intended, from both anon and file pages.

Link: https://lkml.kernel.org/r/20220425190040.2475377-5-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Chen Wandun <chenwandun@huawei.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29 14:37:00 -07:00
Yosry Ahmed
a3622a53e6 selftests: cgroup: fix alloc_anon_noexit() instantly freeing memory
Currently, alloc_anon_noexit() calls alloc_anon() which instantly frees
the allocated memory. alloc_anon_noexit() is usually used with
cg_run_nowait() to run a process in the background that allocates
memory. It makes sense for the background process to keep the memory
allocated and not instantly free it (otherwise there is no point of
running it in the background).

Link: https://lkml.kernel.org/r/20220425190040.2475377-4-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Chen Wandun <chenwandun@huawei.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29 14:36:59 -07:00
Yosry Ahmed
6c26df84e1 selftests: cgroup: return -errno from cg_read()/cg_write() on failure
Currently, cg_read()/cg_write() returns 0 on success and -1 on failure.
Modify them to return the -errno on failure.

Link: https://lkml.kernel.org/r/20220425190040.2475377-3-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Chen Wandun <chenwandun@huawei.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29 14:36:59 -07:00
Roman Gushchin
be74553f25 kselftests: memcg: speed up the memory.high test
After commit 0e4b01df86 ("mm, memcg: throttle allocators when failing
reclaim over memory.high") allocating memory over memory.high became very
time consuming.  But it's exactly what the memory.high test from cgroup
kselftests is doing: it tries to allocate 100M with 30M memory.high value.
It takes forever to complete.

In order to keep it passing (or failing) in a reasonable amount of time
let's try to allocate only a little over 30M: 31M to be precise.

With this change test_memcontrol finishes in a reasonable amount of
time:
  $ time ./test_memcontrol
  ok 1 test_memcg_subtree_control
  ok 2 test_memcg_current
  ok 3 test_memcg_min
  ok 4 test_memcg_low
  ok 5 test_memcg_high
  ok 6 test_memcg_max
  ok 7 test_memcg_oom_events
  ok 8 test_memcg_swap_max
  ok 9 test_memcg_sock
  ok 10 test_memcg_oom_group_leaf_events
  ok 11 test_memcg_oom_group_parent_events
  ok 12 test_memcg_oom_group_score_events

  real	0m2.273s
  user	0m0.064s
  sys	0m0.739s

Link: https://lkml.kernel.org/r/20220415000133.3955987-3-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: David Vernet <void@manifault.com>
Cc: Chris Down <chris@chrisdown.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-28 23:15:59 -07:00
Roman Gushchin
c85bcc912f kselftests: memcg: update the oom group leaf events test
Patch series "mm: memcg kselftests fixes".


This patch (of 4):

Commit 9852ae3fe5 ("mm, memcg: consider subtrees in memory.events") made
memory.events recursive: all events are propagated upwards by the tree. 
It was a change in semantics.

It broke the oom group leaf events test: it assumes that after an OOM the
oom_kill counter is zero on parent's level.

Let's adjust the test: it should have similar expectations for the child
and parent levels.

The test passes after this fix.

Link: https://lkml.kernel.org/r/20220415000133.3955987-2-roman.gushchin@linux.dev
Link: https://lkml.kernel.org/r/20220415000133.3955987-1-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: David Vernet <void@manifault.com>
Cc: Chris Down <chris@chrisdown.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-28 23:15:59 -07:00
David Vernet
5c26993c31 cgroup: Add config file to cgroup selftest suite
Most of the test suites in tools/testing/selftests contain a config file
that specifies which kernel config options need to be present in order for
the test suite to be able to run and perform meaningful validation. There
is no config file for the tools/testing/selftests/cgroup test suite, so
this patch adds one.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-25 07:27:31 -10:00
David Vernet
a79906570f cgroup: Add test_cpucg_max_nested() testcase
The cgroup cpu controller selftests have a test_cpucg_max() testcase
that validates the behavior of the cpu.max knob. Let's also add a
testcase that verifies that the behavior works correctly when set on a
nested cgroup.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-25 07:27:31 -10:00
David Vernet
889ab8113e cgroup: Add test_cpucg_max() testcase
The cgroup cpu controller test suite has a number of testcases that
validate the expected behavior of the cpu.weight knob, but none for
cpu.max. This testcase fixes that by adding a testcase for cpu.max as well.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-25 07:27:31 -10:00
David Vernet
89ca0efa84 cgroup: Add test_cpucg_nested_weight_underprovisioned() testcase
The cgroup cpu controller test suite currently contains a testcase called
test_cpucg_nested_weight_underprovisioned() which verifies the expected
behavior of cpu.weight when applied to nested cgroups. That first testcase
validated the expected behavior when the processes in the leaf cgroups
overcommitted the system. This patch adds a complementary
test_cpucg_nested_weight_underprovisioned() testcase which validates
behavior when those leaf cgroups undercommit the system.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-25 07:27:31 -10:00
David Vernet
b76ee4f576 cgroup: Adding test_cpucg_nested_weight_overprovisioned() testcase
The cgroup cpu controller tests in
tools/testing/selftests/cgroup/test_cpu.c have some testcases that validate
the expected behavior of setting cpu.weight on cgroups, and then hogging
CPUs. What is still missing from the suite is a testcase that validates
nested cgroups. This patch adds test_cpucg_nested_weight_overprovisioned(),
which validates that a parent's cpu.weight will override its children if
they overcommit a host, and properly protect any sibling groups of that
parent.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-25 07:27:31 -10:00
David Vernet
4ab93063c8 cgroup: Add test_cpucg_weight_underprovisioned() testcase
test_cpu.c includes testcases that validate the cgroup cpu controller.
This patch adds a new testcase called test_cpucg_weight_underprovisioned()
that verifies that processes with different cpu.weight that are all running
on an underprovisioned system, still get roughly the same amount of cpu
time.

Because test_cpucg_weight_underprovisioned() is very similar to
test_cpucg_weight_overprovisioned(), this patch also pulls the common logic
into a separate helper function that is invoked from both testcases, and
which uses function pointers to invoke the unique portions of the
testcases.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-22 08:39:32 -10:00
David Vernet
6376b22cd0 cgroup: Add test_cpucg_weight_overprovisioned() testcase
test_cpu.c includes testcases that validate the cgroup cpu controller.
This patch adds a new testcase called test_cpucg_weight_overprovisioned()
that verifies the expected behavior of creating multiple processes with
different cpu.weight, on a system that is overprovisioned.

So as to avoid code duplication, this patch also updates cpu_hog_func_param
to take a new hog_clock_type enum which informs how time is counted in
hog_cpus_timed() (either process time or wall clock time).

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-22 08:39:32 -10:00
David Vernet
3c879a1bb8 cgroup: Add test_cpucg_stats() testcase to cgroup cpu selftests
test_cpu.c includes testcases that validate the cgroup cpu controller.
This patch adds a new testcase called test_cpucg_stats() that verifies the
expected behavior of the cpu.stat interface. In doing so, we define a
new hog_cpus_timed() function which takes a cpu_hog_func_param struct
that configures how many CPUs it uses, and how long it runs. Future
patches will also spawn threads that hog CPUs, so this function will
eventually serve those use-cases as well.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-22 08:39:32 -10:00
David Vernet
820a4f88ee cgroup: Add new test_cpu.c test suite in cgroup selftests
The cgroup selftests suite currently contains tests that validate various
aspects of cgroup, such as validating the expected behavior for memory
controllers, the expected behavior of cgroup.procs, etc. There are no tests
that validate the expected behavior of the cgroup cpu controller.

This patch therefore adds a new test_cpu.c file that will contain cpu
controller testcases. The file currently only contains a single testcase
that validates creating nested cgroups with cgroup.subtree_control
including cpu. Future patches will add more sophisticated testcases that
validate functional aspects of the cpu controller.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-22 08:39:32 -10:00
Shakeel Butt
6323ec54b4 selftests: memcg: test high limit for single entry allocation
Test the enforcement of memory.high limit for large amount of memory
allocation within a single kernel entry.  There are valid use-cases
where the application can trigger large amount of memory allocation
within a single syscall e.g.  mlock() or mmap(MAP_POPULATE).

Make sure memory.high limit enforcement works for such use-cases.

Link: https://lkml.kernel.org/r/20220211064917.2028469-4-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Chris Down <chris@chrisdown.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:02 -07:00
Linus Torvalds
4369b3cec2 linux-kselftest-next-5.17-rc1
This Kselftest update for Linux 5.17-rc1 consists of fixes to build
 errors, false negatives, and several code cleanups, including the
 ARRAY_SIZE cleanup that removes 25+ duplicates ARRAY_SIZE defines
 from individual tests.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmHYu/cACgkQCwJExA0N
 Qxw6Qg//atuqFrmKzmlzces5pHmM88+RpkKDjA2e0Ltr8to8IWMg8gzr3o2qQ1aw
 /zKTCxWtd7PHQiJXmG9LV5z4UefT3rfV7TQfRTlH6XLe+T9wdBVREMUp4HaPeYq7
 Q+u9NIe570ylBUVYznh40SsgSV9YJtAztxDrU39YidmuYEwDCbIwmIFRHMgClRLF
 KgtE1tc0iuFE5Ru/kQwd+MBZBHnA/Zf1S8YiM5AIncRx/9f4EgoMAWcbUHQYFRjS
 W+mlyLR3/UjeB0LUCLiwoXjDvUoga3RA1A6Nf9/7CC7pqRAXHdGFMxCLOJfy1AGh
 n4MNwxQA4yJDXUCS9zhQLob0n75NKJDKbLJ9PlP42IaJXK8KbRM8UcXRRob+ywjT
 xL476F5D49TO2YRS00vMyBor06MUa0cT3HLQKDdv6IdKEbUhpiHUFL9h9nxISnAN
 rFC/y4vrYfpJrR7BF/dK3zs98d9u+lEDxbsKxB7dZ/gvzMIDr2QU0/RiAhurdzz6
 7kswA2ZTBfJkJ2o9+8W9E4jJIORV51b8ep+nmv/aiFfCjLwCrPF0/J63l5Pd/Bgr
 xDXlydtGVmVSGiM5Vq5CmuFlDJzPD+ZDZLbpxpRmLZPsDjAJS5Zo1wIdtbbZ1hI5
 vbvPf9jadsR9WiBjohC09u6J34GwTsWcTW9aLZBUyzjmo07hYV0=
 =r9GV
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-next-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest update from Shuah Khan:
 "Fixes to build errors, false negatives, and several code cleanups,
  including the ARRAY_SIZE cleanup that removes 25+ duplicates
  ARRAY_SIZE defines from individual tests"

* tag 'linux-kselftest-next-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/vm: remove ARRAY_SIZE define from individual tests
  selftests/timens: remove ARRAY_SIZE define from individual tests
  selftests/sparc64: remove ARRAY_SIZE define from adi-test
  selftests/seccomp: remove ARRAY_SIZE define from seccomp_benchmark
  selftests/rseq: remove ARRAY_SIZE define from individual tests
  selftests/net: remove ARRAY_SIZE define from individual tests
  selftests/landlock: remove ARRAY_SIZE define from common.h
  selftests/ir: remove ARRAY_SIZE define from ir_loopback.c
  selftests/core: remove ARRAY_SIZE define from close_range_test.c
  selftests/cgroup: remove ARRAY_SIZE define from cgroup_util.h
  selftests/arm64: remove ARRAY_SIZE define from vec-syscfg.c
  tools: fix ARRAY_SIZE defines in tools and selftests hdrs
  selftests: cgroup: build error multiple outpt files
  selftests/move_mount_set_group remove unneeded conversion to bool
  selftests/mount: remove unneeded conversion to bool
  selftests: harness: avoid false negatives if test has no ASSERTs
  selftests/ftrace: make kprobe profile testcase description unique
  selftests: clone3: clone3: add case CLONE3_ARGS_NO_TEST
  selftests: timers: Remove unneeded semicolon
  kselftests: timers:Remove unneeded semicolon
2022-01-10 12:08:12 -08:00
Tejun Heo
bf35a7879f selftests: cgroup: Test open-time cgroup namespace usage for migration checks
When a task is writing to an fd opened by a different task, the perm check
should use the cgroup namespace of the latter task. Add a test for it.

Tested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06 11:02:29 -10:00
Tejun Heo
613e040e4d selftests: cgroup: Test open-time credential usage for migration checks
When a task is writing to an fd opened by a different task, the perm check
should use the credentials of the latter task. Add a test for it.

Tested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06 11:02:29 -10:00
Tejun Heo
b09c2baa56 selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644
0644 is an odd perm to create a cgroup which is a directory. Use the regular
0755 instead. This is necessary for euid switching test case.

Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06 11:02:29 -10:00
Shuah Khan
72a571d1e2 selftests/cgroup: remove ARRAY_SIZE define from cgroup_util.h
ARRAY_SIZE is defined in several selftests. Remove definitions from
individual test files and include header file for the define instead.
ARRAY_SIZE define is added in a separate patch to prepare for this
change.

Remove ARRAY_SIZE from cgroup_util.h and pickup the one defined in
kselftest.h.

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-12-10 17:50:31 -07:00
Anders Roxell
a738a4ce84 selftests: cgroup: build error multiple outpt files
When building selftests/cgroup: with clang the following error are seen:

clang -Wall -pthread    test_memcontrol.c cgroup_util.c ../clone3/clone3_selftests.h  -o .../builds/current/kselftest/cgroup/test_memcontrol
clang: error: cannot specify -o when generating multiple output files
make[3]: *** [../lib.mk:146: .../builds/current/kselftest/cgroup/test_memcontrol] Error 1

Rework to add the header files to LOCAL_HDRS before including ../lib.mk,
since the dependency is evaluated in '$(OUTPUT)/%:%.c $(LOCAL_HDRS)' in
file lib.mk.

Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-12-10 16:46:01 -07:00
Christian Brauner
85e3b86ed0 tests/cgroup: test cgroup.kill
Test that the new cgroup.kill feature works as intended.

Link: https://lore.kernel.org/r/20210503143922.3093755-5-brauner@kernel.org
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: cgroups@vger.kernel.org
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2021-05-10 10:41:11 -04:00
Christian Brauner
8075e4f6c9 tests/cgroup: move cg_wait_for(), cg_prepare_for_wait()
as they will be used by the tests for cgroup killing.

Link: https://lore.kernel.org/r/20210503143922.3093755-4-brauner@kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: cgroups@vger.kernel.org
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2021-05-10 10:41:11 -04:00
Christian Brauner
0de3103fa2 tests/cgroup: use cgroup.kill in cg_killall()
If cgroup.kill file is supported make use of it.

Link: https://lore.kernel.org/r/20210503143922.3093755-3-brauner@kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: cgroups@vger.kernel.org
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2021-05-10 10:41:10 -04:00
Johannes Weiner
4bbcc5a41c kselftests: cgroup: update kmem test for new vmstat implementation
With memcg having switched to rstat, memory.stat output is precise.
Update the cgroup selftest to reflect the expectations and error
tolerances of the new implementation.

Also add newly tracked types of memory to the memory.stat side of the
equation, since they're included in memory.current and could throw false
positives.

Link: https://lkml.kernel.org/r/20210209163304.77088-9-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-30 11:20:38 -07:00
Sachin Sant
c2e46f6b3e selftests/cgroup: Fix build on older distros
On older distros struct clone_args does not have a cgroup member,
leading to build errors:

 cgroup_util.c: In function 'clone_into_cgroup':
 cgroup_util.c:343:4: error: 'struct clone_args' has no member named 'cgroup'
 cgroup_util.c:346:33: error: invalid application of 'sizeof' to incomplete
  type 'struct clone_args'

But the selftests already have a locally defined version of the
structure which is up to date, called __clone_args.

So use __clone_args which fixes the error.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sachin Sant <sachinp@linux.vnet.ibm.com>>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-11-10 15:13:25 -07:00
Roman Gushchin
90631e1dea kselftests: cgroup: add perpcu memory accounting test
Add a simple test to check the percpu memory accounting.  The test creates
a cgroup tree with 1000 child cgroups and checks values of memory.current
and memory.stat::percpu.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tobin C. Harding <tobin@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Waiman Long <longman@redhat.com>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Bixuan Cui <cuibixuan@huawei.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200608230819.832349-6-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:57:55 -07:00
Roman Gushchin
933dc80ec2 kselftests: cgroup: add kernel memory accounting tests
Add some tests to cover the kernel memory accounting functionality.  These
are covering some issues (and changes) we had recently.

1) A test which allocates a lot of negative dentries, checks memcg slab
   statistics, creates memory pressure by setting memory.max to some low
   value and checks that some number of slabs was reclaimed.

2) A test which covers side effects of memcg destruction: it creates
   and destroys a large number of sub-cgroups, each containing a
   multi-threaded workload which allocates and releases some kernel
   memory.  Then it checks that the charge ans memory.stats do add up on
   the parent level.

3) A test which reads /proc/kpagecgroup and implicitly checks that it
   doesn't crash the system.

4) A test which spawns a large number of threads and checks that the
   kernel stacks accounting works as expected.

5) A test which checks that living charged slab objects are not
   preventing the memory cgroup from being released after being deleted by
   a user.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: http://lkml.kernel.org/r/20200623174037.3951353-19-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:25 -07:00
Gaurav Singh
d830020656 tools/testing/selftests/cgroup/cgroup_util.c: cg_read_strcmp: fix null pointer dereference
Haven't reproduced this issue. This PR is does a minor code cleanup.

Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Michal Koutn <mkoutny@suse.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Chris Down <chris@chrisdown.name>
Link: http://lkml.kernel.org/r/20200726013808.22242-1-gaurav1086@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:21 -07:00
Linus Torvalds
ff2ae607c6 SPDX patches for 5.7-rc1.
Here are 3 SPDX patches for 5.7-rc1.
 
 One fixes up the SPDX tag for a single driver, while the other two go
 through the tree and add SPDX tags for all of the .gitignore files as
 needed.
 
 Nothing too complex, but you will get a merge conflict with your current
 tree, that should be trivial to handle (one file modified by two things,
 one file deleted.)
 
 All 3 of these have been in linux-next for a while, with no reported
 issues other than the merge conflict.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXodg5A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykySQCgy9YDrkz7nWq6v3Gohl6+lW/L+rMAnRM4uTZm
 m5AuCzO3Azt9KBi7NL+L
 =2Lm5
 -----END PGP SIGNATURE-----

Merge tag 'spdx-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx

Pull SPDX updates from Greg KH:
 "Here are three SPDX patches for 5.7-rc1.

  One fixes up the SPDX tag for a single driver, while the other two go
  through the tree and add SPDX tags for all of the .gitignore files as
  needed.

  Nothing too complex, but you will get a merge conflict with your
  current tree, that should be trivial to handle (one file modified by
  two things, one file deleted.)

  All three of these have been in linux-next for a while, with no
  reported issues other than the merge conflict"

* tag 'spdx-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
  ASoC: MT6660: make spdxcheck.py happy
  .gitignore: add SPDX License Identifier
  .gitignore: remove too obvious comments
2020-04-03 13:12:26 -07:00
Masahiro Yamada
d198b34f38 .gitignore: add SPDX License Identifier
Add SPDX License Identifier to all .gitignore files.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-25 11:50:48 +01:00
Christian Brauner
9bd5910d7f selftests/cgroup: add tests for cloning into cgroups
Expand the cgroup test-suite to include tests for CLONE_INTO_CGROUP.
This adds the following tests:
- CLONE_INTO_CGROUP manages to clone a process directly into a correctly
  delegated cgroup
- CLONE_INTO_CGROUP fails to clone a process into a cgroup that has been
  removed after we've opened an fd to it
- CLONE_INTO_CGROUP fails to clone a process into an invalid domain
  cgroup
- CLONE_INTO_CGROUP adheres to the no internal process constraint
- CLONE_INTO_CGROUP works with the freezer feature

Cc: Tejun Heo <tj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: cgroups@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12 17:57:51 -05:00
Suren Baghdasaryan
04189382c0 kselftest/cgroup: add cgroup destruction test
Add new test to verify that a cgroup with dead processes can be destroyed.
The test spawns a child process which allocates and touches 100MB of RAM
to ensure prolonged exit. Subsequently it kills the child, waits until
the cgroup containing the child is empty and destroys the cgroup.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
[mkoutny@suse.com: Fix typo in test_cgcore_destroy comment]
Acked-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12 17:12:04 -05:00
Dan Carpenter
192c197cbc selftests: Uninitialized variable in test_cgcore_proc_migration()
The "c_threads" variable is used in the error handling code before it
has been initialized

Fixes: 11318989c3 ("selftests: cgroup: Add task migration tests")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-01-09 16:21:23 -07:00
Hewenliang
d671fa6393 kselftests: cgroup: Avoid the reuse of fd after it is deallocated
It is necessary to set fd to -1 when inotify_add_watch() fails in
cg_prepare_for_wait. Otherwise the fd which has been closed in
cg_prepare_for_wait may be misused in other functions such as
cg_enter_and_wait_for_frozen and cg_freeze_wait.

Fixes: 5313bfe425 ("selftests: cgroup: add freezer controller self-tests")
Signed-off-by: Hewenliang <hewenliang4@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2019-11-12 07:33:59 -08:00
Michal Koutný
1a99fcc035 selftests: cgroup: Run test_core under interfering stress
test_core tests various cgroup creation/removal and task migration
paths. Run the tests repeatedly with interfering noise (for lockdep
checks). Currently, forking noise and subsystem enabled/disabled
switching are the implemented noises.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2019-10-07 07:11:54 -07:00
Michal Koutný
11318989c3 selftests: cgroup: Add task migration tests
Add two new tests that verify that thread and threadgroup migrations
work as expected.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2019-10-07 07:11:54 -07:00
Michal Koutný
58c9f75b86 selftests: cgroup: Simplify task self migration
Simplify task migration by being oblivious about its PID during
migration. This allows to easily migrate individual threads as well.
This change brings no functional change and prepares grounds for thread
granularity migrating tests.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2019-10-07 07:11:54 -07:00
Roman Gushchin
44e9d308a5 kselftests: cgroup: add freezer mkdir test
Add a new cgroup freezer selftest, which checks that if a cgroup is
frozen, their new child cgroups will properly inherit the frozen
state.

It creates a parent cgroup, freezes it, creates a child cgroup
and populates it with a dummy process. Then it checks that both
parent and child cgroup are frozen.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2019-09-12 14:04:40 -07:00
Chris Down
b59b1baab7 cgroup: kselftest: relax fs_spec checks
On my laptop most memcg kselftests were being skipped because it claimed
cgroup v2 hierarchy wasn't mounted, but this isn't correct.  Instead, it
seems current systemd HEAD mounts it with the name "cgroup2" instead of
"cgroup":

    % grep cgroup /proc/mounts
    cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0

I can't think of a reason to need to check fs_spec explicitly
since it's arbitrary, so we can just rely on fs_vfstype.

After these changes, `make TARGETS=cgroup kselftest` actually runs the
cgroup v2 tests in more cases.

Link: http://lkml.kernel.org/r/20190723210737.GA487@chrisdown.name
Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-08-03 07:02:01 -07:00
YueHaibing
f4fb8a97dc kselftests: cgroup: remove duplicated include from test_freezer.c
Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2019-07-02 13:29:43 -06:00
Alex Shi
f97f3f8839 kselftest/cgroup: fix incorrect test_core skip
The test_core will skip the
test_cgcore_no_internal_process_constraint_on_threads test case if the
'cpu' controller missing in root's subtree_control. In fact we need to
set the 'cpu' in subtree_control, to make the testing meaningful.

./test_core
...
ok 4 # skip test_cgcore_no_internal_process_constraint_on_threads
...

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Claudio Zumbo <claudioz@fb.com>
Cc: Claudio <claudiozumbo@gmail.com>
Cc: linux-kselftest@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Roman Gushchin <guro@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2019-05-30 08:19:17 -06:00
Alex Shi
00e38a5d75 kselftest/cgroup: fix unexpected testing failure on test_core
The cgroup testing relys on the root cgroup's subtree_control setting,
If the 'memory' controller isn't set, some test cases will be failed
as following:

$sudo  ./test_core
not ok 1 test_cgcore_internal_process_constraint
ok 2 test_cgcore_top_down_constraint_enable
not ok 3 test_cgcore_top_down_constraint_disable
...

To correct this unexpected failure, this patch write the 'memory' to
subtree_control of root to get a right result.

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Claudio Zumbo <claudioz@fb.com>
Cc: Claudio <claudiozumbo@gmail.com>
Cc: linux-kselftest@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Roman Gushchin <guro@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2019-05-30 08:19:17 -06:00
Alex Shi
f6131f2805 kselftest/cgroup: fix unexpected testing failure on test_memcontrol
The cgroup testing relies on the root cgroup's subtree_control setting,
If the 'memory' controller isn't set, all test cases will be failed
as following:

$ sudo ./test_memcontrol
not ok 1 test_memcg_subtree_control
not ok 2 test_memcg_current
ok 3 # skip test_memcg_min
not ok 4 test_memcg_low
not ok 5 test_memcg_high
not ok 6 test_memcg_max
not ok 7 test_memcg_oom_events
ok 8 # skip test_memcg_swap_max
not ok 9 test_memcg_sock
not ok 10 test_memcg_oom_group_leaf_events
not ok 11 test_memcg_oom_group_parent_events
not ok 12 test_memcg_oom_group_score_events

To correct this unexpected failure, this patch write the 'memory' to
subtree_control of root to get a right result.

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Jay Kamat <jgkamat@fb.com>
Cc: linux-kselftest@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Roman Gushchin <guro@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2019-05-30 08:19:17 -06:00
Linus Torvalds
abde77eb5c Merge branch 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
 "This includes Roman's cgroup2 freezer implementation.

  It's a separate machanism from cgroup1 freezer. Instead of blocking
  user tasks in arbitrary uninterruptible sleeps, the new implementation
  extends jobctl stop - frozen tasks are trapped in jobctl stop until
  thawed and can be killed and ptraced. Lots of thanks to Oleg for
  sheperding the effort.

  Other than that, there are a few trivial changes"

* 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: never call do_group_exit() with task->frozen bit set
  kernel: cgroup: fix misuse of %x
  cgroup: get rid of cgroup_freezer_frozen_exit()
  cgroup: prevent spurious transition into non-frozen state
  cgroup: Remove unused cgrp variable
  cgroup: document cgroup v2 freezer interface
  cgroup: add tracing points for cgroup v2 freezer
  cgroup: make TRACE_CGROUP_PATH irq-safe
  kselftests: cgroup: add freezer controller self-tests
  kselftests: cgroup: don't fail on cg_kill_all() error in cg_destroy()
  cgroup: cgroup v2 freezer
  cgroup: protect cgroup->nr_(dying_)descendants by css_set_lock
  cgroup: implement __cgroup_task_count() helper
  cgroup: rename freezer.c into legacy_freezer.c
  cgroup: remove extra cgroup_migrate_finish() call
2019-05-09 13:52:12 -07:00
Roman Gushchin
5313bfe425 kselftests: cgroup: add freezer controller self-tests
This patch implements 9 tests for the freezer controller for
cgroup v2:
1) a simple test, which aims to freeze and unfreeze a cgroup with 100
processes
2) a more complicated tree test, which creates a hierarchy of cgroups,
puts some processes in some cgroups, and tries to freeze and unfreeze
different parts of the subtree
3) a forkbomb test: the test aims to freeze a forkbomb running in a
cgroup, kill all tasks in the cgroup and remove the cgroup without
the unfreezing.
4) rmdir test: the test creates two nested cgroups, freezes the parent
one, checks that the child can be successfully removed, and a new
child can be created
5) migration tests: the test checks migration of a task between
frozen cgroups: from a frozen to a running, from a running to a
frozen, and from a frozen to a frozen.
6) ptrace test: the test checks that it's possible to attach to
a process in a frozen cgroup, get some information and detach, and
the cgroup will remain frozen.
7) stopped test: the test checks that it's possible to freeze a cgroup
with a stopped task
8) ptraced test: the test checks that it's possible to freeze a cgroup
with a ptraced task
9) vfork test: the test checks that it's possible to freeze a cgroup
with a parent process waiting for the child process in vfork()

Expected output:
  $ ./test_freezer
  ok 1 test_cgfreezer_simple
  ok 2 test_cgfreezer_tree
  ok 3 test_cgfreezer_forkbomb
  ok 4 test_cgrreezer_rmdir
  ok 5 test_cgfreezer_migrate
  ok 6 test_cgfreezer_ptrace
  ok 7 test_cgfreezer_stopped
  ok 8 test_cgfreezer_ptraced
  ok 9 test_cgfreezer_vfork

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: kernel-team@fb.com
Cc: linux-kselftest@vger.kernel.org
2019-04-19 11:26:49 -07:00
Roman Gushchin
ff9fb7cb51 kselftests: cgroup: don't fail on cg_kill_all() error in cg_destroy()
If the cgroup destruction races with an exit() of a belonging
process(es), cg_kill_all() may fail. It's not a good reason to make
cg_destroy() fail and leave the cgroup in place, potentially causing
next test runs to fail.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: kernel-team@fb.com
Cc: linux-kselftest@vger.kernel.org
2019-04-19 11:26:49 -07:00
Roman Gushchin
e14d314c7a selftests: cgroup: fix cleanup path in test_memcg_subtree_control()
Dan reported, that cleanup path in test_memcg_subtree_control()
triggers a static checker warning:
  ./tools/testing/selftests/cgroup/test_memcontrol.c:76 \
  test_memcg_subtree_control()
  error: uninitialized symbol 'child2'.

Fix this by initializing child2 and parent2 variables and
split the cleanup path into few stages.

Signed-off-by: Roman Gushchin <guro@fb.com>
Fixes: 84092dbcf9 ("selftests: cgroup: add memory controller self-tests")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Shuah Khan <shuah@kernel.org>
2019-04-08 16:44:22 -06:00
Jay Kamat
a987785dcd Add tests for memory.oom.group
Add tests for memory.oom.group for the following cases:
- Killing all processes in a leaf cgroup, but leaving the
  parent untouched
- Killing all processes in a parent and leaf cgroup
- Keeping processes marked by OOM_SCORE_ADJ_MIN alive when considered
  for being killed by the group oom killer.

Signed-off-by: Jay Kamat <jgkamat@fb.com>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-09-07 16:36:01 -06:00
Jay Kamat
48c2bb0b9c Fix cg_read_strcmp()
Fix a couple issues with cg_read_strcmp(), to improve correctness of
cgroup tests
- Fix cg_read_strcmp() always returning 0 for empty "needle" strings.
Previously, this function read to a size = 1 buffer when comparing
against empty strings, which would lead to cg_read_strcmp() comparing
two empty strings.
- Fix a memory leak in cg_read_strcmp()

Fixes: 84092dbcf9 ("selftests: cgroup: add memory controller self-tests")

Signed-off-by: Jay Kamat <jgkamat@fb.com>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-09-07 16:35:51 -06:00
Lei Yang
7035c56899 cgroup: kselftests: add test_core to .gitignore
Update .gitignore file.

Signed-off-by: Lei Yang <Lei.Yang@windriver.com>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-09-06 12:24:47 -06:00
Anders Roxell
a0dde8be2a selftests: cgroup: add gitignore file
Add the executable 'test_memcontrol' to a .gitignore file.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-08-09 11:23:32 -06:00
Claudio
d863cb03fc Add cgroup core selftests
This commit adds tests for some of the core functionalities
of cgroups v2.

The commit adds tests for some core principles of croup V2 API:

- test_cgcore_internal_process_constraint

  Tests internal process constraint.
  You can't add a pid to a domain parent if a controller is enabled.

- test_cgcore_top_down_constraint_enable

   Tests that you can't enable a controller on a child if it's not enabled
   on the parent.

- test_cgcore_top_down_constraint_disable

  Tests that you can't disable a controller on a parent if it's
  enabled in a child.

- test_cgcore_no_internal_process_constraint_on_threads

  Tests that there's no internal process constrain on threaded cgroups.
  You can add threads/processes on a parent with a controller enabled.

- test_cgcore_parent_becomes_threaded

  Tests that when a child becomes threaded the parent type becomes
  domain threaded.

- test_cgcore_invalid_domain

  In a situation like:

  A (domain threaded) - B (threaded) - C (domain)

  it tests that C can't be used until it is turned into a threaded cgroup.
  The "cgroup.type" file will report "domain (invalid)" in these cases.
  Operations which fail due to invalid topology use EOPNOTSUPP as the errno.

- test_cgcore_populated

  In a situation like:

  A(0) - B(0) - C(1)
         \ D(0)

  It tests that A, B and C's "populated" fields would be 1 while D's 0.
  It tests that after the one process in C is moved to root, A,B and C's
  "populated" fields would flip to "0" and file modified events will
  be generated on the "cgroup.events" files of both cgroups.

Signed-off-by: Claudio Zumbo <claudioz@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: kernel-team@fb.com
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-08-09 09:12:45 -06:00
Dan Carpenter
53c3daf8cf kselftest/cgroup: fix a signedness bug
"len" needs to be signed for the error handling to work.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-06-07 13:44:30 -06:00
Mike Rapoport
5f8f019380 selftests: cgroup/memcontrol: add basic test for socket accounting
The test verifies that when there is active TCP connection, the
memory.stat.sock and memory.current values are close.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-05-30 21:28:33 -06:00
Roman Gushchin
adb31be442 kselftest/cgroup: fix variable dereferenced before check warning
cg_name(const char *root, const char *name) is always called with
non-empty root and name arguments, so there is no sense in checking
it in the function body (after using in strlen()).

Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-05-30 15:29:07 -06:00
Mike Rapoport
478b27844e selftests: cgroup/memcontrol: add basic test for swap controls
The new test verifies that memory.swap.max and memory.swap.current behave
as expected for simple allocation scenarios

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-05-30 15:29:07 -06:00
Roman Gushchin
84092dbcf9 selftests: cgroup: add memory controller self-tests
Cgroups are used for controlling the physical resource distribution
(memory, CPU, io, etc) and often are used as basic building blocks
for large distributed computing systems. Even small differences
in the actual behavior may lead to significant incidents.

The codebase is under the active development, which will unlikely
stop at any time soon. Also it's scattered over different kernel
subsystems, which makes regressions more probable.

Given that, the lack of any tests is crying.

This patch implements some basic tests for the memory controller,
as well as a minimal required framework. It doesn't pretend for a
very good coverage, but pretends to be a starting point.

Hopefully, any following significant changes will include corresponding
tests.

Tests for CPU and io controllers, as well as cgroup core
are next in the todo list.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: kernel-team@fb.com
Cc: linux-kselftest@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-05-30 15:29:06 -06:00