mirror of
git://git.yoctoproject.org/linux-yocto.git
synced 2025-07-05 13:25:20 +02:00

this pull request are: "Align kvrealloc() with krealloc()" from Danilo Krummrich. Adds consistency to the APIs and behaviour of these two core allocation functions. This also simplifies/enables Rustification. "Some cleanups for shmem" from Baolin Wang. No functional changes - mode code reuse, better function naming, logic simplifications. "mm: some small page fault cleanups" from Josef Bacik. No functional changes - code cleanups only. "Various memory tiering fixes" from Zi Yan. A small fix and a little cleanup. "mm/swap: remove boilerplate" from Yu Zhao. Code cleanups and simplifications and .text shrinkage. "Kernel stack usage histogram" from Pasha Tatashin and Shakeel Butt. This is a feature, it adds new feilds to /proc/vmstat such as $ grep kstack /proc/vmstat kstack_1k 3 kstack_2k 188 kstack_4k 11391 kstack_8k 243 kstack_16k 0 which tells us that 11391 processes used 4k of stack while none at all used 16k. Useful for some system tuning things, but partivularly useful for "the dynamic kernel stack project". "kmemleak: support for percpu memory leak detect" from Pavel Tikhomirov. Teaches kmemleak to detect leaksage of percpu memory. "mm: memcg: page counters optimizations" from Roman Gushchin. "3 independent small optimizations of page counters". "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from David Hildenbrand. Improves PTE/PMD splitlock detection, makes powerpc/8xx work correctly by design rather than by accident. "mm: remove arch_make_page_accessible()" from David Hildenbrand. Some folio conversions which make arch_make_page_accessible() unneeded. "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David Finkel. Cleans up and fixes our handling of the resetting of the cgroup/process peak-memory-use detector. "Make core VMA operations internal and testable" from Lorenzo Stoakes. Rationalizaion and encapsulation of the VMA manipulation APIs. With a view to better enable testing of the VMA functions, even from a userspace-only harness. "mm: zswap: fixes for global shrinker" from Takero Funaki. Fix issues in the zswap global shrinker, resulting in improved performance. "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao. Fill in some missing info in /proc/zoneinfo. "mm: replace follow_page() by folio_walk" from David Hildenbrand. Code cleanups and rationalizations (conversion to folio_walk()) resulting in the removal of follow_page(). "improving dynamic zswap shrinker protection scheme" from Nhat Pham. Some tuning to improve zswap's dynamic shrinker. Significant reductions in swapin and improvements in performance are shown. "mm: Fix several issues with unaccepted memory" from Kirill Shutemov. Improvements to the new unaccepted memory feature, "mm/mprotect: Fix dax puds" from Peter Xu. Implements mprotect on DAX PUDs. This was missing, although nobody seems to have notied yet. "Introduce a store type enum for the Maple tree" from Sidhartha Kumar. Cleanups and modest performance improvements for the maple tree library code. "memcg: further decouple v1 code from v2" from Shakeel Butt. Move more cgroup v1 remnants away from the v2 memcg code. "memcg: initiate deprecation of v1 features" from Shakeel Butt. Adds various warnings telling users that memcg v1 features are deprecated. "mm: swap: mTHP swap allocator base on swap cluster order" from Chris Li. Greatly improves the success rate of the mTHP swap allocation. "mm: introduce numa_memblks" from Mike Rapoport. Moves various disparate per-arch implementations of numa_memblk code into generic code. "mm: batch free swaps for zap_pte_range()" from Barry Song. Greatly improves the performance of munmap() of swap-filled ptes. "support large folio swap-out and swap-in for shmem" from Baolin Wang. With this series we no longer split shmem large folios into simgle-page folios when swapping out shmem. "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao. Nice performance improvements and code reductions for gigantic folios. "support shmem mTHP collapse" from Baolin Wang. Adds support for khugepaged's collapsing of shmem mTHP folios. "mm: Optimize mseal checks" from Pedro Falcato. Fixes an mprotect() performance regression due to the addition of mseal(). "Increase the number of bits available in page_type" from Matthew Wilcox. Increases the number of bits available in page_type! "Simplify the page flags a little" from Matthew Wilcox. Many legacy page flags are now folio flags, so the page-based flags and their accessors/mutators can be removed. "mm: store zero pages to be swapped out in a bitmap" from Usama Arif. An optimization which permits us to avoid writing/reading zero-filled zswap pages to backing store. "Avoid MAP_FIXED gap exposure" from Liam Howlett. Fixes a race window which occurs when a MAP_FIXED operqtion is occurring during an unrelated vma tree walk. "mm: remove vma_merge()" from Lorenzo Stoakes. Major rotorooting of the vma_merge() functionality, making ot cleaner, more testable and better tested. "misc fixups for DAMON {self,kunit} tests" from SeongJae Park. Minor fixups of DAMON selftests and kunit tests. "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang. Code cleanups and folio conversions. "Shmem mTHP controls and stats improvements" from Ryan Roberts. Cleanups for shmem controls and stats. "mm: count the number of anonymous THPs per size" from Barry Song. Expose additional anon THP stats to userspace for improved tuning. "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more folio conversions and removal of now-unused page-based APIs. "replace per-quota region priorities histogram buffer with per-context one" from SeongJae Park. DAMON histogram rationalization. "Docs/damon: update GitHub repo URLs and maintainer-profile" from SeongJae Park. DAMON documentation updates. "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn" from Jason Wang: fixes usage of page allocator __GFP_NOFAIL and GFP_ATOMIC flags. "mm: split underused THPs" from Yu Zhao. Improve THP=always policy - this was overprovisioning THPs in sparsely accessed memory areas. "zram: introduce custom comp backends API" frm Sergey Senozhatsky. Add support for zram run-time compression algorithm tuning. "mm: Care about shadow stack guard gap when getting an unmapped area" from Mark Brown. Fix up the various arch_get_unmapped_area() implementations to better respect guard areas. "Improve mem_cgroup_iter()" from Kinsey Ho. Improve the reliability of mem_cgroup_iter() and various code cleanups. "mm: Support huge pfnmaps" from Peter Xu. Extends the usage of huge pfnmap support. "resource: Fix region_intersects() vs add_memory_driver_managed()" from Huang Ying. Fix a bug in region_intersects() for systems with CXL memory. "mm: hwpoison: two more poison recovery" from Kefeng Wang. Teaches a couple more code paths to correctly recover from the encountering of poisoned memry. "mm: enable large folios swap-in support" from Barry Song. Support the swapin of mTHP memory into appropriately-sized folios, rather than into single-page folios. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZu1BBwAKCRDdBJ7gKXxA jlWNAQDYlqQLun7bgsAN4sSvi27VUuWv1q70jlMXTfmjJAvQqwD/fBFVR6IOOiw7 AkDbKWP2k0hWPiNJBGwoqxdHHx09Xgo= =s0T+ -----END PGP SIGNATURE----- Merge tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Along with the usual shower of singleton patches, notable patch series in this pull request are: - "Align kvrealloc() with krealloc()" from Danilo Krummrich. Adds consistency to the APIs and behaviour of these two core allocation functions. This also simplifies/enables Rustification. - "Some cleanups for shmem" from Baolin Wang. No functional changes - mode code reuse, better function naming, logic simplifications. - "mm: some small page fault cleanups" from Josef Bacik. No functional changes - code cleanups only. - "Various memory tiering fixes" from Zi Yan. A small fix and a little cleanup. - "mm/swap: remove boilerplate" from Yu Zhao. Code cleanups and simplifications and .text shrinkage. - "Kernel stack usage histogram" from Pasha Tatashin and Shakeel Butt. This is a feature, it adds new feilds to /proc/vmstat such as $ grep kstack /proc/vmstat kstack_1k 3 kstack_2k 188 kstack_4k 11391 kstack_8k 243 kstack_16k 0 which tells us that 11391 processes used 4k of stack while none at all used 16k. Useful for some system tuning things, but partivularly useful for "the dynamic kernel stack project". - "kmemleak: support for percpu memory leak detect" from Pavel Tikhomirov. Teaches kmemleak to detect leaksage of percpu memory. - "mm: memcg: page counters optimizations" from Roman Gushchin. "3 independent small optimizations of page counters". - "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from David Hildenbrand. Improves PTE/PMD splitlock detection, makes powerpc/8xx work correctly by design rather than by accident. - "mm: remove arch_make_page_accessible()" from David Hildenbrand. Some folio conversions which make arch_make_page_accessible() unneeded. - "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David Finkel. Cleans up and fixes our handling of the resetting of the cgroup/process peak-memory-use detector. - "Make core VMA operations internal and testable" from Lorenzo Stoakes. Rationalizaion and encapsulation of the VMA manipulation APIs. With a view to better enable testing of the VMA functions, even from a userspace-only harness. - "mm: zswap: fixes for global shrinker" from Takero Funaki. Fix issues in the zswap global shrinker, resulting in improved performance. - "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao. Fill in some missing info in /proc/zoneinfo. - "mm: replace follow_page() by folio_walk" from David Hildenbrand. Code cleanups and rationalizations (conversion to folio_walk()) resulting in the removal of follow_page(). - "improving dynamic zswap shrinker protection scheme" from Nhat Pham. Some tuning to improve zswap's dynamic shrinker. Significant reductions in swapin and improvements in performance are shown. - "mm: Fix several issues with unaccepted memory" from Kirill Shutemov. Improvements to the new unaccepted memory feature, - "mm/mprotect: Fix dax puds" from Peter Xu. Implements mprotect on DAX PUDs. This was missing, although nobody seems to have notied yet. - "Introduce a store type enum for the Maple tree" from Sidhartha Kumar. Cleanups and modest performance improvements for the maple tree library code. - "memcg: further decouple v1 code from v2" from Shakeel Butt. Move more cgroup v1 remnants away from the v2 memcg code. - "memcg: initiate deprecation of v1 features" from Shakeel Butt. Adds various warnings telling users that memcg v1 features are deprecated. - "mm: swap: mTHP swap allocator base on swap cluster order" from Chris Li. Greatly improves the success rate of the mTHP swap allocation. - "mm: introduce numa_memblks" from Mike Rapoport. Moves various disparate per-arch implementations of numa_memblk code into generic code. - "mm: batch free swaps for zap_pte_range()" from Barry Song. Greatly improves the performance of munmap() of swap-filled ptes. - "support large folio swap-out and swap-in for shmem" from Baolin Wang. With this series we no longer split shmem large folios into simgle-page folios when swapping out shmem. - "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao. Nice performance improvements and code reductions for gigantic folios. - "support shmem mTHP collapse" from Baolin Wang. Adds support for khugepaged's collapsing of shmem mTHP folios. - "mm: Optimize mseal checks" from Pedro Falcato. Fixes an mprotect() performance regression due to the addition of mseal(). - "Increase the number of bits available in page_type" from Matthew Wilcox. Increases the number of bits available in page_type! - "Simplify the page flags a little" from Matthew Wilcox. Many legacy page flags are now folio flags, so the page-based flags and their accessors/mutators can be removed. - "mm: store zero pages to be swapped out in a bitmap" from Usama Arif. An optimization which permits us to avoid writing/reading zero-filled zswap pages to backing store. - "Avoid MAP_FIXED gap exposure" from Liam Howlett. Fixes a race window which occurs when a MAP_FIXED operqtion is occurring during an unrelated vma tree walk. - "mm: remove vma_merge()" from Lorenzo Stoakes. Major rotorooting of the vma_merge() functionality, making ot cleaner, more testable and better tested. - "misc fixups for DAMON {self,kunit} tests" from SeongJae Park. Minor fixups of DAMON selftests and kunit tests. - "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang. Code cleanups and folio conversions. - "Shmem mTHP controls and stats improvements" from Ryan Roberts. Cleanups for shmem controls and stats. - "mm: count the number of anonymous THPs per size" from Barry Song. Expose additional anon THP stats to userspace for improved tuning. - "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more folio conversions and removal of now-unused page-based APIs. - "replace per-quota region priorities histogram buffer with per-context one" from SeongJae Park. DAMON histogram rationalization. - "Docs/damon: update GitHub repo URLs and maintainer-profile" from SeongJae Park. DAMON documentation updates. - "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn" from Jason Wang: fixes usage of page allocator __GFP_NOFAIL and GFP_ATOMIC flags. - "mm: split underused THPs" from Yu Zhao. Improve THP=always policy. This was overprovisioning THPs in sparsely accessed memory areas. - "zram: introduce custom comp backends API" frm Sergey Senozhatsky. Add support for zram run-time compression algorithm tuning. - "mm: Care about shadow stack guard gap when getting an unmapped area" from Mark Brown. Fix up the various arch_get_unmapped_area() implementations to better respect guard areas. - "Improve mem_cgroup_iter()" from Kinsey Ho. Improve the reliability of mem_cgroup_iter() and various code cleanups. - "mm: Support huge pfnmaps" from Peter Xu. Extends the usage of huge pfnmap support. - "resource: Fix region_intersects() vs add_memory_driver_managed()" from Huang Ying. Fix a bug in region_intersects() for systems with CXL memory. - "mm: hwpoison: two more poison recovery" from Kefeng Wang. Teaches a couple more code paths to correctly recover from the encountering of poisoned memry. - "mm: enable large folios swap-in support" from Barry Song. Support the swapin of mTHP memory into appropriately-sized folios, rather than into single-page folios" * tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (416 commits) zram: free secondary algorithms names uprobes: turn xol_area->pages[2] into xol_area->page uprobes: introduce the global struct vm_special_mapping xol_mapping Revert "uprobes: use vm_special_mapping close() functionality" mm: support large folios swap-in for sync io devices mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios mm: fix swap_read_folio_zeromap() for large folios with partial zeromap mm/debug_vm_pgtable: Use pxdp_get() for accessing page table entries set_memory: add __must_check to generic stubs mm/vma: return the exact errno in vms_gather_munmap_vmas() memcg: cleanup with !CONFIG_MEMCG_V1 mm/show_mem.c: report alloc tags in human readable units mm: support poison recovery from copy_present_page() mm: support poison recovery from do_cow_fault() resource, kunit: add test case for region_intersects() resource: make alloc_free_mem_region() works for iomem_resource mm: z3fold: deprecate CONFIG_Z3FOLD vfio/pci: implement huge_fault support mm/arm64: support large pfn mappings mm/x86: support large pfn mappings ...
397 lines
13 KiB
C
397 lines
13 KiB
C
/* SPDX-License-Identifier: GPL-2.0 */
|
|
/*
|
|
* include/linux/writeback.h
|
|
*/
|
|
#ifndef WRITEBACK_H
|
|
#define WRITEBACK_H
|
|
|
|
#include <linux/sched.h>
|
|
#include <linux/workqueue.h>
|
|
#include <linux/fs.h>
|
|
#include <linux/flex_proportions.h>
|
|
#include <linux/backing-dev-defs.h>
|
|
#include <linux/blk_types.h>
|
|
#include <linux/pagevec.h>
|
|
|
|
struct bio;
|
|
|
|
DECLARE_PER_CPU(int, dirty_throttle_leaks);
|
|
|
|
/*
|
|
* The global dirty threshold is normally equal to the global dirty limit,
|
|
* except when the system suddenly allocates a lot of anonymous memory and
|
|
* knocks down the global dirty threshold quickly, in which case the global
|
|
* dirty limit will follow down slowly to prevent livelocking all dirtier tasks.
|
|
*/
|
|
#define DIRTY_SCOPE 8
|
|
|
|
struct backing_dev_info;
|
|
|
|
/*
|
|
* fs/fs-writeback.c
|
|
*/
|
|
enum writeback_sync_modes {
|
|
WB_SYNC_NONE, /* Don't wait on anything */
|
|
WB_SYNC_ALL, /* Wait on every mapping */
|
|
};
|
|
|
|
/*
|
|
* A control structure which tells the writeback code what to do. These are
|
|
* always on the stack, and hence need no locking. They are always initialised
|
|
* in a manner such that unspecified fields are set to zero.
|
|
*/
|
|
struct writeback_control {
|
|
/* public fields that can be set and/or consumed by the caller: */
|
|
long nr_to_write; /* Write this many pages, and decrement
|
|
this for each page written */
|
|
long pages_skipped; /* Pages which were not written */
|
|
|
|
/*
|
|
* For a_ops->writepages(): if start or end are non-zero then this is
|
|
* a hint that the filesystem need only write out the pages inside that
|
|
* byterange. The byte at `end' is included in the writeout request.
|
|
*/
|
|
loff_t range_start;
|
|
loff_t range_end;
|
|
|
|
enum writeback_sync_modes sync_mode;
|
|
|
|
unsigned for_kupdate:1; /* A kupdate writeback */
|
|
unsigned for_background:1; /* A background writeback */
|
|
unsigned tagged_writepages:1; /* tag-and-write to avoid livelock */
|
|
unsigned for_reclaim:1; /* Invoked from the page allocator */
|
|
unsigned range_cyclic:1; /* range_start is cyclic */
|
|
unsigned for_sync:1; /* sync(2) WB_SYNC_ALL writeback */
|
|
unsigned unpinned_netfs_wb:1; /* Cleared I_PINNING_NETFS_WB */
|
|
|
|
/*
|
|
* When writeback IOs are bounced through async layers, only the
|
|
* initial synchronous phase should be accounted towards inode
|
|
* cgroup ownership arbitration to avoid confusion. Later stages
|
|
* can set the following flag to disable the accounting.
|
|
*/
|
|
unsigned no_cgroup_owner:1;
|
|
|
|
/* To enable batching of swap writes to non-block-device backends,
|
|
* "plug" can be set point to a 'struct swap_iocb *'. When all swap
|
|
* writes have been submitted, if with swap_iocb is not NULL,
|
|
* swap_write_unplug() should be called.
|
|
*/
|
|
struct swap_iocb **swap_plug;
|
|
|
|
/* Target list for splitting a large folio */
|
|
struct list_head *list;
|
|
|
|
/* internal fields used by the ->writepages implementation: */
|
|
struct folio_batch fbatch;
|
|
pgoff_t index;
|
|
int saved_err;
|
|
|
|
#ifdef CONFIG_CGROUP_WRITEBACK
|
|
struct bdi_writeback *wb; /* wb this writeback is issued under */
|
|
struct inode *inode; /* inode being written out */
|
|
|
|
/* foreign inode detection, see wbc_detach_inode() */
|
|
int wb_id; /* current wb id */
|
|
int wb_lcand_id; /* last foreign candidate wb id */
|
|
int wb_tcand_id; /* this foreign candidate wb id */
|
|
size_t wb_bytes; /* bytes written by current wb */
|
|
size_t wb_lcand_bytes; /* bytes written by last candidate */
|
|
size_t wb_tcand_bytes; /* bytes written by this candidate */
|
|
#endif
|
|
};
|
|
|
|
static inline blk_opf_t wbc_to_write_flags(struct writeback_control *wbc)
|
|
{
|
|
blk_opf_t flags = 0;
|
|
|
|
if (wbc->sync_mode == WB_SYNC_ALL)
|
|
flags |= REQ_SYNC;
|
|
else if (wbc->for_kupdate || wbc->for_background)
|
|
flags |= REQ_BACKGROUND;
|
|
|
|
return flags;
|
|
}
|
|
|
|
#ifdef CONFIG_CGROUP_WRITEBACK
|
|
#define wbc_blkcg_css(wbc) \
|
|
((wbc)->wb ? (wbc)->wb->blkcg_css : blkcg_root_css)
|
|
#else
|
|
#define wbc_blkcg_css(wbc) (blkcg_root_css)
|
|
#endif /* CONFIG_CGROUP_WRITEBACK */
|
|
|
|
/*
|
|
* A wb_domain represents a domain that wb's (bdi_writeback's) belong to
|
|
* and are measured against each other in. There always is one global
|
|
* domain, global_wb_domain, that every wb in the system is a member of.
|
|
* This allows measuring the relative bandwidth of each wb to distribute
|
|
* dirtyable memory accordingly.
|
|
*/
|
|
struct wb_domain {
|
|
spinlock_t lock;
|
|
|
|
/*
|
|
* Scale the writeback cache size proportional to the relative
|
|
* writeout speed.
|
|
*
|
|
* We do this by keeping a floating proportion between BDIs, based
|
|
* on page writeback completions [end_page_writeback()]. Those
|
|
* devices that write out pages fastest will get the larger share,
|
|
* while the slower will get a smaller share.
|
|
*
|
|
* We use page writeout completions because we are interested in
|
|
* getting rid of dirty pages. Having them written out is the
|
|
* primary goal.
|
|
*
|
|
* We introduce a concept of time, a period over which we measure
|
|
* these events, because demand can/will vary over time. The length
|
|
* of this period itself is measured in page writeback completions.
|
|
*/
|
|
struct fprop_global completions;
|
|
struct timer_list period_timer; /* timer for aging of completions */
|
|
unsigned long period_time;
|
|
|
|
/*
|
|
* The dirtyable memory and dirty threshold could be suddenly
|
|
* knocked down by a large amount (eg. on the startup of KVM in a
|
|
* swapless system). This may throw the system into deep dirty
|
|
* exceeded state and throttle heavy/light dirtiers alike. To
|
|
* retain good responsiveness, maintain global_dirty_limit for
|
|
* tracking slowly down to the knocked down dirty threshold.
|
|
*
|
|
* Both fields are protected by ->lock.
|
|
*/
|
|
unsigned long dirty_limit_tstamp;
|
|
unsigned long dirty_limit;
|
|
};
|
|
|
|
/**
|
|
* wb_domain_size_changed - memory available to a wb_domain has changed
|
|
* @dom: wb_domain of interest
|
|
*
|
|
* This function should be called when the amount of memory available to
|
|
* @dom has changed. It resets @dom's dirty limit parameters to prevent
|
|
* the past values which don't match the current configuration from skewing
|
|
* dirty throttling. Without this, when memory size of a wb_domain is
|
|
* greatly reduced, the dirty throttling logic may allow too many pages to
|
|
* be dirtied leading to consecutive unnecessary OOMs and may get stuck in
|
|
* that situation.
|
|
*/
|
|
static inline void wb_domain_size_changed(struct wb_domain *dom)
|
|
{
|
|
spin_lock(&dom->lock);
|
|
dom->dirty_limit_tstamp = jiffies;
|
|
dom->dirty_limit = 0;
|
|
spin_unlock(&dom->lock);
|
|
}
|
|
|
|
/*
|
|
* fs/fs-writeback.c
|
|
*/
|
|
struct bdi_writeback;
|
|
void writeback_inodes_sb(struct super_block *, enum wb_reason reason);
|
|
void writeback_inodes_sb_nr(struct super_block *, unsigned long nr,
|
|
enum wb_reason reason);
|
|
void try_to_writeback_inodes_sb(struct super_block *sb, enum wb_reason reason);
|
|
void sync_inodes_sb(struct super_block *);
|
|
void wakeup_flusher_threads(enum wb_reason reason);
|
|
void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
|
|
enum wb_reason reason);
|
|
void inode_wait_for_writeback(struct inode *inode);
|
|
void inode_io_list_del(struct inode *inode);
|
|
|
|
/* writeback.h requires fs.h; it, too, is not included from here. */
|
|
static inline void wait_on_inode(struct inode *inode)
|
|
{
|
|
wait_var_event(inode_state_wait_address(inode, __I_NEW),
|
|
!(READ_ONCE(inode->i_state) & I_NEW));
|
|
}
|
|
|
|
#ifdef CONFIG_CGROUP_WRITEBACK
|
|
|
|
#include <linux/cgroup.h>
|
|
#include <linux/bio.h>
|
|
|
|
void __inode_attach_wb(struct inode *inode, struct folio *folio);
|
|
void wbc_attach_and_unlock_inode(struct writeback_control *wbc,
|
|
struct inode *inode)
|
|
__releases(&inode->i_lock);
|
|
void wbc_detach_inode(struct writeback_control *wbc);
|
|
void wbc_account_cgroup_owner(struct writeback_control *wbc, struct page *page,
|
|
size_t bytes);
|
|
int cgroup_writeback_by_id(u64 bdi_id, int memcg_id,
|
|
enum wb_reason reason, struct wb_completion *done);
|
|
void cgroup_writeback_umount(struct super_block *sb);
|
|
bool cleanup_offline_cgwb(struct bdi_writeback *wb);
|
|
|
|
/**
|
|
* inode_attach_wb - associate an inode with its wb
|
|
* @inode: inode of interest
|
|
* @folio: folio being dirtied (may be NULL)
|
|
*
|
|
* If @inode doesn't have its wb, associate it with the wb matching the
|
|
* memcg of @folio or, if @folio is NULL, %current. May be called w/ or w/o
|
|
* @inode->i_lock.
|
|
*/
|
|
static inline void inode_attach_wb(struct inode *inode, struct folio *folio)
|
|
{
|
|
if (!inode->i_wb)
|
|
__inode_attach_wb(inode, folio);
|
|
}
|
|
|
|
/**
|
|
* inode_detach_wb - disassociate an inode from its wb
|
|
* @inode: inode of interest
|
|
*
|
|
* @inode is being freed. Detach from its wb.
|
|
*/
|
|
static inline void inode_detach_wb(struct inode *inode)
|
|
{
|
|
if (inode->i_wb) {
|
|
WARN_ON_ONCE(!(inode->i_state & I_CLEAR));
|
|
wb_put(inode->i_wb);
|
|
inode->i_wb = NULL;
|
|
}
|
|
}
|
|
|
|
/**
|
|
* wbc_attach_fdatawrite_inode - associate wbc and inode for fdatawrite
|
|
* @wbc: writeback_control of interest
|
|
* @inode: target inode
|
|
*
|
|
* This function is to be used by __filemap_fdatawrite_range(), which is an
|
|
* alternative entry point into writeback code, and first ensures @inode is
|
|
* associated with a bdi_writeback and attaches it to @wbc.
|
|
*/
|
|
static inline void wbc_attach_fdatawrite_inode(struct writeback_control *wbc,
|
|
struct inode *inode)
|
|
{
|
|
spin_lock(&inode->i_lock);
|
|
inode_attach_wb(inode, NULL);
|
|
wbc_attach_and_unlock_inode(wbc, inode);
|
|
}
|
|
|
|
/**
|
|
* wbc_init_bio - writeback specific initializtion of bio
|
|
* @wbc: writeback_control for the writeback in progress
|
|
* @bio: bio to be initialized
|
|
*
|
|
* @bio is a part of the writeback in progress controlled by @wbc. Perform
|
|
* writeback specific initialization. This is used to apply the cgroup
|
|
* writeback context. Must be called after the bio has been associated with
|
|
* a device.
|
|
*/
|
|
static inline void wbc_init_bio(struct writeback_control *wbc, struct bio *bio)
|
|
{
|
|
/*
|
|
* pageout() path doesn't attach @wbc to the inode being written
|
|
* out. This is intentional as we don't want the function to block
|
|
* behind a slow cgroup. Ultimately, we want pageout() to kick off
|
|
* regular writeback instead of writing things out itself.
|
|
*/
|
|
if (wbc->wb)
|
|
bio_associate_blkg_from_css(bio, wbc->wb->blkcg_css);
|
|
}
|
|
|
|
#else /* CONFIG_CGROUP_WRITEBACK */
|
|
|
|
static inline void inode_attach_wb(struct inode *inode, struct folio *folio)
|
|
{
|
|
}
|
|
|
|
static inline void inode_detach_wb(struct inode *inode)
|
|
{
|
|
}
|
|
|
|
static inline void wbc_attach_and_unlock_inode(struct writeback_control *wbc,
|
|
struct inode *inode)
|
|
__releases(&inode->i_lock)
|
|
{
|
|
spin_unlock(&inode->i_lock);
|
|
}
|
|
|
|
static inline void wbc_attach_fdatawrite_inode(struct writeback_control *wbc,
|
|
struct inode *inode)
|
|
{
|
|
}
|
|
|
|
static inline void wbc_detach_inode(struct writeback_control *wbc)
|
|
{
|
|
}
|
|
|
|
static inline void wbc_init_bio(struct writeback_control *wbc, struct bio *bio)
|
|
{
|
|
}
|
|
|
|
static inline void wbc_account_cgroup_owner(struct writeback_control *wbc,
|
|
struct page *page, size_t bytes)
|
|
{
|
|
}
|
|
|
|
static inline void cgroup_writeback_umount(struct super_block *sb)
|
|
{
|
|
}
|
|
|
|
#endif /* CONFIG_CGROUP_WRITEBACK */
|
|
|
|
/*
|
|
* mm/page-writeback.c
|
|
*/
|
|
void laptop_io_completion(struct backing_dev_info *info);
|
|
void laptop_sync_completion(void);
|
|
void laptop_mode_timer_fn(struct timer_list *t);
|
|
bool node_dirty_ok(struct pglist_data *pgdat);
|
|
int wb_domain_init(struct wb_domain *dom, gfp_t gfp);
|
|
#ifdef CONFIG_CGROUP_WRITEBACK
|
|
void wb_domain_exit(struct wb_domain *dom);
|
|
#endif
|
|
|
|
extern struct wb_domain global_wb_domain;
|
|
|
|
/* These are exported to sysctl. */
|
|
extern unsigned int dirty_writeback_interval;
|
|
extern unsigned int dirty_expire_interval;
|
|
extern unsigned int dirtytime_expire_interval;
|
|
extern int laptop_mode;
|
|
|
|
int dirtytime_interval_handler(const struct ctl_table *table, int write,
|
|
void *buffer, size_t *lenp, loff_t *ppos);
|
|
|
|
void global_dirty_limits(unsigned long *pbackground, unsigned long *pdirty);
|
|
unsigned long wb_calc_thresh(struct bdi_writeback *wb, unsigned long thresh);
|
|
unsigned long cgwb_calc_thresh(struct bdi_writeback *wb);
|
|
|
|
void wb_update_bandwidth(struct bdi_writeback *wb);
|
|
|
|
/* Invoke balance dirty pages in async mode. */
|
|
#define BDP_ASYNC 0x0001
|
|
|
|
void balance_dirty_pages_ratelimited(struct address_space *mapping);
|
|
int balance_dirty_pages_ratelimited_flags(struct address_space *mapping,
|
|
unsigned int flags);
|
|
|
|
bool wb_over_bg_thresh(struct bdi_writeback *wb);
|
|
|
|
struct folio *writeback_iter(struct address_space *mapping,
|
|
struct writeback_control *wbc, struct folio *folio, int *error);
|
|
|
|
typedef int (*writepage_t)(struct folio *folio, struct writeback_control *wbc,
|
|
void *data);
|
|
|
|
int write_cache_pages(struct address_space *mapping,
|
|
struct writeback_control *wbc, writepage_t writepage,
|
|
void *data);
|
|
int do_writepages(struct address_space *mapping, struct writeback_control *wbc);
|
|
void writeback_set_ratelimit(void);
|
|
void tag_pages_for_writeback(struct address_space *mapping,
|
|
pgoff_t start, pgoff_t end);
|
|
|
|
bool filemap_dirty_folio(struct address_space *mapping, struct folio *folio);
|
|
bool folio_redirty_for_writepage(struct writeback_control *, struct folio *);
|
|
bool redirty_page_for_writepage(struct writeback_control *, struct page *);
|
|
|
|
void sb_mark_inode_writeback(struct inode *inode);
|
|
void sb_clear_inode_writeback(struct inode *inode);
|
|
|
|
#endif /* WRITEBACK_H */
|