提交 f629d1c9 编写于 作者: M Michael Rubin 提交者: Linus Torvalds

mm: add account_page_writeback()

To help developers and applications gain visibility into writeback
behaviour this patch adds two counters to /proc/vmstat.

  # grep nr_dirtied /proc/vmstat
  nr_dirtied 3747
  # grep nr_written /proc/vmstat
  nr_written 3618

These entries allow user apps to understand writeback behaviour over time
and learn how it is impacting their performance.  Currently there is no
way to inspect dirty and writeback speed over time.  It's not possible for
nr_dirty/nr_writeback.

These entries are necessary to give visibility into writeback behaviour.
We have /proc/diskstats which lets us understand the io in the block
layer.  We have blktrace for more in depth understanding.  We have
e2fsprogs and debugsfs to give insight into the file systems behaviour,
but we don't offer our users the ability understand what writeback is
doing.  There is no way to know how active it is over the whole system, if
it's falling behind or to quantify it's efforts.  With these values
exported users can easily see how much data applications are sending
through writeback and also at what rates writeback is processing this
data.  Comparing the rates of change between the two allow developers to
see when writeback is not able to keep up with incoming traffic and the
rate of dirty memory being sent to the IO back end.  This allows folks to
understand their io workloads and track kernel issues.  Non kernel
engineers at Google often use these counters to solve puzzling performance
problems.

Patch #4 adds a pernode vmstat file with nr_dirtied and nr_written

Patch #5 add writeback thresholds to /proc/vmstat

Currently these values are in debugfs. But they should be promoted to
/proc since they are useful for developers who are writing databases
and file servers and are not debugging the kernel.

The output is as below:

 # grep threshold /proc/vmstat
 nr_pages_dirty_threshold 409111
 nr_pages_dirty_background_threshold 818223

This patch:

This allows code outside of the mm core to safely manipulate page
writeback state and not worry about the other accounting.  Not using these
routines means that some code will lose track of the accounting and we get
bugs.

Modify nilfs2 to use interface.
Signed-off-by: NMichael Rubin <mrubin@google.com>
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: NWu Fengguang <fengguang.wu@intel.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Cc: Jiro SEKIBA <jir@unicus.jp>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
上级 0def08e3
...@@ -1609,7 +1609,7 @@ nilfs_copy_replace_page_buffers(struct page *page, struct list_head *out) ...@@ -1609,7 +1609,7 @@ nilfs_copy_replace_page_buffers(struct page *page, struct list_head *out)
kunmap_atomic(kaddr, KM_USER0); kunmap_atomic(kaddr, KM_USER0);
if (!TestSetPageWriteback(clone_page)) if (!TestSetPageWriteback(clone_page))
inc_zone_page_state(clone_page, NR_WRITEBACK); account_page_writeback(clone_page);
unlock_page(clone_page); unlock_page(clone_page);
return 0; return 0;
......
...@@ -868,6 +868,7 @@ int __set_page_dirty_no_writeback(struct page *page); ...@@ -868,6 +868,7 @@ int __set_page_dirty_no_writeback(struct page *page);
int redirty_page_for_writepage(struct writeback_control *wbc, int redirty_page_for_writepage(struct writeback_control *wbc,
struct page *page); struct page *page);
void account_page_dirtied(struct page *page, struct address_space *mapping); void account_page_dirtied(struct page *page, struct address_space *mapping);
void account_page_writeback(struct page *page);
int set_page_dirty(struct page *page); int set_page_dirty(struct page *page);
int set_page_dirty_lock(struct page *page); int set_page_dirty_lock(struct page *page);
int clear_page_dirty_for_io(struct page *page); int clear_page_dirty_for_io(struct page *page);
......
...@@ -1128,6 +1128,17 @@ void account_page_dirtied(struct page *page, struct address_space *mapping) ...@@ -1128,6 +1128,17 @@ void account_page_dirtied(struct page *page, struct address_space *mapping)
} }
EXPORT_SYMBOL(account_page_dirtied); EXPORT_SYMBOL(account_page_dirtied);
/*
* Helper function for set_page_writeback family.
* NOTE: Unlike account_page_dirtied this does not rely on being atomic
* wrt interrupts.
*/
void account_page_writeback(struct page *page)
{
inc_zone_page_state(page, NR_WRITEBACK);
}
EXPORT_SYMBOL(account_page_writeback);
/* /*
* For address_spaces which do not use buffers. Just tag the page as dirty in * For address_spaces which do not use buffers. Just tag the page as dirty in
* its radix tree. * its radix tree.
...@@ -1366,7 +1377,7 @@ int test_set_page_writeback(struct page *page) ...@@ -1366,7 +1377,7 @@ int test_set_page_writeback(struct page *page)
ret = TestSetPageWriteback(page); ret = TestSetPageWriteback(page);
} }
if (!ret) if (!ret)
inc_zone_page_state(page, NR_WRITEBACK); account_page_writeback(page);
return ret; return ret;
} }
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册