writeback: implement foreign cgroup inode bdi_writeback switching
As concurrent write sharing of an inode is expected to be very rare and memcg only tracks page ownership on first-use basis severely confining the usefulness of such sharing, cgroup writeback tracks ownership per-inode. While the support for concurrent write sharing of an inode is deemed unnecessary, an inode being written to by different cgroups at different points in time is a lot more common, and, more importantly, charging only by first-use can too readily lead to grossly incorrect behaviors (single foreign page can lead to gigabytes of writeback to be incorrectly attributed). To resolve this issue, cgroup writeback detects the majority dirtier of an inode and transfers the ownership to it. The previous patches implemented the foreign condition detection mechanism and laid the groundwork. This patch implements the actual switching. With the previously implemented [unlocked_]inode_to_wb_and_list_lock() and wb stat transaction, grabbing wb->list_lock, inode->i_lock and mapping->tree_lock gives us full exclusion against all wb operations on the target inode. inode_switch_wb_work_fn() grabs all the locks and transfers the inode atomically along with its RECLAIMABLE and WRITEBACK stats. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jan Kara <jack@suse.cz> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Greg Thelen <gthelen@google.com> Signed-off-by: NJens Axboe <axboe@fb.com>
Showing
想要评论请 注册 或 登录