• T
    writeback: implement foreign cgroup inode detection · 2a814908
    Tejun Heo 提交于
    As concurrent write sharing of an inode is expected to be very rare
    and memcg only tracks page ownership on first-use basis severely
    confining the usefulness of such sharing, cgroup writeback tracks
    ownership per-inode.  While the support for concurrent write sharing
    of an inode is deemed unnecessary, an inode being written to by
    different cgroups at different points in time is a lot more common,
    and, more importantly, charging only by first-use can too readily lead
    to grossly incorrect behaviors (single foreign page can lead to
    gigabytes of writeback to be incorrectly attributed).
    
    To resolve this issue, cgroup writeback detects the majority dirtier
    of an inode and will transfer the ownership to it.  To avoid
    unnnecessary oscillation, the detection mechanism keeps track of
    history and gives out the switch verdict only if the foreign usage
    pattern is stable over a certain amount of time and/or writeback
    attempts.
    
    The detection mechanism has fairly low space and computation overhead.
    It adds 8 bytes to struct inode (one int and two u16's) and minimal
    amount of calculation per IO.  The detection mechanism converges to
    the correct answer usually in several seconds of IO time when there's
    a clear majority dirtier.  Even when there isn't, it can reach an
    acceptable answer fairly quickly under most circumstances.
    
    Please see wb_detach_inode() for more details.
    
    This patch only implements detection.  Following patches will
    implement actual switching.
    
    v2: wbc_account_io() now checks whether the wbc is associated with a
        wb before dereferencing it.  This can happen when pageout() is
        writing pages directly without going through the usual writeback
        path.  As pageout() path is single-threaded, we don't want it to
        be blocked behind a slow cgroup and ultimately want it to delegate
        actual writing to the usual writeback path.
    Signed-off-by: NTejun Heo <tj@kernel.org>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Jan Kara <jack@suse.cz>
    Cc: Wu Fengguang <fengguang.wu@intel.com>
    Cc: Greg Thelen <gthelen@google.com>
    Signed-off-by: NJens Axboe <axboe@fb.com>
    2a814908
buffer.c 89.6 KB