1. 02 6月, 2015 5 次提交
    • T
      writeback: implement WB_has_dirty_io wb_state flag · d6c10f1f
      Tejun Heo 提交于
      Currently, wb_has_dirty_io() determines whether a wb (bdi_writeback)
      has any dirty inode by testing all three IO lists on each invocation
      without actively keeping track.  For cgroup writeback support, a
      single bdi will host multiple wb's each of which will host dirty
      inodes separately and we'll need to make bdi_has_dirty_io(), which
      currently only represents the root wb, aggregate has_dirty_io from all
      member wb's, which requires tracking transitions in has_dirty_io state
      on each wb.
      
      This patch introduces inode_wb_list_{move|del}_locked() to consolidate
      IO list operations leaving queue_io() the only other function which
      directly manipulates IO lists (via move_expired_inodes()).  All three
      functions are updated to call wb_io_lists_[de]populated() which keep
      track of whether the wb has dirty inodes or not and record it using
      the new WB_has_dirty_io flag.  inode_wb_list_moved_locked()'s return
      value indicates whether the wb had no dirty inodes before.
      
      mark_inode_dirty() is restructured so that the return value of
      inode_wb_list_move_locked() can be used for deciding whether to wake
      up the wb.
      
      While at it, change {bdi|wb}_has_dirty_io()'s return values to bool.
      These functions were returning 0 and 1 before.  Also, add a comment
      explaining the synchronization of wb_state flags.
      
      v2: Updated to accommodate b_dirty_time.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      d6c10f1f
    • T
      writeback: make congestion functions per bdi_writeback · ec8a6f26
      Tejun Heo 提交于
      Currently, all congestion functions take bdi (backing_dev_info) and
      always operate on the root wb (bdi->wb) and the congestion state from
      the block layer is propagated only for the root blkcg.  This patch
      introduces {set|clear}_wb_congested() and wb_congested() which take a
      bdi_writeback_congested and bdi_writeback respectively.  The bdi
      counteparts are now wrappers invoking the wb based functions on
      @bdi->wb.
      
      While converting clear_bdi_congested() to clear_wb_congested(), the
      local variable declaration order between @wqh and @bit is swapped for
      cosmetic reason.
      
      This patch just adds the new wb based functions.  The following
      patches will apply them.
      
      v2: Updated for bdi_writeback_congested.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      ec8a6f26
    • T
      writeback: make backing_dev_info host cgroup-specific bdi_writebacks · 52ebea74
      Tejun Heo 提交于
      For the planned cgroup writeback support, on each bdi
      (backing_dev_info), each memcg will be served by a separate wb
      (bdi_writeback).  This patch updates bdi so that a bdi can host
      multiple wbs (bdi_writebacks).
      
      On the default hierarchy, blkcg implicitly enables memcg.  This allows
      using memcg's page ownership for attributing writeback IOs, and every
      memcg - blkcg combination can be served by its own wb by assigning a
      dedicated wb to each memcg.  This means that there may be multiple
      wb's of a bdi mapped to the same blkcg.  As congested state is per
      blkcg - bdi combination, those wb's should share the same congested
      state.  This is achieved by tracking congested state via
      bdi_writeback_congested structs which are keyed by blkcg.
      
      bdi->wb remains unchanged and will keep serving the root cgroup.
      cgwb's (cgroup wb's) for non-root cgroups are created on-demand or
      looked up while dirtying an inode according to the memcg of the page
      being dirtied or current task.  Each cgwb is indexed on bdi->cgwb_tree
      by its memcg id.  Once an inode is associated with its wb, it can be
      retrieved using inode_to_wb().
      
      Currently, none of the filesystems has FS_CGROUP_WRITEBACK and all
      pages will keep being associated with bdi->wb.
      
      v3: inode_attach_wb() in account_page_dirtied() moved inside
          mapping_cap_account_dirty() block where it's known to be !NULL.
          Also, an unnecessary NULL check before kfree() removed.  Both
          detected by the kbuild bot.
      
      v2: Updated so that wb association is per inode and wb is per memcg
          rather than blkcg.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      52ebea74
    • T
      bdi: separate out congested state into a separate struct · 4aa9c692
      Tejun Heo 提交于
      Currently, a wb's (bdi_writeback) congestion state is carried in its
      ->state field; however, cgroup writeback support will require multiple
      wb's sharing the same congestion state.  This patch separates out
      congestion state into its own struct - struct bdi_writeback_congested.
      A new field wb field, wb_congested, points to its associated congested
      struct.  The default wb, bdi->wb, always points to bdi->wb_congested.
      
      While this patch adds a layer of indirection, it doesn't introduce any
      behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4aa9c692
    • T
      writeback: separate out include/linux/backing-dev-defs.h · 66114cad
      Tejun Heo 提交于
      With the planned cgroup writeback support, backing-dev related
      declarations will be more widely used across block and cgroup;
      unfortunately, including backing-dev.h from include/linux/blkdev.h
      makes cyclic include dependency quite likely.
      
      This patch separates out backing-dev-defs.h which only has the
      essential definitions and updates blkdev.h to include it.  c files
      which need access to more backing-dev details now include
      backing-dev.h directly.  This takes backing-dev.h off the common
      include dependency chain making it a lot easier to use it across block
      and cgroup.
      
      v2: fs/fat build failure fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      66114cad