1. 29 6月, 2020 3 次提交
  2. 01 5月, 2020 1 次提交
    • T
      blk-iocost: switch to fixed non-auto-decaying use_delay · 54c52e10
      Tejun Heo 提交于
      The use_delay mechanism was introduced by blk-iolatency to hold memory
      allocators accountable for the reclaim and other shared IOs they cause. The
      duration of the delay is dynamically balanced between iolatency increasing the
      value on each target miss and it auto-decaying as time passes and threads get
      delayed on it.
      
      While this works well for iolatency, iocost's control model isn't compatible
      with it. There is no repeated "violation" events which can be balanced against
      auto-decaying. iocost instead knows how much a given cgroup is over budget and
      wants to prevent that cgroup from issuing IOs while over budget. Until now,
      iocost has been adding the cost of force-issued IOs. However, this doesn't
      reflect the amount which is already over budget and is simply not enough to
      counter the auto-decaying allowing anon-memory leaking low priority cgroup to
      go over its alloted share of IOs.
      
      As auto-decaying doesn't make much sense for iocost, this patch introduces a
      different mode of operation for use_delay - when blkcg_set_delay() are used
      insted of blkcg_add/use_delay(), the delay duration is not auto-decayed until it
      is explicitly cleared with blkcg_clear_delay(). iocost is updated to keep the
      delay duration synchronized to the budget overage amount.
      
      With this change, iocost can effectively police cgroups which generate
      significant amount of force-issued IOs.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      54c52e10
  3. 29 4月, 2020 1 次提交
  4. 02 4月, 2020 2 次提交
    • T
      blkcg: don't offline parent blkcg first · 4308a434
      Tejun Heo 提交于
      blkcg->cgwb_refcnt is used to delay blkcg offlining so that blkgs
      don't get offlined while there are active cgwbs on them.  However, it
      ends up making offlining unordered sometimes causing parents to be
      offlined before children.
      
      Let's fix this by making child blkcgs pin the parents' online states.
      
      Note that pin/unpin names are chosen over get/put intentionally
      because css uses get/put online for something different.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4308a434
    • T
      blkcg: rename blkcg->cgwb_refcnt to ->online_pin and always use it · d866dbf6
      Tejun Heo 提交于
      blkcg->cgwb_refcnt is used to delay blkcg offlining so that blkgs
      don't get offlined while there are active cgwbs on them.  However, it
      ends up making offlining unordered sometimes causing parents to be
      offlined before children.
      
      To fix it, we want child blkcgs to pin the parents' online states
      turning the refcnt into a more generic online pinning mechanism.
      
      In prepartion,
      
      * blkcg->cgwb_refcnt -> blkcg->online_pin
      * blkcg_cgwb_get/put() -> blkcg_pin/unpin_online()
      * Take them out of CONFIG_CGROUP_WRITEBACK
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d866dbf6
  5. 13 12月, 2019 1 次提交
  6. 18 11月, 2019 1 次提交
  7. 08 11月, 2019 3 次提交
    • T
      blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT · 1d156646
      Tejun Heo 提交于
      blkg_rwstat is now only used by bfq-iosched and blk-throtl when on
      cgroup1.  Let's move it into its own files and gate it behind a config
      option.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      1d156646
    • T
      blk-cgroup: reimplement basic IO stats using cgroup rstat · f7331648
      Tejun Heo 提交于
      blk-cgroup has been using blkg_rwstat to track basic IO stats.
      Unfortunately, reading recursive stats scales badly as itinvolves
      walking all descendants.  On systems with a huge number of cgroups
      (dead or alive), this can lead to substantial CPU cost when reading IO
      stats.
      
      This patch reimplements basic IO stats using cgroup rstat which uses
      more memory but makes recursive stat reading O(# descendants which
      have been active since last reading) instead of O(# descendants).
      
      * blk-cgroup core no longer uses sync/async stats.  Introduce new stat
        enums - BLKG_IOSTAT_{READ|WRITE|DISCARD}.
      
      * Add blkg_iostat[_set] which encapsulates byte and io stats, last
        values for propagation delta calculation and u64_stats_sync for
        correctness on 32bit archs.
      
      * Update the new percpu stat counters directly and implement
        blkcg_rstat_flush() to implement propagation.
      
      * blkg_print_stat() can now bring the stats up to date by calling
        cgroup_rstat_flush() and print them instead of directly summing up
        all descendants.
      
      * It now allocates 96 bytes per cpu.  It used to be 40 bytes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Dan Schatzberg <dschatzberg@fb.com>
      Cc: Daniel Xu <dlxu@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f7331648
    • T
      blk-cgroup: remove now unused blkg_print_stat_{bytes|ios}_recursive() · 8a80d5d6
      Tejun Heo 提交于
      These don't have users anymore.  Remove them.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8a80d5d6
  8. 29 8月, 2019 2 次提交
  9. 05 8月, 2019 1 次提交
  10. 17 7月, 2019 1 次提交
  11. 10 7月, 2019 1 次提交
    • T
      blkcg: implement REQ_CGROUP_PUNT · d3f77dfd
      Tejun Heo 提交于
      When a shared kthread needs to issue a bio for a cgroup, doing so
      synchronously can lead to priority inversions as the kthread can be
      trapped waiting for that cgroup.  This patch implements
      REQ_CGROUP_PUNT flag which makes submit_bio() punt the actual issuing
      to a dedicated per-blkcg work item to avoid such priority inversions.
      
      This will be used to fix priority inversions in btrfs compression and
      should be generally useful as we grow filesystem support for
      comprehensive IO control.
      
      Cc: Chris Mason <clm@fb.com>
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d3f77dfd
  12. 21 6月, 2019 4 次提交
  13. 21 12月, 2018 1 次提交
  14. 13 12月, 2018 1 次提交
  15. 08 12月, 2018 10 次提交
  16. 16 11月, 2018 2 次提交
  17. 08 11月, 2018 1 次提交
  18. 02 11月, 2018 1 次提交
  19. 21 10月, 2018 1 次提交
  20. 22 9月, 2018 2 次提交