1. 24 9月, 2020 2 次提交
  2. 10 9月, 2020 1 次提交
  3. 02 9月, 2020 5 次提交
  4. 01 8月, 2020 1 次提交
  5. 18 7月, 2020 1 次提交
    • B
      blk-cgroup: show global disk stats in root cgroup io.stat · ef45fe47
      Boris Burkov 提交于
      In order to improve consistency and usability in cgroup stat accounting,
      we would like to support the root cgroup's io.stat.
      
      Since the root cgroup has processes doing io even if the system has no
      explicitly created cgroups, we need to be careful to avoid overhead in
      that case.  For that reason, the rstat algorithms don't handle the root
      cgroup, so just turning the file on wouldn't give correct statistics.
      
      To get around this, we simulate flushing the iostat struct by filling it
      out directly from global disk stats. The result is a root cgroup io.stat
      file consistent with both /proc/diskstats and io.stat.
      
      Note that in order to collect the disk stats, we needed to iterate over
      devices. To facilitate that, we had to change the linkage of a disk_type
      to external so that it can be used from blk-cgroup.c to iterate over
      disks.
      Suggested-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NBoris Burkov <boris@bur.io>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ef45fe47
  6. 09 7月, 2020 1 次提交
  7. 24 6月, 2020 3 次提交
    • L
      block: revert back to synchronous request_queue removal · e8c7d14a
      Luis Chamberlain 提交于
      Commit dc9edc44 ("block: Fix a blk_exit_rl() regression") merged on
      v4.12 moved the work behind blk_release_queue() into a workqueue after a
      splat floated around which indicated some work on blk_release_queue()
      could sleep in blk_exit_rl(). This splat would be possible when a driver
      called blk_put_queue() or blk_cleanup_queue() (which calls blk_put_queue()
      as its final call) from an atomic context.
      
      blk_put_queue() decrements the refcount for the request_queue kobject, and
      upon reaching 0 blk_release_queue() is called. Although blk_exit_rl() is
      now removed through commit db6d9952 ("block: remove request_list code")
      on v5.0, we reserve the right to be able to sleep within
      blk_release_queue() context.
      
      The last reference for the request_queue must not be called from atomic
      context. *When* the last reference to the request_queue reaches 0 varies,
      and so let's take the opportunity to document when that is expected to
      happen and also document the context of the related calls as best as
      possible so we can avoid future issues, and with the hopes that the
      synchronous request_queue removal sticks.
      
      We revert back to synchronous request_queue removal because asynchronous
      removal creates a regression with expected userspace interaction with
      several drivers. An example is when removing the loopback driver, one
      uses ioctls from userspace to do so, but upon return and if successful,
      one expects the device to be removed. Likewise if one races to add another
      device the new one may not be added as it is still being removed. This was
      expected behavior before and it now fails as the device is still present
      and busy still. Moving to asynchronous request_queue removal could have
      broken many scripts which relied on the removal to have been completed if
      there was no error. Document this expectation as well so that this
      doesn't regress userspace again.
      
      Using asynchronous request_queue removal however has helped us find
      other bugs. In the future we can test what could break with this
      arrangement by enabling CONFIG_DEBUG_KOBJECT_RELEASE.
      
      While at it, update the docs with the context expectations for the
      request_queue / gendisk refcount decrement, and make these
      expectations explicit by using might_sleep().
      
      Fixes: dc9edc44 ("block: Fix a blk_exit_rl() regression")
      Suggested-by: NNicolai Stange <nstange@suse.de>
      Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBart Van Assche <bvanassche@acm.org>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Omar Sandoval <osandov@fb.com>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Nicolai Stange <nstange@suse.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: yu kuai <yukuai3@huawei.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      e8c7d14a
    • L
      block: clarify context for refcount increment helpers · 763b5892
      Luis Chamberlain 提交于
      Let us clarify the context under which the helpers to increment the
      refcount for the gendisk and request_queue can be called under. We
      make this explicit on the places where we may sleep with might_sleep().
      
      We don't address the decrement context yet, as that needs some extra
      work and fixes, but will be addressed in the next patch.
      Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      763b5892
    • L
      block: add docs for gendisk / request_queue refcount helpers · b5bd357c
      Luis Chamberlain 提交于
      This adds documentation for the gendisk / request_queue refcount
      helpers.
      Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b5bd357c
  8. 27 5月, 2020 2 次提交
  9. 19 5月, 2020 2 次提交
  10. 13 5月, 2020 3 次提交
  11. 10 5月, 2020 1 次提交
  12. 21 4月, 2020 3 次提交
  13. 27 3月, 2020 1 次提交
  14. 25 3月, 2020 7 次提交
  15. 24 3月, 2020 3 次提交
  16. 19 3月, 2020 1 次提交
  17. 12 3月, 2020 1 次提交
  18. 22 11月, 2019 1 次提交
    • K
      block: add iostat counters for flush requests · b6866318
      Konstantin Khlebnikov 提交于
      Requests that triggers flushing volatile writeback cache to disk (barriers)
      have significant effect to overall performance.
      
      Block layer has sophisticated engine for combining several flush requests
      into one. But there is no statistics for actual flushes executed by disk.
      Requests which trigger flushes usually are barriers - zero-size writes.
      
      This patch adds two iostat counters into /sys/class/block/$dev/stat and
      /proc/diskstats - count of completed flush requests and their total time.
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b6866318
  19. 06 9月, 2019 1 次提交
    • D
      block: Delay default elevator initialization · 737eb78e
      Damien Le Moal 提交于
      When elevator_init_mq() is called from blk_mq_init_allocated_queue(),
      the only information known about the device is the number of hardware
      queues as the block device scan by the device driver is not completed
      yet for most drivers. The device type and elevator required features
      are not set yet, preventing to correctly select the default elevator
      most suitable for the device.
      
      This currently affects all multi-queue zoned block devices which default
      to the "none" elevator instead of the required "mq-deadline" elevator.
      These drives currently include host-managed SMR disks connected to a
      smartpqi HBA and null_blk block devices with zoned mode enabled.
      Upcoming NVMe Zoned Namespace devices will also be affected.
      
      Fix this by adding the boolean elevator_init argument to
      blk_mq_init_allocated_queue() to control the execution of
      elevator_init_mq(). Two cases exist:
      1) elevator_init = false is used for calls to
         blk_mq_init_allocated_queue() within blk_mq_init_queue(). In this
         case, a call to elevator_init_mq() is added to __device_add_disk(),
         resulting in the delayed initialization of the queue elevator
         after the device driver finished probing the device information. This
         effectively allows elevator_init_mq() access to more information
         about the device.
      2) elevator_init = true preserves the current behavior of initializing
         the elevator directly from blk_mq_init_allocated_queue(). This case
         is used for the special request based DM devices where the device
         gendisk is created before the queue initialization and device
         information (e.g. queue limits) is already known when the queue
         initialization is executed.
      
      Additionally, to make sure that the elevator initialization is never
      done while requests are in-flight (there should be none when the device
      driver calls device_add_disk()), freeze and quiesce the device request
      queue before calling blk_mq_init_sched() in elevator_init_mq().
      Reviewed-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      737eb78e