1. 13 11月, 2021 1 次提交
  2. 10 11月, 2021 1 次提交
  3. 03 11月, 2021 1 次提交
  4. 27 10月, 2021 1 次提交
  5. 20 10月, 2021 2 次提交
  6. 19 10月, 2021 2 次提交
  7. 18 10月, 2021 7 次提交
  8. 25 6月, 2021 1 次提交
    • J
      blk: Fix lock inversion between ioc lock and bfqd lock · fd2ef39c
      Jan Kara 提交于
      Lockdep complains about lock inversion between ioc->lock and bfqd->lock:
      
      bfqd -> ioc:
       put_io_context+0x33/0x90 -> ioc->lock grabbed
       blk_mq_free_request+0x51/0x140
       blk_put_request+0xe/0x10
       blk_attempt_req_merge+0x1d/0x30
       elv_attempt_insert_merge+0x56/0xa0
       blk_mq_sched_try_insert_merge+0x4b/0x60
       bfq_insert_requests+0x9e/0x18c0 -> bfqd->lock grabbed
       blk_mq_sched_insert_requests+0xd6/0x2b0
       blk_mq_flush_plug_list+0x154/0x280
       blk_finish_plug+0x40/0x60
       ext4_writepages+0x696/0x1320
       do_writepages+0x1c/0x80
       __filemap_fdatawrite_range+0xd7/0x120
       sync_file_range+0xac/0xf0
      
      ioc->bfqd:
       bfq_exit_icq+0xa3/0xe0 -> bfqd->lock grabbed
       put_io_context_active+0x78/0xb0 -> ioc->lock grabbed
       exit_io_context+0x48/0x50
       do_exit+0x7e9/0xdd0
       do_group_exit+0x54/0xc0
      
      To avoid this inversion we change blk_mq_sched_try_insert_merge() to not
      free the merged request but rather leave that upto the caller similarly
      to blk_mq_sched_try_merge(). And in bfq_insert_requests() we make sure
      to free all the merged requests after dropping bfqd->lock.
      
      Fixes: aee69d78 ("block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler")
      Reviewed-by: NMing Lei <ming.lei@redhat.com>
      Acked-by: NPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20210623093634.27879-3-jack@suse.czSigned-off-by: NJens Axboe <axboe@kernel.dk>
      fd2ef39c
  9. 04 6月, 2021 1 次提交
    • J
      block: Do not pull requests from the scheduler when we cannot dispatch them · 61347154
      Jan Kara 提交于
      Provided the device driver does not implement dispatch budget accounting
      (which only SCSI does) the loop in __blk_mq_do_dispatch_sched() pulls
      requests from the IO scheduler as long as it is willing to give out any.
      That defeats scheduling heuristics inside the scheduler by creating
      false impression that the device can take more IO when it in fact
      cannot.
      
      For example with BFQ IO scheduler on top of virtio-blk device setting
      blkio cgroup weight has barely any impact on observed throughput of
      async IO because __blk_mq_do_dispatch_sched() always sucks out all the
      IO queued in BFQ. BFQ first submits IO from higher weight cgroups but
      when that is all dispatched, it will give out IO of lower weight cgroups
      as well. And then we have to wait for all this IO to be dispatched to
      the disk (which means lot of it actually has to complete) before the
      IO scheduler is queried again for dispatching more requests. This
      completely destroys any service differentiation.
      
      So grab request tag for a request pulled out of the IO scheduler already
      in __blk_mq_do_dispatch_sched() and do not pull any more requests if we
      cannot get it because we are unlikely to be able to dispatch it. That
      way only single request is going to wait in the dispatch list for some
      tag to free.
      Reviewed-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20210603104721.6309-1-jack@suse.czSigned-off-by: NJens Axboe <axboe@kernel.dk>
      61347154
  10. 24 5月, 2021 1 次提交
  11. 05 3月, 2021 1 次提交
  12. 25 1月, 2021 1 次提交
  13. 13 12月, 2020 1 次提交
  14. 02 12月, 2020 1 次提交
  15. 04 9月, 2020 5 次提交
  16. 02 7月, 2020 1 次提交
  17. 01 7月, 2020 1 次提交
  18. 30 6月, 2020 3 次提交
  19. 29 6月, 2020 1 次提交
  20. 07 6月, 2020 1 次提交
  21. 30 5月, 2020 1 次提交
  22. 27 2月, 2020 1 次提交
  23. 25 2月, 2020 1 次提交
    • M
      blk-mq: insert passthrough request into hctx->dispatch directly · 01e99aec
      Ming Lei 提交于
      For some reason, device may be in one situation which can't handle
      FS request, so STS_RESOURCE is always returned and the FS request
      will be added to hctx->dispatch. However passthrough request may
      be required at that time for fixing the problem. If passthrough
      request is added to scheduler queue, there isn't any chance for
      blk-mq to dispatch it given we prioritize requests in hctx->dispatch.
      Then the FS IO request may never be completed, and IO hang is caused.
      
      So passthrough request has to be added to hctx->dispatch directly
      for fixing the IO hang.
      
      Fix this issue by inserting passthrough request into hctx->dispatch
      directly together withing adding FS request to the tail of
      hctx->dispatch in blk_mq_dispatch_rq_list(). Actually we add FS request
      to tail of hctx->dispatch at default, see blk_mq_request_bypass_insert().
      
      Then it becomes consistent with original legacy IO request
      path, in which passthrough request is always added to q->queue_head.
      
      Cc: Dongli Zhang <dongli.zhang@oracle.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Ewan D. Milne <emilne@redhat.com>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      01e99aec
  24. 07 10月, 2019 1 次提交
  25. 11 7月, 2019 1 次提交
    • D
      block: Disable write plugging for zoned block devices · b49773e7
      Damien Le Moal 提交于
      Simultaneously writing to a sequential zone of a zoned block device
      from multiple contexts requires mutual exclusion for BIO issuing to
      ensure that writes happen sequentially. However, even for a well
      behaved user correctly implementing such synchronization, BIO plugging
      may interfere and result in BIOs from the different contextx to be
      reordered if plugging is done outside of the mutual exclusion section,
      e.g. the plug was started by a function higher in the call chain than
      the function issuing BIOs.
      
               Context A                     Context B
      
         | blk_start_plug()
         | ...
         | seq_write_zone()
           | mutex_lock(zone)
           | bio-0->bi_iter.bi_sector = zone->wp
           | zone->wp += bio_sectors(bio-0)
           | submit_bio(bio-0)
           | bio-1->bi_iter.bi_sector = zone->wp
           | zone->wp += bio_sectors(bio-1)
           | submit_bio(bio-1)
           | mutex_unlock(zone)
           | return
         | -----------------------> | seq_write_zone()
        				| mutex_lock(zone)
           				| bio-2->bi_iter.bi_sector = zone->wp
           				| zone->wp += bio_sectors(bio-2)
      				| submit_bio(bio-2)
      				| mutex_unlock(zone)
         | <------------------------- |
         | blk_finish_plug()
      
      In the above example, despite the mutex synchronization ensuring the
      correct BIO issuing order 0, 1, 2, context A BIOs 0 and 1 end up being
      issued after BIO 2 of context B, when the plug is released with
      blk_finish_plug().
      
      While this problem can be addressed using the blk_flush_plug_list()
      function (in the above example, the call must be inserted before the
      zone mutex lock is released), a simple generic solution in the block
      layer avoid this additional code in all zoned block device user code.
      The simple generic solution implemented with this patch is to introduce
      the internal helper function blk_mq_plug() to access the current
      context plug on BIO submission. This helper returns the current plug
      only if the target device is not a zoned block device or if the BIO to
      be plugged is not a write operation. Otherwise, the caller context plug
      is ignored and NULL returned, resulting is all writes to zoned block
      device to never be plugged.
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b49773e7
  26. 03 7月, 2019 1 次提交