1. 18 3月, 2020 1 次提交
    • X
      alinux: blk: add iohang check function · 80d6ee24
      Xiaoguang Wang 提交于
      Background:
        We do not have a dependable block layer interface to determine whether
      block device has io requests which have not been completed for somewhat
      long time. Currently we have 'in_flight' interface, it counts the number
      of I/O requests that have been issued to the device driver but have
      not yet completed, and it does not include I/O requests that are in the
      queue but not yet issued to the device driver, which means it will not
      count io requests that have been stucked in block layer.
        Also say that there are steady io requests issued to device driver,
      'in_flight' maybe always non-zero, but you could not determine whether
      there is one io request which has not been completed for too long.
      
      Solution:
        To find io requests which have not been completed for too long, here
      add 3 new inferfaces:
        /sys/block/vdb/queue/hang_threshold
      If one io request's running time has been greater than this value, count
      this io as hang.
      
        /sys/block/vdb/hang
      Show read/write io requests' hang counter.
      
        /sys/kernel/debug/block/vdb/rq_hang
      Show all hang io requests's detailed info, like below:
        ffff97db96301200 {.op=WRITE, .cmd_flags=SYNC, .rq_flags=STARTED|
      ELVPRIV|IO_STAT|STATS, .state=in_flight, .tag=30, .internal_tag=169,
      .start_time_ns=140634088407, .io_start_time_ns=140634102958,
      .current_time=146497371953, .bio = ffff97db91e8e000,
      .bio_pages = { ffffd096a0602540 }, .bio = ffff97db91e8ec00,
      .bio_pages = { ffffd096a070eec0 }, .bio = ffff97db91e8f600,
      .bio_pages = { ffffd096a0424cc0 }, .bio = ffff97db91e8f300,
      .bio_pages = { ffffd096a0600a80 }}
      
      With above info, we can easily see this request's latency distribution,
      and see next patch for bio_pages's usage.
      
      Note, /sys/kernel/debug/block/vdb/rq_hang only exists in blk-mq device driver
      and needs CONFIG_BLK_DEBUG_FS enabled.
      Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      80d6ee24
  2. 04 8月, 2019 1 次提交
  3. 09 7月, 2018 3 次提交
  4. 21 6月, 2018 1 次提交
  5. 29 5月, 2018 1 次提交
    • K
      blk-mq: Remove generation seqeunce · 12f5b931
      Keith Busch 提交于
      This patch simplifies the timeout handling by relying on the request
      reference counting to ensure the iterator is operating on an inflight
      and truly timed out request. Since the reference counting prevents the
      tag from being reallocated, the block layer no longer needs to prevent
      drivers from completing their requests while the timeout handler is
      operating on it: a driver completing a request is allowed to proceed to
      the next state without additional syncronization with the block layer.
      
      This also removes any need for generation sequence numbers since the
      request lifetime is prevented from being reallocated as a new sequence
      while timeout handling is operating on it.
      
      To enables this a refcount is added to struct request so that request
      users can be sure they're operating on the same request without it
      changing while they're processing it.  The request's tag won't be
      released for reuse until both the timeout handler and the completion
      are done with it.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      [hch: slight cleanups, added back submission side hctx lock, use cmpxchg
       for completions]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      12f5b931
  6. 10 4月, 2018 1 次提交
  7. 18 3月, 2018 1 次提交
  8. 01 3月, 2018 2 次提交
  9. 25 1月, 2018 1 次提交
    • E
      blk-mq-debugfs: don't allow write on attributes with seq_operations set · 6b136a24
      Eryu Guan 提交于
      Attributes that only implement .seq_ops are read-only, any write to
      them should be rejected. But currently kernel would crash when
      writing to such debugfs entries, e.g.
      
      chmod +w /sys/kernel/debug/block/<dev>/requeue_list
      echo 0 > /sys/kernel/debug/block/<dev>/requeue_list
      chmod -w /sys/kernel/debug/block/<dev>/requeue_list
      
      Fix it by returning -EPERM in blk_mq_debugfs_write() when writing to
      such attributes.
      
      Cc: Ming Lei <ming.lei@redhat.com>
      Signed-off-by: NEryu Guan <eguan@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6b136a24
  10. 13 1月, 2018 1 次提交
    • J
      blk-mq: add missing RQF_STARTED to debugfs · 85ba3eff
      Jens Axboe 提交于
      Looking at debug output, we see:
      
      ./000000009ddfa913/requeue_list:000000009646711c {.op=READ, .state=idle, gen=0x1
      18, abort_gen=0x0, .cmd_flags=, .rq_flags=SORTED|1|SOFTBARRIER|IO_STAT, complete
      =0, .tag=-1, .internal_tag=217}
      
      Note the '1' between SORTED and SOFTBARRIER - that's because no name
      as defined for RQF_STARTED. Fixed that.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      85ba3eff
  11. 11 1月, 2018 3 次提交
  12. 10 1月, 2018 1 次提交
    • T
      blk-mq: remove REQ_ATOM_STARTED · 5a61c363
      Tejun Heo 提交于
      After the recent updates to use generation number and state based
      synchronization, we can easily replace REQ_ATOM_STARTED usages by
      adding an extra state to distinguish completed but not yet freed
      state.
      
      Add MQ_RQ_COMPLETE and replace REQ_ATOM_STARTED usages with
      blk_mq_rq_state() tests.  REQ_ATOM_STARTED no longer has any users
      left and is removed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5a61c363
  13. 11 11月, 2017 2 次提交
  14. 06 10月, 2017 1 次提交
  15. 04 10月, 2017 1 次提交
  16. 25 8月, 2017 1 次提交
  17. 18 8月, 2017 1 次提交
  18. 10 8月, 2017 1 次提交
  19. 28 6月, 2017 1 次提交
  20. 02 6月, 2017 4 次提交
  21. 04 5月, 2017 11 次提交