1. 23 3月, 2017 5 次提交
  2. 22 3月, 2017 3 次提交
    • O
      blk-stat: convert to callback-based statistics reporting · 34dbad5d
      Omar Sandoval 提交于
      Currently, statistics are gathered in ~0.13s windows, and users grab the
      statistics whenever they need them. This is not ideal for both in-tree
      users:
      
      1. Writeback throttling wants its own dynamically sized window of
         statistics. Since the blk-stats statistics are reset after every
         window and the wbt windows don't line up with the blk-stats windows,
         wbt doesn't see every I/O.
      2. Polling currently grabs the statistics on every I/O. Again, depending
         on how the window lines up, we may miss some I/Os. It's also
         unnecessary overhead to get the statistics on every I/O; the hybrid
         polling heuristic would be just as happy with the statistics from the
         previous full window.
      
      This reworks the blk-stats infrastructure to be callback-based: users
      register a callback that they want called at a given time with all of
      the statistics from the window during which the callback was active.
      Users can dynamically bucketize the statistics. wbt and polling both
      currently use read vs. write, but polling can be extended to further
      subdivide based on request size.
      
      The callbacks are kept on an RCU list, and each callback has percpu
      stats buffers. There will only be a few users, so the overhead on the
      I/O completion side is low. The stats flushing is also simplified
      considerably: since the timer function is responsible for clearing the
      statistics, we don't have to worry about stale statistics.
      
      wbt is a trivial conversion. After the conversion, the windowing problem
      mentioned above is fixed.
      
      For polling, we register an extra callback that caches the previous
      window's statistics in the struct request_queue for the hybrid polling
      heuristic to use.
      
      Since we no longer have a single stats buffer for the request queue,
      this also removes the sysfs and debugfs stats entries. To replace those,
      we add a debugfs entry for the poll statistics.
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      34dbad5d
    • O
      blk-stat: use READ and WRITE instead of BLK_STAT_{READ,WRITE} · fa2e39cb
      Omar Sandoval 提交于
      The stats buckets will become generic soon, so make the existing users
      use the common READ and WRITE definitions instead of one internal to
      blk-stat.
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      fa2e39cb
    • O
      block: remove extra calls to wbt_exit() · 0315b159
      Omar Sandoval 提交于
      We always call wbt_exit() from blk_release_queue(), so these are
      unnecessary.
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      0315b159
  3. 15 3月, 2017 1 次提交
  4. 09 3月, 2017 4 次提交
  5. 03 3月, 2017 1 次提交
    • J
      blk-mq: ensure that bd->last is always set correctly · 113285b4
      Jens Axboe 提交于
      When drivers are called with a request in blk-mq, blk-mq flags the
      state such that the driver knows if this is the last request in
      this call chain or not. The driver can then use that information
      to defer kicking off IO until bd->last is true. However, with blk-mq
      and scheduling, we need to allocate a driver tag for a request before
      it can be issued. If we fail to allocate such a tag, we could end up
      in the situation where the last request issued did not have
      bd->last == true set. This can then cause a driver hang.
      
      This fixes a hang with virtio-blk, which uses bd->last as a hint
      on whether to kick the queue or not.
      Reported-by: NChris Mason <clm@fb.com>
      Tested-by: NChris Mason <clm@fb.com>
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      113285b4
  6. 02 3月, 2017 8 次提交
  7. 24 2月, 2017 1 次提交
    • O
      blk-mq: use sbq wait queues instead of restart for driver tags · da55f2cc
      Omar Sandoval 提交于
      Commit 50e1dab8 ("blk-mq-sched: fix starvation for multiple hardware
      queues and shared tags") fixed one starvation issue for shared tags.
      However, we can still get into a situation where we fail to allocate a
      tag because all tags are allocated but we don't have any pending
      requests on any hardware queue.
      
      One solution for this would be to restart all queues that share a tag
      map, but that really sucks. Ideally, we could just block and wait for a
      tag, but that isn't always possible from blk_mq_dispatch_rq_list().
      
      However, we can still use the struct sbitmap_queue wait queues with a
      custom callback instead of blocking. This has a few benefits:
      
      1. It avoids iterating over all hardware queues when completing an I/O,
         which the current restart code has to do.
      2. It benefits from the existing rolling wakeup code.
      3. It avoids punting to another thread just to have it block.
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      da55f2cc
  8. 18 2月, 2017 2 次提交
  9. 11 2月, 2017 1 次提交
  10. 09 2月, 2017 2 次提交
  11. 03 2月, 2017 1 次提交
  12. 01 2月, 2017 1 次提交
    • J
      blk-mq: don't fail allocating driver tag for stopped hw queue · 12d70958
      Jens Axboe 提交于
      We rely on blk_mq_get_driver_tag() not failing if 'wait' is true,
      but it currently fails in that case if the queue happens to be
      stopped at the time of the call.
      
      We don't need to check for stopped here, it's just assigning
      the tag. If the queue is stopped, we'll handle it when
      attempting to run the queue.
      
      This fixes a stall/crash on flush intensive workloads, where
      we proceed to process a flush that doesn't have a valid tag
      assigned.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      12d70958
  13. 28 1月, 2017 3 次提交
  14. 27 1月, 2017 7 次提交