1. 09 6月, 2014 1 次提交
  2. 21 3月, 2014 1 次提交
  3. 22 2月, 2014 1 次提交
  4. 08 2月, 2014 1 次提交
  5. 01 1月, 2014 1 次提交
  6. 25 10月, 2013 2 次提交
    • J
      blk-mq: new multi-queue block IO queueing mechanism · 320ae51f
      Jens Axboe 提交于
      Linux currently has two models for block devices:
      
      - The classic request_fn based approach, where drivers use struct
        request units for IO. The block layer provides various helper
        functionalities to let drivers share code, things like tag
        management, timeout handling, queueing, etc.
      
      - The "stacked" approach, where a driver squeezes in between the
        block layer and IO submitter. Since this bypasses the IO stack,
        driver generally have to manage everything themselves.
      
      With drivers being written for new high IOPS devices, the classic
      request_fn based driver doesn't work well enough. The design dates
      back to when both SMP and high IOPS was rare. It has problems with
      scaling to bigger machines, and runs into scaling issues even on
      smaller machines when you have IOPS in the hundreds of thousands
      per device.
      
      The stacked approach is then most often selected as the model
      for the driver. But this means that everybody has to re-invent
      everything, and along with that we get all the problems again
      that the shared approach solved.
      
      This commit introduces blk-mq, block multi queue support. The
      design is centered around per-cpu queues for queueing IO, which
      then funnel down into x number of hardware submission queues.
      We might have a 1:1 mapping between the two, or it might be
      an N:M mapping. That all depends on what the hardware supports.
      
      blk-mq provides various helper functions, which include:
      
      - Scalable support for request tagging. Most devices need to
        be able to uniquely identify a request both in the driver and
        to the hardware. The tagging uses per-cpu caches for freed
        tags, to enable cache hot reuse.
      
      - Timeout handling without tracking request on a per-device
        basis. Basically the driver should be able to get a notification,
        if a request happens to fail.
      
      - Optional support for non 1:1 mappings between issue and
        submission queues. blk-mq can redirect IO completions to the
        desired location.
      
      - Support for per-request payloads. Drivers almost always need
        to associate a request structure with some driver private
        command structure. Drivers can tell blk-mq this at init time,
        and then any request handed to the driver will have the
        required size of memory associated with it.
      
      - Support for merging of IO, and plugging. The stacked model
        gets neither of these. Even for high IOPS devices, merging
        sequential IO reduces per-command overhead and thus
        increases bandwidth.
      
      For now, this is provided as a potential 3rd queueing model, with
      the hope being that, as it matures, it can replace both the classic
      and stacked model. That would get us back to having just 1 real
      model for block devices, leaving the stacked approach to dm/md
      devices (as it was originally intended).
      
      Contributions in this patch from the following people:
      
      Shaohua Li <shli@fusionio.com>
      Alexander Gordeev <agordeev@redhat.com>
      Christoph Hellwig <hch@infradead.org>
      Mike Christie <michaelc@cs.wisc.edu>
      Matias Bjorling <m@bjorling.me>
      Jeff Moyer <jmoyer@redhat.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      320ae51f
    • C
      block: remove request ref_count · 71fe07d0
      Christoph Hellwig 提交于
      This reference count has been around since before git history, but the only
      place where it's used is in blk_execute_rq, and ther it is entirely useless
      as it is incremented before submitting the request and decremented in the
      end_io handler before waking up the submitter thread.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      71fe07d0
  7. 18 9月, 2013 1 次提交
  8. 15 2月, 2013 1 次提交
    • V
      block: account iowait time when waiting for completion of IO request · 5577022f
      Vladimir Davydov 提交于
      Using wait_for_completion() for waiting for a IO request to be executed
      results in wrong iowait time accounting. For example, a system having
      the only task doing write() and fdatasync() on a block device can be
      reported being idle instead of iowaiting as it should because
      blkdev_issue_flush() calls wait_for_completion() which in turn calls
      schedule() that does not increment the iowait proc counter and thus does
      not turn on iowait time accounting.
      
      The patch makes block layer use wait_for_completion_io() instead of
      wait_for_completion() where appropriate to account iowait time
      correctly.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5577022f
  9. 08 2月, 2013 1 次提交
  10. 06 12月, 2012 2 次提交
    • B
      block: Avoid that request_fn is invoked on a dead queue · c246e80d
      Bart Van Assche 提交于
      A block driver may start cleaning up resources needed by its
      request_fn as soon as blk_cleanup_queue() finished, so request_fn
      must not be invoked after draining finished. This is important
      when blk_run_queue() is invoked without any requests in progress.
      As an example, if blk_drain_queue() and scsi_run_queue() run in
      parallel, blk_drain_queue() may have finished all requests after
      scsi_run_queue() has taken a SCSI device off the starved list but
      before that last function has had a chance to run the queue.
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Cc: James Bottomley <JBottomley@Parallels.com>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Chanho Min <chanho.min@lge.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      c246e80d
    • B
      block: Rename queue dead flag · 3f3299d5
      Bart Van Assche 提交于
      QUEUE_FLAG_DEAD is used to indicate that queuing new requests must
      stop. After this flag has been set queue draining starts. However,
      during the queue draining phase it is still safe to invoke the
      queue's request_fn, so QUEUE_FLAG_DYING is a better name for this
      flag.
      
      This patch has been generated by running the following command
      over the kernel source tree:
      
      git grep -lEw 'blk_queue_dead|QUEUE_FLAG_DEAD' |
          xargs sed -i.tmp -e 's/blk_queue_dead/blk_queue_dying/g'      \
              -e 's/QUEUE_FLAG_DEAD/QUEUE_FLAG_DYING/g';                \
      sed -i.tmp -e "s/QUEUE_FLAG_DYING$(printf \\t)*5/QUEUE_FLAG_DYING$(printf \\t)5/g" \
          include/linux/blkdev.h;                                       \
      sed -i.tmp -e 's/ DEAD/ DYING/g' -e 's/dead queue/a dying queue/' \
          -e 's/Dead queue/A dying queue/' block/blk-core.c
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: James Bottomley <JBottomley@Parallels.com>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Chanho Min <chanho.min@lge.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      3f3299d5
  11. 23 11月, 2012 1 次提交
    • R
      block: Don't access request after it might be freed · 893d290f
      Roland Dreier 提交于
      After we've done __elv_add_request() and __blk_run_queue() in
      blk_execute_rq_nowait(), the request might finish and be freed
      immediately.  Therefore checking if the type is REQ_TYPE_PM_RESUME
      isn't safe afterwards, because if it isn't, rq might be gone.
      Instead, check beforehand and stash the result in a temporary.
      
      This fixes crashes in blk_execute_rq_nowait() I get occasionally when
      running with lots of memory debugging options enabled -- I think this
      race is usually harmless because the window for rq to be reallocated
      is so small.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      893d290f
  12. 20 7月, 2012 1 次提交
  13. 14 12月, 2011 2 次提交
  14. 22 7月, 2011 1 次提交
    • J
      [SCSI] fix crash in scsi_dispatch_cmd() · bfe159a5
      James Bottomley 提交于
      USB surprise removal of sr is triggering an oops in
      scsi_dispatch_command().  What seems to be happening is that USB is
      hanging on to a queue reference until the last close of the upper
      device, so the crash is caused by surprise remove of a mounted CD
      followed by attempted unmount.
      
      The problem is that USB doesn't issue its final commands as part of
      the SCSI teardown path, but on last close when the block queue is long
      gone.  The long term fix is probably to make sr do the teardown in the
      same way as sd (so remove all the lower bits on ejection, but keep the
      upper disk alive until last close of user space).  However, the
      current oops can be simply fixed by not allowing any commands to be
      sent to a dead queue.
      
      Cc: stable@kernel.org
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      bfe159a5
  15. 06 5月, 2011 1 次提交
  16. 18 4月, 2011 1 次提交
  17. 10 3月, 2011 1 次提交
  18. 24 9月, 2010 1 次提交
    • M
      block: Prevent hang_check firing during long I/O · 4b197769
      Mark Lord 提交于
      During long I/O operations, the hang_check timer may fire,
      trigger stack dumps that unnecessarily alarm the user.
      
      Eg.  hdparm --security-erase NULL /dev/sdb  ## can take *hours* to complete
      
      So, if hang_check is armed, we should wake up periodically
      to prevent it from triggering.  This patch uses a wake-up interval
      equal to half the hang_check timer period, which keeps overhead low enough.
      Signed-off-by: NMark Lord <mlord@pobox.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      4b197769
  19. 08 8月, 2010 1 次提交
  20. 28 4月, 2009 1 次提交
    • T
      block: don't set REQ_NOMERGE unnecessarily · e4025f6c
      Tejun Heo 提交于
      RQ_NOMERGE_FLAGS already clears defines which REQ flags aren't
      mergeable.  There is no reason to specify it superflously.  It only
      adds to confusion.  Don't set REQ_NOMERGE for barriers and requests
      with specific queueing directive.  REQ_NOMERGE is now exclusively used
      by the merging code.
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e4025f6c
  21. 09 10月, 2008 1 次提交
  22. 16 7月, 2008 2 次提交
  23. 01 2月, 2008 1 次提交
  24. 30 1月, 2008 1 次提交