1. 12 4月, 2011 4 次提交
  2. 11 4月, 2011 1 次提交
    • N
      block: splice plug list to local context · 109b8129
      NeilBrown 提交于
      If the request_fn ends up blocking, we could be re-entering
      the plug flush. Since the list is protected by explicitly
      not allowing schedule events, this isn't a terribly good idea.
      
      Additionally, it can cause us to recurse. As request_fn called by
      __blk_run_queue is allowed to 'schedule()' (after dropping the queue
      lock of course), it is possible to get a recursive call:
      
       schedule -> blk_flush_plug -> __blk_finish_plug -> flush_plug_list
            -> __blk_run_queue -> request_fn -> schedule
      
      We must make sure that the second schedule does not call into
      blk_flush_plug again.  So instead of leaving the list of requests on
      blk_plug->list, move them to a separate list leaving blk_plug->list
      empty.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      109b8129
  3. 06 4月, 2011 6 次提交
  4. 31 3月, 2011 1 次提交
  5. 26 3月, 2011 2 次提交
  6. 23 3月, 2011 5 次提交
  7. 21 3月, 2011 1 次提交
    • J
      block: attempt to merge with existing requests on plug flush · 5e84ea3a
      Jens Axboe 提交于
      One of the disadvantages of on-stack plugging is that we potentially
      lose out on merging since all pending IO isn't always visible to
      everybody. When we flush the on-stack plugs, right now we don't do
      any checks to see if potential merge candidates could be utilized.
      
      Correct this by adding a new insert variant, ELEVATOR_INSERT_SORT_MERGE.
      It works just ELEVATOR_INSERT_SORT, but first checks whether we can
      merge with an existing request before doing the insertion (if we fail
      merging).
      
      This fixes a regression with multiple processes issuing IO that
      can be merged.
      
      Thanks to Shaohua Li <shaohua.li@intel.com> for testing and fixing
      an accounting bug.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      5e84ea3a
  8. 17 3月, 2011 1 次提交
  9. 12 3月, 2011 2 次提交
  10. 11 3月, 2011 1 次提交
    • L
      block: fix mis-synchronisation in blkdev_issue_zeroout() · 0aeea189
      Lukas Czerner 提交于
      BZ29402
      https://bugzilla.kernel.org/show_bug.cgi?id=29402
      
      We can hit serious mis-synchronization in bio completion path of
      blkdev_issue_zeroout() leading to a panic.
      
      The problem is that when we are going to wait_for_completion() in
      blkdev_issue_zeroout() we check if the bb.done equals issued (number of
      submitted bios). If it does, we can skip the wait_for_completition()
      and just out of the function since there is nothing to wait for.
      However, there is a ordering problem because bio_batch_end_io() is
      calling atomic_inc(&bb->done) before complete(), hence it might seem to
      blkdev_issue_zeroout() that all bios has been completed and exit. At
      this point when bio_batch_end_io() is going to call complete(bb->wait),
      bb and wait does not longer exist since it was allocated on stack in
      blkdev_issue_zeroout() ==> panic!
      
      (thread 1)                      (thread 2)
      bio_batch_end_io()              blkdev_issue_zeroout()
        if(bb) {                      ...
          if (bb->end_io)             ...
            bb->end_io(bio, err);     ...
          atomic_inc(&bb->done);      ...
          ...                         while (issued != atomic_read(&bb.done))
          ...                         (let issued == bb.done)
          ...                         (do the rest of the function)
          ...                         return ret;
          complete(bb->wait);
          ^^^^^^^^
          panic
      
      We can fix this easily by simplifying bio_batch and completion counting.
      
      Also remove bio_end_io_t *end_io since it is not used.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Reported-by: NEric Whitney <eric.whitney@hp.com>
      Tested-by: NEric Whitney <eric.whitney@hp.com>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      CC: Dmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      0aeea189
  11. 10 3月, 2011 6 次提交
    • V
      blk-throttle: Use blk_plug in throttle dispatch · 69d60eb9
      Vivek Goyal 提交于
      Use plug in throttle dispatch also as we are dispatching a bunch of
      bios in throttle context and some of them might merge.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      69d60eb9
    • J
      block: kill off REQ_UNPLUG · 721a9602
      Jens Axboe 提交于
      With the plugging now being explicitly controlled by the
      submitter, callers need not pass down unplugging hints
      to the block layer. If they want to unplug, it's because they
      manually plugged on their own - in which case, they should just
      unplug at will.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      721a9602
    • J
      block: remove per-queue plugging · 7eaceacc
      Jens Axboe 提交于
      Code has been converted over to the new explicit on-stack plugging,
      and delay users have been converted to use the new API for that.
      So lets kill off the old plugging along with aops->sync_page().
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      7eaceacc
    • J
      block: initial patch for on-stack per-task plugging · 73c10101
      Jens Axboe 提交于
      This patch adds support for creating a queuing context outside
      of the queue itself. This enables us to batch up pieces of IO
      before grabbing the block device queue lock and submitting them to
      the IO scheduler.
      
      The context is created on the stack of the process and assigned in
      the task structure, so that we can auto-unplug it if we hit a schedule
      event.
      
      The current queue plugging happens implicitly if IO is submitted to
      an empty device, yet callers have to remember to unplug that IO when
      they are going to wait for it. This is an ugly API and has caused bugs
      in the past. Additionally, it requires hacks in the vm (->sync_page()
      callback) to handle that logic. By switching to an explicit plugging
      scheme we make the API a lot nicer and can get rid of the ->sync_page()
      hack in the vm.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      73c10101
    • J
      block: add API for delaying work/request_fn a little bit · 3cca6dc1
      Jens Axboe 提交于
      Currently we use plugging for that, but as plugging is going away,
      we need an alternative mechanism.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      3cca6dc1
    • T
      block: Don't implicitly trigger event check on disk_unblock_events() · facc31dd
      Tejun Heo 提交于
      Currently, disk_unblock_events() implicitly kick event check if the
      block count reaches zero.  This behavior is not described in the
      comment and hinders with future changes.  Make the unblocker
      explicitly check events by calling disk_check_events() as necessary.
      
      This patch doesn't cause any behavior difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      facc31dd
  12. 09 3月, 2011 1 次提交
  13. 08 3月, 2011 2 次提交
  14. 07 3月, 2011 3 次提交
  15. 03 3月, 2011 3 次提交
    • V
      block: Move blk_throtl_exit() call to blk_cleanup_queue() · da527770
      Vivek Goyal 提交于
      Move blk_throtl_exit() in blk_cleanup_queue() as blk_throtl_exit() is
      written in such a way that it needs queue lock. In blk_release_queue()
      there is no gurantee that ->queue_lock is still around.
      
      Initially blk_throtl_exit() was in blk_cleanup_queue() but Ingo reported
      one problem.
      
        https://lkml.org/lkml/2010/10/23/86
      
        And a quick fix moved blk_throtl_exit() to blk_release_queue().
      
              commit 7ad58c02
              Author: Jens Axboe <jaxboe@fusionio.com>
              Date:   Sat Oct 23 20:40:26 2010 +0200
      
              block: fix use-after-free bug in blk throttle code
      
      This patch reverts above change and does not try to shutdown the
      throtl work in blk_sync_queue(). By avoiding call to
      throtl_shutdown_timer_wq() from blk_sync_queue(), we should also avoid
      the problem reported by Ingo.
      
      blk_sync_queue() seems to be used only by md driver and it seems to be
      using it to make sure q->unplug_fn is not called as md registers its
      own unplug functions and it is about to free up the data structures
      used by unplug_fn(). Block throttle does not call back into unplug_fn()
      or into md. So there is no need to cancel blk throttle work.
      
      In fact I think cancelling block throttle work is bad because it might
      happen that some bios are throttled and scheduled to be dispatched later
      with the help of pending work and if work is cancelled, these bios might
      never be dispatched.
      
      Block layer also uses blk_sync_queue() during blk_cleanup_queue() and
      blk_release_queue() time. That should be safe as we are also calling
      blk_throtl_exit() which should make sure all the throttling related
      data structures are cleaned up.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      da527770
    • V
      block: Initialize ->queue_lock to internal lock at queue allocation time · c94a96ac
      Vivek Goyal 提交于
      There does not seem to be a clear convention whether q->queue_lock is
      initialized or not when blk_cleanup_queue() is called. In the past it
      was not necessary but now blk_throtl_exit() takes up queue lock by
      default and needs queue lock to be available.
      
      In fact elevator_exit() code also has similar requirement just that it
      is less stringent in the sense that elevator_exit() is called only if
      elevator is initialized.
      
      Two problems have been noticed because of ambiguity about spin lock
      status.
      
            - If a driver calls blk_alloc_queue() and then soon calls
              blk_cleanup_queue() almost immediately, (because some other
      	driver structure allocation failed or some other error happened)
      	then blk_throtl_exit() will run into issues as queue lock is not
      	initialized. Loop driver ran into this issue recently and I
      	noticed error paths in md driver too. Similar error paths should
      	exist in other drivers too.
      
            - If some driver provided external spin lock and zapped the lock
              before blk_cleanup_queue(), then it can lead to issues.
      
      So this patch initializes the default queue lock at queue allocation time.
      
      block throttling code is one of the users of queue lock and it is
      initialized at the queue allocation time, so it makes sense to
      initialize ->queue_lock also to internal lock. A driver can overide that
      lock later. This will take care of the issue where a driver does not have
      to worry about initializing the queue lock to default before calling
      blk_cleanup_queue()
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      c94a96ac
    • L
      block/genhd: Change some numerals into macros · 53f22956
      Liu Yuan 提交于
      Rename the numerals in the diskstats_show() into the macros.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NLiu Yuan <tailai.ly@taobao.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      53f22956
  16. 02 3月, 2011 1 次提交