1. 11 5月, 2018 1 次提交
    • P
      block, bfq: postpone rq preparation to insert or merge · 18e5a57d
      Paolo Valente 提交于
      When invoked for an I/O request rq, the prepare_request hook of bfq
      increments reference counters in the destination bfq_queue for rq. In
      this respect, after this hook has been invoked, rq may still be
      transformed into a request with no icq attached, i.e., for bfq, a
      request not associated with any bfq_queue. No further hook is invoked
      to signal this tranformation to bfq (in general, to the destination
      elevator for rq). This leads bfq into an inconsistent state, because
      bfq has no chance to correctly lower these counters back. This
      inconsistency may in its turn cause incorrect scheduling and hangs. It
      certainly causes memory leaks, by making it impossible for bfq to free
      the involved bfq_queue.
      
      On the bright side, no transformation can still happen for rq after rq
      has been inserted into bfq, or merged with another, already inserted,
      request. Exploiting this fact, this commit addresses the above issue
      by delaying the preparation of an I/O request to when the request is
      inserted or merged.
      
      This change also gives a performance bonus: a lock-contention point
      gets removed. To prepare a request, bfq needs to hold its scheduler
      lock. After postponing request preparation to insertion or merging, no
      lock needs to be grabbed any longer in the prepare_request hook, while
      the lock already taken to perform insertion or merging is used to
      preparare the request as well.
      Tested-by: NOleksandr Natalenko <oleksandr@natalenko.name>
      Tested-by: NBart Van Assche <bart.vanassche@wdc.com>
      Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      18e5a57d
  2. 09 5月, 2018 11 次提交
  3. 08 5月, 2018 3 次提交
    • T
      block: Shorten interrupt disabled regions · 50864670
      Thomas Gleixner 提交于
      Commit 9c40cef2 ("sched: Move blk_schedule_flush_plug() out of
      __schedule()") moved the blk_schedule_flush_plug() call out of the
      interrupt/preempt disabled region in the scheduler. This allows to replace
      local_irq_save/restore(flags) by local_irq_disable/enable() in
      blk_flush_plug_list().
      
      But it makes more sense to disable interrupts explicitly when the request
      queue is locked end reenable them when the request to is unlocked. This
      shortens the interrupt disabled section which is important when the plug
      list contains requests for more than one queue. The comment which claims
      that disabling interrupts around the loop is misleading as the called
      functions can reenable interrupts unconditionally anyway and obfuscates the
      scope badly:
      
       local_irq_save(flags);
         spin_lock(q->queue_lock);
         ...
         queue_unplugged(q...);
           scsi_request_fn();
             spin_unlock_irq(q->queue_lock);
      
      -------------------^^^ ????
      
             spin_lock_irq(q->queue_lock);
           spin_unlock(q->queue_lock);
       local_irq_restore(flags);
      
      Aside of that the detached interrupt disabling is a constant pain for
      PREEMPT_RT as it requires patching and special casing when RT is enabled
      while with the spin_*_irq() variants this happens automatically.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20110622174919.025446432@linutronix.deSigned-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      50864670
    • A
      block: Remove redundant WARN_ON() · 656cb6d0
      Anna-Maria Gleixner 提交于
      Commit 2fff8a92 ("block: Check locking assumptions at runtime") added a
      lockdep_assert_held(q->queue_lock) which makes the WARN_ON() redundant
      because lockdep will detect and warn about context violations.
      
      The unconditional WARN_ON() does not provide real additional value, so it
      can be removed.
      Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      656cb6d0
    • S
      block: don't disable interrupts during kmap_atomic() · f3a1075e
      Sebastian Andrzej Siewior 提交于
      bounce_copy_vec() disables interrupts around kmap_atomic(). This is a
      leftover from the old kmap_atomic() implementation which relied on fixed
      mapping slots, so the caller had to make sure that the same slot could not
      be reused from an interrupting context.
      
      kmap_atomic() was changed to dynamic slots long ago and commit 1ec9c5dd
      ("include/linux/highmem.h: remove the second argument of k[un]map_atomic()")
      removed the slot assignements, but the callers were not checked for now
      redundant interrupt disabling.
      
      Remove the conditional interrupt disable.
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f3a1075e
  4. 26 4月, 2018 2 次提交
  5. 25 4月, 2018 2 次提交
  6. 19 4月, 2018 2 次提交
  7. 18 4月, 2018 2 次提交
  8. 17 4月, 2018 1 次提交
    • J
      blk-mq: start request gstate with gen 1 · f4560231
      Jianchao Wang 提交于
      rq->gstate and rq->aborted_gstate both are zero before rqs are
      allocated. If we have a small timeout, when the timer fires,
      there could be rqs that are never allocated, and also there could
      be rq that has been allocated but not initialized and started. At
      the moment, the rq->gstate and rq->aborted_gstate both are 0, thus
      the blk_mq_terminate_expired will identify the rq is timed out and
      invoke .timeout early.
      
      For scsi, this will cause scsi_times_out to be invoked before the
      scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at
      the moment, then we will get crash.
      
      Cc: Bart Van Assche <bart.vanassche@wdc.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Martin Steigerwald <Martin@Lichtvoll.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJianchao Wang <jianchao.w.wang@oracle.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f4560231
  9. 15 4月, 2018 1 次提交
    • A
      block: do not use interruptible wait anywhere · 1dc3039b
      Alan Jenkins 提交于
      When blk_queue_enter() waits for a queue to unfreeze, or unset the
      PREEMPT_ONLY flag, do not allow it to be interrupted by a signal.
      
      The PREEMPT_ONLY flag was introduced later in commit 3a0a5299
      ("block, scsi: Make SCSI quiesce and resume work reliably").  Note the SCSI
      device is resumed asynchronously, i.e. after un-freezing userspace tasks.
      
      So that commit exposed the bug as a regression in v4.15.  A mysterious
      SIGBUS (or -EIO) sometimes happened during the time the device was being
      resumed.  Most frequently, there was no kernel log message, and we saw Xorg
      or Xwayland killed by SIGBUS.[1]
      
      [1] E.g. https://bugzilla.redhat.com/show_bug.cgi?id=1553979
      
      Without this fix, I get an IO error in this test:
      
      # dd if=/dev/sda of=/dev/null iflag=direct & \
        while killall -SIGUSR1 dd; do sleep 0.1; done & \
        echo mem > /sys/power/state ; \
        sleep 5; killall dd  # stop after 5 seconds
      
      The interruptible wait was added to blk_queue_enter in
      commit 3ef28e83 ("block: generic request_queue reference counting").
      Before then, the interruptible wait was only in blk-mq, but I don't think
      it could ever have been correct.
      Reviewed-by: NBart Van Assche <bart.vanassche@wdc.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAlan Jenkins <alan.christopher.jenkins@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      1dc3039b
  10. 11 4月, 2018 2 次提交
  11. 10 4月, 2018 9 次提交
  12. 03 4月, 2018 1 次提交
  13. 28 3月, 2018 1 次提交
  14. 27 3月, 2018 1 次提交
  15. 26 3月, 2018 1 次提交