1. 15 4月, 2018 1 次提交
    • A
      block: do not use interruptible wait anywhere · 1dc3039b
      Alan Jenkins 提交于
      When blk_queue_enter() waits for a queue to unfreeze, or unset the
      PREEMPT_ONLY flag, do not allow it to be interrupted by a signal.
      
      The PREEMPT_ONLY flag was introduced later in commit 3a0a5299
      ("block, scsi: Make SCSI quiesce and resume work reliably").  Note the SCSI
      device is resumed asynchronously, i.e. after un-freezing userspace tasks.
      
      So that commit exposed the bug as a regression in v4.15.  A mysterious
      SIGBUS (or -EIO) sometimes happened during the time the device was being
      resumed.  Most frequently, there was no kernel log message, and we saw Xorg
      or Xwayland killed by SIGBUS.[1]
      
      [1] E.g. https://bugzilla.redhat.com/show_bug.cgi?id=1553979
      
      Without this fix, I get an IO error in this test:
      
      # dd if=/dev/sda of=/dev/null iflag=direct & \
        while killall -SIGUSR1 dd; do sleep 0.1; done & \
        echo mem > /sys/power/state ; \
        sleep 5; killall dd  # stop after 5 seconds
      
      The interruptible wait was added to blk_queue_enter in
      commit 3ef28e83 ("block: generic request_queue reference counting").
      Before then, the interruptible wait was only in blk-mq, but I don't think
      it could ever have been correct.
      Reviewed-by: NBart Van Assche <bart.vanassche@wdc.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAlan Jenkins <alan.christopher.jenkins@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      1dc3039b
  2. 11 4月, 2018 2 次提交
  3. 10 4月, 2018 9 次提交
  4. 03 4月, 2018 1 次提交
  5. 28 3月, 2018 1 次提交
  6. 27 3月, 2018 1 次提交
  7. 26 3月, 2018 1 次提交
  8. 22 3月, 2018 1 次提交
  9. 20 3月, 2018 1 次提交
    • B
      block: Change a rcu_read_{lock,unlock}_sched() pair into rcu_read_{lock,unlock}() · 818e0fa2
      Bart Van Assche 提交于
      scsi_device_quiesce() uses synchronize_rcu() to guarantee that the
      effect of blk_set_preempt_only() will be visible for percpu_ref_tryget()
      calls that occur after the queue unfreeze by using the approach
      explained in https://lwn.net/Articles/573497/. The rcu read lock and
      unlock calls in blk_queue_enter() form a pair with the synchronize_rcu()
      call in scsi_device_quiesce(). Both scsi_device_quiesce() and
      blk_queue_enter() must either use regular RCU or RCU-sched.
      Since neither the RCU-protected code in blk_queue_enter() nor
      blk_queue_usage_counter_release() sleeps, regular RCU protection
      is sufficient. Note: scsi_device_quiesce() does not have to be
      modified since it already uses synchronize_rcu().
      Reported-by: NTejun Heo <tj@kernel.org>
      Fixes: 3a0a5299 ("block, scsi: Make SCSI quiesce and resume work reliably")
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
      Cc: Martin Steigerwald <martin@lichtvoll.de>
      Cc: stable@vger.kernel.org # v4.15
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      818e0fa2
  10. 18 3月, 2018 2 次提交
  11. 17 3月, 2018 2 次提交
    • J
      blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir() · 4c699480
      Joseph Qi 提交于
      We've triggered a WARNING in blk_throtl_bio() when throttling writeback
      io, which complains blkg->refcnt is already 0 when calling blkg_get(),
      and then kernel crashes with invalid page request.
      After investigating this issue, we've found it is caused by a race
      between blkcg_bio_issue_check() and cgroup_rmdir(), which is described
      below:
      
      writeback kworker               cgroup_rmdir
                                        cgroup_destroy_locked
                                          kill_css
                                            css_killed_ref_fn
                                              css_killed_work_fn
                                                offline_css
                                                  blkcg_css_offline
        blkcg_bio_issue_check
          rcu_read_lock
          blkg_lookup
                                                    spin_trylock(q->queue_lock)
                                                    blkg_destroy
                                                    spin_unlock(q->queue_lock)
          blk_throtl_bio
          spin_lock_irq(q->queue_lock)
          ...
          spin_unlock_irq(q->queue_lock)
        rcu_read_unlock
      
      Since rcu can only prevent blkg from releasing when it is being used,
      the blkg->refcnt can be decreased to 0 during blkg_destroy() and schedule
      blkg release.
      Then trying to blkg_get() in blk_throtl_bio() will complains the WARNING.
      And then the corresponding blkg_put() will schedule blkg release again,
      which result in double free.
      This race is introduced by commit ae118896 ("blkcg: consolidate blkg
      creation in blkcg_bio_issue_check()"). Before this commit, it will
      lookup first and then try to lookup/create again with queue_lock. Since
      revive this logic is a bit drastic, so fix it by only offlining pd during
      blkcg_css_offline(), and move the rest destruction (especially
      blkg_put()) into blkcg_css_free(), which should be the right way as
      discussed.
      
      Fixes: ae118896 ("blkcg: consolidate blkg creation in blkcg_bio_issue_check()")
      Reported-by: NJiufei Xue <jiufei.xue@linux.alibaba.com>
      Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4c699480
    • J
      block: sed-opal: fix u64 short atom length · 5f990d31
      Jonas Rabenstein 提交于
      The length must be given as bytes and not as 4 bit tuples.
      Reviewed-by: NScott Bauer <scott.bauer@intel.com>
      Signed-off-by: NJonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5f990d31
  12. 16 3月, 2018 1 次提交
  13. 14 3月, 2018 3 次提交
  14. 09 3月, 2018 6 次提交
  15. 07 3月, 2018 1 次提交
  16. 01 3月, 2018 7 次提交