1. 26 11月, 2018 2 次提交
  2. 20 11月, 2018 4 次提交
  3. 16 11月, 2018 5 次提交
  4. 14 11月, 2018 1 次提交
    • M
      SCSI: fix queue cleanup race before queue initialization is done · 8dc765d4
      Ming Lei 提交于
      c2856ae2 ("blk-mq: quiesce queue before freeing queue") has
      already fixed this race, however the implied synchronize_rcu()
      in blk_mq_quiesce_queue() can slow down LUN probe a lot, so caused
      performance regression.
      
      Then 1311326c ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()")
      tried to quiesce queue for avoiding unnecessary synchronize_rcu()
      only when queue initialization is done, because it is usual to see
      lots of inexistent LUNs which need to be probed.
      
      However, turns out it isn't safe to quiesce queue only when queue
      initialization is done. Because when one SCSI command is completed,
      the user of sending command can be waken up immediately, then the
      scsi device may be removed, meantime the run queue in scsi_end_request()
      is still in-progress, so kernel panic can be caused.
      
      In Red Hat QE lab, there are several reports about this kind of kernel
      panic triggered during kernel booting.
      
      This patch tries to address the issue by grabing one queue usage
      counter during freeing one request and the following run queue.
      
      Fixes: 1311326c ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()")
      Cc: Andrew Jones <drjones@redhat.com>
      Cc: Bart Van Assche <bart.vanassche@wdc.com>
      Cc: linux-scsi@vger.kernel.org
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
      Cc: stable <stable@vger.kernel.org>
      Cc: jianchao.wang <jianchao.w.wang@oracle.com>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8dc765d4
  5. 10 11月, 2018 1 次提交
  6. 08 11月, 2018 9 次提交
  7. 02 11月, 2018 1 次提交
  8. 31 10月, 2018 1 次提交
  9. 26 10月, 2018 1 次提交
    • C
      block: add a report_zones method · e76239a3
      Christoph Hellwig 提交于
      Dispatching a report zones command through the request queue is a major
      pain due to the command reply payload rewriting necessary. Given that
      blkdev_report_zones() is executing everything synchronously, implement
      report zones as a block device file operation instead, allowing major
      simplification of the code in many places.
      
      sd, null-blk, dm-linear and dm-flakey being the only block device
      drivers supporting exposing zoned block devices, these drivers are
      modified to provide the device side implementation of the
      report_zones() block device file operation.
      
      For device mappers, a new report_zones() target type operation is
      defined so that the upper block layer calls blkdev_report_zones() can
      be propagated down to the underlying devices of the dm targets.
      Implementation for this new operation is added to the dm-linear and
      dm-flakey targets.
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      [Damien]
      * Changed method block_device argument to gendisk
      * Various bug fixes and improvements
      * Added support for null_blk, dm-linear and dm-flakey.
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      e76239a3
  10. 21 10月, 2018 1 次提交
  11. 15 10月, 2018 1 次提交
  12. 14 10月, 2018 1 次提交
  13. 27 9月, 2018 5 次提交
  14. 22 9月, 2018 1 次提交
    • O
      block: use nanosecond resolution for iostat · b57e99b4
      Omar Sandoval 提交于
      Klaus Kusche reported that the I/O busy time in /proc/diskstats was not
      updating properly on 4.18. This is because we started using ktime to
      track elapsed time, and we convert nanoseconds to jiffies when we update
      the partition counter. However, this gets rounded down, so any I/Os that
      take less than a jiffy are not accounted for. Previously in this case,
      the value of jiffies would sometimes increment while we were doing I/O,
      so at least some I/Os were accounted for.
      
      Let's convert the stats to use nanoseconds internally. We still report
      milliseconds as before, now more accurately than ever. The value is
      still truncated to 32 bits for backwards compatibility.
      
      Fixes: 522a7775 ("block: consolidate struct request timestamp fields")
      Cc: stable@vger.kernel.org
      Reported-by: NKlaus Kusche <klaus.kusche@computerix.info>
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b57e99b4
  15. 06 9月, 2018 1 次提交
  16. 18 8月, 2018 1 次提交
  17. 15 8月, 2018 1 次提交
  18. 09 8月, 2018 1 次提交
  19. 05 8月, 2018 1 次提交
    • L
      Partially revert "block: fail op_is_write() requests to read-only partitions" · a32e236e
      Linus Torvalds 提交于
      It turns out that commit 721c7fc7 ("block: fail op_is_write()
      requests to read-only partitions"), while obviously correct, causes
      problems for some older lvm2 installations.
      
      The reason is that the lvm snapshotting will continue to write to the
      snapshow COW volume, even after the volume has been marked read-only.
      End result: snapshot failure.
      
      This has actually been fixed in newer version of the lvm2 tool, but the
      old tools still exist, and the breakage was reported both in the kernel
      bugzilla and in the Debian bugzilla:
      
        https://bugzilla.kernel.org/show_bug.cgi?id=200439
        https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900442
      
      The lvm2 fix is here
      
        https://sourceware.org/git/?p=lvm2.git;a=commit;h=a6fdb9d9d70f51c49ad11a87ab4243344e6701a3
      
      but until everybody has updated to recent versions, we'll have to weaken
      the "never write to read-only partitions" check.  It now allows the
      write to happen, but causes a warning, something like this:
      
        generic_make_request: Trying to write to read-only block-device dm-3 (partno X)
        Modules linked in: nf_tables xt_cgroup xt_owner kvm_intel iwlmvm kvm irqbypass iwlwifi
        CPU: 1 PID: 77 Comm: kworker/1:1 Not tainted 4.17.9-gentoo #3
        Hardware name: LENOVO 20B6A019RT/20B6A019RT, BIOS GJET91WW (2.41 ) 09/21/2016
        Workqueue: ksnaphd do_metadata
        RIP: 0010:generic_make_request_checks+0x4ac/0x600
        ...
        Call Trace:
         generic_make_request+0x64/0x400
         submit_bio+0x6c/0x140
         dispatch_io+0x287/0x430
         sync_io+0xc3/0x120
         dm_io+0x1f8/0x220
         do_metadata+0x1d/0x30
         process_one_work+0x1b9/0x3e0
         worker_thread+0x2b/0x3c0
         kthread+0x113/0x130
         ret_from_fork+0x35/0x40
      
      Note that this is a "revert" in behavior only.  I'm leaving alone the
      actual code cleanups in commit 721c7fc7, but letting the previously
      uncaught request go through with a warning instead of stopping it.
      
      Fixes: 721c7fc7 ("block: fail op_is_write() requests to read-only partitions")
      Reported-and-tested-by: NWGH <wgh@torlan.ru>
      Acked-by: NMike Snitzer <snitzer@redhat.com>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Ilya Dryomov <idryomov@gmail.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Zdenek Kabelac <zkabelac@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a32e236e
  20. 03 8月, 2018 1 次提交