1. 28 10月, 2021 1 次提交
    • X
      virtio-blk: Use blk_validate_block_size() to validate block size · 57a13a5b
      Xie Yongji 提交于
      The block layer can't support a block size larger than
      page size yet. And a block size that's too small or
      not a power of two won't work either. If a misconfigured
      device presents an invalid block size in configuration space,
      it will result in the kernel crash something like below:
      
      [  506.154324] BUG: kernel NULL pointer dereference, address: 0000000000000008
      [  506.160416] RIP: 0010:create_empty_buffers+0x24/0x100
      [  506.174302] Call Trace:
      [  506.174651]  create_page_buffers+0x4d/0x60
      [  506.175207]  block_read_full_page+0x50/0x380
      [  506.175798]  ? __mod_lruvec_page_state+0x60/0xa0
      [  506.176412]  ? __add_to_page_cache_locked+0x1b2/0x390
      [  506.177085]  ? blkdev_direct_IO+0x4a0/0x4a0
      [  506.177644]  ? scan_shadow_nodes+0x30/0x30
      [  506.178206]  ? lru_cache_add+0x42/0x60
      [  506.178716]  do_read_cache_page+0x695/0x740
      [  506.179278]  ? read_part_sector+0xe0/0xe0
      [  506.179821]  read_part_sector+0x36/0xe0
      [  506.180337]  adfspart_check_ICS+0x32/0x320
      [  506.180890]  ? snprintf+0x45/0x70
      [  506.181350]  ? read_part_sector+0xe0/0xe0
      [  506.181906]  bdev_disk_changed+0x229/0x5c0
      [  506.182483]  blkdev_get_whole+0x6d/0x90
      [  506.183013]  blkdev_get_by_dev+0x122/0x2d0
      [  506.183562]  device_add_disk+0x39e/0x3c0
      [  506.184472]  virtblk_probe+0x3f8/0x79b [virtio_blk]
      [  506.185461]  virtio_dev_probe+0x15e/0x1d0 [virtio]
      
      So let's use a block layer helper to validate the block size.
      Signed-off-by: NXie Yongji <xieyongji@bytedance.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Link: https://lore.kernel.org/r/20211026144015.188-5-xieyongji@bytedance.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
      57a13a5b
  2. 13 10月, 2021 1 次提交
    • M
      Revert "virtio-blk: Add validation for block size in config space" · ff631988
      Michael S. Tsirkin 提交于
      It turns out that access to config space before completing the feature
      negotiation is broken for big endian guests at least with QEMU hosts up
      to 6.1 inclusive.  This affects any device that accesses config space in
      the validate callback: at the moment that is virtio-net with
      VIRTIO_NET_F_MTU but since 82e89ea0 ("virtio-blk: Add validation for
      block size in config space") that also started affecting virtio-blk with
      VIRTIO_BLK_F_BLK_SIZE. Further, unlike VIRTIO_NET_F_MTU which is off by
      default on QEMU, VIRTIO_BLK_F_BLK_SIZE is on by default, which resulted
      in lots of people not being able to boot VMs on BE.
      
      The spec is very clear that what we are doing is legal so QEMU needs to
      be fixed, but given it's been broken for so many years and no one
      noticed, we need to give QEMU a bit more time before applying this.
      
      Further, this patch is incomplete (does not check blk size is a power
      of two) and it duplicates the logic from nbd.
      
      Revert for now, and we'll reapply a cleaner logic in the next release.
      
      Cc: stable@vger.kernel.org
      Fixes: 82e89ea0 ("virtio-blk: Add validation for block size in config space")
      Cc: Xie Yongji <xieyongji@bytedance.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      ff631988
  3. 06 9月, 2021 1 次提交
  4. 24 8月, 2021 1 次提交
  5. 17 8月, 2021 1 次提交
  6. 11 8月, 2021 1 次提交
  7. 03 7月, 2021 3 次提交
  8. 12 6月, 2021 1 次提交
  9. 23 2月, 2021 1 次提交
  10. 25 1月, 2021 1 次提交
    • G
      block: remove unnecessary argument from blk_execute_rq · 684da762
      Guoqing Jiang 提交于
      We can remove 'q' from blk_execute_rq as well after the previous change
      in blk_execute_rq_nowait.
      
      And more importantly it never really was needed to start with given
      that we can trivial derive it from struct request.
      
      Cc: linux-scsi@vger.kernel.org
      Cc: virtualization@lists.linux-foundation.org
      Cc: linux-ide@vger.kernel.org
      Cc: linux-mmc@vger.kernel.org
      Cc: linux-nvme@lists.infradead.org
      Cc: linux-nfs@vger.kernel.org
      Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
      Signed-off-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      684da762
  11. 16 11月, 2020 2 次提交
  12. 02 9月, 2020 2 次提交
  13. 17 8月, 2020 1 次提交
    • M
      block: virtio_blk: fix handling single range discard request · af822aa6
      Ming Lei 提交于
      1f23816b ("virtio_blk: add discard and write zeroes support") starts
      to support multi-range discard for virtio-blk. However, the virtio-blk
      disk may report max discard segment as 1, at least that is exactly what
      qemu is doing.
      
      So far, block layer switches to normal request merge if max discard segment
      limit is 1, and multiple bios can be merged to single segment. This way may
      cause memory corruption in virtblk_setup_discard_write_zeroes().
      
      Fix the issue by handling single max discard segment in straightforward
      way.
      
      Fixes: 1f23816b ("virtio_blk: add discard and write zeroes support")
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Changpeng Liu <changpeng.liu@intel.com>
      Cc: Daniel Verkamp <dverkamp@chromium.org>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Cc: Stefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      af822aa6
  14. 01 7月, 2020 1 次提交
  15. 24 6月, 2020 1 次提交
  16. 02 5月, 2020 1 次提交
    • S
      virtio-blk: handle block_device_operations callbacks after hot unplug · 90b5feb8
      Stefan Hajnoczi 提交于
      A userspace process holding a file descriptor to a virtio_blk device can
      still invoke block_device_operations after hot unplug.  This leads to a
      use-after-free accessing vblk->vdev in virtblk_getgeo() when
      ioctl(HDIO_GETGEO) is invoked:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
        IP: [<ffffffffc00e5450>] virtio_check_driver_offered_feature+0x10/0x90 [virtio]
        PGD 800000003a92f067 PUD 3a930067 PMD 0
        Oops: 0000 [#1] SMP
        CPU: 0 PID: 1310 Comm: hdio-getgeo Tainted: G           OE  ------------   3.10.0-1062.el7.x86_64 #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
        task: ffff9be5fbfb8000 ti: ffff9be5fa890000 task.ti: ffff9be5fa890000
        RIP: 0010:[<ffffffffc00e5450>]  [<ffffffffc00e5450>] virtio_check_driver_offered_feature+0x10/0x90 [virtio]
        RSP: 0018:ffff9be5fa893dc8  EFLAGS: 00010246
        RAX: ffff9be5fc3f3400 RBX: ffff9be5fa893e30 RCX: 0000000000000000
        RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff9be5fbc10b40
        RBP: ffff9be5fa893dc8 R08: 0000000000000301 R09: 0000000000000301
        R10: 0000000000000000 R11: 0000000000000000 R12: ffff9be5fdc24680
        R13: ffff9be5fbc10b40 R14: ffff9be5fbc10480 R15: 0000000000000000
        FS:  00007f1bfb968740(0000) GS:ffff9be5ffc00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000090 CR3: 000000003a894000 CR4: 0000000000360ff0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         [<ffffffffc016ac37>] virtblk_getgeo+0x47/0x110 [virtio_blk]
         [<ffffffff8d3f200d>] ? handle_mm_fault+0x39d/0x9b0
         [<ffffffff8d561265>] blkdev_ioctl+0x1f5/0xa20
         [<ffffffff8d488771>] block_ioctl+0x41/0x50
         [<ffffffff8d45d9e0>] do_vfs_ioctl+0x3a0/0x5a0
         [<ffffffff8d45dc81>] SyS_ioctl+0xa1/0xc0
      
      A related problem is that virtblk_remove() leaks the vd_index_ida index
      when something still holds a reference to vblk->disk during hot unplug.
      This causes virtio-blk device names to be lost (vda, vdb, etc).
      
      Fix these issues by protecting vblk->vdev with a mutex and reference
      counting vblk so the vd_index_ida index can be removed in all cases.
      
      Fixes: 48e4043d ("virtio: add virtio disk geometry feature")
      Reported-by: NLance Digby <ldigby@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      Link: https://lore.kernel.org/r/20200430140442.171016-1-stefanha@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>
      90b5feb8
  17. 17 4月, 2020 1 次提交
  18. 19 3月, 2020 1 次提交
  19. 08 3月, 2020 2 次提交
  20. 06 2月, 2020 1 次提交
  21. 03 1月, 2020 1 次提交
  22. 21 5月, 2019 1 次提交
  23. 10 4月, 2019 1 次提交
    • D
      virtio-blk: limit number of hw queues by nr_cpu_ids · bf348f9b
      Dongli Zhang 提交于
      When tag_set->nr_maps is 1, the block layer limits the number of hw queues
      by nr_cpu_ids. No matter how many hw queues are used by virtio-blk, as it
      has (tag_set->nr_maps == 1), it can use at most nr_cpu_ids hw queues.
      
      In addition, specifically for pci scenario, when the 'num-queues' specified
      by qemu is more than maxcpus, virtio-blk would not be able to allocate more
      than maxcpus vectors in order to have a vector for each queue. As a result,
      it falls back into MSI-X with one vector for config and one shared for
      queues.
      
      Considering above reasons, this patch limits the number of hw queues used
      by virtio-blk by nr_cpu_ids.
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NDongli Zhang <dongli.zhang@oracle.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      bf348f9b
  24. 08 4月, 2019 1 次提交
  25. 07 3月, 2019 1 次提交
  26. 20 12月, 2018 1 次提交
  27. 30 11月, 2018 1 次提交
  28. 08 11月, 2018 1 次提交
  29. 28 9月, 2018 2 次提交
  30. 25 5月, 2018 1 次提交
  31. 14 5月, 2018 1 次提交
  32. 01 2月, 2018 1 次提交
    • S
      virtio_blk: print capacity at probe time · daf2a501
      Stefan Hajnoczi 提交于
      Print the capacity of the block device when the driver is probed.  Many
      users expect this since SCSI disks (sd) do it.  Moreover, kernel dmesg
      output is the primary source of troubleshooting information so it's
      helpful to include the disk size there.
      
      The capacity is already printed by virtio_blk when a resize event
      occurs.  Extract the code and reuse it from virtblk_probe().
      
      This patch also adds the block device name to the message so it can be
      correlated with a specific device:
      
        virtio_blk virtio0: [vda] 20971520 512-byte logical blocks (10.7 GB/10.0 GiB)
      
      Cc: Rodrigo A B Freire <rfreire@redhat.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      daf2a501
  33. 31 1月, 2018 1 次提交
    • M
      blk-mq: introduce BLK_STS_DEV_RESOURCE · 86ff7c2a
      Ming Lei 提交于
      This status is returned from driver to block layer if device related
      resource is unavailable, but driver can guarantee that IO dispatch
      will be triggered in future when the resource is available.
      
      Convert some drivers to return BLK_STS_DEV_RESOURCE.  Also, if driver
      returns BLK_STS_RESOURCE and SCHED_RESTART is set, rerun queue after
      a delay (BLK_MQ_DELAY_QUEUE) to avoid IO stalls.  BLK_MQ_DELAY_QUEUE is
      3 ms because both scsi-mq and nvmefc are using that magic value.
      
      If a driver can make sure there is in-flight IO, it is safe to return
      BLK_STS_DEV_RESOURCE because:
      
      1) If all in-flight IOs complete before examining SCHED_RESTART in
      blk_mq_dispatch_rq_list(), SCHED_RESTART must be cleared, so queue
      is run immediately in this case by blk_mq_dispatch_rq_list();
      
      2) if there is any in-flight IO after/when examining SCHED_RESTART
      in blk_mq_dispatch_rq_list():
      - if SCHED_RESTART isn't set, queue is run immediately as handled in 1)
      - otherwise, this request will be dispatched after any in-flight IO is
        completed via blk_mq_sched_restart()
      
      3) if SCHED_RESTART is set concurently in context because of
      BLK_STS_RESOURCE, blk_mq_delay_run_hw_queue() will cover the above two
      cases and make sure IO hang can be avoided.
      
      One invariant is that queue will be rerun if SCHED_RESTART is set.
      Suggested-by: NJens Axboe <axboe@kernel.dk>
      Tested-by: NLaurence Oberman <loberman@redhat.com>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      86ff7c2a
  34. 27 10月, 2017 1 次提交