1. 11 1月, 2017 1 次提交
    • C
      virtio_blk: avoid DMA to stack for the sense buffer · a14d749f
      Christoph Hellwig 提交于
      Most users of BLOCK_PC requests allocate the sense buffer on the stack,
      so to avoid DMA to the stack copy them to a field in the heap allocated
      virtblk_req structure.  Without that any attempt at SCSI passthrough I/O,
      including the SG_IO ioctl from userspace will crash the kernel.  Note that
      this includes running tools like hdparm even when the host does not have
      SCSI passthrough enabled.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: stable@vger.kernel.org # v4.9+
      Signed-off-by: NJens Axboe <axboe@fb.com>
      a14d749f
  2. 31 10月, 2016 2 次提交
  3. 15 9月, 2016 1 次提交
  4. 09 8月, 2016 1 次提交
  5. 21 7月, 2016 2 次提交
  6. 28 6月, 2016 1 次提交
    • D
      block: convert to device_add_disk() · 0d52c756
      Dan Williams 提交于
      For block drivers that specify a parent device, convert them to use
      device_add_disk().
      
      This conversion was done with the following semantic patch:
      
          @@
          struct gendisk *disk;
          expression E;
          @@
      
          - disk->driverfs_dev = E;
          ...
          - add_disk(disk);
          + device_add_disk(E, disk);
      
          @@
          struct gendisk *disk;
          expression E1, E2;
          @@
      
          - disk->driverfs_dev = E1;
          ...
          E2 = disk;
          ...
          - add_disk(E2);
          + device_add_disk(E1, E2);
      
      ...plus some manual fixups for a few missed conversions.
      
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0d52c756
  7. 08 6月, 2016 1 次提交
  8. 13 4月, 2016 1 次提交
  9. 02 3月, 2016 1 次提交
  10. 01 10月, 2015 1 次提交
  11. 08 9月, 2015 2 次提交
  12. 06 5月, 2015 1 次提交
  13. 11 4月, 2015 1 次提交
    • J
      sd, mmc, virtio_blk, string_helpers: fix block size units · b9f28d86
      James Bottomley 提交于
      The current string_get_size() overflows when the device size goes over
      2^64 bytes because the string helper routine computes the suffix from
      the size in bytes.  However, the entirety of Linux thinks in terms of
      blocks, not bytes, so this will artificially induce an overflow on very
      large devices.  Fix this by making the function string_get_size() take
      blocks and the block size instead of bytes.  This should allow us to
      keep working until the current SCSI standard overflows.
      
      Also fix virtio_blk and mmc (both of which were also artificially
      multiplying by the block size to pass a byte side to string_get_size()).
      
      The mathematics of this is pretty simple:  we're taking a product of
      size in blocks (S) and block size (B) and trying to re-express this in
      exponential form: S*B = R*N^E (where N, the exponent is either 1000 or
      1024) and R < N.  Mathematically, S = RS*N^ES and B=RB*N^EB, so if RS*RB
      < N it's easy to see that S*B = RS*RB*N^(ES+EB).  However, if RS*BS > N,
      we can see that this can be re-expressed as RS*BS = R*N (where R =
      RS*BS/N < N) so the whole exponent becomes R*N^(ES+EB+1)
      
      [jejb: fix incorrect 32 bit do_div spotted by kbuild test robot <fengguang.wu@intel.com>]
      Acked-by: NUlf Hansson <ulf.hansson@linaro.org>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJames Bottomley <JBottomley@Odin.com>
      b9f28d86
  14. 21 1月, 2015 2 次提交
  15. 03 1月, 2015 1 次提交
  16. 09 12月, 2014 4 次提交
  17. 11 11月, 2014 1 次提交
  18. 30 10月, 2014 1 次提交
    • J
      blk-mq: add a 'list' parameter to ->queue_rq() · 74c45052
      Jens Axboe 提交于
      Since we have the notion of a 'last' request in a chain, we can use
      this to have the hardware optimize the issuing of requests. Add
      a list_head parameter to queue_rq that the driver can use to
      temporarily store hw commands for issue when 'last' is true. If we
      are doing a chain of requests, pass in a NULL list for the first
      request to force issue of that immediately, then batch the remainder
      for deferred issue until the last request has been sent.
      
      Instead of adding yet another argument to the hot ->queue_rq path,
      encapsulate the passed arguments in a blk_mq_queue_data structure.
      This is passed as a constant, and has been tested as faster than
      passing 4 (or even 3) args through ->queue_rq. Update drivers for
      the new ->queue_rq() prototype. There are no functional changes
      in this patch for drivers - if they don't use the passed in list,
      then they will just queue requests individually like before.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      74c45052
  19. 15 10月, 2014 4 次提交
  20. 23 9月, 2014 3 次提交
  21. 02 7月, 2014 1 次提交
  22. 30 5月, 2014 1 次提交
    • M
      block: virtio_blk: don't hold spin lock during world switch · e8edca6f
      Ming Lei 提交于
      Firstly, it isn't necessary to hold lock of vblk->vq_lock
      when notifying hypervisor about queued I/O.
      
      Secondly, virtqueue_notify() will cause world switch and
      it may take long time on some hypervisors(such as, qemu-arm),
      so it isn't good to hold the lock and block other vCPUs.
      
      On arm64 quad core VM(qemu-kvm), the patch can increase I/O
      performance a lot with VIRTIO_RING_F_EVENT_IDX enabled:
      	- without the patch: 14K IOPS
      	- with the patch: 34K IOPS
      
      fio script:
      	[global]
      	direct=1
      	bsrange=4k-4k
      	timeout=10
      	numjobs=4
      	ioengine=libaio
      	iodepth=64
      
      	filename=/dev/vdc
      	group_reporting=1
      
      	[f1]
      	rw=randread
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: virtualization@lists.linux-foundation.org
      Signed-off-by: NMing Lei <ming.lei@canonical.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: stable@kernel.org # 3.13+
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e8edca6f
  23. 29 5月, 2014 1 次提交
  24. 27 5月, 2014 1 次提交
    • M
      virtio_blk: fix race between start and stop queue · aa0818c6
      Ming Lei 提交于
      When there isn't enough vring descriptor for adding to vq,
      blk-mq will be put as stopped state until some of pending
      descriptors are completed & freed.
      
      Unfortunately, the vq's interrupt may come just before
      blk-mq's BLK_MQ_S_STOPPED flag is set, so the blk-mq will
      still be kept as stopped even though lots of descriptors
      are completed and freed in the interrupt handler. The worst
      case is that all pending descriptors are freed in the
      interrupt handler, and the queue is kept as stopped forever.
      
      This patch fixes the problem by starting/stopping blk-mq
      with holding vq_lock.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NMing Lei <tom.leiming@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <axboe@fb.com>
      
      Conflicts:
      	drivers/block/virtio_blk.c
      aa0818c6
  25. 16 5月, 2014 1 次提交
    • M
      virtio_blk: fix race between start and stop queue · 0c29e93e
      Ming Lei 提交于
      When there isn't enough vring descriptor for adding to vq,
      blk-mq will be put as stopped state until some of pending
      descriptors are completed & freed.
      
      Unfortunately, the vq's interrupt may come just before
      blk-mq's BLK_MQ_S_STOPPED flag is set, so the blk-mq will
      still be kept as stopped even though lots of descriptors
      are completed and freed in the interrupt handler. The worst
      case is that all pending descriptors are freed in the
      interrupt handler, and the queue is kept as stopped forever.
      
      This patch fixes the problem by starting/stopping blk-mq
      with holding vq_lock.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NMing Lei <tom.leiming@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      0c29e93e
  26. 17 4月, 2014 1 次提交
  27. 16 4月, 2014 2 次提交
    • C
      blk-mq: split out tag initialization, support shared tags · 24d2f903
      Christoph Hellwig 提交于
      Add a new blk_mq_tag_set structure that gets set up before we initialize
      the queue.  A single blk_mq_tag_set structure can be shared by multiple
      queues.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      
      Modular export of blk_mq_{alloc,free}_tagset added by me.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      24d2f903
    • C
      blk-mq: add ->init_request and ->exit_request methods · e9b267d9
      Christoph Hellwig 提交于
      The current blk_mq_init_commands/blk_mq_free_commands interface has a
      two problems:
      
       1) Because only the constructor is passed to blk_mq_init_commands there
          is no easy way to clean up when a comman initialization failed.  The
          current code simply leaks the allocations done in the constructor.
      
       2) There is no good place to call blk_mq_free_commands: before
          blk_cleanup_queue there is no guarantee that all outstanding
          commands have completed, so we can't free them yet.  After
          blk_cleanup_queue the queue has usually been freed.  This can be
          worked around by grabbing an unconditional reference before calling
          blk_cleanup_queue and dropping it after blk_mq_free_commands is
          done, although that's not exatly pretty and driver writers are
          guaranteed to get it wrong sooner or later.
      
      Both issues are easily fixed by making the request constructor and
      destructor normal blk_mq_ops methods.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e9b267d9