1. 09 12月, 2014 4 次提交
  2. 15 10月, 2014 4 次提交
  3. 23 9月, 2014 3 次提交
  4. 02 7月, 2014 1 次提交
  5. 30 5月, 2014 1 次提交
    • M
      block: virtio_blk: don't hold spin lock during world switch · e8edca6f
      Ming Lei 提交于
      Firstly, it isn't necessary to hold lock of vblk->vq_lock
      when notifying hypervisor about queued I/O.
      
      Secondly, virtqueue_notify() will cause world switch and
      it may take long time on some hypervisors(such as, qemu-arm),
      so it isn't good to hold the lock and block other vCPUs.
      
      On arm64 quad core VM(qemu-kvm), the patch can increase I/O
      performance a lot with VIRTIO_RING_F_EVENT_IDX enabled:
      	- without the patch: 14K IOPS
      	- with the patch: 34K IOPS
      
      fio script:
      	[global]
      	direct=1
      	bsrange=4k-4k
      	timeout=10
      	numjobs=4
      	ioengine=libaio
      	iodepth=64
      
      	filename=/dev/vdc
      	group_reporting=1
      
      	[f1]
      	rw=randread
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: virtualization@lists.linux-foundation.org
      Signed-off-by: NMing Lei <ming.lei@canonical.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: stable@kernel.org # 3.13+
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e8edca6f
  6. 29 5月, 2014 1 次提交
  7. 27 5月, 2014 1 次提交
    • M
      virtio_blk: fix race between start and stop queue · aa0818c6
      Ming Lei 提交于
      When there isn't enough vring descriptor for adding to vq,
      blk-mq will be put as stopped state until some of pending
      descriptors are completed & freed.
      
      Unfortunately, the vq's interrupt may come just before
      blk-mq's BLK_MQ_S_STOPPED flag is set, so the blk-mq will
      still be kept as stopped even though lots of descriptors
      are completed and freed in the interrupt handler. The worst
      case is that all pending descriptors are freed in the
      interrupt handler, and the queue is kept as stopped forever.
      
      This patch fixes the problem by starting/stopping blk-mq
      with holding vq_lock.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NMing Lei <tom.leiming@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <axboe@fb.com>
      
      Conflicts:
      	drivers/block/virtio_blk.c
      aa0818c6
  8. 16 5月, 2014 1 次提交
    • M
      virtio_blk: fix race between start and stop queue · 0c29e93e
      Ming Lei 提交于
      When there isn't enough vring descriptor for adding to vq,
      blk-mq will be put as stopped state until some of pending
      descriptors are completed & freed.
      
      Unfortunately, the vq's interrupt may come just before
      blk-mq's BLK_MQ_S_STOPPED flag is set, so the blk-mq will
      still be kept as stopped even though lots of descriptors
      are completed and freed in the interrupt handler. The worst
      case is that all pending descriptors are freed in the
      interrupt handler, and the queue is kept as stopped forever.
      
      This patch fixes the problem by starting/stopping blk-mq
      with holding vq_lock.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NMing Lei <tom.leiming@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      0c29e93e
  9. 17 4月, 2014 1 次提交
  10. 16 4月, 2014 3 次提交
    • C
      blk-mq: split out tag initialization, support shared tags · 24d2f903
      Christoph Hellwig 提交于
      Add a new blk_mq_tag_set structure that gets set up before we initialize
      the queue.  A single blk_mq_tag_set structure can be shared by multiple
      queues.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      
      Modular export of blk_mq_{alloc,free}_tagset added by me.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      24d2f903
    • C
      blk-mq: add ->init_request and ->exit_request methods · e9b267d9
      Christoph Hellwig 提交于
      The current blk_mq_init_commands/blk_mq_free_commands interface has a
      two problems:
      
       1) Because only the constructor is passed to blk_mq_init_commands there
          is no easy way to clean up when a comman initialization failed.  The
          current code simply leaks the allocations done in the constructor.
      
       2) There is no good place to call blk_mq_free_commands: before
          blk_cleanup_queue there is no guarantee that all outstanding
          commands have completed, so we can't free them yet.  After
          blk_cleanup_queue the queue has usually been freed.  This can be
          worked around by grabbing an unconditional reference before calling
          blk_cleanup_queue and dropping it after blk_mq_free_commands is
          done, although that's not exatly pretty and driver writers are
          guaranteed to get it wrong sooner or later.
      
      Both issues are easily fixed by making the request constructor and
      destructor normal blk_mq_ops methods.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e9b267d9
    • C
      blk-mq: do not initialize req->special · 9d74e257
      Christoph Hellwig 提交于
      Drivers can reach their private data easily using the blk_mq_rq_to_pdu
      helper and don't need req->special.  By not initializing it code can
      be simplified nicely, and we also shave off a few more instructions from
      the I/O path.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      9d74e257
  11. 24 3月, 2014 1 次提交
    • R
      virtio-blk: base queue-depth on virtqueue ringsize or module param · fc4324b4
      Rusty Russell 提交于
      Venkatash spake thus:
      
        virtio-blk set the default queue depth to 64 requests, which was
        insufficient for high-IOPS devices. Instead set the blk-queue depth to
        the device's virtqueue depth divided by two (each I/O requires at least
        two VQ entries).
      
      But behold, Ted added a module parameter:
      
        Also allow the queue depth to be something which can be set at module
        load time or via a kernel boot-time parameter, for
        testing/benchmarking purposes.
      
      And I rewrote it substantially, mainly to take
      VIRTIO_RING_F_INDIRECT_DESC into account.
      
      As QEMU sets the vq size for PCI to 128, Venkatash's patch wouldn't
      have made a change.  This version does (since QEMU also offers
      VIRTIO_RING_F_INDIRECT_DESC.
      Inspired-by: N"Theodore Ts'o" <tytso@mit.edu>
      Based-on-the-true-story-of: Venkatesh Srinivas <venkateshs@google.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: virtio-dev@lists.oasis-open.org
      Cc: virtualization@lists.linux-foundation.org
      Cc: Frank Swiderski <fes@google.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      fc4324b4
  12. 15 3月, 2014 1 次提交
    • J
      blk-mq: allow blk_mq_init_commands() to return failure · 95363efd
      Jens Axboe 提交于
      If drivers do dynamic allocation in the hardware command init
      path, then we need to be able to handle and return failures.
      
      And if they do allocations or mappings in the init command path,
      then we need a cleanup function to free up that space at exit
      time. So add blk_mq_free_commands() as the cleanup function.
      
      This is required for the mtip32xx driver conversion to blk-mq.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      95363efd
  13. 13 3月, 2014 1 次提交
  14. 11 2月, 2014 1 次提交
  15. 20 11月, 2013 1 次提交
  16. 14 11月, 2013 1 次提交
  17. 29 10月, 2013 1 次提交
  18. 17 10月, 2013 1 次提交
  19. 23 9月, 2013 1 次提交
  20. 20 5月, 2013 1 次提交
  21. 20 3月, 2013 4 次提交
  22. 12 3月, 2013 1 次提交
    • M
      virtio-blk: emit udev event when device is resized · 9d9598b8
      Milos Vyletel 提交于
      When virtio-blk device is resized from host (using block_resize from QEMU) emit
      KOBJ_CHANGE uevent to notify guest about such change. This allows user to have
      custom udev rules which would take whatever action if such event occurs. As a
      proof of concept I've created simple udev rule that automatically resize
      filesystem on virtio-blk device.
      
      ACTION=="change", KERNEL=="vd*", \
              ENV{RESIZE}=="1", \
              ENV{ID_FS_TYPE}=="ext[3-4]", \
              RUN+="/sbin/resize2fs /dev/%k"
      ACTION=="change", KERNEL=="vd*", \
              ENV{RESIZE}=="1", \
              ENV{ID_FS_TYPE}=="LVM2_member", \
              RUN+="/sbin/pvresize /dev/%k"
      Signed-off-by: NMilos Vyletel <milos.vyletel@sde.cz>
      Tested-by: NAsias He <asias@redhat.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (minor simplification)
      9d9598b8
  23. 04 1月, 2013 1 次提交
    • G
      Drivers: block: remove __dev* attributes. · 8d85fce7
      Greg Kroah-Hartman 提交于
      CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
      markings need to be removed.
      
      This change removes the use of __devinit, __devexit_p, __devinitdata,
      __devinitconst, and __devexit from these drivers.
      
      Based on patches originally written by Bill Pemberton, but redone by me
      in order to handle some of the coding style issues better, by hand.
      
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Cc: Mike Miller <mike.miller@hp.com>
      Cc: Chirag Kantharia <chirag.kantharia@hp.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Jim Paris <jim@jtan.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Tao Guo <Tao.Guo@emc.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8d85fce7
  24. 02 1月, 2013 1 次提交
    • A
      virtio-blk: Don't free ida when disk is in use · f4953fe6
      Alexander Graf 提交于
      When a file system is mounted on a virtio-blk disk, we then remove it
      and then reattach it, the reattached disk gets the same disk name and
      ids as the hot removed one.
      
      This leads to very nasty effects - mostly rendering the newly attached
      device completely unusable.
      
      Trying what happens when I do the same thing with a USB device, I saw
      that the sd node simply doesn't get free'd when a device gets forcefully
      removed.
      
      Imitate the same behavior for vd devices. This way broken vd devices
      simply are never free'd and newly attached ones keep working just fine.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: stable@kernel.org
      f4953fe6
  25. 28 9月, 2012 3 次提交
    • A
      virtio-blk: Disable callback in virtblk_done() · bb811108
      Asias He 提交于
      This reduces unnecessary interrupts that host could send to guest while
      guest is in the progress of irq handling.
      
      If one vcpu is handling the irq, while another interrupt comes, in
      handle_edge_irq(), the guest will mask the interrupt via mask_msi_irq()
      which is a very heavy operation that goes all the way down to host.
      
       Here are some performance numbers on qemu:
      
       Before:
       -------------------------------------
         seq-read  : io=0 B, bw=269730KB/s, iops=67432 , runt= 62200msec
         seq-write : io=0 B, bw=339716KB/s, iops=84929 , runt= 49386msec
         rand-read : io=0 B, bw=270435KB/s, iops=67608 , runt= 62038msec
         rand-write: io=0 B, bw=354436KB/s, iops=88608 , runt= 47335msec
           clat (usec): min=101 , max=138052 , avg=14822.09, stdev=11771.01
           clat (usec): min=96 , max=81543 , avg=11798.94, stdev=7735.60
           clat (usec): min=128 , max=140043 , avg=14835.85, stdev=11765.33
           clat (usec): min=109 , max=147207 , avg=11337.09, stdev=5990.35
         cpu          : usr=15.93%, sys=60.37%, ctx=7764972, majf=0, minf=54
         cpu          : usr=32.73%, sys=120.49%, ctx=7372945, majf=0, minf=1
         cpu          : usr=18.84%, sys=58.18%, ctx=7775420, majf=0, minf=1
         cpu          : usr=24.20%, sys=59.85%, ctx=8307886, majf=0, minf=0
         vdb: ios=8389107/8368136, merge=0/0, ticks=19457874/14616506,
       in_queue=34206098, util=99.68%
        43: interrupt in total: 887320
       fio --exec_prerun="echo 3 > /proc/sys/vm/drop_caches" --group_reporting
       --ioscheduler=noop --thread --bs=4k --size=512MB --direct=1 --numjobs=16
       --ioengine=libaio --iodepth=64 --loops=3 --ramp_time=0
       --filename=/dev/vdb --name=seq-read --stonewall --rw=read
       --name=seq-write --stonewall --rw=write --name=rnd-read --stonewall
       --rw=randread --name=rnd-write --stonewall --rw=randwrite
      
       After:
       -------------------------------------
         seq-read  : io=0 B, bw=309503KB/s, iops=77375 , runt= 54207msec
         seq-write : io=0 B, bw=448205KB/s, iops=112051 , runt= 37432msec
         rand-read : io=0 B, bw=311254KB/s, iops=77813 , runt= 53902msec
         rand-write: io=0 B, bw=377152KB/s, iops=94287 , runt= 44484msec
           clat (usec): min=81 , max=90588 , avg=12946.06, stdev=9085.94
           clat (usec): min=57 , max=72264 , avg=8967.97, stdev=5951.04
           clat (usec): min=29 , max=101046 , avg=12889.95, stdev=9067.91
           clat (usec): min=52 , max=106152 , avg=10660.56, stdev=4778.19
         cpu          : usr=15.05%, sys=57.92%, ctx=77109411, majf=0, minf=54
         cpu          : usr=26.78%, sys=101.40%, ctx=7387891, majf=0, minf=2
         cpu          : usr=19.03%, sys=58.17%, ctx=7681976, majf=0, minf=8
         cpu          : usr=24.65%, sys=58.34%, ctx=8442632, majf=0, minf=4
         vdb: ios=8389086/8361888, merge=0/0, ticks=17243780/12742010,
       in_queue=30078377, util=99.59%
        43: interrupt in total: 1259639
       fio --exec_prerun="echo 3 > /proc/sys/vm/drop_caches" --group_reporting
       --ioscheduler=noop --thread --bs=4k --size=512MB --direct=1 --numjobs=16
       --ioengine=libaio --iodepth=64 --loops=3 --ramp_time=0
       --filename=/dev/vdb --name=seq-read --stonewall --rw=read
       --name=seq-write --stonewall --rw=write --name=rnd-read --stonewall
       --rw=randread --name=rnd-write --stonewall --rw=randwrite
      Signed-off-by: NAsias He <asias@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      bb811108
    • D
      virtio-blk: fix NULL checking in virtblk_alloc_req() · f22cf8eb
      Dan Carpenter 提交于
      Smatch complains about the inconsistent NULL checking here.  Fix it to
      return NULL on failure.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (fixed accidental deletion)
      f22cf8eb
    • A
      virtio-blk: Add REQ_FLUSH and REQ_FUA support to bio path · c85a1f91
      Asias He 提交于
      We need to support both REQ_FLUSH and REQ_FUA for bio based path since
      it does not get the sequencing of REQ_FUA into REQ_FLUSH that request
      based drivers can request.
      
      REQ_FLUSH is emulated by:
      A) If the bio has no data to write:
      1. Send VIRTIO_BLK_T_FLUSH to device,
      2. In the flush I/O completion handler, finish the bio
      
      B) If the bio has data to write:
      1. Send VIRTIO_BLK_T_FLUSH to device
      2. In the flush I/O completion handler, send the actual write data to device
      3. In the write I/O completion handler, finish the bio
      
      REQ_FUA is emulated by:
      1. Send the actual write data to device
      2. In the write I/O completion handler, send VIRTIO_BLK_T_FLUSH to device
      3. In the flush I/O completion handler, finish the bio
      
      Changes in v7:
      - Using vbr->flags to trace request type
      - Dropped unnecessary struct virtio_blk *vblk parameter
      - Reuse struct virtblk_req in bio done function
      
      Cahnges in v6:
      - Reworked REQ_FLUSH and REQ_FUA emulatation order
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: kvm@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: virtualization@lists.linux-foundation.org
      Signed-off-by: NAsias He <asias@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      c85a1f91