1. 20 9月, 2012 2 次提交
  2. 02 8月, 2012 2 次提交
    • P
      block: split discard into aligned requests · c6e66634
      Paolo Bonzini 提交于
      When a disk has large discard_granularity and small max_discard_sectors,
      discards are not split with optimal alignment.  In the limit case of
      discard_granularity == max_discard_sectors, no request could be aligned
      correctly, so in fact you might end up with no discarded logical blocks
      at all.
      
      Another example that helps showing the condition in the patch is with
      discard_granularity == 64, max_discard_sectors == 128.  A request that is
      submitted for 256 sectors 2..257 will be split in two: 2..129, 130..257.
      However, only 2 aligned blocks out of 3 are included in the request;
      128..191 may be left intact and not discarded.  With this patch, the
      first request will be truncated to ensure good alignment of what's left,
      and the split will be 2..127, 128..255, 256..257.  The patch will also
      take into account the discard_alignment.
      
      At most one extra request will be introduced, because the first request
      will be reduced by at most granularity-1 sectors, and granularity
      must be less than max_discard_sectors.  Subsequent requests will run
      on round_down(max_discard_sectors, granularity) sectors, as in the
      current code.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Tested-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      c6e66634
    • P
      block: reorganize rounding of max_discard_sectors · f6ff53d3
      Paolo Bonzini 提交于
      Mostly a preparation for the next patch.
      
      In principle this fixes an infinite loop if max_discard_sectors < granularity,
      but that really shouldn't happen.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Tested-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f6ff53d3
  3. 24 7月, 2011 1 次提交
  4. 07 7月, 2011 1 次提交
    • M
      block: eliminate potential for infinite loop in blkdev_issue_discard · 0f799603
      Mike Snitzer 提交于
      Due to the recently identified overflow in read_capacity_16() it was
      possible for max_discard_sectors to be zero but still have discards
      enabled on the associated device's queue.
      
      Eliminate the possibility for blkdev_issue_discard to infinitely loop.
      
      Interestingly this issue wasn't identified until a device, whose
      discard_granularity was 0 due to read_capacity_16 overflow, was consumed
      by blk_stack_limits() to construct limits for a higher-level DM
      multipath device.  The multipath device's resulting limits never had the
      discard limits stacked because blk_stack_limits() will only do so if
      the bottom device's discard_granularity != 0.  This resulted in the
      multipath device's limits.max_discard_sectors being 0.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      0f799603
  5. 07 5月, 2011 3 次提交
    • L
      blkdev: Do not return -EOPNOTSUPP if discard is supported · 8af1954d
      Lukas Czerner 提交于
      Currently we return -EOPNOTSUPP in blkdev_issue_discard() if any of the
      bio fails due to underlying device not supporting discard request.
      However, if the device is for example dm device composed of devices
      which some of them support discard and some of them does not, it is ok
      for some bios to fail with EOPNOTSUPP, but it does not mean that discard
      is not supported at all.
      
      This commit removes the check for bios failed with EOPNOTSUPP and change
      blkdev_issue_discard() to return operation not supported if and only if
      the device does not actually supports it, not just part of the device as
      some bios might indicate.
      
      This change also fixes problem with BLKDISCARD ioctl() which now works
      correctly on such dm devices.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      CC: Jens Axboe <jaxboe@fusionio.com>
      CC: Jeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      8af1954d
    • L
      blkdev: Simple cleanup in blkdev_issue_zeroout() · 5baebe5c
      Lukas Czerner 提交于
      In blkdev_issue_zeroout() we are submitting regular WRITE bios, so we do
      not need to check for -EOPNOTSUPP specifically in case of error. Also
      there is no need to have label submit: because there is no way to jump
      out from the while cycle without an error and we really want to exit,
      rather than try again. And also remove the check for (sz == 0) since at
      that point sz can never be zero.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      CC: Dmitry Monakhov <dmonakhov@openvz.org>
      CC: Jens Axboe <jaxboe@fusionio.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      5baebe5c
    • L
      blkdev: Submit discard bio in batches in blkdev_issue_discard() · 5dba3089
      Lukas Czerner 提交于
      Currently we are waiting for every submitted REQ_DISCARD bio separately,
      but it can have unwanted consequences of repeatedly flushing the queue,
      so we rather submit bios in batches and wait for the entire batch, hence
      narrowing the window of other ios going in.
      
      Use bio_batch_end_io() and struct bio_batch for that purpose, the same
      is used by blkdev_issue_zeroout(). Also change bio_batch_end_io() so we
      always set !BIO_UPTODATE in the case of error and remove the check for
      bb, since we are the only user of this function and we always set this.
      
      Remove bio_get()/bio_put() from the blkdev_issue_discard() since
      bio_alloc() and bio_batch_end_io() is doing the same thing, hence it is
      not needed anymore.
      
      I have done simple dd testing with surprising results. The script I have
      used is:
      
      for i in $(seq 10); do
              echo $i
              dd if=/dev/sdb1 of=/dev/sdc1 bs=4k &
              sleep 5
      done
      /usr/bin/time -f %e ./blkdiscard /dev/sdc1
      
      Running time of BLKDISCARD on the whole device:
      with patch              without patch
      0.95                    15.58
      
      So we can see that in this artificial test the kernel with the patch
      applied is approx 16x faster in discarding the device.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      CC: Dmitry Monakhov <dmonakhov@openvz.org>
      CC: Jens Axboe <jaxboe@fusionio.com>
      CC: Jeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      5dba3089
  6. 12 3月, 2011 1 次提交
  7. 11 3月, 2011 1 次提交
    • L
      block: fix mis-synchronisation in blkdev_issue_zeroout() · 0aeea189
      Lukas Czerner 提交于
      BZ29402
      https://bugzilla.kernel.org/show_bug.cgi?id=29402
      
      We can hit serious mis-synchronization in bio completion path of
      blkdev_issue_zeroout() leading to a panic.
      
      The problem is that when we are going to wait_for_completion() in
      blkdev_issue_zeroout() we check if the bb.done equals issued (number of
      submitted bios). If it does, we can skip the wait_for_completition()
      and just out of the function since there is nothing to wait for.
      However, there is a ordering problem because bio_batch_end_io() is
      calling atomic_inc(&bb->done) before complete(), hence it might seem to
      blkdev_issue_zeroout() that all bios has been completed and exit. At
      this point when bio_batch_end_io() is going to call complete(bb->wait),
      bb and wait does not longer exist since it was allocated on stack in
      blkdev_issue_zeroout() ==> panic!
      
      (thread 1)                      (thread 2)
      bio_batch_end_io()              blkdev_issue_zeroout()
        if(bb) {                      ...
          if (bb->end_io)             ...
            bb->end_io(bio, err);     ...
          atomic_inc(&bb->done);      ...
          ...                         while (issued != atomic_read(&bb.done))
          ...                         (let issued == bb.done)
          ...                         (do the rest of the function)
          ...                         return ret;
          complete(bb->wait);
          ^^^^^^^^
          panic
      
      We can fix this easily by simplifying bio_batch and completion counting.
      
      Also remove bio_end_io_t *end_io since it is not used.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Reported-by: NEric Whitney <eric.whitney@hp.com>
      Tested-by: NEric Whitney <eric.whitney@hp.com>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      CC: Dmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      0aeea189
  8. 02 3月, 2011 1 次提交
  9. 17 9月, 2010 1 次提交
    • C
      block: remove BLKDEV_IFL_WAIT · dd3932ed
      Christoph Hellwig 提交于
      All the blkdev_issue_* helpers can only sanely be used for synchronous
      caller.  To issue cache flushes or barriers asynchronously the caller needs
      to set up a bio by itself with a completion callback to move the asynchronous
      state machine ahead.  So drop the BLKDEV_IFL_WAIT flag that is always
      specified when calling blkdev_issue_* and also remove the now unused flags
      argument to blkdev_issue_flush and blkdev_issue_zeroout.  For
      blkdev_issue_discard we need to keep it for the secure discard flag, which
      gains a more descriptive name and loses the bitops vs flag confusion.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      dd3932ed
  10. 10 9月, 2010 1 次提交
  11. 12 8月, 2010 1 次提交
  12. 09 8月, 2010 1 次提交
  13. 08 8月, 2010 2 次提交
  14. 29 4月, 2010 3 次提交