1. 27 9月, 2012 2 次提交
    • M
      dm: retain table limits when swapping to new table with no devices · 3ae70656
      Mike Snitzer 提交于
      Add a safety net that will re-use the DM device's existing limits in the
      event that DM device has a temporary table that doesn't have any
      component devices.  This is to reduce the chance that requests not
      respecting the hardware limits will reach the device.
      
      DM recalculates queue limits based only on devices which currently exist
      in the table.  This creates a problem in the event all devices are
      temporarily removed such as all paths being lost in multipath.  DM will
      reset the limits to the maximum permissible, which can then assemble
      requests which exceed the limits of the paths when the paths are
      restored.  The request will fail the blk_rq_check_limits() test when
      sent to a path with lower limits, and will be retried without end by
      multipath.  This became a much bigger issue after v3.6 commit fe86cdce
      ("block: do not artificially constrain max_sectors for stacking
      drivers").
      Reported-by: NDavid Jeffery <djeffery@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      3ae70656
    • M
      dm: handle requests beyond end of device instead of using BUG_ON · ba1cbad9
      Mike Snitzer 提交于
      The access beyond the end of device BUG_ON that was introduced to
      dm_request_fn via commit 29e4013d ("dm: implement
      REQ_FLUSH/FUA support for request-based dm") was an overly
      drastic (but simple) response to this situation.
      
      I have received a report that this BUG_ON was hit and now think
      it would be better to use dm_kill_unmapped_request() to fail the clone
      and original request with -EIO.
      
      map_request() will assign the valid target returned by
      dm_table_find_target to tio->ti.  But when the target
      isn't valid tio->ti is never assigned (because map_request isn't
      called); so add a check for tio->ti != NULL to dm_done().
      Reported-by: NMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Cc: stable@vger.kernel.org # v2.6.37+
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      ba1cbad9
  2. 27 7月, 2012 2 次提交
  3. 29 3月, 2012 1 次提交
  4. 04 1月, 2012 1 次提交
  5. 01 11月, 2011 4 次提交
  6. 12 9月, 2011 3 次提交
  7. 02 8月, 2011 4 次提交
    • M
      dm table: set flush capability based on underlying devices · ed8b752b
      Mike Snitzer 提交于
      DM has always advertised both REQ_FLUSH and REQ_FUA flush capabilities
      regardless of whether or not a given DM device's underlying devices
      also advertised a need for them.
      
      Block's flush-merge changes from 2.6.39 have proven to be more costly
      for DM devices.  Performance regressions have been reported even when
      DM's underlying devices do not advertise that they have a write cache.
      
      Fix the performance regressions by configuring a DM device's flushing
      capabilities based on those of the underlying devices' capabilities.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      ed8b752b
    • M
      dm: ignore merge_bvec for snapshots when safe · d5b9dd04
      Mikulas Patocka 提交于
      Add a new flag DMF_MERGE_IS_OPTIONAL to struct mapped_device to indicate
      whether the device can accept bios larger than the size its merge
      function returns.  When set, use this to send large bios to snapshots
      which can split them if necessary.  Snapshot I/O may be significantly
      fragmented and this approach seems to improve peformance.
      
      Before the patch, dm_set_device_limits restricted bio size to page size
      if the underlying device had a merge function and the target didn't
      provide a merge function.  After the patch, dm_set_device_limits
      restricts bio size to page size if the underlying device has a merge
      function, doesn't have DMF_MERGE_IS_OPTIONAL flag and the target doesn't
      provide a merge function.
      
      The snapshot target can't provide a merge function because when the merge
      function is called, it is impossible to determine where the bio will be
      remapped.  Previously this led us to impose a 4k limit, which we can
      now remove if the snapshot store is located on a device without a merge
      function.  Together with another patch for optimizing full chunk writes,
      it improves performance from 29MB/s to 40MB/s when writing to the
      filesystem on snapshot store.
      
      If the snapshot store is placed on a non-dm device with a merge function
      (such as md-raid), device mapper still limits all bios to page size.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      d5b9dd04
    • M
      dm table: fix discard support · 936688d7
      Mike Snitzer 提交于
      Remove 'discards_supported' from the dm_table structure.  The same
      information can be easily discovered from the table's target(s) in
      dm_table_supports_discards().
      
      Before this fix dm_table_supports_discards() would skip checking the
      individual targets' 'discards_supported' flag if any one target in the
      table didn't set num_discard_requests > 0.  Now the per-target
      'discards_supported' flag is effective at insuring the final DM device
      advertises discard support.  But, to be clear, targets that don't
      support discards (!num_discard_requests) will not receive discard
      requests.
      
      Also DMWARN if a target sets 'discards_supported' override but forgets
      to set 'num_discard_requests'.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      936688d7
    • A
      dm: fix idr leak on module removal · d15b774c
      Alasdair G Kergon 提交于
      Destroy _minor_idr when unloading the core dm module.  (Found by kmemleak.)
      
      Cc: stable@kernel.org
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      d15b774c
  8. 22 3月, 2011 1 次提交
  9. 17 3月, 2011 1 次提交
  10. 10 3月, 2011 1 次提交
  11. 14 1月, 2011 5 次提交
  12. 07 1月, 2011 1 次提交
  13. 16 11月, 2010 1 次提交
  14. 05 10月, 2010 1 次提交
    • A
      block: autoconvert trivial BKL users to private mutex · 2a48fc0a
      Arnd Bergmann 提交于
      The block device drivers have all gained new lock_kernel
      calls from a recent pushdown, and some of the drivers
      were already using the BKL before.
      
      This turns the BKL into a set of per-driver mutexes.
      Still need to check whether this is safe to do.
      
      file=$1
      name=$2
      if grep -q lock_kernel ${file} ; then
          if grep -q 'include.*linux.mutex.h' ${file} ; then
                  sed -i '/include.*<linux\/smp_lock.h>/d' ${file}
          else
                  sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file}
          fi
          sed -i ${file} \
              -e "/^#include.*linux.mutex.h/,$ {
                      1,/^\(static\|int\|long\)/ {
                           /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);
      
      } }"  \
          -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
          -e '/[      ]*cycle_kernel_lock();/d'
      else
          sed -i -e '/include.*\<smp_lock.h\>/d' ${file}  \
                      -e '/cycle_kernel_lock()/d'
      fi
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      2a48fc0a
  15. 10 9月, 2010 6 次提交
    • M
      dm: convey that all flushes are processed as empty · b372d360
      Mike Snitzer 提交于
      Rename __clone_and_map_flush to __clone_and_map_empty_flush for added
      clarity.
      
      Simplify logic associated with REQ_FLUSH conditionals.
      
      Introduce a BUG_ON() and add a few more helpful comments to the code
      so that it is clear that all flushes are empty.
      
      Cleanup __split_and_process_bio() so that an empty flush isn't processed
      by a 'sector_count' focused while loop.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      b372d360
    • K
      dm: fix locking context in queue_io() · 05447420
      Kiyoshi Ueda 提交于
      Now queue_io() is called from dec_pending(), which may be called with
      interrupts disabled, so queue_io() must not enable interrupts
      unconditionally and must save/restore the current interrupts status.
      Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      05447420
    • T
      dm: relax ordering of bio-based flush implementation · 6a8736d1
      Tejun Heo 提交于
      Unlike REQ_HARDBARRIER, REQ_FLUSH/FUA doesn't mandate any ordering
      against other bio's.  This patch relaxes ordering around flushes.
      
      * A flush bio is no longer deferred to workqueue directly.  It's
        processed like other bio's but __split_and_process_bio() uses
        md->flush_bio as the clone source.  md->flush_bio is initialized to
        empty flush during md initialization and shared for all flushes.
      
      * As a flush bio now travels through the same execution path as other
        bio's, there's no need for dedicated error handling path either.  It
        can use the same error handling path in dec_pending().  Dedicated
        error handling removed along with md->flush_error.
      
      * When dec_pending() detects that a flush has completed, it checks
        whether the original bio has data.  If so, the bio is queued to the
        deferred list w/ REQ_FLUSH cleared; otherwise, it's completed.
      
      * As flush sequencing is handled in the usual issue/completion path,
        dm_wq_work() no longer needs to handle flushes differently.  Now its
        only responsibility is re-issuing deferred bio's the same way as
        _dm_request() would.  REQ_FLUSH handling logic including
        process_flush() is dropped.
      
      * There's no reason for queue_io() and dm_wq_work() write lock
        dm->io_lock.  queue_io() now only uses md->deferred_lock and
        dm_wq_work() read locks dm->io_lock.
      
      * bio's no longer need to be queued on the deferred list while a flush
        is in progress making DMF_QUEUE_IO_TO_THREAD unncessary.  Drop it.
      
      This avoids stalling the device during flushes and simplifies the
      implementation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      6a8736d1
    • T
      dm: implement REQ_FLUSH/FUA support for request-based dm · 29e4013d
      Tejun Heo 提交于
      This patch converts request-based dm to support the new REQ_FLUSH/FUA.
      
      The original request-based flush implementation depended on
      request_queue blocking other requests while a barrier sequence is in
      progress, which is no longer true for the new REQ_FLUSH/FUA.
      
      In general, request-based dm doesn't have infrastructure for cloning
      one source request to multiple targets, but the original flush
      implementation had a special mostly independent path which can issue
      flushes to multiple targets and sequence them.  However, the
      capability isn't currently in use and adds a lot of complexity.
      Moreoever, it's unlikely to be useful in its current form as it
      doesn't make sense to be able to send out flushes to multiple targets
      when write requests can't be.
      
      This patch rips out special flush code path and deals handles
      REQ_FLUSH/FUA requests the same way as other requests.  The only
      special treatment is that REQ_FLUSH requests use the block address 0
      when finding target, which is enough for now.
      
      * added BUG_ON(!dm_target_is_valid(ti)) in dm_request_fn() as
        suggested by Mike Snitzer
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NMike Snitzer <snitzer@redhat.com>
      Tested-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      29e4013d
    • T
      dm: implement REQ_FLUSH/FUA support for bio-based dm · d87f4c14
      Tejun Heo 提交于
      This patch converts bio-based dm to support REQ_FLUSH/FUA instead of
      now deprecated REQ_HARDBARRIER.
      
      * -EOPNOTSUPP handling logic dropped.
      
      * Preflush is handled as before but postflush is dropped and replaced
        with passing down REQ_FUA to member request_queues.  This replaces
        one array wide cache flush w/ member specific FUA writes.
      
      * __split_and_process_bio() now calls __clone_and_map_flush() directly
        for flushes and guarantees all FLUSH bio's going to targets are zero
      `  length.
      
      * It's now guaranteed that all FLUSH bio's which are passed onto dm
        targets are zero length.  bio_empty_barrier() tests are replaced
        with REQ_FLUSH tests.
      
      * Empty WRITE_BARRIERs are replaced with WRITE_FLUSHes.
      
      * Dropped unlikely() around REQ_FLUSH tests.  Flushes are not unlikely
        enough to be marked with unlikely().
      
      * Block layer now filters out REQ_FLUSH/FUA bio's if the request_queue
        doesn't support cache flushing.  Advertise REQ_FLUSH | REQ_FUA
        capability.
      
      * Request based dm isn't converted yet.  dm_init_request_based_queue()
        resets flush support to 0 for now.  To avoid disturbing request
        based dm code, dm->flush_error is added for bio based dm while
        requested based dm continues to use dm->barrier_error.
      
      Lightly tested linear, stripe, raid1, snap and crypt targets.  Please
      proceed with caution as I'm not familiar with the code base.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: dm-devel@redhat.com
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      d87f4c14
    • T
      block: deprecate barrier and replace blk_queue_ordered() with blk_queue_flush() · 4913efe4
      Tejun Heo 提交于
      Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA
      requests.  Deprecate barrier.  All REQ_HARDBARRIERs are failed with
      -EOPNOTSUPP and blk_queue_ordered() is replaced with simpler
      blk_queue_flush().
      
      blk_queue_flush() takes combinations of REQ_FLUSH and FUA.  If a
      device has write cache and can flush it, it should set REQ_FLUSH.  If
      the device can handle FUA writes, it should also set REQ_FUA.
      
      All blk_queue_ordered() users are converted.
      
      * ORDERED_DRAIN is mapped to 0 which is the default value.
      * ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH.
      * ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH | REQ_FUA.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NBoaz Harrosh <bharrosh@panasas.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Alasdair G Kergon <agk@redhat.com>
      Cc: Pierre Ossman <drzeus@drzeus.cx>
      Cc: Stefan Weinhuber <wein@de.ibm.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      4913efe4
  16. 12 8月, 2010 6 次提交