1. 05 12月, 2020 2 次提交
    • M
      dm: fix double RCU unlock in dm_dax_zero_page_range() error path · f05c4403
      Mike Snitzer 提交于
      Remove redundant dm_put_live_table() in dm_dax_zero_page_range() error
      path to fix sparse warning:
      drivers/md/dm.c:1208:9: warning: context imbalance in 'dm_dax_zero_page_range' - unexpected unlock
      
      Fixes: cdf6cdcd ("dm,dax: Add dax zero_page_range operation")
      Cc: stable@vger.kernel.org
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      f05c4403
    • M
      dm: fix IO splitting · 3ee16db3
      Mike Snitzer 提交于
      Commit 882ec4e6 ("dm table: stack 'chunk_sectors' limit to account
      for target-specific splitting") caused a couple regressions:
      1) Using lcm_not_zero() when stacking chunk_sectors was a bug because
         chunk_sectors must reflect the most limited of all devices in the
         IO stack.
      2) DM targets that set max_io_len but that do _not_ provide an
         .iterate_devices method no longer had there IO split properly.
      
      And commit 5091cdec ("dm: change max_io_len() to use
      blk_max_size_offset()") also caused a regression where DM no longer
      supported varied (per target) IO splitting. The implication being the
      potential for severely reduced performance for IO stacks that use a DM
      target like dm-cache to hide performance limitations of a slower
      device (e.g. one that requires 4K IO splitting).
      
      Coming full circle: Fix all these issues by discontinuing stacking
      chunk_sectors up using ti->max_io_len in dm_calculate_queue_limits(),
      add optional chunk_sectors override argument to blk_max_size_offset()
      and update DM's max_io_len() to pass ti->max_io_len to its
      blk_max_size_offset() call.
      
      Passing in an optional chunk_sectors override to blk_max_size_offset()
      allows for code reuse of block's centralized calculation for max IO
      size based on provided offset and split boundary.
      
      Fixes: 882ec4e6 ("dm table: stack 'chunk_sectors' limit to account for target-specific splitting")
      Fixes: 5091cdec ("dm: change max_io_len() to use blk_max_size_offset()")
      Cc: stable@vger.kernel.org
      Reported-by: NJohn Dorminy <jdorminy@redhat.com>
      Reported-by: NBruce Johnston <bjohnsto@redhat.com>
      Reported-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Reviewed-by: NJohn Dorminy <jdorminy@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Reviewed-by: NJens Axboe <axboe@kernel.dk>
      3ee16db3
  2. 02 12月, 2020 1 次提交
  3. 08 10月, 2020 2 次提交
  4. 06 10月, 2020 1 次提交
  5. 02 10月, 2020 2 次提交
  6. 01 10月, 2020 1 次提交
    • M
      dm: fix missing imposition of queue_limits from dm_wq_work() thread · 0c2915b8
      Mike Snitzer 提交于
      If a DM device was suspended when bios were issued to it, those bios
      would be deferred using queue_io(). Once the DM device was resumed
      dm_process_bio() could be called by dm_wq_work() for original bio that
      still needs splitting. dm_process_bio()'s check for current->bio_list
      (meaning call chain is within ->submit_bio) as a prerequisite for
      calling blk_queue_split() for "abnormal IO" would result in
      dm_process_bio() never imposing corresponding queue_limits
      (e.g. discard_granularity, discard_max_bytes, etc).
      
      Fix this by always having dm_wq_work() resubmit deferred bios using
      submit_bio_noacct().
      
      Side-effect is blk_queue_split() is always called for "abnormal IO" from
      ->submit_bio, be it from application thread or dm_wq_work() workqueue,
      so proper bio splitting and depth-first bio submission is performed.
      For sake of clarity, remove current->bio_list check before call to
      blk_queue_split().
      
      Also, remove dm_wq_work()'s use of dm_{get,put}_live_table() -- no
      longer needed since IO will be reissued in terms of ->submit_bio.
      And rename bio variable from 'c' to 'bio'.
      
      Fixes: cf9c3786 ("dm: fix comment in dm_process_bio()")
      Reported-by: NJeffle Xu <jefflexu@linux.alibaba.com>
      Reviewed-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      0c2915b8
  7. 30 9月, 2020 7 次提交
  8. 25 9月, 2020 1 次提交
  9. 22 9月, 2020 2 次提交
    • M
      dm: fix comment in dm_process_bio() · cf9c3786
      Mike Snitzer 提交于
      Refer to the correct function (->submit_bio instead of ->queue_bio).
      Also, add details about why using blk_queue_split() isn't needed for
      dm_wq_work()'s call to dm_process_bio().
      
      Fixes: c62b37d9 ("block: move ->make_request_fn to struct block_device_operations")
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      cf9c3786
    • M
      dm: fix bio splitting and its bio completion order for regular IO · ee1dfad5
      Mike Snitzer 提交于
      dm_queue_split() is removed because __split_and_process_bio() _must_
      handle splitting bios to ensure proper bio submission and completion
      ordering as a bio is split.
      
      Otherwise, multiple recursive calls to ->submit_bio will cause multiple
      split bios to be allocated from the same ->bio_split mempool at the same
      time. This would result in deadlock in low memory conditions because no
      progress could be made (only one bio is available in ->bio_split
      mempool).
      
      This fix has been verified to still fix the loss of performance, due
      to excess splitting, that commit 120c9257 provided.
      
      Fixes: 120c9257 ("Revert "dm: always call blk_queue_split() in dm_process_bio()"")
      Cc: stable@vger.kernel.org # 5.0+, requires custom backport due to 5.9 changes
      Reported-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      ee1dfad5
  10. 20 9月, 2020 1 次提交
    • D
      dm/dax: Fix table reference counts · 02186d88
      Dan Williams 提交于
      A recent fix to the dm_dax_supported() flow uncovered a latent bug. When
      dm_get_live_table() fails it is still required to drop the
      srcu_read_lock(). Without this change the lvm2 test-suite triggers this
      warning:
      
          # lvm2-testsuite --only pvmove-abort-all.sh
      
          WARNING: lock held when returning to user space!
          5.9.0-rc5+ #251 Tainted: G           OE
          ------------------------------------------------
          lvm/1318 is leaving the kernel with locks still held!
          1 lock held by lvm/1318:
           #0: ffff9372abb5a340 (&md->io_barrier){....}-{0:0}, at: dm_get_live_table+0x5/0xb0 [dm_mod]
      
      ...and later on this hang signature:
      
          INFO: task lvm:1344 blocked for more than 122 seconds.
                Tainted: G           OE     5.9.0-rc5+ #251
          "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
          task:lvm             state:D stack:    0 pid: 1344 ppid:     1 flags:0x00004000
          Call Trace:
           __schedule+0x45f/0xa80
           ? finish_task_switch+0x249/0x2c0
           ? wait_for_completion+0x86/0x110
           schedule+0x5f/0xd0
           schedule_timeout+0x212/0x2a0
           ? __schedule+0x467/0xa80
           ? wait_for_completion+0x86/0x110
           wait_for_completion+0xb0/0x110
           __synchronize_srcu+0xd1/0x160
           ? __bpf_trace_rcu_utilization+0x10/0x10
           __dm_suspend+0x6d/0x210 [dm_mod]
           dm_suspend+0xf6/0x140 [dm_mod]
      
      Fixes: 7bf7eac8 ("dax: Arrange for dax_supported check to span multiple devices")
      Cc: <stable@vger.kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Reported-by: NAdrian Huang <ahuang12@lenovo.com>
      Reviewed-by: NIra Weiny <ira.weiny@intel.com>
      Tested-by: NAdrian Huang <ahuang12@lenovo.com>
      Link: https://lore.kernel.org/r/160045867590.25663.7548541079217827340.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>
      02186d88
  11. 02 9月, 2020 1 次提交
  12. 24 8月, 2020 1 次提交
  13. 05 8月, 2020 1 次提交
  14. 24 7月, 2020 1 次提交
    • M
      dm integrity: fix integrity recalculation that is improperly skipped · 5df96f2b
      Mikulas Patocka 提交于
      Commit adc0daad ("dm: report suspended
      device during destroy") broke integrity recalculation.
      
      The problem is dm_suspended() returns true not only during suspend,
      but also during resume. So this race condition could occur:
      1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
      2. integrity_recalc (&ic->recalc_work) preempts the current thread
      3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
      4. integrity_recalc exits and no recalculating is done.
      
      To fix this race condition, add a function dm_post_suspending that is
      only true during the postsuspend phase and use it instead of
      dm_suspended().
      
      Signed-off-by: Mikulas Patocka <mpatocka redhat com>
      Fixes: adc0daad ("dm: report suspended device during destroy")
      Cc: stable vger kernel org # v4.18+
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      5df96f2b
  15. 09 7月, 2020 3 次提交
  16. 08 7月, 2020 2 次提交
    • C
      dm: use bio_uninit instead of bio_disassociate_blkg · 382761dc
      Christoph Hellwig 提交于
      bio_uninit is the proper API to clean up a BIO that has been allocated
      on stack or inside a structure that doesn't come from the BIO allocator.
      Switch dm to use that instead of bio_disassociate_blkg, which really is
      an implementation detail.  Note that the bio_uninit calls are also moved
      to the two callers of __send_empty_flush, so that they better pair with
      the bio_init calls used to initialize them.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      382761dc
    • M
      dm: do not use waitqueue for request-based DM · 85067747
      Ming Lei 提交于
      Given request-based DM now uses blk-mq's blk_mq_queue_inflight() to
      determine if outstanding IO has completed (and DM has no control over
      the blk-mq state machine used to track outstanding IO) it is unsafe to
      wakeup waiter (dm_wait_for_completion) before blk-mq has cleared a
      request's state bits (e.g. MQ_RQ_IN_FLIGHT or MQ_RQ_COMPLETE).  As
      such dm_wait_for_completion() could be left to wait indefinitely if no
      other requests complete.
      
      Fix this by eliminating request-based DM's use of waitqueue to wait
      for blk-mq requests to complete in dm_wait_for_completion.
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Depends-on: 3c94d83c ("blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()")
      Cc: stable@vger.kernel.org
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      85067747
  17. 02 7月, 2020 1 次提交
    • J
      dm: remove unused variable · b53ac8b8
      Jens Axboe 提交于
      Since merging the commit identified in Fixes below, we trigger this
      compile time warning:
      
      drivers/md/dm.c: In function ‘__map_bio’:
      drivers/md/dm.c:1296:24: warning: unused variable ‘md’ [-Wunused-variable]
       1296 |  struct mapped_device *md = io->md;
             |                        ^~
      
      Remove the 'md' variable.
      
      Fixes: 5a6c35f9 ("block: remove direct_make_request")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b53ac8b8
  18. 01 7月, 2020 5 次提交
  19. 29 6月, 2020 1 次提交
  20. 20 6月, 2020 1 次提交
  21. 27 5月, 2020 1 次提交
  22. 21 5月, 2020 1 次提交
    • M
      dm: use DMDEBUG macros now that they use pr_debug variants · ac75b09f
      Mike Snitzer 提交于
      Now that DMDEBUG uses pr_debug and DMDEBUG_LIMIT uses
      pr_debug_ratelimited cleanup DM's 2 direct pr_debug callers to use
      them to get the benefit of consistent DM_FMT formatting of debugging
      messages.
      
      While doing so, dm-mpath.c:dm_report_EIO() was switched over to using
      DMDEBUG_LIMIT due to the potential for error handling floods in the IO
      completion path.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      ac75b09f
  23. 19 5月, 2020 1 次提交
    • C
      blk-mq: allow blk_mq_make_request to consume the q_usage_counter reference · ac7c5675
      Christoph Hellwig 提交于
      blk_mq_make_request currently needs to grab an q_usage_counter
      reference when allocating a request.  This is because the block layer
      grabs one before calling blk_mq_make_request, but also releases it as
      soon as blk_mq_make_request returns.  Remove the blk_queue_exit call
      after blk_mq_make_request returns, and instead let it consume the
      reference.  This works perfectly fine for the block layer caller, just
      device mapper needs an extra reference as the old problem still
      persists there.  Open code blk_queue_enter_live in device mapper,
      as there should be no other callers and this allows better documenting
      why we do a non-try get.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ac7c5675