1. 10 4月, 2019 1 次提交
    • G
      block: Mark expected switch fall-throughs · e16fb3a8
      Gustavo A. R. Silva 提交于
      In preparation to enabling -Wimplicit-fallthrough, mark switch cases
      where we are expecting to fall through.
      
      This patch fixes the following warnings:
      
      drivers/block/drbd/drbd_int.h:1774:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      drivers/block/drbd/drbd_int.h:1774:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      drivers/block/drbd/drbd_int.h:1774:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      drivers/block/drbd/drbd_int.h:1774:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      drivers/block/drbd/drbd_int.h:1774:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      drivers/block/drbd/drbd_receiver.c:3093:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      drivers/block/drbd/drbd_receiver.c:3120:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      drivers/block/drbd/drbd_req.c:856:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      This patch is part of the ongoing efforts to enable
      -Wimplicit-fallthrough
      Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Acked-by: NRoland Kammerer <roland.kammerer@linbit.com>
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      e16fb3a8
  2. 21 12月, 2018 2 次提交
    • L
      drbd: introduce P_ZEROES (REQ_OP_WRITE_ZEROES on the "wire") · f31e583a
      Lars Ellenberg 提交于
      And also re-enable partial-zero-out + discard aligned.
      
      With the introduction of REQ_OP_WRITE_ZEROES,
      we started to use that for both WRITE_ZEROES and DISCARDS,
      hoping that WRITE_ZEROES would "do what we want",
      UNMAP if possible, zero-out the rest.
      
      The example scenario is some LVM "thin" backend.
      
      While an un-allocated block on dm-thin reads as zeroes, on a dm-thin
      with "skip_block_zeroing=true", after a partial block write allocated
      that block, that same block may well map "undefined old garbage" from
      the backends on LBAs that have not yet been written to.
      
      If we cannot distinguish between zero-out and discard on the receiving
      side, to avoid "undefined old garbage" to pop up randomly at later times
      on supposedly zero-initialized blocks, we'd need to map all discards to
      zero-out on the receiving side.  But that would potentially do a full
      alloc on thinly provisioned backends, even when the expectation was to
      unmap/trim/discard/de-allocate.
      
      We need to distinguish on the protocol level, whether we need to guarantee
      zeroes (and thus use zero-out, potentially doing the mentioned full-alloc),
      or if we want to put the emphasis on discard, and only do a "best effort
      zeroing" (by "discarding" blocks aligned to discard-granularity, and zeroing
      only potential unaligned head and tail clippings to at least *try* to
      avoid "false positives" in an online-verify later), hoping that someone
      set skip_block_zeroing=false.
      
      For some discussion regarding this on dm-devel, see also
      https://www.mail-archive.com/dm-devel%40redhat.com/msg07965.html
      https://www.redhat.com/archives/dm-devel/2018-January/msg00271.html
      
      For backward compatibility, P_TRIM means zero-out, unless the
      DRBD_FF_WZEROES feature flag is agreed upon during handshake.
      
      To have upper layers even try to submit WRITE ZEROES requests,
      we need to announce "efficient zeroout" independently.
      
      We need to fixup max_write_zeroes_sectors after blk_queue_stack_limits():
      if we can handle "zeroes" efficiently on the protocol,
      we want to do that, even if our backend does not announce
      max_write_zeroes_sectors itself.
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f31e583a
    • L
      drbd: centralize printk reporting of new size into drbd_set_my_capacity() · d5412e8d
      Lars Ellenberg 提交于
      Previously, some implicit resizes that happend during handshake
      have not been reported as prominently as explicit resize.
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d5412e8d
  3. 04 10月, 2018 1 次提交
  4. 07 9月, 2018 1 次提交
  5. 09 7月, 2018 1 次提交
  6. 31 5月, 2018 1 次提交
  7. 16 5月, 2018 1 次提交
  8. 07 11月, 2017 1 次提交
    • K
      drbd: Convert timers to use timer_setup() · 2bccef39
      Kees Cook 提交于
      In preparation for unconditionally passing the struct timer_list pointer to
      all timer callbacks, switch to using the new timer_setup() and from_timer()
      to pass the timer pointer explicitly.
      
      Cc: Philipp Reisner <philipp.reisner@linbit.com>
      Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
      Cc: drbd-dev@lists.linbit.com
      Signed-off-by: NKees Cook <keescook@chromium.org>
      2bccef39
  9. 30 8月, 2017 3 次提交
  10. 24 8月, 2017 1 次提交
    • C
      block: replace bi_bdev with a gendisk pointer and partitions index · 74d46992
      Christoph Hellwig 提交于
      This way we don't need a block_device structure to submit I/O.  The
      block_device has different life time rules from the gendisk and
      request_queue and is usually only available when the block device node
      is open.  Other callers need to explicitly create one (e.g. the lightnvm
      passthrough code, or the new nvme multipathing code).
      
      For the actual I/O path all that we need is the gendisk, which exists
      once per block device.  But given that the block layer also does
      partition remapping we additionally need a partition index, which is
      used for said remapping in generic_make_request.
      
      Note that all the block drivers generally want request_queue or
      sometimes the gendisk, so this removes a layer of indirection all
      over the stack.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      74d46992
  11. 19 6月, 2017 1 次提交
    • N
      drbd: use bio_clone_fast() instead of bio_clone() · 8cb0defb
      NeilBrown 提交于
      drbd does not modify the bi_io_vec of the cloned bio,
      so there is no need to clone that part.  So bio_clone_fast()
      is the better choice.
      For bio_clone_fast() we need to specify a bio_set.
      We could use fs_bio_set, which bio_clone() uses, or
      drbd_md_io_bio_set, which drbd uses for metadata, but it is
      generally best to avoid sharing bio_sets unless you can
      be certain that there are no interdependencies.
      
      So create a new bio_set, drbd_io_bio_set, and use bio_clone_fast().
      
      Also remove a "XXX cannot fail ???" comment because it definitely
      cannot fail - bio_clone_fast() doesn't fail if the GFP flags allow for
      sleeping.
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8cb0defb
  12. 09 6月, 2017 1 次提交
  13. 09 4月, 2017 1 次提交
  14. 02 3月, 2017 1 次提交
  15. 03 8月, 2016 1 次提交
  16. 14 6月, 2016 9 次提交
    • L
      drbd: al_write_transaction: skip re-scanning of bitmap page pointer array · 27ea1d87
      Lars Ellenberg 提交于
      For larger devices, the array of bitmap page pointers can grow very
      large (8000 pointers per TB of storage).
      
      For each activity log transaction, we need to flush the associated
      bitmap pages to stable storage. Currently, we just "mark" the respective
      pages while setting up the transaction, then tell the bitmap code to
      write out all marked pages, but skip unchanged pages.
      
      But one such transaction can affect only a small number of bitmap pages,
      there is no need to scan the full array of several (ten-)thousand
      page pointers to find the few marked ones.
      
      Instead, remember the index numbers of the few affected pages,
      and later only re-check those to skip duplicates and unchanged ones.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      27ea1d87
    • F
      drbd: code cleanups without semantic changes · 7e5fec31
      Fabian Frederick 提交于
      This contains various cosmetic fixes ranging from simple typos to
      const-ifying, and using booleans properly.
      
      Original commit messages from Fabian's patch set:
      drbd: debugfs: constify drbd_version_fops
      drbd: use seq_put instead of seq_print where possible
      drbd: include linux/uaccess.h instead of asm/uaccess.h
      drbd: use const char * const for drbd strings
      drbd: kerneldoc warning fix in w_e_end_data_req()
      drbd: use unsigned for one bit fields
      drbd: use bool for peer is_ states
      drbd: fix typo
      drbd: use | for bitmask combination
      drbd: use true/false for bool
      drbd: fix drbd_bm_init() comments
      drbd: introduce peer state union
      drbd: fix maybe_pull_ahead() locking comments
      drbd: use bool for growing
      drbd: remove redundant declarations
      drbd: replace if/BUG by BUG_ON
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Signed-off-by: NRoland Kammerer <roland.kammerer@linbit.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      7e5fec31
    • L
      drbd: introduce WRITE_SAME support · 9104d31a
      Lars Ellenberg 提交于
      We will support WRITE_SAME, if
       * all peers support WRITE_SAME (both in kernel and DRBD version),
       * all peer devices support WRITE_SAME
       * logical_block_size is identical on all peers.
      
      We may at some point introduce a fallback on the receiving side
      for devices/kernels that do not support WRITE_SAME,
      by open-coding a submit loop. But not yet.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      9104d31a
    • L
      drbd: introduce unfence-peer handler · 26a96110
      Lars Ellenberg 提交于
      When resync is finished, we already call the "after-resync-target"
      handler (on the former sync target, obviously), once per volume.
      
      Paired with the before-resync-target handler, you can create snapshots,
      before the resync causes the volumes to become inconsistent,
      and discard those snapshots again, once they are no longer needed.
      
      It was also overloaded to be paired with the "fence-peer" handler,
      to "unfence" once the volumes are up-to-date and known good.
      
      This has some disadvantages, though: we call "fence-peer" for the whole
      connection (once for the group of volumes), but would call unfence as
      side-effect of after-resync-target once for each volume.
      
      Also, we fence on a (current, or about to become) Primary,
      which will later become the sync-source.
      
      Calling unfence only as a side effect of the after-resync-target
      handler opens a race window, between a new fence on the Primary
      (SyncTarget) and the unfence on the SyncTarget, which is difficult to
      close without some kind of "cluster wide lock" in those handlers.
      
      We would not need those handlers if we could still communicate.
      Which makes trying to aquire a cluster wide lock from those handlers
      seem like a very bad idea.
      
      This introduces the "unfence-peer" handler, which will be called
      per connection (once for the group of volumes), just like the fence
      handler, only once all volumes are back in sync, and on the SyncSource.
      
      Which is expected to be the node that previously called "fence", the
      node that is currently allowed to be Primary, and thus the only node
      that could trigger a new "fence" that could race with this unfence.
      
      Which makes us not need any cluster wide synchronization here,
      serializing two scripts running on the same node is trivial.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      26a96110
    • L
      drbd: finish resync on sync source only by notification from sync target · 5052fee2
      Lars Ellenberg 提交于
      If the replication link breaks exactly during "resync finished" detection,
      finishing too early on the sync source could again lead to UUIDs rotated
      too fast, and potentially a spurious full resync on next handshake.
      
      Always wait for explicit resync finished state change notification from
      the sync target.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      5052fee2
    • L
      drbd: allow larger max_discard_sectors · 505675f9
      Lars Ellenberg 提交于
      Make sure we have at least 67 (> AL_UPDATES_PER_TRANSACTION)
      al-extents available, and allow up to half of that to be
      discarded in one bio.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      505675f9
    • L
      drbd: zero-out partial unaligned discards on local backend · 7435e901
      Lars Ellenberg 提交于
      For consistency, also zero-out partial unaligned chunks of discard
      requests on the local backend.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      7435e901
    • L
      drbd: when receiving P_TRIM, zero-out partial unaligned chunks · dd4f699d
      Lars Ellenberg 提交于
      We can avoid spurious data divergence caused by partially-ignored
      discards on certain backends with discard_zeroes_data=0, if we
      translate partial unaligned discard requests into explicit zero-out.
      
      The relevant use case is LVM/DM thin.
      
      If on different nodes, DRBD is backed by devices with differing
      discard characteristics, discards may lead to data divergence
      (old data or garbage left over on one backend, zeroes due to
      unmapped areas on the other backend). Online verify would now
      potentially report tons of spurious differences.
      
      While probably harmless for most use cases (fstrim on a file system),
      DRBD cannot have that, it would violate our promise to upper layers
      that our data instances on the nodes are identical.
      
      To be correct and play safe (make sure data is identical on both copies),
      we would have to disable discard support, if our local backend (on a
      Primary) does not support "discard_zeroes_data=true".
      
      We'd also have to translate discards to explicit zero-out on the
      receiving (typically: Secondary) side, unless the receiving side
      supports "discard_zeroes_data=true".
      
      Which both would allocate those blocks, instead of unmapping them,
      in contrast with expectations.
      
      LVM/DM thin does set discard_zeroes_data=0,
      because it silently ignores discards to partial chunks.
      
      We can work around this by checking the alignment first.
      For unaligned (wrt. alignment and granularity) or too small discards,
      we zero-out the initial (and/or) trailing unaligned partial chunks,
      but discard all the aligned full chunks.
      
      At least for LVM/DM thin, the result is effectively "discard_zeroes_data=1".
      
      Arguably it should behave this way internally, by default,
      and we'll try to make that happen.
      
      But our workaround is still valid for already deployed setups,
      and for other devices that may behave this way.
      
      Setting discard-zeroes-if-aligned=yes will allow DRBD to use
      discards, and to announce discard_zeroes_data=true, even on
      backends that announce discard_zeroes_data=false.
      
      Setting discard-zeroes-if-aligned=no will cause DRBD to always
      fall-back to zero-out on the receiving side, and to not even
      announce discard capabilities on the Primary, if the respective
      backend announces discard_zeroes_data=false.
      
      We used to ignore the discard_zeroes_data setting completely.
      To not break established and expected behaviour, and suddenly
      cause fstrim on thin-provisioned LVs to run out-of-space,
      instead of freeing up space, the default value is "yes".
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      dd4f699d
    • P
      drbd: Implement handling of thinly provisioned storage on resync target nodes · 700ca8c0
      Philipp Reisner 提交于
      If during resync we read only zeroes for a range of sectors assume
      that these secotors can be discarded on the sync target node.
      Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      700ca8c0
  17. 10 6月, 2016 1 次提交
  18. 08 6月, 2016 1 次提交
  19. 05 4月, 2016 1 次提交
  20. 27 1月, 2016 1 次提交
  21. 23 1月, 2016 1 次提交
  22. 26 11月, 2015 8 次提交