1. 23 5月, 2018 1 次提交
    • D
      dax: Introduce a ->copy_to_iter dax operation · b3a9a0c3
      Dan Williams 提交于
      Similar to the ->copy_from_iter() operation, a platform may want to
      deploy an architecture or device specific routine for handling reads
      from a dax_device like /dev/pmemX. On x86 this routine will point to a
      machine check safe version of copy_to_iter(). For now, add the plumbing
      to device-mapper and the dax core.
      
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      b3a9a0c3
  2. 05 4月, 2018 1 次提交
  3. 04 4月, 2018 2 次提交
  4. 18 3月, 2018 1 次提交
    • B
      block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into <linux/blkdev.h> · 233bde21
      Bart Van Assche 提交于
      It happens often while I'm preparing a patch for a block driver that
      I'm wondering: is a definition of SECTOR_SIZE and/or SECTOR_SHIFT
      available for this driver? Do I have to introduce definitions of these
      constants before I can use these constants? To avoid this confusion,
      move the existing definitions of SECTOR_SIZE and SECTOR_SHIFT into the
      <linux/blkdev.h> header file such that these become available for all
      block drivers. Make the SECTOR_SIZE definition in the uapi msdos_fs.h
      header file conditional to avoid that including that header file after
      <linux/blkdev.h> causes the compiler to complain about a SECTOR_SIZE
      redefinition.
      
      Note: the SECTOR_SIZE / SECTOR_SHIFT / SECTOR_BITS definitions have
      not been removed from uapi header files nor from NAND drivers in
      which these constants are used for another purpose than converting
      block layer offsets and sizes into a number of sectors.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      233bde21
  5. 30 1月, 2018 1 次提交
  6. 17 1月, 2018 1 次提交
  7. 20 12月, 2017 1 次提交
    • M
      dm: introduce DM_TYPE_NVME_BIO_BASED · 22c11858
      Mike Snitzer 提交于
      If dm_table_determine_type() establishes DM_TYPE_NVME_BIO_BASED then
      all devices in the DM table do not support partial completions.  Also,
      the table has a single immutable target that doesn't require DM core to
      split bios.
      
      This will enable adding NVMe optimizations to bio-based DM.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      22c11858
  8. 17 12月, 2017 1 次提交
    • M
      dm: improve performance by moving dm_io structure to per-bio-data · 64f52b0e
      Mike Snitzer 提交于
      Eliminates need for a separate mempool to allocate 'struct dm_io'
      objects from.  As such, it saves an extra mempool allocation for each
      original bio that DM core is issued.
      
      This complicates the per-bio-data accessor functions by needing to
      conditonally add extra padding to get to a target's per-bio-data.  But
      in the end this provides a decent performance improvement for all
      bio-based DM devices.
      
      On an NVMe-loop based testbed to a ramdisk (~3100 MB/s): bio-based
      DM linear performance improved by 2% (went from 2665 to 2777 MB/s).
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      64f52b0e
  9. 14 12月, 2017 1 次提交
  10. 11 9月, 2017 1 次提交
    • M
      dax: remove the pmem_dax_ops->flush abstraction · c3ca015f
      Mikulas Patocka 提交于
      Commit abebfbe2 ("dm: add ->flush() dax operation support") is
      buggy. A DM device may be composed of multiple underlying devices and
      all of them need to be flushed. That commit just routes the flush
      request to the first device and ignores the other devices.
      
      It could be fixed by adding more complex logic to the device mapper. But
      there is only one implementation of the method pmem_dax_ops->flush - that
      is pmem_dax_flush() - and it calls arch_wb_cache_pmem(). Consequently, we
      don't need the pmem_dax_ops->flush abstraction at all, we can call
      arch_wb_cache_pmem() directly from dax_flush() because dax_dev->ops->flush
      can't ever reach anything different from arch_wb_cache_pmem().
      
      It should be also pointed out that for some uses of persistent memory it
      is needed to flush only a very small amount of data (such as 1 cacheline),
      and it would be overkill if we go through that device mapper machinery for
      a single flushed cache line.
      
      Fix this by removing the pmem_dax_ops->flush abstraction and call
      arch_wb_cache_pmem() directly from dax_flush(). Also, remove the device
      mapper code that forwards the flushes.
      
      Fixes: abebfbe2 ("dm: add ->flush() dax operation support")
      Cc: stable@vger.kernel.org
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Reviewed-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      c3ca015f
  11. 28 8月, 2017 2 次提交
  12. 19 6月, 2017 3 次提交
    • D
      dm: introduce dm_remap_zone_report() · 10999307
      Damien Le Moal 提交于
      A target driver support zoned block devices and exposing it as such may
      receive REQ_OP_ZONE_REPORT request for the user to determine the mapped
      device zone configuration. To process properly such request, the target
      driver may need to remap the zone descriptors provided in the report
      reply. The helper function dm_remap_zone_report() does this generically
      using only the target start offset and length and the start offset
      within the target device.
      
      dm_remap_zone_report() will remap the start sector of all zones
      reported. If the report includes sequential zones, the write pointer
      position of these zones will also be remapped.
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      10999307
    • D
      dm table: add zoned block devices validation · dd88d313
      Damien Le Moal 提交于
      1) Introduce DM_TARGET_ZONED_HM feature flag:
      
      The target drivers currently available will not operate correctly if a
      table target maps onto a host-managed zoned block device.
      
      To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to
      allow a target to explicitly state that it supports host-managed zoned
      block devices.  This feature is checked for all targets in a table if
      any of the table's block devices are host-managed.
      
      Note that as host-aware zoned block devices are backward compatible with
      regular block devices, they can be used by any of the current target
      types.  This new feature is thus restricted to host-managed zoned block
      devices.
      
      2) Check device area zone alignment:
      
      If a target maps to a zoned block device, check that the device area is
      aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET
      operations (resetting a partially mapped sequential zone would not be
      possible).  This also facilitates the processing of zone report with
      REQ_OP_ZONE_REPORT bios.
      
      3) Check block devices zone model compatibility
      
      When setting the DM device's queue limits, several possibilities exists
      for zoned block devices:
      1) The DM target driver may want to expose a different zone model
      (e.g. host-managed device emulation or regular block device on top of
      host-managed zoned block devices)
      2) Expose the underlying zone model of the devices as-is
      
      To allow both cases, the underlying block device zone model must be set
      in the target limits in dm_set_device_limits() and the compatibility of
      all devices checked similarly to the logical block size alignment.  For
      this last check, introduce validate_hardware_zoned_model() to check that
      all targets of a table have the same zone model and that the zone size
      of the target devices are equal.
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      [Mike Snitzer refactored Damien's original work to simplify the code]
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      dd88d313
    • J
      dm: convert DM printk macros to pr_<level> macros · d2c3c8dc
      Joe Perches 提交于
      Using pr_<level> is the more common logging style.
      
      Standardize style and use new macro DM_FMT.
      Use no_printk in DMDEBUG macros when CONFIG_DM_DEBUG is not #defined.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      d2c3c8dc
  13. 16 6月, 2017 1 次提交
  14. 10 6月, 2017 1 次提交
  15. 09 6月, 2017 3 次提交
    • C
      block: switch bios to blk_status_t · 4e4cbee9
      Christoph Hellwig 提交于
      Replace bi_error with a new bi_status to allow for a clear conversion.
      Note that device mapper overloaded bi_error with a private value, which
      we'll have to keep arround at least for now and thus propagate to a
      proper blk_status_t value.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4e4cbee9
    • C
      block: introduce new block status code type · 2a842aca
      Christoph Hellwig 提交于
      Currently we use nornal Linux errno values in the block layer, and while
      we accept any error a few have overloaded magic meanings.  This patch
      instead introduces a new  blk_status_t value that holds block layer specific
      status codes and explicitly explains their meaning.  Helpers to convert from
      and to the previous special meanings are provided for now, but I suspect
      we want to get rid of them in the long run - those drivers that have a
      errno input (e.g. networking) usually get errnos that don't know about
      the special block layer overloads, and similarly returning them to userspace
      will usually return somethings that strictly speaking isn't correct
      for file system operations, but that's left as an exercise for later.
      
      For now the set of errors is a very limited set that closely corresponds
      to the previous overloaded errno values, but there is some low hanging
      fruite to improve it.
      
      blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse
      typechecking, so that we can easily catch places passing the wrong values.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2a842aca
    • C
      dm: change ->end_io calling convention · 1be56909
      Christoph Hellwig 提交于
      Turn the error paramter into a pointer so that target drivers can change
      the value, and make sure only DM_ENDIO_* values are returned from the
      methods.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      1be56909
  16. 02 5月, 2017 2 次提交
  17. 28 4月, 2017 1 次提交
  18. 26 4月, 2017 1 次提交
  19. 25 4月, 2017 1 次提交
    • M
      dm: mark targets that pass integrity data · e2460f2a
      Mikulas Patocka 提交于
      A dm-crypt on dm-integrity device incorrectly advertises an integrity
      profile on the DM crypt device.  It can be seen in the files
      "/sys/block/dm-*/integrity/*" that both dm-integrity and dm-crypt target
      advertise the integrity profile.  That is incorrect, only the
      dm-integrity target should advertise the integrity profile.
      
      A general problem in DM is that if we have a DM device that depends on
      another device with an integrity profile, the upper device will always
      advertise the integrity profile, even when the target driver doesn't
      support handling integrity data.
      
      Most targets don't support integrity data, so we provide a whitelist of
      targets that support it (linear, delay and striped).  The targets that
      support passing integrity data to the lower device are marked with the
      flag DM_TARGET_PASSES_INTEGRITY.  The DM core will now advertise
      integrity data on a DM device only if all the targets support the
      integrity data.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      e2460f2a
  20. 21 4月, 2017 1 次提交
    • D
      dm: add dax_device and dax_operations support · f26c5719
      Dan Williams 提交于
      Allocate a dax_device to represent the capacity of a device-mapper
      instance. Provide a ->direct_access() method via the new dax_operations
      indirection that mirrors the functionality of the current direct_access
      support via block_device_operations.  Once fs/dax.c has been converted
      to use dax_operations the old dm_blk_direct_access() will be removed.
      
      A new helper dm_dax_get_live_target() is introduced to separate some of
      the dm-specifics from the direct_access implementation.
      
      This enabling is only for the top-level dm representation to upper
      layers. Converting target direct_access implementations is deferred to a
      separate patch.
      
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Reviewed-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      f26c5719
  21. 09 4月, 2017 2 次提交
  22. 08 3月, 2017 1 次提交
  23. 28 1月, 2017 1 次提交
  24. 15 9月, 2016 1 次提交
  25. 21 7月, 2016 1 次提交
    • T
      dm: add infrastructure for DAX support · 545ed20e
      Toshi Kani 提交于
      Change mapped device to implement direct_access function,
      dm_blk_direct_access(), which calls a target direct_access function.
      'struct target_type' is extended to have target direct_access interface.
      This function limits direct accessible size to the dm_target's limit
      with max_io_len().
      
      Add dm_table_supports_dax() to iterate all targets and associated block
      devices to check for DAX support.  To add DAX support to a DM target the
      target must only implement the direct_access function.
      
      Add a new dm type, DM_TYPE_DAX_BIO_BASED, which indicates that mapped
      device supports DAX and is bio based.  This new type is used to assure
      that all target devices have DAX support and remain that way after
      QUEUE_FLAG_DAX is set in mapped device.
      
      At initial table load, QUEUE_FLAG_DAX is set to mapped device when setting
      DM_TYPE_DAX_BIO_BASED to the type.  Any subsequent table load to the
      mapped device must have the same type, or else it fails per the check in
      table_load().
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      545ed20e
  26. 11 6月, 2016 1 次提交
    • M
      dm mpath: add optional "queue_mode" feature · e83068a5
      Mike Snitzer 提交于
      Allow a user to specify an optional feature 'queue_mode <mode>' where
      <mode> may be "bio", "rq" or "mq" -- which corresponds to bio-based,
      request_fn rq-based, and blk-mq rq-based respectively.
      
      If the queue_mode feature isn't specified the default for the
      "multipath" target is still "rq" but if dm_mod.use_blk_mq is set to Y
      it'll default to mode "mq".
      
      This new queue_mode feature introduces the ability for each multipath
      device to have its own queue_mode (whereas before this feature all
      multipath devices effectively had to have the same queue_mode).
      
      This commit also goes a long way to eliminate the awkward (ab)use of
      DM_TYPE_*, the associated filter_md_type() and other relatively fragile
      and difficult to maintain code.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      e83068a5
  27. 11 3月, 2016 1 次提交
    • D
      dm snapshot: disallow the COW and origin devices from being identical · 4df2bf46
      DingXiang 提交于
      Otherwise loading a "snapshot" table using the same device for the
      origin and COW devices, e.g.:
      
      echo "0 20971520 snapshot 253:3 253:3 P 8" | dmsetup create snap
      
      will trigger:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
      [ 1958.979934] IP: [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot]
      [ 1958.989655] PGD 0
      [ 1958.991903] Oops: 0000 [#1] SMP
      ...
      [ 1959.059647] CPU: 9 PID: 3556 Comm: dmsetup Tainted: G          IO    4.5.0-rc5.snitm+ #150
      ...
      [ 1959.083517] task: ffff8800b9660c80 ti: ffff88032a954000 task.ti: ffff88032a954000
      [ 1959.091865] RIP: 0010:[<ffffffffa040efba>]  [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot]
      [ 1959.104295] RSP: 0018:ffff88032a957b30  EFLAGS: 00010246
      [ 1959.110219] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000001
      [ 1959.118180] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff880329334a00
      [ 1959.126141] RBP: ffff88032a957b50 R08: 0000000000000000 R09: 0000000000000001
      [ 1959.134102] R10: 000000000000000a R11: f000000000000000 R12: ffff880330884d80
      [ 1959.142061] R13: 0000000000000008 R14: ffffc90001c13088 R15: ffff880330884d80
      [ 1959.150021] FS:  00007f8926ba3840(0000) GS:ffff880333440000(0000) knlGS:0000000000000000
      [ 1959.159047] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1959.165456] CR2: 0000000000000098 CR3: 000000032f48b000 CR4: 00000000000006e0
      [ 1959.173415] Stack:
      [ 1959.175656]  ffffc90001c13040 ffff880329334a00 ffff880330884ed0 ffff88032a957bdc
      [ 1959.183946]  ffff88032a957bb8 ffffffffa040f225 ffff880329334a30 ffff880300000000
      [ 1959.192233]  ffffffffa04133e0 ffff880329334b30 0000000830884d58 00000000569c58cf
      [ 1959.200521] Call Trace:
      [ 1959.203248]  [<ffffffffa040f225>] dm_exception_store_create+0x1d5/0x240 [dm_snapshot]
      [ 1959.211986]  [<ffffffffa040d310>] snapshot_ctr+0x140/0x630 [dm_snapshot]
      [ 1959.219469]  [<ffffffffa0005c44>] ? dm_split_args+0x64/0x150 [dm_mod]
      [ 1959.226656]  [<ffffffffa0005ea7>] dm_table_add_target+0x177/0x440 [dm_mod]
      [ 1959.234328]  [<ffffffffa0009203>] table_load+0x143/0x370 [dm_mod]
      [ 1959.241129]  [<ffffffffa00090c0>] ? retrieve_status+0x1b0/0x1b0 [dm_mod]
      [ 1959.248607]  [<ffffffffa0009e35>] ctl_ioctl+0x255/0x4d0 [dm_mod]
      [ 1959.255307]  [<ffffffff813304e2>] ? memzero_explicit+0x12/0x20
      [ 1959.261816]  [<ffffffffa000a0c3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
      [ 1959.268615]  [<ffffffff81215eb6>] do_vfs_ioctl+0xa6/0x5c0
      [ 1959.274637]  [<ffffffff81120d2f>] ? __audit_syscall_entry+0xaf/0x100
      [ 1959.281726]  [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
      [ 1959.288814]  [<ffffffff81216449>] SyS_ioctl+0x79/0x90
      [ 1959.294450]  [<ffffffff8167e4ae>] entry_SYSCALL_64_fastpath+0x12/0x71
      ...
      [ 1959.323277] RIP  [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot]
      [ 1959.333090]  RSP <ffff88032a957b30>
      [ 1959.336978] CR2: 0000000000000098
      [ 1959.344121] ---[ end trace b049991ccad1169e ]---
      
      Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1195899
      Cc: stable@vger.kernel.org
      Signed-off-by: NDing Xiang <dingxiang@huawei.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      4df2bf46
  28. 23 2月, 2016 2 次提交
  29. 01 11月, 2015 1 次提交
    • C
      dm: refactor ioctl handling · e56f81e0
      Christoph Hellwig 提交于
      This moves the call to blkdev_ioctl and the argument checking to DM core
      code, and only leaves a callout to find the block device to operate on
      in the targets.  This simplifies the code and allows us to pass through
      ioctl-like command using other methods in the next patch.
      
      Also split out a helper around calling the prepare_ioctl method that
      will be reused for persistent reservation handling.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      e56f81e0
  30. 14 8月, 2015 1 次提交
    • K
      block: kill merge_bvec_fn() completely · 8ae12666
      Kent Overstreet 提交于
      As generic_make_request() is now able to handle arbitrarily sized bios,
      it's no longer necessary for each individual block driver to define its
      own ->merge_bvec_fn() callback. Remove every invocation completely.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
      Cc: drbd-user@lists.linbit.com
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Yehuda Sadeh <yehuda@inktank.com>
      Cc: Sage Weil <sage@inktank.com>
      Cc: Alex Elder <elder@kernel.org>
      Cc: ceph-devel@vger.kernel.org
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      Cc: Neil Brown <neilb@suse.de>
      Cc: linux-raid@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Acked-by: NeilBrown <neilb@suse.de> (for the 'md' bits)
      Acked-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
      [dpark: also remove ->merge_bvec_fn() in dm-thin as well as
       dm-era-target, and resolve merge conflicts]
      Signed-off-by: NDongsu Park <dpark@posteo.net>
      Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      8ae12666
  31. 01 4月, 2015 1 次提交