1. 11 11月, 2017 15 次提交
    • H
      dm raid: fix panic when attempting to force a raid to sync · 23397844
      Heinz Mauelshagen 提交于
      Requesting a sync on an active raid device via a table reload
      (see 'sync' parameter in Documentation/device-mapper/dm-raid.txt)
      skips the super_load() call that defines the superblock size
      (rdev->sb_size) -- resulting in an oops if/when super_sync()->memset()
      is called.
      
      Fix by moving the initialization of the superblock start and size
      out of super_load() to the caller (analyse_superblocks).
      Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      23397844
    • M
      dm integrity: allow unaligned bv_offset · 95b1369a
      Mikulas Patocka 提交于
      When slub_debug is enabled kmalloc returns unaligned memory. XFS uses
      this unaligned memory for its buffers (if an unaligned buffer crosses a
      page, XFS frees it and allocates a full page instead - see the function
      xfs_buf_allocate_memory).
      
      dm-integrity checks if bv_offset is aligned on page size and this check
      fail with slub_debug and XFS.
      
      Fix this bug by removing the bv_offset check, leaving only the check for
      bv_len.
      
      Fixes: 7eada909 ("dm: add integrity target")
      Cc: stable@vger.kernel.org # v4.12+
      Reported-by: NBruno Prémont <bonbons@sysophe.eu>
      Reviewed-by: NMilan Broz <gmazyland@gmail.com>
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      95b1369a
    • M
      dm crypt: allow unaligned bv_offset · 0440d5c0
      Mikulas Patocka 提交于
      When slub_debug is enabled kmalloc returns unaligned memory. XFS uses
      this unaligned memory for its buffers (if an unaligned buffer crosses a
      page, XFS frees it and allocates a full page instead - see the function
      xfs_buf_allocate_memory).
      
      dm-crypt checks if bv_offset is aligned on page size and these checks
      fail with slub_debug and XFS.
      
      Fix this bug by removing the bv_offset checks. Switch to checking if
      bv_len is aligned instead of bv_offset (this check should be sufficient
      to prevent overruns if a bio with too small bv_len is received).
      
      Fixes: 8f0009a2 ("dm crypt: optionally support larger encryption sector size")
      Cc: stable@vger.kernel.org # v4.12+
      Reported-by: NBruno Prémont <bonbons@sysophe.eu>
      Tested-by: NBruno Prémont <bonbons@sysophe.eu>
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Reviewed-by: NMilan Broz <gmazyland@gmail.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      0440d5c0
    • M
      dm: small cleanup in dm_get_md() · 49de5769
      Mike Snitzer 提交于
      Makes dm_get_md() and dm_get_from_kobject() have similar code.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      49de5769
    • H
      dm: fix race between dm_get_from_kobject() and __dm_destroy() · b9a41d21
      Hou Tao 提交于
      The following BUG_ON was hit when testing repeat creation and removal of
      DM devices:
      
          kernel BUG at drivers/md/dm.c:2919!
          CPU: 7 PID: 750 Comm: systemd-udevd Not tainted 4.1.44
          Call Trace:
           [<ffffffff81649e8b>] dm_get_from_kobject+0x34/0x3a
           [<ffffffff81650ef1>] dm_attr_show+0x2b/0x5e
           [<ffffffff817b46d1>] ? mutex_lock+0x26/0x44
           [<ffffffff811df7f5>] sysfs_kf_seq_show+0x83/0xcf
           [<ffffffff811de257>] kernfs_seq_show+0x23/0x25
           [<ffffffff81199118>] seq_read+0x16f/0x325
           [<ffffffff811de994>] kernfs_fop_read+0x3a/0x13f
           [<ffffffff8117b625>] __vfs_read+0x26/0x9d
           [<ffffffff8130eb59>] ? security_file_permission+0x3c/0x44
           [<ffffffff8117bdb8>] ? rw_verify_area+0x83/0xd9
           [<ffffffff8117be9d>] vfs_read+0x8f/0xcf
           [<ffffffff81193e34>] ? __fdget_pos+0x12/0x41
           [<ffffffff8117c686>] SyS_read+0x4b/0x76
           [<ffffffff817b606e>] system_call_fastpath+0x12/0x71
      
      The bug can be easily triggered, if an extra delay (e.g. 10ms) is added
      between the test of DMF_FREEING & DMF_DELETING and dm_get() in
      dm_get_from_kobject().
      
      To fix it, we need to ensure the test of DMF_FREEING & DMF_DELETING and
      dm_get() are done in an atomic way, so _minor_lock is used.
      
      The other callers of dm_get() have also been checked to be OK: some
      callers invoke dm_get() under _minor_lock, some callers invoke it under
      _hash_lock, and dm_start_request() invoke it after increasing
      md->open_count.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NHou Tao <houtao1@huawei.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      b9a41d21
    • M
      dm: allocate struct mapped_device with kvzalloc · 856eb091
      Mikulas Patocka 提交于
      The structure srcu_struct can be very big, its size is proportional to the
      value CONFIG_NR_CPUS. The Fedora kernel has CONFIG_NR_CPUS 8192, the field
      io_barrier in the struct mapped_device has 84kB in the debugging kernel
      and 50kB in the non-debugging kernel. The large size may result in failure
      of the function kzalloc_node.
      
      In order to avoid the allocation failure, we use the function
      kvzalloc_node, this function falls back to vmalloc if a large contiguous
      chunk of memory is not available. This patch also moves the field
      io_barrier to the last position of struct mapped_device - the reason is
      that on many processor architectures, short memory offsets result in
      smaller code than long memory offsets - on x86-64 it reduces code size by
      320 bytes.
      
      Note to stable kernel maintainers - the kernels 4.11 and older don't have
      the function kvzalloc_node, you can use the function vzalloc_node instead.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      856eb091
    • D
      dm zoned: ignore last smaller runt zone · 114e0259
      Damien Le Moal 提交于
      The SCSI layer allows ZBC drives to have a smaller last runt zone. For
      such a device, specifying the entire capacity for a dm-zoned target
      table entry fails because the specified capacity is not aligned on a
      device zone size indicated in the request queue structure of the
      device.
      
      Fix this problem by ignoring the last runt zone in the entry length
      when seting up the dm-zoned target (ctr method) and when iterating table
      entries of the target (iterate_devices method). This allows dm-zoned
      users to still easily setup a target using the entire device capacity
      (as mandated by dm-zoned) or the aligned capacity excluding the last
      runt zone.
      
      While at it, replace direct references to the device queue chunk_sectors
      limit with calls to the accessor blk_queue_zone_sectors().
      Reported-by: NPeter Desnoyers <pjd@ccs.neu.edu>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      114e0259
    • J
      dm space map metadata: use ARRAY_SIZE · fbc61291
      Jérémy Lefaure 提交于
      Using the ARRAY_SIZE macro improves the readability of the code.
      
      Found with Coccinelle with the following semantic patch:
      @r depends on (org || report)@
      type T;
      T[] E;
      position p;
      @@
      (
       (sizeof(E)@p /sizeof(*E))
      |
       (sizeof(E)@p /sizeof(E[...]))
      |
       (sizeof(E)@p /sizeof(T))
      )
      Signed-off-by: NJérémy Lefaure <jeremy.lefaure@lse.epita.fr>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      fbc61291
    • R
      dm log writes: add support for DAX · 98d82f48
      Ross Zwisler 提交于
      Now that we have the ability log filesystem writes using a flat buffer, add
      support for DAX.
      
      The motivation for this support is the need for an xfstest that can test
      the new MAP_SYNC DAX flag.  By logging the filesystem activity with
      dm-log-writes we can show that the MAP_SYNC page faults are writing out
      their metadata as they happen, instead of requiring an explicit
      msync/fsync.
      
      Unfortunately we can't easily track data that has been written via
      mmap() now that the dax_flush() abstraction was removed by commit
      c3ca015f ("dax: remove the pmem_dax_ops->flush abstraction").
      Otherwise we could just treat each flush as a big write, and store the
      data that is being synced to media.  It may be worthwhile to add the
      dax_flush() entry point back, just as a notifier so we can do this
      logging.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      98d82f48
    • R
      dm log writes: add support for inline data buffers · e5a20660
      Ross Zwisler 提交于
      Currently dm-log-writes supports writing filesystem data via BIOs, and
      writing internal metadata from a flat buffer via write_metadata().
      
      For DAX writes, though, we won't have a BIO, but will instead have an
      iterator that we'll want to use to fill a flat data buffer.
      
      So, create write_inline_data() which allows us to write filesystem data
      using a flat buffer as a source, and wire it up in log_one_block().
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      e5a20660
    • M
      dm cache: simplify get_per_bio_data() by removing data_size argument · 693b960e
      Mike Snitzer 提交于
      There is only one per_bio_data size now that writethrough-specific data
      was removed from the per_bio_data structure.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      693b960e
    • M
      dm cache: remove all obsolete writethrough-specific code · 9958f1d9
      Mike Snitzer 提交于
      Now that the writethrough code is much simpler there is no need to track
      so much state or cascade bio submission (as was done, via
      writethrough_endio(), to issue origin then cache IO in series).
      
      As such the obsolete writethrough list and workqueue is also removed.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      9958f1d9
    • M
      dm cache: submit writethrough writes in parallel to origin and cache · 2df3bae9
      Mike Snitzer 提交于
      Discontinue issuing writethrough write IO in series to the origin and
      then cache.
      
      Use bio_clone_fast() to create a new origin clone bio that will be
      mapped to the origin device and then bio_chain() it to the bio that gets
      remapped to the cache device.  The origin clone bio does _not_ have a
      copy of the per_bio_data -- as such check_if_tick_bio_needed() will not
      be called.
      
      The cache bio (parent bio) will not complete until the origin bio has
      completed -- this fulfills bio_clone_fast()'s requirements as well as
      the requirement to not complete the original IO until the write IO has
      completed to both the origin and cache device.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      2df3bae9
    • M
      dm cache: pass cache structure to mode functions · 8e3c3827
      Mike Snitzer 提交于
      No functional changes, just a bit cleaner than passing cache_features
      structure.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      8e3c3827
    • J
      dm cache: fix race condition in the writeback mode overwrite_bio optimisation · d1260e2a
      Joe Thornber 提交于
      When a DM cache in writeback mode moves data between the slow and fast
      device it can often avoid a copy if the triggering bio either:
      
      i) covers the whole block (no point copying if we're about to overwrite it)
      ii) the migration is a promotion and the origin block is currently discarded
      
      Prior to this fix there was a race with case (ii).  The discard status
      was checked with a shared lock held (rather than exclusive).  This meant
      another bio could run in parallel and write data to the origin, removing
      the discard state.  After the promotion the parallel write would have
      been lost.
      
      With this fix the discard status is re-checked once the exclusive lock
      has been aquired.  If the block is no longer discarded it falls back to
      the slower full copy path.
      
      Fixes: b29d4986 ("dm cache: significant rework to leverage dm-bio-prison-v2")
      Cc: stable@vger.kernel.org # v4.12+
      Signed-off-by: NJoe Thornber <ejt@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      d1260e2a
  2. 25 10月, 2017 3 次提交
  3. 22 10月, 2017 7 次提交
  4. 21 10月, 2017 10 次提交
  5. 20 10月, 2017 5 次提交