1. 19 6月, 2017 2 次提交
    • M
      dm ioctl: add a new DM_DEV_ARM_POLL ioctl · fc1841e1
      Mikulas Patocka 提交于
      This ioctl will record the current global event number in the structure
      dm_file, so that next select or poll call will wait until new events
      arrived since this ioctl.
      
      The DM_DEV_ARM_POLL ioctl has the same effect as closing and reopening
      the handle.
      
      Using the DM_DEV_ARM_POLL ioctl is optional - if the userspace is OK
      with closing and reopening the /dev/mapper/control handle after select
      or poll, there is no need to re-arm via ioctl.
      
      Usage:
      1. open the /dev/mapper/control device
      2. send the DM_DEV_ARM_POLL ioctl
      3. scan the event numbers of all devices we are interested in and process
         them
      4. call select, poll or epoll on the handle (it waits until some new event
         happens since the DM_DEV_ARM_POLL ioctl)
      5. go to step 2
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAndy Grover <agrover@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      fc1841e1
    • M
      dm: add basic support for using the select or poll function · 93e6442c
      Mikulas Patocka 提交于
      Add the ability to poll on the /dev/mapper/control device.  The select
      or poll function waits until any event happens on any dm device since
      opening the /dev/mapper/control device.  When select or poll returns the
      device as readable, we must close and reopen the device to wait for new
      dm events.
      
      Usage:
      1. open the /dev/mapper/control device
      2. scan the event numbers of all devices we are interested in and process
         them
      3. call select, poll or epoll on the handle (it waits until some new event
         happens since opening the device)
      4. close the /dev/mapper/control handle
      5. go to step 1
      
      The next commit allows to re-arm the polling without closing and
      reopening the device.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAndy Grover <agrover@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      93e6442c
  2. 23 5月, 2017 1 次提交
  3. 09 5月, 2017 1 次提交
  4. 28 4月, 2017 2 次提交
  5. 25 4月, 2017 2 次提交
  6. 02 3月, 2017 1 次提交
  7. 25 12月, 2016 1 次提交
  8. 09 12月, 2016 1 次提交
  9. 21 7月, 2016 1 次提交
  10. 01 7月, 2016 1 次提交
  11. 11 6月, 2016 1 次提交
  12. 06 5月, 2016 1 次提交
  13. 23 2月, 2016 2 次提交
  14. 06 8月, 2015 1 次提交
  15. 30 4月, 2015 1 次提交
    • C
      dm: only initialize the request_queue once · 3e6180f0
      Christoph Hellwig 提交于
      Commit bfebd1cd ("dm: add full blk-mq support to request-based DM")
      didn't properly account for the need to short-circuit re-initializing
      DM's blk-mq request_queue if it was already initialized.
      
      Otherwise, reloading a blk-mq request-based DM table (either manually
      or via multipathd) resulted in errors, see:
       https://www.redhat.com/archives/dm-devel/2015-April/msg00132.html
      
      Fix is to only initialize the request_queue on the initial table load
      (when the mapped_device type is assigned).
      
      This is better than having dm_init_request_based_blk_mq_queue() return
      early if the queue was already initialized because it elevates the
      constraint to a more meaningful location in DM core.  As such the
      pre-existing early return in dm_init_request_based_queue() can now be
      removed.
      
      Fixes: bfebd1cd ("dm: add full blk-mq support to request-based DM")
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      3e6180f0
  16. 10 2月, 2015 1 次提交
  17. 20 11月, 2014 1 次提交
    • M
      dm: enhance internal suspend and resume interface · ffcc3936
      Mike Snitzer 提交于
      Rename dm_internal_{suspend,resume} to dm_internal_{suspend,resume}_fast
      -- dm-stats will continue using these methods to avoid all the extra
      suspend/resume logic that is not needed in order to quickly flush IO.
      
      Introduce dm_internal_suspend_noflush() variant that actually calls the
      mapped_device's target callbacks -- otherwise target-specific hooks are
      avoided (e.g. dm-thin's thin_presuspend and thin_postsuspend).  Common
      code between dm_internal_{suspend_noflush,resume} and
      dm_{suspend,resume} was factored out as __dm_{suspend,resume}.
      
      Update dm_internal_{suspend_noflush,resume} to always take and release
      the mapped_device's suspend_lock.  Also update dm_{suspend,resume} to be
      aware of potential for DM_INTERNAL_SUSPEND_FLAG to be set and respond
      accordingly by interruptibly waiting for the DM_INTERNAL_SUSPEND_FLAG to
      be cleared.  Add lockdep annotation to dm_suspend() and dm_resume().
      
      The existing DM_SUSPEND_FLAG remains unchanged.
      DM_INTERNAL_SUSPEND_FLAG is set by dm_internal_suspend_noflush() and
      cleared by dm_internal_resume().
      
      Both DM_SUSPEND_FLAG and DM_INTERNAL_SUSPEND_FLAG may be set if a device
      was already suspended when dm_internal_suspend_noflush() was called --
      this can be thought of as a "nested suspend".  A "nested suspend" can
      occur with legacy userspace dm-thin code that might suspend all active
      thin volumes before suspending the pool for resize.
      
      But otherwise, in the normal dm-thin-pool suspend case moving forward:
      the thin-pool will have DM_SUSPEND_FLAG set and all active thins from
      that thin-pool will have DM_INTERNAL_SUSPEND_FLAG set.
      
      Also add DM_INTERNAL_SUSPEND_FLAG to status report.  This new
      DM_INTERNAL_SUSPEND_FLAG state is being reported to assist with
      debugging (e.g. 'dmsetup info' will report an internally suspended
      device accordingly).
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Acked-by: NJoe Thornber <ejt@redhat.com>
      ffcc3936
  18. 06 10月, 2014 1 次提交
    • B
      dm: allow active and inactive tables to share dm_devs · 86f1152b
      Benjamin Marzinski 提交于
      Until this change, when loading a new DM table, DM core would re-open
      all of the devices in the DM table.  Now, DM core will avoid redundant
      device opens (and closes when destroying the old table) if the old
      table already has a device open using the same mode.  This is achieved
      by managing reference counts on the table_devices that DM core now
      stores in the mapped_device structure (rather than in the dm_table
      structure).  So a mapped_device's active and inactive dm_tables' dm_dev
      lists now just point to the dm_devs stored in the mapped_device's
      table_devices list.
      
      This improvement in DM core's device reference counting has the
      side-effect of fixing a long-standing limitation of the multipath
      target: a DM multipath table couldn't include any paths that were unusable
      (failed).  For example: if all paths have failed and you add a new,
      working, path to the table; you can't use it since the table load would
      fail due to it still containing failed paths.  Now a re-load of a
      multipath table can include failed devices and when those devices become
      active again they can be used instantly.
      
      The device list code in dm.c isn't a straight copy/paste from the code in
      dm-table.c, but it's very close (aside from some variable renames).  One
      subtle difference is that find_table_device for the tables_devices list
      will only match devices with the same name and mode.  This is because we
      don't want to upgrade a device's mode in the active table when an
      inactive table is loaded.
      
      Access to the mapped_device structure's tables_devices list requires a
      mutex (tables_devices_lock), so that tables cannot be created and
      destroyed concurrently.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      86f1152b
  19. 10 11月, 2013 1 次提交
    • M
      dm: allow remove to be deferred · 2c140a24
      Mikulas Patocka 提交于
      This patch allows the removal of an open device to be deferred until
      it is closed.  (Previously such a removal attempt would fail.)
      
      The deferred remove functionality is enabled by setting the flag
      DM_DEFERRED_REMOVE in the ioctl structure on DM_DEV_REMOVE or
      DM_REMOVE_ALL ioctl.
      
      On return from DM_DEV_REMOVE, the flag DM_DEFERRED_REMOVE indicates if
      the device was removed immediately or flagged to be removed on close -
      if the flag is clear, the device was removed.
      
      On return from DM_DEV_STATUS and other ioctls, the flag
      DM_DEFERRED_REMOVE is set if the device is scheduled to be removed on
      closure.
      
      A device that is scheduled to be deleted can be revived using the
      message "@cancel_deferred_remove". This message clears the
      DMF_DEFERRED_REMOVE flag so that the device won't be deleted on close.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      2c140a24
  20. 06 9月, 2013 4 次提交
  21. 11 7月, 2013 3 次提交
  22. 02 3月, 2013 4 次提交
    • M
      dm ioctl: allow message to return data · a2606241
      Mikulas Patocka 提交于
      This patch introduces enhanced message support that allows the
      device-mapper core to recognise messages that are common to all devices,
      and for messages to return data to userspace.
      
      Core messages are processed by the function "message_for_md".  If the
      device mapper doesn't support the message, it is passed to the target
      driver.
      
      If the message returns data, the kernel sets the flag
      DM_MESSAGE_OUT_FLAG.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      a2606241
    • M
      dm ioctl: optimize functions without variable params · 02cde50b
      Mikulas Patocka 提交于
      Device-mapper ioctls receive and send data in a buffer supplied
      by userspace.  The buffer has two parts.  The first part contains
      a 'struct dm_ioctl' and has a fixed size.  The second part depends
      on the ioctl and has a variable size.
      
      This patch recognises the specific ioctls that do not use the variable
      part of the buffer and skips allocating memory for it.
      
      In particular, when a device is suspended and a resume ioctl is sent,
      this now avoid memory allocation completely.
      
      The variable "struct dm_ioctl tmp" is moved from the function
      copy_params to its caller ctl_ioctl and renamed to param_kernel.
      It is used directly when the ioctl function doesn't need any arguments.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      02cde50b
    • M
      dm ioctl: introduce ioctl_flags · e2914cc2
      Mikulas Patocka 提交于
      This patch introduces flags for each ioctl function.
      
      So far, one flag is defined, IOCTL_FLAGS_NO_PARAMS.  It is set if the
      function processing the ioctl doesn't take or produce any parameters in
      the section of the data buffer that has a variable size.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      e2914cc2
    • M
      dm: fix truncated status strings · fd7c092e
      Mikulas Patocka 提交于
      Avoid returning a truncated table or status string instead of setting
      the DM_BUFFER_FULL_FLAG when the last target of a table fills the
      buffer.
      
      When processing a table or status request, the function retrieve_status
      calls ti->type->status. If ti->type->status returns non-zero,
      retrieve_status assumes that the buffer overflowed and sets
      DM_BUFFER_FULL_FLAG.
      
      However, targets don't return non-zero values from their status method
      on overflow. Most targets returns always zero.
      
      If a buffer overflow happens in a target that is not the last in the
      table, it gets noticed during the next iteration of the loop in
      retrieve_status; but if a buffer overflow happens in the last target, it
      goes unnoticed and erroneously truncated data is returned.
      
      In the current code, the targets behave in the following way:
      * dm-crypt returns -ENOMEM if there is not enough space to store the
        key, but it returns 0 on all other overflows.
      * dm-thin returns errors from the status method if a disk error happened.
        This is incorrect because retrieve_status doesn't check the error
        code, it assumes that all non-zero values mean buffer overflow.
      * all the other targets always return 0.
      
      This patch changes the ti->type->status function to return void (because
      most targets don't use the return code). Overflow is detected in
      retrieve_status: if the status method fills up the remaining space
      completely, it is assumed that buffer overflow happened.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      fd7c092e
  23. 22 12月, 2012 3 次提交
    • M
      dm ioctl: use kmalloc if possible · 9c5091f2
      Mikulas Patocka 提交于
      If the parameter buffer is small enough, try to allocate it with kmalloc()
      rather than vmalloc().
      
      vmalloc is noticeably slower than kmalloc because it has to manipulate
      page tables.
      
      In my tests, on PA-RISC this patch speeds up activation 13 times.
      On Opteron this patch speeds up activation by 5%.
      
      This patch introduces a new function free_params() to free the
      parameters and this uses new flags that record whether or not vmalloc()
      was used and whether or not the input buffer must be wiped after use.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      9c5091f2
    • M
      dm ioctl: remove PF_MEMALLOC · 5023e5cf
      Mikulas Patocka 提交于
      When allocating memory for the userspace ioctl data, set some
      appropriate GPF flags directly instead of using PF_MEMALLOC.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      5023e5cf
    • A
      dm ioctl: prevent unsafe change to dm_ioctl data_size · e910d7eb
      Alasdair G Kergon 提交于
      Abort dm ioctl processing if userspace changes the data_size parameter
      after we validated it but before we finished copying the data buffer
      from userspace.
      
      The dm ioctl parameters are processed in the following sequence:
       1. ctl_ioctl() calls copy_params();
       2. copy_params() makes a first copy of the fixed-sized portion of the
          userspace parameters into the local variable "tmp";
       3. copy_params() then validates tmp.data_size and allocates a new
          structure big enough to hold the complete data and copies the whole
          userspace buffer there;
       4. ctl_ioctl() reads userspace data the second time and copies the whole
          buffer into the pointer "param";
       5. ctl_ioctl() reads param->data_size without any validation and stores it
          in the variable "input_param_size";
       6. "input_param_size" is further used as the authoritative size of the
          kernel buffer.
      
      The problem is that userspace code could change the contents of user
      memory between steps 2 and 4.  In particular, the data_size parameter
      can be changed to an invalid value after the kernel has validated it.
      This lets userspace force the kernel to access invalid kernel memory.
      
      The fix is to ensure that the size has not changed at step 4.
      
      This patch shouldn't have a security impact because CAP_SYS_ADMIN is
      required to run this code, but it should be fixed anyway.
      Reported-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      Cc: stable@kernel.org
      e910d7eb
  24. 27 7月, 2012 1 次提交
    • A
      dm thin: commit before gathering status · 1f4e0ff0
      Alasdair G Kergon 提交于
      Commit outstanding metadata before returning the status for a dm thin
      pool so that the numbers reported are as up-to-date as possible.
      
      The commit is not performed if the device is suspended or if
      the DM_NOFLUSH_FLAG is supplied by userspace and passed to the target
      through a new 'status_flags' parameter in the target's dm_status_fn.
      
      The userspace dmsetup tool will support the --noflush flag with the
      'dmsetup status' and 'dmsetup wait' commands from version 1.02.76
      onwards.
      Tested-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      1f4e0ff0
  25. 29 3月, 2012 1 次提交
    • M
      dm: reject trailing characters in sccanf input · 31998ef1
      Mikulas Patocka 提交于
      Device mapper uses sscanf to convert arguments to numbers. The problem is that
      the way we use it ignores additional unmatched characters in the scanned string.
      
      For example, this `if (sscanf(string, "%d", &number) == 1)' will match a number,
      but also it will match number with some garbage appended, like "123abc".
      
      As a result, device mapper accepts garbage after some numbers. For example
      the command `dmsetup create vg1-new --table "0 16384 linear 254:1bla 34816bla"'
      will pass without an error.
      
      This patch fixes all sscanf uses in device mapper. It appends "%c" with
      a pointer to a dummy character variable to every sscanf statement.
      
      The construct `if (sscanf(string, "%d%c", &number, &dummy) == 1)' succeeds
      only if string is a null-terminated number (optionally preceded by some
      whitespace characters). If there is some character appended after the number,
      sscanf matches "%c", writes the character to the dummy variable and returns 2.
      We check the return value for 1 and consequently reject numbers with some
      garbage appended.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Acked-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      31998ef1
  26. 08 3月, 2012 1 次提交