1. 28 3月, 2020 1 次提交
    • C
      block: simplify queue allocation · 3d745ea5
      Christoph Hellwig 提交于
      Current make_request based drivers use either blk_alloc_queue_node or
      blk_alloc_queue to allocate a queue, and then set up the make_request_fn
      function pointer and a few parameters using the blk_queue_make_request
      helper.  Simplify this by passing the make_request pointer to
      blk_alloc_queue, and while at it merge the _node variant into the main
      helper by always passing a node_id, and remove the superfluous gfp_mask
      parameter.  A lower-level __blk_alloc_queue is kept for the blk-mq case.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      3d745ea5
  2. 25 3月, 2020 1 次提交
  3. 19 3月, 2020 1 次提交
  4. 28 2月, 2020 1 次提交
  5. 21 2月, 2020 1 次提交
    • L
      nvme-multipath: Fix memory leak with ana_log_buf · 3b783090
      Logan Gunthorpe 提交于
      kmemleak reports a memory leak with the ana_log_buf allocated by
      nvme_mpath_init():
      
      unreferenced object 0xffff888120e94000 (size 8208):
        comm "nvme", pid 6884, jiffies 4295020435 (age 78786.312s)
          hex dump (first 32 bytes):
            00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
            01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
          backtrace:
            [<00000000e2360188>] kmalloc_order+0x97/0xc0
            [<0000000079b18dd4>] kmalloc_order_trace+0x24/0x100
            [<00000000f50c0406>] __kmalloc+0x24c/0x2d0
            [<00000000f31a10b9>] nvme_mpath_init+0x23c/0x2b0
            [<000000005802589e>] nvme_init_identify+0x75f/0x1600
            [<0000000058ef911b>] nvme_loop_configure_admin_queue+0x26d/0x280
            [<00000000673774b9>] nvme_loop_create_ctrl+0x2a7/0x710
            [<00000000f1c7a233>] nvmf_dev_write+0xc66/0x10b9
            [<000000004199f8d0>] __vfs_write+0x50/0xa0
            [<0000000065466fef>] vfs_write+0xf3/0x280
            [<00000000b0db9a8b>] ksys_write+0xc6/0x160
            [<0000000082156b91>] __x64_sys_write+0x43/0x50
            [<00000000c34fbb6d>] do_syscall_64+0x77/0x2f0
            [<00000000bbc574c9>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      nvme_mpath_init() is called by nvme_init_identify() which is called in
      multiple places (nvme_reset_work(), nvme_passthru_end(), etc). This
      means nvme_mpath_init() may be called multiple times before
      nvme_mpath_uninit() (which is only called on nvme_free_ctrl()).
      
      When nvme_mpath_init() is called multiple times, it overwrites the
      ana_log_buf pointer with a new allocation, thus leaking the previous
      allocation.
      
      To fix this, free ana_log_buf before allocating a new one.
      
      Fixes: 0d0b660f ("nvme: add ANA support")
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      3b783090
  6. 20 2月, 2020 1 次提交
  7. 19 2月, 2020 2 次提交
  8. 15 2月, 2020 4 次提交
  9. 05 2月, 2020 3 次提交
    • D
      nvmet: update AEN list and array at one place · 0f5be6a4
      Daniel Wagner 提交于
      All async events are enqueued via nvmet_add_async_event() which
      updates the ctrl->async_event_cmds[] array and additionally an struct
      nvmet_async_event is added to the ctrl->async_events list.
      
      Under normal operations the nvmet_async_event_work() updates again
      the ctrl->async_event_cmds and removes the corresponding struct
      nvmet_async_event from the list again. Though nvmet_sq_destroy() could
      be called which calls nvmet_async_events_free() which only updates the
      ctrl->async_event_cmds[] array.
      
      Add new functions nvmet_async_events_process() and
      nvmet_async_events_free() to process async events, update an array and
      the list.
      
      When we destroy submission queue after clearing the aen present on
      the ctrl->async list we also loop over ctrl->async_event_cmds[] for
      any requests posted by the host for which we don't have the AEN in
      the ctrl->async_events list by calling nvmet_async_event_process()
      and nvmet_async_events_free().
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDaniel Wagner <dwagner@suse.de>
      [chaitanya.kulkarni@wdc.com
       * Loop over and clear out outstanding requests
       * Update changelog
      ]
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      0f5be6a4
    • I
      nvmet: Fix controller use after free · 1a3f540d
      Israel Rukshin 提交于
      After nvmet_install_queue() sets sq->ctrl calling to nvmet_sq_destroy()
      reduces the controller refcount. In case nvmet_install_queue() fails,
      calling to nvmet_ctrl_put() is done twice (at nvmet_sq_destroy and
      nvmet_execute_io_connect/nvmet_execute_admin_connect) instead of once for
      the queue which leads to use after free of the controller. Fix this by set
      NULL at sq->ctrl in case of a failure at nvmet_install_queue().
      
      The bug leads to the following Call Trace:
      
      [65857.994862] refcount_t: underflow; use-after-free.
      [65858.108304] Workqueue: events nvmet_rdma_release_queue_work [nvmet_rdma]
      [65858.115557] RIP: 0010:refcount_warn_saturate+0xe5/0xf0
      [65858.208141] Call Trace:
      [65858.211203]  nvmet_sq_destroy+0xe1/0xf0 [nvmet]
      [65858.216383]  nvmet_rdma_release_queue_work+0x37/0xf0 [nvmet_rdma]
      [65858.223117]  process_one_work+0x167/0x370
      [65858.227776]  worker_thread+0x49/0x3e0
      [65858.232089]  kthread+0xf5/0x130
      [65858.235895]  ? max_active_store+0x80/0x80
      [65858.240504]  ? kthread_bind+0x10/0x10
      [65858.244832]  ret_from_fork+0x1f/0x30
      [65858.249074] ---[ end trace f82d59250b54beb7 ]---
      
      Fixes: bb1cc747 ("nvmet: implement valid sqhd values in completions")
      Fixes: 1672ddb8 ("nvmet: Add install_queue callout")
      Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
      Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      1a3f540d
    • I
      nvmet: Fix error print message at nvmet_install_queue function · 0b87a2b7
      Israel Rukshin 提交于
      Place the arguments in the correct order.
      
      Fixes: 1672ddb8 ("nvmet: Add install_queue callout")
      Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
      Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      0b87a2b7
  10. 04 2月, 2020 3 次提交
  11. 01 2月, 2020 1 次提交
  12. 10 1月, 2020 2 次提交
  13. 07 1月, 2020 1 次提交
    • H
      block: Allow t10-pi to be modular · a754bd5f
      Herbert Xu 提交于
      Currently t10-pi can only be built into the block layer which via
      crc-t10dif pulls in a whole chunk of the Crypto API.  In fact all
      users of t10-pi work as modules and there is no reason for it to
      always be built-in.
      
      This patch adds a new hidden option for t10-pi that is selected
      automatically based on BLK_DEV_INTEGRITY and whether the users
      of t10-pi are built-in or not.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      a754bd5f
  14. 07 12月, 2019 3 次提交
  15. 03 12月, 2019 2 次提交
  16. 27 11月, 2019 8 次提交
  17. 22 11月, 2019 2 次提交
    • A
      nvme: hwmon: add quirk to avoid changing temperature threshold · 6c6aa2f2
      Akinobu Mita 提交于
      This adds a new quirk NVME_QUIRK_NO_TEMP_THRESH_CHANGE to avoid changing
      the value of the temperature threshold feature for specific devices that
      show undesirable behavior.
      
      Guenter reported:
      
      "On my Intel NVME drive (SSDPEKKW512G7), writing any minimum limit on the
      Composite temperature sensor results in a temperature warning, and that
      warning is sticky until I reset the controller.
      
      It doesn't seem to matter which temperature I write; writing -273000 has
      the same result."
      
      The Intel NVMe has the latest firmware version installed, so this isn't
      a problem that was ever fixed.
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Cc: Keith Busch <kbusch@kernel.org>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Jean Delvare <jdelvare@suse.com>
      Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
      Tested-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      6c6aa2f2
    • A
      nvme: hwmon: provide temperature min and max values for each sensor · 52deba0f
      Akinobu Mita 提交于
      According to the NVMe specification, the over temperature threshold and
      under temperature threshold features shall be implemented for Composite
      Temperature if a non-zero WCTEMP field value is reported in the Identify
      Controller data structure.  The features are also implemented for all
      implemented temperature sensors (i.e., all Temperature Sensor fields that
      report a non-zero value).
      
      This provides the over temperature threshold and under temperature
      threshold for each sensor as temperature min and max values of hwmon
      sysfs attributes.
      
      The WCTEMP is already provided as a temperature max value for Composite
      Temperature, but this change isn't incompatible.  Because the default
      value of the over temperature threshold for Composite Temperature is
      the WCTEMP.
      
      Now the alarm attribute for Composite Temperature indicates one of the
      temperature is outside of a temperature threshold.  Because there is only
      a single bit in Critical Warning field that indicates a temperature is
      outside of a threshold.
      
      Example output from the "sensors" command:
      
      nvme-pci-0100
      Adapter: PCI adapter
      Composite:    +33.9°C  (low  = -273.1°C, high = +69.8°C)
                             (crit = +79.8°C)
      Sensor 1:     +34.9°C  (low  = -273.1°C, high = +65261.8°C)
      Sensor 2:     +31.9°C  (low  = -273.1°C, high = +65261.8°C)
      Sensor 5:     +47.9°C  (low  = -273.1°C, high = +65261.8°C)
      
      This also adds helper macros for kelvin from/to milli Celsius conversion,
      and replaces the repeated code in hwmon.c.
      
      Cc: Keith Busch <kbusch@kernel.org>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Jean Delvare <jdelvare@suse.com>
      Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
      Tested-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      52deba0f
  18. 13 11月, 2019 1 次提交
  19. 12 11月, 2019 1 次提交
    • G
      nvme: Add hardware monitoring support · 400b6a7b
      Guenter Roeck 提交于
      nvme devices report temperature information in the controller information
      (for limits) and in the smart log. Currently, the only means to retrieve
      this information is the nvme command line interface, which requires
      super-user privileges.
      
      At the same time, it would be desirable to be able to use NVMe temperature
      information for thermal control.
      
      This patch adds support to read NVMe temperatures from the kernel using the
      hwmon API and adds temperature zones for NVMe drives. The thermal subsystem
      can use this information to set thermal policies, and userspace can access
      it using libsensors and/or the "sensors" command.
      
      Example output from the "sensors" command:
      
      nvme0-pci-0100
      Adapter: PCI adapter
      Composite:    +39.0°C  (high = +85.0°C, crit = +85.0°C)
      Sensor 1:     +39.0°C
      Sensor 2:     +41.0°C
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      400b6a7b
  20. 05 11月, 2019 1 次提交
    • A
      nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths · 763303a8
      Anton Eidelman 提交于
      nvme_mpath_clear_ctrl_paths() iterates through
      the ctrl->namespaces list while holding ctrl->scan_lock.
      This does not seem to be the correct way of protecting
      from concurrent list modification.
      
      Specifically, nvme_scan_work() sorts ctrl->namespaces
      AFTER unlocking scan_lock.
      
      This may result in the following (rare) crash in ctrl disconnect
      during scan_work:
      
          BUG: kernel NULL pointer dereference, address: 0000000000000050
          Oops: 0000 [#1] SMP PTI
          CPU: 0 PID: 3995 Comm: nvme 5.3.5-050305-generic
          RIP: 0010:nvme_mpath_clear_current_path+0xe/0x90 [nvme_core]
          ...
          Call Trace:
           nvme_mpath_clear_ctrl_paths+0x3c/0x70 [nvme_core]
           nvme_remove_namespaces+0x35/0xe0 [nvme_core]
           nvme_do_delete_ctrl+0x47/0x90 [nvme_core]
           nvme_sysfs_delete+0x49/0x60 [nvme_core]
           dev_attr_store+0x17/0x30
           sysfs_kf_write+0x3e/0x50
           kernfs_fop_write+0x11e/0x1a0
           __vfs_write+0x1b/0x40
           vfs_write+0xb9/0x1a0
           ksys_write+0x67/0xe0
           __x64_sys_write+0x1a/0x20
           do_syscall_64+0x5a/0x130
           entry_SYSCALL_64_after_hwframe+0x44/0xa9
          RIP: 0033:0x7f8d02bfb154
      
      Fix:
      After taking scan_lock in nvme_mpath_clear_ctrl_paths()
      down_read(&ctrl->namespaces_rwsem) as well to make list traversal safe.
      This will not cause deadlocks because taking scan_lock never happens
      while holding the namespaces_rwsem.
      Moreover, scan work downs namespaces_rwsem in the same order.
      
      Alternative: sort ctrl->namespaces in nvme_scan_work()
      while still holding the scan_lock.
      This would leave nvme_mpath_clear_ctrl_paths() without correct protection
      against ctrl->namespaces modification by anyone other than scan_work.
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      763303a8