1. 08 12月, 2021 1 次提交
    • R
      nvme: fix use after free when disconnecting a reconnecting ctrl · 8b77fa6f
      Ruozhu Li 提交于
      A crash happens when trying to disconnect a reconnecting ctrl:
      
       1) The network was cut off when the connection was just established,
          scan work hang there waiting for some IOs complete.  Those I/Os were
          retried because we return BLK_STS_RESOURCE to blk in reconnecting.
       2) After a while, I tried to disconnect this connection.  This
          procedure also hangs because it tried to obtain ctrl->scan_lock.
          It should be noted that now we have switched the controller state
          to NVME_CTRL_DELETING.
       3) In nvme_check_ready(), we always return true when ctrl->state is
          NVME_CTRL_DELETING, so those retrying I/Os were issued to the bottom
          device which was already freed.
      
      To fix this, when ctrl->state is NVME_CTRL_DELETING, issue cmd to bottom
      device only when queue state is live.  If not, return host path error to
      the block layer
      Signed-off-by: NRuozhu Li <liruozhu@huawei.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      8b77fa6f
  2. 21 10月, 2021 1 次提交
  3. 20 10月, 2021 2 次提交
  4. 19 10月, 2021 1 次提交
  5. 28 9月, 2021 1 次提交
  6. 06 9月, 2021 2 次提交
  7. 17 8月, 2021 2 次提交
  8. 16 8月, 2021 1 次提交
  9. 15 8月, 2021 1 次提交
  10. 21 7月, 2021 1 次提交
  11. 01 7月, 2021 2 次提交
  12. 17 6月, 2021 1 次提交
  13. 03 6月, 2021 2 次提交
  14. 12 5月, 2021 1 次提交
  15. 04 5月, 2021 2 次提交
  16. 22 4月, 2021 2 次提交
  17. 15 4月, 2021 5 次提交
  18. 06 4月, 2021 1 次提交
    • K
      nvme: implement non-mdts command limits · 5befc7c2
      Keith Busch 提交于
      Commands that access LBA contents without a data transfer between the
      host historically have not had a spec defined upper limit. The driver
      set the queue constraints for such commands to the max data transfer
      size just to be safe, but this artificial constraint frequently limits
      devices below their capabilities.
      
      The NVMe Workgroup ratified TP4040 defines how a controller may
      advertise their non-MDTS limits. Use these if provided and default to
      the current constraints if not. Since the Dataset Management command
      limits are defined in logical blocks, but without a namespace to tell us
      the logical block size, the code defaults to the safe 512b size.
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      5befc7c2
  19. 03 4月, 2021 3 次提交
  20. 10 2月, 2021 3 次提交
  21. 02 2月, 2021 1 次提交
  22. 06 1月, 2021 2 次提交
  23. 02 12月, 2020 2 次提交
    • J
      nvme: export zoned namespaces without Zone Append support read-only · 2f4c9ba2
      Javier González 提交于
      Allow ZNS NVMe SSDs to present a read-only namespace when append is not
      supported, instead of rejecting the namespace directly.
      
      This allows (i) the namespace to be used in read-only mode, which is not
      a problem as the append command only affects the write path, and (ii) to
      use standard management tools such as nvme-cli to choose a different
      format or firmware slot that is compatible with the Linux zoned block
      device.
      Signed-off-by: NJavier González <javier.gonz@samsung.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      2f4c9ba2
    • V
      nvme-fabrics: reject I/O to offline device · 8c4dfea9
      Victor Gladkov 提交于
      Commands get stuck while Host NVMe-oF controller is in reconnect state.
      The controller enters into reconnect state when it loses connection with
      the target.  It tries to reconnect every 10 seconds (default) until
      a successful reconnect or until the reconnect time-out is reached.
      The default reconnect time out is 10 minutes.
      
      Applications are expecting commands to complete with success or error
      within a certain timeout (30 seconds by default).  The NVMe host is
      enforcing that timeout while it is connected, but during reconnect the
      timeout is not enforced and commands may get stuck for a long period or
      even forever.
      
      To fix this long delay due to the default timeout, introduce new
      "fast_io_fail_tmo" session parameter.  The timeout is measured in seconds
      from the controller reconnect and any command beyond that timeout is
      rejected.  The new parameter value may be passed during 'connect'.
      The default value of -1 means no timeout (similar to current behavior).
      Signed-off-by: NVictor Gladkov <victor.gladkov@kioxia.com>
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NChao Leng <lengchao@huawei.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      8c4dfea9