1. 29 12月, 2017 1 次提交
    • S
      nvme-rdma: fix concurrent reset and reconnect · d5bf4b7f
      Sagi Grimberg 提交于
      Now ctrl state machine allows to transition from RESETTING to
      RECONNECTING.  In nvme-rdma when we receive a rdma cm DISONNECTED event,
      we trigger nvme_rdma_error_recovery. This happens also when we execute a
      controller reset, issue a cm diconnect request and receive a cm
      disconnect reply, as a result, the reset work and the error recovery work
      can run concurrently.
      
      Until now the state machine prevented from the error recovery work from
      running as a result of a controller reset (RESETTING -> RECONNECTING was
      not allowed).
      
      To fix this, we adopt the FC state machine approach, we always transition
      from LIVE to RESETTING and only then to RECONNECTING.  We do this both
      for the error recovery work and the controller reset work:
      
       1. transition to RESETTING
       2. teardown the controller association
       3. transition to RECONNECTING
      
      This will restore the protection against reset work and error recovery work
      from concurrently running together.
      
      Fixes: 3cec7f9d ("nvme: allow controller RESETTING to RECONNECTING transition")
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      d5bf4b7f
  2. 29 11月, 2017 1 次提交
    • M
      nvme-rdma: fix memory leak during queue allocation · eb1bd249
      Max Gurtovoy 提交于
      In case nvme_rdma_wait_for_cm timeout expires before we get
      an established or rejected event (rdma_connect succeeded) from
      rdma_cm, we end up with leaking the ib transport resources for
      dedicated queue. This scenario can easily reproduced using traffic
      test during port toggling.
      Also, in order to protect from parallel ib queue destruction, that
      may be invoked from different context's, introduce new flag that
      stands for transport readiness. While we're here, protect also against
      a situation that we can receive rdma_cm events during ib queue destruction.
      Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      eb1bd249
  3. 26 11月, 2017 5 次提交
  4. 20 11月, 2017 1 次提交
  5. 11 11月, 2017 3 次提交
  6. 01 11月, 2017 4 次提交
  7. 27 10月, 2017 2 次提交
    • J
      nvme-rdma: add support for duplicate_connect option · 36e835f2
      James Smart 提交于
      Adds support for the duplicate_connect option. When set to true,
      checks whether there's an existing controller via the same target
      address (traddr), target port (trsvcid), and if specified, host
      address (host_traddr). Fails the connection request if there is
      an existing controller.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      36e835f2
    • C
      nvme: switch controller refcounting to use struct device · d22524a4
      Christoph Hellwig 提交于
      Instead of allocating a separate struct device for the character device
      handle embedd it into struct nvme_ctrl and use it for the main controller
      refcounting.  This removes double refcounting and gets us an automatic
      reference for the character device operations.  We keep ctrl->device as a
      pointer for now to avoid chaning printks all over, but in the future we
      could look into message printing helpers that take a controller structure
      similar to what other subsystems do.
      
      Note the delete_ctrl operation always already has a reference (either
      through sysfs due this change, or because every open file on the
      /dev/nvme-fabrics node has a refernece) when it is entered now, so we
      don't need to do the unless_zero variant there.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      d22524a4
  8. 23 10月, 2017 3 次提交
    • N
      e62a538d
    • M
      nvme-rdma: align nvme_rdma_device structure · f87c89ad
      Max Gurtovoy 提交于
      Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      f87c89ad
    • S
      nvme-rdma: fix possible hang when issuing commands during ctrl removal · 7db81446
      Sagi Grimberg 提交于
      nvme_rdma_queue_is_ready() fails requests in case a queue is not
      LIVE. If the controller is in RECONNECTING state, we might be in
      this state for a long time (until we successfully reconnect) and
      we are better off with failing the request fast. Otherwise, we
      fail with BLK_STS_RESOURCE to have the block layer try again
      soon.
      
      In case we are removing the controller when the admin queue
      is not LIVE, we will terminate the request with BLK_STS_RESOURCE
      but it happens before we call blk_mq_start_request() so the
      request timeout never expires, and the queue will never get
      back to LIVE (because we are removing the controller). This
      causes the removal operation to block infinitly [1].
      
      Thus, if we are removing (state DELETING), and the queue is
      not LIVE, we need to fail the request permanently as there is
      no chance for it to ever complete successfully.
      
      [1]
      --
      sysrq: SysRq : Show Blocked State
        task                        PC stack   pid father
      kworker/u66:2   D    0   440      2 0x80000000
      Workqueue: nvme-wq nvme_rdma_del_ctrl_work [nvme_rdma]
      Call Trace:
       __schedule+0x3e9/0xb00
       schedule+0x40/0x90
       schedule_timeout+0x221/0x580
       io_schedule_timeout+0x1e/0x50
       wait_for_completion_io_timeout+0x118/0x180
       blk_execute_rq+0x86/0xc0
       __nvme_submit_sync_cmd+0x89/0xf0
       nvmf_reg_write32+0x4b/0x90 [nvme_fabrics]
       nvme_shutdown_ctrl+0x41/0xe0
       nvme_rdma_shutdown_ctrl+0xca/0xd0 [nvme_rdma]
       nvme_rdma_remove_ctrl+0x2b/0x40 [nvme_rdma]
       nvme_rdma_del_ctrl_work+0x25/0x30 [nvme_rdma]
       process_one_work+0x1fd/0x630
       worker_thread+0x1db/0x3b0
       kthread+0x11e/0x150
       ret_from_fork+0x27/0x40
      01              D    0  2868   2862 0x00000000
      Call Trace:
       __schedule+0x3e9/0xb00
       schedule+0x40/0x90
       schedule_timeout+0x260/0x580
       wait_for_completion+0x108/0x170
       flush_work+0x1e0/0x270
       nvme_rdma_del_ctrl+0x5a/0x80 [nvme_rdma]
       nvme_sysfs_delete+0x2a/0x40
       dev_attr_store+0x18/0x30
       sysfs_kf_write+0x45/0x60
       kernfs_fop_write+0x124/0x1c0
       __vfs_write+0x28/0x150
       vfs_write+0xc7/0x1b0
       SyS_write+0x49/0xa0
       entry_SYSCALL_64_fastpath+0x18/0xad
      --
      Reported-by: NBart Van Assche <bart.vanassche@wdc.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      7db81446
  9. 19 10月, 2017 12 次提交
  10. 26 9月, 2017 2 次提交
  11. 30 8月, 2017 1 次提交
  12. 29 8月, 2017 5 次提交