1. 26 3月, 2020 5 次提交
  2. 05 3月, 2020 1 次提交
  3. 15 2月, 2020 2 次提交
    • N
      nvme: prevent warning triggered by nvme_stop_keep_alive · 97b2512a
      Nigel Kirkland 提交于
      Delayed keep alive work is queued on system workqueue and may be cancelled
      via nvme_stop_keep_alive from nvme_reset_wq, nvme_fc_wq or nvme_wq.
      
      Check_flush_dependency detects mismatched attributes between the work-queue
      context used to cancel the keep alive work and system-wq. Specifically
      system-wq does not have the WQ_MEM_RECLAIM flag, whereas the contexts used
      to cancel keep alive work have WQ_MEM_RECLAIM flag.
      
      Example warning:
      
        workqueue: WQ_MEM_RECLAIM nvme-reset-wq:nvme_fc_reset_ctrl_work [nvme_fc]
      	is flushing !WQ_MEM_RECLAIM events:nvme_keep_alive_work [nvme_core]
      
      To avoid the flags mismatch, delayed keep alive work is queued on nvme_wq.
      
      However this creates a secondary concern where work and a request to cancel
      that work may be in the same work queue - namely err_work in the rdma and
      tcp transports, which will want to flush/cancel the keep alive work which
      will now be on nvme_wq.
      
      After reviewing the transports, it looks like err_work can be moved to
      nvme_reset_wq. In fact that aligns them better with transition into
      RESETTING and performing related reset work in nvme_reset_wq.
      
      Change nvme-rdma and nvme-tcp to perform err_work in nvme_reset_wq.
      Signed-off-by: NNigel Kirkland <nigel.kirkland@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      97b2512a
    • A
      nvme/tcp: fix bug on double requeue when send fails · 2d570a7c
      Anton Eidelman 提交于
      When nvme_tcp_io_work() fails to send to socket due to
      connection close/reset, error_recovery work is triggered
      from nvme_tcp_state_change() socket callback.
      This cancels all the active requests in the tagset,
      which requeues them.
      
      The failed request, however, was ended and thus requeued
      individually as well unless send returned -EPIPE.
      Another return code to be treated the same way is -ECONNRESET.
      
      Double requeue caused BUG_ON(blk_queued_rq(rq))
      in blk_mq_requeue_request() from either the individual requeue
      of the failed request or the bulk requeue from
      blk_mq_tagset_busy_iter(, nvme_cancel_request, );
      Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      2d570a7c
  4. 05 11月, 2019 1 次提交
  5. 29 10月, 2019 1 次提交
  6. 15 10月, 2019 1 次提交
  7. 14 10月, 2019 2 次提交
  8. 26 9月, 2019 1 次提交
  9. 12 9月, 2019 2 次提交
  10. 30 8月, 2019 8 次提交
  11. 05 8月, 2019 1 次提交
  12. 10 7月, 2019 1 次提交
  13. 31 5月, 2019 1 次提交
    • S
      nvme-tcp: fix queue mapping when queue count is limited · 64861993
      Sagi Grimberg 提交于
      When the controller supports less queues than requested, we
      should make sure that queue mapping does the right thing and
      not assume that all queues are available. This fixes a crash
      when the controller supports less queues than requested.
      
      The rules are:
      1. if no write queues are requested, we assign the available queues
         to the default queue map. The default and read queue maps share the
         existing queues.
      2. if write queues are requested:
        - first make sure that read queue map gets the requested
          nr_io_queues count
        - then grant the default queue map the minimum between the requested
          nr_write_queues and the remaining queues. If there are no available
          queues to dedicate to the default queue map, fallback to (1) and
          share all the queues in the existing queue map.
      
      Also, provide a log indication on how we constructed the different
      queue maps.
      Reported-by: NHarris, James R <james.r.harris@intel.com>
      Tested-by: NJim Harris <james.r.harris@intel.com>
      Cc: <stable@vger.kernel.org> # v5.0+
      Suggested-by: NRoy Shterman <roys@lightbitslabs.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      64861993
  14. 01 5月, 2019 1 次提交
  15. 25 4月, 2019 2 次提交
  16. 29 3月, 2019 1 次提交
  17. 14 3月, 2019 1 次提交
  18. 04 2月, 2019 1 次提交
  19. 24 1月, 2019 1 次提交
    • S
      nvme-tcp: fix timeout handler · 39d57757
      Sagi Grimberg 提交于
      Currently, we have several problems with the timeout
      handler:
      1. If we timeout on the controller establishment flow, we will hang
      because we don't execute the error recovery (and we shouldn't because
      the create_ctrl flow needs to fail and cleanup on its own)
      2. We might also hang if we get a disconnet on a queue while the
      controller is already deleting. This racy flow can cause the controller
      disable/shutdown admin command to hang.
      
      We cannot complete a timed out request from the timeout handler without
      mutual exclusion from the teardown flow (e.g. nvme_rdma_error_recovery_work).
      So we serialize it in the timeout handler and teardown io and admin
      queues to guarantee that no one races with us from completing the
      request.
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      39d57757
  20. 10 1月, 2019 2 次提交
  21. 19 12月, 2018 3 次提交
  22. 13 12月, 2018 1 次提交