1. 27 10月, 2017 9 次提交
  2. 23 10月, 2017 2 次提交
  3. 20 10月, 2017 4 次提交
    • J
      nvme-fc: correct io timeout behavior · 134aedc9
      James Smart 提交于
      The transport io timeout behavior wasn't quite correct. It ignored
      that the io error handler is supposed to be synchronous so it possibly
      allowed the blk request to be restarted while the io associated was
      still aborting. Timeouts on reserved commands, those used for
      association create, were never timing out thus they hung out forever.
      
      To correct:
      If an io is times out while a remoteport is not connected, just
      restart the io timer. The lack of connectivity will simultaneously
      be resetting the controller, so the reset path will abort and terminate
      the io.
      
      If an io is times out while it was marked for transport abort, just
      reset the io timer. The abort process is underway and will complete
      the io.
      
      Otherwise, if an io times out, abort the io. If the abort was
      unsuccessful (unlikely) give up and return not handled.
      
      If the abort was successful, as the abort process is underway it will
      terminate the io, so rather than synchronously waiting, just restart
      the io timer.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      134aedc9
    • J
      nvme-fc: correct io termination handling · 0a02e39f
      James Smart 提交于
      The io completion handling for i/o's that are failing due to
      to a transport error or association termination had issues, causing
      io failures (DNR set so retries didn't kick in) or long stalls.
      
      Change the io completion handler for the following items:
      
      When an io has been completed due to a transport abort (based on an
      exchange error) or when marked as aborted as part of an association
      termination (FCOP_FLAGS_TERMIO), set the NVME completion status to
      NVME_SC_ABORTED. By default, do not set DNR on the status so that a
      retry can be attempted after association recreate.
      
      In cases where an io is failed (non-successful nvme status including
      aborted), if the controller is being deleted (blk_queue_dying) or
      the io was part of the ios used for association creation (ctrl state
      is NEW or RECONNECTING), then additionally set the DNR bit so the io
      will not be retried. If the failed io was part of association creation,
      the failure will tear down the partially completioned association and
      typically restart a new reconnect attempt (another create association
      later).
      
      Rearranged code flow to remove a largely unneeded local variable.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      0a02e39f
    • C
      nvme-pci: add SGL support · a7a7cbe3
      Chaitanya Kulkarni 提交于
      This adds SGL support for NVMe PCIe driver, based on an earlier patch
      from Rajiv Shanmugam Madeswaran <smrajiv15 at gmail.com>. This patch
      refactors the original code and adds new module parameter sgl_threshold
      to determine whether to use SGL or PRP for IOs.
      
      The usage of SGLs is controlled by the sgl_threshold module parameter,
      which allows to conditionally use SGLs if average request segment
      size (avg_seg_size) is greater than sgl_threshold. In the original patch,
      the decision of using SGLs was dependent only on the IO size,
      with the new approach we consider not only IO size but also the
      number of physical segments present in the IO.
      
      We calculate avg_seg_size based on request payload bytes and number
      of physical segments present in the request.
      
      For e.g.:-
      
      1. blk_rq_nr_phys_segments = 2 blk_rq_payload_bytes = 8k
      avg_seg_size = 4K use sgl if avg_seg_size >= sgl_threshold.
      
      2. blk_rq_nr_phys_segments = 2 blk_rq_payload_bytes = 64k
      avg_seg_size = 32K use sgl if avg_seg_size >= sgl_threshold.
      
      3. blk_rq_nr_phys_segments = 16 blk_rq_payload_bytes = 64k
      avg_seg_size = 4K use sgl if avg_seg_size >= sgl_threshold.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      a7a7cbe3
    • C
      nvme: use ida_simple_{get,remove} for the controller instance · 9843f685
      Christoph Hellwig 提交于
      Switch to the ida_simple_* helpers instead of opencoding them.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      9843f685
  4. 19 10月, 2017 12 次提交
  5. 16 10月, 2017 1 次提交
  6. 05 10月, 2017 1 次提交
  7. 04 10月, 2017 4 次提交
  8. 26 9月, 2017 3 次提交
  9. 25 9月, 2017 4 次提交
    • G
      nvme-fabrics: Allow 0 as KATO value · 8edd11c9
      Guilherme G. Piccoli 提交于
      Currently, driver code allows user to set 0 as KATO
      (Keep Alive TimeOut), but this is not being respected.
      This patch enforces the expected behavior.
      Signed-off-by: NGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8edd11c9
    • J
      nvme: allow timed-out ios to retry · 0951338d
      James Smart 提交于
      Currently the nvme_req_needs_retry() applies several checks to see if
      a retry is allowed. On of those is whether the current time has exceeded
      the start time of the io plus the timeout length. This check, if an io
      times out, means there is never a retry allowed for the io. Which means
      applications see the io failure.
      
      Remove this check and allow the io to timeout, like it does on other
      protocols, and retries to be made.
      
      On the FC transport, a frame can be lost for an individual io, and there
      may be no other errors that escalate for the connection/association.
      The io will timeout, which causes the transport to escalate into creating
      a new association, but the io that timed out, due to this retry logic, has
      already failed back to the application and things are hosed.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      0951338d
    • J
      nvme: stop aer posting if controller state not live · cd48282c
      James Smart 提交于
      If an nvme async_event command completes, in most cases, a new
      async event is posted. However, if the controller enters a
      resetting or reconnecting state, there is nothing to block the
      scheduled work element from posting the async event again. Nor are
      there calls from the transport to stop async events when an
      association dies.
      
      In the case of FC, where the association is torn down, the aer must
      be aborted on the FC link and completes through the normal job
      completion path. Thus the terminated async event ends up being
      rescheduled even though the controller isn't in a valid state for
      the aer, and the reposting gets the transport into a partially torn
      down data structure.
      
      It's possible to hit the scenario on rdma, although much less likely
      due to an aer completing right as the association is terminated and
      as the association teardown reclaims the blk requests via
      nvme_cancel_request() so its immediate, not a link-related action
      like on FC.
      
      Fix by putting controller state checks in both the async event
      completion routine where it schedules the async event and in the
      async event work routine before it calls into the transport. It's
      effectively a "stop_async_events()" behavior.  The transport, when
      it creates a new association with the subsystem will transition
      the state back to live and is already restarting the async event
      posting.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      [hch: remove taking a lock over reading the controller state]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      cd48282c
    • K
      nvme-pci: Print invalid SGL only once · d0877473
      Keith Busch 提交于
      The WARN_ONCE macro returns true if the condition is true, not if the
      warn was raised, so we're printing the scatter list every time it's
      invalid. This is excessive and makes debugging harder, so this patch
      prints it just once.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d0877473