1. 25 9月, 2017 2 次提交
    • J
      nvme: allow timed-out ios to retry · 0951338d
      James Smart 提交于
      Currently the nvme_req_needs_retry() applies several checks to see if
      a retry is allowed. On of those is whether the current time has exceeded
      the start time of the io plus the timeout length. This check, if an io
      times out, means there is never a retry allowed for the io. Which means
      applications see the io failure.
      
      Remove this check and allow the io to timeout, like it does on other
      protocols, and retries to be made.
      
      On the FC transport, a frame can be lost for an individual io, and there
      may be no other errors that escalate for the connection/association.
      The io will timeout, which causes the transport to escalate into creating
      a new association, but the io that timed out, due to this retry logic, has
      already failed back to the application and things are hosed.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      0951338d
    • J
      nvme: stop aer posting if controller state not live · cd48282c
      James Smart 提交于
      If an nvme async_event command completes, in most cases, a new
      async event is posted. However, if the controller enters a
      resetting or reconnecting state, there is nothing to block the
      scheduled work element from posting the async event again. Nor are
      there calls from the transport to stop async events when an
      association dies.
      
      In the case of FC, where the association is torn down, the aer must
      be aborted on the FC link and completes through the normal job
      completion path. Thus the terminated async event ends up being
      rescheduled even though the controller isn't in a valid state for
      the aer, and the reposting gets the transport into a partially torn
      down data structure.
      
      It's possible to hit the scenario on rdma, although much less likely
      due to an aer completing right as the association is terminated and
      as the association teardown reclaims the blk requests via
      nvme_cancel_request() so its immediate, not a link-related action
      like on FC.
      
      Fix by putting controller state checks in both the async event
      completion routine where it schedules the async event and in the
      async event work routine before it calls into the transport. It's
      effectively a "stop_async_events()" behavior.  The transport, when
      it creates a new association with the subsystem will transition
      the state back to live and is already restarting the async event
      posting.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      [hch: remove taking a lock over reading the controller state]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      cd48282c
  2. 12 9月, 2017 2 次提交
  3. 30 8月, 2017 3 次提交
  4. 29 8月, 2017 10 次提交
  5. 24 8月, 2017 1 次提交
    • C
      block: replace bi_bdev with a gendisk pointer and partitions index · 74d46992
      Christoph Hellwig 提交于
      This way we don't need a block_device structure to submit I/O.  The
      block_device has different life time rules from the gendisk and
      request_queue and is usually only available when the block device node
      is open.  Other callers need to explicitly create one (e.g. the lightnvm
      passthrough code, or the new nvme multipathing code).
      
      For the actual I/O path all that we need is the gendisk, which exists
      once per block device.  But given that the block layer also does
      partition remapping we additionally need a partition index, which is
      used for said remapping in generic_make_request.
      
      Note that all the block drivers generally want request_queue or
      sometimes the gendisk, so this removes a layer of indirection all
      over the stack.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      74d46992
  6. 11 8月, 2017 2 次提交
  7. 10 8月, 2017 1 次提交
  8. 26 7月, 2017 1 次提交
    • S
      nvme: validate admin queue before unquiesce · 7dd1ab16
      Scott Bauer 提交于
      With a misbehaving controller it's possible we'll never
      enter the live state and create an admin queue. When we
      fail out of reset work it's possible we failed out early
      enough without setting up the admin queue. We tear down
      queues after a failed reset, but needed to do some more
      sanitization.
      
      Fixes 443bd90f: "nvme: host: unquiesce queue in nvme_kill_queues()"
      
      [  189.650995] nvme nvme1: pci function 0000:0b:00.0
      [  317.680055] nvme nvme0: Device not ready; aborting reset
      [  317.680183] nvme nvme0: Removing after probe failure status: -19
      [  317.681258] kasan: GPF could be caused by NULL-ptr deref or user memory access
      [  317.681397] general protection fault: 0000 [#1] SMP KASAN
      [  317.682984] CPU: 3 PID: 477 Comm: kworker/3:2 Not tainted 4.13.0-rc1+ #5
      [  317.683112] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016
      [  317.683284] Workqueue: events nvme_remove_dead_ctrl_work [nvme]
      [  317.683398] task: ffff8803b0990000 task.stack: ffff8803c2ef0000
      [  317.683516] RIP: 0010:blk_mq_unquiesce_queue+0x2b/0xa0
      [  317.683614] RSP: 0018:ffff8803c2ef7d40 EFLAGS: 00010282
      [  317.683716] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff1006fbdcde3
      [  317.683847] RDX: 0000000000000038 RSI: 1ffff1006f5a9245 RDI: 0000000000000000
      [  317.683978] RBP: ffff8803c2ef7d58 R08: 1ffff1007bcdc974 R09: 0000000000000000
      [  317.684108] R10: 1ffff1007bcdc975 R11: 0000000000000000 R12: 00000000000001c0
      [  317.684239] R13: ffff88037ad49228 R14: ffff88037ad492d0 R15: ffff88037ad492e0
      [  317.684371] FS:  0000000000000000(0000) GS:ffff8803de6c0000(0000) knlGS:0000000000000000
      [  317.684519] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  317.684627] CR2: 0000002d1860c000 CR3: 000000045b40d000 CR4: 00000000003406e0
      [  317.684758] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  317.684888] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  317.685018] Call Trace:
      [  317.685084]  nvme_kill_queues+0x4d/0x170 [nvme_core]
      [  317.685185]  nvme_remove_dead_ctrl_work+0x3a/0x90 [nvme]
      [  317.685289]  process_one_work+0x771/0x1170
      [  317.685372]  worker_thread+0xde/0x11e0
      [  317.685452]  ? pci_mmcfg_check_reserved+0x110/0x110
      [  317.685550]  kthread+0x2d3/0x3d0
      [  317.685617]  ? process_one_work+0x1170/0x1170
      [  317.685704]  ? kthread_create_on_node+0xc0/0xc0
      [  317.685785]  ret_from_fork+0x25/0x30
      [  317.685798] Code: 0f 1f 44 00 00 55 48 b8 00 00 00 00 00 fc ff df 48 89 e5 41 54 4c 8d a7 c0 01 00 00 53 48 89 fb 4c 89 e2 48 c1 ea 03 48 83 ec 08 <80> 3c 02 00 75 50 48 8b bb c0 01 00 00 e8 33 8a f9 00 0f ba b3
      [  317.685872] RIP: blk_mq_unquiesce_queue+0x2b/0xa0 RSP: ffff8803c2ef7d40
      [  317.685908] ---[ end trace a3f8704150b1e8b4 ]---
      Signed-off-by: NScott Bauer <scott.bauer@intel.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      7dd1ab16
  9. 25 7月, 2017 1 次提交
  10. 20 7月, 2017 1 次提交
  11. 06 7月, 2017 2 次提交
  12. 28 6月, 2017 5 次提交
  13. 19 6月, 2017 2 次提交
  14. 16 6月, 2017 1 次提交
  15. 15 6月, 2017 6 次提交