提交 · 9002c4e5ff006c62de09fe2b6966403bdf96afa1 · openeuler / Kernel

19 12月, 2018 1 次提交

nvme-fabrics: allow nvmf_connect_io_queue to poll · 26c68227

由 Sagi Grimberg 提交于 12月 14, 2018

Preparation for polling support for fabrics. Polling support
means that our completion queues are not generating any interrupts
which means we need to poll for the nvmf io queue connect as well.

Reviewed by Steve Wise <swise@opengridcomputing.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

26c68227

08 12月, 2018 1 次提交

nvme: add a numa_node field to struct nvme_ctrl · 103e515e

由 Hannes Reinecke 提交于 11月 16, 2018

Instead of directly poking into the struct device add a new numa_node
field to struct nvme_ctrl.  This allows fabrics drivers where ctrl->dev
is a virtual device to support NUMA affinity as well.

Also expose the field as a sysfs attribute, and populate it for the
RDMA and FC transports.
Signed-off-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

103e515e

28 11月, 2018 1 次提交

nvme-fc: initialize nvme_req(rq)->ctrl after calling __nvme_fc_init_request() · dfa74422

由 Ewan D. Milne 提交于 11月 26, 2018

__nvme_fc_init_request() invokes memset() on the nvme_fcp_op_w_sgl structure, which
NULLed-out the nvme_req(req)->ctrl field previously set by nvme_fc_init_request().
This apparently was not referenced until commit faf4a44fff ("nvme: support traffic
based keep-alive") which now results in a crash in nvme_complete_rq():

[ 8386.897130] RIP: 0010:panic+0x220/0x26c
[ 8386.901406] Code: 83 3d 6f ee 72 01 00 74 05 e8 e8 54 02 00 48 c7 c6 40 fd 5b b4 48 c7 c7 d8 8d c6 b3 31e
[ 8386.922359] RSP: 0018:ffff99650019fc40 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[ 8386.930804] RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000006
[ 8386.938764] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff8e325f8168b0
[ 8386.946725] RBP: ffff99650019fcb0 R08: 0000000000000000 R09: 00000000000004f8
[ 8386.954687] R10: 0000000000000000 R11: ffff99650019f9b8 R12: ffffffffb3c55f3c
[ 8386.962648] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001
[ 8386.970613]  oops_end+0xd1/0xe0
[ 8386.974116]  no_context+0x1b2/0x3c0
[ 8386.978006]  do_page_fault+0x32/0x140
[ 8386.982090]  page_fault+0x1e/0x30
[ 8386.985786] RIP: 0010:nvme_complete_rq+0x65/0x1d0 [nvme_core]
[ 8386.992195] Code: 41 bc 03 00 00 00 74 16 0f 86 c3 00 00 00 66 3d 83 00 41 bc 06 00 00 00 0f 85 e7 00 000
[ 8387.013147] RSP: 0018:ffff99650019fe18 EFLAGS: 00010246
[ 8387.018973] RAX: 0000000000000000 RBX: ffff8e322ae51280 RCX: 0000000000000001
[ 8387.026935] RDX: 0000000000000400 RSI: 0000000000000001 RDI: ffff8e322ae51280
[ 8387.034897] RBP: ffff8e322ae51280 R08: 0000000000000000 R09: ffffffffb2f0b890
[ 8387.042859] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[ 8387.050821] R13: 0000000000000100 R14: 0000000000000004 R15: ffff8e2b0446d990
[ 8387.058782]  ? swiotlb_unmap_page+0x40/0x40
[ 8387.063448]  nvme_fc_complete_rq+0x2d/0x70 [nvme_fc]
[ 8387.068986]  blk_done_softirq+0xa1/0xd0
[ 8387.073264]  __do_softirq+0xd6/0x2a9
[ 8387.077251]  run_ksoftirqd+0x26/0x40
[ 8387.081238]  smpboot_thread_fn+0x10e/0x160
[ 8387.085807]  kthread+0xf8/0x130
[ 8387.089309]  ? sort_range+0x20/0x20
[ 8387.093198]  ? kthread_stop+0x110/0x110
[ 8387.097475]  ret_from_fork+0x35/0x40
[ 8387.101462] ---[ end trace 7106b0adf5e422f8 ]---

Fixes: faf4a44fff ("nvme: support traffic based keep-alive")
Signed-off-by: NEwan D. Milne <emilne@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

dfa74422

20 11月, 2018 1 次提交

nvme-fc: remove ->poll implementation · 92f806d6

由 Jens Axboe 提交于 11月 19, 2018

It's specifically looking for a given request, which we will not be
supporting going forward. Also kill the qla2xxx poll implementation
as that's the only user of the nvme-fc poll, and the now unused
->poll_queue() hook.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

92f806d6

15 11月, 2018 1 次提交

nvme-fc: resolve io failures during connect · 4cff280a

由 James Smart 提交于 11月 14, 2018

If an io error occurs on an io issued while connecting, recovery
of the io falls flat as the state checking ends up nooping the error
handler.

Create an err_work work item that is scheduled upon an io error while
connecting. The work thread terminates all io on all queues and marks
the queues as not connected.  The termination of the io will return
back to the callee, which will then back out of the connection attempt
and will reschedule, if possible, the connection attempt.

The changes:
- in case there are several commands hitting the error handler, a
  state flag is kept so that the error work is only scheduled once,
  on the first error. The subsequent errors can be ignored.
- The calling sequence to stop keep alive and terminate the queues
  and their io is lifted from the reset routine. Made a small
  service routine used by both reset and err_work.
- During debugging, found that the teardown path can reference
  an uninitialized pointer, resulting in a NULL pointer oops.
  The aen_ops weren't initialized yet. Add validation on their
  initialization before calling the teardown routine.
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

4cff280a

09 11月, 2018 1 次提交

blk-mq-tag: change busy_iter_fn to return whether to continue or not · 7baa8572

由 Jens Axboe 提交于 11月 08, 2018

We have this functionality in sbitmap, but we don't export it in
blk-mq for users of the tags busy iteration. This can be useful
for stopping the iteration, if the caller doesn't need to find
more requests.
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7baa8572

02 11月, 2018 1 次提交

nvme-fc: fix request private initialization · d19b8bc8

由 James Smart 提交于 10月 27, 2018

The patch made to avoid Coverity reporting of out of bounds access
on aen_op moved the assignment of a pointer, leaving it null when it
was subsequently used to calculate a private pointer. Thus the private
pointer was bad.

Move/correct the private pointer initialization to be in sync with the
patch.

Fixes: 0d2bdf9f ("nvme-fc: rework the request initialization code")
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d19b8bc8

17 10月, 2018 3 次提交

nvme-fc: rework the request initialization code · 0d2bdf9f

由 Bart Van Assche 提交于 10月 08, 2018

Instead of setting and then clearing the first_sgl pointer for AEN requests,
leave that pointer zero. This patch does not change how requests are
initialized but avoids that Coverity reports the following complaint for
nvme_fc_init_aen_ops():

CID 1418400 (#1 of 1): Out-of-bounds access (OVERRUN)
4. overrun-buffer-val: Overrunning buffer pointed to by aen_op of 312 bytes by passing it to a function which accesses it at byte offset 312.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0d2bdf9f

nvme-fc: introduce struct nvme_fcp_op_w_sgl · d3d0bc78

由 Bart Van Assche 提交于 10月 08, 2018

This patch does not change any functionality but makes the intent of the
code more clear.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d3d0bc78

nvme-fc: fix kernel-doc headers · 76c910c7

由 Bart Van Assche 提交于 10月 08, 2018

This patch avoids that the kernel-doc tool complains about several
multiple function headers when building with W=1.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

76c910c7

02 10月, 2018 2 次提交

nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device · 97faec53

由 James Smart 提交于 9月 13, 2018

The fc transport device should allow for a rediscovery, as userspace
might have lost the events. Example is udev events not handled during
system startup.

This patch add a sysfs entry 'nvme_discovery' on the fc class to
have it replay all udev discovery events for all local port/remote
port address pairs.
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

97faec53

nvme-fc: fix for a minor typos · d4e4230c

由 Milan P. Gandhi 提交于 8月 10, 2018

Signed-off-by: NMilan P. Gandhi <mgandhi@redhat.com>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d4e4230c

24 7月, 2018 1 次提交

nvme: if_ready checks to fail io to deleting controller · 6cdefc6e

由 James Smart 提交于 7月 20, 2018

The revised if_ready checks skipped over the case of returning error when
the controller is being deleted.  Instead it was returning BUSY, which
caused the ios to retry, which caused the ns delete to hang waiting for
the ios to drain.

Stack trace of hang looks like:
 kworker/u64:2   D    0    74      2 0x80000000
 Workqueue: nvme-delete-wq nvme_delete_ctrl_work [nvme_core]
 Call Trace:
  ? __schedule+0x26d/0x820
  schedule+0x32/0x80
  blk_mq_freeze_queue_wait+0x36/0x80
  ? remove_wait_queue+0x60/0x60
  blk_cleanup_queue+0x72/0x160
  nvme_ns_remove+0x106/0x140 [nvme_core]
  nvme_remove_namespaces+0x7e/0xa0 [nvme_core]
  nvme_delete_ctrl_work+0x4d/0x80 [nvme_core]
  process_one_work+0x160/0x350
  worker_thread+0x1c3/0x3d0
  kthread+0xf5/0x130
  ? process_one_work+0x350/0x350
  ? kthread_bind+0x10/0x10
  ret_from_fork+0x1f/0x30

Extend nvmf_fail_nonready_command() to supply the controller pointer so
that the controller state can be looked at. Fail any io to a controller
that is deleting.

Fixes: 3bc32bb1 ("nvme-fabrics: refactor queue ready check")
Fixes: 35897b92 ("nvme-fabrics: fix and refine state checks in __nvmf_check_ready")
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NEwan D. Milne <emilne@redhat.com>
Reviewed-by: NEwan D. Milne <emilne@redhat.com>

6cdefc6e

23 7月, 2018 1 次提交

nvme: cache struct nvme_ctrl reference to struct nvme_request · 59e29ce6

由 Sagi Grimberg 提交于 6月 29, 2018

We will need to reference the controller in the setup and completion
time for tracing and future traffic based keep alive support.
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

59e29ce6

21 6月, 2018 1 次提交

nvme-fc: release io queues to allow fast fail · 02d62a8b

由 James Smart 提交于 6月 20, 2018

Rather than leaving io queues quiesced after tearing down an association,
restart them. This allows ios to be replayed, with fastfail ios terminating
and non-fastfail getting into loops of retry.

This follows rdma's lead.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NSagi Grimberg <sagi@grimber.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

02d62a8b

15 6月, 2018 1 次提交

nvme-fabrics: refactor queue ready check · 3bc32bb1

由 Christoph Hellwig 提交于 6月 11, 2018

Move the is_connected check to the fibre channel transport, as it has no
meaning for other transports.  To facilitate this split out a new
nvmf_fail_nonready_command helper that is called by the transport when
it is asked to handle a command on a queue that is not ready.

Also avoid a function call for the queue live fast path by inlining
the check.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJames Smart <james.smart@broadcom.com>

3bc32bb1

14 6月, 2018 3 次提交

nvme-fc: fix nulling of queue data on reconnect · 3e493c00

由 James Smart 提交于 6月 13, 2018

The reconnect path is calling the init routines to clear a queue
structure. But the queue structure has state that perhaps needs
to persist as long as the controller is live.

Remove the nvme_fc_init_queue() calls on reconnect.
The nvme_fc_free_queue() calls will clear state bits and reset
any relevant queue state for a new connection.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3e493c00

nvme-fc: remove reinit_request routine · 587331f7

由 James Smart 提交于 6月 13, 2018

The reinit_request routine is not necessary. Remove support for the
op callback.

As all that nvme_reinit_tagset() does is itterate and call the
reinit routine, it too has no purpose. Remove the call.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

587331f7

nvme-fc: change controllers first connect to use reconnect path · 4c984154

由 James Smart 提交于 6月 13, 2018

Current code follows the framework that has been in the transports
from the beginning where initial link-side controller connect occurs
as part of "creating the controller". Thus that first connect fully
talks to the controller and obtains values that can then be used in
for blk-mq setup, etc. It also means that everything about the
controller is fully know before the "create controller" call returns.

This has several weaknesses:
- The initial create_ctrl call made by the cli will block for a long
  time as wire transactions are performed synchronously. This delay
  becomes longer if errors occur or connectivity is lost and retries
  need to be performed.
- Code wise, it means there is a separate connect path for initial
  controller connect vs the (same) steps used in the reconnect path.
- And as there's separate paths, it means there's separate error
  handling and retry logic. It also plays havoc with the NEW state
  (should transition out of it after successful initial connect) vs
  the RESETTING and CONNECTING (reconnect) states that want to be
  transitioned to on error.
- As there's separate paths, to recover from errors and disruptions,
  it requires separate recovery/retry paths as well and can severely
  convolute the controller state.

This patch reworks the fc transport to use the same connect paths
for the initial connection as it uses for reconnect. This makes a
single path for error recovery and handling.

This patch:
- Removes the driving of the initial connect and replaces it with
  a state transition to CONNECTING and initiating the reconnect
  thread. A dummy state transition of RESETTING had to be traversed
  as a direct transtion of NEW->CONNECTING is not allowed. Given
  that the controller is "new", the RESETTING transition is a simple
  no-op. Once in the reconnecting thread, the normal behaviors of
  ctrl_loss_tmo (max_retries * connect_delay) and dev_loss_tmo will
  apply before the controller is torn down.
- Only if the state transitions couldn't be traversed and the
  reconnect thread not scheduled, will the controller be torn down
  while in create_ctrl.
- The prior code used the controller state of NEW to indicate
  whether request queues had been initialized or not. For the admin
  queue, the request queue is always created, so there's no need to
  check a state. For IO queues, change to tracking whether a successful
  io request queue create has occurred (e.g. 1st successful connect).
- The initial controller id is initialized to the dynamic controller
  id used in the initial connect message. It will be overwritten by
  the real controller id once the controller is connected on the wire.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

4c984154

31 5月, 2018 1 次提交

blk-mq: only iterate over inflight requests in blk_mq_tagset_busy_iter · d250bf4e

由 Christoph Hellwig 提交于 5月 30, 2018

We already check for started commands in all callbacks, but we should
also protect against already completed commands.  Do this by taking
the checks to common code.
Acked-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d250bf4e

25 5月, 2018 1 次提交

nvme-fc: remove setting DNR on exception conditions · 90fcaf5d

由 James Smart 提交于 5月 11, 2018

Current code will set DNR if the controller is deleting or there is
an error during controller init. None of this is necessary.

Remove the code that sets DNR
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

90fcaf5d

27 4月, 2018 1 次提交

nvme: fc: provide a descriptive error · 4fb135ad

由 Johannes Thumshirn 提交于 4月 19, 2018

Provide a descriptive error in case an lport to rport association
isn't found when creating the FC-NVME controller.

Currently it's very hard to debug the reason for a failed connect
attempt without a look at the source.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NJames Smart  <james.smart@broadcom.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

4fb135ad

12 4月, 2018 1 次提交

nvme: expand nvmf_check_if_ready checks · bb06ec31

由 James Smart 提交于 4月 12, 2018

The nvmf_check_if_ready() checks that were added are very simplistic.
As such, the routine allows a lot of cases to fail ios during windows
of reset or re-connection. In cases where there are not multi-path
options present, the error goes back to the callee - the filesystem
or application. Not good.

The common routine was rewritten and calling syntax slightly expanded
so that per-transport is_ready routines don't need to be present.
The transports now call the routine directly. The routine is now a
fabrics routine rather than an inline function.

The routine now looks at controller state to decide the action to
take. Some states mandate io failure. Others define the condition where
a command can be accepted. When the decision is unclear, a generic
queue-or-reject check is made to look for failfast or multipath ios and
only fails the io if it is so marked. Otherwise, the io will be queued
and wait for the controller state to resolve.

Admin commands issued via ioctl share a live admin queue with commands
from the transport for controller init. The ioctls could be intermixed
with the initialization commands. It's possible for the ioctl cmd to
be issued prior to the controller being enabled. To block this, the
ioctl admin commands need to be distinguished from admin commands used
for controller init. Added a USERCMD nvme_req(req)->rq_flags bit to
reflect this division and set it on ioctls requests. As the
nvmf_check_if_ready() routine is called prior to nvme_setup_cmd(),
ensure that commands allocated by the ioctl path (actually anything
in core.c) preps the nvme_req(req) before starting the io. This will
preserve the USERCMD flag during execution and/or retry.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.e>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bb06ec31

26 3月, 2018 5 次提交

nvme_fc: on remoteport reuse, set new nport_id and role. · 0cdd5fca

由 James Smart 提交于 3月 05, 2018

When reattaching to a removed remoteport that has not yet been
fully deleted as it's waiting for reconnect timeouts, be sure to
re-set the ports nport id and role.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0cdd5fca

nvme_fc: fix abort race on teardown with lld reject · b12740d3

由 James Smart 提交于 2月 28, 2018

Another abort race: An io request is started, becomes active,
and is attempted to be started with the lldd. At the same time
the controller is stopped/torndown and an itterator is run to
abort the ios. As the io is active, it is added to the outstanding
aborted io count.  However on the original io request thread, the
driver ends up rejecting the io due to the condition that induced
the controller teardown. The driver reject path didn't check whether
it was in the outstanding io count. This left the count outstanding
stopping controller teardown.

Correct by, in the driver reject case, setting the state to
inactive and checking whether it was in the outstanding io count.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b12740d3

nvme_fc: io timeout should defer abort to ctrl reset · 041018c6

由 James Smart 提交于 3月 12, 2018

The current nvme_fc code, when an io times out, will abort the io
on the fc link, then call the error recovery routine to reset the
controller. It is during the reset of the controller that the
transport will wait for all ios to be aborted before sending a
Disconnect LS to the target.

However, the reset routine only waits for the io which it generates
the abort for to complete. Any io that was aborted just prior to the
reset isn't in it's list to wait for. Thus the Disconnect is getting
sent before the aborts have completed.

Correct by removing the abort in the timeout handler. The reset will
generate the abort. At that point the timeout handler can be simplified
to request the reset (via the error handler) and restart the timeout
timer.

Also fixes a small typo in a comment in the reset handler.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

041018c6

nvme_fc: fix ctrl create failures racing with workq items · cf25809b

由 James Smart 提交于 3月 13, 2018

If there are errors during initial controller create, the transport
will teardown the partially initialized controller struct and free
the ctlr memory.  Trouble is - most of those errors can occur due
to asynchronous events happening such io timeouts and subsystem
connectivity failures. Those failures invoke async workq items to
reset the controller and attempt reconnect.  Those may be in progress
as the main thread frees the ctrl memory, resulting in NULL ptr oops.

Prevent this from happening by having the main ctrl failure thread
changing state to DELETING followed by synchronously cancelling any
pending queued work item. The change of state will prevent the
scheduling of resets or reconnect events.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

cf25809b

nvme: centralize ctrl removal prints · 77d0612d

由 Max Gurtovoy 提交于 3月 11, 2018

nvme_delete_ctrl can be called from various contexts in parallel,
and cause duplicated information prints, even though the specific
context doesn't perform the actual removal. Instead, print the
information when the actual removal occurs.
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

77d0612d

09 3月, 2018 1 次提交

nvme_fc: rework sqsize handling · d157e534

由 James Smart 提交于 3月 07, 2018

Corrected four outstanding issues in the transport around sqsize.

1: Create Connection LS is sending the 1's-based sqsize, should be
sending the 0's-based value.

2: allocation of hw queue is using the 0's-base size. It should be
using the 1's-based value.

3: normalization of ctrl.sqsize by MQES is using MQES+1 (1's-based
value). It should be MQES (0's-based value).

4: Missing clause to ensure queue_count not larger than ctrl->sqsize.

Corrected by:
Clean up routines that pass queue size around. The queue size value is
the actual count (1's-based) value and determined from ctrl->sqsize + 1.

Routines that send 0's-based value adapt from queue size.

Sset ctrl->sqsize properly for MQES.

Added clause to nsure queue_count not larger than ctrl->sqsize + 1.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

d157e534

11 2月, 2018 2 次提交

nvme_fc: cleanup io completion · c3aedd22

由 James Smart 提交于 2月 06, 2018

There was some old cold that dealt with complete_rq being called
prior to the lldd returning the io completion. This is garbage code.
The complete_rq routine was being called after eh_timeouts were
called and it was due to eh_timeouts not being handled properly.
The timeouts were fixed in prior patches so that in general, a
timeout will initiate an abort and the reset timer restarted as
the abort operation will take care of completing things. Given the
reset timer restarted, the erroneous complete_rq calls were eliminated.

So remove the work that was synchronizing complete_rq with io
completion.
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

c3aedd22

nvme_fc: correct abort race condition on resets · 3efd6e8e

由 James Smart 提交于 2月 06, 2018

During reset handling, there is live io completing while the reset
is taking place. The reset path attempts to abort all outstanding io,
counting the number of ios that were reset. It then waits for those
ios to be reclaimed from the lldd before continuing.

The transport's logic on io state and flag setting was poor, allowing
ios to complete simultaneous to the abort request. The completed ios
were counted, but as the completion had already occurred, the
completion never reduced the count. As the count never zeros, the
reset/delete never completes.

Tighten it up by unconditionally changing the op state to completed
when the io done handler is called.  The reset/abort path now changes
the op state to aborted, but the abort only continues if the op
state was live priviously. If complete, the abort is backed out.
Thus proper counting of io aborts and their completions is working
again.

Also removed the TERMIO state on the op as it's redundant with the
op's aborted state.
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

3efd6e8e

09 2月, 2018 1 次提交

nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING · ad6a0a52

由 Max Gurtovoy 提交于 1月 31, 2018

In pci transport, this state is used to mark the initialization
process. This should be also used in other transports as well.
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

ad6a0a52

31 1月, 2018 1 次提交

blk-mq: introduce BLK_STS_DEV_RESOURCE · 86ff7c2a

由 Ming Lei 提交于 1月 30, 2018

This status is returned from driver to block layer if device related
resource is unavailable, but driver can guarantee that IO dispatch
will be triggered in future when the resource is available.

Convert some drivers to return BLK_STS_DEV_RESOURCE.  Also, if driver
returns BLK_STS_RESOURCE and SCHED_RESTART is set, rerun queue after
a delay (BLK_MQ_DELAY_QUEUE) to avoid IO stalls.  BLK_MQ_DELAY_QUEUE is
3 ms because both scsi-mq and nvmefc are using that magic value.

If a driver can make sure there is in-flight IO, it is safe to return
BLK_STS_DEV_RESOURCE because:

1) If all in-flight IOs complete before examining SCHED_RESTART in
blk_mq_dispatch_rq_list(), SCHED_RESTART must be cleared, so queue
is run immediately in this case by blk_mq_dispatch_rq_list();

2) if there is any in-flight IO after/when examining SCHED_RESTART
in blk_mq_dispatch_rq_list():
- if SCHED_RESTART isn't set, queue is run immediately as handled in 1)
- otherwise, this request will be dispatched after any in-flight IO is
  completed via blk_mq_sched_restart()

3) if SCHED_RESTART is set concurently in context because of
BLK_STS_RESOURCE, blk_mq_delay_run_hw_queue() will cover the above two
cases and make sure IO hang can be avoided.

One invariant is that queue will be rerun if SCHED_RESTART is set.
Suggested-by: NJens Axboe <axboe@kernel.dk>
Tested-by: NLaurence Oberman <loberman@redhat.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

86ff7c2a

18 1月, 2018 2 次提交

nvme-fc: correct hang in nvme_ns_remove() · 0fd997d3

由 James Smart 提交于 1月 11, 2018

When connectivity is lost to a device, the association is terminated
and the blk-mq queues are quiesced/stopped. When connectivity is
re-established, they are resumed.

If connectivity is lost for a sufficient amount of time that the
controller is then deleted, the delete path starts tearing down queues,
and eventually calling nvme_ns_remove(). It appears that pending
commands may cause blk_cleanup_queue() to never complete and the
teardown stalls.

Correct by starting the ns queues after transitioning to a DELETING
state, allowing pending commands to be flushed with io failures. Thus
the delete path is clear when reached.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0fd997d3

nvme-fc: fix rogue admin cmds stalling teardown · d625d05e

由 James Smart 提交于 1月 11, 2018

When connectivity is lost to a device, the association is terminated
and the blk-mq queues are quiesced/stopped. When connectivity is
re-established, they are resumed.

If an admin command is received while connectivity is list, the ioctl
queues the command on the admin_q and the command stalls (the thread
issuing the ioctl hangs/waits). if the connectivity is lost long
enough such that the controller is then deleted, the delete code
makes its calls to initiate the delete, which then expects the core
layer to call the transport when all references are removed and the
controller can be freed. Unfortunately, nothing in this path dequeued
the admin command, so a reference sits outstanding and things stop,
hanging the delete indefinitely.

Correct by unquiescing the admin queue in the delete association. This
means any admin command (which should only be from an ioctl) issued
after connectivity is lost will detect the controller is in a
reconnecting state and will (fast) fail the command. Thus, a pending
reference can no longer be created. Once connectivity is re-established,
a new ioctl/admin command would see proper device state and function again.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d625d05e

08 1月, 2018 1 次提交

nvme-fabrics: protect against module unload during create_ctrl · 0de5cd36

由 Roy Shterman 提交于 12月 25, 2017

NVMe transport driver module unload may (and usually does) trigger
iteration over the active controllers and delete them all (sometimes
under a mutex).  However, a controller can be created concurrently with
module unload which can lead to leakage of resources (most important char
device node leakage) in case the controller creation occured after the
unload delete and drain sequence.  To protect against this, we take a
module reference to guarantee that the nvme transport driver is not
unloaded while creating a controller.
Signed-off-by: NRoy Shterman <roys@lightbitslabs.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0de5cd36

15 12月, 2017 1 次提交

nvme-fc: remove double put reference if admin connect fails · 4596e752

由 James Smart 提交于 11月 29, 2017

There are two put references in the failure case of initial
create_association. The first put actually frees the controller, thus the
second put references freed memory.

Remove the unnecessary 2nd put.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

4596e752

25 11月, 2017 1 次提交

nvme-fc: don't use bit masks for set/test_bit() numbers · 26c0a26d

由 Jens Axboe 提交于 11月 24, 2017

So far harmless, but it's confusing and a bug waiting to happen if the
shifts grow larger than 4.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

26c0a26d

20 11月, 2017 1 次提交

nvme-fc: check if queue is ready in queue_rq · 9e0ed16a

由 Sagi Grimberg 提交于 10月 24, 2017

In case the queue is not LIVE (fully functional and connected at the nvmf
level), we cannot allow any commands other than connect to pass through.

Add a new queue state flag NVME_FC_Q_LIVE which is set after nvmf connect
and cleared in queue teardown.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

9e0ed16a

11 11月, 2017 1 次提交

nvme: remove handling of multiple AEN requests · ad22c355

由 Keith Busch 提交于 11月 07, 2017

The driver can handle tracking only one AEN request, so this patch
removes handling for multiple ones.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJames Smart  <james.smart@broadcom.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ad22c355

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功