提交 · 71c691fd06cc2625966620c93ce21bdcce32ed95 · openanolis / cloud-kernel

01 11月, 2017 1 次提交

nvme-fc: avoid workqueue flush stalls · 71c691fd

由 James Smart 提交于 9月 28, 2017

There's no need to wait for the full nvme_wq, which is now shared,
to flush. flush only the delete_work item.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NSagi Grimberg <sgi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

71c691fd

27 10月, 2017 10 次提交

nvme-fc: remove NVME_FC_MAX_SEGMENTS · ecad0d2c

由 James Smart 提交于 10月 23, 2017

The define is an arbitrary limit to the io size on the initiator,
capping the io to 1MB-4KB.

Remove the define from the transport. I/O size will solely be limited
by the LLDD sg limits.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ecad0d2c

nvme-fc: add support for duplicate_connect option · 56d5f4f1

由 James Smart 提交于 10月 20, 2017

Adds support for the duplicate_connect option. When set to true,
checks whether there's an existing controller via the same host port
and target port for the same host (hostnqn, hostid) to the same
subsystem. Fails the connection request if an existing controller.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

56d5f4f1

nvme-rdma: add support for duplicate_connect option · 36e835f2

由 James Smart 提交于 10月 20, 2017

Adds support for the duplicate_connect option. When set to true,
checks whether there's an existing controller via the same target
address (traddr), target port (trsvcid), and if specified, host
address (host_traddr). Fails the connection request if there is
an existing controller.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

36e835f2

nvme: add helper to compare options to controller · 991231dc

由 James Smart 提交于 10月 20, 2017

Adds a helper function that compares the host and subsytem
specified in a connect options list vs a controller.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

991231dc

nvme: add duplicate_connect option · 3b338762

由 James Smart 提交于 10月 20, 2017

Add the "duplicate_connect" boolean option (presence means true).
Default is false.

When false, the transport should validate whether a new controller request
is targeted for the same host transport addressing and target transport
addressing as an existing controller. If so, the new controller request
should be rejected.

When true, the callee is explicitly requesting a duplicate controller
connection to be made and the new request should be attempted.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3b338762

nvme: check for a live controller in nvme_dev_open · 999ada28

由 Christoph Hellwig 提交于 10月 18, 2017

This is a much more sensible check than just the admin queue.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@rimbeg.me>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

999ada28

nvme: get rid of nvme_ctrl_list · a6a5149b

由 Christoph Hellwig 提交于 10月 18, 2017

Use the core chrdev code to set up the link between the character device
and the nvme controller.  This allows us to get rid of the global list
of all controllers, and also ensures that we have both a reference to
the controller and the transport module before the open method of the
character device is called.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sgi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>

a6a5149b

nvme: switch controller refcounting to use struct device · d22524a4

由 Christoph Hellwig 提交于 10月 18, 2017

Instead of allocating a separate struct device for the character device
handle embedd it into struct nvme_ctrl and use it for the main controller
refcounting.  This removes double refcounting and gets us an automatic
reference for the character device operations.  We keep ctrl->device as a
pointer for now to avoid chaning printks all over, but in the future we
could look into message printing helpers that take a controller structure
similar to what other subsystems do.

Note the delete_ctrl operation always already has a reference (either
through sysfs due this change, or because every open file on the
/dev/nvme-fabrics node has a refernece) when it is entered now, so we
don't need to do the unless_zero variant there.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>

d22524a4

nvme: simplify nvme_open · c6424a90

由 Christoph Hellwig 提交于 10月 18, 2017

Now that we are protected against lookup vs free races for the namespace
by using kref_get_unless_zero we don't need the hack of NULLing out the
disk private data during removal.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

c6424a90

nvme: use kref_get_unless_zero in nvme_find_get_ns · 2dd41228

由 Christoph Hellwig 提交于 10月 18, 2017

For kref_get_unless_zero to protect against lookup vs free races we need
to use it in all places where we aren't guaranteed to already hold a
reference.  There is no such guarantee in nvme_find_get_ns, so switch to
kref_get_unless_zero in this function.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

2dd41228

23 10月, 2017 2 次提交

nvme-rdma: Add debug message when reaches timeout · e62a538d

由 Nitzan Carmi 提交于 10月 22, 2017

Signed-off-by: NNitzan Carmi <nitzanc@mellanox.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

e62a538d

M
nvme-rdma: align nvme_rdma_device structure · f87c89ad
由 Max Gurtovoy 提交于 10月 23, 2017
```
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
```
f87c89ad

20 10月, 2017 5 次提交

nvme-fc: correct io timeout behavior · 134aedc9

由 James Smart 提交于 10月 19, 2017

The transport io timeout behavior wasn't quite correct. It ignored
that the io error handler is supposed to be synchronous so it possibly
allowed the blk request to be restarted while the io associated was
still aborting. Timeouts on reserved commands, those used for
association create, were never timing out thus they hung out forever.

To correct:
If an io is times out while a remoteport is not connected, just
restart the io timer. The lack of connectivity will simultaneously
be resetting the controller, so the reset path will abort and terminate
the io.

If an io is times out while it was marked for transport abort, just
reset the io timer. The abort process is underway and will complete
the io.

Otherwise, if an io times out, abort the io. If the abort was
unsuccessful (unlikely) give up and return not handled.

If the abort was successful, as the abort process is underway it will
terminate the io, so rather than synchronously waiting, just restart
the io timer.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

134aedc9

nvme-fc: correct io termination handling · 0a02e39f

由 James Smart 提交于 10月 19, 2017

The io completion handling for i/o's that are failing due to
to a transport error or association termination had issues, causing
io failures (DNR set so retries didn't kick in) or long stalls.

Change the io completion handler for the following items:

When an io has been completed due to a transport abort (based on an
exchange error) or when marked as aborted as part of an association
termination (FCOP_FLAGS_TERMIO), set the NVME completion status to
NVME_SC_ABORTED. By default, do not set DNR on the status so that a
retry can be attempted after association recreate.

In cases where an io is failed (non-successful nvme status including
aborted), if the controller is being deleted (blk_queue_dying) or
the io was part of the ios used for association creation (ctrl state
is NEW or RECONNECTING), then additionally set the DNR bit so the io
will not be retried. If the failed io was part of association creation,
the failure will tear down the partially completioned association and
typically restart a new reconnect attempt (another create association
later).

Rearranged code flow to remove a largely unneeded local variable.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0a02e39f

nvme-pci: add SGL support · a7a7cbe3

由 Chaitanya Kulkarni 提交于 10月 16, 2017

This adds SGL support for NVMe PCIe driver, based on an earlier patch
from Rajiv Shanmugam Madeswaran <smrajiv15 at gmail.com>. This patch
refactors the original code and adds new module parameter sgl_threshold
to determine whether to use SGL or PRP for IOs.

The usage of SGLs is controlled by the sgl_threshold module parameter,
which allows to conditionally use SGLs if average request segment
size (avg_seg_size) is greater than sgl_threshold. In the original patch,
the decision of using SGLs was dependent only on the IO size,
with the new approach we consider not only IO size but also the
number of physical segments present in the IO.

We calculate avg_seg_size based on request payload bytes and number
of physical segments present in the request.

For e.g.:-

1. blk_rq_nr_phys_segments = 2 blk_rq_payload_bytes = 8k
avg_seg_size = 4K use sgl if avg_seg_size >= sgl_threshold.

2. blk_rq_nr_phys_segments = 2 blk_rq_payload_bytes = 64k
avg_seg_size = 32K use sgl if avg_seg_size >= sgl_threshold.

3. blk_rq_nr_phys_segments = 16 blk_rq_payload_bytes = 64k
avg_seg_size = 4K use sgl if avg_seg_size >= sgl_threshold.
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

a7a7cbe3

nvme: use ida_simple_{get,remove} for the controller instance · 9843f685

由 Christoph Hellwig 提交于 10月 18, 2017

Switch to the ida_simple_* helpers instead of opencoding them.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

9843f685

nvmet: Change max_nsid in subsystem due to ns_disable if needed · ba2dec35

由 Roy Shterman 提交于 10月 18, 2017

In case we disable namespaces which has the nsid like
subsystem max_nsid we need to search for the next largest nsid
in this subsystem. If the subsystem don't has more namespaces
we set it to 0, else we take nsid from the last namespace in
namespaces list because the list is sorted while inserting.
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NRoy Shterman <roys@lightbitslabs.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
[hch: slight refactor]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ba2dec35

19 10月, 2017 15 次提交

nvme-loop: Add BLK_MQ_F_NO_SCHED flag to admin tag set · 86f36b9c

由 Israel Rukshin 提交于 10月 18, 2017

This flag is useful for admin queues that aren't used for normal IO.
Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

86f36b9c

nvme-fc: Add BLK_MQ_F_NO_SCHED flag to admin tag set · 5a22e2bf

由 Israel Rukshin 提交于 10月 18, 2017

Since commit b86dd815
"block: get rid of blk-mq default scheduler choice Kconfig entries",
when setting nr_hw_queues to 1 the admin tag set uses mq-deadline scheduler.
This flag is useful for admin queues that aren't used for normal IO.
Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJames Smart  <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

5a22e2bf

nvme-rdma: Add BLK_MQ_F_NO_SCHED flag to admin tag set · 94f29d4f

由 Israel Rukshin 提交于 10月 18, 2017

Since commit b86dd815
"block: get rid of blk-mq default scheduler choice Kconfig entries",
when setting nr_hw_queues to 1 the admin tag set uses mq-deadline scheduler.
This flag is useful for admin queues that aren't used for normal IO.
Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

94f29d4f

nvme-pci: fix typos in comments · 16772ae6

由 Minwoo Im 提交于 10月 18, 2017

fixed comment typos in adapter_alloc_cq() and adapter_alloc_sq().
'the the' duplications are replaced with 'that the'.
Signed-off-by: NMinwoo Im <dn3108@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

16772ae6

nvme-rdma: stop controller reset if the controller is deleting · 0ad0bfa2