提交 · b9156daeb1601d69007b7e50efcf89d69d72ec1d · openeuler / Kernel

24 6月, 2019 1 次提交

RDMA/core: Add an integrity MR pool support · 5a6781a5

由 Israel Rukshin 提交于 6月 11, 2019

This is a preparation for adding new signature API to the rw-API.
Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

5a6781a5

21 6月, 2019 1 次提交

scsi: lib/sg_pool.c: improve APIs for allocating sg pool · 4635873c

由 Ming Lei 提交于 4月 28, 2019

sg_alloc_table_chained() currently allows the caller to provide one
preallocated SGL and returns if the requested number isn't bigger than
size of that SGL. This is used to inline an SGL for an IO request.

However, scattergather code only allows that size of the 1st preallocated
SGL to be SG_CHUNK_SIZE(128). This means a substantial amount of memory
(4KB) is claimed for the SGL for each IO request. If the I/O is small, it
would be prudent to allocate a smaller SGL.

Introduce an extra parameter to sg_alloc_table_chained() and
sg_free_table_chained() for specifying size of the preallocated SGL.

Both __sg_free_table() and __sg_alloc_table() assume that each SGL has the
same size except for the last one.  Change the code to allow both functions
to accept a variable size for the 1st preallocated SGL.

[mkp: attempted to clarify commit desc]

Cc: Christoph Hellwig <hch@lst.de>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: netdev@vger.kernel.org
Cc: linux-nvme@lists.infradead.org
Suggested-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

4635873c

07 6月, 2019 1 次提交

nvme-rdma: use dynamic dma mapping per command · 62f99b62

由 Max Gurtovoy 提交于 6月 06, 2019

Commit 87fd1253 ("nvme-rdma: remove redundant reference between
ib_device and tagset") caused a kernel panic when disconnecting from an
inaccessible controller (disconnect during re-connection).

--
nvme nvme0: Removing ctrl: NQN "testnqn1"
nvme_rdma: nvme_rdma_exit_request: hctx 0 queue_idx 1
BUG: unable to handle kernel paging request at 0000000080000228
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
...
Call Trace:
 blk_mq_exit_hctx+0x5c/0xf0
 blk_mq_exit_queue+0xd4/0x100
 blk_cleanup_queue+0x9a/0xc0
 nvme_rdma_destroy_io_queues+0x52/0x60 [nvme_rdma]
 nvme_rdma_shutdown_ctrl+0x3e/0x80 [nvme_rdma]
 nvme_do_delete_ctrl+0x53/0x80 [nvme_core]
 nvme_sysfs_delete+0x45/0x60 [nvme_core]
 kernfs_fop_write+0x105/0x180
 vfs_write+0xad/0x1a0
 ksys_write+0x5a/0xd0
 do_syscall_64+0x55/0x110
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fa215417154
--

The reason for this crash is accessing an already freed ib_device for
performing dma_unmap during exit_request commands. The root cause for
that is that during re-connection all the queues are destroyed and
re-created (and the ib_device is reference counted by the queues and
freed as well) but the tagset stays alive and all the DMA mappings (that
we perform in init_request) kept in the request context. The original
commit fixed a different bug that was introduced during bonding (aka nic
teaming) tests that for some scenarios change the underlying ib_device
and caused memory leakage and possible segmentation fault. This commit
is a complementary commit that also changes the wrong DMA mappings that
were saved in the request context and making the request sqe dma
mappings dynamic with the command lifetime (i.e. mapped in .queue_rq and
unmapped in .complete). It also fixes the above crash of accessing freed
ib_device during destruction of the tagset.

Fixes: 87fd1253 ("nvme-rdma: remove redundant reference between ib_device and tagset")
Reported-by: NJim Harris <james.r.harris@intel.com>
Suggested-by: NSagi Grimberg <sagi@grimberg.me>
Tested-by: NJim Harris <james.r.harris@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

62f99b62

31 5月, 2019 1 次提交

nvme-rdma: fix queue mapping when queue count is limited · 5651cd3c

由 Sagi Grimberg 提交于 5月 28, 2019

When the controller supports less queues than requested, we
should make sure that queue mapping does the right thing and
not assume that all queues are available. This fixes a crash
when the controller supports less queues than requested.

The rules are:
1. if no write/poll queues are requested, we assign the available queues
   to the default queue map. The default and read queue maps share the
   existing queues.
2. if write queues are requested:
  - first make sure that read queue map gets the requested
    nr_io_queues count
  - then grant the default queue map the minimum between the requested
    nr_write_queues and the remaining queues. If there are no available
    queues to dedicate to the default queue map, fallback to (1) and
    share all the queues in the existing queue map.
3. if poll queues are requested:
  - map the remaining queues to the poll queue map.

Also, provide a log indication on how we constructed the different
queue maps.
Reported-by: NHarris, James R <james.r.harris@intel.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Tested-by: NJim Harris <james.r.harris@intel.com>
Cc: <stable@vger.kernel.org> # v5.0+
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

5651cd3c

13 5月, 2019 1 次提交

nvme-rdma: remove redundant reference between ib_device and tagset · 87fd1253

由 Max Gurtovoy 提交于 5月 06, 2019

In the past, before adding f41725bb ("nvme-rdma: Use mr pool") commit,
we needed a reference on the ib_device as long as the tagset
was alive, as the MRs in the request structures needed a valid ib_device.
Now, we allocate/deallocate MR pool per QP and consume on demand.

Also remove nvme_rdma_free_tagset function and use blk_mq_free_tag_set
instead, as it unneeded anymore.

This commit also fixes a memory leakage and possible segmentation fault.
When configuring the system with NIC teaming (aka bonding), we use 1
network interface to create an HA connection to the target side. In case
one connection breaks down, nvme-rdma driver will get notification from
rdma-cm layer that underlying address was change and will start error
recovery process. During this process, we'll reconnect to the target
via the second interface in the bond without destroying the tagset.
This will cause a leakage of the initial rdma device (ndev) and miscount
in the reference count of the new created rdma device (new ndev). In
the final destruction (or in another error flow), we'll get a warning
dump from the ib_dealloc_pd that we still have inflight MR's related to
that pd. This happens becasue of the miscount of the reference tag of
the rdma device and causing access violation to it's elements (some
queues are not destroyed yet).
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

87fd1253

25 4月, 2019 1 次提交

nvme-rdma: fix a NULL deref when an admin connect times out · 1007709d

由 Sagi Grimberg 提交于 4月 24, 2019

If we timeout the admin startup sequence we might not yet have
an I/O tagset allocated which causes the teardown sequence to crash.
Make nvme_tcp_teardown_io_queues safe by not iterating inflight tags
if the tagset wasn't allocated.

Fixes: 4c174e63 ("nvme-rdma: fix timeout handler")
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1007709d

21 2月, 2019 1 次提交

nvme-rdma: use nr_phys_segments when map rq to sgl · 34e08191

由 Chaitanya Kulkarni 提交于 2月 20, 2019

Use blk_rq_nr_phys_segments() instead of blk_rq_payload_bytes() to check
if a command contains data to be mapped.  This fixes the case where
a struct request contains LBAs, but it has no payload, such as
Write Zeroes support.

Fixes: 6e02318e ("nvme: add support for the Write Zeroes command")
Reported-by: NMing Lei <tom.leiming@gmail.com>
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Tested-by: NMing Lei <tom.leiming@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

34e08191

20 2月, 2019 1 次提交

nvme-rdma: convert to SPDX identifiers · 5d8762d5

由 Christoph Hellwig 提交于 2月 18, 2019

Update license to use SPDX-License-Identifier instead of verbose license
text.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

5d8762d5

04 2月, 2019 1 次提交

nvme: remove the .stop_ctrl callout · 794a4cb3

由 Sagi Grimberg 提交于 1月 01, 2019

It is used now just to flush error recovery and reconnect work items in
the RDMA and TCP transports, which can simply be moved to the
corresponding teardown routines.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

794a4cb3

24 1月, 2019 2 次提交

nvme-rdma: rework queue maps handling · b1064d3e

由 Sagi Grimberg 提交于 1月 18, 2019

If the device supports less queues than provided (if the device has less
completion vectors), we might hit a bug due to the fact that we ignore
that in nvme_rdma_map_queues (we override the maps nr_queues with user
opts).

Instead, keep track of how many default/read/poll queues we actually
allocated (rather than asked by the user) and use that to assign our
queue mappings.

Fixes: b65bb777 (" nvme-rdma: support separate queue maps for read and write")
Reported-by: NSaleem, Shiraz <shiraz.saleem@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b1064d3e

nvme-rdma: fix timeout handler · 4c174e63

由 Sagi Grimberg 提交于 1月 08, 2019

Currently, we have several problems with the timeout
handler:
1. If we timeout on the controller establishment flow, we will hang
because we don't execute the error recovery (and we shouldn't because
the create_ctrl flow needs to fail and cleanup on its own)
2. We might also hang if we get a disconnet on a queue while the
controller is already deleting. This racy flow can cause the controller
disable/shutdown admin command to hang.

We cannot complete a timed out request from the timeout handler without
mutual exclusion from the teardown flow (e.g. nvme_rdma_error_recovery_work).
So we serialize it in the timeout handler and teardown io and admin
queues to guarantee that no one races with us from completing the
request.
Reported-by: NJaesoo Lee <jalee@purestorage.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4c174e63

19 12月, 2018 2 次提交

nvme-rdma: implement polling queue map · ff8519f9

由 Sagi Grimberg 提交于 12月 14, 2018

When passed with nr_poll_queues setup additional queues with cq polling
context IB_POLL_DIRECT (no interrupts) and make sure to set
QUEUE_FLAG_POLL on the connect_q. In addition add the third queue
mapping for polling queues.

nvmf connect on this queue is polled for like all other requests so make
nvmf_connect_io_queue poll for polling queues.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ff8519f9

nvme-fabrics: allow nvmf_connect_io_queue to poll · 26c68227

由 Sagi Grimberg 提交于 12月 14, 2018

Preparation for polling support for fabrics. Polling support
means that our completion queues are not generating any interrupts
which means we need to poll for the nvmf io queue connect as well.

Reviewed by Steve Wise <swise@opengridcomputing.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

26c68227

13 12月, 2018 2 次提交

nvme-rdma: support separate queue maps for read and write · b65bb777

由 Sagi Grimberg 提交于 12月 11, 2018

llow NVMF_OPT_NR_WRITE_QUEUES to describe additional write queues.  In
addition, implement .map_queues that will apply 2 queue maps for read
and write queue sets.

Note that with the separate queue map, HCTX_TYPE_READ will always use
nr_io_queues and HCTX_TYPE_DEFAULT will use nr_write_queues.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b65bb777

blk-mq-rdma: pass in queue map to blk_mq_rdma_map_queues · e42b3867

由 Sagi Grimberg 提交于 12月 11, 2018

Will be used by nvme-rdma for queue map separation support.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

e42b3867

08 12月, 2018 1 次提交

nvme: add a numa_node field to struct nvme_ctrl · 103e515e

由 Hannes Reinecke 提交于 11月 16, 2018

Instead of directly poking into the struct device add a new numa_node
field to struct nvme_ctrl.  This allows fabrics drivers where ctrl->dev
is a virtual device to support NUMA affinity as well.

Also expose the field as a sysfs attribute, and populate it for the
RDMA and FC transports.
Signed-off-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

103e515e

05 12月, 2018 1 次提交

nvme-rdma: remove I/O polling support · f9801a48

由 Christoph Hellwig 提交于 12月 02, 2018

The code was always a bit of a hack that digs far too much into
RDMA core internals.  Lets kick it out and reimplement proper
dedicated poll queues as needed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f9801a48

01 12月, 2018 1 次提交

nvme-rdma: fix double freeing of async event data · 6344d02d

由 Prabhath Sajeepa 提交于 11月 28, 2018

Some error paths in configuration of admin queue free data buffer
associated with async request SQE without resetting the data buffer
pointer to NULL, This buffer is also freed up again if the controller
is shutdown or reset.
Signed-off-by: NPrabhath Sajeepa <psajeepa@purestorage.com>
Reviewed-by: NRoland Dreier <roland@purestorage.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

6344d02d

26 11月, 2018 2 次提交

blk-mq: remove 'tag' parameter from mq_ops->poll() · 9743139c

由 Jens Axboe 提交于 11月 16, 2018

We always pass in -1 now and none of the callers use the tag value,
remove the parameter.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9743139c

blk-mq: when polling for IO, look for any completion · 1052b8ac

由 Jens Axboe 提交于 11月 26, 2018

If we want to support async IO polling, then we have to allow finding
completions that aren't just for the one we are looking for. Always pass
in -1 to the mq_ops->poll() helper, and have that return how many events
were found in this poll loop.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1052b8ac

19 10月, 2018 2 次提交

nvme-fabrics: move controller options matching to fabrics · b7c7be6f

由 Sagi Grimberg 提交于 10月 18, 2018

IP transports will most likely use the same controller options
matching when detecting a duplicate connect. Move it to
fabrics.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b7c7be6f

nvme-rdma: always have a valid trsvcid · bb59b8e5

由 Sagi Grimberg 提交于 10月 19, 2018

If not passed, we set the default trsvcid. We can rely on having trsvcid
and can simplify the controller matching logic.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

bb59b8e5

17 10月, 2018 1 次提交

nvmet-rdma: check for timeout in nvme_rdma_wait_for_cm() · 35da77d5

由 Bart Van Assche 提交于 10月 08, 2018

Check whether queue->cm_error holds a value before reading it. This patch
addresses Coverity ID 1373774: unchecked return value.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

35da77d5

25 7月, 2018 1 次提交

nvme-rdma: Simplify ib_post_(send|recv|srq_recv)() calls · 45e3cc1a

由 Bart Van Assche 提交于 7月 18, 2018

Instead of declaring and passing a dummy 'bad_wr' pointer, pass NULL
as third argument to ib_post_(send|recv|srq_recv)().
Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

45e3cc1a

24 7月, 2018 5 次提交

nvme-rdma: centralize admin/io queue teardown sequence · 75862c72

由 Sagi Grimberg 提交于 7月 09, 2018

We follow the same queue teardown sequence in delete, reset and error
recovery. Centralize the logic.  This patch does not change any
functionality.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

75862c72

nvme-rdma: centralize controller setup sequence · c66e2998

由 Sagi Grimberg 提交于 7月 09, 2018

Centralize controller sequence to a single routine that correctly cleans
up after failures instead of having multiple apperances in several flows
(create, reset, reconnect).

One thing that we also gain here are the sanity/boundary checks also
when connecting back to a dynamic controller.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c66e2998

nvme-rdma: unquiesce queues when deleting the controller · 90140624

由 Sagi Grimberg 提交于 7月 09, 2018

If the controller is going away, we need to unquiesce the IO queues so
that all pending request can fail gracefully before moving forward with
controller deletion. Do that before we destroy the IO queues so
blk_cleanup_queue won't block in freeze.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

90140624

nvme-rdma: mark expected switch fall-through · 249090f9

由 Gustavo A. R. Silva 提交于 7月 05, 2018

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

249090f9

nvme: if_ready checks to fail io to deleting controller · 6cdefc6e

由 James Smart 提交于 7月 20, 2018

The revised if_ready checks skipped over the case of returning error when
the controller is being deleted.  Instead it was returning BUSY, which
caused the ios to retry, which caused the ns delete to hang waiting for
the ios to drain.

Stack trace of hang looks like:
 kworker/u64:2   D    0    74      2 0x80000000
 Workqueue: nvme-delete-wq nvme_delete_ctrl_work [nvme_core]
 Call Trace:
  ? __schedule+0x26d/0x820
  schedule+0x32/0x80
  blk_mq_freeze_queue_wait+0x36/0x80
  ? remove_wait_queue+0x60/0x60
  blk_cleanup_queue+0x72/0x160
  nvme_ns_remove+0x106/0x140 [nvme_core]
  nvme_remove_namespaces+0x7e/0xa0 [nvme_core]
  nvme_delete_ctrl_work+0x4d/0x80 [nvme_core]
  process_one_work+0x160/0x350
  worker_thread+0x1c3/0x3d0
  kthread+0xf5/0x130
  ? process_one_work+0x350/0x350
  ? kthread_bind+0x10/0x10
  ret_from_fork+0x1f/0x30

Extend nvmf_fail_nonready_command() to supply the controller pointer so
that the controller state can be looked at. Fail any io to a controller
that is deleting.

Fixes: 3bc32bb1 ("nvme-fabrics: refactor queue ready check")
Fixes: 35897b92 ("nvme-fabrics: fix and refine state checks in __nvmf_check_ready")
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NEwan D. Milne <emilne@redhat.com>
Reviewed-by: NEwan D. Milne <emilne@redhat.com>

6cdefc6e

23 7月, 2018 2 次提交

nvme: cache struct nvme_ctrl reference to struct nvme_request · 59e29ce6

由 Sagi Grimberg 提交于 6月 29, 2018

We will need to reference the controller in the setup and completion
time for tracing and future traffic based keep alive support.
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

59e29ce6

nvme-rdma: support up to 4 segments of inline data · 64a741c1

由 Steve Wise 提交于 6月 20, 2018

Allow up to 4 segments of inline data for NVMF WRITE operations. This
reduces latency for small WRITEs by removing the need for the target to
issue a READ WR for IB, or a REG_MR + READ WR chain for iWarp.

Also cap the inline segments used based on the limitations of the
device.
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

64a741c1

28 6月, 2018 1 次提交

nvme-rdma: fix possible double free of controller async event buffer · 682630f0

由 Sagi Grimberg 提交于 6月 25, 2018

If reconnect/reset failed where the controller async event buffer
was freed, we might end up freeing it again as we call
nvme_rdma_destroy_admin_queue again in the remove path. Given that
the sequence is guaranteed to serialize by .ctrl_stop, we simply
set ctrl->async_event_sqe.data to NULL and don't free it in future
visits.
Reported-by: NMax Gurtovoy <maxg@mellanox.com>
Tested-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

682630f0

20 6月, 2018 4 次提交

nvme-rdma: don't override opts->queue_size · 5e77d61c

由 Sagi Grimberg 提交于 6月 19, 2018

That is user argument, and theoretically controller limits can change
over time (over reconnects/resets).  Instead, use the sqsize controller
attribute to check queue depth boundaries and use it to the tagset
allocation.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

5e77d61c

nvme-rdma: Fix command completion race at error recovery · c947657b

由 Israel Rukshin 提交于 6月 19, 2018

The race is between completing the request at error recovery work and
rdma completions.  If we cancel the request before getting the good
rdma completion we get a NULL deref of the request MR at
nvme_rdma_process_nvme_rsp().

When Canceling the request we return its mr to the mr pool (set mr to
NULL) and also unmap its data.  Canceling the requests while the rdma
queues are active is not safe.  Because rdma queues are active and we
get good rdma completions that can use the mr pointer which may be NULL.
Completing the request too soon may lead also to performing DMA to/from
user buffers which might have been already unmapped.

The commit fixes the race by draining the QP before starting the abort
commands mechanism.
Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c947657b

nvme-rdma: fix possible free of a non-allocated async event buffer · 94e42213

由 Sagi Grimberg 提交于 6月 19, 2018

If nvme_rdma_configure_admin_queue fails before we allocated
the async event buffer, we will falsly free it because
nvme_rdma_free_queue is freeing it. Fix it by allocating the buffer right
after nvme_rdma_alloc_queue and free it right before nvme_rdma_queue_free
to maintain orderly reverse cleanup sequence.
Reported-by: NIsrael Rukshin <israelr@mellanox.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

94e42213

nvme-rdma: fix possible double free condition when failing to create a controller · 3d064101

由 Sagi Grimberg 提交于 6月 19, 2018

Failures after nvme_init_ctrl will defer resource cleanups to .free_ctrl
when the reference is released, hence we should not free the controller
queues for these failures.

Fix that by moving controller queues allocation before controller
initialization and correctly freeing them for failures before
initialization and skip them for failures after initialization.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3d064101

15 6月, 2018 1 次提交

nvme-fabrics: refactor queue ready check · 3bc32bb1

由 Christoph Hellwig 提交于 6月 11, 2018

Move the is_connected check to the fibre channel transport, as it has no
meaning for other transports.  To facilitate this split out a new
nvmf_fail_nonready_command helper that is called by the transport when
it is asked to handle a command on a queue that is not ready.

Also avoid a function call for the queue live fast path by inlining
the check.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJames Smart <james.smart@broadcom.com>

3bc32bb1

11 6月, 2018 1 次提交

nvme-rdma: fix error flow during mapping request data · 94423a8f

由 Max Gurtovoy 提交于 6月 10, 2018

After dma mapping the sgl, we map the sgl to nvme sgl descriptor. In case
of failure during the last mapping we never dma unmap the sgl.
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

94423a8f

09 6月, 2018 1 次提交

nvme-rdma: correctly check for target keyed sgl support · d4c68c7a

由 Steve Wise 提交于 6月 05, 2018

The code was checking bit 20 instead of bit 2.  Also fixed the log entry.
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d4c68c7a

29 5月, 2018 1 次提交

nvme: return BLK_EH_DONE from ->timeout · db8c48e4

由 Christoph Hellwig 提交于 5月 29, 2018

NVMe always completes the request before returning from ->timeout, either
by polling for it, or by disabling the controller.  Return BLK_EH_DONE so
that the block layer doesn't even try to complete it again.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

db8c48e4

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功