- 06 7月, 2017 4 次提交
-
-
由 Sagi Grimberg 提交于
When our RDMA queue-pair is torn down with high load of I/O traffic, we have no way of knowing if the memory region was actually registered by the reg_mr work request as it completion flushes with error (hw might have done it or not). So in order to not deal with all this uncertanty, we simply recycle the MR in reinit_request. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
Usually before we teardown the controller we want to: 1. complete/cancel any ctrl inflight works 2. remove ctrl namespaces (only for removal though, resets shouldn't remove any namespaces). but we do not want to destroy the controller device as we might use it for logging during the teardown stage. This patch adds nvme_start_ctrl() which queues inflight controller works (aen, ns scan, queue start and keep-alive if kato is set) and nvme_stop_ctrl() which cancels the works namespace removal is left to the callers to handle. Move nvme_uninit_ctrl after we are done with the controller device. Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
unlike blk_mq_stop_hw_queues and blk_mq_start_stopped_hw_queues quiescing/unquiescing respects the submission path rcu grace. Also make sure to kick the requeue list when appropriate. Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Marta Rybczynska 提交于
This patch improves the way the RDMA IB signalling is done by using atomic operations for the signalling variable. This avoids race conditions on sig_count. The signalling interval changes slightly and is now the largest power of two not larger than queue depth / 2. ilog() usage idea by Bart Van Assche. Signed-off-by: NMarta Rybczynska <marta.rybczynska@kalray.eu> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org
-
- 04 7月, 2017 1 次提交
-
-
由 Sagi Grimberg 提交于
We might have more/less queues once we reconnect/reset. For example due to cpu going online/offline or controller constraints. Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 02 7月, 2017 2 次提交
-
-
由 Sagi Grimberg 提交于
All transports use either a private cache of controller cap or an on-stack copy, move it to the generic struct nvme_ctrl. In the future it will also be maintained by the core. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
All all transports use the queue_count in exactly the same, so move it to the generic struct nvme_ctrl. In the future it will also be maintained by the core. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-By: NJames Smart <james.smart@broadcom.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 28 6月, 2017 2 次提交
-
-
由 Christoph Hellwig 提交于
NVMe 1.2.1 or later requires controllers to provide a subsystem NQN in the Identify controller data structures. Use this NQN for the subsysnqn sysfs attribute by storing it in the nvme_ctrl structure after verifying it. For older controllers we generate a "fake" NQN per non-normative text in the NVMe 1.3 spec. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Sagi Grimberg 提交于
No need to differentiate fabrics from pci/loop, also lower it to 32 as we don't really need 256 inflight admin commands. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 15 6月, 2017 11 次提交
-
-
由 Christoph Hellwig 提交于
This moves the nvme_reset function from the PCIe driver to common code, renaming it to nvme_reset_ctrl in the process. Additionally a new helper nvme_reset_ctrl_sync is added for the case where we want to wait for the reset. To facilitate that the reset_work work structure is move to the common nvme_ctrl structure and the ->reset_ctrl method is removed. For now the drivers initialize the reset_work with their own callback, but longer term we should move to callouts for specific parts of the reset process and move even more code to the core. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Christoph Hellwig 提交于
Now that we get the tagset passed we can have a single implementation for the I/O and admin queues. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Dan Carpenter 提交于
We accidentally return ERR_PTR(0) which is NULL. The caller isn't explicitly checking for that but I couldn't immediately spot whether this would lead to a NULL dereference. Anyway, we can fix add an error code easily enough. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
It is not a user option but rather a variable controller attribute. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
Instead of each transport using it's own workqueue, export a single nvme-core workqueue and use that instead. In the future, this will help us moving towards some unification if controller setup/teardown flows. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
We only care about if the queue is LIVE for request submission, so no need for CONNECTED. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
Instead of introducing a flag for if the queue is allocated, simply free the rdma resources when we get the error. We allocate the queue rdma resources when we have an address resolution, their we allocate (or take a reference on) our device so we should free it when we have error after the address resolution namely: 1. route resolution error 2. connect reject 3. connect error 4. peer unreachable error Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
We put the reference on the device in the destroy routine so we should lookup and take the reference in the create routine. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
We don't need it as the core polling context will take are of rearming the completion queue. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
bitops accept bit numbers. Reported-by: NVijay Immanuel <vijayi@attalasystems.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 13 6月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
The merge of 4.12-rc5 into the for-4.13/block tree didn't handle the queue ready case correctly. Fix this by propagating blk_status_t into nvme_rdma_queue_is_ready. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 09 6月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
Use the same values for use for request completion errors as the return value from ->queue_rq. BLK_STS_RESOURCE is special cased to cause a requeue, and all the others are completed as-is. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 07 6月, 2017 1 次提交
-
-
由 Sagi Grimberg 提交于
When we encounter an transport/controller errors, error recovery kicks in which performs: 1. stops io/admin queues 2. moves transport queues out of LIVE state 3. fast fail pending io 4. schedule periodic reconnects. But we also need to fast fail incoming IO taht enters after we already scheduled. Given that our queue is not LIVE anymore, simply restart the request queues to fail in .queue_rq Reported-by: NAlex Turin <alex@vastdata.com> Reported-by: Nshahar.salzman <shahar.salzman@gmail.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org
-
- 26 5月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
So that we can have more flags for transport-specific behavior. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com>
-
- 23 5月, 2017 1 次提交
-
-
由 Marta Rybczynska 提交于
In the case of small NVMe-oF queue size (<32) we may enter a deadlock caused by the fact that the IB completions aren't sent waiting for 32 and the send queue will fill up. The error is seen as (using mlx5): [ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273): [ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12 This patch changes the way the signaling is done so that it depends on the queue depth now. The magic define has been removed completely. Cc: stable@vger.kernel.org Signed-off-by: NMarta Rybczynska <marta.rybczynska@kalray.eu> Signed-off-by: NSamuel Jones <sjones@kalray.eu> Acked-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 02 5月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
Remove the request_idx parameter, which can't be used safely now that we support I/O schedulers with blk-mq. Except for a superflous check in mtip32xx it was unused anyway. Also pass the tag_set instead of just the driver data - this allows drivers to avoid some code duplication in a follow on cleanup. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 21 4月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
We want our own clearly defined error field for NVMe passthrough commands, and the request errors field is going away in its current form. Just store the status and result field in the nvme_request field from hardirq completion context (using a new helper) and then generate a Linux errno for the block layer only when we actually need it. Because we can't overload the status value with a negative error code for cancelled command we now have a flags filed in struct nvme_request that contains a bit for this condition. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 10 4月, 2017 1 次提交
-
-
由 Sagi Grimberg 提交于
both our sqsize and the controller MQES cap are a 0 based value, so making it 1 based is wrong. Reported-by: NTrapp, Darren <Darren.Trapp@cavium.com> Reported-by: NDaniel Verkamp <daniel.verkamp@intel.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 04 4月, 2017 8 次提交
-
-
由 Christoph Hellwig 提交于
This avoids duplicating the logic four times, and it also allows to keep some helpers static in core.c or just opencode them. Note that this loses printing the aborted status on completions in the PCI driver as that uses a data structure not available any more. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Sagi Grimberg 提交于
This way our max retry limit holds as well. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Sagi Grimberg 提交于
Before scheduling a reconnect attempt, check nr_reconnects against max_reconnects, if not exhausted (or max_reconnects is not -1), schedule a reconnect attempts, otherwise schedule ctrl removal. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Sagi Grimberg 提交于
we already have it in opts. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Sagi Grimberg 提交于
If nvmf_register_transport happened to fail (it can't, but theoretically) we leak memory. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Sagi Grimberg 提交于
Both the destination and the host addresses are now parsed using inet_pton_with_scope helper. We also get ipv6 (with address scopes support). Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Sagi Grimberg 提交于
The target might be occupied with multiple hosts so lets give it some more grace before failing the connection establishment. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Sagi Grimberg 提交于
If a cpu unplug event has occured, we need to take the minimum of the provided nr_io_queues and the number of online cpus, otherwise we won't be able to connect them as blk-mq mapping won't dispatch to those queues. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 31 3月, 2017 1 次提交
-
-
由 Eric Biggers 提交于
Constify all instances of blk_mq_ops, as they are never modified. Signed-off-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 22 3月, 2017 1 次提交
-
-
由 Sagi Grimberg 提交于
If a cpu unplug event has occured, we need to take the minimum of the provided nr_io_queues and the number of online cpus, otherwise we won't be able to connect them as blk-mq mapping won't dispatch to those queues. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 28 2月, 2017 1 次提交
-
-
由 Masahiro Yamada 提交于
Fix typos and add the following to the scripts/spelling.txt: embeded||embedded Link: http://lkml.kernel.org/r/1481573103-11329-12-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 2月, 2017 1 次提交
-
-
由 Max Gurtovoy 提交于
This will enable the user to control the specific interface for connection establishment in case the host has more than 1 interface under the same subnet. E.g: Host interfaces configured as: - ib0 1.1.1.1/16 - ib1 1.1.1.2/16 Target interfaces configured as: - ib0 1.1.1.3/16 (listener interface) - ib1 1.1.1.4/16 the following connect command will go through host iface ib0 (default): nvme connect -t rdma -n testsubsystem -a 1.1.1.3 -s 1023 but the following command will go through host iface ib1: nvme connect -t rdma -n testsubsystem -a 1.1.1.3 -s 1023 -w 1.1.1.2 Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NParav Pandit <parav@mellanox.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@fb.com>
-