- 27 10月, 2017 9 次提交
-
-
由 James Smart 提交于
Adds support for the duplicate_connect option. When set to true, checks whether there's an existing controller via the same host port and target port for the same host (hostnqn, hostid) to the same subsystem. Fails the connection request if an existing controller. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 James Smart 提交于
Adds support for the duplicate_connect option. When set to true, checks whether there's an existing controller via the same target address (traddr), target port (trsvcid), and if specified, host address (host_traddr). Fails the connection request if there is an existing controller. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 James Smart 提交于
Adds a helper function that compares the host and subsytem specified in a connect options list vs a controller. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 James Smart 提交于
Add the "duplicate_connect" boolean option (presence means true). Default is false. When false, the transport should validate whether a new controller request is targeted for the same host transport addressing and target transport addressing as an existing controller. If so, the new controller request should be rejected. When true, the callee is explicitly requesting a duplicate controller connection to be made and the new request should be attempted. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
This is a much more sensible check than just the admin queue. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@rimbeg.me> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
-
由 Christoph Hellwig 提交于
Use the core chrdev code to set up the link between the character device and the nvme controller. This allows us to get rid of the global list of all controllers, and also ensures that we have both a reference to the controller and the transport module before the open method of the character device is called. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sgi@grimberg.me> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NHannes Reinecke <hare@suse.com>
-
由 Christoph Hellwig 提交于
Instead of allocating a separate struct device for the character device handle embedd it into struct nvme_ctrl and use it for the main controller refcounting. This removes double refcounting and gets us an automatic reference for the character device operations. We keep ctrl->device as a pointer for now to avoid chaning printks all over, but in the future we could look into message printing helpers that take a controller structure similar to what other subsystems do. Note the delete_ctrl operation always already has a reference (either through sysfs due this change, or because every open file on the /dev/nvme-fabrics node has a refernece) when it is entered now, so we don't need to do the unless_zero variant there. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com>
-
由 Christoph Hellwig 提交于
Now that we are protected against lookup vs free races for the namespace by using kref_get_unless_zero we don't need the hack of NULLing out the disk private data during removal. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
-
由 Christoph Hellwig 提交于
For kref_get_unless_zero to protect against lookup vs free races we need to use it in all places where we aren't guaranteed to already hold a reference. There is no such guarantee in nvme_find_get_ns, so switch to kref_get_unless_zero in this function. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
-
- 23 10月, 2017 2 次提交
-
-
由 Nitzan Carmi 提交于
Signed-off-by: NNitzan Carmi <nitzanc@mellanox.com> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Max Gurtovoy 提交于
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 20 10月, 2017 4 次提交
-
-
由 James Smart 提交于
The transport io timeout behavior wasn't quite correct. It ignored that the io error handler is supposed to be synchronous so it possibly allowed the blk request to be restarted while the io associated was still aborting. Timeouts on reserved commands, those used for association create, were never timing out thus they hung out forever. To correct: If an io is times out while a remoteport is not connected, just restart the io timer. The lack of connectivity will simultaneously be resetting the controller, so the reset path will abort and terminate the io. If an io is times out while it was marked for transport abort, just reset the io timer. The abort process is underway and will complete the io. Otherwise, if an io times out, abort the io. If the abort was unsuccessful (unlikely) give up and return not handled. If the abort was successful, as the abort process is underway it will terminate the io, so rather than synchronously waiting, just restart the io timer. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 James Smart 提交于
The io completion handling for i/o's that are failing due to to a transport error or association termination had issues, causing io failures (DNR set so retries didn't kick in) or long stalls. Change the io completion handler for the following items: When an io has been completed due to a transport abort (based on an exchange error) or when marked as aborted as part of an association termination (FCOP_FLAGS_TERMIO), set the NVME completion status to NVME_SC_ABORTED. By default, do not set DNR on the status so that a retry can be attempted after association recreate. In cases where an io is failed (non-successful nvme status including aborted), if the controller is being deleted (blk_queue_dying) or the io was part of the ios used for association creation (ctrl state is NEW or RECONNECTING), then additionally set the DNR bit so the io will not be retried. If the failed io was part of association creation, the failure will tear down the partially completioned association and typically restart a new reconnect attempt (another create association later). Rearranged code flow to remove a largely unneeded local variable. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
This adds SGL support for NVMe PCIe driver, based on an earlier patch from Rajiv Shanmugam Madeswaran <smrajiv15 at gmail.com>. This patch refactors the original code and adds new module parameter sgl_threshold to determine whether to use SGL or PRP for IOs. The usage of SGLs is controlled by the sgl_threshold module parameter, which allows to conditionally use SGLs if average request segment size (avg_seg_size) is greater than sgl_threshold. In the original patch, the decision of using SGLs was dependent only on the IO size, with the new approach we consider not only IO size but also the number of physical segments present in the IO. We calculate avg_seg_size based on request payload bytes and number of physical segments present in the request. For e.g.:- 1. blk_rq_nr_phys_segments = 2 blk_rq_payload_bytes = 8k avg_seg_size = 4K use sgl if avg_seg_size >= sgl_threshold. 2. blk_rq_nr_phys_segments = 2 blk_rq_payload_bytes = 64k avg_seg_size = 32K use sgl if avg_seg_size >= sgl_threshold. 3. blk_rq_nr_phys_segments = 16 blk_rq_payload_bytes = 64k avg_seg_size = 4K use sgl if avg_seg_size >= sgl_threshold. Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
Switch to the ida_simple_* helpers instead of opencoding them. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
-
- 19 10月, 2017 12 次提交
-
-
由 Israel Rukshin 提交于
Since commit b86dd815 "block: get rid of blk-mq default scheduler choice Kconfig entries", when setting nr_hw_queues to 1 the admin tag set uses mq-deadline scheduler. This flag is useful for admin queues that aren't used for normal IO. Signed-off-by: NIsrael Rukshin <israelr@mellanox.com> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Israel Rukshin 提交于
Since commit b86dd815 "block: get rid of blk-mq default scheduler choice Kconfig entries", when setting nr_hw_queues to 1 the admin tag set uses mq-deadline scheduler. This flag is useful for admin queues that aren't used for normal IO. Signed-off-by: NIsrael Rukshin <israelr@mellanox.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Minwoo Im 提交于
fixed comment typos in adapter_alloc_cq() and adapter_alloc_sq(). 'the the' duplications are replaced with 'that the'. Signed-off-by: NMinwoo Im <dn3108@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
If the controller is deleting (in case the user decided to delete it), we have no point to continue reset sequence. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
Instead of marking we are deleting, mark we are allocated and check that instead. This makes the logic symmetrical to connected mark check. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
No chance for the local invalidate to succeed if the queue-pair is in error state. Most likely the target will do a remote invalidation of our mr so not a big loss on the test_bit. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
Relying on the queue state while tearing down on every reconnect attempt is not a good design. We should do it once in err_work and simply try to establish the queues for each reconnect attempt. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
Warn if req->mr is NULL as it should never happen. Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
No need for the extra line for trivial assignments. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
Not necessarily address resolution failed. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
Instead of flagging admin/io. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
Move blk_mq_reinit_tagset from blk-mq to nvme core as the only user of it. Current transports that use it (rdma, fc) simply implement .reinit_request op. This patch does not change any functionality. Reviewed-by: NJens Axboe <axboe@kernel.dk> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 16 10月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
We can just use our normal ioctl handler for the compat case and remove the boilerplate code for it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
-
- 05 10月, 2017 1 次提交
-
-
由 James Smart 提交于
move nvme_fc_rport_get/put and rport free to higher in the file to avoid adding prototypes to resolve references in upcoming code additions Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 04 10月, 2017 4 次提交
-
-
由 James Smart 提交于
Added a new fc class and a device node for udev events under it. I expect the fc class will eventually be the location where the FC SCSI and FC NVME merge in the future. Therefore names are kept somewhat generic. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 James Smart 提交于
To support auto-connecting to FC-NVME devices upon their dynamic appearance, add a uevent that can kick off connection scripts. uevent is posted against the fc_udev device. patch set tested with the following rule to kick an nvme-cli connect-all for the FC initiator and FC target ports. This is just an example for testing and not intended for real life use. ACTION=="change", SUBSYSTEM=="fc", ENV{FC_EVENT}=="nvmediscovery", \ ENV{NVMEFC_HOST_TRADDR}=="*", ENV{NVMEFC_TRADDR}=="*", \ RUN+="/bin/sh -c '/usr/local/sbin/nvme connect-all --transport=fc --host-traddr=$env{NVMEFC_HOST_TRADDR} --traddr=$env{NVMEFC_TRADDR} >> /tmp/nvme_fc.log'" I will post proposed udev/systemd scripts for possible kernel support. Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
Help userspace to make sure transport module is loaded. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Marc Olson 提交于
The underlying blk_mq_tag_set, and request timeout parameters support an unsigned int. Extend the size of the nvme module parameters for io and admin commands to match. Signed-off-by: NMarc Olson <marcolso@amazon.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 26 9月, 2017 3 次提交
-
-
由 Sagi Grimberg 提交于
By calling nvme_stop_ctrl on a already failed controller will wait for the scan work to complete (only by identify timeout expiration which is 60 seconds). This is unnecessary when we already know that the controller has failed. Reported-by: NYi Zhang <yizhan@redhat.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Sagi Grimberg 提交于
If we failed to transition to state LIVE after a successful reconnect, then controller deletion already started. In this case there is no point moving forward with reconnect. Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Sagi Grimberg 提交于
async_event_work might race as it is executed from two different workqueues at the moment. Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 9月, 2017 4 次提交
-
-
由 Guilherme G. Piccoli 提交于
Currently, driver code allows user to set 0 as KATO (Keep Alive TimeOut), but this is not being respected. This patch enforces the expected behavior. Signed-off-by: NGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 James Smart 提交于
Currently the nvme_req_needs_retry() applies several checks to see if a retry is allowed. On of those is whether the current time has exceeded the start time of the io plus the timeout length. This check, if an io times out, means there is never a retry allowed for the io. Which means applications see the io failure. Remove this check and allow the io to timeout, like it does on other protocols, and retries to be made. On the FC transport, a frame can be lost for an individual io, and there may be no other errors that escalate for the connection/association. The io will timeout, which causes the transport to escalate into creating a new association, but the io that timed out, due to this retry logic, has already failed back to the application and things are hosed. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 James Smart 提交于
If an nvme async_event command completes, in most cases, a new async event is posted. However, if the controller enters a resetting or reconnecting state, there is nothing to block the scheduled work element from posting the async event again. Nor are there calls from the transport to stop async events when an association dies. In the case of FC, where the association is torn down, the aer must be aborted on the FC link and completes through the normal job completion path. Thus the terminated async event ends up being rescheduled even though the controller isn't in a valid state for the aer, and the reposting gets the transport into a partially torn down data structure. It's possible to hit the scenario on rdma, although much less likely due to an aer completing right as the association is terminated and as the association teardown reclaims the blk requests via nvme_cancel_request() so its immediate, not a link-related action like on FC. Fix by putting controller state checks in both the async event completion routine where it schedules the async event and in the async event work routine before it calls into the transport. It's effectively a "stop_async_events()" behavior. The transport, when it creates a new association with the subsystem will transition the state back to live and is already restarting the async event posting. Signed-off-by: NJames Smart <james.smart@broadcom.com> [hch: remove taking a lock over reading the controller state] Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
The WARN_ONCE macro returns true if the condition is true, not if the warn was raised, so we're printing the scatter list every time it's invalid. This is excessive and makes debugging harder, so this patch prints it just once. Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-