- 02 2月, 2021 7 次提交
-
-
由 Sagi Grimberg 提交于
Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
We might set the iov_iter direction wrong, which is harmless for this use-case, but get it right. Also this makes the code slightly cleaner. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Minwoo Im 提交于
The controller can request a delay retrying a failed command by setting the Command Retry Delay (CRD) field in the Completion Queue Entry. Currentlty this features is only applied to commands on the I/O queue, but not to commands on the admin queue. Retreive the nvme_ctrl from the request so that no namespace is required and apply the feature to all commands. Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Rikard Falkeborn 提交于
The only usage of these is to put their addresses in arrays of pointers to const attribute_groups. Make them const to allow the compiler to put them in read-only memory. Signed-off-by: NRikard Falkeborn <rikard.falkeborn@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Leonid Ravich 提交于
searching assoc_list protected by rcu_read_lock if list not changed inline. and according to the rcu list rules. queue array embedded into nvmet_fc_tgt_assoc protected by rcu_read_lock according to rcu dereference/assign rules. queue and assoc object freed after grace period by call_rcu. tgtport lock taken for changing assoc_list. Reviewed-by: NEldad Zinger <Eldad.Zinger@dell.com> Reviewed-by: NElad Grupi <Elad.Grupi@dell.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NLeonid Ravich <Leonid.Ravich@emc.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Israel Rukshin 提交于
Remove extra tab. Signed-off-by: NIsrael Rukshin <israelr@nvidia.com> Reviewed-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Israel Rukshin 提交于
Remove code duplication. Signed-off-by: NIsrael Rukshin <israelr@nvidia.com> Reviewed-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 26 1月, 2021 1 次提交
-
-
由 Christoph Hellwig 提交于
Always use the bio_set_dev helper to assign ->bi_bdev to make sure other state related to the device is uptodate. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 1月, 2021 4 次提交
-
-
由 Guoqing Jiang 提交于
We can remove 'q' from blk_execute_rq as well after the previous change in blk_execute_rq_nowait. And more importantly it never really was needed to start with given that we can trivial derive it from struct request. Cc: linux-scsi@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: linux-ide@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-nvme@lists.infradead.org Cc: linux-nfs@vger.kernel.org Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Signed-off-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Guoqing Jiang 提交于
The 'q' is not used since commit a1ce35fa ("block: remove dead elevator code"), also update the comment of the function. And more importantly it never really was needed to start with given that we can trivial derive it from struct request. Cc: target-devel@vger.kernel.org Cc: linux-scsi@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: linux-ide@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-nvme@lists.infradead.org Cc: linux-nfs@vger.kernel.org Signed-off-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Replace the gendisk pointer in struct bio with a pointer to the newly improved struct block device. From that the gendisk can be trivially accessed with an extra indirection, but it also allows to directly look up all information related to partition remapping. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Unconditionally call set_disk_ro now that it only updates the hardware state. This allows to properly set up the Linux devices read-only when the controller turns a previously writable namespace read-only. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 21 1月, 2021 2 次提交
-
-
由 Christoph Hellwig 提交于
Properly unwind step by step using refactored helpers from nvme_unmap_data to avoid a potential double dma_unmap on a mapping failure. Fixes: 7fe07d14 ("nvme-pci: merge nvme_free_iod into nvme_unmap_data") Reported-by: NMarc Orr <marcorr@google.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NMarc Orr <marcorr@google.com>
-
由 Christoph Hellwig 提交于
Split out three helpers from nvme_unmap_data that will allow finer grained unwinding from nvme_map_data. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NMarc Orr <marcorr@google.com>
-
- 19 1月, 2021 5 次提交
-
-
由 Chaitanya Kulkarni 提交于
The function nvmet_execute_identify_ns() doesn't set the status if call to nvmet_find_namespace() fails. In that case we set the status of the request to the value return by the nvmet_copy_sgl(). Set the status to NVME_SC_INVALID_NS and adjust the code such that request will have the right status on nvmet_find_namespace() failure. Without this patch :- NVME Identify Namespace 3: nsze : 0 ncap : 0 nuse : 0 nsfeat : 0 nlbaf : 0 flbas : 0 mc : 0 dpc : 0 dps : 0 nmic : 0 rescap : 0 fpi : 0 dlfeat : 0 nawun : 0 nawupf : 0 nacwu : 0 nabsn : 0 nabo : 0 nabspf : 0 noiob : 0 nvmcap : 0 mssrl : 0 mcl : 0 msrc : 0 nsattr : 0 nvmsetid: 0 anagrpid: 0 endgid : 0 nguid : 00000000000000000000000000000000 eui64 : 0000000000000000 lbaf 0 : ms:0 lbads:0 rp:0 (in use) With this patch-series :- feb3b88b501e (HEAD -> nvme-5.11) nvmet: remove extra variable in identify ns 6302aa67210a nvmet: remove extra variable in id-desclist ed57951da453 nvmet: remove extra variable in smart log nsid be384b8c24dc nvmet: set right status on error in id-ns handler NVMe status: INVALID_NS: The namespace or the format of that namespace is invalid(0xb) Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Klaus Jensen 提交于
Since NVMe v1.4 the Controller Memory Buffer must be explicitly enabled by the host. Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> [hch: avoid a local variable and add a comment] Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chao Leng 提交于
Each name space has a request queue, if complete request long time, multi request queues may have time out requests at the same time, nvme_tcp_timeout will execute concurrently. Multi requests in different request queues may be queued in the same tcp queue, multi nvme_tcp_timeout may call nvme_tcp_stop_queue at the same time. The first nvme_tcp_stop_queue will clear NVME_TCP_Q_LIVE and continue stopping the tcp queue(cancel io_work), but the others check NVME_TCP_Q_LIVE is already cleared, and then directly complete the requests, complete request before the io work is completely canceled may lead to a use-after-free condition. Add a multex lock to serialize nvme_tcp_stop_queue. Signed-off-by: NChao Leng <lengchao@huawei.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chao Leng 提交于
A crash happens when inject completing request long time(nearly 30s). Each name space has a request queue, when inject completing request long time, multi request queues may have time out requests at the same time, nvme_rdma_timeout will execute concurrently. Multi requests in different request queues may be queued in the same rdma queue, multi nvme_rdma_timeout may call nvme_rdma_stop_queue at the same time. The first nvme_rdma_timeout will clear NVME_RDMA_Q_LIVE and continue stopping the rdma queue(drain qp), but the others check NVME_RDMA_Q_LIVE is already cleared, and then directly complete the requests, complete request before the qp is fully drained may lead to a use-after-free condition. Add a multex lock to serialize nvme_rdma_stop_queue. Signed-off-by: NChao Leng <lengchao@huawei.com> Tested-by: NIsrael Rukshin <israelr@nvidia.com> Reviewed-by: NIsrael Rukshin <israelr@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Revanth Rajashekar 提交于
According to NVMe spec v1.4, section 8.3.1, the PRINFO bit and the metadata size play a vital role in deteriming the host buffer size. If PRIFNO bit is set and MS==8, the host doesn't add the metadata buffer, instead the controller adds it. Signed-off-by: NRevanth Rajashekar <revanth.rajashekar@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 15 1月, 2021 4 次提交
-
-
由 Sagi Grimberg 提交于
Discovery controllers usually don't support smart log page command. So when we connect to the discovery controller we see this warning: nvme nvme0: Failed to read smart log (error 24577) nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.123.1:8009 nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery" Introduce a new helper to understand if the controller is a discovery controller and use this helper to skip nvme_init_hwmon (also use it in other places that we check if the controller is a discovery controller). Fixes: 400b6a7b ("nvme: Add hardware monitoring support") Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
When a bio merges, we can get a request that spans multiple bios, and the overall request payload size is the sum of all bios. When we calculate how much we need to send from the existing bio (and bvec), we did not take into account the iov_iter byte count cap. Since multipage bvecs support, bvecs can split in the middle which means that when we account for the last bvec send we should also take the iov_iter byte count cap as it might be lower than the last bvec size. Reported-by: NHao Wang <pkuwangh@gmail.com> Fixes: 3f2304f8 ("nvme-tcp: add NVMe over TCP host driver") Tested-by: NHao Wang <pkuwangh@gmail.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
We shouldn't call smp_processor_id() in a preemptible context, but this is advisory at best, so instead call __smp_processor_id(). Fixes: db5ad6b7 ("nvme-tcp: try to send request in queue_rq context") Reported-by: NOr Gerlitz <gerlitz.or@gmail.com> Reported-by: NYi Zhang <yi.zhang@redhat.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Israel Rukshin 提交于
When setting port traddr to INADDR_ANY, the listening cm_id->device is NULL. The associate IB device is known only when a connect request event arrives, so checking T10-PI device capability should be done at this stage. Fixes: b09160c3 ("nvmet-rdma: add metadata/T10-PI support") Signed-off-by: NIsrael Rukshin <israelr@nvidia.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 06 1月, 2021 8 次提交
-
-
由 Max Gurtovoy 提交于
The only used argument in this function is the "req". Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Israel Rukshin 提交于
When a queue is in NVMET_RDMA_Q_CONNECTING state, it may has some requests at rsp_wait_list. In case a disconnect occurs at this state, no one will empty this list and will return the requests to free_rsps list. Normally nvmet_rdma_queue_established() free those requests after moving the queue to NVMET_RDMA_Q_LIVE state, but in this case __nvmet_rdma_queue_disconnect() is called before. The crash happens at nvmet_rdma_free_rsps() when calling list_del(&rsp->free_list), because the request exists only at the wait list. To fix the issue, simply clear rsp_wait_list when destroying the queue. Signed-off-by: NIsrael Rukshin <israelr@nvidia.com> Reviewed-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Minwoo Im 提交于
There are no callers for nvme_reset_ctrl_sync() and nvme_alloc_request_qid() so that we keep the symbols exported. Unexport those functions, mark them static and update the header file respectively. Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
While handling the completion queue, keep a local copy of the command id from the DMA-accessible completion entry. This silences a time-of-check to time-of-use (TOCTOU) warning from KF/x[1], with respect to a Thunderclap[2] vulnerability analysis. The double-read impact appears benign. There may be a theoretical window for @command_id to be used as an adversary-controlled array-index-value for mounting a speculative execution attack, but that mitigation is saved for a potential follow-on. A man-in-the-middle attack on the data payload is out of scope for this analysis and is hopefully mitigated by filesystem integrity mechanisms. [1] https://github.com/intel/kernel-fuzzer-for-xen-project [2] http://thunderclap.io/thunderclap-paper-ndss2019.pdfSigned-off-by: NLalithambika Krishna Kumar <lalithambika.krishnakumar@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
We may send a request (with or without its data) from two paths: 1. From our I/O context nvme_tcp_io_work which is triggered from: - queue_rq - r2t reception - socket data_ready and write_space callbacks 2. Directly from queue_rq if the send_list is empty (because we want to save the context switch associated with scheduling our io_work). However, given that now we have the send_mutex, we may run into a race condition where none of these contexts will send the pending payload to the controller. Both io_work send path and queue_rq send path opportunistically attempt to acquire the send_mutex however queue_rq only attempts to send a single request, and if io_work context fails to acquire the send_mutex it will complete without rescheduling itself. The race can trigger with the following sequence: 1. queue_rq sends request (no incapsule data) and blocks 2. RX path receives r2t - prepares data PDU to send, adds h2cdata PDU to the send_list and schedules io_work 3. io_work triggers and cannot acquire the send_mutex - because of (1), ends without self rescheduling 4. queue_rq completes the send, and completes ==> no context will send the h2cdata - timeout. Fix this by having queue_rq sending as much as it can from the send_list such that if it still has any left, its because the socket buffer is full and the socket write_space callback will trigger, thus guaranteeing that a context will be scheduled to send the h2cdata PDU. Fixes: db5ad6b7 ("nvme-tcp: try to send request in queue_rq context") Reported-by: NPotnuri Bharat Teja <bharat@chelsio.com> Reported-by: NSamuel Jones <sjones@kalrayinc.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Tested-by: NPotnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Gopal Tiwari 提交于
A system with more than one of these SSDs will only have one usable. Hence the kernel fails to detect nvme devices due to duplicate cntlids. [ 6.274554] nvme nvme1: Duplicate cntlid 33 with nvme0, rejecting [ 6.274566] nvme nvme1: Removing after probe failure status: -22 Adding the NVME_QUIRK_IGNORE_DEV_SUBNQN quirk to resolves the issue. Signed-off-by: NGopal Tiwari <gtiwari@redhat.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 James Smart 提交于
Kernel robot had the following warnings: >> fcloop.c:1506:6: warning: %x in format string (no. 1) requires >> 'unsigned int *' but the argument type is 'signed int *'. >> [invalidScanfArgType_int] >> if (sscanf(buf, "%x:%d:%d", &opcode, &starting, &amount) != 3) >> ^ Resolve by changing opcode from and int to an unsigned int and >> fcloop.c:1632:32: warning: Uninitialized variable: lport [uninitvar] >> ret = __wait_localport_unreg(lport); >> ^ >> fcloop.c:1615:28: warning: Uninitialized variable: nport [uninitvar] >> ret = __remoteport_unreg(nport, rport); >> ^ These aren't actual issues as the values are assigned prior to use. It appears the tool doesn't understand list_first_entry_or_null(). Regardless, quiet the tool by initializing the pointers to NULL at declaration. Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 James Smart 提交于
Recent patches changed calling sequences. nvme_fc_abort_outstanding_ios used to be called from a timeout or work context. Now it is being called in an io completion context, which can be an interrupt handler. Unfortunately, the abort outstanding ios routine attempts to stop nvme queues and nested routines that may try to sleep, which is in conflict with the interrupt handler. Correct replacing the direct call with a work element scheduling, and the abort outstanding ios routine will be called in the work element. Fixes: 95ced8a2 ("nvme-fc: eliminate terminate_io use by nvme_fc_error_recovery") Signed-off-by: NJames Smart <james.smart@broadcom.com> Reported-by: NDaniel Wagner <dwagner@suse.de> Tested-by: NDaniel Wagner <dwagner@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 08 12月, 2020 1 次提交
-
-
由 Ming Lei 提交于
Set nvme-loop's lock class via blk_mq_hctx_set_fq_lock_class for avoiding lockdep possible recursive locking, then we can remove the dynamically allocated lock class for each flush queue, finally we can avoid horrible SCSI probe delay. This way may not address situation in which one nvme-loop is backed on another nvme-loop. However, in reality, people seldom uses this way for test. Even though someone played in this way, it is just one recursive locking false positive, no real deadlock issue. Tested-by: NKashyap Desai <kashyap.desai@broadcom.com> Reported-by: NQian Cai <cai@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Cc: Sumit Saxena <sumit.saxena@broadcom.com> Cc: John Garry <john.garry@huawei.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 05 12月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
The request_queue can trivially be derived from the bio. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 02 12月, 2020 7 次提交
-
-
由 Christoph Hellwig 提交于
Use struct block_device to lookup partitions on a disk. This removes all usage of struct hd_struct from the I/O path. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NHannes Reinecke <hare@suse.de> Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: Chao Yu <yuchao0@huawei.com> [f2fs] Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Javier González 提交于
Allow ZNS NVMe SSDs to present a read-only namespace when append is not supported, instead of rejecting the namespace directly. This allows (i) the namespace to be used in read-only mode, which is not a problem as the append command only affects the write path, and (ii) to use standard management tools such as nvme-cli to choose a different format or firmware slot that is compatible with the Linux zoned block device. Signed-off-by: NJavier González <javier.gonz@samsung.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Javier González 提交于
Remane block device operations in preparation to add char device file operations. Signed-off-by: NJavier González <javier.gonz@samsung.com> Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Javier González 提交于
Rename controller base dev_t char device in preparation for adding a namespace char device. Signed-off-by: NJavier González <javier.gonz@samsung.com> Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Javier González 提交于
Cleanup unnecessary ret values that are not checked or used in nvme_alloc_ns(). Signed-off-by: NJavier González <javier.gonz@samsung.com> Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Minwoo Im 提交于
During the scan_work, an Identify command is issued to figure out which namespaces are active. If this command fails, the nvme driver falls back to scanning namespaces sequentially. In this situation, we don't see any warnings and don't even know whether list-ns command has been failed or not easiliy. Printa warning when the Identify command executin fail: [ 1.108399] nvme nvme0: Identify NS List failed (status=0x400b) [ 1.109583] nvme0n1: detected capacity change from 0 to 1048576 [ 1.112186] nvme nvme0: Identify Descriptors failed (nsid=2, status=0x4002) [ 1.113929] nvme nvme0: Identify Descriptors failed (nsid=3, status=0x4002) [ 1.116537] nvme nvme0: Identify Descriptors failed (nsid=4, status=0x4002) ... Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Minwoo Im 提交于
Add the namespace ID to the error message when the Identify command used to retrieve the Namespace Identification Descriptor list fails. This avoids rather useless and duplicative messages like the following: [ 1.321031] nvme nvme0: Identify Descriptors failed (16386) [ 1.321948] nvme nvme0: Identify Descriptors failed (16386) [ 1.322872] nvme nvme0: Identify Descriptors failed (16386) [ 1.323775] nvme nvme0: Identify Descriptors failed (16386) [ 1.324687] nvme nvme0: Identify Descriptors failed (16386) ... Also, print the nvme status code in hexadecimal rather than decimal format rather for better readability. Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-