- 02 9月, 2020 31 次提交
-
-
由 Baolin Wang 提交于
fix #29375191 For NVMe discard request, it will use special_vec to describe the size of the request, thus it will get an incorrect request size with blk_rq_bytes() when handling the NVMe discard request. Thus we should use blk_rq_payload_bytes() to calculate the data transfer size which can fix this issue. Fixes: 220741e8c12d ("alios: nvme-pci: Improve mapping single segment requests using SGLs") Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Acked-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJiufei Xue <jiufei.xue@linux.alibaba.com>
-
由 Baolin Wang 提交于
fix #29327388 Just use bio->bi_vcnt directly to validate if only one bvec in a bio for PRP mode, which can remove warnings for dm device. No functional changes. Fixes: c8b92b847512 ("alios: nvme-pci: Improve mapping single segment requests using PRP") Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Xiaoguang Wang 提交于
fix #29535320 In __nvme_poll(), nvme_complete_cqes() should also been protected by nvmeq->cq_lock. Fixes: 0d326c85dba5 ("nvme: provide optimized poll function for separate poll queues") Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Baolin Wang 提交于
fix #29327388 Now the blk-mq did not support multi-page bvec, which means each bvec can only contain one page. Though the physical segment is 1 in one bio, the bio still can contains multiple bvecs which are physically contiguous, so we can not use one bvec length to map the request, instead we should use the full length of the request to mapping the request, when the physical segment is 1 in a request. In future if we support multi-page bvecs, this patch can be dropped. Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Baolin Wang 提交于
fix #29327388 Now the blk-mq did not support multi-page bvec, which means each bvec can only contain one page. For simple PRP, it just support one bvec in the bio though the physical segment is only 1, otherwise it can not map the whole request. In future if we support multi-page bvecs, this patch can be dropped. Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Klaus Birkelund Jensen 提交于
fix #29327388 commit 049bf37262c61c99f45438910711b55054b24838 upstream The shortcut for single segment SGL requests did not set the PSDT field to mark the request as using SGLs. Fixes: 297910571f08 ("nvme-pci: optimize mapping single segment requests using SGLs") Signed-off-by: NKlaus Birkelund Jensen <klaus.jensen@cnexlabs.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Kevin Hao 提交于
fix #29327388 commit a4f40484e7f1dff56bb9f286cc59ffa36e0259eb upstream In the current code, the nvme is using a fixed 4k PRP entry size, but if the kernel use a page size which is more than 4k, we should consider the situation that the bv_offset may be larger than the dev->ctrl.page_size. Otherwise we may miss setting the prp2 and then cause the command can't be executed correctly. Fixes: dff824b2aadb ("nvme-pci: optimize mapping of small single segment requests") Cc: stable@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NKevin Hao <haokexin@gmail.com> Signed-off-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
fix #29327388 commit 70479b71bc80ae6f63c8d6644cc76dff99f79686 upstream Remove two pointless local variables, remove ret assignment that is never used, move the use_sgl initialization closer to where it is used. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
fix #29327388 commit 297910571f08f1d7e398793df6e606ebb375a3f1 upstream If the controller supports SGLs we can take another short cut for single segment request, given that we can always map those without another indirection structure, and thus don't need to create a scatterlist structure. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
fix #29327388 commit dff824b2aadb7808f50ceb0927acaec5ad750ce7 upstream If a request is single segment and fits into one or two PRP entries we do not have to create a scatterlist for it, but can just map the bio_vec directly. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
fix #29327388 commit d43f1ccfad053dbefba1d15443cdc36ca60958f0 upstream We'll have a better way to optimize for small I/O that doesn't require it soon, so remove the existing inline_sg case to make that optimization easier to implement. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
fix #29327388 commit 4aedb705437f6f98b45f45c394e6803ca67abd33 upstream This prepares for some bigger changes to the data mapping helpers. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
fix #29327388 commit 783b94bd9250478154904fa782d2cfc46336cdf6 upstream We always have exactly one segment, so we can simply call dma_map_bvec. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
fix #29327388 commit b15c592de37ed9d71499a3b8a750d1b235fcba3d upstream This mirrors how nvme_map_pci is called and will allow simplifying some checks in nvme_unmap_pci later on. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
fix #29327388 commit 7fe07d14f71fabef642a478626248a9121e95b7b upstream This means we now have a function that undoes everything nvme_map_data does and we can simplify the error handling a bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
fix #29327388 commit 915f04c93db4e3a7388c8ad8ddfc28830e4cbce3 upstream Cleaning up the command setup isn't related to unmapping data, and disentangling them will simplify error handling a bit down the road. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
fix #29327388 commit 9b048119a153590b934ef49aae309b723587f527 upstream nvme_init_iod should really be split into two parts: initialize a few general iod fields, which can easily be done at the beginning of nvme_queue_rq, and allocating the scatterlist if needed, which logically belongs into nvme_map_data with the code making use of it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Keith Busch 提交于
fix #29327388 commit 39f8e36401142d73e33a954ac4bdf844fb5de9ae upstream Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Bijan Mottahedeh 提交于
to #28991349 commit 9515743bfb39c61aaf3d4f3219a645c8d1fe9a0e upstream Completions need to consumed in the same order the controller submitted them, otherwise future completion entries may overwrite ones we haven't handled yet. Hold the nvme queue's poll lock while completing new CQEs to prevent another thread from freeing command tags for reuse out-of-order. Fixes: dabcefab45d3 ("nvme: provide optimized poll function for separate poll queues") Signed-off-by: NBijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
to #28991349 commit e5edd5f298fafda28284bafb8371e6f0b7681035 upstream Now that the block layer checks if a queue map has any queues inside it there is no more reason to duplicate the maps for the non-default types. Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Ming Lei 提交于
to #28991349 commit c45b1fa2433c65e44bdf48f513cb37289f3116b9 upstream When -ENOSPC is returned from pci_alloc_irq_vectors_affinity(), we still try to allocate multiple irq vectors again, so irq queues covers the admin queue actually. But we don't consider that, then number of the allocated irq vector may be same with sum of io_queues[HCTX_TYPE_DEFAULT] and io_queues[HCTX_TYPE_READ], this way is obviously wrong, and finally breaks nvme_pci_map_queues(), and warning from pci_irq_get_affinity() is triggered. IRQ queues should cover admin queues, this patch makes this point explicitely in nvme_calc_io_queues(). We got severl boot failure internal report on aarch64, so please consider to fix it in v4.20. Fixes: 6451fe73fa0f ("nvme: fix irq vs io_queue calculations") Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Tested-by: Nfin4478 <fin4478@hotmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Jens Axboe 提交于
to #28991349 commit 6451fe73fa0f542a49bfacd7205b88a597897f58 upstream Guenter reported an boot hang issue on HPPA after we default to 0 poll queues. We have two issues in the queue count calculations: 1) We don't separate the poll queues from the read/write queues. This is important, since the former doesn't need interrupts. 2) The adjust logic is broken. Adjust the poll queue count before doing nvme_calc_io_queues(). The poll queue count is only limited by the IO queue count we were able to get from the controller, not failures in the IRQ allocation loop. This leaves nvme_calc_io_queues() just adjusting the read/write queue map. Reported-by: NReported-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Christoph Hellwig 提交于
to #28991349 commit e20ba6e1da029136ded295f33076483d65ddf50a upstream Having another indirect all in the fast path doesn't really help in our post-spectre world. Also having too many queue type is just going to create confusion, so I'd rather manage them centrally. Note that the queue type naming and ordering changes a bit - the first index now is the default queue for everything not explicitly marked, the optional ones are read and poll queues. Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Jens Axboe 提交于
to #28991349 commit dabcefab45d36ecb5a22f16577bb0f298876a22d upstream If we have separate poll queues, we know that they aren't using interrupts. Hence we don't need to disable interrupts around finding completions. Provide a separate set of blk_mq_ops for such devices. Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Jens Axboe 提交于
to #28991349 commit a4668d9ba4be1ca9f4a39798ba3419fdfef0750d upstream We need a better way of configuring this, and given that polling is (still) a bit niche, let's default to using 0 poll queues. That way we'll have the same read/write/poll behavior as 4.20, and users that want to test/use polling are required to do manual configuration of the number of poll queues. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Jens Axboe 提交于
to #28991349 commit db29eb059cdc571f9d75cec4a41b9884b3b8286a upstream At least on SPARC, if MSI/MSI-X isn't supported, we get EINVAL if we ask for more than one vector. This isn't covered by our ENOSPC check. If we get EINVAL, decrease our ask to just one vector, instead of bailing out in error. Fixes: 3b6592f70ad7 ("nvme: utilize two queue maps, one for reads and one for writes") Reported-by: NGuenter Roeck <linux@roeck-us.net> Tested-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Jens Axboe 提交于
to #28991349 commit 30e066286e232772cad72c87008a958e23e40a33 upstream NVMe always asks for io_queues + 1 worth of IRQ vectors, which means that even when we scale all the way down, we still ask for 2 vectors and get -ENOSPC in return if the system can't support more than 1. Getting just 1 vector is fine, it just means that we'll have 1 IO queue and 1 admin queue, with a shared vector between them. Check for this case and don't add our + 1 if it happens. Fixes: 3b6592f70ad7 ("nvme: utilize two queue maps, one for reads and one for writes") Reported-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Jens Axboe 提交于
to #28991349 commit 4b04cc6a8f86c4842314def22332de1f15de8523 upstream Adds support for defining a variable number of poll queues, currently configurable with the 'poll_queues' module parameter. Defaults to a single poll queue. And now we finally have poll support without triggering interrupts! Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Bart Van Assche 提交于
to #28991349 commit e895fedf12dc0663a925b54eb0961fc927208097 upstream This patch avoids that the compiler complains about 'ret' being set but not being used when building with W=1. Fixes: 3b6592f70ad7 ("nvme: utilize two queue maps, one for reads and one for writes") # v5.0-rc1 Signed-off-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Jens Axboe 提交于
to #28991349 commit 3b6592f70ad7b4c24dd3eb2ac9bbe3353d02c992 upstream NVMe does round-robin between queues by default, which means that sharing a queue map for both reads and writes can be problematic in terms of read servicing. It's much easier to flood the queue with writes and reduce the read servicing. Implement two queue maps, one for reads and one for writes. The write queue count is configurable through the 'write_queues' parameter. By default, we retain the previous behavior of having a single queue set, shared between reads and writes. Setting 'write_queues' to a non-zero value will create two queue sets, one for reads and one for writes, the latter using the configurable number of queues (hardware queue counts permitting). Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Jens Axboe 提交于
to #28991349 commit ed76e329d74a4b15ac0f5fd3adbd52ec0178a134 upstream This is in preparation for allowing multiple sets of maps per queue, if so desired. Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
- 29 6月, 2020 1 次提交
-
-
由 Jens Axboe 提交于
fix #28871358 commit 04f3eafda6e05adc56afed4d3ae6e24aaa429058 upstream Split the command submission and the SQ doorbell ring, and add the doorbell ring as our ->commit_rqs() hook. This allows a list of requests to be issued, with nvme only writing the SQ update when it's necessary. This is more efficient if we have lists of requests to issue, particularly on virtualized hardware, where writing the SQ doorbell is more expensive than on real hardware. For those cases, performance increases of 2-3x have been observed. The use case for this is plugged IO, where blk-mq flushes a batch of requests at the time. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
- 15 6月, 2020 1 次提交
-
-
由 Ard Biesheuvel 提交于
task #28557808 [ Upstream commit 3a8ecc935efabdad106b5e06d07b150c394b4465 ] Commit 7fd8930f "nvme: add a common helper to read Identify Controller data" has re-introduced an issue that we have attempted to work around in the past, in commit a310acd7 ("NVMe: use split lo_hi_{read,write}q"). The problem is that some PCIe NVMe controllers do not implement 64-bit outbound accesses correctly, which is why the commit above switched to using lo_hi_[read|write]q for all 64-bit BAR accesses occuring in the code. In the mean time, the NVMe subsystem has been refactored, and now calls into the PCIe support layer for NVMe via a .reg_read64() method, which fails to use lo_hi_readq(), and thus reintroduces the problem that the workaround above aimed to address. Given that, at the moment, .reg_read64() is only used to read the capability register [which is known to tolerate split reads], let's switch .reg_read64() to lo_hi_readq() as well. This fixes a boot issue on some ARM boxes with NVMe behind a Synopsys DesignWare PCIe host controller. Fixes: 7fd8930f ("nvme: add a common helper to read Identify Controller data") Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com> Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
- 17 1月, 2020 1 次提交
-
-
由 Jens Axboe 提交于
commit 1052b8ac5282daf35df331edcbdb645839d17e6a upstream. If we want to support async IO polling, then we have to allow finding completions that aren't just for the one we are looking for. Always pass in -1 to the mq_ops->poll() helper, and have that return how many events were found in this poll loop. Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
- 15 1月, 2020 1 次提交
-
-
由 Wenwei Tao 提交于
We found huge performance lost on below particular Intel's disk drive when discard zeroout functionality is enabled on it. The issue was found when we have ext4 filesystem mounted on the disk drive and started regular FIO testing. With it disabled, we don't observe performance lost any more. 81:00.0 Non-Volatile memory controller: Intel Corporation \ PCIe Data Center SSD (rev 01) This imposes to disable the discard zero-out functionality on above disk drive in order to regain the high performance that NVMe disk driver supposes to provide. Differential Revision: https://aone.alibaba-inc.com/code/D377540Signed-off-by: NWenwei Tao <wenwei.tao@linux.alibaba.com> Reviewed-by: NGavin Shan <shan.gavin@linux.alibaba.com> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
- 01 12月, 2019 2 次提交
-
-
由 Keith Busch 提交于
[ Upstream commit 9fe5c59ff6a1e5e26a39b75489a1420e7eaaf0b1 ] The nvme pci driver had been adding its CMB resource to the P2P DMA subsystem everytime on on a controller reset. This results in the following warning: ------------[ cut here ]------------ nvme 0000:00:03.0: Conflicting mapping in same section WARNING: CPU: 7 PID: 81 at kernel/memremap.c:155 devm_memremap_pages+0xa6/0x380 ... Call Trace: pci_p2pdma_add_resource+0x153/0x370 nvme_reset_work+0x28c/0x17b1 [nvme] ? add_timer+0x107/0x1e0 ? dequeue_entity+0x81/0x660 ? dequeue_entity+0x3b0/0x660 ? pick_next_task_fair+0xaf/0x610 ? __switch_to+0xbc/0x410 process_one_work+0x1cf/0x350 worker_thread+0x215/0x3d0 ? process_one_work+0x350/0x350 kthread+0x107/0x120 ? kthread_park+0x80/0x80 ret_from_fork+0x1f/0x30 ---[ end trace f7ea76ac6ee72727 ]--- nvme nvme0: failed to register the CMB This patch fixes this by registering the CMB with P2P only once. Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NLogan Gunthorpe <logang@deltatee.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Keith Busch 提交于
[ Upstream commit cb4bfda62afa25b4eee3d635d33fccdd9485dd7c ] A removal waits for the reset_work to complete. If a surprise removal occurs around the same time as an error triggered controller reset, and reset work happened to dispatch a command to the removed controller, the command won't be recovered since the timeout work doesn't do anything during error recovery. We wouldn't want to wait for timeout handling anyway, so this patch fixes this by disabling the controller and killing admin queues prior to syncing with the reset_work. Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 06 9月, 2019 1 次提交
-
-
由 Keith Busch 提交于
[ Upstream commit bd46a90634302bfe791e93ad5496f98f165f7ae0 ] Ensure the controller is not in the NEW state when nvme_probe() exits. This will always allow a subsequent nvme_remove() to set the state to DELETING, fixing a potential race between the initial asynchronous probe and device removal. Reported-by: NLi Zhong <lizhongfs@gmail.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 26 7月, 2019 2 次提交
-
-
由 Chaitanya Kulkarni 提交于
[ Upstream commit e71afda49335620e3d9adf56015676db33a3bd86 ] This patch removes the confusing assignment of the variable result at the time of declaration and sets the value in error cases next to the places where the actual error is happening. Here we also set the result value to -ENODEV when we fail at the final ctrl state transition in nvme_reset_work(). Without this assignment result will hold 0 from nvme_setup_io_queue() and on failure 0 will be passed to he nvme_remove_dead_ctrl() from final state transition. Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Minwoo Im 提交于
[ Upstream commit cee6c269b016ba89c62e34d6bccb103ee2c7de4f ] If the state change to NVME_CTRL_CONNECTING fails, the dmesg is going to be like: [ 293.689160] nvme nvme0: failed to mark controller CONNECTING [ 293.689160] nvme nvme0: Removing after probe failure status: 0 Even it prints the first line to indicate the situation, the second line is not proper because the status is 0 which means normally success of the previous operation. This patch makes it indicate the proper error value when it fails. [ 25.932367] nvme nvme0: failed to mark controller CONNECTING [ 25.932369] nvme nvme0: Removing after probe failure status: -16 This situation is able to be easily reproduced by: root@target:~# rmmod nvme && modprobe nvme && rmmod nvme Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSasha Levin <sashal@kernel.org>
-