- 05 4月, 2019 5 次提交
-
-
由 Christoph Hellwig 提交于
Cleaning up the command setup isn't related to unmapping data, and disentangling them will simplify error handling a bit down the road. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
-
由 Christoph Hellwig 提交于
nvme_init_iod should really be split into two parts: initialize a few general iod fields, which can easily be done at the beginning of nvme_queue_rq, and allocating the scatterlist if needed, which logically belongs into nvme_map_data with the code making use of it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
-
由 Keith Busch 提交于
Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Keith Busch 提交于
We don't need to save the dma device as it's not used in the hot path and hasn't in a long time. Shrink the struct nvme_queue removing this unnecessary member. Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Keith Busch 提交于
A negative value for the cq_vector used to mean the queue is either disabled or a polled queue. However, we have a queue enabled flag, so the cq_vector had been serving double duty. Don't overload the meaning of cq_vector. Use a flag specific to the polled queues instead. Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 14 3月, 2019 1 次提交
-
-
由 Christoph Hellwig 提交于
Qemu started out with a broken implementation of Write Zeroes written by yours truly. Disable Write Zeroes on qemu for now, eventually we need to go back and make all the qemu quirks version specific, but that is left for another time. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Tested-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 20 2月, 2019 2 次提交
-
-
由 Christoph Hellwig 提交于
Update license to use SPDX-License-Identifier instead of verbose license text. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Bart Van Assche 提交于
This patch avoids that the compiler complains about 'ret' being set but not being used when building with W=1. Fixes: 3b6592f7 ("nvme: utilize two queue maps, one for reads and one for writes") # v5.0-rc1 Signed-off-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 18 2月, 2019 2 次提交
-
-
由 Ming Lei 提交于
The NVME PCI driver contains a tedious mechanism for interrupt allocation, which is necessary to adjust the number and size of interrupt sets to the maximum available number of interrupts which depends on the underlying PCI capabilities and the available CPU resources. It works around the former short comings of the PCI and core interrupt allocation mechanims in combination with interrupt sets. The PCI interrupt allocation function allows to provide a maximum and a minimum number of interrupts to be allocated and tries to allocate as many as possible. This worked without driver interaction as long as there was only a single set of interrupts to handle. With the addition of support for multiple interrupt sets in the generic affinity spreading logic, which is invoked from the PCI interrupt allocation, the adaptive loop in the PCI interrupt allocation did not work for multiple interrupt sets. The reason is that depending on the total number of interrupts which the PCI allocation adaptive loop tries to allocate in each step, the number and the size of the interrupt sets need to be adapted as well. Due to the way the interrupt sets support was implemented there was no way for the PCI interrupt allocation code or the core affinity spreading mechanism to invoke a driver specific function for adapting the interrupt sets configuration. As a consequence the driver had to implement another adaptive loop around the PCI interrupt allocation function and calling that with maximum and minimum interrupts set to the same value. This ensured that the allocation either succeeded or immediately failed without any attempt to adjust the number of interrupts in the PCI code. The core code now allows drivers to provide a callback to recalculate the number and the size of interrupt sets during PCI interrupt allocation, which in turn allows the PCI interrupt allocation function to be called in the same way as with a single set of interrupts. The PCI code handles the adaptive loop and the interrupt affinity spreading mechanism invokes the driver callback to adapt the interrupt set configuration to the current loop value. This replaces the adaptive loop in the driver completely. Implement the NVME specific callback which adjusts the interrupt sets configuration and remove the adaptive allocation loop. [ tglx: Simplify the callback further and restore the dropped adjustment of number of sets ] Signed-off-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NMarc Zyngier <marc.zyngier@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Cc: Sagi Grimberg <sagi@grimberg.me> Cc: linux-nvme@lists.infradead.org Cc: linux-pci@vger.kernel.org Cc: Keith Busch <keith.busch@intel.com> Cc: Sumit Saxena <sumit.saxena@broadcom.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com> Link: https://lkml.kernel.org/r/20190216172228.602546658@linutronix.de
-
由 Ming Lei 提交于
The interrupt affinity spreading mechanism supports to spread out affinities for one or more interrupt sets. A interrupt set contains one or more interrupts. Each set is mapped to a specific functionality of a device, e.g. general I/O queues and read I/O queus of multiqueue block devices. The number of interrupts per set is defined by the driver. It depends on the total number of available interrupts for the device, which is determined by the PCI capabilites and the availability of underlying CPU resources, and the number of queues which the device provides and the driver wants to instantiate. The driver passes initial configuration for the interrupt allocation via a pointer to struct irq_affinity. Right now the allocation mechanism is complex as it requires to have a loop in the driver to determine the maximum number of interrupts which are provided by the PCI capabilities and the underlying CPU resources. This loop would have to be replicated in every driver which wants to utilize this mechanism. That's unwanted code duplication and error prone. In order to move this into generic facilities it is required to have a mechanism, which allows the recalculation of the interrupt sets and their size, in the core code. As the core code does not have any knowledge about the underlying device, a driver specific callback will be added to struct affinity_desc, which will be invoked by the core code. The callback will get the number of available interupts as an argument, so the driver can calculate the corresponding number and size of interrupt sets. To support this, two modifications for the handling of struct irq_affinity are required: 1) The (optional) interrupt sets size information is contained in a separate array of integers and struct irq_affinity contains a pointer to it. This is cumbersome and as the maximum number of interrupt sets is small, there is no reason to have separate storage. Moving the size array into struct affinity_desc avoids indirections and makes the code simpler. 2) At the moment the struct irq_affinity pointer which is handed in from the driver and passed through to several core functions is marked 'const'. With the upcoming callback to recalculate the number and size of interrupt sets, it's necessary to remove the 'const' qualifier. Otherwise the callback would not be able to update the data. Implement #1 and store the interrupt sets size in 'struct irq_affinity'. No functional change. [ tglx: Fixed the memcpy() size so it won't copy beyond the size of the source. Fixed the kernel doc comments for struct irq_affinity and de-'This patch'-ed the changelog ] Signed-off-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NMarc Zyngier <marc.zyngier@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Cc: Sagi Grimberg <sagi@grimberg.me> Cc: linux-nvme@lists.infradead.org Cc: linux-pci@vger.kernel.org Cc: Keith Busch <keith.busch@intel.com> Cc: Sumit Saxena <sumit.saxena@broadcom.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com> Link: https://lkml.kernel.org/r/20190216172228.423723127@linutronix.de
-
- 12 2月, 2019 1 次提交
-
-
由 Keith Busch 提交于
The reset work holds a mutex to prevent races with removal modifying the same resources, but was unlocking only on success. Unlock on failure too. Fixes: 5c959d73 ("nvme-pci: fix rapid add remove sequence") Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 06 2月, 2019 1 次提交
-
-
由 Keith Busch 提交于
A surprise removal may fail to tear down request queues if it is racing with the initial asynchronous probe. If that happens, the remove path won't see the queue resources to tear down, and the controller reset path may create a new request queue on a removed device, but will not be able to make forward progress, deadlocking the pci removal. Protect setting up non-blocking resources from a shutdown by holding the same mutex, and transition to the CONNECTING state after these resources are initialized so the probe path may see the dead controller state before dispatching new IO. Link: https://bugzilla.kernel.org/show_bug.cgi?id=202081Reported-by: NAlex Gagniuc <Alex_Gagniuc@Dellteam.com> Signed-off-by: NKeith Busch <keith.busch@intel.com> Tested-by: NAlex Gagniuc <mr.nuke.me@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 17 1月, 2019 1 次提交
-
-
由 Ming Lei 提交于
When -ENOSPC is returned from pci_alloc_irq_vectors_affinity(), we still try to allocate multiple irq vectors again, so irq queues covers the admin queue actually. But we don't consider that, then number of the allocated irq vector may be same with sum of io_queues[HCTX_TYPE_DEFAULT] and io_queues[HCTX_TYPE_READ], this way is obviously wrong, and finally breaks nvme_pci_map_queues(), and warning from pci_irq_get_affinity() is triggered. IRQ queues should cover admin queues, this patch makes this point explicitely in nvme_calc_io_queues(). We got severl boot failure internal report on aarch64, so please consider to fix it in v4.20. Fixes: 6451fe73 ("nvme: fix irq vs io_queue calculations") Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Tested-by: Nfin4478 <fin4478@hotmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 10 1月, 2019 5 次提交
-
-
由 James Dingwall 提交于
If a device provides an NQN it is expected to be globally unique. Unfortunately some firmware revisions for Intel 760p/Pro 7600p devices did not satisfy this requirement. In these circumstances if a system has >1 affected device then only one device is enabled. If this quirk is enabled then the device supplied subnqn is ignored and we fallback to generating one as if the field was empty. In this case we also suppress the version check so we don't print a warning when the quirk is enabled. Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJames Dingwall <james@dingwall.me.uk> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Hongbo Yao 提交于
There is an out of bounds array access in nvme_cqe_peding(). When enable irq_thread for nvme interrupt, there is racing between the nvmeq->cq_head updating and reading. nvmeq->cq_head is updated in nvme_update_cq_head(), if nvmeq->cq_head equals nvmeq->q_depth and before its value set to zero, nvme_cqe_pending() uses its value as an array index, the index will be out of bounds. Signed-off-by: NHongbo Yao <yaohongbo@huawei.com> [hch: slight coding style update] Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Keith Busch 提交于
If the driver is unable to create a subset of IO queues for any reason, the read/write and polled queue sets will not match the actual allocated hardware contexts. This leaves gaps in the CPU affinity mappings and causes the following kernel panic after blk_mq_map_queue_type() returns a NULL hctx. BUG: unable to handle kernel NULL pointer dereference at 0000000000000198 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP CPU: 64 PID: 1171 Comm: kworker/u259:1 Not tainted 4.20.0+ #241 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014 Workqueue: nvme-wq nvme_scan_work [nvme_core] RIP: 0010:blk_mq_init_allocated_queue+0x2d9/0x440 RSP: 0018:ffffb1bf0abc3cd0 EFLAGS: 00010286 RAX: 000000000000001f RBX: ffff8ea744cf0718 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 000000000000007c RDI: ffffffff9109a820 RBP: ffff8ea7565f7008 R08: 000000000000001f R09: 000000000000003f R10: ffffb1bf0abc3c00 R11: 0000000000000000 R12: 000000000001d008 R13: ffff8ea7565f7008 R14: 000000000000003f R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8ea757200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000198 CR3: 0000000013058000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: blk_mq_init_queue+0x35/0x60 nvme_validate_ns+0xc6/0x7c0 [nvme_core] ? nvme_identify_ctrl.isra.56+0x7e/0xc0 [nvme_core] nvme_scan_work+0xc8/0x340 [nvme_core] ? __wake_up_common+0x6d/0x120 ? try_to_wake_up+0x55/0x410 process_one_work+0x1e9/0x3d0 worker_thread+0x2d/0x3d0 ? process_one_work+0x3d0/0x3d0 kthread+0x111/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x1f/0x30 Modules linked in: nvme nvme_core serio_raw CR2: 0000000000000198 Fix by re-running the interrupt vector setup from scratch using a reduced count that may be successful until the created queues matches the irq affinity plus polling queue sets. Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Liviu Dudau 提交于
When using HMB the PCIe host driver allocates host_mem_desc_bufs using dma_alloc_attrs() but frees them using dma_free_coherent(). Use the correct dma_free_attrs() function to free the buffers. Signed-off-by: NLiviu Dudau <liviu@dudau.co.uk> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Jianchao Wang 提交于
We only set the nr_maps to 3 if poll queues are supported. Signed-off-by: NJianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 08 1月, 2019 1 次提交
-
-
由 Luis Chamberlain 提交于
We already need to zero out memory for dma_alloc_coherent(), as such using dma_zalloc_coherent() is superflous. Phase it out. This change was generated with the following Coccinelle SmPL patch: @ replace_dma_zalloc_coherent @ expression dev, size, data, handle, flags; @@ -dma_zalloc_coherent(dev, size, handle, flags) +dma_alloc_coherent(dev, size, handle, flags) Suggested-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org> [hch: re-ran the script on the latest tree] Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 19 12月, 2018 3 次提交
-
-
由 yupeng 提交于
Export the disk name, queue id, sq_head, sq_tail to a trace event in completion handling. Usage example: cd /sys/kernel/debug/tracing/events/nvme/nvme_sq echo 'disk=="nvme1n1"' > filter echo 1 > enable cat /sys/kernel/debug/tracing/trace_pipe Signed-off-by: Nyupeng <yupeng0921@gmail.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NKeith Busch <keith.busch@intel.com> [hch: slight formatting tweaks, use standard nvme tracepoint conventions] Signed-off-by: NChristoph Hellwig <hch@lst.de> wip
-
由 Christoph Hellwig 提交于
By duplicating the nvme_process_cq in both branches we keep the sparse lock context checking happy, so do it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Christoph Hellwig 提交于
The block layer now enables polling support on a queue if nr_maps includes the poll map, so we should only set that if we actually support poll queues. Fixes: 6544d229 ("block: enable polling by default if a poll map is initalized") Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJens Axboe <axboe@kernel.dk> Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
-
- 17 12月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
Now that the block layer checks if a queue map has any queues inside it there is no more reason to duplicate the maps for the non-default types. Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 11 12月, 2018 1 次提交
-
-
由 Jens Axboe 提交于
Guenter reported an boot hang issue on HPPA after we default to 0 poll queues. We have two issues in the queue count calculations: 1) We don't separate the poll queues from the read/write queues. This is important, since the former doesn't need interrupts. 2) The adjust logic is broken. Adjust the poll queue count before doing nvme_calc_io_queues(). The poll queue count is only limited by the IO queue count we were able to get from the controller, not failures in the IRQ allocation loop. This leaves nvme_calc_io_queues() just adjusting the read/write queue map. Reported-by: NReported-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 05 12月, 2018 9 次提交
-
-
由 Christoph Hellwig 提交于
This avoids having to have differnet mq_ops for different setups with or without poll queues. Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Now that we can't poll regular, interrupt driven I/O queues there is almost nothing that can race with an interrupt. The only possible other contexts polling a CQ are the error handler and queue shutdown, and both are so far off in the slow path that we can simply use the big hammer of disabling interrupts. With that we can stop taking the cq_lock for normal queues. Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
This is the last place outside of nvme_irq that handles CQEs from interrupt context, and thus is in the way of removing the cq_lock for normal queues, and avoiding lockdep warnings on the poll queues, for which we already take it without IRQ disabling. Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Pass the opcode for the delete SQ/CQ command as an argument instead of the somewhat confusing pass loop. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
We have three places that can poll for I/O completions on a normal interrupt-enabled queue. All of them are in slow path code, so consolidate them to a single helper that uses spin_lock_irqsave and removes the fast path cqe_pending check. Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
This will allow us to simplify both the regular NVMe interrupt handler and the upcoming aio poll code. In addition to that the separate queues are generally a good idea for performance reasons. Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use a bit flag to mark if the SQ was allocated from the CMB, and clean up the surrounding code a bit. Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
This gets rid of all the messing with cq_vector and the ->polled field by using an atomic bitop to mark the queue enabled or not. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Having another indirect all in the fast path doesn't really help in our post-spectre world. Also having too many queue type is just going to create confusion, so I'd rather manage them centrally. Note that the queue type naming and ordering changes a bit - the first index now is the default queue for everything not explicitly marked, the optional ones are read and poll queues. Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 30 11月, 2018 1 次提交
-
-
由 Jens Axboe 提交于
Split the command submission and the SQ doorbell ring, and add the doorbell ring as our ->commit_rqs() hook. This allows a list of requests to be issued, with nvme only writing the SQ update when it's necessary. This is more efficient if we have lists of requests to issue, particularly on virtualized hardware, where writing the SQ doorbell is more expensive than on real hardware. For those cases, performance increases of 2-3x have been observed. The use case for this is plugged IO, where blk-mq flushes a batch of requests at the time. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 26 11月, 2018 2 次提交
-
-
由 Jens Axboe 提交于
We always pass in -1 now and none of the callers use the tag value, remove the parameter. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
If we want to support async IO polling, then we have to allow finding completions that aren't just for the one we are looking for. Always pass in -1 to the mq_ops->poll() helper, and have that return how many events were found in this poll loop. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 19 11月, 2018 1 次提交
-
-
由 Jens Axboe 提交于
We need a better way of configuring this, and given that polling is (still) a bit niche, let's default to using 0 poll queues. That way we'll have the same read/write/poll behavior as 4.20, and users that want to test/use polling are required to do manual configuration of the number of poll queues. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 16 11月, 2018 2 次提交
-
-
由 Jens Axboe 提交于
If we have separate poll queues, we know that they aren't using interrupts. Hence we don't need to disable interrupts around finding completions. Provide a separate set of blk_mq_ops for such devices. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
At least on SPARC, if MSI/MSI-X isn't supported, we get EINVAL if we ask for more than one vector. This isn't covered by our ENOSPC check. If we get EINVAL, decrease our ask to just one vector, instead of bailing out in error. Fixes: 3b6592f7 ("nvme: utilize two queue maps, one for reads and one for writes") Reported-by: NGuenter Roeck <linux@roeck-us.net> Tested-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 15 11月, 2018 1 次提交
-
-
由 Jens Axboe 提交于
NVMe always asks for io_queues + 1 worth of IRQ vectors, which means that even when we scale all the way down, we still ask for 2 vectors and get -ENOSPC in return if the system can't support more than 1. Getting just 1 vector is fine, it just means that we'll have 1 IO queue and 1 admin queue, with a shared vector between them. Check for this case and don't add our + 1 if it happens. Fixes: 3b6592f7 ("nvme: utilize two queue maps, one for reads and one for writes") Reported-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-