- 30 8月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
The NVMe 1.3 specification says in section 5.21.1.13: "After a successful completion of a Set Features enabling the host memory buffer, the host shall not write to the associated host memory region, buffer size, or descriptor list until the host memory buffer has been disabled." While this doesn't state that the descriptor list must remain accessible to the device it certainly implies it must remaing readable by the device. So switch to a dma coherent allocation for the descriptor list just to be safe - it's not like the cost for it matters compared to the actual memory buffers. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Fixes: 87ad72a5 ("nvme-pci: implement host memory buffer support")
-
- 18 8月, 2017 1 次提交
-
-
由 Keith Busch 提交于
Fixes: 920d13a8 ("nvme-pci: factor out the cqe reading mechanics from __nvme_process_cq") Reported-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 10 8月, 2017 1 次提交
-
-
由 Max Gurtovoy 提交于
Currently we create the sysfs entry even if we fail mapping it. In that case, the unmapping will not remove the sysfs created file. There is no good reason to create a sysfs entry for a non working CMB and show his characteristics. Fixes: f63572df ("nvme: unmap CMB and remove sysfs file in reset path") Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NStephen Bates <sbates@raithlin.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 26 7月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
It's possible the preferred HMB size may not be a multiple of the chunk_size. This patch moves len to function scope and uses that in the for loop increment so the last iteration doesn't cause the total size to exceed the allocated HMB size. Based on an earlier patch from Keith Busch. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Fixes: 87ad72a5 ("nvme-pci: implement host memory buffer support")
-
- 20 7月, 2017 3 次提交
-
-
由 Christophe JAILLET 提交于
Release resources in the correct order in order not to miss a 'put_device()' if 'nvme_dev_map()' fails. Fixes: b00a726a ("NVMe: Don't unmap controller registers on reset") Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
This patch replaces the invalid nvme SGL kernel panic with a warning, and returns an appropriate error. The warning will occur only on the first occurance, and sgl details will be printed to help debug how the request was allowed to form. Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 David Wayne Fugate 提交于
Adds a fourth Intel controller which has the "stripe" quirk. Signed-off-by: NDavid Wayne Fugate <david.fugate@intel.com> Acked-by: NKeith Busch <keith.busch@intel.com> Acked-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 10 7月, 2017 2 次提交
-
-
由 weiping zhang 提交于
Adjust io queue depth more easily, and make sure io queue depth >= 2. Signed-off-by: Nweiping zhang <zhangweiping@didichuxing.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Dan Carpenter 提交于
"i" should be signed or it could cause a forever loop on the cleanup path. "size" can be used uninitialized. Fixes: 87ad72a5 ("nvme-pci: implement host memory buffer support") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 06 7月, 2017 2 次提交
-
-
由 Sagi Grimberg 提交于
Usually before we teardown the controller we want to: 1. complete/cancel any ctrl inflight works 2. remove ctrl namespaces (only for removal though, resets shouldn't remove any namespaces). but we do not want to destroy the controller device as we might use it for logging during the teardown stage. This patch adds nvme_start_ctrl() which queues inflight controller works (aen, ns scan, queue start and keep-alive if kato is set) and nvme_stop_ctrl() which cancels the works namespace removal is left to the callers to handle. Move nvme_uninit_ctrl after we are done with the controller device. Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
unlike blk_mq_stop_hw_queues and blk_mq_start_stopped_hw_queues quiescing/unquiescing respects the submission path rcu grace. Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 03 7月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
The pci_error_handlers->reset_notify() method had a flag to indicate whether to prepare for or clean up after a reset. The prepare and done cases have no shared functionality whatsoever, so split them into separate methods. [bhelgaas: changelog, update locking comments] Link: http://lkml.kernel.org/r/20170601111039.8913-3-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 02 7月, 2017 4 次提交
-
-
由 Sagi Grimberg 提交于
we are going to need the name for the core routine... Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
All transports use either a private cache of controller cap or an on-stack copy, move it to the generic struct nvme_ctrl. In the future it will also be maintained by the core. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
All all transports use the queue_count in exactly the same, so move it to the generic struct nvme_ctrl. In the future it will also be maintained by the core. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-By: NJames Smart <james.smart@broadcom.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Martin K. Petersen 提交于
PM1725 controllers have a couple of quirks that need to be handled in the driver: - I/O queue depth must be limited to 64 entries on controllers that do not report MQES. - The host interface registers go offline briefly while resetting the chip. Thus a delay is needed before checking whether the controller is ready. Note that the admin queue depth is also limited to 64 on older versions of this board. Since our NVME_AQ_DEPTH is now 32 that is no longer an issue. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 29 6月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
Unlike most drіvers that simply pass the maximum possible vectors to pci_alloc_irq_vectors NVMe needs to configure the device before allocting the vectors, so it needs a manual update for the new scheme of using all present CPUs. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJens Axboe <axboe@kernel.dk> Cc: Keith Busch <keith.busch@intel.com> Cc: linux-block@vger.kernel.org Cc: linux-nvme@lists.infradead.org Link: http://lkml.kernel.org/r/20170626102058.10200-4-hch@lst.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 28 6月, 2017 6 次提交
-
-
由 Sagi Grimberg 提交于
No need to differentiate fabrics from pci/loop, also lower it to 32 as we don't really need 256 inflight admin commands. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Sagi Grimberg 提交于
Given that the code is simple enough it seems better then passing a tag by reference for each call site, also we can now get rid of __nvme_process_cq. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Sagi Grimberg 提交于
Also, maintain a consumed counter to rely on for doorbell and cqe_seen update instead of directly relying on the cq head and phase. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Sagi Grimberg 提交于
Makes the code slightly more readable. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Sagi Grimberg 提交于
Nice abstraction of the actual mechanics of how to do it. Note the change that we call it after we assign nvmeq->cq_head to avoid passing it. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
The controller state is set to resetting prior to disabling the controller, so this patch accounts for that state when deciding if it needs to freeze the queues. Without this, an 'nvme reset /dev/nvme0' blocks forever because the queues were never frozen. Fixes: 82b057ca ("nvme-pci: fix multiple ctrl removal scheduling") Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 15 6月, 2017 9 次提交
-
-
由 Christoph Hellwig 提交于
This moves the nvme_reset function from the PCIe driver to common code, renaming it to nvme_reset_ctrl in the process. Additionally a new helper nvme_reset_ctrl_sync is added for the case where we want to wait for the reset. To facilitate that the reset_work work structure is move to the common nvme_ctrl structure and the ->reset_ctrl method is removed. For now the drivers initialize the reset_work with their own callback, but longer term we should move to callouts for specific parts of the reset process and move even more code to the core. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Christoph Hellwig 提交于
Now that we get the tagset passed we can have a single implementation for the I/O and admin queues. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
It only applies to read/write commands, and this way non-PCIe drivers get the check as well instead of having to duplicate it when adding metadata support. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Johannes Thumshirn 提交于
Use NVME_IDENTIFY_DATA_SIZE define instead of hard coding the magic 4096 value. Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NHannes Reinecke <hare@suse.com> [hch: converted three more users] Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
-
由 Keith Busch 提交于
The controller status polling was added to preemptively reset a failed controller. This early detection would allow commands that would normally timeout a chance for a retry, or find broken links when the platform didn't support hotplug. This once-per-second MMIO read, however, created more problems than it solves. This often races with PCIe Hotplug events that required complicated syncing between work queues, frequently triggered PCIe Completion Timeout errors that also lead to fatal machine checks, and unnecessarily disrupts low power modes by running on idle controllers. This patch removes the watchdog timer, and instead checks controller health only on an IO timeout when we have a reason to believe something is wrong. If the controller is failed, the driver will disable immediately and request scheduling a reset. Suggested-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Xu Yu 提交于
The existing driver initially maps 8192 bytes of BAR0 which is intended to cover doorbells of admin SQ and CQ. However, if a large stride, e.g. 10, is used, the doorbell of admin CQ will be out of 8192 bytes. Consequently, a page fault will be raised when the admin CQ doorbell is accessed in nvme_configure_admin_queue(). This patch fixes this issue by remapping BAR0 before accessing admin CQ doorbell if the initial mapping is not enough. Signed-off-by: NXu Yu <yu.a.xu@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
Instead of each transport using it's own workqueue, export a single nvme-core workqueue and use that instead. In the future, this will help us moving towards some unification if controller setup/teardown flows. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
If a controller supports the host memory buffer we try to provide it with the requested size up to an upper cap set as a module parameter. We try to give as few as possible descriptors, eventually working our way down. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
-
- 09 6月, 2017 2 次提交
-
-
由 Christoph Hellwig 提交于
Use the same values for use for request completion errors as the return value from ->queue_rq. BLK_STS_RESOURCE is special cased to cause a requeue, and all the others are completed as-is. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Currently we use nornal Linux errno values in the block layer, and while we accept any error a few have overloaded magic meanings. This patch instead introduces a new blk_status_t value that holds block layer specific status codes and explicitly explains their meaning. Helpers to convert from and to the previous special meanings are provided for now, but I suspect we want to get rid of them in the long run - those drivers that have a errno input (e.g. networking) usually get errnos that don't know about the special block layer overloads, and similarly returning them to userspace will usually return somethings that strictly speaking isn't correct for file system operations, but that's left as an exercise for later. For now the set of errors is a very limited set that closely corresponds to the previous overloaded errno values, but there is some low hanging fruite to improve it. blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse typechecking, so that we can easily catch places passing the wrong values. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 07 6月, 2017 1 次提交
-
-
由 Rakesh Pandit 提交于
Commit c5f6ce97 tries to address multiple resets but fails as work_busy doesn't involve any synchronization and can fail. This is reproducible easily as can be seen by WARNING below which is triggered with line: WARN_ON(dev->ctrl.state == NVME_CTRL_RESETTING) Allowing multiple resets can result in multiple controller removal as well if different conditions inside nvme_reset_work fail and which might deadlock on device_release_driver. [ 480.327007] WARNING: CPU: 3 PID: 150 at drivers/nvme/host/pci.c:1900 nvme_reset_work+0x36c/0xec0 [ 480.327008] Modules linked in: rfcomm fuse nf_conntrack_netbios_ns nf_conntrack_broadcast... [ 480.327044] btusb videobuf2_core ghash_clmulni_intel snd_hwdep cfg80211 acer_wmi hci_uart.. [ 480.327065] CPU: 3 PID: 150 Comm: kworker/u16:2 Not tainted 4.12.0-rc1+ #13 [ 480.327065] Hardware name: Acer Predator G9-591/Mustang_SLS, BIOS V1.10 03/03/2016 [ 480.327066] Workqueue: nvme nvme_reset_work [ 480.327067] task: ffff880498ad8000 task.stack: ffffc90002218000 [ 480.327068] RIP: 0010:nvme_reset_work+0x36c/0xec0 [ 480.327069] RSP: 0018:ffffc9000221bdb8 EFLAGS: 00010246 [ 480.327070] RAX: 0000000000460000 RBX: ffff880498a98128 RCX: dead000000000200 [ 480.327070] RDX: 0000000000000001 RSI: ffff8804b1028020 RDI: ffff880498a98128 [ 480.327071] RBP: ffffc9000221be50 R08: 0000000000000000 R09: 0000000000000000 [ 480.327071] R10: ffffc90001963ce8 R11: 000000000000020d R12: ffff880498a98000 [ 480.327072] R13: ffff880498a53500 R14: ffff880498a98130 R15: ffff880498a98128 [ 480.327072] FS: 0000000000000000(0000) GS:ffff8804c1cc0000(0000) knlGS:0000000000000000 [ 480.327073] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 480.327074] CR2: 00007ffcf3c37f78 CR3: 0000000001e09000 CR4: 00000000003406e0 [ 480.327074] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 480.327075] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 480.327075] Call Trace: [ 480.327079] ? __switch_to+0x227/0x400 [ 480.327081] process_one_work+0x18c/0x3a0 [ 480.327082] worker_thread+0x4e/0x3b0 [ 480.327084] kthread+0x109/0x140 [ 480.327085] ? process_one_work+0x3a0/0x3a0 [ 480.327087] ? kthread_park+0x60/0x60 [ 480.327102] ret_from_fork+0x2c/0x40 [ 480.327103] Code: e8 5a dc ff ff 85 c0 41 89 c1 0f..... This patch addresses the problem by using state of controller to decide whether reset should be queued or not as state change is synchronizated using controller spinlock. Also cancel_work_sync is used to make sure remove cancels the reset_work and waits for it to finish. This patch also changes return value from -ENODEV to more appropriate -EBUSY if nvme_reset fails to change state. Fixes: c5f6ce97 ("nvme: don't schedule multiple resets") Signed-off-by: NRakesh Pandit <rakesh@tuxera.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 26 5月, 2017 3 次提交
-
-
由 Andy Lutomirski 提交于
They have known firmware bugs. A fix is apparently in the works -- once fixed firmware is available, someone from Intel (Hi, Keith!) can adjust the quirk accordingly. Cc: stable@vger.kernel.org # v4.11 Cc: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: Mario Limonciello <mario_limonciello@dell.com> Signed-off-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
Currently only the PCIe driver supports metadata, so we should not claim integrity support for the other drivers. This prevents nasty crashes with targets that advertise metadata support on fabrics. Also use the opportunity to factor out some code into a separate helper that isn't even compiled if CONFIG_BLK_DEV_INTEGRITY is disabled. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com>
-
由 Christoph Hellwig 提交于
This is what most of the code already does and gives much more useful prefixes than the device embedded in the pci_dev. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com>
-
- 21 5月, 2017 1 次提交
-
-
由 Jon Derrick 提交于
CMB doesn't get unmapped until removal while getting remapped on every reset. Add the unmapping and sysfs file removal to the reset path in nvme_pci_disable to match the mapping path in nvme_pci_enable. Fixes: 202021c1 ("nvme : Add sysfs entry for NVMe CMBs when appropriate") Signed-off-by: NJon Derrick <jonathan.derrick@intel.com> Acked-by: NKeith Busch <keith.busch@intel.com> Reviewed-By: NStephen Bates <sbates@raithlin.com> Cc: <stable@vger.kernel.org> # 4.9+ Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 02 5月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
Remove the request_idx parameter, which can't be used safely now that we support I/O schedulers with blk-mq. Except for a superflous check in mtip32xx it was unused anyway. Also pass the tag_set instead of just the driver data - this allows drivers to avoid some code duplication in a follow on cleanup. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-