- 25 6月, 2019 1 次提交
-
-
由 Jaesoo Lee 提交于
[ Upstream commit c8e8c77b3bdbade6e26e8e76595f141ede12b692 ] The Number of Namespaces (nn) field in the identify controller data structure is defined as u32 and the maximum allowed value in NVMe specification is 0xFFFFFFFEUL. This change fixes the possible overflow of the DIV_ROUND_UP() operation used in nvme_scan_ns_list() by casting the nn to u64. Signed-off-by: NJaesoo Lee <jalee@purestorage.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 19 6月, 2019 5 次提交
-
-
由 Yufen Yu 提交于
[ Upstream commit 510a405d945bc985abc513fafe45890cac34fafa ] Unconditionally hide device pm latency tolerance when uninitializing the controller to ensure all qos resources are released so that we're not leaking this memory. This is safe to call if none were allocated in the first place, or were previously freed. Fixes: c5552fde("nvme: Enable autonomous power state transitions") Suggested-by: NKeith Busch <keith.busch@intel.com> Tested-by: NDavid Milburn <dmilburn@redhat.com> Signed-off-by: NYufen Yu <yuyufen@huawei.com> [changelog] Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Christoph Hellwig 提交于
[ Upstream commit 5fb4aac756acacf260b9ebd88747251effa3a2f2 ] Holding the SRCU critical section protecting the namespace list can cause deadlocks when using the per-namespace admin passthrough ioctl to delete as namespace. Release it earlier when performing per-controller ioctls to avoid that. Reported-by: NKenneth Heitke <kenneth.heitke@intel.com> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Christoph Hellwig 提交于
[ Upstream commit 90ec611adcf20b96d0c2b7166497d53e4301a57f ] Merge the two functions to make future changes a little easier. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Christoph Hellwig 提交于
[ Upstream commit 3f98bcc58cd5f1e4668db289dcab771874cc0920 ] We already have a proper stub if lightnvm is not enabled, so don't bother with the ifdef. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Christoph Hellwig 提交于
[ Upstream commit 100c815cbd56480b3e31518475b04719c363614a ] If we can't get a namespace don't leak the SRCU lock. nvme_ioctl was working around this, but nvme_pr_command wasn't handling this properly. Just do what callers would usually expect. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 31 5月, 2019 1 次提交
-
-
由 Sagi Grimberg 提交于
[ Upstream commit 01fa017484ad98fccdeaab32db0077c574b6bd6f ] If our target exposed a namespace with a block size that is greater than PAGE_SIZE, set 0 capacity on the namespace as we do not support it. This issue encountered when the nvmet namespace was backed by a tempfile. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 14 3月, 2019 1 次提交
-
-
由 Keith Busch 提交于
[ Upstream commit e7ad43c3eda6a1690c4c3c341f95dc1c6898da83 ] If a controller supports the NS Change Notification, the namespace scan_work is automatically triggered after attaching a new namespace. Occasionally the namespace scan_work may append the new namespace to the list before the admin command effects handling is completed. The effects handling unfreezes namespaces, but if it unfreezes the newly attached namespace, its request_queue freeze depth will be off and we'll hit the warning in blk_mq_unfreeze_queue(). On the next namespace add, we will fail to freeze that queue due to the previous bad accounting and deadlock waiting for frozen. Fix that by preventing scan work from altering the namespace list while command effects handling needs to pair freeze with unfreeze. Reported-by: NWen Xiong <wenxiong@us.ibm.com> Tested-by: NWen Xiong <wenxiong@us.ibm.com> Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 20 2月, 2019 1 次提交
-
-
由 Keith Busch 提交于
[ Upstream commit 3da584f57133e51aeb84aaefae5e3d69531a1e4f ] We need to preserve the leading zeros in the vid and ssvid when generating a unique NQN. Truncating these may lead to naming collisions. Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 21 12月, 2018 1 次提交
-
-
由 James Smart 提交于
[ Upstream commit 86880d646122240596d6719b642fee3213239994 ] Delete operations are seeing NULL pointer references in call_timer_fn. Tracking these back, the timer appears to be the keep alive timer. nvme_keep_alive_work() which is tied to the timer that is cancelled by nvme_stop_keep_alive(), simply starts the keep alive io but doesn't wait for it's completion. So nvme_stop_keep_alive() only stops a timer when it's pending. When a keep alive is in flight, there is no timer running and the nvme_stop_keep_alive() will have no affect on the keep alive io. Thus, if the io completes successfully, the keep alive timer will be rescheduled. In the failure case, delete is called, the controller state is changed, the nvme_stop_keep_alive() is called while the io is outstanding, and the delete path continues on. The keep alive happens to successfully complete before the delete paths mark it as aborted as part of the queue termination, so the timer is restarted. The delete paths then tear down the controller, and later on the timer code fires and the timer entry is now corrupt. Fix by validating the controller state before rescheduling the keep alive. Testing with the fix has confirmed the condition above was hit. Signed-off-by: NJames Smart <jsmart2021@gmail.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 17 12月, 2018 1 次提交
-
-
由 Sagi Grimberg 提交于
[ Upstream commit f6c8e432cb0479255322c5d0335b9f1699a0270c ] nvme_stop_ctrl can be called also for reset flow and there is no need to flush the scan_work as namespaces are not being removed. This can cause deadlock in rdma, fc and loop drivers since nvme_stop_ctrl barriers before controller teardown (and specifically I/O cancellation of the scan_work itself) takes place, but the scan_work will be blocked anyways so there is no need to flush it. Instead, move scan_work flush to nvme_remove_namespaces() where it really needs to flush. Reported-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed by: James Smart <jsmart2021@gmail.com> Tested-by: NEwan D. Milne <emilne@redhat.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 27 11月, 2018 1 次提交
-
-
由 Sagi Grimberg 提交于
[ Upstream commit 8f676b8508c250bbe255096522fdefb73f1ea0b9 ] Whenever we update ns_head info, we need to make sure it is still compatible with all underlying backing devices because although nvme multipath doesn't have any explicit use of these limits, other devices can still be stacked on top of it which may rely on the underlying limits. Start with unlimited stacking limits, and every info update iterate over siblings and adjust queue limits. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 08 10月, 2018 1 次提交
-
-
由 Keith Busch 提交于
The code had been clearing a namespace being deleted as the current path while that namespace was still in the path siblings list. It is possible a new IO could set that namespace back to the current path since it appeared to be an eligable path to select, which may result in a use-after-free error. This patch ensures a namespace being removed is not eligable to be reset as a current path prior to clearing it as the current path. Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 08 8月, 2018 1 次提交
-
-
由 Chaitanya Kulkarni 提交于
NVMe 1.3 TP 4005 introduces new filed (NSATTR). This field indicates whether given namespace is write protected or not. This patch sets the gendisk associated with the namespace to read only based on the identify namespace nsattr field. Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 30 7月, 2018 2 次提交
-
-
由 Max Gurtovoy 提交于
Also moved the logic of the remapping to the nvme core driver instead of implementing it in the nvme pci driver. This way all the other nvme transport drivers will benefit from it (in case they'll implement metadata support). Suggested-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Max Gurtovoy 提交于
Currently this function is implemented in the scsi layer, but it's actual place should be the block layer since T10-PI is a general data integrity feature that is used in the nvme protocol as well. Suggested-by: NChristoph Hellwig <hch@lst.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 28 7月, 2018 3 次提交
-
-
由 Christoph Hellwig 提交于
Add support for Asynchronous Namespace Access as specified in NVMe 1.3 TP 4004. With ANA each namespace attached to a controller belongs to an ANA group that describes the characteristics of accessing the namespaces through this controller. In the optimized and non-optimized states namespaces can be accessed regularly, although in a multi-pathing environment we should always prefer to access a namespace through a controller where an optimized relationship exists. Namespaces in Inaccessible, Permanent-Loss or Change state for a given controller should not be accessed. The states are updated through reading the ANA log page, which is read once during controller initialization, whenever the ANA change notice AEN is received, or when one of the ANA specific status codes that signal a state change is received on a command. The ANA state is kept in the nvme_ns structure, which makes the checks in the fast path very simple. Updating the ANA state when reading the log page is also very simple, the only downside is that finding the initial ANA state when scanning for namespaces is a bit cumbersome. The gendisk for a ns_head is only registered once a live path for it exists. Without that the kernel would hang during partition scanning. Includes fixes and improvements from Hannes Reinecke. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
-
由 Christoph Hellwig 提交于
Now that we just call out to blk_path_error there isn't really any good reason to not merge it into the only caller. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
-
由 Christoph Hellwig 提交于
Merge nvme_get_log and nvme_get_log_ext into a single helper, which takes a plain nsid instead of the nvme_ns pointer. Also add support for the log specific field while we're at it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
-
- 23 7月, 2018 2 次提交
-
-
由 Keith Busch 提交于
We can not match a command to its completion based on the command id alone. We need the submitting queue identifier to pair with the completion, so this patch adds that to the trace buffer. This patch is also collapsing the admin and IO submission traces into a single one so we don't need to duplicate this and creating unnecessary code branches: we know if the command is an admin vs IO based on the qid. And since we're here, the patch fixes code formatting in the area. Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> [hch: move the qid helper to nvme.h and made it an inline function] Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 James Smart 提交于
Currently, the code initializes the keep alive work item whenever nvme_start_keep_alive() is called. However, this routine is called several times while reconnecting, etc. Although it's hoped that keep alive is always disabled and not scheduled when start is called, re-initing if it were scheduled or completing can have very bad side effects. There's no need for re-initialization. Move the keep_alive work item and cmd struct initialization to controller init. Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 20 7月, 2018 1 次提交
-
-
由 Roland Dreier 提交于
The old code in nvme_user_cmd() passed the userspace virtual address from nvme_passthru_cmd.metadata as the length of the metadata buffer as well as the address to nvme_submit_user_cmd(). Fixes: 63263d60 ("nvme: Use metadata for passthrough commands") Signed-off-by: NRoland Dreier <roland@purestorage.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 17 7月, 2018 2 次提交
-
-
由 Weiping Zhang 提交于
Avoid excuting set_feature command if there is no supported bit in Optional Asynchronous Events Supported (OAES). Fixes: c0561f82 ("nvme: submit AEN event configuration on startup") Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NWeiping Zhang <zhangweiping@didichuxing.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Scott Bauer 提交于
If the controller supports effects and goes down during the passthru admin command we will deadlock during namespace revalidation. [ 363.488275] INFO: task kworker/u16:5:231 blocked for more than 120 seconds. [ 363.488290] Not tainted 4.17.0+ #2 [ 363.488296] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 363.488303] kworker/u16:5 D 0 231 2 0x80000000 [ 363.488331] Workqueue: nvme-reset-wq nvme_reset_work [nvme] [ 363.488338] Call Trace: [ 363.488385] schedule+0x75/0x190 [ 363.488396] rwsem_down_read_failed+0x1c3/0x2f0 [ 363.488481] call_rwsem_down_read_failed+0x14/0x30 [ 363.488504] down_read+0x1d/0x80 [ 363.488523] nvme_stop_queues+0x1e/0xa0 [nvme_core] [ 363.488536] nvme_dev_disable+0xae4/0x1620 [nvme] [ 363.488614] nvme_reset_work+0xd1e/0x49d9 [nvme] [ 363.488911] process_one_work+0x81a/0x1400 [ 363.488934] worker_thread+0x87/0xe80 [ 363.488955] kthread+0x2db/0x390 [ 363.488977] ret_from_fork+0x35/0x40 Fixes: 84fef62d ("nvme: check admin passthru command effects") Signed-off-by: NScott Bauer <scott.bauer@intel.com> Reviewed-by: NKeith Busch <keith.busch@linux.intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 22 6月, 2018 1 次提交
-
-
由 Jens Axboe 提交于
nvme requires an sg table allocation for each request. If the request is large, then the allocation can become quite large. For instance, with our default software settings of 1280KB IO size, we'll need 10248 bytes of sg table. That turns into a 2nd order allocation, which we can't always guarantee. If we fail the allocation, blk-mq will retry it later. But there's no guarantee that we'll EVER be able to allocate that much contigious memory. Limit the IO size such that we never need more than a single page of memory. That's a lot faster and more reliable. Then back that allocation with a mempool, so that we know we'll always be able to succeed the allocation at some point. Signed-off-by: NJens Axboe <axboe@kernel.dk> Acked-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 14 6月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
Unused now that all transports stopped using it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJens Axboe <axboe@kernel.dk>
-
- 13 6月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
Don't optimize our namespace rescan based on the changed namespace list log page as userspace might have changed the content through reading it. Suggested-by: NKeith Busch <keith.busch@linux.intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@linux.intel.com> Reviewed-by: NHannes Reinecke <hare@suse.com>
-
- 11 6月, 2018 1 次提交
-
-
由 Israel Rukshin 提交于
When using nvme-pci driver the nvmf_ctrl_options is NULL. There is no need to check for discovery_nqn flag at non-fabrics controller. Fixes: 181303d0 ("nvme-fabrics: allow duplicate connections to the discovery controller") Signed-off-by: NIsrael Rukshin <israelr@mellanox.com> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 09 6月, 2018 1 次提交
-
-
由 Dan Carpenter 提交于
The problem here is that set_bit() and test_bit() take a bit number so we should be passing 0 but instead we're passing (1 << 0) which leads to a double shift. It doesn't cause a runtime bug in the current code because it's done consistently and we only set that one bit. I decided to just re-use NVME_AER_NOTICE_NS_CHANGED instead of introducing a new define for this. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 01 6月, 2018 4 次提交
-
-
由 Christoph Hellwig 提交于
Per section 5.2 we need to issue the corresponding log page to clear an AEN, so for a namespace data changed AEN we need to read the changed namespace list log. And once we read that log anyway we might as well use it to optimize the rescan. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Christoph Hellwig 提交于
And move it toward the top of the file to avoid a forward declaration. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
-
由 Hannes Reinecke 提交于
We should register for AEN events; some law-abiding targets might not be sending us AENs otherwise. Signed-off-by: NHannes Reinecke <hare@suse.com> [hch: slight cleanups] Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Christoph Hellwig 提交于
Stop including the event type in the definitions for the notice type. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
-
- 31 5月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
We already check for started commands in all callbacks, but we should also protect against already completed commands. Do this by taking the checks to common code. Acked-by: NJosef Bacik <josef@toxicpanda.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 30 5月, 2018 1 次提交
-
-
由 Max Gurtovoy 提交于
This value depands on the metadata support value, so reorder the initialization to fit. Fixes: b5be3b39 ("nvme: always unregister the integrity profile in __nvme_revalidate_disk") Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org
-
- 25 5月, 2018 2 次提交
-
-
由 Hannes Reinecke 提交于
If nvme_get_effects_log() failed the 'id' buffer from the previous nvme_identify_ctrl() call will never be freed. Signed-off-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Hannes Reinecke 提交于
The whole point of the discovery controller is that it can accept multiple connections. Additionally the cmic field is not even defined for the discovery controller identify page. Signed-off-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 23 5月, 2018 1 次提交
-
-
由 Ivan Bornyakov 提交于
Ternary operator have lower precedence then bitwise or, so 'cdw10' was calculated wrong. Signed-off-by: NIvan Bornyakov <brnkv.i1@gmail.com> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NKeith Busch <keith.busch@intel.com>
-
- 19 5月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
We'll need that in the PCIe driver soon as we'll read it straight off the CQ. Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 16 5月, 2018 1 次提交
-
-
由 Nitzan Carmi 提交于
The nvme_delete_ctrl() function queues a work item on a MEM_RECLAIM queue (nvme_delete_wq), which eventually calls cleanup_srcu_struct(), which in turn flushes a delayed work from an !MEM_RECLAIM queue. This is unsafe as we might trigger deadlocks under severe memory pressure. Since we don't ever invoke call_srcu(), it is safe to use the shiny new _quiesced() version of srcu cleanup, thus avoiding that flush dependency. This commit makes that change. Signed-off-by: NNitzan Carmi <nitzanc@mellanox.com> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Tested-by: NNicholas Piggin <npiggin@gmail.com>
-