提交 · 5bedd3afee8eb01ccd256f0cd2cc0fa6f841417a · openeuler / Kernel

29 7月, 2020 1 次提交

nvme: add a Identify Namespace Identification Descriptor list quirk · 5bedd3af

由 Christoph Hellwig 提交于 7月 28, 2020

Add a quirk for a device that does not support the Identify Namespace
Identification Descriptor list despite claiming 1.3 compliance.

Fixes: ea43d970 ("nvme: fix identify error status silent ignore")
Reported-by: NIngo Brunberg <ingo_brunberg@web.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NIngo Brunberg <ingo_brunberg@web.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

5bedd3af

16 7月, 2020 1 次提交

nvme: explicitly update mpath disk capacity on revalidation · 05b29021

由 Anthony Iliopoulos 提交于 7月 14, 2020

Commit 3b4b1972 ("nvme: fix possible deadlock when I/O is
blocked") reverted multipath head disk revalidation due to deadlocks
caused by holding the bd_mutex during revalidate.

Updating the multipath disk blockdev size is still required though for
userspace to be able to observe any resizing while the device is
mounted. Directly update the bdev inode size to avoid unnecessarily
holding the bdev->bd_mutex.

Fixes: 3b4b1972 ("nvme: fix possible deadlock when I/O is
blocked")
Signed-off-by: NAnthony Iliopoulos <ailiop@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

05b29021

02 7月, 2020 1 次提交

nvme: fix identify error status silent ignore · ea43d970

由 Sagi Grimberg 提交于 6月 26, 2020

Commit 59c7c3ca intended to only silently ignore non retry-able
errors (DNR bit set) such that we can still identify misbehaving
controllers, and in the other hand propagate retry-able errors (DNR bit
cleared) so we don't wrongly abandon a namespace just because it happens
to be temporarily inaccessible.

The goal remains the same as the original commit where this was
introduced but unfortunately had the logic backwards.

Fixes: 59c7c3ca ("nvme: fix possible hang when ns scanning fails during error recovery")
Reported-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ea43d970

25 6月, 2020 2 次提交

nvme: fix possible deadlock when I/O is blocked · 3b4b1972

由 Sagi Grimberg 提交于 6月 24, 2020

Revert fab7772b ("nvme-multipath: revalidate nvme_ns_head gendisk
in nvme_validate_ns")

When adding a new namespace to the head disk (via nvme_mpath_set_live)
we will see partition scan which triggers I/O on the mpath device node.
This process will usually be triggered from the scan_work which holds
the scan_lock. If I/O blocks (if we got ana change currently have only
available paths but none are accessible) this can deadlock on the head
disk bd_mutex as both partition scan I/O takes it, and head disk revalidation
takes it to check for resize (also triggered from scan_work on a different
path). See trace [1].

The mpath disk revalidation was originally added to detect online disk
size change, but this is no longer needed since commit cb224c3a
("nvme: Convert to use set_capacity_revalidate_and_notify") which already
updates resize info without unnecessarily revalidating the disk (the
mpath disk doesn't even implement .revalidate_disk fop).

[1]:
--
kernel: INFO: task kworker/u65:9:494 blocked for more than 241 seconds.
kernel:       Tainted: G           OE     5.3.5-050305-generic #201910071830
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: kworker/u65:9   D    0   494      2 0x80004000
kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core]
kernel: Call Trace:
kernel:  __schedule+0x2b9/0x6c0
kernel:  schedule+0x42/0xb0
kernel:  schedule_preempt_disabled+0xe/0x10
kernel:  __mutex_lock.isra.0+0x182/0x4f0
kernel:  __mutex_lock_slowpath+0x13/0x20
kernel:  mutex_lock+0x2e/0x40
kernel:  revalidate_disk+0x63/0xa0
kernel:  __nvme_revalidate_disk+0xfe/0x110 [nvme_core]
kernel:  nvme_revalidate_disk+0xa4/0x160 [nvme_core]
kernel:  ? evict+0x14c/0x1b0
kernel:  revalidate_disk+0x2b/0xa0
kernel:  nvme_validate_ns+0x49/0x940 [nvme_core]
kernel:  ? blk_mq_free_request+0xd2/0x100
kernel:  ? __nvme_submit_sync_cmd+0xbe/0x1e0 [nvme_core]
kernel:  nvme_scan_work+0x24f/0x380 [nvme_core]
kernel:  process_one_work+0x1db/0x380
kernel:  worker_thread+0x249/0x400
kernel:  kthread+0x104/0x140
kernel:  ? process_one_work+0x380/0x380
kernel:  ? kthread_park+0x80/0x80
kernel:  ret_from_fork+0x1f/0x40
...
kernel: INFO: task kworker/u65:1:2630 blocked for more than 241 seconds.
kernel:       Tainted: G           OE     5.3.5-050305-generic #201910071830
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: kworker/u65:1   D    0  2630      2 0x80004000
kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core]
kernel: Call Trace:
kernel:  __schedule+0x2b9/0x6c0
kernel:  schedule+0x42/0xb0
kernel:  io_schedule+0x16/0x40
kernel:  do_read_cache_page+0x438/0x830
kernel:  ? __switch_to_asm+0x34/0x70
kernel:  ? file_fdatawait_range+0x30/0x30
kernel:  read_cache_page+0x12/0x20
kernel:  read_dev_sector+0x27/0xc0
kernel:  read_lba+0xc1/0x220
kernel:  ? kmem_cache_alloc_trace+0x19c/0x230
kernel:  efi_partition+0x1e6/0x708
kernel:  ? vsnprintf+0x39e/0x4e0
kernel:  ? snprintf+0x49/0x60
kernel:  check_partition+0x154/0x244
kernel:  rescan_partitions+0xae/0x280
kernel:  __blkdev_get+0x40f/0x560
kernel:  blkdev_get+0x3d/0x140
kernel:  __device_add_disk+0x388/0x480
kernel:  device_add_disk+0x13/0x20
kernel:  nvme_mpath_set_live+0x119/0x140 [nvme_core]
kernel:  nvme_update_ns_ana_state+0x5c/0x60 [nvme_core]
kernel:  nvme_set_ns_ana_state+0x1e/0x30 [nvme_core]
kernel:  nvme_parse_ana_log+0xa1/0x180 [nvme_core]
kernel:  ? nvme_update_ns_ana_state+0x60/0x60 [nvme_core]
kernel:  nvme_mpath_add_disk+0x47/0x90 [nvme_core]
kernel:  nvme_validate_ns+0x396/0x940 [nvme_core]
kernel:  ? blk_mq_free_request+0xd2/0x100
kernel:  nvme_scan_work+0x24f/0x380 [nvme_core]
kernel:  process_one_work+0x1db/0x380
kernel:  worker_thread+0x249/0x400
kernel:  kthread+0x104/0x140
kernel:  ? process_one_work+0x380/0x380
kernel:  ? kthread_park+0x80/0x80
kernel:  ret_from_fork+0x1f/0x40
--

Fixes: fab7772b ("nvme-multipath: revalidate nvme_ns_head gendisk
in nvme_validate_ns")
Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3b4b1972

nvme: set initial value for controller's numa node · 4fea243e

由 Max Gurtovoy 提交于 6月 16, 2020

Initialize the node to NUMA_NO_NODE value. Transports that are aware of
numa node affinity can override it (e.g. RDMA transport set the affinity
according to the RDMA HCA).
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

4fea243e

11 6月, 2020 1 次提交

nvme: do not call del_gendisk() on a disk that was never added · 108a5858

由 Niklas Cassel 提交于 6月 07, 2020

device_add_disk() is negated by del_gendisk().
alloc_disk_node() is negated by put_disk().

In nvme_alloc_ns(), device_add_disk() is one of the last things being
called in the success case, and only void functions are being called
after this. Therefore this call should not be negated in the error path.

The superfluous call to del_gendisk() leads to the following prints:
[    7.839975] kobject: '(null)' (000000001ff73734): is not initialized, yet kobject_put() is being called.
[    7.840865] WARNING: CPU: 2 PID: 361 at lib/kobject.c:736 kobject_put+0x70/0x120

Fixes: 33cfdc2a ("nvme: enforce extended LBA format for fabrics metadata")
Signed-off-by: NNiklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

108a5858

30 5月, 2020 1 次提交

nvme: force complete cancelled requests · 3382a567

由 Keith Busch 提交于 5月 29, 2020

Use blk_mq_foce_complete_rq() to bypass fake timeout error injection so
that request reclaim may proceed.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NDaniel Wagner <dwagner@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3382a567

27 5月, 2020 8 次提交

nvme: set dma alignment to qword · 3b2a1ebc

由 Keith Busch 提交于 5月 20, 2020

The default dma alignment mask is 511, which is much larger than any nvme
controller requires. NVMe controllers accept qword aligned DMA addresses,
so set the request_queue constraints to that. This can help avoid bounce
buffers on user passthrough commands.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3b2a1ebc

nvme: enforce extended LBA format for fabrics metadata · 33cfdc2a

由 Max Gurtovoy 提交于 5月 19, 2020

An extended LBA is a larger LBA that is created when metadata associated
with the LBA is transferred contiguously with the LBA data (AKA
interleaved). The metadata may be either transferred as part of the LBA
(creating an extended LBA) or it may be transferred as a separate
contiguous buffer of data. According to the NVMeoF spec, a fabrics ctrl
supports only an Extended LBA format. Fail revalidation in case we have a
spec violation. Also add a flag that will imply on capable transports and
controllers as part of a preparation for allowing end-to-end protection
information for fabric controllers.
Suggested-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

33cfdc2a

nvme: introduce max_integrity_segments ctrl attribute · 95093350

由 Max Gurtovoy 提交于 5月 19, 2020

This patch doesn't change any logic, and is needed as a preparation
for adding PI support for fabrics drivers that will use an extended
LBA format for metadata and will support more than 1 integrity segment.
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

95093350

nvme: make nvme_ns_has_pi accessible to transports · 4d2ce688

由 James Smart 提交于 5月 19, 2020

Move the nvme_ns_has_pi() inline from core.c to the nvme.h header.
This allows use by the transports.
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
[maxg: added a comment for nvme_ns_has_pi()]
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

4d2ce688

nvme: introduce NVME_NS_METADATA_SUPPORTED flag · b29f8485

由 Max Gurtovoy 提交于 5月 19, 2020

This is a preparation for adding support for metadata in fabric
controllers. New flag will imply that NVMe namespace supports getting
metadata that was originally generated by host's block layer.
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b29f8485

nvme: introduce namespace features flag · ffc89b1d

由 Max Gurtovoy 提交于 5月 19, 2020

Replace the specific ext boolean (that implies on extended LBA format)
with a feature in the new namespace features flag. This is a preparation
for adding more namespace features (such as metadata specific features).
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ffc89b1d

nvme: fix io_opt limit setting · 68ab60ca

由 Damien Le Moal 提交于 5月 14, 2020

Currently, a namespace io_opt queue limit is set by default to the
physical sector size of the namespace and to the the write optimal
size (NOWS) when the namespace reports optimal IO sizes. This causes
problems with block limits stacking in blk_stack_limits() when a
namespace block device is combined with an HDD which generally do not
report any optimal transfer size (io_opt limit is 0). The code:

/* Optimal I/O a multiple of the physical block size? */
if (t->io_opt & (t->physical_block_size - 1)) {
	t->io_opt = 0;
	t->misaligned = 1;
	ret = -1;
}

in blk_stack_limits() results in an error return for this function when
the combined devices have different but compatible physical sector
sizes (e.g. 512B sector SSD with 4KB sector disks).

Fix this by not setting the optimal IO size queue limit if the namespace
does not report an optimal write size value.
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: NBart van Assche <bvanassche@acm.org>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

68ab60ca

nvme: disable streams when get stream params failed · 84e4c204

由 Wu Bo 提交于 5月 13, 2020

Disable streams again if getting the stream params fails.
Signed-off-by: NWu Bo <wubo40@huawei.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

84e4c204

10 5月, 2020 20 次提交

nvme: define constants for identification values · 92decf11

由 Keith Busch 提交于 4月 03, 2020

Improve code readability by defining the specification's constants that
the driver is using when decoding identification payloads.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NBart van Assche <bvanassche@acm.org>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

92decf11

nvme: flush scan work on passthrough commands · b04df85d

由 Keith Busch 提交于 4月 30, 2020

If a passthrough command causes the namespace inventory or capabilities
to change, flush the scan work that handles these changes so the driver
synchronizes with the user command's effects before returning the result
to user space.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b04df85d

nvme: clean up error handling in nvme_init_ns_head · 6623c5b3

由 Christoph Hellwig 提交于 4月 22, 2020

Use a common label for putting the nshead if needed and only convert
nvme status codes for the one case where it actually is needed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6623c5b3

nvme: consolodate io settings · 31fdad7b

由 Keith Busch 提交于 4月 09, 2020

The stream parameters indicating optimal io settings were just getting
overwritten later. Rearrange the settings so the streams parameters can
be preserved if provided.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

31fdad7b

nvme: revalidate namespace stream parameters · bc1af009

由 Keith Busch 提交于 4月 09, 2020

The stream parameters are based on the currently formatted logical block
size. Recheck these parameters on namespace revalidation so the
registered constraints will be accurate.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bc1af009

nvme: consolidate chunk_sectors settings · 38adf94e

由 Keith Busch 提交于 4月 09, 2020

Move the quirked chunk_sectors setting to the same location as noiob so
one place registers this setting. And since the noiob value is only used
locally, remove the member from struct nvme_ns.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

38adf94e

nvme: revalidate after verifying identifiers · b2b2de7c

由 Keith Busch 提交于 4月 09, 2020

If the namespace identifiers have changed, skip updating the disk
information, as that will register parameters from a mismatched
namespace.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b2b2de7c

nvme-multipath: set bdi capabilities once · b2ce4d90

由 Keith Busch 提交于 4月 09, 2020

The queues' backing device info capabilities don't change with each
namespace revalidation. Set it only when each path's request_queue
is initially added to a multipath queue.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b2ce4d90

nvme: check namespace head shared property · 0c284db7

由 Keith Busch 提交于 4月 09, 2020

Reject a new shared namespace if a duplicate unshared namespace exists.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0c284db7

nvme: always search for namespace head · 9ad1927a

由 Keith Busch 提交于 4月 09, 2020

Even if a namespace reports it is not capable of sharing, search the
subsystem for a matching namespace head. If found, the driver should
reject that namespace since it's coming from an invalid configuration.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9ad1927a

nvme: release namespace head reference on error · ac262508

由 Keith Busch 提交于 4月 09, 2020

If a namespace identification does not match the subsystem's head for
that NSID, release the reference that was taken when the matching head
was initially found.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ac262508

nvme: unlink head after removing last namespace · d5675729

由 Keith Busch 提交于 4月 09, 2020

The driver had been unlinking the namespace head from the subsystem's
list only after the last reference was released, and outside of the
list's subsys->lock protection.

There is no reason to track an empty head, so unlink the entry from the
subsystem's list when the last namespace using that head is removed and
with the mutex lock protecting the list update. The next namespace to
attach reusing the previous NSID will allocate a new head rather than
find the old head with mismatched identifiers.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d5675729

nvme: remove the magic 1024 constant in nvme_scan_ns_list · aec459b4

由 Christoph Hellwig 提交于 4月 04, 2020

Replace it with a value derived from the identify data and nsid sizes.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

aec459b4

nvme: avoid an Identify Controller command for each namespace scan · 4005f28d

由 Christoph Hellwig 提交于 4月 04, 2020

The namespace lists are 0-terminated, so we don't really need the NN value
execept for the legacy sequential scan.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4005f28d

nvme: factor out a nvme_ns_remove_by_nsid helper · 4450ba3b

由 Christoph Hellwig 提交于 4月 04, 2020

Factor out a piece of deeply indented and logicaly separate code
from nvme_scan_ns_list into a new helper.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4450ba3b

nvme: clean up nvme_scan_work · 25dcaa92

由 Christoph Hellwig 提交于 4月 04, 2020

Move the check for the supported CNS value into nvme_scan_ns_list, and
limit the life time of the identify controller allocation.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

25dcaa92

nvme: refine the Qemu Identify CNS quirk · b9a5c3d4

由 Christoph Hellwig 提交于 4月 04, 2020

Add a helper to check if we can use Identify CNS values > 1, and refine
the Qemu quirk to not apply to reported versions larger than 1.1, as the
Qemu implementation had been fixed by then.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b9a5c3d4

nvme: remove unused parameter · 03f8cebc

由 Keith Busch 提交于 4月 03, 2020

nvme_alloc_ns_head() doesn't use the 'struct nvme_id_ns' parameter.
Remove it, and update caller accordingly.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

03f8cebc

nvme: provide num dword helper · 71fb90eb

由 Keith Busch 提交于 4月 03, 2020

Various nvme commands use a zeroes based number of dwords field. Create
a helper function to convert byte lengths to this format.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

71fb90eb

nvme: fix possible hang when ns scanning fails during error recovery · 59c7c3ca

由 Sagi Grimberg 提交于 5月 06, 2020

When the controller is reconnecting, the host fails I/O and admin
commands as the host cannot reach the controller. ns scanning may
revalidate namespaces during that period and it is wrong to remove
namespaces due to these failures as we may hang (see 205da243).

One command that may fail is nvme_identify_ns_descs. Since we return
success due to having ns identify descriptor list optional, we continue
to compare ns identifiers in nvme_revalidate_disk, obviously fail and
return -ENODEV to nvme_validate_ns, which will remove the namespace.

Exactly what we don't want to happen.

Fixes: 22802bf7 ("nvme: Namepace identification descriptor list is optional")
Tested-by: NAnton Eidelman <anton@lightbitslabs.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

59c7c3ca

27 4月, 2020 1 次提交

nvme: prevent double free in nvme_alloc_ns() error handling · 132be623

由 Niklas Cassel 提交于 4月 27, 2020

When jumping to the out_put_disk label, we will call put_disk(), which will
trigger a call to disk_release(), which calls blk_put_queue().

Later in the cleanup code, we do blk_cleanup_queue(), which will also call
blk_put_queue().

Putting the queue twice is incorrect, and will generate a KASAN splat.

Set the disk->queue pointer to NULL, before calling put_disk(), so that the
first call to blk_put_queue() will not free the queue.

The second call to blk_put_queue() uses another pointer to the same queue,
so this call will still free the queue.

Fixes: 85136c01 ("lightnvm: simplify geometry enumeration")
Signed-off-by: NNiklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

132be623

02 4月, 2020 1 次提交

nvme: inherit stable pages constraint in the mpath stack device · 74e4d20e

由 Sagi Grimberg 提交于 4月 01, 2020

If the backing device require stable pages, we need to set it on the
stack mpath device as well. This applies to rdma/fc transports when
doing data integrity and tcp transport calculating digests.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

74e4d20e

31 3月, 2020 1 次提交

nvme: fix compat address handling in several ioctls · c95b708d

由 Nick Bowler 提交于 3月 28, 2020

On a 32-bit kernel, the upper bits of userspace addresses passed via
various ioctls are silently ignored by the nvme driver.

However on a 64-bit kernel running a compat task, these upper bits are
not ignored and are in fact required to be zero for the ioctls to work.

Unfortunately, this difference matters.  32-bit smartctl submits the
NVME_IOCTL_ADMIN_CMD ioctl with garbage in these upper bits because it
seems the pointer value it puts into the nvme_passthru_cmd structure is
sign extended.  This works fine on 32-bit kernels but fails on a 64-bit
one because (at least on my setup) the addresses smartctl uses are
consistently above 2G.  For example:

  # smartctl -x /dev/nvme0n1
  smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.5.11] (local build)
  Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

  Read NVMe Identify Controller failed: NVME_IOCTL_ADMIN_CMD: Bad address

Since changing 32-bit kernels to actually check all of the submitted
address bits now would break existing userspace, this patch fixes the
compat problem by explicitly zeroing the upper bits in the compat case.
This enables 32-bit smartctl to work on a 64-bit kernel.
Signed-off-by: NNick Bowler <nbowler@draconx.ca>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c95b708d

26 3月, 2020 2 次提交

nvme: cleanup namespace identifier reporting in nvme_init_ns_head · 43fcd9e1

由 Christoph Hellwig 提交于 3月 25, 2020

Lift the common namespace identifier reporting between the shared
namespace and new nshead cases into common code.  This also means
one less lock is held while doing I/O.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NKeith Busch <kbusch@kernel.org>

43fcd9e1

nvme: rename __nvme_find_ns_head to nvme_find_ns_head · 026d2ef7

由 Christoph Hellwig 提交于 3月 25, 2020

There is no non __-prefixed version, so make the name a little more
readable.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NKeith Busch <kbusch@kernel.org>

026d2ef7

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功