提交 · 0fa0f99fc84e41057cbdd2efbfe91c6b2f47dd9d · openeuler / Kernel

02 2月, 2022 1 次提交

nvme: fix a possible use-after-free in controller reset during load · 0fa0f99f

由 Sagi Grimberg 提交于 2月 01, 2022

Unlike .queue_rq, in .submit_async_event drivers may not check the ctrl
readiness for AER submission. This may lead to a use-after-free
condition that was observed with nvme-tcp.

The race condition may happen in the following scenario:
1. driver executes its reset_ctrl_work
2. -> nvme_stop_ctrl - flushes ctrl async_event_work
3. ctrl sends AEN which is received by the host, which in turn
   schedules AEN handling
4. teardown admin queue (which releases the queue socket)
5. AEN processed, submits another AER, calling the driver to submit
6. driver attempts to send the cmd
==> use-after-free

In order to fix that, add ctrl state check to validate the ctrl
is actually able to accept the AER submission.

This addresses the above race in controller resets because the driver
during teardown should:
1. change ctrl state to RESETTING
2. flush async_event_work (as well as other async work elements)

So after 1,2, any other AER command will find the
ctrl state to be RESETTING and bail out without submitting the AER.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

0fa0f99f

23 12月, 2021 3 次提交

nvme: add 'iopolicy' module parameter · e3d34794

由 Hannes Reinecke 提交于 12月 20, 2021

While the 'iopolicy' sysfs attribute can be set at runtime, most
storage arrays prefer to use the 'round-robin' iopolicy per default.
We can use udev rules to set this, but is getting rather unwieldy
for rebranded arrays as we would have to update the udev rules
anytime a new array shows up, leading to the same mess we currently
have in multipathd for configuring the RDAC arrays.

Hence this patch adds a module parameter 'iopolicy' to allow the
admin to switch the default, and to do away with the need for a
udev rule here.
Signed-off-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NDaniel Wagner <dwagner@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

e3d34794

nvme: drop unused variable ctrl in nvme_setup_cmd · 3a605e32

由 Geliang Tang 提交于 12月 22, 2021

The variable 'ctrl' became useless since the code using it was dropped
from nvme_setup_cmd() in the commit 292ddf67bbd5 ("nvme: increment
request genctr on completion"). Fix it to get rid of this compilation
warning in the nvme-5.17 branch:

 drivers/nvme/host/core.c: In function ‘nvme_setup_cmd’:
 drivers/nvme/host/core.c:993:20: warning: unused variable ‘ctrl’ [-Wunused-variable]
   struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
                     ^~~~

Fixes: 292ddf67bbd5 ("nvme: increment request genctr on completion")
Signed-off-by: NGeliang Tang <geliang.tang@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3a605e32

nvme: increment request genctr on completion · e4fdb2b1

由 Keith Busch 提交于 12月 13, 2021

The nvme request generation counter is intended to catch duplicate
completions. Incrementing the counter on submission means duplicates can
only be caught if the request tag is reallocated and dispatched prior to
the driver observing the corrupted CQE. Incrementing on completion
removes this window, making it possible to detect duplicate completions
in consecutive entries.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

e4fdb2b1

08 12月, 2021 1 次提交

nvme: fix use after free when disconnecting a reconnecting ctrl · 8b77fa6f

由 Ruozhu Li 提交于 11月 04, 2021

A crash happens when trying to disconnect a reconnecting ctrl:

 1) The network was cut off when the connection was just established,
    scan work hang there waiting for some IOs complete.  Those I/Os were
    retried because we return BLK_STS_RESOURCE to blk in reconnecting.
 2) After a while, I tried to disconnect this connection.  This
    procedure also hangs because it tried to obtain ctrl->scan_lock.
    It should be noted that now we have switched the controller state
    to NVME_CTRL_DELETING.
 3) In nvme_check_ready(), we always return true when ctrl->state is
    NVME_CTRL_DELETING, so those retrying I/Os were issued to the bottom
    device which was already freed.

To fix this, when ctrl->state is NVME_CTRL_DELETING, issue cmd to bottom
device only when queue state is live.  If not, return host path error to
the block layer
Signed-off-by: NRuozhu Li <liruozhu@huawei.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

8b77fa6f

06 12月, 2021 2 次提交

nvme: disable namespace access for unsupported metadata · d39ad2a4

由 Keith Busch 提交于 11月 30, 2021

The only fabrics target that supports metadata handling through the
separate integrity buffer is RDMA. It is currently usable only if the
size is 8B per block and formatted for protection information. If an
rdma target were to export a namespace with a different format (ex:
4k+64B), the driver will not be able to submit valid read/write commands
for that namespace.

Suppress setting the metadata feature in the namespace so that the
gendisk capacity will be set to 0. This will prevent read/write access
through the block stack, but will continue to allow ioctl passthrough
commands.

Cc: Max Gurtovoy <mgurtovoy@nvidia.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d39ad2a4

nvme: show subsys nqn for duplicate cntlids · 16cc33b2

由 Keith Busch 提交于 11月 29, 2021

The driver assigned nvme handle isn't persistent across reboots, so is
not enough information to match up where the collisions are occuring.
Add the subsys nqn string to the output so that it can more easily be
identified later.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215099Signed-off-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

16cc33b2

29 11月, 2021 1 次提交

block: remove the gendisk argument to blk_execute_rq · b84ba30b

由 Christoph Hellwig 提交于 11月 26, 2021

Remove the gendisk aregument to blk_execute_rq and blk_execute_rq_nowait
given that it is unused now. Also convert the boolean at_head parameter
to actually use the bool type while touching the prototype.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20211126121802.2090656-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

b84ba30b

24 11月, 2021 2 次提交

nvme: fix write zeroes pi · 00b33cf3

由 Klaus Jensen 提交于 11月 10, 2021

Write Zeroes sets PRACT when block integrity is enabled (as it should),
but neglects to also set the reftag which is expected by reads. This
causes protection errors on reads.

Fix this by setting the reftag for type 1 and 2 (for type 3, reads will
not check the reftag).
Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

00b33cf3

nvme-pci: add NO APST quirk for Kioxia device · 5a6254d5

由 Enzo Matsumiya 提交于 11月 05, 2021

This particular Kioxia device times out and aborts I/O during any load,
but it's more easily observable with discards (fstrim).

The device gets to a state that is also not possible to use
"nvme set-feature" to disable APST.
Booting with nvme_core.default_ps_max_latency=0 solves the issue.

We had a dozen or so of these devices behaving this same way in
customer environments.
Signed-off-by: NEnzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

5a6254d5

09 11月, 2021 1 次提交

nvme: wait until quiesce is done · 26af1cd0

由 Ming Lei 提交于 11月 09, 2021

NVMe uses one atomic flag to check if quiesce is needed. If quiesce is
started, the helper returns immediately. This way is wrong, since we
have to wait until quiesce is done.

Fixes: e70feb8b ("blk-mq: support concurrent queue quiesce/unquiesce")
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20211109071144.181581-5-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

26af1cd0

21 10月, 2021 3 次提交

nvme: display correct subsystem NQN · e5ea42fa

由 Hannes Reinecke 提交于 9月 22, 2021

With discovery controllers supporting unique subsystem NQNs the
actual subsystem NQN might be different from that one passed in
via the connect args. So add a helper to display the resulting
subsystem NQN.
Signed-off-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

e5ea42fa

nvme: Add connect option 'discovery' · 20e8b689

由 Hannes Reinecke 提交于 9月 22, 2021

Add a connect option 'discovery' to specify that the connection
should be made to a discovery controller, not a normal I/O controller.
With discovery controllers supporting unique subsystem NQNs we
cannot easily distinguish by the subsystem NQN if this should be
a discovery connection, but we need this information to blank out
options not supported by discovery controllers.
Signed-off-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

20e8b689

nvme: expose subsystem type in sysfs attribute 'subsystype' · 954ae166

由 Hannes Reinecke 提交于 9月 22, 2021

With unique discovery controller NQNs we cannot distinguish the
subsystem type by the NQN alone, but need to check the subsystem
type, too.
So expose the subsystem type in a new sysfs attribute 'subsystype'.
Signed-off-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

954ae166

20 10月, 2021 6 次提交

nvme: paring quiesce/unquiesce · 9e6a6b12

由 Ming Lei 提交于 10月 14, 2021

The current blk_mq_quiesce_queue() and blk_mq_unquiesce_queue() always
stops and starts the queue unconditionally. And there can be concurrent
quiesce/unquiesce coming from different unrelated code paths, so
unquiesce may come unexpectedly and start queue too early.

Prepare for supporting concurrent quiesce/unquiesce from multiple
contexts, so that we can address the above issue.

NVMe has very complicated quiesce/unquiesce use pattern, add one atomic
bit for makeiing sure that blk-mq quiece/unquiesce is always called in
pair.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211014081710.1871747-5-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

9e6a6b12

nvme: prepare for pairing quiescing and unquiescing · ebc9b952

由 Ming Lei 提交于 10月 14, 2021

Add two helpers so that we can prepare for pairing quiescing and
unquiescing which will be done in next patch.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211014081710.1871747-4-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

ebc9b952

nvme: apply nvme API to quiesce/unquiesce admin queue · 6ca1d902

由 Ming Lei 提交于 10月 14, 2021

Apply the added two APIs to quiesce/unquiesce admin queue.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211014081710.1871747-3-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

6ca1d902

nvme: add APIs for stopping/starting admin queue · a277654b

由 Ming Lei 提交于 10月 14, 2021

Add two APIs for stopping and starting admin queue.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211014081710.1871747-2-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

a277654b

nvme: don't memset() the normal read/write command · a9a7e30f

由 Jens Axboe 提交于 10月 18, 2021

This memset in the fast path costs a lot of cycles on my setup. Here's a
top-of-profile of doing ~6.7M IOPS:

+    5.90%  io_uring  [nvme]            [k] nvme_queue_rq
+    5.32%  io_uring  [nvme_core]       [k] nvme_setup_cmd
+    5.17%  io_uring  [kernel.vmlinux]  [k] io_submit_sqes
+    4.97%  io_uring  [kernel.vmlinux]  [k] blkdev_direct_IO

and a perf diff with this patch:

     0.92%     +4.40%  [nvme_core]       [k] nvme_setup_cmd

reducing it from 5.3% to only 0.9%. This takes it from the 2nd most
cycle consumer to something that's mostly irrelevant.
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a9a7e30f

nvme: move command clear into the various setup helpers · 9c3d2929

由 Jens Axboe 提交于 10月 18, 2021

We don't have to worry about doing extra memsets by moving it outside
the protection of RQF_DONTPREP, as nvme doesn't do partial completions.

This is in preparation for making the read/write fast path not do a full
memset of the command.
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9c3d2929

19 10月, 2021 1 次提交

nvme: add support for batched completion of polled IO · c234a653

由 Jens Axboe 提交于 10月 08, 2021

Take advantage of struct io_comp_batch, if passed in to the nvme poll
handler. If it's set, rather than complete each request individually
inline, store them in the io_comp_batch list. We only do so for requests
that will complete successfully, anything else will be completed inline as
before.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c234a653

18 10月, 2021 2 次提交

block: rename REQ_HIPRI to REQ_POLLED · 6ce913fe

由 Christoph Hellwig 提交于 10月 12, 2021

Unlike the RWF_HIPRI userspace ABI which is intentionally kept vague,
the bio flag is specific to the polling implementation, so rename and
document it properly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Tested-by: NMark Wunderlich <mark.wunderlich@intel.com>
Link: https://lore.kernel.org/r/20211012111226.760968-12-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

6ce913fe

block: move integrity handling out of <linux/blkdev.h> · fe45e630

由 Christoph Hellwig 提交于 9月 20, 2021

Split the integrity/metadata handling definitions out into a new header.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210920123328.1399408-17-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

fe45e630

14 10月, 2021 1 次提交

nvme: fix per-namespace chardev deletion · be5eb933

由 Adam Manzanares 提交于 10月 13, 2021

Decrease reference count of chardevice during char device deletion in
order to fix a memory leak.  Add a release callabck for the device
associated chardev and move ida_simple_remove into the release function.

Fixes: 2637baed ("nvme: introduce generic per-namespace chardev")
Reported-by: NYi Zhang <yi.zhang@redhat.com>
Suggested-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NAdam Manzanares <a.manzanares@samsung.com>
Reviewed-by: NJavier GonzÃ¡lez <javier@javigon.com>
Tested-by: NYi Zhang <yi.zhang@redhat.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

be5eb933

28 9月, 2021 1 次提交

nvme: add command id quirk for apple controllers · a2941f6a

由 Keith Busch 提交于 9月 27, 2021

Some apple controllers use the command id as an index to implementation
specific data structures and will fail if the value is out of bounds.
The nvme driver's recently introduced command sequence number breaks
this controller.

Provide a quirk so these spec incompliant controllers can function as
before. The driver will not have the ability to detect bad completions
when this quirk is used, but we weren't previously checking this anyway.

The quirk bit was selected so that it can readily apply to stable.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=214509
Cc: Sven Peter <sven@svenpeter.dev>
Reported-by: NOrlando Chamberlain <redecorating@protonmail.com>
Reported-by: NAditya Garg <gargaditya08@live.com>
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NSven Peter <sven@svenpeter.dev>
Link: https://lore.kernel.org/r/20210927154306.387437-1-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>

a2941f6a

21 9月, 2021 1 次提交

nvme: keep ctrl->namespaces ordered · 298ba0e3

由 Christoph Hellwig 提交于 9月 14, 2021

Various places in the nvme code that rely on ctrl->namespace to be
ordered.  Ensure that the namespae is inserted into the list at the
right position from the start instead of sorting it after the fact.

Fixes: 540c801c ("NVMe: Implement namespace list scanning")
Reported-by: NAnton Eidelman <anton.eidelman@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

298ba0e3

15 9月, 2021 1 次提交

nvme: remove the call to nvme_update_disk_info in nvme_ns_remove · 9da4c727

由 Christoph Hellwig 提交于 9月 14, 2021

There is no need to explicitly unregister the integrity profile when
deleting the gendisk.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Link: https://lore.kernel.org/r/20210914070657.87677-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

9da4c727

13 9月, 2021 1 次提交

nvme: avoid race in shutdown namespace removal · 9edceaf4

由 Daniel Wagner 提交于 9月 02, 2021

When we remove the siblings entry, we update ns->head->list, hence we
can't separate the removal and test for being empty. They have to be
in the same critical section to avoid a race.

To avoid breaking the refcounting imbalance again, add a list empty
check to nvme_find_ns_head.

Fixes: 5396fdac ("nvme: fix refcounting imbalance when all paths are down")
Signed-off-by: NDaniel Wagner <dwagner@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Tested-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

9edceaf4

06 9月, 2021 5 次提交

nvme: add error handling support for add_disk() · ab3994f6

由 Luis Chamberlain 提交于 8月 30, 2021

We never checked for errors on add_disk() as this function
returned void. Now that this is fixed, use the shiny new
error handling.
Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ab3994f6

nvme: only call synchronize_srcu when clearing current path · 041bd1a1

由 Daniel Wagner 提交于 9月 01, 2021

The function nmve_mpath_clear_current_path returns true if the current
path has changed. In this case we have to wait for all concurrent
submissions to finish. But if we didn't change the current path, there
is no point in waiting for another RCU period to finish.
Signed-off-by: NDaniel Wagner <dwagner@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

041bd1a1

nvme: update keep alive interval when kato is modified · b58da2d2

由 Tatsuya Sasaki 提交于 9月 01, 2021

Currently the connection between host and NVMe-oF target gets
disconnected by keep-alive timeout when a user connects to a target
with a relatively large kato value and then sets the smaller kato
with a set features command (e.g. connects with 60 seconds kato value
and then sets 10 seconds kato value).

The cause is that keep alive command interval on the host, which is
defined as unsigned int kato in nvme_ctrl structure, does not follow
the kato value changes.

This patch updates the keep alive interval in the following steps when
the kato is modified by a set features command: stops the keep alive
work queue, then sets the kato as new timer value and re-start the queue.
Signed-off-by: NTatsuya Sasaki <tatsuya6.sasaki@kioxia.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b58da2d2

nvme: move nvme_multi_css into nvme.h · 43dc9878

由 Adam Manzanares 提交于 8月 26, 2021

Preparatory patch in order to reuse nvme_multi_css in the nvme target
code.
Signed-off-by: NAdam Manzanares <a.manzanares@samsung.com>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

43dc9878

nvme-multipath: revalidate paths during rescan · e7d65803

由 Hannes Reinecke 提交于 8月 24, 2021

When triggering a rescan due to a namespace resize we will be
receiving AENs on every controller, triggering a rescan of all
attached namespaces. If multipath is active only the current path and
the ns_head disk will be updated, the other paths will still refer to
the old size until AENs for the remaining controllers are received.

If I/O comes in before that it might be routed to one of the old
paths, triggering an I/O failure with 'access beyond end of device'.
With this patch the old paths are skipped from multipath path
selection until the controller serving these paths has been rescanned.
Signed-off-by: NHannes Reinecke <hare@suse.de>
[dwagner: - introduce NVME_NS_READY flag instead of NVME_NS_INVALIDATE
          - use 'revalidate' instead of 'invalidate' which
	    follows the zoned device code path.
	  - clear NVME_NS_READY before clearing current_path]
Signed-off-by: NDaniel Wagner <dwagner@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

e7d65803

24 8月, 2021 1 次提交

nvme: use blk_mq_alloc_disk · 5f432cce

由 Christoph Hellwig 提交于 8月 16, 2021

Switch to use the blk_mq_alloc_disk helper for allocating the
request_queue and gendisk.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Link: https://lore.kernel.org/r/20210816131910.615153-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

5f432cce

17 8月, 2021 1 次提交

nvme: use bvec_virt · 3973e15f

由 Christoph Hellwig 提交于 8月 04, 2021

Use bvec_virt instead of open coding it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20210804095634.460779-16-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

3973e15f

16 8月, 2021 1 次提交

nvme: code command_id with a genctr for use-after-free validation · e7006de6

由 Sagi Grimberg 提交于 6月 16, 2021

We cannot detect a (perhaps buggy) controller that is sending us
a completion for a request that was already completed (for example
sending a completion twice), this phenomenon was seen in the wild
a few times.

So to protect against this, we use the upper 4 msbits of the nvme sqe
command_id to use as a 4-bit generation counter and verify it matches
the existing request generation that is incrementing on every execution.

The 16-bit command_id structure now is constructed by:
| xxxx | xxxxxxxxxxxx |
  gen    request tag

This means that we are giving up some possible queue depth as 12 bits
allow for a maximum queue depth of 4095 instead of 65536, however we
never create such long queues anyways so no real harm done.
Suggested-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Acked-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NDaniel Wagner <dwagner@suse.de>
Tested-by: NDaniel Wagner <dwagner@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

e7006de6

15 8月, 2021 1 次提交

remove the lightnvm subsystem · 9ea9b9c4

由 Christoph Hellwig 提交于 8月 12, 2021

Lightnvm supports the OCSSD 1.x and 2.0 specs which were early attempts
to produce Open Channel SSDs and never made it into the NVMe spec
proper. They have since been superceeded by NVMe enhancements such
as ZNS support. Remove the support per the deprecation schedule.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210812132308.38486-1-hch@lst.deReviewed-by: NMatias Bjørling <mb@lightnvm.io>
Reviewed-by: NJavier González <javier@javigon.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9ea9b9c4

13 8月, 2021 2 次提交

block: remove GENHD_FL_UP · 50b4aecf

由 Christoph Hellwig 提交于 8月 09, 2021

Just check inode_unhashed on the whole device bdev inode instead,
and provide a helper to check for that information.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210809064028.1198327-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

50b4aecf

nvme: remove the GENHD_FL_UP check in nvme_ns_remove · 5eba2005

由 Christoph Hellwig 提交于 8月 09, 2021

Early probe failure never reaches nvme_ns_remove, so GENHD_FL_UP must
be set at this point. Remove the check.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210809064028.1198327-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

5eba2005

10 8月, 2021 1 次提交

block: pass a gendisk to blk_queue_update_readahead · 471aa704

由 Christoph Hellwig 提交于 8月 09, 2021

.. and rename the function to disk_update_readahead.  This is in
preparation for moving the BDI from the request_queue to the gendisk.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210809141744.1203023-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

471aa704

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功