提交 · 41d07df7de841bfbc32725ce21d933ad358f2844 · openeuler / Kernel

23 6月, 2022 2 次提交

nvme: move the Samsung X5 quirk entry to the core quirks · e6487833

由 Christoph Hellwig 提交于 6月 17, 2022

This device shares the PCI ID with the Samsung 970 Evo Plus that
does not need or want the quirks.  Move the the quirk entry to the
core table based on the model number instead.

Fixes: bc360b0b ("nvme-pci: add quirks for Samsung X5 SSDs")
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NPankaj Raghav <p.raghav@samsung.com>

e6487833

nvme: add a bogus subsystem NQN quirk for Micron MTFDKBA2T0TFH · 41f38043

由 Leo Savernik 提交于 6月 22, 2022

The Micron MTFDKBA2T0TFH device reports the same subsysem NQN for
all devices.  Add a quick to ignore it.
Signed-off-by: NLeo Savernik <l.savernik@aon.at>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

41f38043

14 6月, 2022 8 次提交

nvme-pci: disable write zeros support on UMIC and Samsung SSDs · 43047e08

由 rasheed.hsueh 提交于 6月 10, 2022

Like commit 5611ec2b ("nvme-pci: prevent SK hynix PC400 from using
Write Zeroes command"), UMIS and Samsung has the same issue:
[ 6305.633887] blk_update_request: operation not supported error,
dev nvme0n1, sector 340812032 op 0x9:(WRITE_ZEROES) flags 0x0
phys_seg 0 prio class 0

So also disable Write Zeroes command on UMIS and Samsung.
Signed-off-by: Nrasheed.hsueh <rasheed.hsueh@lcfc.corp-partner.google.com>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

43047e08

nvme-pci: avoid the deepest sleep state on ZHITAI TiPro7000 SSDs · 6b961bce

由 Ning Wang 提交于 6月 05, 2022

When ZHITAI TiPro7000 SSDs entered deepest power state(ps4)
it has the same APST sleep problem as Kingston A2000.
by chance the system crashes and displays the same dmesg info:

https://bugzilla.kernel.org/show_bug.cgi?id=195039#c65

As the Archlinux wiki suggest (enlat + exlat) < 25000 is fine
and my testing shows no system crashes ever since.
Therefore disabling the deepest power state will fix the APST sleep issue.

https://wiki.archlinux.org/title/Solid_state_drive/NVMe

This is the APST data from 'nvme id-ctrl /dev/nvme1'

NVME Identify Controller:
vid       : 0x1e49
ssvid     : 0x1e49
sn        : [...]
mn        : ZHITAI TiPro7000 1TB
fr        : ZTA32F3Y
[...]
ps    0 : mp:3.50W operational enlat:5 exlat:5 rrt:0 rrl:0
          rwt:0 rwl:0 idle_power:- active_power:-
ps    1 : mp:3.30W operational enlat:50 exlat:100 rrt:1 rrl:1
          rwt:1 rwl:1 idle_power:- active_power:-
ps    2 : mp:2.80W operational enlat:50 exlat:200 rrt:2 rrl:2
          rwt:2 rwl:2 idle_power:- active_power:-
ps    3 : mp:0.1500W non-operational enlat:500 exlat:5000 rrt:3 rrl:3
          rwt:3 rwl:3 idle_power:- active_power:-
ps    4 : mp:0.0200W non-operational enlat:2000 exlat:60000 rrt:4 rrl:4
          rwt:4 rwl:4 idle_power:- active_power:-
Signed-off-by: NNing Wang <ningwang35@outlook.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

6b961bce

nvme-pci: sk hynix p31 has bogus namespace ids · c4f01a77

由 Keith Busch 提交于 6月 13, 2022

Add the quirk.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216049Signed-off-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c4f01a77

nvme-pci: smi has bogus namespace ids · c98a8793

由 Keith Busch 提交于 6月 13, 2022

Add the quirk.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216096Signed-off-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c98a8793

nvme-pci: phison e12 has bogus namespace ids · 2cf7a77e

由 Keith Busch 提交于 6月 13, 2022

Add the quirk.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216049Signed-off-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

2cf7a77e

nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S50 · 3765fad5

由 Stefan Reiter 提交于 6月 06, 2022

ADATA XPG GAMMIX S50 drives report bogus eui64 values that appear to
be the same across drives in one system. Quirk them out so they are
not marked as "non globally unique" duplicates.
Signed-off-by: NStefan Reiter <stefan@pimaker.at>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3765fad5

nvme-pci: add trouble shooting steps for timeouts · 4641a8e6

由 Keith Busch 提交于 6月 06, 2022

Many users have encountered IO timeouts with a CSTS value of 0xffffffff,
which indicates a failure to read the register. While there are various
potential causes for this observation, faulty NVMe APST has been the
culprit quite frequently. Add the recommended troubleshooting steps in
the error output when this condition occurs.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

4641a8e6

nvme: add bug report info for global duplicate id · 2f0dad17

由 Keith Busch 提交于 6月 07, 2022

The recent global id check is finding poorly implemented devices in the
wild. Include relavant device information in the output to help quicken
an appropriate quirk patch.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

2f0dad17

31 5月, 2022 1 次提交

nvme-pci: disable namespace identifiers for the MAXIO MAP1001 · 70ce3455

由 Christoph Hellwig 提交于 5月 27, 2022

The MAXIO MAP1001 controllers reports completely bogus Namespace
identifiers that even change after suspend cycles.  Disable using
the Identifiers entirely.
Reported-by: NArman Hajishafieha <arman.hajishafieha@hotmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Tested-by: NArman Hajishafieha <arman.hajishafieha@hotmail.com>

70ce3455

28 5月, 2022 1 次提交

blk-mq: remove the done argument to blk_execute_rq_nowait · e2e53086

由 Christoph Hellwig 提交于 5月 24, 2022

Let the caller set it together with the end_io_data instead of passing
a pointless argument. Note the the target code did in fact already
set it and then just overrode it again by calling blk_execute_rq_nowait.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NKanchan Joshi <joshi.k@samsung.com>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220524121530.943123-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

e2e53086

16 5月, 2022 3 次提交

nvme-pci: harden drive presence detect in nvme_dev_disable() · b98235d3

由 Stefan Roese 提交于 5月 06, 2022

On our ZynqMP system we observe, that a NVMe drive that resets itself
while doing a firmware update causes a Kernel crash like this:

[ 67.720772] pcieport 0000:02:02.0: pciehp: Slot(2): Link Down
[ 67.720783] pcieport 0000:02:02.0: pciehp: Slot(2): Card not present
[ 67.720795] nvme 0000:04:00.0: PME# disabled
[ 67.720849] Internal error: synchronous external abort: 96000010 [#1] PREEMPT SMP
[ 67.720853] nwl-pcie fd0e0000.pcie: Slave error

Analysis: When nvme_dev_disable() is called because of this PCIe hotplug
event, pci_is_enabled() is still true. And accessing the NVMe drive
which is currently not available as it's in reboot process causes this
"synchronous external abort" on this ARM64 platform.

This patch adds the pci_device_is_present() check as well, which returns
false in this "Card not present" hot-plug case. With this change, the
NVMe driver does not try to access the NVMe registers any more and the
FW update finishes without any problems.
Signed-off-by: NStefan Roese <sr@denx.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b98235d3

nvme-pci: fix a NULL pointer dereference in nvme_alloc_admin_tags · da427611

由 Smith, Kyle Miller (Nimble Kernel) 提交于 4月 22, 2022

In nvme_alloc_admin_tags, the admin_q can be set to an error (typically
-ENOMEM) if the blk_mq_init_queue call fails to set up the queue, which
is checked immediately after the call. However, when we return the error
message up the stack, to nvme_reset_work the error takes us to
nvme_remove_dead_ctrl()
  nvme_dev_disable()
   nvme_suspend_queue(&dev->queues[0]).

Here, we only check that the admin_q is non-NULL, rather than not
an error or NULL, and begin quiescing a queue that never existed, leading
to bad / NULL pointer dereference.
Signed-off-by: NKyle Smith <kyles@hpe.com>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

da427611

nvme: mark internal passthru request RQF_QUIET · 128126a7

由 Chaitanya Kulkarni 提交于 4月 19, 2022

Most of the internal passthru commands use __nvme_submit_sync_cmd()
interface. There are few places we open code the request submission :-

1. nvme_keep_alive_work(struct work_struct *work)
2. nvme_timeout(struct request *req, bool reserved)
3. nvme_delete_queue(struct nvme_queue *nvmeq, u8 opcode)

Mark the internal passthru request quiet so that we can skip the verbose
error message from nvme_log_error() in nvme_end_req() completion path,
this will be consistent with what we have in __nvme_submit_sync_cmd().
Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: NAlan Adamson <alan.adamson@oracle.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

128126a7

15 4月, 2022 2 次提交

nvme-pci: disable namespace identifiers for Qemu controllers · 66dd346b

由 Christoph Hellwig 提交于 4月 12, 2022

Qemu unconditionally reports a UUID, which depending on the qemu version
is either all-null (which is incorrect but harmless) or contains a single
bit set for all controllers.  In addition it can also optionally report
a eui64 which needs to be manually set.  Disable namespace identifiers
for Qemu controlles entirely even if in some cases they could be set
correctly through manual intervention.
Reported-by: NLuis Chamberlain <mcgrof@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

66dd346b

nvme-pci: disable namespace identifiers for the MAXIO MAP1002/1202 · a98a945b

由 Christoph Hellwig 提交于 4月 11, 2022

The MAXIO MAP1002/1202 controllers reports completely bogus Namespace
identifiers that even change after suspend cycles.  Disable using
the Identifiers entirely.
Reported-by: N金韬 <me@kingtous.cn>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Tested-by: N金韬 <me@kingtous.cn>

a98a945b

23 3月, 2022 2 次提交

nvme-pci: add quirks for Samsung X5 SSDs · bc360b0b

由 Monish Kumar R 提交于 3月 16, 2022

Add quirks to not fail the initialization and to have quick resume
latency after cold/warm reboot.
Signed-off-by: NMonish Kumar R <monish.kumar.r@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

bc360b0b

nvme-pci: expose use_threaded_interrupts read-only in sysfs · 2e21e445

由 Xin Hao 提交于 3月 22, 2022

Allow reading /sys/module/nvme/parameters/use_threaded_interrupts to see
if the use_threaded_interrupts module parameter is in use.
Signed-off-by: NXin Hao <xhao@linux.alibaba.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

2e21e445

16 3月, 2022 1 次提交

nvme: remove nvme_alloc_request and nvme_alloc_request_qid · e559398f

由 Christoph Hellwig 提交于 3月 15, 2022

Just open code the allocation + initialization in the callers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>

e559398f

04 3月, 2022 1 次提交

mm: don't include <linux/memremap.h> in <linux/mm.h> · dc90f084

由 Christoph Hellwig 提交于 2月 16, 2022

Move the check for the actual pgmap types that need the free at refcount
one behavior into the out of line helper, and thus avoid the need to
pull memremap.h into mm.h.

Link: https://lkml.kernel.org/r/20220210072828.2930359-7-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Tested-by: N"Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com>

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>

dc90f084

27 1月, 2022 1 次提交

nvme-pci: add the IGNORE_DEV_SUBNQN quirk for Intel P4500/P4600 SSDs · 25e58af4

由 Wu Zheng 提交于 6月 21, 2021

The Intel P4500/P4600 SSDs do not report a subsystem NQN despite claiming
compliance to a standards version where reporting one is required.

Add the IGNORE_DEV_SUBNQN quirk to not fail the initialization of a
second such SSDs in a system.
Signed-off-by: NZheng Wu <wu.zheng@intel.com>
Signed-off-by: NYe Jinhe <jinhe.ye@intel.com>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

25e58af4

06 1月, 2022 1 次提交

nvme-pci: fix queue_rqs list splitting · 6bfec799

由 Keith Busch 提交于 1月 05, 2022

If command prep fails, current handling will orphan subsequent requests
in the list. Consider a simple example:

  rqlist = [ 1 -> 2 ]

When prep for request '1' fails, it will be appended to the
'requeue_list', leaving request '2' disconnected from the original
rqlist and no longer tracked. Meanwhile, rqlist is still pointing to the
failed request '1' and will attempt to submit the unprepped command.

Fix this by updating the rqlist accordingly using the request list
helper functions.

Fixes: d62cbcf6 ("nvme: add support for mq_ops->queue_rqs()")
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220105170518.3181469-5-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>

6bfec799

17 12月, 2021 3 次提交

nvme: add support for mq_ops->queue_rqs() · d62cbcf6

由 Jens Axboe 提交于 11月 18, 2021

This enables the block layer to send us a full plug list of requests
that need submitting. The block layer guarantees that they all belong
to the same queue, but we do have to check the hardware queue mapping
for each request.

If errors are encountered, leave them in the passed in list. Then the
block layer will handle them individually.

This is good for about a 4% improvement in peak performance, taking us
from 9.6M to 10M IOPS/core.
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d62cbcf6

nvme: separate command prep and issue · 62451a2b

由 Jens Axboe 提交于 10月 29, 2021

Add a nvme_prep_rq() helper to setup a command, and nvme_queue_rq() is
adapted to use this helper.
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

62451a2b

nvme: split command copy into a helper · 3233b94c

由 Jens Axboe 提交于 10月 29, 2021

We'll need it for batched submit as well. Since we now have a copy
helper, get rid of the nvme_submit_cmd() wrapper.
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3233b94c

29 11月, 2021 1 次提交

block: remove the gendisk argument to blk_execute_rq · b84ba30b

由 Christoph Hellwig 提交于 11月 26, 2021

Remove the gendisk aregument to blk_execute_rq and blk_execute_rq_nowait
given that it is unused now. Also convert the boolean at_head parameter
to actually use the bool type while touching the prototype.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20211126121802.2090656-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

b84ba30b

21 10月, 2021 1 次提交

nvme-pci: clear shadow doorbell memory on resets · 58847f12

由 Keith Busch 提交于 10月 14, 2021

The host memory doorbell and event buffers need to be initialized on
each reset so the driver doesn't observe stale values from the previous
instantiation.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Tested-by: NJohn Levon <john.levon@nutanix.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

58847f12

20 10月, 2021 1 次提交

nvme: apply nvme API to quiesce/unquiesce admin queue · 6ca1d902

由 Ming Lei 提交于 10月 14, 2021

Apply the added two APIs to quiesce/unquiesce admin queue.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211014081710.1871747-3-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

6ca1d902

19 10月, 2021 3 次提交

nvme: wire up completion batching for the IRQ path · 4f502245

由 Jens Axboe 提交于 10月 18, 2021

Trivial to do now, just need our own io_comp_batch on the stack and pass
that in to the usual command completion handling.

I pondered making this dependent on how many entries we had to process,
but even for a single entry there's no discernable difference in
performance or latency. Running a sync workload over io_uring:

t/io_uring -b512 -d1 -s1 -c1 -p0 -F1 -B1 -n2 /dev/nvme1n1 /dev/nvme2n1

yields the below performance before the patch:

IOPS=254820, BW=124MiB/s, IOS/call=1/1, inflight=(1 1)
IOPS=251174, BW=122MiB/s, IOS/call=1/1, inflight=(1 1)
IOPS=250806, BW=122MiB/s, IOS/call=1/1, inflight=(1 1)

and the following after:

IOPS=255972, BW=124MiB/s, IOS/call=1/1, inflight=(1 1)
IOPS=251920, BW=123MiB/s, IOS/call=1/1, inflight=(1 1)
IOPS=251794, BW=122MiB/s, IOS/call=1/1, inflight=(1 1)

which definitely isn't slower, about the same if you factor in a bit of
variance. For peak performance workloads, benchmarking shows a 2%
improvement.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4f502245

nvme: add support for batched completion of polled IO · c234a653

由 Jens Axboe 提交于 10月 08, 2021

Take advantage of struct io_comp_batch, if passed in to the nvme poll
handler. If it's set, rather than complete each request individually
inline, store them in the io_comp_batch list. We only do so for requests
that will complete successfully, anything else will be completed inline as
before.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c234a653

block: add a struct io_comp_batch argument to fops->iopoll() · 5a72e899

由 Jens Axboe 提交于 10月 12, 2021

struct io_comp_batch contains a list head and a completion handler, which
will allow completions to more effciently completed batches of IO.

For now, no functional changes in this patch, we just define the
io_comp_batch structure and add the argument to the file_operations iopoll
handler.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5a72e899

18 10月, 2021 1 次提交

block: move integrity handling out of <linux/blkdev.h> · fe45e630

由 Christoph Hellwig 提交于 9月 20, 2021

Split the integrity/metadata handling definitions out into a new header.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210920123328.1399408-17-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

fe45e630

07 10月, 2021 1 次提交

nvme-pci: Fix abort command id · 85f74acf

由 Keith Busch 提交于 10月 06, 2021

The request tag is no longer the only component of the command id.

Fixes: e7006de6 ("nvme: code command_id with a genctr for use-after-free validation")
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NKeith Busch <kbusch@kernel.org>

85f74acf

28 9月, 2021 1 次提交

nvme: add command id quirk for apple controllers · a2941f6a

由 Keith Busch 提交于 9月 27, 2021

Some apple controllers use the command id as an index to implementation
specific data structures and will fail if the value is out of bounds.
The nvme driver's recently introduced command sequence number breaks
this controller.

Provide a quirk so these spec incompliant controllers can function as
before. The driver will not have the ability to detect bad completions
when this quirk is used, but we weren't previously checking this anyway.

The quirk bit was selected so that it can readily apply to stable.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=214509
Cc: Sven Peter <sven@svenpeter.dev>
Reported-by: NOrlando Chamberlain <redecorating@protonmail.com>
Reported-by: NAditya Garg <gargaditya08@live.com>
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NSven Peter <sven@svenpeter.dev>
Link: https://lore.kernel.org/r/20210927154306.387437-1-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>

a2941f6a

16 8月, 2021 5 次提交

nvme: allow user toggling hmb usage · a5df5e79

由 Keith Busch 提交于 7月 27, 2021

The NVMe host memory buffer may consume a non-negligable amount of
memory. Controllers are required to function without the host memory
buffer enabled, but with possibly degraded performance. Export a sysfs
property to toggle this feature on a per-device granularity so users may
choose to reclaim memory at the expense of storage performance.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

a5df5e79

nvme-pci: disable hmb on idle suspend · e5ad96f3

由 Keith Busch 提交于 7月 27, 2021

An idle suspend may or may not disable host memory access from devices
placed in low power mode. Either way, it should always be safe to
disable the host memory buffer prior to entering the low power mode, and
this should also always be faster than a full device shutdown.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

e5ad96f3

nvme-pci: cmb sysfs: one file, one value · 1751e97a

由 Keith Busch 提交于 7月 16, 2021

An attribute should only be exporting one value as recommended in
Documentation/filesystems/sysfs.rst. Implement CMB attributes this way.
The old attribute will remain for backward compatibility.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1751e97a

nvme-pci: use attribute group for cmb sysfs · 0521905e

由 Keith Busch 提交于 7月 14, 2021

Appending sysfs files to the controller kobject is a bit clunky and
becomes a maintenance problem as more attributes are added. The
attribute group infrastructure handles this better, so use that.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0521905e

nvme: code command_id with a genctr for use-after-free validation · e7006de6

由 Sagi Grimberg 提交于 6月 16, 2021

We cannot detect a (perhaps buggy) controller that is sending us
a completion for a request that was already completed (for example
sending a completion twice), this phenomenon was seen in the wild
a few times.

So to protect against this, we use the upper 4 msbits of the nvme sqe
command_id to use as a 4-bit generation counter and verify it matches
the existing request generation that is incrementing on every execution.

The 16-bit command_id structure now is constructed by:
| xxxx | xxxxxxxxxxxx |
  gen    request tag

This means that we are giving up some possible queue depth as 12 bits
allow for a maximum queue depth of 4095 instead of 65536, however we
never create such long queues anyways so no real harm done.
Suggested-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Acked-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NDaniel Wagner <dwagner@suse.de>
Tested-by: NDaniel Wagner <dwagner@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

e7006de6

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功