提交 · a3646451edd52ba238cbe4f618aaf6eb9bf9d60c · openeuler / Kernel

21 6月, 2019 17 次提交

nvme: prepare for fault injection into admin commands · a3646451

由 Akinobu Mita 提交于 6月 20, 2019

Currenlty fault injection support for nvme only enables to inject errors
into the commands submitted to I/O queues.

In preparation for fault injection into the admin commands, this makes
the helper functions independent of struct nvme_ns.
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

a3646451

nvme-trace: print result and status in hex format · 5f965f4f

由 Minwoo Im 提交于 6月 12, 2019

The "result" field is in 64bit to be printed out which means it could be
like:
  nvme_complete_rq: nvme0: qid=0, cmdid=0, res=18446612684158962624, etries=0, flags=0x0, status=0

Switch both the result and status field to be printed in hexadecimal
format to be easier to read.
Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

5f965f4f

nvme-trace: support for fabrics commands in host-side · ad795e47

由 Minwoo Im 提交于 6月 12, 2019

This patch introduces fabrics commands tracing feature from host-side.
This patch does not include any changes for the previous host-side
tracing, but just add fabrics commands parsing in cmd=() format.
Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com>
[hch: fixed some whitespace damage]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ad795e47

nvme-trace: move opcode symbol print to nvme.h · 26f2990d

由 Minwoo Im 提交于 6月 12, 2019

The following patches are going to provide the target-side trace which
might need these kind of macros.  It would be great if it can be shared
between host and target side both.
Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

26f2990d

nvme-trace: do not export nvme_trace_disk_name · 7183a46a

由 Minwoo Im 提交于 6月 12, 2019

nvme_trace_disk_name() is now already being invoked with the function
prototype in trace.h.  We don't need to export this symbol at all.

The following patches are going to provide target-side trace feature
with the exactly same function with this so that this patch removes the
EXPORT_SYMBOL() for this function.
Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

7183a46a

nvme-pci: clean up nvme_remove_dead_ctrl a bit · 7c1ce408

由 Chaitanya Kulkarni 提交于 6月 08, 2019

Remove the status parameter o nvme_remove_dead_ctrl(), which is only
used for printing it.

We move the print message to the same function where actual error is
occurring.
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

7c1ce408

nvme-pci: properly report state change failure in nvme_reset_work · cee6c269

由 Minwoo Im 提交于 6月 09, 2019

If the state change to NVME_CTRL_CONNECTING fails, the dmesg is going to
be like:

  [  293.689160] nvme nvme0: failed to mark controller CONNECTING
  [  293.689160] nvme nvme0: Removing after probe failure status: 0

Even it prints the first line to indicate the situation, the second line
is not proper because the status is 0 which means normally success of
the previous operation.

This patch makes it indicate the proper error value when it fails.
  [   25.932367] nvme nvme0: failed to mark controller CONNECTING
  [   25.932369] nvme nvme0: Removing after probe failure status: -16

This situation is able to be easily reproduced by:
  root@target:~# rmmod nvme && modprobe nvme && rmmod nvme
Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

cee6c269

nvme-pci: set the errno on ctrl state change error · e71afda4

由 Chaitanya Kulkarni 提交于 6月 08, 2019

This patch removes the confusing assignment of the variable result at
the time of declaration and sets the value in error cases next to the
places where the actual error is happening.

Here we also set the result value to -ENODEV when we fail at the final
ctrl state transition in nvme_reset_work(). Without this assignment
result will hold 0 from nvme_setup_io_queue() and on failure 0 will be
passed to he nvme_remove_dead_ctrl() from final state transition.
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

e71afda4

nvme-pci: adjust irq max_vector using num_possible_cpus() · dad77d63

由 Minwoo Im 提交于 6月 09, 2019

If the "irq_queues" are greater than num_possible_cpus(),
nvme_calc_irq_sets() can have irq set_size for HCTX_TYPE_DEFAULT greater
than it can be afforded.
2039         affd->set_size[HCTX_TYPE_DEFAULT] = nrirqs - nr_read_queues;

It might cause a WARN() from the irq_build_affinity_masks() like [1]:
220         if (nr_present < numvecs)
221                 WARN_ON(nr_present + nr_others < numvecs);

This patch prevents it from the WARN() by adjusting the max_vector value
from the nvme_setup_irqs().

[1] WARN messages when modprobe nvme write_queues=32 poll_queues=0:
root@target:~/nvme# nproc
8
root@target:~/nvme# modprobe nvme write_queues=32 poll_queues=0
[   17.925326] nvme nvme0: pci function 0000:00:04.0
[   17.940601] WARNING: CPU: 3 PID: 1030 at kernel/irq/affinity.c:221 irq_create_affinity_masks+0x222/0x330
[   17.940602] Modules linked in: nvme nvme_core [last unloaded: nvme]
[   17.940605] CPU: 3 PID: 1030 Comm: kworker/u17:4 Tainted: G        W         5.1.0+ #156
[   17.940605] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[   17.940608] Workqueue: nvme-reset-wq nvme_reset_work [nvme]
[   17.940609] RIP: 0010:irq_create_affinity_masks+0x222/0x330
[   17.940611] Code: 4c 8d 4c 24 28 4c 8d 44 24 30 e8 c9 fa ff ff 89 44 24 18 e8 c0 38 fa ff 8b 44 24 18 44 8b 54 24 1c 5a 44 01 d0 41 39 c4 76 02 <0f> 0b 48 89 df 44 01 e5 e8 f1 ce 10 00 48 8b 34 24 44 89 f0 44 01
[   17.940611] RSP: 0018:ffffc90002277c50 EFLAGS: 00010216
[   17.940612] RAX: 0000000000000008 RBX: ffff88807ca48860 RCX: 0000000000000000
[   17.940612] RDX: ffff88807bc03800 RSI: 0000000000000020 RDI: 0000000000000000
[   17.940613] RBP: 0000000000000001 R08: ffffc90002277c78 R09: ffffc90002277c70
[   17.940613] R10: 0000000000000008 R11: 0000000000000001 R12: 0000000000000020
[   17.940614] R13: 0000000000025d08 R14: 0000000000000001 R15: ffff88807bc03800
[   17.940614] FS:  0000000000000000(0000) GS:ffff88807db80000(0000) knlGS:0000000000000000
[   17.940616] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   17.940617] CR2: 00005635e583f790 CR3: 000000000240a000 CR4: 00000000000006e0
[   17.940617] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   17.940618] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   17.940618] Call Trace:
[   17.940622]  __pci_enable_msix_range+0x215/0x540
[   17.940623]  ? kernfs_put+0x117/0x160
[   17.940625]  pci_alloc_irq_vectors_affinity+0x74/0x110
[   17.940626]  nvme_reset_work+0xc30/0x1397 [nvme]
[   17.940628]  ? __switch_to_asm+0x34/0x70
[   17.940628]  ? __switch_to_asm+0x40/0x70
[   17.940629]  ? __switch_to_asm+0x34/0x70
[   17.940630]  ? __switch_to_asm+0x40/0x70
[   17.940630]  ? __switch_to_asm+0x34/0x70
[   17.940631]  ? __switch_to_asm+0x40/0x70
[   17.940632]  ? nvme_irq_check+0x30/0x30 [nvme]
[   17.940633]  process_one_work+0x20b/0x3e0
[   17.940634]  worker_thread+0x1f9/0x3d0
[   17.940635]  ? cancel_delayed_work+0xa0/0xa0
[   17.940636]  kthread+0x117/0x120
[   17.940637]  ? kthread_stop+0xf0/0xf0
[   17.940638]  ret_from_fork+0x3a/0x50
[   17.940639] ---[ end trace aca8a131361cd42a ]---
[   17.942124] nvme nvme0: 7/1/0 default/read/poll queues
Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

dad77d63

nvme-pci: remove queue_count_ops for write_queues and poll_queues · 483178f3

由 Minwoo Im 提交于 6月 09, 2019

queue_count_set() seems like that it has been provided to limit the
number of queue entries for write/poll queues.  But, the
queue_count_set() has been doing nothing but a parameter check even it
has num_possible_cpus() which is nop.

This patch removes entire queue_count_ops from the write_queues and
poll_queues.
Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

483178f3

nvme-pci: remove unnecessary zero for static var · a232ea0e

由 Minwoo Im 提交于 6月 09, 2019

poll_queues will be zero even without zero initialization here.
Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

a232ea0e

nvme-pci: use host managed power state for suspend · d916b1be

由 Keith Busch 提交于 5月 23, 2019

The nvme pci driver prepares its devices for power loss during suspend
by shutting down the controllers. The power setting is deferred to
pci driver's power management before the platform removes power. The
suspend-to-idle mode, however, does not remove power.

NVMe devices that implement host managed power settings can achieve
lower power and better transition latencies than using generic PCI power
settings. Try to use this feature if the platform is not involved with
the suspend. If successful, restore the previous power state on resume.
Tested-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: NMario Limonciello <mario.limonciello@dell.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
[hch: fixed the compilation for the !CONFIG_PM_SLEEP case]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d916b1be

nvme: introduce nvme_is_fabrics to check fabrics cmd · 7a1f46e3

由 Minwoo Im 提交于 6月 06, 2019

This patch introduces a nvme_is_fabrics() inline function to check
whether or not the given command structure is for fabrics.
Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

7a1f46e3

nvme: export get and set features · 1a87ee65

由 Keith Busch 提交于 5月 27, 2019

Future use intends to make use of both, so export these functions. And
since their implementation is identical except for the opcode, provide a
new function that implement both.

[akinobu.mita@gmail.com>: fix line over 80 characters]
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1a87ee65

nvme: fix possible io failures when removing multipathed ns · 2181e455

由 Anton Eidelman 提交于 6月 20, 2019

When a shared namespace is removed, we call blk_cleanup_queue()
when the device can still be accessed as the current path and this can
result in submission to a dying queue. Hence, direct_make_request()
called by our mpath device may fail (propagating the failure to userspace).
Instead, we want to failover this I/O to a different path if one exists.
Thus, before we cleanup the request queue, we make sure that the device is
cleared from the current path nor it can be selected again as such.

Fix this by:
- clear the ns from the head->list and synchronize rcu to make sure there is
  no concurrent path search that restores it as the current path
- clear the mpath current path in order to trigger a subsequent path search
  and sync srcu to wait for any ongoing request submissions
- safely continue to namespace removal and blk_cleanup_queue
Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

2181e455

nvme-fc: add message when creating new association · 4bea364f

由 James Smart 提交于 5月 29, 2019

When looking at console messages to troubleshoot, there are one
maybe two messages before creation of the controller is complete.
However, a lot of io takes place to reach that point. It's unclear
when things have started.

Add a message when the controller is attempting to create a new
association. Thus we know what controller, between what host and
remote port, and what NQN is being put into place for any
subsequent success or failure messages.
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NGiridhar Malavali <gmalavali@marvell.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

4bea364f

block: remove blk_init_request_from_bio · f924cdde

由 Christoph Hellwig 提交于 6月 06, 2019

lightnvm should have never used this function, as it is sending
passthrough requests, so switch it to blk_rq_append_bio like all the
other passthrough request users.  Inline blk_init_request_from_bio into
the only remaining caller.
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: NJavier González <javier@javigon.com>
Reviewed-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f924cdde

07 6月, 2019 2 次提交

nvme-rdma: use dynamic dma mapping per command · 62f99b62

由 Max Gurtovoy 提交于 6月 06, 2019

Commit 87fd1253 ("nvme-rdma: remove redundant reference between
ib_device and tagset") caused a kernel panic when disconnecting from an
inaccessible controller (disconnect during re-connection).

--
nvme nvme0: Removing ctrl: NQN "testnqn1"
nvme_rdma: nvme_rdma_exit_request: hctx 0 queue_idx 1
BUG: unable to handle kernel paging request at 0000000080000228
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
...
Call Trace:
 blk_mq_exit_hctx+0x5c/0xf0
 blk_mq_exit_queue+0xd4/0x100
 blk_cleanup_queue+0x9a/0xc0
 nvme_rdma_destroy_io_queues+0x52/0x60 [nvme_rdma]
 nvme_rdma_shutdown_ctrl+0x3e/0x80 [nvme_rdma]
 nvme_do_delete_ctrl+0x53/0x80 [nvme_core]
 nvme_sysfs_delete+0x45/0x60 [nvme_core]
 kernfs_fop_write+0x105/0x180
 vfs_write+0xad/0x1a0
 ksys_write+0x5a/0xd0
 do_syscall_64+0x55/0x110
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fa215417154
--

The reason for this crash is accessing an already freed ib_device for
performing dma_unmap during exit_request commands. The root cause for
that is that during re-connection all the queues are destroyed and
re-created (and the ib_device is reference counted by the queues and
freed as well) but the tagset stays alive and all the DMA mappings (that
we perform in init_request) kept in the request context. The original
commit fixed a different bug that was introduced during bonding (aka nic
teaming) tests that for some scenarios change the underlying ib_device
and caused memory leakage and possible segmentation fault. This commit
is a complementary commit that also changes the wrong DMA mappings that
were saved in the request context and making the request sqe dma
mappings dynamic with the command lifetime (i.e. mapped in .queue_rq and
unmapped in .complete). It also fixes the above crash of accessing freed
ib_device during destruction of the tagset.

Fixes: 87fd1253 ("nvme-rdma: remove redundant reference between ib_device and tagset")
Reported-by: NJim Harris <james.r.harris@intel.com>
Suggested-by: NSagi Grimberg <sagi@grimberg.me>
Tested-by: NJim Harris <james.r.harris@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

62f99b62

nvme: Fix u32 overflow in the number of namespace list calculation · c8e8c77b

由 Jaesoo Lee 提交于 6月 03, 2019

The Number of Namespaces (nn) field in the identify controller data structure is
defined as u32 and the maximum allowed value in NVMe specification is
0xFFFFFFFEUL. This change fixes the possible overflow of the DIV_ROUND_UP()
operation used in nvme_scan_ns_list() by casting the nn to u64.
Signed-off-by: NJaesoo Lee <jalee@purestorage.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

c8e8c77b

06 6月, 2019 1 次提交

nvme-pci: don't limit DMA segement size · a48bc520

由 Christoph Hellwig 提交于 6月 05, 2019

NVMe uses PRPs (or optionally unlimited SGLs) for data transfers and
has no specific limit for a single DMA segement.  Limiting the size
will cause problems because the block layer assumes PRP-ish devices
using a virt boundary mask don't have a segment limit.  And while this
is true, we also really need to tell the DMA mapping layer about it,
otherwise dma-debug will trip over it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NSebastian Ott <sebott@linux.ibm.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a48bc520

31 5月, 2019 2 次提交

nvme-tcp: fix queue mapping when queue count is limited · 64861993

由 Sagi Grimberg 提交于 5月 28, 2019

When the controller supports less queues than requested, we
should make sure that queue mapping does the right thing and
not assume that all queues are available. This fixes a crash
when the controller supports less queues than requested.

The rules are:
1. if no write queues are requested, we assign the available queues
   to the default queue map. The default and read queue maps share the
   existing queues.
2. if write queues are requested:
  - first make sure that read queue map gets the requested
    nr_io_queues count
  - then grant the default queue map the minimum between the requested
    nr_write_queues and the remaining queues. If there are no available
    queues to dedicate to the default queue map, fallback to (1) and
    share all the queues in the existing queue map.

Also, provide a log indication on how we constructed the different
queue maps.
Reported-by: NHarris, James R <james.r.harris@intel.com>
Tested-by: NJim Harris <james.r.harris@intel.com>
Cc: <stable@vger.kernel.org> # v5.0+
Suggested-by: NRoy Shterman <roys@lightbitslabs.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

64861993

nvme-rdma: fix queue mapping when queue count is limited · 5651cd3c

由 Sagi Grimberg 提交于 5月 28, 2019

When the controller supports less queues than requested, we
should make sure that queue mapping does the right thing and
not assume that all queues are available. This fixes a crash
when the controller supports less queues than requested.

The rules are:
1. if no write/poll queues are requested, we assign the available queues
   to the default queue map. The default and read queue maps share the
   existing queues.
2. if write queues are requested:
  - first make sure that read queue map gets the requested
    nr_io_queues count
  - then grant the default queue map the minimum between the requested
    nr_write_queues and the remaining queues. If there are no available
    queues to dedicate to the default queue map, fallback to (1) and
    share all the queues in the existing queue map.
3. if poll queues are requested:
  - map the remaining queues to the poll queue map.

Also, provide a log indication on how we constructed the different
queue maps.
Reported-by: NHarris, James R <james.r.harris@intel.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Tested-by: NJim Harris <james.r.harris@intel.com>
Cc: <stable@vger.kernel.org> # v5.0+
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

5651cd3c

23 5月, 2019 1 次提交

nvme-pci: use blk-mq mapping for unmanaged irqs · cb9e0e50

由 Keith Busch 提交于 5月 21, 2019

If a device is providing a single IRQ vector, the IO queue will share
that vector with the admin queue. This is an unmanaged vector, so does
not have a valid PCI IRQ affinity. Avoid trying to extract a managed
affinity in this case and let blk-mq set up the cpu:queue mapping instead.
Otherwise we'd hit the following warning when the device is using MSI:

 WARNING: CPU: 4 PID: 7 at drivers/pci/msi.c:1272 pci_irq_get_affinity+0x66/0x80
 Modules linked in: nvme nvme_core serio_raw
 CPU: 4 PID: 7 Comm: kworker/u16:0 Tainted: G        W         5.2.0-rc1+ #494
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
 Workqueue: nvme-reset-wq nvme_reset_work [nvme]
 RIP: 0010:pci_irq_get_affinity+0x66/0x80
 Code: 0b 31 c0 c3 83 e2 10 48 c7 c0 b0 83 35 91 74 2a 48 8b 87 d8 03 00 00 48 85 c0 74 0e 48 8b 50 30 48 85 d2 74 05 39 70 14 77 05 <0f> 0b 31 c0 c3 48 63 f6 48 8d 04 76 48 8d 04 c2 f3 c3 48 8b 40 30
 RSP: 0000:ffffb5abc01d3cc8 EFLAGS: 00010246
 RAX: ffff9536786a39c0 RBX: 0000000000000000 RCX: 0000000000000080
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9536781ed000
 RBP: ffff95367346a008 R08: ffff95367d43f080 R09: ffff953678c07800
 R10: ffff953678164800 R11: 0000000000000000 R12: 0000000000000000
 R13: ffff9536781ed000 R14: 00000000ffffffff R15: ffff95367346a008
 FS:  0000000000000000(0000) GS:ffff95367d400000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007fdf814a3ff0 CR3: 000000001a20f000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  blk_mq_pci_map_queues+0x37/0xd0
  nvme_pci_map_queues+0x80/0xb0 [nvme]
  blk_mq_alloc_tag_set+0x133/0x2f0
  nvme_reset_work+0x105d/0x1590 [nvme]
  process_one_work+0x291/0x530
  worker_thread+0x218/0x3d0
  ? process_one_work+0x530/0x530
  kthread+0x111/0x130
  ? kthread_park+0x90/0x90
  ret_from_fork+0x1f/0x30
 ---[ end trace 74587339d93c83c0 ]---

Fixes: 22b55601 ("nvme-pci: Separate IO and admin queue IRQ vectors")
Reported-by: NIván Chavero <ichavero@chavero.com.mx>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

cb9e0e50

21 5月, 2019 2 次提交

nvme: copy MTFA field from identify controller · 2d466c7a

由 Laine Walker-Avina 提交于 5月 20, 2019

We use the controller's reported maximum firmware activation time as our
timeout before resetting a controller for a failed activation notice,
but this value was never being read so we could only use the default
timeout. Copy the Identify Controller MTFA field to the corresponding
nvme_ctrl's mtfa field.

Fixes: b6dccf7f (“nvme: add support for FW activation without reset”).
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMinwoo Im <minwoo.im@samsung.com>
Signed-off-by: NLaine Walker-Avina <laine.walker-avina@intel.com>
[changelog, fix endian]
Signed-off-by: NKeith Busch <keith.busch@intel.com>

2d466c7a

treewide: Add SPDX license identifier - Makefile/Kconfig · ec8f24b7

由 Thomas Gleixner 提交于 5月 19, 2019

Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ec8f24b7

18 5月, 2019 10 次提交

nvme: fix memory leak for power latency tolerance · 510a405d

由 Yufen Yu 提交于 5月 16, 2019

Unconditionally hide device pm latency tolerance when uninitializing
the controller to ensure all qos resources are released so that we're
not leaking this memory. This is safe to call if none were allocated in
the first place, or were previously freed.

Fixes: c5552fde("nvme: Enable autonomous power state transitions")
Suggested-by: NKeith Busch <keith.busch@intel.com>
Tested-by: NDavid Milburn <dmilburn@redhat.com>
Signed-off-by: NYufen Yu <yuyufen@huawei.com>
[changelog]
Signed-off-by: NKeith Busch <keith.busch@intel.com>

510a405d

nvme: release namespace SRCU protection before performing controller ioctls · 5fb4aac7

由 Christoph Hellwig 提交于 5月 17, 2019

Holding the SRCU critical section protecting the namespace list can
cause deadlocks when using the per-namespace admin passthrough ioctl to
delete as namespace.  Release it earlier when performing per-controller
ioctls to avoid that.
Reported-by: NKenneth Heitke <kenneth.heitke@intel.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

5fb4aac7

nvme: merge nvme_ns_ioctl into nvme_ioctl · 90ec611a

由 Christoph Hellwig 提交于 5月 17, 2019

Merge the two functions to make future changes a little easier.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>

90ec611a

nvme: remove the ifdef around nvme_nvm_ioctl · 3f98bcc5

由 Christoph Hellwig 提交于 5月 17, 2019

We already have a proper stub if lightnvm is not enabled, so don't bother
with the ifdef.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>

3f98bcc5

nvme: fix srcu locking on error return in nvme_get_ns_from_disk · 100c815c

由 Christoph Hellwig 提交于 5月 17, 2019

If we can't get a namespace don't leak the SRCU lock.  nvme_ioctl was
working around this, but nvme_pr_command wasn't handling this properly.
Just do what callers would usually expect.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>

100c815c

nvme: Fix known effects · 6fa0321a

由 Keith Busch 提交于 5月 17, 2019

We're trying to append known effects to the ones reported in the
controller's log. The original patch accomplished this, but something
went wrong when patch was merged causing the effects log to override
the known effects.

Link: http://lists.infradead.org/pipermail/linux-nvme/2019-May/023710.html
Fixes: f4524cc4 ("nvme-pci: add known admin effects to augument admin effects log page")
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

6fa0321a

nvme-pci: Sync queues on reset · d6135c3a

由 Keith Busch 提交于 5月 14, 2019

A controller with multiple namespaces may have multiple request_queues with
their own timeout work. If a controller fails with IO outstanding to
diffent namespaces, each request queue may attempt to handle it, so
ensure there is no previously scheduled timeout work executing prior to
starting controller initialization by synchronizing with each queue.
Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

d6135c3a

nvme-pci: Unblock reset_work on IO failure · 2036f726

由 Keith Busch 提交于 5月 14, 2019

The reset_work waits for queued IO to complete before setting the
controller to live. If any of these times out and requeues, we won't be
able to restart the controller because the reset_work is already running.

Flush all entered requests to a failed completion if a timeout occurs
in the connecting state, and ensure the controller can't transition to
the live state after we've unblocked it from waiting for completions.
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

2036f726

nvme-pci: Don't disable on timeout in reset state · 39a9dd81

由 Keith Busch 提交于 5月 14, 2019

The reset state doesn't dispatch commands that it needs to wait for
anymore. If a timeout occurs in this state, the reset work is already
disabling the controller, so just reset the request's timer.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

39a9dd81

nvme-pci: Fix controller freeze wait disabling · e43269e6

由 Keith Busch 提交于 5月 14, 2019

If a controller disabling didn't start a freeze, don't wait for the
operation to complete.
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

e43269e6

14 5月, 2019 4 次提交

nvme: validate cntlid during controller initialisation · 1b1031ca

由 Christoph Hellwig 提交于 5月 09, 2019

The CNTLID value is required to be unique, and we do rely on this
for correct operation. So reject any controller for which a non-unique
CNTLID has been detected.

Based on a patch from Hannes Reinecke.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>

1b1031ca

nvme: change locking for the per-subsystem controller list · 32fd90c4

由 Christoph Hellwig 提交于 5月 08, 2019

Life becomes a lot simpler if we just use the global
nvme_subsystems_lock to protect this list.  Given that it is only
accessed during controller probing and removal that isn't a scalability
problem either.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>

32fd90c4

nvme: trace all async notice events · 521cfb8e

由 Chaitanya Kulkarni 提交于 5月 13, 2019

This patch removes the tracing of the NVMe Async events out of the
switch so that it can trace all the events including the ones which
are not handled in the nvme_handle_aen_notice(). The events which
are not handled in the nvme_handle_aen_notice() such as
NVME_AER_NOTICE_DISC_CHANGED corresponding event identifier needs
to be added in the drivers/nvme/host/trace.h so that it can stringify
the AER .
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

521cfb8e

nvme-fabrics: remove unused argument · 94e970b6

由 Minwoo Im 提交于 5月 11, 2019

The variable 'count' is not currently used by nvmf_create_ctrl(), so
remove it.
Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

94e970b6

13 5月, 2019 1 次提交

nvme-multipath: avoid crash on invalid subsystem cntlid enumeration · 8a03b27e

由 Hannes Reinecke 提交于 5月 03, 2019

A process holding an open reference to a removed disk prevents it
from completing deletion, so its name continues to exist. A subsequent
gendisk creation may have the same cntlid which risks collision when
using that for the name. Use the unique ctrl->instance instead.
Signed-off-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

8a03b27e

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功