提交 · 0b43a299794ee9dba2dc1b0f0290b1acab9d445d · openeuler / Kernel

07 12月, 2018 1 次提交

nvme: validate controller state before rescheduling keep alive · 86880d64

由 James Smart 提交于 11月 27, 2018

Delete operations are seeing NULL pointer references in call_timer_fn.
Tracking these back, the timer appears to be the keep alive timer.

nvme_keep_alive_work() which is tied to the timer that is cancelled
by nvme_stop_keep_alive(), simply starts the keep alive io but doesn't
wait for it's completion. So nvme_stop_keep_alive() only stops a timer
when it's pending. When a keep alive is in flight, there is no timer
running and the nvme_stop_keep_alive() will have no affect on the keep
alive io. Thus, if the io completes successfully, the keep alive timer
will be rescheduled. In the failure case, delete is called, the
controller state is changed, the nvme_stop_keep_alive() is called while
the io is outstanding, and the delete path continues on. The keep
alive happens to successfully complete before the delete paths mark it
as aborted as part of the queue termination, so the timer is restarted.
The delete paths then tear down the controller, and later on the timer
code fires and the timer entry is now corrupt.

Fix by validating the controller state before rescheduling the keep
alive. Testing with the fix has confirmed the condition above was hit.
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

86880d64

01 12月, 2018 3 次提交

nvme-rdma: fix double freeing of async event data · 6344d02d

由 Prabhath Sajeepa 提交于 11月 28, 2018

Some error paths in configuration of admin queue free data buffer
associated with async request SQE without resetting the data buffer
pointer to NULL, This buffer is also freed up again if the controller
is shutdown or reset.
Signed-off-by: NPrabhath Sajeepa <psajeepa@purestorage.com>
Reviewed-by: NRoland Dreier <roland@purestorage.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

6344d02d

nvme: flush namespace scanning work just before removing namespaces · f6c8e432

由 Sagi Grimberg 提交于 11月 21, 2018

nvme_stop_ctrl can be called also for reset flow and there is no need to
flush the scan_work as namespaces are not being removed. This can cause
deadlock in rdma, fc and loop drivers since nvme_stop_ctrl barriers
before controller teardown (and specifically I/O cancellation of the
scan_work itself) takes place, but the scan_work will be blocked anyways
so there is no need to flush it.

Instead, move scan_work flush to nvme_remove_namespaces() where it really
needs to flush.
Reported-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed by: James Smart <jsmart2021@gmail.com>
Tested-by: NEwan D. Milne <emilne@redhat.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

f6c8e432

nvme: warn when finding multi-port subsystems without multipathing enabled · 14a1336e

由 Christoph Hellwig 提交于 11月 20, 2018

Without CONFIG_NVME_MULTIPATH enabled a multi-port subsystem might
show up as invididual devices and cause problems, warn about it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

14a1336e

28 11月, 2018 2 次提交

nvme-pci: fix surprise removal · 751a0cc0

由 Igor Konopko 提交于 11月 23, 2018

When a PCIe NVMe device is not present, nvme_dev_remove_admin() calls
blk_cleanup_queue() on the admin queue, which frees the hctx for that
queue.  Moments later, on the same path nvme_kill_queues() calls
blk_mq_unquiesce_queue() on admin queue and tries to access hctx of it,
which leads to following OOPS:

Oops: 0000 [#1] SMP PTI
RIP: 0010:sbitmap_any_bit_set+0xb/0x40
Call Trace:
 blk_mq_run_hw_queue+0xd5/0x150
 blk_mq_run_hw_queues+0x3a/0x50
 nvme_kill_queues+0x26/0x50
 nvme_remove_namespaces+0xb2/0xc0
 nvme_remove+0x60/0x140
 pci_device_remove+0x3b/0xb0

Fixes: cb4bfda6 ("nvme-pci: fix hot removal during error handling")
Signed-off-by: NIgor Konopko <igor.j.konopko@intel.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

751a0cc0

nvme-fc: initialize nvme_req(rq)->ctrl after calling __nvme_fc_init_request() · dfa74422

由 Ewan D. Milne 提交于 11月 26, 2018

__nvme_fc_init_request() invokes memset() on the nvme_fcp_op_w_sgl structure, which
NULLed-out the nvme_req(req)->ctrl field previously set by nvme_fc_init_request().
This apparently was not referenced until commit faf4a44fff ("nvme: support traffic
based keep-alive") which now results in a crash in nvme_complete_rq():

[ 8386.897130] RIP: 0010:panic+0x220/0x26c
[ 8386.901406] Code: 83 3d 6f ee 72 01 00 74 05 e8 e8 54 02 00 48 c7 c6 40 fd 5b b4 48 c7 c7 d8 8d c6 b3 31e
[ 8386.922359] RSP: 0018:ffff99650019fc40 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[ 8386.930804] RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000006
[ 8386.938764] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff8e325f8168b0
[ 8386.946725] RBP: ffff99650019fcb0 R08: 0000000000000000 R09: 00000000000004f8
[ 8386.954687] R10: 0000000000000000 R11: ffff99650019f9b8 R12: ffffffffb3c55f3c
[ 8386.962648] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001
[ 8386.970613]  oops_end+0xd1/0xe0
[ 8386.974116]  no_context+0x1b2/0x3c0
[ 8386.978006]  do_page_fault+0x32/0x140
[ 8386.982090]  page_fault+0x1e/0x30
[ 8386.985786] RIP: 0010:nvme_complete_rq+0x65/0x1d0 [nvme_core]
[ 8386.992195] Code: 41 bc 03 00 00 00 74 16 0f 86 c3 00 00 00 66 3d 83 00 41 bc 06 00 00 00 0f 85 e7 00 000
[ 8387.013147] RSP: 0018:ffff99650019fe18 EFLAGS: 00010246
[ 8387.018973] RAX: 0000000000000000 RBX: ffff8e322ae51280 RCX: 0000000000000001
[ 8387.026935] RDX: 0000000000000400 RSI: 0000000000000001 RDI: ffff8e322ae51280
[ 8387.034897] RBP: ffff8e322ae51280 R08: 0000000000000000 R09: ffffffffb2f0b890
[ 8387.042859] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[ 8387.050821] R13: 0000000000000100 R14: 0000000000000004 R15: ffff8e2b0446d990
[ 8387.058782]  ? swiotlb_unmap_page+0x40/0x40
[ 8387.063448]  nvme_fc_complete_rq+0x2d/0x70 [nvme_fc]
[ 8387.068986]  blk_done_softirq+0xa1/0xd0
[ 8387.073264]  __do_softirq+0xd6/0x2a9
[ 8387.077251]  run_ksoftirqd+0x26/0x40
[ 8387.081238]  smpboot_thread_fn+0x10e/0x160
[ 8387.085807]  kthread+0xf8/0x130
[ 8387.089309]  ? sort_range+0x20/0x20
[ 8387.093198]  ? kthread_stop+0x110/0x110
[ 8387.097475]  ret_from_fork+0x35/0x40
[ 8387.101462] ---[ end trace 7106b0adf5e422f8 ]---

Fixes: faf4a44fff ("nvme: support traffic based keep-alive")
Signed-off-by: NEwan D. Milne <emilne@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

dfa74422

27 11月, 2018 1 次提交

nvme: Free ctrl device name on init failure · d6a2b953

由 Keith Busch 提交于 11月 26, 2018

Free the kobject name that was allocated for the controller device on
failure rather than its parent.

Fixes: d22524a4 ("nvme: switch controller refcounting to use struct device")
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d6a2b953

15 11月, 2018 1 次提交

nvme-fc: resolve io failures during connect · 4cff280a

由 James Smart 提交于 11月 14, 2018

If an io error occurs on an io issued while connecting, recovery
of the io falls flat as the state checking ends up nooping the error
handler.

Create an err_work work item that is scheduled upon an io error while
connecting. The work thread terminates all io on all queues and marks
the queues as not connected.  The termination of the io will return
back to the callee, which will then back out of the connection attempt
and will reschedule, if possible, the connection attempt.

The changes:
- in case there are several commands hitting the error handler, a
  state flag is kept so that the error work is only scheduled once,
  on the first error. The subsequent errors can be ignored.
- The calling sequence to stop keep alive and terminate the queues
  and their io is lifted from the reset routine. Made a small
  service routine used by both reset and err_work.
- During debugging, found that the teardown path can reference
  an uninitialized pointer, resulting in a NULL pointer oops.
  The aen_ops weren't initialized yet. Add validation on their
  initialization before calling the teardown routine.
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

4cff280a

09 11月, 2018 1 次提交

nvme: make sure ns head inherits underlying device limits · 8f676b85

由 Sagi Grimberg 提交于 11月 02, 2018

Whenever we update ns_head info, we need to make sure it is still
compatible with all underlying backing devices because although nvme
multipath doesn't have any explicit use of these limits, other devices
can still be stacked on top of it which may rely on the underlying limits.
Start with unlimited stacking limits, and every info update iterate over
siblings and adjust queue limits.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8f676b85

02 11月, 2018 2 次提交

nvme-pci: fix conflicting p2p resource adds · 9fe5c59f

由 Keith Busch 提交于 10月 31, 2018

The nvme pci driver had been adding its CMB resource to the P2P DMA
subsystem everytime on on a controller reset. This results in the
following warning:

    ------------[ cut here ]------------
    nvme 0000:00:03.0: Conflicting mapping in same section
    WARNING: CPU: 7 PID: 81 at kernel/memremap.c:155 devm_memremap_pages+0xa6/0x380
    ...
    Call Trace:
     pci_p2pdma_add_resource+0x153/0x370
     nvme_reset_work+0x28c/0x17b1 [nvme]
     ? add_timer+0x107/0x1e0
     ? dequeue_entity+0x81/0x660
     ? dequeue_entity+0x3b0/0x660
     ? pick_next_task_fair+0xaf/0x610
     ? __switch_to+0xbc/0x410
     process_one_work+0x1cf/0x350
     worker_thread+0x215/0x3d0
     ? process_one_work+0x350/0x350
     kthread+0x107/0x120
     ? kthread_park+0x80/0x80
     ret_from_fork+0x1f/0x30
    ---[ end trace f7ea76ac6ee72727 ]---
    nvme nvme0: failed to register the CMB

This patch fixes this by registering the CMB with P2P only once.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9fe5c59f

nvme-fc: fix request private initialization · d19b8bc8

由 James Smart 提交于 10月 27, 2018

The patch made to avoid Coverity reporting of out of bounds access
on aen_op moved the assignment of a pointer, leaving it null when it
was subsequently used to calculate a private pointer. Thus the private
pointer was bad.

Move/correct the private pointer initialization to be in sync with the
patch.

Fixes: 0d2bdf9f ("nvme-fc: rework the request initialization code")
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d19b8bc8

19 10月, 2018 2 次提交

nvme-fabrics: move controller options matching to fabrics · b7c7be6f

由 Sagi Grimberg 提交于 10月 18, 2018

IP transports will most likely use the same controller options
matching when detecting a duplicate connect. Move it to
fabrics.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b7c7be6f

nvme-rdma: always have a valid trsvcid · bb59b8e5

由 Sagi Grimberg 提交于 10月 19, 2018

If not passed, we set the default trsvcid. We can rely on having trsvcid
and can simplify the controller matching logic.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

bb59b8e5

18 10月, 2018 3 次提交

nvme-pci: remove duplicate check · 3045c0d0

由 Chaitanya Kulkarni 提交于 10月 17, 2018

This is a cleanup patch doesn't change any functionality. It removes
the duplicate call to the blk_integrity_rq() in the nvme_map_data().
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3045c0d0

nvme-pci: Add support for P2P memory in requests · e0596ab2

由 Logan Gunthorpe 提交于 10月 04, 2018

For P2P requests, we must use the pci_p2pmem_map_sg() function instead of
the dma_map_sg functions.

With that, we can then indicate PCI_P2P support in the request queue.  For
this, we create an NVME_F_PCI_P2P flag which tells the core to set
QUEUE_FLAG_PCI_P2P in the request queue.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>

e0596ab2

nvme-pci: Use PCI p2pmem subsystem to manage the CMB · 0f238ff5

由 Logan Gunthorpe 提交于 10月 04, 2018

Register the CMB buffer as p2pmem and use the appropriate allocation
functions to create and destroy the IO submission queues.

If the CMB supports WDS and RDS, publish it for use as P2P memory by other
devices.

Kernels without CONFIG_PCI_P2PDMA will also no longer support NVMe CMB.
However, seeing the main use-cases for the CMB is P2P operations, this
seems like a reasonable dependency.

We drop the __iomem safety on the buffer seeing that, by convention, it's
safe to directly access memory mapped by memremap()/devm_memremap_pages().
Architectures where this is not safe will not be supported by memremap()
and therefore will not support PCI P2P and have no support for CMB.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

0f238ff5

17 10月, 2018 10 次提交

nvme-pci: fix hot removal during error handling · cb4bfda6

由 Keith Busch 提交于 10月 15, 2018

A removal waits for the reset_work to complete. If a surprise removal
occurs around the same time as an error triggered controller reset, and
reset work happened to dispatch a command to the removed controller, the
command won't be recovered since the timeout work doesn't do anything
during error recovery. We wouldn't want to wait for timeout handling
anyway, so this patch fixes this by disabling the controller and killing
admin queues prior to syncing with the reset_work.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

cb4bfda6

nvme-core: make implicit seed truncation explicit · 202359c0

由 Bart Van Assche 提交于 10月 10, 2018

The nvme_user_io.slba field is 64 bits wide. That value is copied into the
32-bit bio_integrity_payload.bip_iter.bi_sector field. Make that truncation
explicit to avoid that Coverity complains about implicit truncation. See
also Coverity ID 1056486 on http://scan.coverity.com/projects/linux.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

202359c0

nvme-fc: rework the request initialization code · 0d2bdf9f

由 Bart Van Assche 提交于 10月 08, 2018

Instead of setting and then clearing the first_sgl pointer for AEN requests,
leave that pointer zero. This patch does not change how requests are
initialized but avoids that Coverity reports the following complaint for
nvme_fc_init_aen_ops():

CID 1418400 (#1 of 1): Out-of-bounds access (OVERRUN)
4. overrun-buffer-val: Overrunning buffer pointed to by aen_op of 312 bytes by passing it to a function which accesses it at byte offset 312.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0d2bdf9f

nvme-fc: introduce struct nvme_fcp_op_w_sgl · d3d0bc78

由 Bart Van Assche 提交于 10月 08, 2018

This patch does not change any functionality but makes the intent of the
code more clear.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d3d0bc78

nvme-fc: fix kernel-doc headers · 76c910c7

由 Bart Van Assche 提交于 10月 08, 2018

This patch avoids that the kernel-doc tool complains about several
multiple function headers when building with W=1.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

76c910c7

nvme-pci: fix nvme_suspend_queue() kernel-doc header · 40581d1a

由 Bart Van Assche 提交于 10月 08, 2018

This patch avoids that the kernel-doc tool complains about the
nvme_suspend_queue() function header when building with W=1.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

40581d1a

nvme-core: rework a NQN copying operation · bb2a1d4e

由 Bart Van Assche 提交于 10月 08, 2018

Although it is easy to see that the code in nvme_init_subnqn() guarantees that
the subsys->nqn string is '\0'-terminated, apparently Coverity is not smart
enough to see this. Make it easier for Coverity to analyze this code by changing
the strncpy() call into a strlcpy() call. This patch does not change the
behavior of the code but fixes Coveritiy ID 1423720.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

bb2a1d4e

nvme-core: declare local symbols static · eb090c4c

由 Bart Van Assche 提交于 10月 08, 2018

This patch avoids that sparse complains about missing declarations.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

eb090c4c

nvmet-rdma: check for timeout in nvme_rdma_wait_for_cm() · 35da77d5

由 Bart Van Assche 提交于 10月 08, 2018

Check whether queue->cm_error holds a value before reading it. This patch
addresses Coverity ID 1373774: unchecked return value.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

35da77d5

nvme: update node paths after adding new path · 886fabf6

由 Keith Busch 提交于 10月 05, 2018

The nvme namespace paths were being updated only when the current path
was not set or nonoptimized. If a new path comes online that is a better
path for its NUMA node, the multipath selector may continue using the
previously set path on a potentially further node.

This patch re-runs the path assignment after successfully adding a new
optimized path.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

886fabf6

09 10月, 2018 3 次提交

lightnvm: do no update csecs and sos on 1.2 · 6fd05cad

由 Javier González 提交于 10月 09, 2018

1.2 devices exposes their data and metadata size through the separate
identify command. Make sure that the NVMe LBA format does not override
these values.
Signed-off-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6fd05cad

lightnvm: use internal allocation for chunk log page · 090ee26f

由 Javier González 提交于 10月 09, 2018

The lightnvm subsystem provides helpers to retrieve chunk metadata,
where the target needs to provide a buffer to store the metadata. An
implicit assumption is that this buffer is contiguous and can be used to
retrieve the data from the device. If the device exposes too many
chunks, then kmalloc might fail, thus failing instance creation.

This patch removes this assumption by implementing an internal buffer in
the lightnvm subsystem to retrieve chunk metadata. Targets can then
use virtual memory allocations. Since this is a target API change, adapt
pblk accordingly.
Signed-off-by: NJavier González <javier@cnexlabs.com>
Reviewed-by: NHans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

090ee26f

lightnvm: move bad block and chunk state logic to core · aff3fb18

由 Matias Bjørling 提交于 10月 09, 2018

pblk implements two data paths for recovery line state. One for 1.2
and another for 2.0, instead of having pblk implement these, combine
them in the core to reduce complexity and make available to other
targets.

The new interface will adhere to the 2.0 chunk definition,
including managing open chunks with an active write pointer. To provide
this interface, a 1.2 device recovers the state of the chunks by
manually detecting if a chunk is either free/open/close/offline, and if
open, scanning the flash pages sequentially to find the next writeable
page. This process takes on average ~10 seconds on a device with 64 dies,
1024 blocks and 60us read access time. The process can be parallelized
but is left out for maintenance simplicity, as the 1.2 specification is
deprecated. For 2.0 devices, the logic is maintained internally in the
drive and retrieved through the 2.0 interface.
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

aff3fb18

08 10月, 2018 1 次提交

nvme: remove ns sibling before clearing path · 48f78be3

由 Keith Busch 提交于 10月 05, 2018

The code had been clearing a namespace being deleted as the current
path while that namespace was still in the path siblings list. It is
possible a new IO could set that namespace back to the current path
since it appeared to be an eligable path to select, which may result in
a use-after-free error.

This patch ensures a namespace being removed is not eligable to be reset
as a current path prior to clearing it as the current path.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

48f78be3

03 10月, 2018 1 次提交

PCI/AER: Remove pci_cleanup_aer_uncorrect_error_status() calls · 62b36c3e

由 Oza Pawandeep 提交于 9月 28, 2018

After bfcb79fc ("PCI/ERR: Run error recovery callbacks for all affected
devices"), AER errors are always cleared by the PCI core and drivers don't
need to do it themselves.

Remove calls to pci_cleanup_aer_uncorrect_error_status() from device
driver error recovery functions.
Signed-off-by: NOza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog, remove PCI core changes, remove unused variables]
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

62b36c3e

02 10月, 2018 6 次提交

nvme: take node locality into account when selecting a path · f3334447

由 Christoph Hellwig 提交于 9月 11, 2018

Make current_path an array with an entry for every possible node, and
cache the best path on a per-node basis.  Take the node distance into
account when selecting it.  This is primarily useful for dual-ported PCIe
devices which are connected to PCIe root ports on different sockets.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>

f3334447

nvme: call nvme_complete_rq when nvmf_check_ready fails for mpath I/O · 783f4a44

由 James Smart 提交于 9月 27, 2018

When an io is rejected by nvmf_check_ready() due to validation of the
controller state, the nvmf_fail_nonready_command() will normally return
BLK_STS_RESOURCE to requeue and retry.  However, if the controller is
dying or the I/O is marked for NVMe multipath, the I/O is failed so that
the controller can terminate or so that the io can be issued on a
different path.  Unfortunately, as this reject point is before the
transport has accepted the command, blk-mq ends up completing the I/O
and never calls nvme_complete_rq(), which is where multipath may preserve
or re-route the I/O. The end result is, the device user ends up seeing an
EIO error.

Example: single path connectivity, controller is under load, and a reset
is induced.  An I/O is received:

  a) while the reset state has been set but the queues have yet to be
     stopped; or
  b) after queues are started (at end of reset) but before the reconnect
     has completed.

The I/O finishes with an EIO status.

This patch makes the following changes:

  - Adds the HOST_PATH_ERROR pathing status from TP4028
  - Modifies the reject point such that it appears to queue successfully,
    but actually completes the io with the new pathing status and calls
    nvme_complete_rq().
  - nvme_complete_rq() recognizes the new status, avoids resetting the
    controller (likely was already done in order to get this new status),
    and calls the multipather to clear the current path that errored.
    This allows the next command (retry or new command) to select a new
    path if there is one.
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

783f4a44

nvme-core: add async event trace helper · 09bd1ff4

由 Chaitanya Kulkarni 提交于 9月 17, 2018

This patch adds a new event for nvme async event notification.
We print the async event in the decoded format when we recognize
the event otherwise we just dump the result.
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

09bd1ff4

nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device · 97faec53

由 James Smart 提交于 9月 13, 2018

The fc transport device should allow for a rediscovery, as userspace
might have lost the events. Example is udev events not handled during
system startup.

This patch add a sysfs entry 'nvme_discovery' on the fc class to
have it replay all udev discovery events for all local port/remote
port address pairs.
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

97faec53

nvme-fc: fix for a minor typos · d4e4230c

由 Milan P. Gandhi 提交于 8月 10, 2018

Signed-off-by: NMilan P. Gandhi <mgandhi@redhat.com>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d4e4230c

nvme: fix typo in nvme_identify_ns_descs · 53b3a661

由 Milan P. Gandhi 提交于 8月 09, 2018

Signed-off-by: NMilan P. Gandhi <mgandhi@redhat.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

53b3a661

28 9月, 2018 2 次提交

nvme: register ns_id attributes as default sysfs groups · 33b14f67

由 Hannes Reinecke 提交于 9月 28, 2018

We should be registering the ns_id attribute as default sysfs
attribute groups, otherwise we have a race condition between
the uevent and the attributes appearing in sysfs.
Suggested-by: NBart van Assche <bvanassche@acm.org>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

33b14f67

block: genhd: add 'groups' argument to device_add_disk · fef912bf

由 Hannes Reinecke 提交于 9月 28, 2018

Update device_add_disk() to take an 'groups' argument so that
individual drivers can register a device with additional sysfs
attributes.
This avoids race condition the driver would otherwise have if these
groups were to be created with sysfs_add_groups().
Signed-off-by: NMartin Wilck <martin.wilck@suse.com>
Signed-off-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

fef912bf

26 9月, 2018 1 次提交
- S
  nvme: properly propagate errors in nvme_mpath_init · bb830add
  由 Susobhan Dey 提交于 9月 25, 2018
```
Signed-off-by: NSusobhan Dey <susobhan.dey@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
```
  bb830add

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功