提交 · d19b8bc82fc232d17ec45ca148388e4ba05ac4b9 · openeuler / Kernel

18 10月, 2018 3 次提交

nvme-pci: remove duplicate check · 3045c0d0

由 Chaitanya Kulkarni 提交于 10月 17, 2018

This is a cleanup patch doesn't change any functionality. It removes
the duplicate call to the blk_integrity_rq() in the nvme_map_data().
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3045c0d0

nvme-pci: Add support for P2P memory in requests · e0596ab2

由 Logan Gunthorpe 提交于 10月 04, 2018

For P2P requests, we must use the pci_p2pmem_map_sg() function instead of
the dma_map_sg functions.

With that, we can then indicate PCI_P2P support in the request queue.  For
this, we create an NVME_F_PCI_P2P flag which tells the core to set
QUEUE_FLAG_PCI_P2P in the request queue.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>

e0596ab2

nvme-pci: Use PCI p2pmem subsystem to manage the CMB · 0f238ff5

由 Logan Gunthorpe 提交于 10月 04, 2018

Register the CMB buffer as p2pmem and use the appropriate allocation
functions to create and destroy the IO submission queues.

If the CMB supports WDS and RDS, publish it for use as P2P memory by other
devices.

Kernels without CONFIG_PCI_P2PDMA will also no longer support NVMe CMB.
However, seeing the main use-cases for the CMB is P2P operations, this
seems like a reasonable dependency.

We drop the __iomem safety on the buffer seeing that, by convention, it's
safe to directly access memory mapped by memremap()/devm_memremap_pages().
Architectures where this is not safe will not be supported by memremap()
and therefore will not support PCI P2P and have no support for CMB.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

0f238ff5

17 10月, 2018 2 次提交

nvme-pci: fix hot removal during error handling · cb4bfda6

由 Keith Busch 提交于 10月 15, 2018

A removal waits for the reset_work to complete. If a surprise removal
occurs around the same time as an error triggered controller reset, and
reset work happened to dispatch a command to the removed controller, the
command won't be recovered since the timeout work doesn't do anything
during error recovery. We wouldn't want to wait for timeout handling
anyway, so this patch fixes this by disabling the controller and killing
admin queues prior to syncing with the reset_work.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

cb4bfda6

nvme-pci: fix nvme_suspend_queue() kernel-doc header · 40581d1a

由 Bart Van Assche 提交于 10月 08, 2018

This patch avoids that the kernel-doc tool complains about the
nvme_suspend_queue() function header when building with W=1.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

40581d1a

03 10月, 2018 1 次提交

PCI/AER: Remove pci_cleanup_aer_uncorrect_error_status() calls · 62b36c3e

由 Oza Pawandeep 提交于 9月 28, 2018

After bfcb79fc ("PCI/ERR: Run error recovery callbacks for all affected
devices"), AER errors are always cleared by the PCI core and drivers don't
need to do it themselves.

Remove calls to pci_cleanup_aer_uncorrect_error_status() from device
driver error recovery functions.
Signed-off-by: NOza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog, remove PCI core changes, remove unused variables]
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

62b36c3e

28 8月, 2018 1 次提交

nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event · f1ed3df2

由 Michal Wnukowski 提交于 8月 15, 2018

In many architectures loads may be reordered with older stores to
different locations.  In the nvme driver the following two operations
could be reordered:

 - Write shadow doorbell (dbbuf_db) into memory.
 - Read EventIdx (dbbuf_ei) from memory.

This can result in a potential race condition between driver and VM host
processing requests (if given virtual NVMe controller has a support for
shadow doorbell).  If that occurs, then the NVMe controller may decide to
wait for MMIO doorbell from guest operating system, and guest driver may
decide not to issue MMIO doorbell on any of subsequent commands.

This issue is purely timing-dependent one, so there is no easy way to
reproduce it. Currently the easiest known approach is to run "Oracle IO
Numbers" (orion) that is shipped with Oracle DB:

orion -run advanced -num_large 0 -size_small 8 -type rand -simulate \
	concat -write 40 -duration 120 -matrix row -testname nvme_test

Where nvme_test is a .lun file that contains a list of NVMe block
devices to run test against. Limiting number of vCPUs assigned to given
VM instance seems to increase chances for this bug to occur. On test
environment with VM that got 4 NVMe drives and 1 vCPU assigned the
virtual NVMe controller hang could be observed within 10-20 minutes.
That correspond to about 400-500k IO operations processed (or about
100GB of IO read/writes).

Orion tool was used as a validation and set to run in a loop for 36
hours (equivalent of pushing 550M IO operations). No issues were
observed. That suggest that the patch fixes the issue.

Fixes: f9f38e33 ("nvme: improve performance for virtual NVMe devices")
Signed-off-by: NMichal Wnukowski <wnukowski@google.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
[hch: updated changelog and comment a bit]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

f1ed3df2

30 7月, 2018 1 次提交

nvme: use blk API to remap ref tags for IOs with metadata · f7f1fc36

由 Max Gurtovoy 提交于 7月 30, 2018

Also moved the logic of the remapping to the nvme core driver instead
of implementing it in the nvme pci driver. This way all the other nvme
transport drivers will benefit from it (in case they'll implement metadata
support).
Suggested-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f7f1fc36

23 7月, 2018 1 次提交

nvme: cache struct nvme_ctrl reference to struct nvme_request · 59e29ce6

由 Sagi Grimberg 提交于 6月 29, 2018

We will need to reference the controller in the setup and completion
time for tracing and future traffic based keep alive support.
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

59e29ce6

12 7月, 2018 1 次提交

nvme-pci: fix memory leak on probe failure · b6e44b4c

由 Keith Busch 提交于 7月 11, 2018

The nvme driver specific structures need to be initialized prior to
enabling the generic controller so we can unwind on failure with out
using the reference counting callbacks so that 'probe' and 'remove'
can be symmetric.

The newly added iod_mempool is the only resource that was being
allocated out of order, and a failure there would leak the generic
controller memory. This patch just moves that allocation above the
controller initialization.

Fixes: 943e942e ("nvme-pci: limit max IO size and segments to avoid high order allocations")
Reported-by: NWeiping Zhang <zwp10758@gmail.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b6e44b4c

22 6月, 2018 1 次提交

nvme-pci: limit max IO size and segments to avoid high order allocations · 943e942e

由 Jens Axboe 提交于 6月 21, 2018

nvme requires an sg table allocation for each request. If the request
is large, then the allocation can become quite large. For instance,
with our default software settings of 1280KB IO size, we'll need
10248 bytes of sg table. That turns into a 2nd order allocation,
which we can't always guarantee. If we fail the allocation, blk-mq
will retry it later. But there's no guarantee that we'll EVER be
able to allocate that much contigious memory.

Limit the IO size such that we never need more than a single page
of memory. That's a lot faster and more reliable. Then back that
allocation with a mempool, so that we know we'll always be able
to succeed the allocation at some point.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

943e942e

21 6月, 2018 1 次提交

nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrl · 9f9cafc1

由 Jianchao Wang 提交于 6月 20, 2018

There is race between nvme_remove and nvme_reset_work that can
lead to io hang.

nvme_remove                    nvme_reset_work
                               -> nvme_remove_dead_ctrl
                                 -> nvme_dev_disable
                                   -> quiesce request_queue
                                 -> queue remove_work
-> cancel_work_sync reset_work
-> nvme_remove_namespaces
  -> splice ctrl->namespaces
                               nvme_remove_dead_ctrl_work
                               -> nvme_kill_queues
  -> nvme_ns_remove               do nothing
    -> blk_cleanup_queue
      -> blk_freeze_queue

Finally, the request_queue is quiesced state when wait freeze,
we will get io hang here. To fix it, move the nvme_kill_queues
from nvme_remove_dead_ctrl_work to nvme_remove_dead_ctrl.
Suggested-by: NKeith Busch <keith.busch@linux.intel.com>
Signed-off-by: NJianchao Wang <jianchao.w.wang@oracle.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

9f9cafc1

09 6月, 2018 6 次提交

nvme-pci: make CMB SQ mod-param read-only · 69f4eb9f

由 Keith Busch 提交于 6月 06, 2018

A controller reset after a run time change of the CMB module parameter
breaks the driver. An 'on -> off' will have the driver use NULL for the
host memory queue, and 'off -> on' will use mismatched queue depth between
the device and the host.

We could fix both, but there isn't really a good reason to change this
at run time anyway, compared to at module load time, so this patch makes
parameter read-only after after modprobe.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

69f4eb9f

nvme-pci: unquiesce dead controller queues · 1d39e692

由 Keith Busch 提交于 6月 06, 2018

This patch ensures the nvme namsepace request queues are not quiesced
on a surprise removal. It's possible the queues were previously killed
in a failed reset, so the queues need to be unquiesced to ensure all
requests are flushed to completion.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1d39e692

nvme-pci: remove HMB teardown on reset · fe76fcfb

由 Keith Busch 提交于 6月 06, 2018

The controller is required to disable its host memory buffer use on
controller reset. We don't need to submit an admin command to delete it,
so this patch skips sending that command so we don't need to worry about
handling a timeout.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

fe76fcfb

nvme-pci: queue creation fixes · ded45505

由 Keith Busch 提交于 6月 06, 2018

We've been ignoring NVMe error status on queue creations. Fortunately they
are uncommon, but we should handle these anyway. This patch adds checks
for the a positive error return value that indicates an NVMe status.

If we do see a negative return, the controller isn't usable, so this
patch returns immediately in since we can't unwind that failure.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ded45505

nvme-pci: remove unnecessary completion doorbell check · 397c699f

由 Keith Busch 提交于 6月 06, 2018

The nvme pci driver never unmaps the doorbell registers while the requests
are active, so we can always safely update the completion queue head.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

397c699f

nvme-pci: remove unnecessary nested locking · 0bc88192

由 Keith Busch 提交于 6月 06, 2018

The nvme pci driver no longer handles completions under the cq lock,
so the nested locking is not necessary.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0bc88192

30 5月, 2018 2 次提交

nvme-pci: simplify __nvme_submit_cmd · 90ea5ca4

由 Christoph Hellwig 提交于 5月 26, 2018

With recent CQ handling improvements we can now move the locking into
__nvme_submit_cmd.  Also remove the local tail variable to make the code
more obvious, remove the __ prefix in the name, and fix the comments
describing the function.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>

90ea5ca4

nvme-pci: Rate limit the nvme timeout warnings · b9cac43c

由 Keith Busch 提交于 5月 24, 2018

The block layer's timeout handling currently prevents drivers from
completing commands outside the timeout callback once blk-mq decides
they've expired. If a device breaks, this could potentially create many
thousands of timed out commands. There's nothing of value to be gleaned
from observing each of those messages, so this patch adds a rate limit
on them.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b9cac43c

29 5月, 2018 1 次提交

nvme: return BLK_EH_DONE from ->timeout · db8c48e4

由 Christoph Hellwig 提交于 5月 29, 2018

NVMe always completes the request before returning from ->timeout, either
by polling for it, or by disabling the controller.  Return BLK_EH_DONE so
that the block layer doesn't even try to complete it again.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

db8c48e4

25 5月, 2018 2 次提交

nvme-pci: Fix AER reset handling · 72cd4cc2

由 Keith Busch 提交于 5月 24, 2018

The nvme timeout handling doesn't do anything if the pci channel is
offline, which is the case when recovering from PCI error event, so it
was a bad idea to sync the controller reset in this state. This patch
flushes the reset work in the error_resume callback instead when the
channel is back to online. This keeps AER handling serialized and
can recover from timeouts.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=199757
Fixes: cc1d5e74 ("nvme/pci: Sync controller reset for AER slot_reset")
Reported-by: NAlex Gagniuc <mr.nuke.me@gmail.com>
Tested-by: NAlex Gagniuc <mr.nuke.me@gmail.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

72cd4cc2

nvme-pci: set nvmeq->cq_vector after alloc cq/sq · a8e3e0bb

由 Jianchao Wang 提交于 5月 24, 2018

Set cq_vector after alloc cq/sq, otherwise nvme_suspend_queue will invoke
free_irq for it and cause a 'Trying to free already-free IRQ  xxx'
warning if the create CQ/SQ command times out.
Signed-off-by: NJianchao Wang <jianchao.w.wang@oracle.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
[hch: fixed to pass a s16 and clean up the comment]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

a8e3e0bb

21 5月, 2018 1 次提交

nvme-pci: fix race between poll and IRQ completions · 68fa9dbe

由 Jens Axboe 提交于 5月 21, 2018

If polling completions are racing with the IRQ triggered by a
completion, the IRQ handler will find no work and return IRQ_NONE.
This can trigger complaints about spurious interrupts:

[  560.169153] irq 630: nobody cared (try booting with the "irqpoll" option)
[  560.175988] CPU: 40 PID: 0 Comm: swapper/40 Not tainted 4.17.0-rc2+ #65
[  560.175990] Hardware name: Intel Corporation S2600STB/S2600STB, BIOS SE5C620.86B.00.01.0010.010920180151 01/09/2018
[  560.175991] Call Trace:
[  560.175994]  <IRQ>
[  560.176005]  dump_stack+0x5c/0x7b
[  560.176010]  __report_bad_irq+0x30/0xc0
[  560.176013]  note_interrupt+0x235/0x280
[  560.176020]  handle_irq_event_percpu+0x51/0x70
[  560.176023]  handle_irq_event+0x27/0x50
[  560.176026]  handle_edge_irq+0x6d/0x180
[  560.176031]  handle_irq+0xa5/0x110
[  560.176036]  do_IRQ+0x41/0xc0
[  560.176042]  common_interrupt+0xf/0xf
[  560.176043]  </IRQ>
[  560.176050] RIP: 0010:cpuidle_enter_state+0x9b/0x2b0
[  560.176052] RSP: 0018:ffffa0ed4659fe98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdd
[  560.176055] RAX: ffff9527beb20a80 RBX: 000000826caee491 RCX: 000000000000001f
[  560.176056] RDX: 000000826caee491 RSI: 00000000335206ee RDI: 0000000000000000
[  560.176057] RBP: 0000000000000001 R08: 00000000ffffffff R09: 0000000000000008
[  560.176059] R10: ffffa0ed4659fe78 R11: 0000000000000001 R12: ffff9527beb29358
[  560.176060] R13: ffffffffa235d4b8 R14: 0000000000000000 R15: 000000826caed593
[  560.176065]  ? cpuidle_enter_state+0x8b/0x2b0
[  560.176071]  do_idle+0x1f4/0x260
[  560.176075]  cpu_startup_entry+0x6f/0x80
[  560.176080]  start_secondary+0x184/0x1d0
[  560.176085]  secondary_startup_64+0xa5/0xb0
[  560.176088] handlers:
[  560.178387] [<00000000efb612be>] nvme_irq [nvme]
[  560.183019] Disabling IRQ #630

A previous commit removed ->cqe_seen that was handling this case,
but we need to handle this a bit differently due to completions
now running outside the queue lock. Return IRQ_HANDLED from the
IRQ handler, if the completion ring head was moved since we last
saw it.

Fixes: 5cb525c8 ("nvme-pci: handle completions outside of the queue lock")
Reported-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Tested-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

68fa9dbe

19 5月, 2018 6 次提交

nvme-pci: drop IRQ disabling on submission queue lock · 1eae349d

由 Jens Axboe 提交于 5月 17, 2018

Since we aren't sharing the lock for completions now, we don't
have to make it IRQ safe.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1eae349d

nvme-pci: split the nvme queue lock into submission and completion locks · 1ab0cd69

由 Jens Axboe 提交于 5月 17, 2018

This is now feasible. We protect the submission queue ring with
->sq_lock, and the completion side with ->cq_lock.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1ab0cd69

nvme-pci: handle completions outside of the queue lock · 5cb525c8

由 Jens Axboe 提交于 5月 17, 2018

Split the completion of events into a two part process:

1) Reap the events inside the queue lock
2) Complete the events outside the queue lock

Since we never wrap the queue, we can access it locklessly after we've
updated the completion queue head. This patch started off with batching
events on the stack, but with this trick we don't have to. Keith Busch
<keith.busch@intel.com> came up with that idea.

Note that this kills the ->cqe_seen as well. I haven't been able to
trigger any ill effects of this. If we do race with polling every so
often, it should be rare enough NOT to trigger any issues.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
[hch: refactored, restored poll early exit optimization]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

5cb525c8

nvme-pci: move ->cq_vector == -1 check outside of ->q_lock · d1f06f4a

由 Jens Axboe 提交于 5月 17, 2018

We only clear it dynamically in nvme_suspend_queue(). When we do, ensure
to do a full flush so that any nvme_queue_rq() invocation will see it.

Ideally we'd kill this check completely, but we're using it to flush
requests on a dying queue.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d1f06f4a

nvme-pci: remove cq check after submission · f9dde187

由 Jens Axboe 提交于 5月 17, 2018

We always check the completion queue after submitting, but in my testing
this isn't a win even on DRAM/xpoint devices. In some cases it's
actually worse. Kill it.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

f9dde187

nvme-pci: simplify nvme_cqe_valid · 750dde44

由 Christoph Hellwig 提交于 5月 18, 2018

We always look at the current CQ head and phase, so don't pass these
as separate arguments, and rename the function to nvme_cqe_pending.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

750dde44

12 5月, 2018 2 次提交

nvme/pci: Sync controller reset for AER slot_reset · cc1d5e74

由 Keith Busch 提交于 5月 10, 2018

AER handling expects a successful return from slot_reset means the
driver made the device functional again. The nvme driver had been using
an asynchronous reset to recover the device, so the device
may still be initializing after control is returned to the
AER handler. This creates problems for subsequent event handling,
causing the initializion to fail.

This patch fixes that by syncing the controller reset before returning
to the AER driver, and reporting the true state of the reset.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=199657Reported-by: NAlex Gagniuc <mr.nuke.me@gmail.com>
Cc: Sinan Kaya <okaya@codeaurora.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Tested-by: NAlex Gagniuc <mr.nuke.me@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

cc1d5e74

nvme: add quirk to force medium priority for SQ creation · 9abd68ef

由 Jens Axboe 提交于 5月 08, 2018

Some P3100 drives have a bug where they think WRRU (weighted round robin)
is always enabled, even though the host doesn't set it. Since they think
it's enabled, they also look at the submission queue creation priority. We
used to set that to MEDIUM by default, but that was removed in commit
81c1cd98. This causes various issues on that drive. Add a quirk to
still set MEDIUM priority for that controller.

Fixes: 81c1cd98 ("nvme/pci: Don't set reserved SQ create flags")
Cc: stable@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

9abd68ef

07 5月, 2018 1 次提交

nvme/pci: Hold controller reference during async probe · 80f513b5

由 Keith Busch 提交于 5月 07, 2018

It is possible the driver's remove may have freed the controller if
the remove callback is invoked prior to the async_schedule starting
the reset_work. This patch fixes that by holding a reference on the
controller.
Reported-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

80f513b5

02 5月, 2018 1 次提交

nvme/pci: Use async_schedule for initial reset work · 18119775

由 Keith Busch 提交于 4月 27, 2018

This patch schedules the initial controller reset in an async_domain
so that it can be synchronized from wait_for_device_probe(). This way
the kernel waits for the initial nvme controller scan to complete for
all devices before proceeding with the boot sequence, which may have
nvme dependencies.
Reported-by: NMikulas Patocka <mpatocka@redhat.com>
Tested-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

18119775

27 4月, 2018 2 次提交

nvme: lightnvm: add granby support · ea48e877

由 Wei Xu 提交于 4月 26, 2018

Add a new lightnvm quirk to identify CNEX’s Granby controller.
Signed-off-by: NWei Xu <wxu@cnexlabs.com>
Reviewed-by: NJavier González <javier@cnexlabs.com>
Reviewed-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

ea48e877

NVMe: Add Quirk Delay before CHK RDY for Seagate Nytro Flash Storage · 0302ae60

由 Micah Parrish 提交于 4月 12, 2018

Add Seagate Nytro Flash Storage nvme drive to quirk list for
NVME_QUIRK_DELAY_BEFORE_CHK_RDY, which solves a bug where the drive is
probed on hot-add before the firmare is ready, I/O errors are generated
while reading sector 0, and linux is "unable to read partition table".
Signed-off-by: NMicah Parrish <micah.parrish@hpe.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

0302ae60

25 4月, 2018 1 次提交

nvme-pci: Use pci_sriov_configure_simple() to enable VFs · 74d986ab

由 Alexander Duyck 提交于 4月 24, 2018

Instead of implementing our own version of a SR-IOV configuration stub in
the nvme driver, use the existing pci_sriov_configure_simple() function.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

74d986ab

12 4月, 2018 3 次提交

nvme-pci: Separate IO and admin queue IRQ vectors · 22b55601

由 Keith Busch 提交于 4月 12, 2018

The admin and first IO queues shared the first irq vector, which has an
affinity mask including cpu0. If a system allows cpu0 to be offlined,
the admin queue may not be usable if no other CPUs in the affinity mask
are online. This is a problem since unlike IO queues, there is only
one admin queue that always needs to be usable.

To fix, this patch allocates one pre_vector for the admin queue that
is assigned all CPUs, so will always be accessible. The IO queues are
assigned the remaining managed vectors.

In case a controller has only one interrupt vector available, the admin
and IO queues will share the pre_vector with all CPUs assigned.

Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

22b55601

nvme-pci: Remove unused queue parameter · a6ff7262

由 Keith Busch 提交于 4月 12, 2018

All the queue memory is allocated up front. We don't take the node
into consideration when creating queues anymore, so removing the unused
parameter.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a6ff7262

nvme-pci: Skip queue deletion if there are no queues · 64ee0ac0

由 Keith Busch 提交于 4月 12, 2018

User reported controller always retains CSTS.RDY to 1, which fails
controller disabling when resetting the controller. This is also before
the admin queue is allocated, and trying to disable an unallocated queue
results in a NULL dereference.
Reported-by: NAlex Gagniuc <Alex_Gagniuc@Dellteam.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

64ee0ac0

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功