提交 · 82469c59d222f839ded5cd282172258e026f9112 · openeuler / raspberrypi-kernel

07 9月, 2016 1 次提交

nvme: Don't suspend admin queue that wasn't created · 82469c59

由 Gabriel Krisman Bertazi 提交于 9月 06, 2016

This fixes a regression in my previous commit c21377f8 ("nvme:
Suspend all queues before deletion"), which provoked an Oops in the
removal path when removing a device that became IO incapable very early
at probe (i.e. after a failed EEH recovery).

Turns out, if the error occurred very early at the probe path, before
even configuring the admin queue, we might try to suspend the
uninitialized admin queue, accessing bad memory.

Fixes: c21377f8 ("nvme: Suspend all queues before deletion")
Signed-off-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Reviewed-by: NJay Freyensee <james_p_freyensee@linux.intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

82469c59

11 8月, 2016 1 次提交

nvme: Suspend all queues before deletion · c21377f8

由 Gabriel Krisman Bertazi 提交于 8月 11, 2016

When nvme_delete_queue fails in the first pass of the
nvme_disable_io_queues() loop, we return early, failing to suspend all
of the IO queues.  Later, on the nvme_pci_disable path, this causes us
to disable MSI without actually having freed all the IRQs, which
triggers the BUG_ON in free_msi_irqs(), as show below.

This patch refactors nvme_disable_io_queues to suspend all queues before
start submitting delete queue commands.  This way, we ensure that we
have at least returned every IRQ before continuing with the removal
path.

[  487.529200] kernel BUG at ../drivers/pci/msi.c:368!
cpu 0x46: Vector: 700 (Program Check) at [c0000078c5b83650]
    pc: c000000000627a50: free_msi_irqs+0x90/0x200
    lr: c000000000627a40: free_msi_irqs+0x80/0x200
    sp: c0000078c5b838d0
   msr: 9000000100029033
  current = 0xc0000078c5b40000
  paca    = 0xc000000002bd7600   softe: 0        irq_happened: 0x01
    pid   = 1376, comm = kworker/70:1H
kernel BUG at ../drivers/pci/msi.c:368!
Linux version 4.7.0.mainline+ (root@iod76) (gcc version 5.3.1 20160413
(Ubuntu/IBM 5.3.1-14ubuntu2.1) ) #104 SMP Fri Jul 29 09:20:17 CDT 2016
enter ? for help
[c0000078c5b83920] d0000000363b0cd8 nvme_dev_disable+0x208/0x4f0 [nvme]
[c0000078c5b83a10] d0000000363b12a4 nvme_timeout+0xe4/0x250 [nvme]
[c0000078c5b83ad0] c0000000005690e4 blk_mq_rq_timed_out+0x64/0x110
[c0000078c5b83b40] c00000000056c930 bt_for_each+0x160/0x170
[c0000078c5b83bb0] c00000000056d928 blk_mq_queue_tag_busy_iter+0x78/0x110
[c0000078c5b83c00] c0000000005675d8 blk_mq_timeout_work+0xd8/0x1b0
[c0000078c5b83c50] c0000000000e8cf0 process_one_work+0x1e0/0x590
[c0000078c5b83ce0] c0000000000e9148 worker_thread+0xa8/0x660
[c0000078c5b83d80] c0000000000f2090 kthread+0x110/0x130
[c0000078c5b83e30] c0000000000095f0 ret_from_kernel_thread+0x5c/0x6c
Signed-off-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: linux-nvme@lists.infradead.org
Signed-off-by: NJens Axboe <axboe@fb.com>

c21377f8

21 7月, 2016 1 次提交

nvme/pci: Provide SR-IOV support · 13880f5b

由 Keith Busch 提交于 6月 20, 2016

This registers an sr-iov callback for nvme.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

13880f5b

14 7月, 2016 1 次提交

nvme: avoid crashes when node 0 is memoryless node. · 2fa84351

由 Masayoshi Mizuma 提交于 6月 20, 2016

When CONFIG_NUMA is enabled and node 0 is memoryless, the system
crashes because nvme_probe() sets the device->numa_node to 0 by
set_dev_node(&pdev->dev, 0), so it tries to allocate memory from node 0.
To avoid the crash, we should change the 0 to first_memory_node.
Signed-off-by: NMasayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

2fa84351

13 7月, 2016 1 次提交

nvme: Limit command retries · f80ec966

由 Keith Busch 提交于 7月 12, 2016

Many controller implementations will return errors to commands that will
not succeed, but without the DNR bit set. The driver previously retried
these commands an unlimited number of times until the command timeout
has exceeded, which takes an unnecessarilly long period of time.

This patch limits the number of retries a command can have, defaulting
to 5, but is user tunable at load or runtime.

The struct request's 'retries' field is used to track the number of
retries attempted. This is in contrast with scsi's use of this field,
which indicates how many retries are allowed.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

f80ec966

12 7月, 2016 1 次提交

nvme/quirk: Add a delay before checking for adapter readiness · 54adc010

由 Guilherme G. Piccoli 提交于 6月 14, 2016

When disabling the controller, the specification says the register
NVME_REG_CC should be written and then driver needs to wait the
adapter to be ready, which is checked by reading another register
bit (NVME_CSTS_RDY). There's a timeout validation in this checking,
so in case this timeout is reached the driver gives up and removes
the adapter from the system.

After a firmware activation procedure, the PCI_DEVICE(0x1c58, 0x0003)
(HGST adapter) end up being removed if we issue a reset_controller,
because driver keeps verifying the NVME_REG_CSTS until the timeout is
reached. This patch adds a necessary quirk for this adapter, by
introducing a delay before nvme_wait_ready(), so the reset procedure
is able to be completed. This quirk is needed because just increasing
the timeout is not enough in case of this adapter - the driver must
wait before start reading NVME_REG_CSTS register on this specific
device.
Signed-off-by: NGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

54adc010

06 7月, 2016 3 次提交

nvme.h: add NVMe over Fabrics definitions · eb793e2c

由 Christoph Hellwig 提交于 6月 13, 2016

The NVMe over Fabrics specification defines a protocol interface and
related extensions to NVMe that enable operation over network protocols.
The NVMe over Fabrics specification has an NVMe Transport binding for
each NVMe Transport.

This patch adds the fabrics related definitions:
- fabric specific command set and error codes
- transport addressing and binding definitions
- fabrics sgl extensions
- controller identification fabrics enhancements
- discovery log page definition
Signed-off-by: NArmen Baloyan <armenx.baloyan@intel.com>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NJay Freyensee <james.p.freyensee@intel.com>
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

eb793e2c

nvme: add fabrics sysfs attributes · 1a353d85

由 Ming Lin 提交于 6月 13, 2016

- delete_controller: This attribute allows to delete a controller.
  A driver is not obligated to support it (pci doesn't) so it is
  created only if the driver supports it. The new fabrics drivers
  will support it (essentialy a disconnect operation).

  Usage:
  echo > /sys/class/nvme/nvme0/delete_controller

- subsysnqn: This attribute shows the subsystem nqn of the configured
  device. If a driver does not implement the get_subsysnqn method, the
  file will not appear in sysfs.

- transport: This attribute shows the transport name. Added a "name"
  field to struct nvme_ctrl_ops.

  For loop,
  cat /sys/class/nvme/nvme0/transport
  loop

  For RDMA,
  cat /sys/class/nvme/nvme0/transport
  rdma

  For PCIe,
  cat /sys/class/nvme/nvme0/transport
  pcie

- address: This attributes shows the controller address. The fabrics
  drivers that will implement get_address can show the address of the
  connected controller.

  example:
  cat /sys/class/nvme/nvme0/address
  traddr=192.168.2.2,trsvcid=1023
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Reviewed-by: NJay Freyensee <james.p.freyensee@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1a353d85

nvme: Modify and export sync command submission for fabrics · eb71f435

由 Christoph Hellwig 提交于 6月 13, 2016

NVMe over fabrics will use __nvme_submit_sync_cmd in the the
transport and require a few tweaks to it.  For that we export it
and add a few more paramters:

1. allow passing a queue ID to the block layer

   For the NVMe over Fabrics connect command we need to able to specify a
   queue ID that we want to send the command on.  Add a qid parameter to
   the relevant functions to enable this behavior.

2. allow submitting at_head commands

   In cases where we want to (re)connect to a controller
   where we have inflight queued commands we want to first
   connect and only then allow the other queued commands to
   be kicked. This will prevents failures in controller resets
   and reconnects.

3. allow passing flags to blk_mq_allocate_request

   Both for Fabrics connect the the keep-alive feature in NVMe 1.2.1 we
   want to be able to use reserved requests.
Reviewed-by: NJay Freyensee <james.p.freyensee@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Tested-by: NMing Lin <ming.l@ssi.samsung.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

eb71f435

22 6月, 2016 1 次提交

NVMe: Use pci_(request|release)_mem_regions · a1f447b3

由 Johannes Thumshirn 提交于 6月 07, 2016

Now that we do have pci_request_mem_regions() and pci_release_mem_regions()
at hand, use it in the NVMe driver.
Suggested-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
CC: Keith Busch <keith.busch@intel.com>
CC: Jens Axboe <axboe@fb.com>

a1f447b3

12 6月, 2016 1 次提交

nvme: move the workaround for I/O queue-less controllers from PCIe to core · f5fa90dc

由 Christoph Hellwig 提交于 6月 06, 2016

We want to apply this to Fabrics drivers as well, so move it to common
code.
Reviewed-by: NJay Freyensee <james.p.freyensee@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Tested-by: NMing Lin <ming.l@ssi.samsung.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f5fa90dc

10 6月, 2016 1 次提交

NVMe: Only release requested regions · edb50a54

由 Johannes Thumshirn 提交于 5月 10, 2016

The NVMe driver only requests the PCIe device's memory regions but releases
all possible regions (including eventual I/O regions). This leads to a stale
warning entry in dmesg about freeing non existent resources.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

edb50a54

08 6月, 2016 2 次提交

nvme: move nvme_cancel_request() to common code · c55a2fd4

由 Ming Lin 提交于 5月 18, 2016

So it can be used by fabrics driver also.
Signed-off-by: NMing Lin <ming.l@samsung.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NKeith Busch <keith.bsuch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

c55a2fd4

nvme: update and rename nvme_cancel_io to nvme_cancel_request · e1958e65

由 Ming Lin 提交于 5月 18, 2016

nvme_cancel_io is a bit confusing (given the distinction of io/admin),
so rename it to nvme_cancel_request.

And update it a bit to pass in struct nvme_ctrl, so it can be used
by Fabrics driver also.
Signed-off-by: NMing Lin <ming.l@samsung.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Suggested-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NKeith Busch <keith.bsuch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e1958e65

18 5月, 2016 6 次提交

NVMe: Add device ID's with stripe quirk · 99466e70

由 Keith Busch 提交于 5月 02, 2016

Adds two Intel controllers that have the "stripe" quirk.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

99466e70

NVMe: Short-cut removal on surprise hot-unplug · 0ff9d4e1

由 Keith Busch 提交于 5月 12, 2016

This patch adds a new state that when set has the core automatically
kill request queues prior to removing namespaces.

If PCI device is not present at the time the nvme driver's remove is
called, we can kill all IO queues immediately instead of waiting for
the watchdog thread to do that at its polling interval. This improves
scenarios where multiple hot plug events occur at the same time since
it doesn't block the pci enumeration for as long.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

0ff9d4e1

NVMe: Reduce driver log spamming · d011fb31

由 Keith Busch 提交于 4月 04, 2016

Reduce error logging when no corrective action is required.
Suggessted-by: NChris Petersen <cpetersen@fb.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

d011fb31

NVMe: Unbind driver on failure · 921920ab

由 Keith Busch 提交于 3月 28, 2016

Instead of removing the PCI device from the kernel's topology on
controller failure, this patch simply requests unbinding the device
from the driver. This avoids concurrently running pci removal with the
hot plug event, which has been reported to be problematic when multiple
surprise events occur near simultaneously.

The other benefit is that we will have PCI config and memory space
available to poke around for debugging a failed controller, assuming
the device was not physically removed.

The down side occurs if the platform and/or kernel do not support any
type of surprise hot removal. The device will remain visible through
sysfs (and therefore lspci), and some manual work is necessary to get
the logical topology corrected. But if your platform and/or kernel don't
support surprise removal, you probably shouldn't be doing that anyway.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

921920ab

NVMe: Delete only created queues · 014a0d60

由 Keith Busch 提交于 5月 06, 2016

Use the online queue count instead of the number of allocated queues. The
controller should just return an invalid queue identifier error to the
commands if a queue wasn't created. While it's not harmful, it's still
not correct.
Reported-by: NSaar Gross <saar@annapurnalabs.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

014a0d60

NVMe: Allocate queues only for online cpus · 2800b8e7

由 Keith Busch 提交于 5月 13, 2016

The driver previously requested allocating queues for the total possible
number of CPUs so that blk-mq could rebalance these if CPUs were added
after initialization. The number of hardware contexts can now be changed
at runtime, so we only need to allocate the number of online queues
since we can add more later.
Suggested-by: NJeff Lien <jeff.lien@hgst.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

2800b8e7

04 5月, 2016 1 次提交

NVMe: Fix reset/remove race · 87c32077

由 Keith Busch 提交于 4月 08, 2016

This fixes a scenario where device is present and being reset, but a
request to unbind the driver occurs.

A previous patch series addressing a device failure removal scenario
flushed reset_work after controller disable to unblock reset_work waiting
on a completion that wouldn't occur. This isn't safe as-is. The broken
scenario can potentially be induced with:

  modprobe nvme && modprobe -r nvme

To fix, the reset work is flushed immediately after setting the controller
removing flag, and any subsequent reset will not proceed with controller
initialization if the flag is set.

The controller status must be polled while active, so the watchdog timer
is also left active until the controller is disabled to cleanup requests
that may be stuck during namespace removal.

[Fixes: ff23a2a1]
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

87c32077

02 5月, 2016 7 次提交

nvme: add helper nvme_cleanup_cmd() · 6904242d

由 Ming Lin 提交于 4月 25, 2016

This hides command cleanup into nvme.h and fabrics drivers will
also use it.
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

6904242d

nvme: move AER handling to common code · f866fc42

由 Christoph Hellwig 提交于 4月 26, 2016

The transport driver still needs to do the actual submission, but all the
higher level code can be shared.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

f866fc42

nvme: move namespace scanning to core · 5955be21

由 Christoph Hellwig 提交于 4月 26, 2016

Move the scan work item and surrounding code to the common code.  For now
we need a new finish_scan method to allow the PCI driver to set the
irq affinity hints, but I have plans in the works to obsolete this as well.

Note that this moves the namespace scanning from nvme_wq to the system
workqueue, but as we don't rely on namespace scanning to finish from reset
or I/O this should be fine.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by Jon Derrick: <jonathan.derrick@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

5955be21

nvme: tighten up state check for namespace scanning · 92911a55

由 Christoph Hellwig 提交于 4月 26, 2016

We only should be scanning namespaces if the controller is live.  Currently
we call the function just before setting it live, so fix the code up to
move the call to nvme_queue_scan to just below the state change.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Acked-by Jon Derrick: <jonathan.derrick@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

92911a55

nvme: introduce a controller state machine · bb8d261e

由 Christoph Hellwig 提交于 4月 26, 2016

Replace the adhoc flags in the PCI driver with a state machine in the
core code.  Based on code from Sagi Grimberg for the Fabrics driver.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Acked-by Jon Derrick: <jonathan.derrick@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

bb8d261e

nvme: remove the io_incapable method · 04a934d4

由 Christoph Hellwig 提交于 4月 26, 2016

It's unused since "NVMe: Move error handling to failed reset handler".
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NJon Derrick <jonathan.derrick@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

04a934d4

NVMe: Fix check_flush_dependency warning · 3b24774e

由 Keith Busch 提交于 4月 27, 2016

If the controller fails and is degraded after a reset, we need to kill
off all requests queues before removing the inaccessble namespaces. This
will prevent del_gendisk from syncing dirty data, which we can't due
from a WQ_MEM_RECLAIM work queue.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

3b24774e

15 4月, 2016 1 次提交

NVMe: Always use MSI/MSI-x interrupts · a5229050

由 Keith Busch 提交于 4月 08, 2016

Multiple users have reported device initialization failure due the driver
not receiving legacy PCI interrupts. This is not unique to any particular
controller, but has been observed on multiple platforms.

There have been no issues reported or observed when with message signaled
interrupts, so this patch attempts to use MSI-x during initialization,
falling back to MSI. If that fails, legacy would become the default.

The setup_io_queues error handling had to change as a result: the admin
queue's msix_entry used to be initialized to the legacy IRQ. The case
where nr_io_queues is 0 would fail request_irq when setting up the admin
queue's interrupt since re-enabling MSI-x fails with 0 vectors, leaving
the admin queue's msix_entry invalid. Instead, return success immediately.
Reported-by: NTim Muhlemmer <muhlemmer@gmail.com>
Reported-by: NJon Derrick <jonathan.derrick@intel.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a5229050

13 4月, 2016 9 次提交

nvme: Avoid reset work on watchdog timer function during error recovery · c875a709

由 Guilherme G. Piccoli 提交于 4月 13, 2016

This patch adds a check on nvme_watchdog_timer() function to avoid the
call to reset_work() when an error recovery process is ongoing on
controller. The check is made by looking at pci_channel_offline()
result.

If we don't check for this on nvme_watchdog_timer(), error recovery
mechanism can't recover well, because reset_work() won't be able to
do its job (since we're in the middle of an error) and so the
controller is removed from the system before error recovery mechanism
can perform slot reset (which would allow the adapter to recover).

In this patch we also have split the huge condition expression on
nvme_watchdog_timer() by introducing an auxiliary function to help
make the code more readable.
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

c875a709

NVMe: silence warning about unused 'dev' · 7e197930

由 Jens Axboe 提交于 4月 12, 2016

Depending on options, we might not be using dev in nvme_cancel_io():

drivers/nvme/host/pci.c: In function ‘nvme_cancel_io’:
drivers/nvme/host/pci.c:970:19: warning: unused variable ‘dev’ [-Wunused-variable]
  struct nvme_dev *dev = data;
                   ^

So get rid of it, and just cast for the dev_dbg_ratelimited() call.

Fixes: 82b4552b ("nvme: Use blk-mq helper for IO termination")
Signed-off-by: NJens Axboe <axboe@fb.com>

7e197930

nvme: Use blk-mq helper for IO termination · 82b4552b

由 Sagi Grimberg 提交于 4月 12, 2016

blk-mq offers a tagset iterator so let's use that
instead of using nvme_clear_queues.

Note, we changed nvme_queue_cancel_ios name to nvme_cancel_io
as there is no concept of a queue now in this function (we
also lost the print).
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

82b4552b

NVMe: Skip async events for degraded controllers · 21f033f7

由 Keith Busch 提交于 4月 12, 2016

If the controller is degraded, the driver should stay out of the way so
the user can recover the drive. This patch skips driver initiated async
event requests when the drive is in this state.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

21f033f7

nvme: add helper nvme_setup_cmd() · 8093f7ca

由 Ming Lin 提交于 4月 12, 2016

This moves nvme_setup_{flush,discard,rw} calls into a common
nvme_setup_cmd() helper. So we can eventually hide all the command
setup in the core module and don't even need to update the fabrics
drivers for any specific command type.
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

8093f7ca

nvme: rewrite discard support · 03b5929e

由 Ming Lin 提交于 3月 22, 2016

This rewrites nvme_setup_discard() with blk_add_request_payload().
It allocates only the necessary amount(16 bytes) for the payload.
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

03b5929e

nvme: add helper nvme_map_len() · 58b45602

由 Ming Lin 提交于 3月 22, 2016

The helper returns the number of bytes that need to be mapped
using PRPs/SGL entries.
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

58b45602

nvme: add missing lock nesting notation · 2e39e0f6

由 Ming Lin 提交于 4月 05, 2016

When unloading driver, nvme_disable_io_queues() calls nvme_delete_queue()
that sends nvme_admin_delete_cq command to admin sq. So when the command
completed, the lock acquired by nvme_irq() actually belongs to admin queue.

While the lock that nvme_del_cq_end() trying to acquire belongs to io queue.
So it will not deadlock.

This patch adds lock nesting notation to fix following report.

[  109.840952] =============================================
[  109.846379] [ INFO: possible recursive locking detected ]
[  109.851806] 4.5.0+ #180 Tainted: G            E
[  109.856533] ---------------------------------------------
[  109.861958] swapper/0/0 is trying to acquire lock:
[  109.866771]  (&(&nvmeq->q_lock)->rlock){-.....}, at: [<ffffffffc0820bc6>] nvme_del_cq_end+0x26/0x70 [nvme]
[  109.876535]
[  109.876535] but task is already holding lock:
[  109.882398]  (&(&nvmeq->q_lock)->rlock){-.....}, at: [<ffffffffc0820c2b>] nvme_irq+0x1b/0x50 [nvme]
[  109.891547]
[  109.891547] other info that might help us debug this:
[  109.898107]  Possible unsafe locking scenario:
[  109.898107]
[  109.904056]        CPU0
[  109.906515]        ----
[  109.908974]   lock(&(&nvmeq->q_lock)->rlock);
[  109.913381]   lock(&(&nvmeq->q_lock)->rlock);
[  109.917787]
[  109.917787]  *** DEADLOCK ***
[  109.917787]
[  109.923738]  May be due to missing lock nesting notation
[  109.923738]
[  109.930558] 1 lock held by swapper/0/0:
[  109.934413]  #0:  (&(&nvmeq->q_lock)->rlock){-.....}, at: [<ffffffffc0820c2b>] nvme_irq+0x1b/0x50 [nvme]
[  109.944010]
[  109.944010] stack backtrace:
[  109.948389] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G            E   4.5.0+ #180
[  109.955734] Hardware name: Dell Inc. OptiPlex 7010/0YXT71, BIOS A15 08/12/2013
[  109.962989]  0000000000000000 ffff88011e203c38 ffffffff81383d9c ffffffff81c13540
[  109.970478]  ffffffff826711d0 ffff88011e203ce8 ffffffff810bb429 0000000000000046
[  109.977964]  0000000000000046 0000000000000000 0000000000b2e597 ffffffff81f4cb00
[  109.985453] Call Trace:
[  109.987911]  <IRQ>  [<ffffffff81383d9c>] dump_stack+0x85/0xc9
[  109.993711]  [<ffffffff810bb429>] __lock_acquire+0x19b9/0x1c60
[  109.999575]  [<ffffffff810b6d1d>] ? trace_hardirqs_off+0xd/0x10
[  110.005524]  [<ffffffff810b386d>] ? complete+0x3d/0x50
[  110.010688]  [<ffffffff810bb760>] lock_acquire+0x90/0xf0
[  110.016029]  [<ffffffffc0820bc6>] ? nvme_del_cq_end+0x26/0x70 [nvme]
[  110.022418]  [<ffffffff81772afb>] _raw_spin_lock_irqsave+0x4b/0x60
[  110.028632]  [<ffffffffc0820bc6>] ? nvme_del_cq_end+0x26/0x70 [nvme]
[  110.035019]  [<ffffffffc0820bc6>] nvme_del_cq_end+0x26/0x70 [nvme]
[  110.041232]  [<ffffffff8135b485>] blk_mq_end_request+0x35/0x60
[  110.047095]  [<ffffffffc0821ad8>] nvme_complete_rq+0x68/0x190 [nvme]
[  110.053481]  [<ffffffff8135b53f>] __blk_mq_complete_request+0x8f/0x130
[  110.060043]  [<ffffffff8135b611>] blk_mq_complete_request+0x31/0x40
[  110.066343]  [<ffffffffc08209e3>] __nvme_process_cq+0x83/0x240 [nvme]
[  110.072818]  [<ffffffffc0820c35>] nvme_irq+0x25/0x50 [nvme]
[  110.078419]  [<ffffffff810cdb66>] handle_irq_event_percpu+0x36/0x110
[  110.084804]  [<ffffffff810cdc77>] handle_irq_event+0x37/0x60
[  110.090491]  [<ffffffff810d0ea3>] handle_edge_irq+0x93/0x150
[  110.096180]  [<ffffffff81012306>] handle_irq+0xa6/0x130
[  110.101431]  [<ffffffff81011abe>] do_IRQ+0x5e/0x120
[  110.106333]  [<ffffffff8177384c>] common_interrupt+0x8c/0x8c
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

2e39e0f6

NVMe: Always use MSI/MSI-x interrupts · 788e15ab

由 Keith Busch 提交于 4月 08, 2016

788e15ab

12 4月, 2016 1 次提交

NVMe: Fix reset/remove race · 9bf2b972

由 Keith Busch 提交于 4月 08, 2016

This fixes a scenario where device is present and being reset, but a
request to unbind the driver occurs.

A previous patch series addressing a device failure removal scenario
flushed reset_work after controller disable to unblock reset_work waiting
on a completion that wouldn't occur. This isn't safe as-is. The broken
scenario can potentially be induced with:

  modprobe nvme && modprobe -r nvme

To fix, the reset work is flushed immediately after setting the controller
removing flag, and any subsequent reset will not proceed with controller
initialization if the flag is set.

The controller status must be polled while active, so the watchdog timer
is also left active until the controller is disabled to cleanup requests
that may be stuck during namespace removal.

[Fixes: ff23a2a1]
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

9bf2b972

23 3月, 2016 1 次提交

nvme: avoid cqe corruption when update at the same time as read · d783e0bd

由 Marta Rybczynska 提交于 3月 22, 2016

Make sure the CQE phase (validity) is read before the rest of the
structure. The phase bit is the highest address and the CQE
read will happen on most platforms from lower to upper addresses
and will be done by multiple non-atomic loads. If the structure
is updated by PCI during the reads from the processor, the
processor may get a corrupted copy.

The addition of the new nvme_cqe_valid function that verifies
the validity bit also allows refactoring of the other CQE read
sequences.
Signed-off-by: NMarta Rybczynska <marta.rybczynska@kalray.eu>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

d783e0bd