提交 · 2b053aca76b48e681be57b34ca3a8c2c10b275c5 · openeuler / raspberrypi-kernel

03 11月, 2016 2 次提交

blk-mq: Add a kick_requeue_list argument to blk_mq_requeue_request() · 2b053aca

由 Bart Van Assche 提交于 10月 28, 2016

Most blk_mq_requeue_request() and blk_mq_add_to_requeue_list() calls
are followed by kicking the requeue list. Hence add an argument to
these two functions that allows to kick the requeue list. This was
proposed by Christoph Hellwig.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

2b053aca

blk-mq: Remove blk_mq_cancel_requeue_work() · 9b7dd572

由 Bart Van Assche 提交于 10月 28, 2016

Since blk_mq_requeue_work() no longer restarts stopped queues
canceling requeue work is no longer needed to prevent that a
stopped queue would be restarted. Hence remove this function.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

9b7dd572

28 10月, 2016 1 次提交

block: split out request-only flags into a new namespace · e8064021

由 Christoph Hellwig 提交于 10月 20, 2016

A lot of the REQ_* flags are only used on struct requests, and only of
use to the block layer and a few drivers that dig into struct request
internals.

This patch adds a new req_flags_t rq_flags field to struct request for
them, and thus dramatically shrinks the number of common requests.  It
also removes the unfortunate situation where we have to fit the fields
from the same enum into 32 bits for struct bio and 64 bits for
struct request.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NShaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e8064021

12 10月, 2016 1 次提交

nvme: use the DMA_ATTR_NO_WARN attribute · 2b6b535d

由 Mauricio Faria de Oliveira 提交于 10月 11, 2016

Use the DMA_ATTR_NO_WARN attribute for the dma_map_sg() call of the nvme
driver that returns BLK_MQ_RQ_QUEUE_BUSY (not for BLK_MQ_RQ_QUEUE_ERROR).

Link: http://lkml.kernel.org/r/1470092390-25451-4-git-send-email-mauricfo@linux.vnet.ibm.comSigned-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Reviewed-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2b6b535d

25 9月, 2016 2 次提交

nvme: Pass pointers, not dma addresses, to nvme_get/set_features() · 1a6fe74d

由 Andy Lutomirski 提交于 9月 16, 2016

Any user I can imagine that needs a buffer at all will want to pass
a pointer directly.  There are no currently callers that use
buffers, so this change is painless, and it will make it much easier
to start using features that use buffers (e.g. APST).
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NJay Freyensee <james_p_freyensee@linux.intel.com>
Tested-by: NJay Freyensee <james_p_freyensee@linux.intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1a6fe74d

nvme/scsi: Remove power management support · 26501db8

由 Andy Lutomirski 提交于 9月 16, 2016

As far as I can tell, there is basically nothing correct about this
code.  It misinterprets npss (off-by-one).  It hardcodes a bunch of
power states, which is nonsense, because they're all just indices
into a table that software needs to parse.  It completely ignores
the distinction between operational and non-operational states.
And, until 4.8, if all of the above magically succeeded, it would
dereference a NULL pointer and OOPS.

Since this code appears to be useless, just delete it.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NJay Freyensee <james_p_freyensee@linux.intel.com>
Tested-by: NJay Freyensee <james_p_freyensee@linux.intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

26501db8

24 9月, 2016 5 次提交

nvme-fabrics: Add host_traddr options field to host infrastructure · 478bcb93

由 James Smart 提交于 8月 02, 2016

Add the host_traddr field to allow specification of the host-port
connection info for the transport. Will be used by FC transport.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Acked-by: NJohannes Thumshirn <jth@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

478bcb93

nvme-fabrics: revise host transport option descriptions · 4a9f05c5

由 James Smart 提交于 8月 02, 2016

Revise some of the comments so not so ethernet-network centric
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Acked-by: NJohannes Thumshirn <jth@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

4a9f05c5

nvme-fabrics: rework nvmf_get_address() for variable options · 0fe51ff2

由 James Smart 提交于 8月 02, 2016

Revise nvmf_get_address() string to account for not all options being
present.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Acked-by: NJohannes Thumshirn <jth@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

0fe51ff2

nvme-rdma: use IB_PD_UNSAFE_GLOBAL_RKEY · 11975e01

由 Christoph Hellwig 提交于 9月 05, 2016

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

11975e01

IB/core: add support to create a unsafe global rkey to ib_create_pd · ed082d36

由 Christoph Hellwig 提交于 9月 05, 2016

Instead of exposing ib_get_dma_mr to ULPs and letting them use it more or
less unchecked, this moves the capability of creating a global rkey into
the RDMA core, where it can be easily audited.  It also prints a warning
everytime this feature is used as well.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

ed082d36

23 9月, 2016 1 次提交

nvme-rdma: only clear queue flags after successful connect · 3b4ac786

由 Sagi Grimberg 提交于 9月 22, 2016

Otherwise, nvme_rdma_stop_and_clear_queue() will incorrectly
try to stop/free rdma qps/cm_ids that are already freed.

Fixes: e89ca58f ("nvme-rdma: add DELETING queue flag")
Reported-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

3b4ac786

21 9月, 2016 3 次提交

lightnvm: expose device geometry through sysfs · 40267efd

由 Simon A. F. Lund 提交于 9月 16, 2016

For a host to access an Open-Channel SSD, it has to know its geometry,
so that it writes and reads at the appropriate device bounds.

Currently, the geometry information is kept within the kernel, and not
exported to user-space for consumption. This patch exposes the
configuration through sysfs and enables user-space libraries, such as
liblightnvm, to use the sysfs implementation to get the geometry of an
Open-Channel SSD.

The sysfs entries are stored within the device hierarchy, and can be
found using the "lightnvm" device type.

An example configuration looks like this:

/sys/class/nvme/
└── nvme0n1
   ├── capabilities: 3
   ├── device_mode: 1
   ├── erase_max: 1000000
   ├── erase_typ: 1000000
   ├── flash_media_type: 0
   ├── media_capabilities: 0x00000001
   ├── media_type: 0
   ├── multiplane: 0x00010101
   ├── num_blocks: 1022
   ├── num_channels: 1
   ├── num_luns: 4
   ├── num_pages: 64
   ├── num_planes: 1
   ├── page_size: 4096
   ├── prog_max: 100000
   ├── prog_typ: 100000
   ├── read_max: 10000
   ├── read_typ: 10000
   ├── sector_oob_size: 0
   ├── sector_size: 4096
   ├── media_manager: gennvm
   ├── ppa_format: 0x380830082808001010102008
   ├── vendor_opcode: 0
   ├── max_phys_secs: 64
   └── version: 1
Signed-off-by: NSimon A. F. Lund <slund@cnexlabs.com>
Signed-off-by: NMatias Bjørling <m@bjorling.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

40267efd

lightnvm: control life of nvm_dev in driver · b0b4e09c

由 Matias Bjørling 提交于 9月 16, 2016

LightNVM compatible device drivers does not have a method to expose
LightNVM specific sysfs entries.

To enable LightNVM sysfs entries to be exposed, lightnvm device
drivers require a struct device to attach it to. To allow both the
actual device driver and lightnvm sysfs entries to coexist, the device
driver tracks the lifetime of the nvm_dev structure.

This patch refactors NVMe and null_blk to handle the lifetime of struct
nvm_dev, which eliminates the need for struct gendisk when a lightnvm
compatible device is provided.
Signed-off-by: NMatias Bjørling <m@bjorling.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

b0b4e09c

nvme: refactor namespaces to support non-gendisk devices · ac81bfa9

由 Matias Bjørling 提交于 9月 16, 2016

With LightNVM enabled namespaces, the gendisk structure is not exposed
to the user. This prevents LightNVM users from accessing the NVMe device
driver specific sysfs entries, and LightNVM namespace geometry.

Refactor the revalidation process, so that a namespace, instead of a
gendisk, is revalidated. This later allows patches to wire up the
sysfs entries up to a non-gendisk namespace.
Signed-off-by: NMatias Bjørling <m@bjorling.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

ac81bfa9

15 9月, 2016 3 次提交

nvme: remove the post_scan callout · b5af7f2f

由 Christoph Hellwig 提交于 9月 14, 2016

No need now that we don't have to reverse engineer the irq affinity.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b5af7f2f

nvme: switch to use pci_alloc_irq_vectors · dca51e78

由 Christoph Hellwig 提交于 9月 14, 2016

Use the new helper to automatically select the right interrupt type, as
well as to use the automatic interupt affinity assignment.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

dca51e78

blk-mq: remove ->map_queue · 7d7e0f90

由 Christoph Hellwig 提交于 9月 14, 2016

All drivers use the default, so provide an inline version of it.  If we
ever need other queue mapping we can add an optional method back,
although supporting will also require major changes to the queue setup
code.

This provides better code generation, and better debugability as well.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

7d7e0f90

13 9月, 2016 4 次提交

nvme-rdma: add back dependency on CONFIG_BLOCK · 2cfe199c

由 Arnd Bergmann 提交于 9月 06, 2016

A recent change removed the dependency on BLK_DEV_NVME, which implies
the dependency on PCI and BLOCK. We don't need CONFIG_PCI, but without
CONFIG_BLOCK we get tons of build errors, e.g.

In file included from drivers/nvme/host/core.c:16:0:
linux/blk-mq.h:182:33: error: 'struct gendisk' declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
drivers/nvme/host/core.c: In function 'nvme_setup_rw':
drivers/nvme/host/core.c:295:21: error: implicit declaration of function 'rq_data_dir' [-Werror=implicit-function-declaration]
drivers/nvme/host/nvme.h: In function 'nvme_map_len':
drivers/nvme/host/nvme.h:217:6: error: implicit declaration of function 'req_op' [-Werror=implicit-function-declaration]
drivers/nvme/host/scsi.c: In function 'nvme_trans_bdev_limits_page':
drivers/nvme/host/scsi.c:768:85: error: implicit declaration of function 'queue_max_hw_sectors' [-Werror=implicit-function-declaration]

This adds back the specific CONFIG_BLOCK dependency to avoid broken
configurations.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Fixes: aa719874 ("nvme: fabrics drivers don't need the nvme-pci driver")
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

2cfe199c

nvme-rdma: fix null pointer dereference on req->mr · 1bda18de

由 Colin Ian King 提交于 9月 05, 2016

If there is an error on req->mr, req->mr is set to null, however
the following statement sets req->mr->need_inval causing a null
pointer dereference.  Fix this by bailing out to label 'out' to
immediately return and hence skip over the offending null pointer
dereference.

Fixes: f5b7b559 ("nvme-rdma: Get rid of duplicate variable")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

1bda18de

nvme-rdma: use ib_client API to detect device removal · e87a911f

由 Steve Wise 提交于 9月 02, 2016

Change nvme-rdma to use the IB Client API to detect device removal.
This has the wonderful benefit of being able to blow away all the
ib/rdma_cm resources for the device being removed.  No craziness about
not destroying the cm_id handling the event.  No deadlocks due to broken
iw_cm/rdma_cm/iwarp dependencies.  And no need to have a bound cm_id
around during controller recovery/reconnect to catch device removal
events.

We don't use the device_add aspect of the ib_client service since we only
want to create resources for an IB device if we have a target utilizing
that device.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

e87a911f

nvme-rdma: add DELETING queue flag · e89ca58f

由 Sagi Grimberg 提交于 9月 02, 2016

When we get a surprise disconnect from the target we queue a periodic
reconnect (which is the sane thing to do...).

We only move the queues out of CONNECTED when we retry to reconnect (after
10 seconds in the default case) but we stop the blk queues immediately
so we are not bothered with traffic from now on. If delete() is kicking
off in this period the queues are still in CONNECTED state.

Part of the delete sequence is trying to issue ctrl shutdown if the
admin queue is CONNECTED (which it is!). This request is issued but
stuck in blk-mq waiting for the queues to start again. This might be
the one preventing us from forward progress...

The patch separates the queue flags to CONNECTED and DELETING. Now we
will move out of CONNECTED as soon as error recovery kicks in (before
stopping the queues) and DELETING is on when we start the queue deletion.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

e89ca58f

12 9月, 2016 1 次提交

nvme: make NVME_RDMA depend on BLOCK · bd0b841f

由 Linus Torvalds 提交于 9月 11, 2016

Commit aa719874 ("nvme: fabrics drivers don't need the nvme-pci
driver") removed the dependency on BLK_DEV_NVME, but the cdoe does
depend on the block layer (which used to be an implicit dependency
through BLK_DEV_NVME).

Otherwise you get various errors from the kbuild test robot random
config testing when that happens to hit a configuration with BLOCK
device support disabled.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Jay Freyensee <james_p_freyensee@linux.intel.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bd0b841f

09 9月, 2016 1 次提交
- W
  nvme/quirk: Add a delay before checking device ready for memblaze device · 015282c9
  由 Wenbo Wang 提交于 9月 08, 2016
```
Signed-off-by: NWenbo Wang <wenbo.wang@memblaze.com>
Signed-off-by: NJens Axboe <axboe@fb.com>
```
  015282c9
07 9月, 2016 1 次提交

nvme: Don't suspend admin queue that wasn't created · 82469c59

由 Gabriel Krisman Bertazi 提交于 9月 06, 2016

This fixes a regression in my previous commit c21377f8 ("nvme:
Suspend all queues before deletion"), which provoked an Oops in the
removal path when removing a device that became IO incapable very early
at probe (i.e. after a failed EEH recovery).

Turns out, if the error occurred very early at the probe path, before
even configuring the admin queue, we might try to suspend the
uninitialized admin queue, accessing bad memory.

Fixes: c21377f8 ("nvme: Suspend all queues before deletion")
Signed-off-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Reviewed-by: NJay Freyensee <james_p_freyensee@linux.intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

82469c59

04 9月, 2016 2 次提交

nvme-rdma: destroy nvme queue rdma resources on connect failure · f361e5a0

由 Steve Wise 提交于 9月 02, 2016

After address resolution, the nvme_rdma_queue rdma resources are
allocated.  If rdma route resolution or the connect fails, or the
controller reconnect times out and gives up, then the rdma resources
need to be freed.  Otherwise, rdma resources are leaked.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimbrg.me>
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

f361e5a0

nvme_rdma: keep a ref on the ctrl during delete/flush · cdbecc8d

由 Steve Wise 提交于 9月 01, 2016

Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimbrg.me>
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

cdbecc8d

28 8月, 2016 2 次提交

nvme-rdma: Get rid of redundant defines · 4d8c6a79

由 Sagi Grimberg 提交于 8月 26, 2016

Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

4d8c6a79

nvme-rdma: Get rid of duplicate variable · f5b7b559

由 Sagi Grimberg 提交于 8月 24, 2016

We already have need_inval in ib_mr, lets use
that instead.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

f5b7b559

24 8月, 2016 1 次提交

nvme: Fix nvme_get/set_features() with a NULL result pointer · 9b47f77a

由 Andy Lutomirski 提交于 8月 24, 2016

nvme_set_features() callers seem to expect that passing NULL as the
result pointer is acceptable.  Teach nvme_set_features() not to try to
write to the NULL address.

For symmetry, make the same change to nvme_get_features(), despite the
fact that all current callers pass a valid result pointer.

I assume that this bug hasn't been reported in practice because
the callers that pass NULL are all in the SCSI translation layer
and no one uses the relevant operations.

Cc: stable@vger.kernel.org
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

9b47f77a

19 8月, 2016 3 次提交

nvme: fabrics drivers don't need the nvme-pci driver · aa719874

由 Christoph Hellwig 提交于 8月 18, 2016

So select the NVME_CORE symbol instead of depending on BLK_DEV_NVME.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJay Freyensee <james_p_freyensee@linux.intel.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

aa719874

nvme-fabrics: get a reference when reusing a nvme_host structure · 98096d8a

由 Christoph Hellwig 提交于 8月 18, 2016

Without this we'll get a use after free after connecting two controller
using the same hostnqn and then disconnecting one of them.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJay Freyensee <james_p_freyensee@linux.intel.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

98096d8a

nvme-fabrics: change NQN UUID to big-endian format · 7a665d2f

由 Daniel Verkamp 提交于 6月 28, 2016

NVM Express 1.2.1 section 7.9, NVMe Qualified Names, specifies that the
UUID format of NQN uses a UUID based on RFC 4122.

RFC 4122 specifies that the UUID is encoded in big-endian byte order.

Switch the NVMe over Fabrics host ID field from little-endian UUID to
big-endian UUID to match the specification.
Signed-off-by: NDaniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: NJay Freyensee <james_p_freyensee@linux.intel.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

7a665d2f

18 8月, 2016 2 次提交

nvme-rdma: fix sqsize/hsqsize per spec · c5af8654

由 Jay Freyensee 提交于 8月 17, 2016

Per NVMe-over-Fabrics 1.0 spec, sqsize is represented as
a 0-based value.

Also per spec, the RDMA binding values shall be set
to sqsize, which makes hsqsize 0-based values.

Thus, the sqsize during NVMf connect() is now:

[root@fedora23-fabrics-host1 for-48]# dmesg
[  318.720645] nvme_fabrics: nvmf_connect_admin_queue(): sqsize for
admin queue: 31
[  318.720884] nvme nvme0: creating 16 I/O queues.
[  318.810114] nvme_fabrics: nvmf_connect_io_queue(): sqsize for i/o
queue: 127

Finally, current interpretation implies hrqsize is 1's based
so set it appropriately.
Reported-by: NDaniel Verkamp <daniel.verkamp@intel.com>
Signed-off-by: NJay Freyensee <james_p_freyensee@linux.intel.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

c5af8654

fabrics: define admin sqsize min default, per spec · f994d9dc

由 Jay Freyensee 提交于 8月 17, 2016

Upon admin queue connect(), the rdma qp was being
set based on NVMF_AQ_DEPTH.  However, the fabrics layer was
using the sqsize field value set for I/O queues for the admin
queue, which threw the nvme layer and rdma layer off-whack:

root@fedora23-fabrics-host1 nvmf]# dmesg
[ 3507.798642] nvme_fabrics: nvmf_connect_admin_queue():admin sqsize
being sent is: 128
[ 3507.798858] nvme nvme0: creating 16 I/O queues.
[ 3507.896407] nvme nvme0: new ctrl: NQN "nullside-nqn", addr
192.168.1.3:4420

Thus, to have a different admin queue value, we use
NVMF_AQ_DEPTH for connect() and RDMA private data
as the minimum depth specified in the NVMe-over-Fabrics 1.0 spec
(and in that RDMA private data we treat hrqsize as 1's-based
value, per current understanding of the fabrics spec).
Reported-by: NDaniel Verkamp <daniel.verkamp@intel.com>
Signed-off-by: NJay Freyensee <james_p_freyensee@linux.intel.com>
Reviewed-by: NDaniel Verkamp <daniel.verkamp@intel.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

f994d9dc

16 8月, 2016 1 次提交

nvme-rdma: initialize ret to zero to avoid returning garbage · 39bbee4e

由 Colin Ian King 提交于 8月 16, 2016

ret is not initialized so it contains garbage.  Ensure garbage
is not returned by initializing rc to 0.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

39bbee4e

15 8月, 2016 1 次提交

nvme: Prevent controller state invalid transition · f6b6a28e

由 Gabriel Krisman Bertazi 提交于 7月 29, 2016

Acquiring the nvme_ctrl lock before reading ctrl->state in
nvme_change_ctrl_state() should prevent a theoretical invalid state
transition, in the event of two threads racing inside that function.

I haven't been able to observe this happening with the current code, and
the current state machine seems to be simple enough to not be
affected by these invalid transitions, but future modifications could
make it more likely to happen.
Signed-off-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Reviewed-by: NSagi Grimberg <sag@grimberg.me>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f6b6a28e

11 8月, 2016 1 次提交

nvme: Suspend all queues before deletion · c21377f8

由 Gabriel Krisman Bertazi 提交于 8月 11, 2016

When nvme_delete_queue fails in the first pass of the
nvme_disable_io_queues() loop, we return early, failing to suspend all
of the IO queues.  Later, on the nvme_pci_disable path, this causes us
to disable MSI without actually having freed all the IRQs, which
triggers the BUG_ON in free_msi_irqs(), as show below.

This patch refactors nvme_disable_io_queues to suspend all queues before
start submitting delete queue commands.  This way, we ensure that we
have at least returned every IRQ before continuing with the removal
path.

[  487.529200] kernel BUG at ../drivers/pci/msi.c:368!
cpu 0x46: Vector: 700 (Program Check) at [c0000078c5b83650]
    pc: c000000000627a50: free_msi_irqs+0x90/0x200
    lr: c000000000627a40: free_msi_irqs+0x80/0x200
    sp: c0000078c5b838d0
   msr: 9000000100029033
  current = 0xc0000078c5b40000
  paca    = 0xc000000002bd7600   softe: 0        irq_happened: 0x01
    pid   = 1376, comm = kworker/70:1H
kernel BUG at ../drivers/pci/msi.c:368!
Linux version 4.7.0.mainline+ (root@iod76) (gcc version 5.3.1 20160413
(Ubuntu/IBM 5.3.1-14ubuntu2.1) ) #104 SMP Fri Jul 29 09:20:17 CDT 2016
enter ? for help
[c0000078c5b83920] d0000000363b0cd8 nvme_dev_disable+0x208/0x4f0 [nvme]
[c0000078c5b83a10] d0000000363b12a4 nvme_timeout+0xe4/0x250 [nvme]
[c0000078c5b83ad0] c0000000005690e4 blk_mq_rq_timed_out+0x64/0x110
[c0000078c5b83b40] c00000000056c930 bt_for_each+0x160/0x170
[c0000078c5b83bb0] c00000000056d928 blk_mq_queue_tag_busy_iter+0x78/0x110
[c0000078c5b83c00] c0000000005675d8 blk_mq_timeout_work+0xd8/0x1b0
[c0000078c5b83c50] c0000000000e8cf0 process_one_work+0x1e0/0x590
[c0000078c5b83ce0] c0000000000e9148 worker_thread+0xa8/0x660
[c0000078c5b83d80] c0000000000f2090 kthread+0x110/0x130
[c0000078c5b83e30] c0000000000095f0 ret_from_kernel_thread+0x5c/0x6c
Signed-off-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: linux-nvme@lists.infradead.org
Signed-off-by: NJens Axboe <axboe@fb.com>

c21377f8

04 8月, 2016 2 次提交

nvme-rdma: Remove unused includes · e3266378

由 Sagi Grimberg 提交于 8月 04, 2016

Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>

e3266378

nvme-rdma: start async event handler after reconnecting to a controller · 3ef1b4b2

由 Sagi Grimberg 提交于 8月 04, 2016

When we reset or reconnect to a controller, we are cancelling the
async event handler so we can safely re-establish resources, but we
need to remember to start it again when we successfully reconnect.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

3ef1b4b2