提交 · 641a9ed60f3620936921a58fb21d9f3aa891f3a4 · openeuler / raspberrypi-kernel

19 6月, 2017 1 次提交

blk-mq: use the introduced blk_mq_unquiesce_queue() · f660174e

由 Ming Lei 提交于 6月 06, 2017

blk_mq_unquiesce_queue() is used for unquiescing the
queue explicitly, so replace blk_mq_start_stopped_hw_queues()
with it.

For the scsi part, this patch takes Bart's suggestion to
switch to block quiesce/unquiesce API completely.

Cc: linux-nvme@lists.infradead.org
Cc: linux-scsi@vger.kernel.org
Cc: dm-devel@redhat.com
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f660174e

16 6月, 2017 1 次提交

nvme: implement NS Optimal IO Boundary from 1.3 Spec · 6b8190d6

由 Scott Bauer 提交于 6月 15, 2017

The NVMe 1.3 spec introduces Namespace Optimal IO Boundaries (NOIOB),
which standardizes the stripe mechanism we currently have quirks for.
This patch implements the necessary logic to handle this new feature.
Signed-off-by: NScott Bauer <scott.bauer@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

6b8190d6

15 6月, 2017 29 次提交

S
nvme: don't hard code size of struct t10_pi_tuple · 8fa61121
由 Sagi Grimberg 提交于 6月 15, 2017
```
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
```
8fa61121

nvme: no need to wait for the reset when keepalive fails · 39bdc590

由 Christoph Hellwig 提交于 6月 12, 2017

We don't need to wait for the reset from the delayed work item that
is kicked off when we don't get a keepalive.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

39bdc590

nvme: move reset workqueue handling to common code · d86c4d8e

由 Christoph Hellwig 提交于 6月 15, 2017

This moves the nvme_reset function from the PCIe driver to common code,
renaming it to nvme_reset_ctrl in the process. Additionally a new
helper nvme_reset_ctrl_sync is added for the case where we want to
wait for the reset. To facilitate that the reset_work work structure is
move to the common nvme_ctrl structure and the ->reset_ctrl method is
removed. For now the drivers initialize the reset_work with their own
callback, but longer term we should move to callouts for specific
parts of the reset process and move even more code to the core.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

d86c4d8e

nvme-pci: merge init_request methods · 0350815a

由 Christoph Hellwig 提交于 6月 13, 2017

Now that we get the tagset passed we can have a single implementation for
the I/O and admin queues.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0350815a

nvme-fc: merge init_request methods · 76f983cb

由 Christoph Hellwig 提交于 6月 13, 2017

Now that we get the tagset passed we can have a single implementation for
the I/O and admin queues.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

76f983cb

nvme-rdma: merge init_request and exit_request methods · 385475ee

由 Christoph Hellwig 提交于 6月 13, 2017

Now that we get the tagset passed we can have a single implementation for
the I/O and admin queues.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

385475ee

nvme: move protection information check into nvme_setup_rw · ebe6d874

由 Christoph Hellwig 提交于 6月 12, 2017

It only applies to read/write commands, and this way non-PCIe drivers
get the check as well instead of having to duplicate it when adding
metadata support.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ebe6d874

nvme: mark shutdown_timeout static · b3b1b0b0

由 Christoph Hellwig 提交于 6月 12, 2017

And open code the SHUTDOWN_TIMEOUT macro.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b3b1b0b0

nvme-rdma: fix error code in nvme_rdma_create_ctrl() · bb472baa

由 Dan Carpenter 提交于 6月 14, 2017

We accidentally return ERR_PTR(0) which is NULL.  The caller isn't
explicitly checking for that but I couldn't immediately spot whether
this would lead to a NULL dereference.  Anyway, we can fix add an
error code easily enough.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

bb472baa

nvmf: keep track of nvmet connect error status · 97ddc36e

由 Guan Junxiong 提交于 6月 13, 2017

To let the host know what happends to the connection establishment,
adjust the behavior of nvmf_log_connect_error to make more connect
specifig error codes human-readble.
Signed-off-by: NGuan Junxiong <guanjunxiong@huawei.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

97ddc36e

nvme: use ctrl->device consistently for logging · f0425db0

由 Johannes Thumshirn 提交于 6月 09, 2017

Change the few left over users of ctrl->dev over to using ctrl->device
for logging purposes, so we consistently use the same device.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

f0425db0

nvme: provide UUID value to userspace · d934f984

由 Johannes Thumshirn 提交于 6月 07, 2017

Now that we have a way for getting the UUID from a target, provide it
to userspace as well.

Unfortunately there is already a sysfs attribute called UUID which is
a misnomer as it holds the NGUID value. So instead of creating yet
another wrong name, create a new 'nguid' sysfs attribute for the
NGUID. For the UUID attribute add a check wheter the namespace has a
UUID assigned to it and return this or return the NGUID to maintain
backwards compatibility. This should give userspace a chance to catch
up.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NSagi Grimberg <sagi@rimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d934f984

nvme: get list of namespace descriptors · 3b22ba26

由 Johannes Thumshirn 提交于 6月 07, 2017

If a target identifies itself as NVMe 1.3 compliant, try to get the
list of Namespace Identification Descriptors and populate the UUID,
NGUID and EUI64 fileds in the NVMe namespace structure with these
values.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3b22ba26

nvme: rename uuid to nguid in nvme_ns · 90985b84

由 Johannes Thumshirn 提交于 6月 07, 2017

The uuid field in the nvme_ns structure represents the nguid field
from the identify namespace command. And as NVMe 1.3 introduced an
UUID in the NVMe Namespace Identification Descriptor this will
collide.

So rename the uuid to nguid to prevent any further
confusion. Unfortunately we export the nguid to sysfs in the uuid
sysfs attribute, but this can't be changed anymore without possibly
breaking existing userspace.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

90985b84

nvmet: use NVME_IDENTIFY_DATA_SIZE · 0add5e8e

由 Johannes Thumshirn 提交于 6月 07, 2017

Use NVME_IDENTIFY_DATA_SIZE define instead of hard coding the magic
4096 value.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NHannes Reinecke <hare@suse.com>
[hch: converted three more users]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0add5e8e

nvme-pci: remove redundant includes · d19d4c8e

由 Sagi Grimberg 提交于 6月 05, 2017

Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>

d19d4c8e

nvme-pci: Remove watchdog timer · b2a0eb1a

由 Keith Busch 提交于 6月 07, 2017

The controller status polling was added to preemptively reset a failed
controller. This early detection would allow commands that would normally
timeout a chance for a retry, or find broken links when the platform
didn't support hotplug.

This once-per-second MMIO read, however, created more problems than
it solves. This often races with PCIe Hotplug events that required
complicated syncing between work queues, frequently triggered PCIe
Completion Timeout errors that also lead to fatal machine checks, and
unnecessarily disrupts low power modes by running on idle controllers.

This patch removes the watchdog timer, and instead checks controller
health only on an IO timeout when we have a reason to believe something
is wrong. If the controller is failed, the driver will disable immediately
and request scheduling a reset.
Suggested-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b2a0eb1a

nvme-pci: remap BAR0 to cover admin CQ doorbell for large stride · 97f6ef64

由 Xu Yu 提交于 5月 24, 2017

The existing driver initially maps 8192 bytes of BAR0 which is
intended to cover doorbells of admin SQ and CQ. However, if a
large stride, e.g. 10, is used, the doorbell of admin CQ will
be out of 8192 bytes. Consequently, a page fault will be raised
when the admin CQ doorbell is accessed in nvme_configure_admin_queue().

This patch fixes this issue by remapping BAR0 before accessing
admin CQ doorbell if the initial mapping is not enough.
Signed-off-by: NXu Yu <yu.a.xu@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

97f6ef64

nvme: move nr_reconnects to nvme_ctrl · fdf9dfa8

由 Sagi Grimberg 提交于 5月 04, 2017

It is not a user option but rather a variable controller
attribute.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

fdf9dfa8

nvme: queue ns scanning and async request from nvme_wq · c669ccdc

由 Sagi Grimberg 提交于 5月 04, 2017

To suppress the warning triggered by nvme_uninit_ctrl:
kernel: [ 50.350439] nvme nvme0: rescanning
kernel: [ 50.363351] ------------[ cut here]------------
kernel: [ 50.363396] WARNING: CPU: 1 PID: 37 at kernel/workqueue.c:2423 check_flush_dependency+0x11f/0x130
kernel: [ 50.363409] workqueue: WQ_MEM_RECLAIM
nvme-wq:nvme_del_ctrl_work [nvme_core] is flushing !WQ_MEM_RECLAIM events:nvme_scan_work [nvme_core]

This was triggered with nvme-loop, but can happen with rdma/pci as well afaict.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c669ccdc

nvme: Move transports to use nvme-core workqueue · 9a6327d2

由 Sagi Grimberg 提交于 6月 07, 2017

Instead of each transport using it's own workqueue, export
a single nvme-core workqueue and use that instead.

In the future, this will help us moving towards some unification
if controller setup/teardown flows.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

9a6327d2

nvme: Don't allow to reset a reconnecting controller · c58bd1bf

由 Sagi Grimberg 提交于 5月 04, 2017

The reset operation is guaranteed to fail for all scenarios
but the esoteric case where in the last reconnect attempt
concurrent with the reset we happen to successfully reconnect.

We just deny initiating a reset if we are reconnecting.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c58bd1bf

nvme-rdma: Get rid of CONNECTED state · b282a88d

由 Sagi Grimberg 提交于 5月 04, 2017

We only care about if the queue is LIVE for request submission,
so no need for CONNECTED.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b282a88d

nvme-rdma: rework rdma connection establishment error path · abf87d5e

由 Sagi Grimberg 提交于 5月 04, 2017

Instead of introducing a flag for if the queue is allocated,
simply free the rdma resources when we get the error.

We allocate the queue rdma resources when we have an address
resolution, their we allocate (or take a reference on) our device
so we should free it when we have error after the address resolution
namely:
1. route resolution error
2. connect reject
3. connect error
4. peer unreachable error
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

abf87d5e

nvme-rdma: make nvme_rdma_[create|destroy]_queue_ib symmetrical · ca6e95bb

由 Sagi Grimberg 提交于 5月 04, 2017

We put the reference on the device in the destroy routine
so we should lookup and take the reference in the create
routine.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ca6e95bb

nvme-rdma: Don't rearm the CQ when polling directly · c8295d11

由 Sagi Grimberg 提交于 5月 04, 2017

We don't need it as the core polling context will take
are of rearming the completion queue.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c8295d11

nvme-rdma: Make queue flags bit numbers and not shifts · dc5bc6a9

由 Sagi Grimberg 提交于 5月 04, 2017

bitops accept bit numbers.
Reported-by: NVijay Immanuel <vijayi@attalasystems.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

dc5bc6a9

nvme-rdma: get rid of unused ctrl lock · 3dee63c7

由 Sagi Grimberg 提交于 5月 04, 2017

Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3dee63c7

nvme-pci: implement host memory buffer support · 87ad72a5

由 Christoph Hellwig 提交于 5月 12, 2017

If a controller supports the host memory buffer we try to provide
it with the requested size up to an upper cap set as a module
parameter.  We try to give as few as possible descriptors, eventually
working our way down.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

87ad72a5

13 6月, 2017 2 次提交

nvme: save hmpre and hmmin in struct nvme_ctrl · fe6d53c9

由 Christoph Hellwig 提交于 5月 12, 2017

We'll need the later for the HMB support.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

fe6d53c9

nvme-rdma: fix merge error · a104c9f2

由 Christoph Hellwig 提交于 6月 12, 2017

The merge of 4.12-rc5 into the for-4.13/block tree didn't handle the queue
ready case correctly.  Fix this by propagating blk_status_t into
nvme_rdma_queue_is_ready.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

a104c9f2

09 6月, 2017 3 次提交

blk-mq: switch ->queue_rq return value to blk_status_t · fc17b653

由 Christoph Hellwig 提交于 6月 03, 2017

Use the same values for use for request completion errors as the return
value from ->queue_rq.  BLK_STS_RESOURCE is special cased to cause
a requeue, and all the others are completed as-is.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

fc17b653

block: introduce new block status code type · 2a842aca

由 Christoph Hellwig 提交于 6月 03, 2017

Currently we use nornal Linux errno values in the block layer, and while
we accept any error a few have overloaded magic meanings. This patch
instead introduces a new blk_status_t value that holds block layer specific
status codes and explicitly explains their meaning. Helpers to convert from
and to the previous special meanings are provided for now, but I suspect
we want to get rid of them in the long run - those drivers that have a
errno input (e.g. networking) usually get errnos that don't know about
the special block layer overloads, and similarly returning them to userspace
will usually return somethings that strictly speaking isn't correct
for file system operations, but that's left as an exercise for later.

For now the set of errors is a very limited set that closely corresponds
to the previous overloaded errno values, but there is some low hanging
fruite to improve it.

blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse
typechecking, so that we can easily catch places passing the wrong values.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

2a842aca

nvme-lightnvm: use blk_execute_rq in nvme_nvm_submit_user_cmd · 40174154

由 Christoph Hellwig 提交于 6月 03, 2017

Instead of reinventing it poorly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Reviewed-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

40174154

07 6月, 2017 4 次提交

nvme: relax APST default max latency to 100ms · 9947d6a0

由 Kai-Heng Feng 提交于 6月 07, 2017

Christoph Hellwig suggests we should to make APST work out of the box.
Hence relax the the default max latency to make them able to enter
deepest power state on default.

Here are id-ctrl excerpts from two high latency NVMes:

vid     : 0x14a4
ssvid   : 0x1b4b
mn      : CX2-GB1024-Q11 NVMe LITEON 1024GB
ps    3 : mp:0.1000W non-operational enlat:5000 exlat:5000 rrt:3 rrl:3
          rwt:3 rwl:3 idle_power:- active_power:-
ps    4 : mp:0.0100W non-operational enlat:50000 exlat:100000 rrt:4 rrl:4
          rwt:4 rwl:4 idle_power:- active_power:-

vid     : 0x15b7
ssvid   : 0x1b4b
mn      : A400 NVMe SanDisk 512GB
ps    3 : mp:0.0500W non-operational enlat:51000 exlat:10000 rrt:0 rrl:0
          rwt:0 rwl:0 idle_power:- active_power:-
ps    4 : mp:0.0055W non-operational enlat:1000000 exlat:100000 rrt:0 rrl:0
          rwt:0 rwl:0 idle_power:- active_power:-
Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

9947d6a0

nvme: only consider exit latency when choosing useful non-op power states · da87591b

由 Kai-Heng Feng 提交于 6月 07, 2017

When a NVMe is in non-op states, the latency is exlat.
The latency will be enlat + exlat only when the NVMe tries to transit
from operational state right atfer it begins to transit to
non-operational state, which should be a rare case.

Therefore, as Andy Lutomirski suggests, use exlat only when deciding power
states to trainsit to.
Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

da87591b

nvme-fc: fix missing put reference on controller create failure · 24b7f059

由 James Smart 提交于 6月 05, 2017

The failure case, of a create controller request, called
nvme_uninit_ctrl() but didn't do a put to allow the nvme
controller to be deleted.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

24b7f059

nvme-fc: on lldd/transport io error, terminate association · f874d5d0

由 James Smart 提交于 6月 01, 2017

Per FC-NVME, when lldd or transport detects an i/o error, the
connection must be terminated, which in turn requires the association
to be termianted.  Currently the transport simply creates a nvme
completion status of transport error and returns the io. The FC-NVME
spec makes the mandate as initiator and host, depending on the error,
can get out of sync on outstanding io counts (sqhd/sqtail).

Implement the association teardown on lldd or transport detected
errors.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

f874d5d0