提交 · 62346eaeb2f1a0524b35eaa2f479596f40491165 · openanolis / cloud-kernel

29 8月, 2017 2 次提交

由 Arnav Dawn 提交于 7月 12, 2017

Define the constant "0xffffffff" (used as nsid for all namespaces)
as NVME_NSID_ALL.
Signed-off-by: NArnav Dawn <a.dawn@samsung.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

62346eae

nvme: add support for FW activation without reset · b6dccf7f

由 Arnav Dawn 提交于 7月 12, 2017

This patch adds support for handling Fw activation without reset
On completion of FW-activation-starting AER, all queues are
paused till CSTS.PP is cleared or timed out (exceeds max time for
fw activtion MTFA). If device fails to clear CSTS.PP within MTFA,
driver issues reset controller.
Signed-off-by: NArnav Dawn <a.dawn@samsung.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

b6dccf7f

24 8月, 2017 1 次提交

block: replace bi_bdev with a gendisk pointer and partitions index · 74d46992

由 Christoph Hellwig 提交于 8月 23, 2017

This way we don't need a block_device structure to submit I/O.  The
block_device has different life time rules from the gendisk and
request_queue and is usually only available when the block device node
is open.  Other callers need to explicitly create one (e.g. the lightnvm
passthrough code, or the new nvme multipathing code).

For the actual I/O path all that we need is the gendisk, which exists
once per block device.  But given that the block layer also does
partition remapping we additionally need a partition index, which is
used for said remapping in generic_make_request.

Note that all the block drivers generally want request_queue or
sometimes the gendisk, so this removes a layer of indirection all
over the stack.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

74d46992

11 8月, 2017 2 次提交

nvme: fix directive command numd calculation · a082b426

由 Kwan (Hingkwan) Huen-SSI 提交于 8月 09, 2017

The numd field of directive receive command takes number of dwords to
transfer. This fix has the correct calculation for numd.
Signed-off-by: NKwan (Hingkwan) Huen-SSI <kwan.huen@samsung.com>
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

a082b426

nvme: fix nvme reset command timeout handling · 634b8325

由 Keith Busch 提交于 8月 10, 2017

We need to return an error if a timeout occurs on any NVMe command during
initialization. Without this, the nvme reset work will be stuck. A timeout
will have a negative error code, meaning we need to stop initializing
the controller. All postitive returns mean the controller is still usable.

bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196325Signed-off-by: NKeith Busch <keith.busch@intel.com>
Cc: Martin Peres <martin.peres@intel.com>
[jth consolidated cleanup path ]
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

634b8325

10 8月, 2017 1 次提交

nvme: strip trailing 0-bytes in wwid_show · 758f3735

由 Martin Wilck 提交于 7月 20, 2017

Some broken controllers (such as earlier Linux targets) pad model or
serial fields with 0-bytes rather than spaces. The NVMe spec disallows
0 bytes in "ASCII" fields.  Thus strip trailing 0-bytes, too. Also make
sure that we get no underflow for pathological input.
Signed-off-by: NMartin Wilck <mwilck@suse.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

758f3735

26 7月, 2017 1 次提交

nvme: validate admin queue before unquiesce · 7dd1ab16

由 Scott Bauer 提交于 7月 25, 2017

With a misbehaving controller it's possible we'll never
enter the live state and create an admin queue. When we
fail out of reset work it's possible we failed out early
enough without setting up the admin queue. We tear down
queues after a failed reset, but needed to do some more
sanitization.

Fixes 443bd90f: "nvme: host: unquiesce queue in nvme_kill_queues()"

[  189.650995] nvme nvme1: pci function 0000:0b:00.0
[  317.680055] nvme nvme0: Device not ready; aborting reset
[  317.680183] nvme nvme0: Removing after probe failure status: -19
[  317.681258] kasan: GPF could be caused by NULL-ptr deref or user memory access
[  317.681397] general protection fault: 0000 [#1] SMP KASAN
[  317.682984] CPU: 3 PID: 477 Comm: kworker/3:2 Not tainted 4.13.0-rc1+ #5
[  317.683112] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016
[  317.683284] Workqueue: events nvme_remove_dead_ctrl_work [nvme]
[  317.683398] task: ffff8803b0990000 task.stack: ffff8803c2ef0000
[  317.683516] RIP: 0010:blk_mq_unquiesce_queue+0x2b/0xa0
[  317.683614] RSP: 0018:ffff8803c2ef7d40 EFLAGS: 00010282
[  317.683716] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff1006fbdcde3
[  317.683847] RDX: 0000000000000038 RSI: 1ffff1006f5a9245 RDI: 0000000000000000
[  317.683978] RBP: ffff8803c2ef7d58 R08: 1ffff1007bcdc974 R09: 0000000000000000
[  317.684108] R10: 1ffff1007bcdc975 R11: 0000000000000000 R12: 00000000000001c0
[  317.684239] R13: ffff88037ad49228 R14: ffff88037ad492d0 R15: ffff88037ad492e0
[  317.684371] FS:  0000000000000000(0000) GS:ffff8803de6c0000(0000) knlGS:0000000000000000
[  317.684519] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  317.684627] CR2: 0000002d1860c000 CR3: 000000045b40d000 CR4: 00000000003406e0
[  317.684758] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  317.684888] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  317.685018] Call Trace:
[  317.685084]  nvme_kill_queues+0x4d/0x170 [nvme_core]
[  317.685185]  nvme_remove_dead_ctrl_work+0x3a/0x90 [nvme]
[  317.685289]  process_one_work+0x771/0x1170
[  317.685372]  worker_thread+0xde/0x11e0
[  317.685452]  ? pci_mmcfg_check_reserved+0x110/0x110
[  317.685550]  kthread+0x2d3/0x3d0
[  317.685617]  ? process_one_work+0x1170/0x1170
[  317.685704]  ? kthread_create_on_node+0xc0/0xc0
[  317.685785]  ret_from_fork+0x25/0x30
[  317.685798] Code: 0f 1f 44 00 00 55 48 b8 00 00 00 00 00 fc ff df 48 89 e5 41 54 4c 8d a7 c0 01 00 00 53 48 89 fb 4c 89 e2 48 c1 ea 03 48 83 ec 08 <80> 3c 02 00 75 50 48 8b bb c0 01 00 00 e8 33 8a f9 00 0f ba b3
[  317.685872] RIP: blk_mq_unquiesce_queue+0x2b/0xa0 RSP: ffff8803c2ef7d40
[  317.685908] ---[ end trace a3f8704150b1e8b4 ]---
Signed-off-by: NScott Bauer <scott.bauer@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

7dd1ab16

25 7月, 2017 1 次提交

nvme: also provide a UUID in the WWID sysfs attribute · 6484f5d1

由 Johannes Thumshirn 提交于 7月 12, 2017

The WWID sysfs attribute can provide multiple means of a World Wide ID
for a NVMe device. It can either be a NGUID, a EUI-64 or a concatenation
of VID, Serial Number, Model and the Namespace ID in this order of
preference.

If the target also sends us a UUID use the UUID for identification and
give it the highest priority.

This eases generation of /dev/disk/by-* symlinks.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

6484f5d1

20 7月, 2017 1 次提交

nvme: fix byte swapping in the streams code · dc1a0afb

由 Christoph Hellwig 提交于 7月 14, 2017

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

dc1a0afb

06 7月, 2017 2 次提交

nvme: split nvme_uninit_ctrl into stop and uninit · d09f2b45

由 Sagi Grimberg 提交于 7月 02, 2017

Usually before we teardown the controller we want to:
1. complete/cancel any ctrl inflight works
2. remove ctrl namespaces (only for removal though, resets
   shouldn't remove any namespaces).

but we do not want to destroy the controller device as
we might use it for logging during the teardown stage.

This patch adds nvme_start_ctrl() which queues inflight
controller works (aen, ns scan, queue start and keep-alive
if kato is set) and nvme_stop_ctrl() which cancels the works
namespace removal is left to the callers to handle.

Move nvme_uninit_ctrl after we are done with the
controller device.
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

d09f2b45

nvme: kick requeue list when requeueing a request instead of when starting the queues · 8d7b8faf

由 Sagi Grimberg 提交于 7月 04, 2017

When we requeue a request, we can always insert the request
back to the scheduler instead of doing it when restarting
the queues and kicking the requeue work, so get rid of
the requeue kick in nvme (core and drivers).

Also, now there is no need start hw queues in nvme_kill_queues
We don't stop the hw queues anymore, so no need to
start them.
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

8d7b8faf

28 6月, 2017 5 次提交

nvme: simplify nvme_dev_attrs_are_visible · 49d3d50b

由 Christoph Hellwig 提交于 6月 26, 2017

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

49d3d50b

nvme: read the subsystem NQN from Identify Controller · 180de007

由 Christoph Hellwig 提交于 6月 26, 2017

NVMe 1.2.1 or later requires controllers to provide a subsystem NQN in the
Identify controller data structures.  Use this NQN for the subsysnqn
sysfs attribute by storing it in the nvme_ctrl structure after verifying
it.  For older controllers we generate a "fake" NQN per non-normative
text in the NVMe 1.3 spec.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

180de007

nvme: explicitly disable APST on quirked devices · 76a5af84

由 Kai-Heng Feng 提交于 6月 26, 2017

A user reports APST is enabled, even when the NVMe is quirked or with
option "default_ps_max_latency_us=0".

The current logic will not set APST if the device is quirked. But the
NVMe in question will enable APST automatically.

Separate the logic "apst is supported" and "to enable apst", so we can
use the latter one to explicitly disable APST at initialiaztion.

BugLink: https://bugs.launchpad.net/bugs/1699004Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

76a5af84

nvme: Remove SCSI translations · 3f7f25a9

由 Keith Busch 提交于 6月 20, 2017

The SCSI-to-NVMe translations were added to assist storage applications
utilizing SG_IO transitioning to NVMe. It was always recommended,
however, to use native NVMe for device management as too much is lost
in translation and the maintenance burden in keeping this kludgey
layer around has been neglected such that much of the translations are
completely broken.

This patch removes SG_IO handling from NVMe to avoid any confusion
regarding maintenance support for this interface. The config option for
NVMe SCSI emulation has been disabled by default since 4.5. The driver
has supported native nvme user commands since the beginning, and native
tooling is publicly available for use or as reference for anyone writing
their own tools, so there's no excuse for hanging onto a broken crutch.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Acked-by: NJens Axboe <axboe@kernel.dk>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NGuan Junxiong <guanjunxiong@huawei.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3f7f25a9

nvme: add support for streams and directives · f5d11840

由 Jens Axboe 提交于 6月 27, 2017

This adds support for Directives in NVMe, particular for the Streams
directive. Support for Directives is a new feature in NVMe 1.3. It
allows a user to pass in information about where to store the data, so
that it the device can do so most effiently. If an application is
managing and writing data with different life times, mixing differently
retentioned data onto the same locations on flash can cause write
amplification to grow. This, in turn, will reduce performance and life
time of the device.
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f5d11840

19 6月, 2017 2 次提交

nvme: host: unquiesce queue in nvme_kill_queues() · 443bd90f

由 Ming Lei 提交于 6月 19, 2017

When nvme_kill_queues() is run, queues may be in
quiesced state, so we forcibly unquiesce queues to avoid
blocking dispatch, and I/O hang can be avoided in
remove path.

Peviously we use blk_mq_start_stopped_hw_queues() as
counterpart of blk_mq_quiesce_queue(), now we have
introduced blk_mq_unquiesce_queue(), so use it explicitly.

Cc: linux-nvme@lists.infradead.org
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

443bd90f

blk-mq: use the introduced blk_mq_unquiesce_queue() · f660174e

由 Ming Lei 提交于 6月 06, 2017

blk_mq_unquiesce_queue() is used for unquiescing the
queue explicitly, so replace blk_mq_start_stopped_hw_queues()
with it.

For the scsi part, this patch takes Bart's suggestion to
switch to block quiesce/unquiesce API completely.

Cc: linux-nvme@lists.infradead.org
Cc: linux-scsi@vger.kernel.org
Cc: dm-devel@redhat.com
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f660174e

16 6月, 2017 1 次提交

nvme: implement NS Optimal IO Boundary from 1.3 Spec · 6b8190d6

由 Scott Bauer 提交于 6月 15, 2017

The NVMe 1.3 spec introduces Namespace Optimal IO Boundaries (NOIOB),
which standardizes the stripe mechanism we currently have quirks for.
This patch implements the necessary logic to handle this new feature.
Signed-off-by: NScott Bauer <scott.bauer@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

6b8190d6

15 6月, 2017 12 次提交

S
nvme: don't hard code size of struct t10_pi_tuple · 8fa61121
由 Sagi Grimberg 提交于 6月 15, 2017
```
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
```
8fa61121

nvme: no need to wait for the reset when keepalive fails · 39bdc590

由 Christoph Hellwig 提交于 6月 12, 2017

We don't need to wait for the reset from the delayed work item that
is kicked off when we don't get a keepalive.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

39bdc590

nvme: move reset workqueue handling to common code · d86c4d8e

由 Christoph Hellwig 提交于 6月 15, 2017

This moves the nvme_reset function from the PCIe driver to common code,
renaming it to nvme_reset_ctrl in the process. Additionally a new
helper nvme_reset_ctrl_sync is added for the case where we want to
wait for the reset. To facilitate that the reset_work work structure is
move to the common nvme_ctrl structure and the ->reset_ctrl method is
removed. For now the drivers initialize the reset_work with their own
callback, but longer term we should move to callouts for specific
parts of the reset process and move even more code to the core.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

d86c4d8e

nvme: move protection information check into nvme_setup_rw · ebe6d874

由 Christoph Hellwig 提交于 6月 12, 2017

It only applies to read/write commands, and this way non-PCIe drivers
get the check as well instead of having to duplicate it when adding
metadata support.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ebe6d874

nvme: mark shutdown_timeout static · b3b1b0b0

由 Christoph Hellwig 提交于 6月 12, 2017

And open code the SHUTDOWN_TIMEOUT macro.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b3b1b0b0

nvme: use ctrl->device consistently for logging · f0425db0

由 Johannes Thumshirn 提交于 6月 09, 2017

Change the few left over users of ctrl->dev over to using ctrl->device
for logging purposes, so we consistently use the same device.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

f0425db0

nvme: provide UUID value to userspace · d934f984

由 Johannes Thumshirn 提交于 6月 07, 2017

Now that we have a way for getting the UUID from a target, provide it
to userspace as well.

Unfortunately there is already a sysfs attribute called UUID which is
a misnomer as it holds the NGUID value. So instead of creating yet
another wrong name, create a new 'nguid' sysfs attribute for the
NGUID. For the UUID attribute add a check wheter the namespace has a
UUID assigned to it and return this or return the NGUID to maintain
backwards compatibility. This should give userspace a chance to catch
up.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NSagi Grimberg <sagi@rimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d934f984

nvme: get list of namespace descriptors · 3b22ba26

由 Johannes Thumshirn 提交于 6月 07, 2017

If a target identifies itself as NVMe 1.3 compliant, try to get the
list of Namespace Identification Descriptors and populate the UUID,
NGUID and EUI64 fileds in the NVMe namespace structure with these
values.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3b22ba26

nvme: rename uuid to nguid in nvme_ns · 90985b84

由 Johannes Thumshirn 提交于 6月 07, 2017

The uuid field in the nvme_ns structure represents the nguid field
from the identify namespace command. And as NVMe 1.3 introduced an
UUID in the NVMe Namespace Identification Descriptor this will
collide.

So rename the uuid to nguid to prevent any further
confusion. Unfortunately we export the nguid to sysfs in the uuid
sysfs attribute, but this can't be changed anymore without possibly
breaking existing userspace.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

90985b84

nvme: queue ns scanning and async request from nvme_wq · c669ccdc

由 Sagi Grimberg 提交于 5月 04, 2017

To suppress the warning triggered by nvme_uninit_ctrl:
kernel: [ 50.350439] nvme nvme0: rescanning
kernel: [ 50.363351] ------------[ cut here]------------
kernel: [ 50.363396] WARNING: CPU: 1 PID: 37 at kernel/workqueue.c:2423 check_flush_dependency+0x11f/0x130
kernel: [ 50.363409] workqueue: WQ_MEM_RECLAIM
nvme-wq:nvme_del_ctrl_work [nvme_core] is flushing !WQ_MEM_RECLAIM events:nvme_scan_work [nvme_core]

This was triggered with nvme-loop, but can happen with rdma/pci as well afaict.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c669ccdc

nvme: Move transports to use nvme-core workqueue · 9a6327d2

由 Sagi Grimberg 提交于 6月 07, 2017

Instead of each transport using it's own workqueue, export
a single nvme-core workqueue and use that instead.

In the future, this will help us moving towards some unification
if controller setup/teardown flows.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

9a6327d2

nvme: Don't allow to reset a reconnecting controller · c58bd1bf

由 Sagi Grimberg 提交于 5月 04, 2017

The reset operation is guaranteed to fail for all scenarios
but the esoteric case where in the last reconnect attempt
concurrent with the reset we happen to successfully reconnect.

We just deny initiating a reset if we are reconnecting.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c58bd1bf

13 6月, 2017 1 次提交

nvme: save hmpre and hmmin in struct nvme_ctrl · fe6d53c9

由 Christoph Hellwig 提交于 5月 12, 2017

We'll need the later for the HMB support.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

fe6d53c9

09 6月, 2017 2 次提交

blk-mq: switch ->queue_rq return value to blk_status_t · fc17b653

由 Christoph Hellwig 提交于 6月 03, 2017

Use the same values for use for request completion errors as the return
value from ->queue_rq.  BLK_STS_RESOURCE is special cased to cause
a requeue, and all the others are completed as-is.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

fc17b653

block: introduce new block status code type · 2a842aca

由 Christoph Hellwig 提交于 6月 03, 2017

Currently we use nornal Linux errno values in the block layer, and while
we accept any error a few have overloaded magic meanings. This patch
instead introduces a new blk_status_t value that holds block layer specific
status codes and explicitly explains their meaning. Helpers to convert from
and to the previous special meanings are provided for now, but I suspect
we want to get rid of them in the long run - those drivers that have a
errno input (e.g. networking) usually get errnos that don't know about
the special block layer overloads, and similarly returning them to userspace
will usually return somethings that strictly speaking isn't correct
for file system operations, but that's left as an exercise for later.

For now the set of errors is a very limited set that closely corresponds
to the previous overloaded errno values, but there is some low hanging
fruite to improve it.

blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse
typechecking, so that we can easily catch places passing the wrong values.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

2a842aca

07 6月, 2017 3 次提交

nvme: relax APST default max latency to 100ms · 9947d6a0

由 Kai-Heng Feng 提交于 6月 07, 2017

Christoph Hellwig suggests we should to make APST work out of the box.
Hence relax the the default max latency to make them able to enter
deepest power state on default.

Here are id-ctrl excerpts from two high latency NVMes:

vid     : 0x14a4
ssvid   : 0x1b4b
mn      : CX2-GB1024-Q11 NVMe LITEON 1024GB
ps    3 : mp:0.1000W non-operational enlat:5000 exlat:5000 rrt:3 rrl:3
          rwt:3 rwl:3 idle_power:- active_power:-
ps    4 : mp:0.0100W non-operational enlat:50000 exlat:100000 rrt:4 rrl:4
          rwt:4 rwl:4 idle_power:- active_power:-

vid     : 0x15b7
ssvid   : 0x1b4b
mn      : A400 NVMe SanDisk 512GB
ps    3 : mp:0.0500W non-operational enlat:51000 exlat:10000 rrt:0 rrl:0
          rwt:0 rwl:0 idle_power:- active_power:-
ps    4 : mp:0.0055W non-operational enlat:1000000 exlat:100000 rrt:0 rrl:0
          rwt:0 rwl:0 idle_power:- active_power:-
Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

9947d6a0

nvme: only consider exit latency when choosing useful non-op power states · da87591b

由 Kai-Heng Feng 提交于 6月 07, 2017

When a NVMe is in non-op states, the latency is exlat.
The latency will be enlat + exlat only when the NVMe tries to transit
from operational state right atfer it begins to transit to
non-operational state, which should be a rare case.

Therefore, as Andy Lutomirski suggests, use exlat only when deciding power
states to trainsit to.
Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

da87591b

nvme: fix hang in remove path · 82654b6b

由 Ming Lei 提交于 6月 02, 2017

We need to start admin queues too in nvme_kill_queues()
for avoiding hang in remove path[1].

This patch is very similar with 806f026f(nvme: use
blk_mq_start_hw_queues() in nvme_kill_queues()).

[1] hang stack trace
[<ffffffff813c9716>] blk_execute_rq+0x56/0x80
[<ffffffff815cb6e9>] __nvme_submit_sync_cmd+0x89/0xf0
[<ffffffff815ce7be>] nvme_set_features+0x5e/0x90
[<ffffffff815ce9f6>] nvme_configure_apst+0x166/0x200
[<ffffffff815cef45>] nvme_set_latency_tolerance+0x35/0x50
[<ffffffff8157bd11>] apply_constraint+0xb1/0xc0
[<ffffffff8157cbb4>] dev_pm_qos_constraints_destroy+0xf4/0x1f0
[<ffffffff8157b44a>] dpm_sysfs_remove+0x2a/0x60
[<ffffffff8156d951>] device_del+0x101/0x320
[<ffffffff8156db8a>] device_unregister+0x1a/0x60
[<ffffffff8156dc4c>] device_destroy+0x3c/0x50
[<ffffffff815cd295>] nvme_uninit_ctrl+0x45/0xa0
[<ffffffff815d4858>] nvme_remove+0x78/0x110
[<ffffffff81452b69>] pci_device_remove+0x39/0xb0
[<ffffffff81572935>] device_release_driver_internal+0x155/0x210
[<ffffffff81572a02>] device_release_driver+0x12/0x20
[<ffffffff815d36fb>] nvme_remove_dead_ctrl_work+0x6b/0x70
[<ffffffff810bf3bc>] process_one_work+0x18c/0x3a0
[<ffffffff810bf61e>] worker_thread+0x4e/0x3b0
[<ffffffff810c5ac9>] kthread+0x109/0x140
[<ffffffff8185800c>] ret_from_fork+0x2c/0x40
[<ffffffffffffffff>] 0xffffffffffffffff

Fixes: c5552fde("nvme: Enable autonomous power state transitions")
Reported-by: NRakesh Pandit <rakesh@tuxera.com>
Tested-by: NRakesh Pandit <rakesh@tuxera.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

82654b6b

26 5月, 2017 2 次提交

nvme: only setup block integrity if supported by the driver · c81bfba9

由 Christoph Hellwig 提交于 5月 20, 2017

Currently only the PCIe driver supports metadata, so we should not claim
integrity support for the other drivers. This prevents nasty crashes
with targets that advertise metadata support on fabrics.

Also use the opportunity to factor out some code into a separate helper
that isn't even compiled if CONFIG_BLK_DEV_INTEGRITY is disabled.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>

c81bfba9

nvme: replace is_flags field in nvme_ctrl_ops with a flags field · d3d5b87d

由 Christoph Hellwig 提交于 5月 20, 2017

So that we can have more flags for transport-specific behavior.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>

d3d5b87d

23 5月, 2017 1 次提交

nvme: avoid to use blk_mq_abort_requeue_list() · 986f75c8

由 Ming Lei 提交于 5月 22, 2017

NVMe may add request into requeue list simply and not kick off the
requeue if hw queues are stopped. Then blk_mq_abort_requeue_list()
is called in both nvme_kill_queues() and nvme_ns_remove() for
dealing with this issue.

Unfortunately blk_mq_abort_requeue_list() is absolutely a
race maker, for example, one request may be requeued during
the aborting. So this patch just calls blk_mq_kick_requeue_list() in
nvme_kill_queues() to handle this issue like what nvme_start_queues()
does. Now all requests in requeue list when queues are stopped will be
handled by blk_mq_kick_requeue_list() when queues are restarted, either
in nvme_start_queues() or in nvme_kill_queues().

Cc: stable@vger.kernel.org
Reported-by: NZhang Yi <yizhan@redhat.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

986f75c8

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功