提交 · 037cebb85b94027a52be69d72068e6f6d0dca3a3 · openanolis / cloud-kernel

19 6月, 2017 6 次提交

C
blk-mq: streamline blk_mq_get_request · 037cebb8
由 Christoph Hellwig 提交于 6月 16, 2017
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
```
037cebb8

blk-mq: simplify blk_mq_free_request · 6af54051

由 Christoph Hellwig 提交于 6月 16, 2017

Merge three functions only tail-called by blk_mq_free_request into
blk_mq_free_request.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6af54051

blk-mq-sched: unify request finished methods · 7b9e9361

由 Christoph Hellwig 提交于 6月 16, 2017

No need to have two different callouts of bfq vs kyber.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7b9e9361

blk-mq: remove blk_mq_sched_{get,put}_rq_priv · ea511e3c

由 Christoph Hellwig 提交于 6月 16, 2017

Having these as separate helpers in a header really does not help
readability, or my chances to refactor this code sanely.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ea511e3c

blk-mq: move blk_mq_sched_{get,put}_request to blk-mq.c · d2c0d383

由 Christoph Hellwig 提交于 6月 16, 2017

Having them out of line in blk-mq-sched.c just makes the code flow
unnecessarily complicated.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d2c0d383

C
blk-mq: mark blk_mq_rq_ctx_init static · 6e15cf2a
由 Christoph Hellwig 提交于 6月 16, 2017
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
```
6e15cf2a

18 6月, 2017 1 次提交

loop: Add PF_LESS_THROTTLE to block/loop device thread. · b2ee7d46

由 NeilBrown 提交于 6月 16, 2017

When a filesystem is mounted from a loop device, writes are
throttled by balance_dirty_pages() twice: once when writing
to the filesystem and once when the loop_handle_cmd() writes
to the backing file.  This double-throttling can trigger
positive feedback loops that create significant delays.  The
throttling at the lower level is seen by the upper level as
a slow device, so it throttles extra hard.

The PF_LESS_THROTTLE flag was created to handle exactly this
circumstance, though with an NFS filesystem mounted from a
local NFS server.  It reduces the throttling on the lower
layer so that it can proceed largely unthrottled.

To demonstrate this, create a filesystem on a loop device
and write (e.g. with dd) several large files which combine
to consume significantly more than the limit set by
/proc/sys/vm/dirty_ratio or dirty_bytes.  Measure the total
time taken.

When I do this directly on a device (no loop device) the
total time for several runs (mkfs, mount, write 200 files,
umount) is fairly stable: 28-35 seconds.
When I do this over a loop device the times are much worse
and less stable.  52-460 seconds.  Half below 100seconds,
half above.
When I apply this patch, the times become stable again,
though not as fast as the no-loop-back case: 53-72 seconds.

There may be room for further improvement as the total overhead still
seems too high, but this is a big improvement.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMing Lei <tom.leiming@gmail.com>
Suggested-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b2ee7d46

17 6月, 2017 1 次提交

Merge branch 'nvme-4.13' of git://git.infradead.org/nvme into for-4.13/block · c27b2d63

由 Jens Axboe 提交于 6月 16, 2017

Pull NVMe changes for 4.13 from Christoph:

Highlights:

 - UUID identifier support from Johannes
 - Lots of cleanups from Sagi
 - Host Memory Buffer support from me

And lots of cleanups and smaller fixes of course.

Note that the UUID identifier changes are based on top of the uuid tree.
I am the maintainer of that tree and will send it to Linus as soon as
4.12 is released as various other trees depend on it as well (and the
diffstat includes those changes unfortunately)

c27b2d63

16 6月, 2017 3 次提交

block: swim3: make of_device_ids const. · cc3f2e9f

由 Arvind Yadav 提交于 6月 16, 2017

of_device_ids are not supposed to change at runtime. All functions
working with of_device_ids provided by <linux/of.h> work with const
of_device_ids. So mark the non-const structs as const.

File size before:
   text	   data	    bss	    dec	    hex	filename
   8908	   1096	    624	  10628	   2984	drivers/block/swim3.o

File size after constify swim3_match:
   text	   data	    bss	    dec	    hex	filename
   9708	    296	    624	  10628	   2984	drivers/block/swim3.o
Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

cc3f2e9f

block: Dedicated error code fixups · a462b950

由 Bart Van Assche 提交于 6月 13, 2017

This patch fixes two sparse warnings introduced by the "dedicated
error codes for the block layer V3" patch series. These changes
have not been tested.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

a462b950

nvme: implement NS Optimal IO Boundary from 1.3 Spec · 6b8190d6

由 Scott Bauer 提交于 6月 15, 2017

The NVMe 1.3 spec introduces Namespace Optimal IO Boundaries (NOIOB),
which standardizes the stripe mechanism we currently have quirks for.
This patch implements the necessary logic to handle this new feature.
Signed-off-by: NScott Bauer <scott.bauer@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

6b8190d6

15 6月, 2017 29 次提交

S
nvme: don't hard code size of struct t10_pi_tuple · 8fa61121
由 Sagi Grimberg 提交于 6月 15, 2017
```
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
```
8fa61121

nvme: no need to wait for the reset when keepalive fails · 39bdc590

由 Christoph Hellwig 提交于 6月 12, 2017

We don't need to wait for the reset from the delayed work item that
is kicked off when we don't get a keepalive.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

39bdc590

nvme: move reset workqueue handling to common code · d86c4d8e

由 Christoph Hellwig 提交于 6月 15, 2017

This moves the nvme_reset function from the PCIe driver to common code,
renaming it to nvme_reset_ctrl in the process. Additionally a new
helper nvme_reset_ctrl_sync is added for the case where we want to
wait for the reset. To facilitate that the reset_work work structure is
move to the common nvme_ctrl structure and the ->reset_ctrl method is
removed. For now the drivers initialize the reset_work with their own
callback, but longer term we should move to callouts for specific
parts of the reset process and move even more code to the core.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

d86c4d8e

nvme-pci: merge init_request methods · 0350815a

由 Christoph Hellwig 提交于 6月 13, 2017

Now that we get the tagset passed we can have a single implementation for
the I/O and admin queues.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0350815a

nvme-loop: merge init_request methods · 62b83b18

由 Christoph Hellwig 提交于 6月 13, 2017

Now that we get the tagset passed we can have a single implementation for
the I/O and admin queues.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

62b83b18

nvme-fc: merge init_request methods · 76f983cb

由 Christoph Hellwig 提交于 6月 13, 2017

Now that we get the tagset passed we can have a single implementation for
the I/O and admin queues.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

76f983cb

nvme-rdma: merge init_request and exit_request methods · 385475ee

由 Christoph Hellwig 提交于 6月 13, 2017

Now that we get the tagset passed we can have a single implementation for
the I/O and admin queues.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

385475ee

nvme: move protection information check into nvme_setup_rw · ebe6d874

由 Christoph Hellwig 提交于 6月 12, 2017

It only applies to read/write commands, and this way non-PCIe drivers
get the check as well instead of having to duplicate it when adding
metadata support.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ebe6d874

nvme: mark shutdown_timeout static · b3b1b0b0

由 Christoph Hellwig 提交于 6月 12, 2017

And open code the SHUTDOWN_TIMEOUT macro.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b3b1b0b0

nvme-rdma: fix error code in nvme_rdma_create_ctrl() · bb472baa

由 Dan Carpenter 提交于 6月 14, 2017

We accidentally return ERR_PTR(0) which is NULL.  The caller isn't
explicitly checking for that but I couldn't immediately spot whether
this would lead to a NULL dereference.  Anyway, we can fix add an
error code easily enough.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

bb472baa

nvmf: keep track of nvmet connect error status · 97ddc36e

由 Guan Junxiong 提交于 6月 13, 2017

To let the host know what happends to the connection establishment,
adjust the behavior of nvmf_log_connect_error to make more connect
specifig error codes human-readble.
Signed-off-by: NGuan Junxiong <guanjunxiong@huawei.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

97ddc36e

nvme: add fields into identify controller data structure · 435e8090

由 Guan Junxiong 提交于 6月 13, 2017

Add the new to NVMe 1.3 fields EDSTT, DSTO, FWUG, HCTMA, MNTMT, MXTMT,
and SANICAP into the idenfity controller data structure.
Signed-off-by: NGuan Junxiong <guanjunxiong@huawei.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

435e8090

nvmet-fc: Remove a set-but-not-used variable · 1b633277

由 Bart Van Assche 提交于 6月 08, 2017

This was detected by building the nvmet-fc driver with W=1.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Cc: James Smart <james.smart@broadcom.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1b633277

nvme: use ctrl->device consistently for logging · f0425db0

由 Johannes Thumshirn 提交于 6月 09, 2017

Change the few left over users of ctrl->dev over to using ctrl->device
for logging purposes, so we consistently use the same device.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

f0425db0

nvmet: allow overriding the NVMe VS via configfs · c61d788b

由 Johannes Thumshirn 提交于 6月 07, 2017

Allow overriding the announced NVMe Version of a via configfs.

This is particularly helpful when debugging new features for the host
or target side without bumping the hard coded version (as the target
might not be fully compliant to the announced version yet).
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NGuan Junxiong <guanjunxiong@huawei.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c61d788b

nvmet: add uuid field to nvme_ns and populate via configfs · 430c7bef

由 Johannes Thumshirn 提交于 6月 07, 2017

Add the UUID field from the NVMe Namespace Identification Descriptor
to the nvmet_ns structure and allow it's population via configfs.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

430c7bef

nvmet: implement namespace identify descriptor list · 637dc0f3

由 Johannes Thumshirn 提交于 6月 07, 2017

A NVMe Identify NS command with a CNS value of '3' is expecting a list
of Namespace Identification Descriptor structures to be returned to
the host for the namespace requested in the namespace identify
command.

This Namespace Identification Descriptor structure consists of the
type of the namespace identifier, the length of the identifier and the
actual identifier.

Valid types are NGUID and UUID which we have saved in our nvme_ns
structure if they have been configured via configfs. If no value has
been assigened to one of these we return an "invalid opcode" back to
the host to maintain backward compatibiliy with older implementations
without Namespace Identify Descriptor list support.

Also as the Namespace Identify Descriptor list is the only mandatory
feature change between 1.2.1 and 1.3 we can bump the advertised
version as well.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

637dc0f3

nvme: provide UUID value to userspace · d934f984

由 Johannes Thumshirn 提交于 6月 07, 2017

Now that we have a way for getting the UUID from a target, provide it
to userspace as well.

Unfortunately there is already a sysfs attribute called UUID which is
a misnomer as it holds the NGUID value. So instead of creating yet
another wrong name, create a new 'nguid' sysfs attribute for the
NGUID. For the UUID attribute add a check wheter the namespace has a
UUID assigned to it and return this or return the NGUID to maintain
backwards compatibility. This should give userspace a chance to catch
up.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NSagi Grimberg <sagi@rimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d934f984

nvme: get list of namespace descriptors · 3b22ba26

由 Johannes Thumshirn 提交于 6月 07, 2017

If a target identifies itself as NVMe 1.3 compliant, try to get the
list of Namespace Identification Descriptors and populate the UUID,
NGUID and EUI64 fileds in the NVMe namespace structure with these
values.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3b22ba26

nvme: rename uuid to nguid in nvme_ns · 90985b84

由 Johannes Thumshirn 提交于 6月 07, 2017

The uuid field in the nvme_ns structure represents the nguid field
from the identify namespace command. And as NVMe 1.3 introduced an
UUID in the NVMe Namespace Identification Descriptor this will
collide.

So rename the uuid to nguid to prevent any further
confusion. Unfortunately we export the nguid to sysfs in the uuid
sysfs attribute, but this can't be changed anymore without possibly
breaking existing userspace.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

90985b84

nvme: introduce NVMe Namespace Identification Descriptor structures · af8b86e9

由 Johannes Thumshirn 提交于 6月 07, 2017

Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

af8b86e9

nvmet: use NVME_IDENTIFY_DATA_SIZE · 0add5e8e

由 Johannes Thumshirn 提交于 6月 07, 2017

Use NVME_IDENTIFY_DATA_SIZE define instead of hard coding the magic
4096 value.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NHannes Reinecke <hare@suse.com>
[hch: converted three more users]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0add5e8e

scatterlist: add sg_zero_buffer() helper · 0945e569

由 Johannes Thumshirn 提交于 6月 07, 2017

The sg_zero_buffer() helper is used to zero fill an area in a SG
list.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
[hch: renamed to sg_zero_buffer]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0945e569

nvme-pci: remove redundant includes · d19d4c8e

由 Sagi Grimberg 提交于 6月 05, 2017

Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>

d19d4c8e

nvme-pci: Remove watchdog timer · b2a0eb1a

由 Keith Busch 提交于 6月 07, 2017

The controller status polling was added to preemptively reset a failed
controller. This early detection would allow commands that would normally
timeout a chance for a retry, or find broken links when the platform
didn't support hotplug.

This once-per-second MMIO read, however, created more problems than
it solves. This often races with PCIe Hotplug events that required
complicated syncing between work queues, frequently triggered PCIe
Completion Timeout errors that also lead to fatal machine checks, and
unnecessarily disrupts low power modes by running on idle controllers.

This patch removes the watchdog timer, and instead checks controller
health only on an IO timeout when we have a reason to believe something
is wrong. If the controller is failed, the driver will disable immediately
and request scheduling a reset.
Suggested-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b2a0eb1a

nvme-pci: remap BAR0 to cover admin CQ doorbell for large stride · 97f6ef64

由 Xu Yu 提交于 5月 24, 2017

The existing driver initially maps 8192 bytes of BAR0 which is
intended to cover doorbells of admin SQ and CQ. However, if a
large stride, e.g. 10, is used, the doorbell of admin CQ will
be out of 8192 bytes. Consequently, a page fault will be raised
when the admin CQ doorbell is accessed in nvme_configure_admin_queue().

This patch fixes this issue by remapping BAR0 before accessing
admin CQ doorbell if the initial mapping is not enough.
Signed-off-by: NXu Yu <yu.a.xu@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

97f6ef64

nvme: move nr_reconnects to nvme_ctrl · fdf9dfa8

由 Sagi Grimberg 提交于 5月 04, 2017

It is not a user option but rather a variable controller
attribute.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

fdf9dfa8

nvme: queue ns scanning and async request from nvme_wq · c669ccdc

由 Sagi Grimberg 提交于 5月 04, 2017

To suppress the warning triggered by nvme_uninit_ctrl:
kernel: [ 50.350439] nvme nvme0: rescanning
kernel: [ 50.363351] ------------[ cut here]------------
kernel: [ 50.363396] WARNING: CPU: 1 PID: 37 at kernel/workqueue.c:2423 check_flush_dependency+0x11f/0x130
kernel: [ 50.363409] workqueue: WQ_MEM_RECLAIM
nvme-wq:nvme_del_ctrl_work [nvme_core] is flushing !WQ_MEM_RECLAIM events:nvme_scan_work [nvme_core]

This was triggered with nvme-loop, but can happen with rdma/pci as well afaict.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c669ccdc

nvme: Move transports to use nvme-core workqueue · 9a6327d2

由 Sagi Grimberg 提交于 6月 07, 2017

Instead of each transport using it's own workqueue, export
a single nvme-core workqueue and use that instead.

In the future, this will help us moving towards some unification
if controller setup/teardown flows.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

9a6327d2

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功