提交 · 233f0bf415e28adca96f61289e424ce4cfa9a9c0 · openeuler / raspberrypi-kernel

01 9月, 2017 5 次提交

doc, block, bfq: fix some typos and remove stale stuff · 233f0bf4

由 Paolo Valente 提交于 8月 31, 2017

In addition to containing some typos and stale sentences, the file
bfq-iosched.txt still mentioned a set of sysfs parameters that have
been removed from this version of bfq. This commit fixes all these
issues.
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Reviewed-by: NJeremy Hickman <jeremywh7@gmail.com>
Reviewed-by: NLaurentiu Nicola <lnicola@dend.ro>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

233f0bf4

loop: fold loop_switch() into callers · 43cade80

由 Omar Sandoval 提交于 8月 24, 2017

The comments here are really outdated, and blk-mq made flushing much
simpler, so just fold the two cases into the callers.
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

43cade80

loop: add ioctl for changing logical block size · 89e4fdec

由 Omar Sandoval 提交于 8月 24, 2017

This is a different approach from the first attempt in f2c6df7d
("loop: support 4k physical blocksize"). Rather than extending
LOOP_{GET,SET}_STATUS, add a separate ioctl just for setting the block
size.
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

89e4fdec

loop: set physical block size to PAGE_SIZE · 6c6b6f28

由 Omar Sandoval 提交于 8月 24, 2017

The physical block size is "the lowest possible sector size that the
hardware can operate on without reverting to read-modify-write
operations" (from the comment on blk_queue_physical_block_size()). Since
loop does buffered I/O on the backing file by default, the RMW unit is a
page. This isn't the case for direct I/O mode, but let's keep it simple.
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6c6b6f28

loop: get rid of lo_blocksize · 8a0740c4

由 Omar Sandoval 提交于 8月 24, 2017

This is only used for setting the soft block size on the struct
block_device once and then never used again.
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8a0740c4

31 8月, 2017 3 次提交

block, bfq: guarantee update_next_in_service always returns an eligible entity · 24d90bb2

由 Paolo Valente 提交于 8月 31, 2017

If the function bfq_update_next_in_service is invoked as a consequence
of the activation or requeueing of an entity, say E, then it doesn't
invoke bfq_lookup_next_entity to get the next-in-service entity. In
contrast, it follows a shorter path: if E happens to be eligible (see
commit "bfq-sq-mq: make lookup_next_entity push up vtime on
expirations" for details on eligibility) and to have a lower virtual
finish time than the current candidate as next-in-service entity, then
E directly becomes the next-in-service entity. Unfortunately, there is
a corner case for which this shorter path makes
bfq_update_next_in_service choose a non eligible entity: it occurs if
both E and the current next-in-service entity happen to be non
eligible when bfq_update_next_in_service is invoked. In this case, E
is not set as next-in-service, and, since bfq_lookup_next_entity is
not invoked, the state of the parent entity is not updated so as to
end up with an eligible entity as the proper next-in-service entity.

In this respect, next-in-service is actually allowed to be non
eligible while some queue is in service: since no system-virtual-time
push-up can be performed in that case (see again commit "bfq-sq-mq:
make lookup_next_entity push up vtime on expirations" for details),
next-in-service is chosen, speculatively, as a function of the
possible value that the system virtual time may get after a push
up. But the correctness of the schedule breaks if next-in-service is
still a non eligible entity when it is time to set in service the next
entity. Unfortunately, this may happen in the above corner case.

This commit fixes this problem by making bfq_update_next_in_service
invoke bfq_lookup_next_entity not only if the above shorter path
cannot be taken, but also if the shorter path is taken but fails to
yield an eligible next-in-service entity.
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Tested-by: NLee Tibbert <lee.tibbert@gmail.com>
Tested-by: NOleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

24d90bb2

block, bfq: remove direct switch to an entity in higher class · a02195ce

由 Paolo Valente 提交于 8月 31, 2017

If the function bfq_update_next_in_service is invoked as a consequence
of the activation or requeueing of an entity, say E, and finds out
that E belongs to a higher-priority class than that of the current
next-in-service entity, then it sets next_in_service directly to
E. But this may lead to anomalous schedules, because E may happen not
be eligible for service, because its virtual start time is higher than
the system virtual time for its service tree.

This commit addresses this issue by simply removing this direct
switch.
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Tested-by: NLee Tibbert <lee.tibbert@gmail.com>
Tested-by: NOleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a02195ce

block, bfq: make lookup_next_entity push up vtime on expirations · 80294c3b

由 Paolo Valente 提交于 8月 31, 2017

To provide a very smooth service, bfq starts to serve a bfq_queue
only if the queue is 'eligible', i.e., if the same queue would
have started to be served in the ideal, perfectly fair system that
bfq simulates internally. This is obtained by associating each
queue with a virtual start time, and by computing a special system
virtual time quantity: a queue is eligible only if the system
virtual time has reached the virtual start time of the
queue. Finally, bfq guarantees that, when a new queue must be set
in service, there is always at least one eligible entity for each
active parent entity in the scheduler. To provide this guarantee,
the function __bfq_lookup_next_entity pushes up, for each parent
entity on which it is invoked, the system virtual time to the
minimum among the virtual start times of the entities in the
active tree for the parent entity (more precisely, the push up
occurs if the system virtual time happens to be lower than all
such virtual start times).

There is however a circumstance in which __bfq_lookup_next_entity
cannot push up the system virtual time for a parent entity, even
if the system virtual time is lower than the virtual start times
of all the child entities in the active tree. It happens if one of
the child entities is in service. In fact, in such a case, there
is already an eligible entity, the in-service one, even if it may
not be not present in the active tree (because in-service entities
may be removed from the active tree).

Unfortunately, in the last re-design of the
hierarchical-scheduling engine, the reset of the pointer to the
in-service entity for a given parent entity--reset to be done as a
consequence of the expiration of the in-service entity--always
happens after the function __bfq_lookup_next_entity has been
invoked. This causes the function to think that there is still an
entity in service for the parent entity, and then that the system
virtual time cannot be pushed up, even if actually such a
no-more-in-service entity has already been properly reinserted
into the active tree (or in some other tree if no more
active). Yet, the system virtual time *had* to be pushed up, to be
ready to correctly choose the next queue to serve. Because of the
lack of this push up, bfq may wrongly set in service a queue that
had been speculatively pre-computed as the possible
next-in-service queue, but that would no more be the one to serve
after the expiration and the reinsertion into the active trees of
the previously in-service entities.

This commit addresses this issue by making
__bfq_lookup_next_entity properly push up the system virtual time
if an expiration is occurring.
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Tested-by: NLee Tibbert <lee.tibbert@gmail.com>
Tested-by: NOleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

80294c3b

29 8月, 2017 32 次提交

Merge branch 'nvme-4.14' of git://git.infradead.org/nvme into for-4.14/block-postmerge · 2b76da95

由 Jens Axboe 提交于 8月 29, 2017

Pull NVMe changes from Christoph:

"Below is the current set of NVMe updates for Linux 4.14, now against
 your postmerge branch, and with three more patches.

 The biggest bit comes from Sagi and refactors the RDMA driver to
 prepare for more code sharing in the setup and teardown path.  But we
 have various features and bug fixes from a lot of people as well."

2b76da95

nvme: don't blindly overwrite identifiers on disk revalidate · 1d5df6af

由 Christoph Hellwig 提交于 8月 17, 2017

Instead validate that these identifiers do not change, as that is
prohibited by the specification.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>

1d5df6af

nvme: remove nvme_revalidate_ns · cdbff4f2

由 Christoph Hellwig 提交于 8月 16, 2017

The function is used in two places, and the shared code for those will
diverge later in this series.

Instead factor out a new helper to get the ids for a namespace, simplify
the calling conventions for nvme_identify_ns and just open code the
sequence.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

cdbff4f2

nvme: remove unused struct nvme_ns fields · 57eeaf8e

由 Christoph Hellwig 提交于 8月 16, 2017

And move the flags for the flags field near that field while touching
this area.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

57eeaf8e

nvme: allow calling nvme_change_ctrl_state from irq context · 0a72bbba

由 Christoph Hellwig 提交于 8月 22, 2017

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

0a72bbba

nvme: report more detailed status codes to the block layer · a751da33

由 Christoph Hellwig 提交于 8月 22, 2017

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

a751da33

nvme: honor RTD3 Entry Latency for shutdowns · 07fbd32a

由 Martin K. Petersen 提交于 8月 25, 2017

If an NVMe controller reports RTD3 Entry Latency larger than
shutdown_timeout, up to a maximum of 60 seconds, use that value to set
the shutdown timer. Otherwise fall back to the module parameter which
defaults to 5 seconds.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
[hch: removed do_div, made transition time local scope]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

07fbd32a

nvme: fix uninitialized prp2 value on small transfers · 5228b328

由 Jan H. Schönherr 提交于 8月 27, 2017

The value of iod->first_dma ends up as prp2 in NVMe commands. In case
there is not enough data to cross a page boundary, iod->first_dma is
never initialized and contains random data.

Comply with the NVMe specification and fill in 0 in that case.
Signed-off-by: NJan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

5228b328

nvme-rdma: Use unlikely macro in the fast path · a7b7c7a1

由 Max Gurtovoy 提交于 8月 14, 2017

This patch slightly improves performance (mainly for small block sizes).
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

a7b7c7a1

nvmet: use memcpy_and_pad for identify sn/fr · 17c39d05

由 Martin Wilck 提交于 8月 14, 2017

This changes the earlier patch "nvmet: don't report 0-bytes
in serial number" to use the memcpy_and_pad() helper introduced
in a previous patch.
Signed-off-by: NMartin Wilck <mwilck@suse.com>
Reviewed-by: NSagi Grimberg <sagi@grimbeg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

17c39d05

string.h: add memcpy_and_pad() · 01f33c33

由 Martin Wilck 提交于 8月 14, 2017

This helper function is useful for the nvme subsystem, and maybe
others.

Note: the warnings reported by the kbuild test robot for this patch
are actually generated by the use of CONFIG_PROFILE_ALL_BRANCHES
together with __FORTIFY_INLINE.
Signed-off-by: NMartin Wilck <mwilck@suse.com>
Reviewed-by: NSagi Grimberg <sagi@grimbeg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

01f33c33

nvmet-fc: simplify sg list handling · 48fa362b

由 James Smart 提交于 7月 31, 2017

The existing nvmet_fc sg list handling has 2 faults:
a) the request between LLDD and transport has too large of an sg
   list (256 elements), which is normally 256k (64 elements).
b) sglist handling doesn't optimize on the fact that each element
   is a page.

This patch removes the static sg list in the request and uses the
dynamic list already present in the nvmet_fc transport. It also
simplies the handling of the sg list on multiple sequences to
take advantage of the per-page divisions.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

48fa362b

nvme-fc: Reattach to localports on re-registration · 5533d424

由 James Smart 提交于 7月 31, 2017

If the LLDD resets or detaches from an fc port, the LLDD will
deregister all remoteports seen by the fc port and deregister the
localport associated with the fc port. The teardown of the localport
structure will be held off due to reference counting until all the
remoteports are removed (and they are held off until all
controllers/associations to terminated). Currently, if the fc port
is reinit/reattached and registered again as a localport it is
treated as an independent entity from the prior localport and all
prior remoteports and controllers cannot be revived. They are
created as new and separate entities.

This patch changes the localport registration to look at the known
localports that are waiting to be torndown. If they are the same port
based on wwn's, the local port is transitioned out of the teardown
state.  This allows the remote ports and controller connections to
be reestablished and resumed as long as the localport can also be
reregistered within the timeout windows.

The patch adds a new routine nvme_fc_attach_to_unreg_lport() with
the functionality and moves the lport get/put routines to avoid
forward references.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

5533d424

M
nvme: rename AMS symbolic constants to fit specification · 60b43f62
由 Max Gurtovoy 提交于 8月 13, 2017
```
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
```
60b43f62

nvme: add symbolic constants for CC identifiers · ad4e05b2

由 Max Gurtovoy 提交于 8月 13, 2017

Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ad4e05b2

nvme: fix identify namespace logging · caaa15c5

由 Sagi Grimberg 提交于 8月 15, 2017

Use ctrl->device and lose the func name.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

caaa15c5

nvme-fabrics: log a warning if hostid is invalid · 9b483da1

由 Guan Junxiong 提交于 8月 03, 2017

This helps users to quickly spot the reason of why connection fails
if the hostid is not compliant with the uuid format.
Signed-off-by: NGuan Junxiong <guanjunxiong@huawei.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

9b483da1

nvme-rdma: call ops->reg_read64 instead of nvmf_reg_read64 · 09fdc23b

由 Sagi Grimberg 提交于 7月 10, 2017

To make the nvme_rdma_configure_admin_queue generic in preparation of
moving it to common code.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

09fdc23b

nvme-rdma: cleanup error path in controller reset · 370ae6e4

由 Sagi Grimberg 提交于 7月 10, 2017

No need to queue an extra work to indirect controller removal, just call the
ctrl remove routine.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

370ae6e4

nvme-rdma: introduce nvme_rdma_start_queue · 68e16fcf

由 Sagi Grimberg 提交于 7月 10, 2017

This should pair with nvme_rdma_stop_queue. While this is not a complete
inverse, it still pairs up pretty well because in fabrics we don't have a
disconnect capsule (yet) but we simply teardown the transport association.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

68e16fcf

nvme-rdma: rename nvme_rdma_init_queue to nvme_rdma_alloc_queue · 41e8cfa1

由 Sagi Grimberg 提交于 7月 10, 2017

Give it a name symmetric to nvme_rdma_free_queue. Also pass in the ctrl
sqsize+1 and not the opts queue_size.  And suppress a superflous
failure message.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

41e8cfa1

nvme-rdma: stop queues instead of simply flipping their state · 148b4e7f

由 Sagi Grimberg 提交于 7月 10, 2017

If we move the queues from LIVE state, we might as well stop them (drain
for rdma).  Do it after we stop the request queues to prevent a stray
request sneaking in .queue_rq after we stop the queue.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

148b4e7f

nvme-rdma: introduce configure/destroy io queues · a57bd541

由 Sagi Grimberg 提交于 8月 28, 2017

Make a symmetrical handling with admin queue.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

a57bd541

nvme-rdma: reuse configure/destroy_admin_queue · 31fdf184

由 Sagi Grimberg 提交于 8月 28, 2017

No need to open-code it.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

31fdf184

nvme-rdma: don't free tagset on resets · 3f02fffb

由 Sagi Grimberg 提交于 7月 10, 2017

We're not supposed to do that.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3f02fffb

nvme-rdma: disable the controller on resets · 18398af2

由 Sagi Grimberg 提交于 7月 10, 2017

Mimic the pci driver as a controller disable might be more lightweight
than a shutdown.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

18398af2

nvme-rdma: move tagset allocation to a dedicated routine · b28a308e

由 Sagi Grimberg 提交于 7月 10, 2017

We always pair tagset allocation with rdma device reference and it shares
some code, centralize it with an argument if its an admin or IO tagset.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b28a308e

nvme: Add admin_tagset pointer to nvme_ctrl · 34b6c231

由 Sagi Grimberg 提交于 7月 10, 2017

Will be used when we centralize control flows.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

34b6c231

nvme-rdma: move nvme_rdma_configure_admin_queue code location · 90af3512

由 Sagi Grimberg 提交于 7月 10, 2017

We will call it from other places so avoid having to forward declare it.
Also move it next to nvme_rdma_destroy_admin_queue.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

90af3512

nvme-rdma: remove NVME_RDMA_MAX_SEGMENT_SIZE · 4897ad4e

由 Johannes Thumshirn 提交于 8月 03, 2017

NVME_RDMA_MAX_SEGMENT_SIZE is not used anywhere, zap it.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

4897ad4e

nvmet-fcloop: remove ALL_OPTS define · 1c35be8c

由 Johannes Thumshirn 提交于 8月 03, 2017

ALL_OPTS isn't used anywhere, remove it.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1c35be8c

nvmet: fix the return error code of target if host is not allowed · 130c24b5

由 Guan Junxiong 提交于 8月 04, 2017

nvmf target shall return NVME_SC_CONNECT_INVALID_HOST instead of
the gereal code INVALID_PARAM when the given host nqn is not allowed
to connect. Refer to the 2.2.1 section of the NVMe over Fabrics Spec.
Signed-off-by: NGuan Junxiong <guanjunxiong@huawei.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

130c24b5