提交 · 48b8d20895f8a489e1527e9bdc5e372808542fa3 · openanolis / cloud-kernel

01 6月, 2018 33 次提交

lightnvm: pblk: garbage collect lines with failed writes · 48b8d208

由 Hans Holmberg 提交于 6月 01, 2018

Write failures should not happen under normal circumstances,
so in order to bring the chunk back into a known state as soon
as possible, evacuate all the valid data out of the line and let the
fw judge if the block can be written to in the next reset cycle.

Do this by introducing a new gc list for lines with failed writes,
and ensure that the rate limiter allocates a small portion of
the write bandwidth to get the job done.

The lba list is saved in memory for use during gc as we
cannot gurantee that the emeta data is readable if a write
error occurred.
Signed-off-by: NHans Holmberg <hans.holmberg@cnexlabs.com>
Reviewed-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

48b8d208

lightnvm: pblk: rework write error recovery path · 6a3abf5b

由 Hans Holmberg 提交于 6月 01, 2018

The write error recovery path is incomplete, so rework
the write error recovery handling to do resubmits directly
from the write buffer.

When a write error occurs, the remaining sectors in the chunk are
mapped out and invalidated and the request inserted in a resubmit list.

The writer thread checks if there are any requests to resubmit,
scans and invalidates any lbas that have been overwritten by later
writes and resubmits the failed entries.
Signed-off-by: NHans Holmberg <hans.holmberg@cnexlabs.com>
Reviewed-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6a3abf5b

lightnvm: pblk: remove dead function · 72b6cdbb

由 Javier González 提交于 6月 01, 2018

Remove dead function for manual sync. I/O
Signed-off-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

72b6cdbb

lightnvm: pass flag on graceful teardown to targets · a7c9e910

由 Javier González 提交于 6月 01, 2018

If the namespace is unregistered before the LightNVM target is removed
(e.g., on hot unplug) it is too late for the target to store any metadata
on the device - any attempt to write to the device will fail. In this
case, pass on a "gracefull teardown" flag to the target to let it know
when this happens.

In the case of pblk, we pad the open line (close all open chunks) to
improve data retention. In the event of an ungraceful shutdown, avoid
this part and just clean up.
Signed-off-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a7c9e910

lightnvm: pblk: check for chunk size before allocating it · 6f9c9607

由 Javier González 提交于 6月 01, 2018

Do the check for the chunk state after making sure that the chunk type
is supported.

Fixes: 32ef9412 ("lightnvm: pblk: implement get log report chunk")
Signed-off-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6f9c9607

lightnvm: pblk: remove unnecessary argument · 8e55c07b

由 Javier González 提交于 6月 01, 2018

Remove unnecessary argument on pblk_line_free()
Signed-off-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8e55c07b

lightnvm: pblk: remove unnecessary indirection · e13f421b

由 Javier González 提交于 6月 01, 2018

Call nvm_submit_io directly and remove an unnecessary indirection on the
read path.
Signed-off-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e13f421b

lightnvm: pblk: return NVM_ error on failed submission · b6730dd4

由 Javier González 提交于 6月 01, 2018

Return a meaningful error when the sanity vector I/O check fails.
Signed-off-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b6730dd4

lightnvm: pblk: warn in case of corrupted write buffer · e37d0798

由 Javier González 提交于 6月 01, 2018

When cleaning up buffer entries as we wrap up, their state should be
"completed". If any of the entries is in "submitted" state, it means
that something bad has happened. Trigger a warning immediately instead of
waiting for the state flag to eventually be updated, thus hiding the
issue.
Signed-off-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e37d0798

lightnvm: pblk: improve error msg on corrupted LBAs · 03a34b2d

由 Javier González 提交于 6月 01, 2018

In the event of a mismatch between the read LBA and the metadata pointer
reported by the device, improve the error message to be able to detect
the offending physical address (PPA) mapped to the corrupted LBA.
Signed-off-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

03a34b2d

lightnvm: pblk: check read lba on gc path · 310df582

由 Javier González 提交于 6月 01, 2018

Check that the lba stored in the LBA metadata is correct in the GC path
too. This requires a new helper function to check random reads in the
vector read.
Signed-off-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

310df582

lightnvm: pblk: recheck for bad lines at runtime · 1d8b33e0

由 Javier González 提交于 6月 01, 2018

Bad blocks can grow at runtime. Check that the number of valid blocks in
a line are within the sanity threshold before allocating the line for
new writes.
Signed-off-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1d8b33e0

lightnvm: pblk: fail gracefully on line alloc. failure · 2deeefc0

由 Javier González 提交于 6月 01, 2018

In the event of a line failing to allocate, fail gracefully and stop the
pipeline to avoid more write failing in the same place.
Signed-off-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2deeefc0

Merge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-4.18/block · 84e92c13

由 Jens Axboe 提交于 6月 01, 2018

Pull NVMe changes from Christoph:

"Below is another set of NVMe updates for 4.18.  Besides the usual bug
 fixes this includes more feature completness in terms of AEN and log
 page handling on the target."

* 'nvme-4.18' of git://git.infradead.org/nvme:
  nvme: use the changed namespaces list log to clear ns data changed AENs
  nvme: mark nvme_queue_scan static
  nvme: submit AEN event configuration on startup
  nvmet: mask pending AENs
  nvmet: add AEN configuration support
  nvmet: implement the changed namespaces log
  nvmet: split log page implementation
  nvmet: add a new nvmet_zero_sgl helper
  nvme.h: add AEN configuration symbols
  nvme.h: add the changed namespace list log
  nvme.h: untangle AEN notice definitions
  nvmet: fix error return code in nvmet_file_ns_enable()
  nvmet: fix a typo in nvmet_file_ns_enable()
  nvme-fabrics: allow internal passthrough command on deleting controllers
  nvme-loop: add support for multiple ports
  nvme-pci: simplify __nvme_submit_cmd
  nvme-pci: Rate limit the nvme timeout warnings
  nvme: allow duplicate controller if prior controller being deleted

84e92c13

block: split the blk-mq case from elevator_init · 131d08e1

由 Christoph Hellwig 提交于 5月 31, 2018

There is almost no shared logic, which leads to a very confusing code
flow.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
Tested-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

131d08e1

block: move sysfs_lock into elevator_init · acddf3b3

由 Christoph Hellwig 提交于 5月 31, 2018

Both callers take just around so function call, so move it in.
Also remove the now pointless blk_mq_sched_init wrapper.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
Tested-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

acddf3b3

block: remove the always unused name argument to elevator_init · ddb72532

由 Christoph Hellwig 提交于 5月 31, 2018

Reported-by: NDamien Le Moal <Damien.LeMoal@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
Tested-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ddb72532

block: unexport elevator_init/exit · a8a275c9

由 Christoph Hellwig 提交于 5月 31, 2018

These are only used by the block core.  Also move the declarations to
block/blk.h.
Reported-by: NDamien Le Moal <Damien.LeMoal@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
Tested-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a8a275c9

block: move initialization of elevator-related fields to blk_alloc_queue_node · cbf62af3

由 Christoph Hellwig 提交于 5月 31, 2018

No point in doing this in elevator_init.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NDamien Le Moal <Damien.LeMoal@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
Tested-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

cbf62af3

nvme: use the changed namespaces list log to clear ns data changed AENs · 30d90964

由 Christoph Hellwig 提交于 5月 25, 2018

Per section 5.2 we need to issue the corresponding log page to clear an
AEN, so for a namespace data changed AEN we need to read the changed
namespace list log.  And once we read that log anyway we might as well
use it to optimize the rescan.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

30d90964

nvme: mark nvme_queue_scan static · 50e8d8ee

由 Christoph Hellwig 提交于 5月 25, 2018

And move it toward the top of the file to avoid a forward declaration.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

50e8d8ee

nvme: submit AEN event configuration on startup · c0561f82

由 Hannes Reinecke 提交于 5月 22, 2018

We should register for AEN events; some law-abiding targets might
not be sending us AENs otherwise.
Signed-off-by: NHannes Reinecke <hare@suse.com>
[hch: slight cleanups]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

c0561f82

nvmet: mask pending AENs · 55fdd6b6

由 Christoph Hellwig 提交于 5月 30, 2018

Per section 5.2 of the NVMe 1.3 spec:

  "When the controller posts a completion queue entry for an outstanding
  Asynchronous Event Request command and thus reports an asynchronous
  event, subsequent events of that event type are automatically masked by
  the controller until the host clears that event. An event is cleared by
  reading the log page associated with that event using the Get Log Page
  command (see section 5.14)."
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

55fdd6b6

nvmet: add AEN configuration support · c86b8f7b

由 Christoph Hellwig 提交于 5月 30, 2018

AEN configuration via the 'Get Features' and 'Set Features' admin
command is mandatory, so we should be implemeting handling for it.
Signed-off-by: NHannes Reinecke <hare@suse.com>
[hch: use WRITE_ONCE, check for invalid values]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NDaniel Verkamp <daniel.verkamp@intel.com>

c86b8f7b

nvmet: implement the changed namespaces log · c16734ea

由 Christoph Hellwig 提交于 5月 25, 2018

Just keep a per-controller buffer of changed namespaces and copy it out
in the get log page implementation.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NDaniel Verkamp <daniel.verkamp@intel.com>

c16734ea

nvmet: split log page implementation · 8ab0805f

由 Christoph Hellwig 提交于 5月 22, 2018

Remove the common code to allocate a buffer and copy it into the SGL.
Instead the two no-op implementations just zero the SGL directly, and
the smart log allocates a buffer on its own.  This prepares for the
more elaborate ANA log page.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

8ab0805f

nvmet: add a new nvmet_zero_sgl helper · c7759fff

由 Christoph Hellwig 提交于 5月 22, 2018

Zeroes the SGL in the payload.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

c7759fff

nvme.h: add AEN configuration symbols · aafd3afe

由 Hannes Reinecke 提交于 5月 25, 2018

Signed-off-by: NHannes Reinecke <hare@suse.com>
[hch: split from a larger patch]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

aafd3afe

nvme.h: add the changed namespace list log · b3984e06

由 Christoph Hellwig 提交于 5月 25, 2018

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

b3984e06

nvme.h: untangle AEN notice definitions · 868c2392

由 Christoph Hellwig 提交于 5月 22, 2018

Stop including the event type in the definitions for the notice type.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

868c2392

nvmet: fix error return code in nvmet_file_ns_enable() · 1367bc82

由 Wei Yongjun 提交于 5月 31, 2018

Fix to return error code -ENOMEM from the memory alloc fail error
handling case instead of 0, as done elsewhere in this function.

Fixes: d5eff33e ("nvmet: add simple file backed ns support")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.e>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1367bc82

nvmet: fix a typo in nvmet_file_ns_enable() · 81cf54e0

由 Wei Yongjun 提交于 5月 31, 2018

Fix a typo in nvmet_file_ns_enable().

Fixes: d5eff33e ("nvmet: add simple file backed ns support")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.e>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

81cf54e0

nvme-fabrics: allow internal passthrough command on deleting controllers · cc456b65

由 Christoph Hellwig 提交于 5月 25, 2018

Without this we can't cleanly shut down.

Based on analysis an an earlier patch from Hannes Reinecke.

Fixes: bb06ec31 ("nvme: expand nvmf_check_if_ready checks")
Reported-by: NHannes Reinecke <hare@suse.de>
Tested-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJames Smart <james.smart@broadcom.com>

cc456b65

31 5月, 2018 7 次提交

block, bfq: prevent soft_rt_next_start from being stuck at infinity · f6c3ca0e

由 Davide Sapienza 提交于 5月 31, 2018

BFQ can deem a bfq_queue as soft real-time only if the queue
- periodically becomes completely idle, i.e., empty and with
  no still-outstanding I/O request;
- after becoming idle, gets new I/O only after a special reference
  time soft_rt_next_start.

In this respect, after commit "block, bfq: consider also past I/O in
soft real-time detection", the value of soft_rt_next_start can never
decrease. This causes a problem with the following special updating
case for soft_rt_next_start: to prevent queues that are not completely
idle to be wrongly detected as soft real-time (when they become
non-empty again), soft_rt_next_start is temporarily set to infinity
for empty queues with still outstanding I/O requests. But, if such an
update is actually performed, then, because of the above commit,
soft_rt_next_start will be stuck at infinity forever, and the queue
will have no more chance to be considered soft real-time.

On slow systems, this problem does cause actual soft real-time
applications to be occasionally not detected as such.

This commit addresses this issue by eliminating the pushing of
soft_rt_next_start to infinity, and by changing the way non-empty
queues are prevented from being wrongly detected as soft
real-time. Simply, a queue that becomes non-empty again can now be
detected as soft real-time only if it has no outstanding I/O request.
Signed-off-by: NDavide Sapienza <sapienza.dav@gmail.com>
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f6c3ca0e

block, bfq: increase weight-raising duration for interactive apps · d450542e

由 Davide Sapienza 提交于 5月 31, 2018

The maximum possible duration of the weight-raising period for
interactive applications is limited to 13 seconds, as this is the time
needed to load the largest application that we considered when tuning
weight raising. Unfortunately, in such an evaluation, we did not
consider the case of very slow virtual machines.

For example, on a QEMU/KVM virtual machine
- running in a slow PC;
- with a virtual disk stacked on a slow low-end 5400rpm HDD;
- serving a heavy I/O workload, such as the sequential reading of
several files;
mplayer takes 23 seconds to start, if constantly weight-raised.

To address this issue, this commit conservatively sets the upper limit
for weight-raising duration to 25 seconds.
Signed-off-by: NDavide Sapienza <sapienza.dav@gmail.com>
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d450542e

block, bfq: remove slow-system class · e24f1c24

由 Paolo Valente 提交于 5月 31, 2018

BFQ computes the duration of weight raising for interactive
applications automatically, using some reference parameters. In
particular, BFQ uses the best durations (see comments in the code for
how these durations have been assessed) for two classes of systems:
slow and fast ones. Examples of slow systems are old phones or systems
using micro HDDs. Fast systems are all the remaining ones. Using these
parameters, BFQ computes the actual duration of the weight raising,
for the system at hand, as a function of the relative speed of the
system w.r.t. the speed of a reference system, belonging to the same
class of systems as the system at hand.

This slow vs fast differentiation proved to be useful in the past, but
happens to have little meaning with current hardware. Even worse, it
does cause problems in virtual systems, where the speed of the system
can vary frequently, and so widely to just confuse the class-detection
mechanism, and, as we have verified experimentally, to cause BFQ to
compute non-sensical weight-raising durations.

This commit addresses this issue by removing the slow class and the
class-detection mechanism.
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e24f1c24

block, bfq: add description of weight-raising heuristics · 4029eef1

由 Paolo Valente 提交于 5月 31, 2018

A description of how weight raising works is missing in BFQ
sources. In addition, the code for handling weight raising is
scattered across a few functions. This makes it rather hard to
understand the mechanism and its rationale. This commits adds such a
description at the beginning of the main source file.
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4029eef1

block, bfq: remove the removal of 'next' rq in bfq_requests_merged · ac857e0d

由 Filippo Muzzini 提交于 5月 31, 2018

Since bfq_finish_request() is always called on the request 'next',
after bfq_requests_merged() is finished, and bfq_finish_request()
removes 'next' from its bfq_queue if needed, it isn't necessary to do
such a removal in advance in bfq_merged_requests().

This commit removes such a useless 'next' removal.
Signed-off-by: NFilippo Muzzini <filippo.muzzini@outlook.it>
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ac857e0d

block, bfq: remove wrong check in bfq_requests_merged · 8abfa4d6

由 Paolo Valente 提交于 5月 31, 2018

The request rq passed to the function bfq_requests_merged is always in
a bfq_queue, so the check !RB_EMPTY_NODE(&rq->rb_node) at the
beginning of bfq_requests_merged always succeeds, and the control
flow systematically skips to the end of the function.  This implies
that the body of the function is never executed, i.e., the
repositioning of rq is never performed.

On the opposite end, a control is missing in the body of the function:
'next' must be removed only if it is inside a bfq_queue.

This commit removes the wrong check on rq, and adds the missing check
on 'next'. In addition, this commit adds comments on
bfq_requests_merged.
Signed-off-by: NFilippo Muzzini <filippo.muzzini@outlook.it>
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8abfa4d6

block, bfq: remove wrong lock in bfq_requests_merged · a12bffeb

由 Filippo Muzzini 提交于 5月 31, 2018

In bfq_requests_merged(), there is a deadlock because the lock on
bfqq->bfqd->lock is held by the calling function, but the code of
this function tries to grab the lock again.

This deadlock is currently hidden by another bug (fixed by next commit
for this source file), which causes the body of bfq_requests_merged()
to be never executed.

This commit removes the deadlock by removing the lock/unlock pair.
Signed-off-by: NFilippo Muzzini <filippo.muzzini@outlook.it>
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a12bffeb

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功