提交 · ece0728037b15f4d31198f12b359104bcb5db4c8 · openeuler / raspberrypi-kernel

16 5月, 2017 3 次提交

dm rq: add a missing break to map_request · ece07280

由 Christoph Hellwig 提交于 5月 15, 2017

We don't want to bug when receiving a DM_MAPIO_KILL value..

Fixes: 412445ac ("dm: introduce a new DM_MAPIO_KILL return value")
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

ece07280

dm space map disk: fix some book keeping in the disk space map · 0377a07c

由 Joe Thornber 提交于 5月 15, 2017

When decrementing the reference count for a block, the free count wasn't
being updated if the reference count went to zero.

Cc: stable@vger.kernel.org
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

0377a07c

dm thin metadata: call precommit before saving the roots · 91bcdb92

由 Joe Thornber 提交于 5月 15, 2017

These calls were the wrong way round in __write_initial_superblock.

Cc: stable@vger.kernel.org
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

91bcdb92

15 5月, 2017 8 次提交

dm cache policy smq: don't do any writebacks unless IDLE · 2e633095

由 Joe Thornber 提交于 5月 11, 2017

If there are no clean blocks to be demoted the writeback will be
triggered at that point.  Preemptively writing back can hurt high IO
load scenarios.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

2e633095

dm cache: simplify the IDLE vs BUSY state calculation · 49b7f768

由 Joe Thornber 提交于 5月 11, 2017

Drop the MODERATE state since it wasn't buying us much.

Also, in check_migrations(), prepare for the next commit ("dm cache
policy smq: don't do any writebacks unless IDLE") by deferring to the
policy to make the final decision on whether writebacks can be
serviced.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

49b7f768

dm cache: track all IO to the cache rather than just the origin device's IO · 701e03e4

由 Joe Thornber 提交于 5月 11, 2017

IO tracking used to throttle writebacks when the origin device is busy.

Even if all the IO is going to the fast device, writebacks can
significantly degrade performance.  So track all IO to gauge whether the
cache is busy or not.

Otherwise, synthetic IO tests (e.g. fio) that might send all IO to the
fast device wouldn't cause writebacks to get throttled.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

701e03e4

dm cache policy smq: stop preemptively demoting blocks · 6cf4cc8f

由 Joe Thornber 提交于 5月 11, 2017

It causes a lot of churn if the working set's size is close to the fast
device's size.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

6cf4cc8f

dm cache policy smq: put newly promoted entries at the top of the multiqueue · 4d44ec5a

由 Joe Thornber 提交于 5月 11, 2017

This stops entries bouncing in and out of the cache quickly.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

4d44ec5a

dm cache policy smq: be more aggressive about triggering a writeback · 78c45607

由 Joe Thornber 提交于 5月 11, 2017

If there are no clean entries to demote we really want to writeback
immediately.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

78c45607

dm cache policy smq: only demote entries in bottom half of the clean multiqueue · a8cd1eba

由 Joe Thornber 提交于 5月 11, 2017

Heavy IO load may mean there are very few clean blocks in the cache, and
we risk demoting entries that get hit a lot.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

a8cd1eba

dm cache: fix incorrect 'idle_time' reset in IO tracker · 072792dc

由 Joe Thornber 提交于 5月 11, 2017

Some bios have no payload (eg, a FLUSH), don't reset the idle_time when
these come in.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

072792dc

09 5月, 2017 5 次提交

mm, vmalloc: use __GFP_HIGHMEM implicitly · 19809c2d

由 Michal Hocko 提交于 5月 08, 2017

__vmalloc* allows users to provide gfp flags for the underlying
allocation.  This API is quite popular

  $ git grep "=[[:space:]]__vmalloc\|return[[:space:]]*__vmalloc" | wc -l
  77

The only problem is that many people are not aware that they really want
to give __GFP_HIGHMEM along with other flags because there is really no
reason to consume precious lowmemory on CONFIG_HIGHMEM systems for pages
which are mapped to the kernel vmalloc space.  About half of users don't
use this flag, though.  This signals that we make the API unnecessarily
too complex.

This patch simply uses __GFP_HIGHMEM implicitly when allocating pages to
be mapped to the vmalloc space.  Current users which add __GFP_HIGHMEM
are simplified and drop the flag.

Link: http://lkml.kernel.org/r/20170307141020.29107-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Reviewed-by: NMatthew Wilcox <mawilcox@microsoft.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Cristopher Lameter <cl@linux.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

19809c2d

drivers/md/bcache/super.c: use kvmalloc · bc4e54f6

由 Michal Hocko 提交于 5月 08, 2017

bcache_device_init uses kmalloc for small requests and vmalloc for those
which are larger than 64 pages. This alone is a strange criterion.
Moreover kmalloc can fallback to vmalloc on the failure. Let's simply
use kvmalloc instead as it knows how to handle the fallback properly

Link: http://lkml.kernel.org/r/20170306103327.2766-5-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bc4e54f6

drivers/md/dm-ioctl.c: use kvmalloc rather than opencoded variant · d224e938

由 Michal Hocko 提交于 5月 08, 2017

copy_params uses kmalloc with vmalloc fallback.  We already have a
helper for that - kvmalloc.  This caller requires GFP_NOIO semantic so
it hasn't been converted with many others by previous patches.  All we
need to achieve this semantic is to use the scope
memalloc_noio_{save,restore} around kvmalloc.

Link: http://lkml.kernel.org/r/20170306103327.2766-4-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d224e938

treewide: use kv[mz]alloc* rather than opencoded variants · 752ade68

由 Michal Hocko 提交于 5月 08, 2017

There are many code paths opencoding kvmalloc.  Let's use the helper
instead.  The main difference to kvmalloc is that those users are
usually not considering all the aspects of the memory allocator.  E.g.
allocation requests <= 32kB (with 4kB pages) are basically never failing
and invoke OOM killer to satisfy the allocation.  This sounds too
disruptive for something that has a reasonable fallback - the vmalloc.
On the other hand those requests might fallback to vmalloc even when the
memory allocator would succeed after several more reclaim/compaction
attempts previously.  There is no guarantee something like that happens
though.

This patch converts many of those places to kv[mz]alloc* helpers because
they are more conservative.

Link: http://lkml.kernel.org/r/20170306103327.2766-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> # Xen bits
Acked-by: NKees Cook <keescook@chromium.org>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: Andreas Dilger <andreas.dilger@intel.com> # Lustre
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> # KVM/s390
Acked-by: Dan Williams <dan.j.williams@intel.com> # nvdim
Acked-by: David Sterba <dsterba@suse.com> # btrfs
Acked-by: Ilya Dryomov <idryomov@gmail.com> # Ceph
Acked-by: Tariq Toukan <tariqt@mellanox.com> # mlx4
Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx5
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Santosh Raspatur <santosh@chelsio.com>
Cc: Hariprasad S <hariprasad@chelsio.com>
Cc: Yishai Hadas <yishaih@mellanox.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: "Yan, Zheng" <zyan@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

752ade68

mm: introduce kv[mz]alloc helpers · a7c3e901

由 Michal Hocko 提交于 5月 08, 2017

Patch series "kvmalloc", v5.

There are many open coded kmalloc with vmalloc fallback instances in the
tree.  Most of them are not careful enough or simply do not care about
the underlying semantic of the kmalloc/page allocator which means that
a) some vmalloc fallbacks are basically unreachable because the kmalloc
part will keep retrying until it succeeds b) the page allocator can
invoke a really disruptive steps like the OOM killer to move forward
which doesn't sound appropriate when we consider that the vmalloc
fallback is available.

As it can be seen implementing kvmalloc requires quite an intimate
knowledge if the page allocator and the memory reclaim internals which
strongly suggests that a helper should be implemented in the memory
subsystem proper.

Most callers, I could find, have been converted to use the helper
instead.  This is patch 6.  There are some more relying on __GFP_REPEAT
in the networking stack which I have converted as well and Eric Dumazet
was not opposed [2] to convert them as well.

[1] http://lkml.kernel.org/r/20170130094940.13546-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/1485273626.16328.301.camel@edumazet-glaptop3.roam.corp.google.com

This patch (of 9):

Using kmalloc with the vmalloc fallback for larger allocations is a
common pattern in the kernel code.  Yet we do not have any common helper
for that and so users have invented their own helpers.  Some of them are
really creative when doing so.  Let's just add kv[mz]alloc and make sure
it is implemented properly.  This implementation makes sure to not make
a large memory pressure for > PAGE_SZE requests (__GFP_NORETRY) and also
to not warn about allocation failures.  This also rules out the OOM
killer as the vmalloc is a more approapriate fallback than a disruptive
user visible action.

This patch also changes some existing users and removes helpers which
are specific for them.  In some cases this is not possible (e.g.
ext4_kvmalloc, libcfs_kvzalloc) because those seems to be broken and
require GFP_NO{FS,IO} context which is not vmalloc compatible in general
(note that the page table allocation is GFP_KERNEL).  Those need to be
fixed separately.

While we are at it, document that __vmalloc{_node} about unsupported gfp
mask because there seems to be a lot of confusion out there.
kvmalloc_node will warn about GFP_KERNEL incompatible (which are not
superset) flags to catch new abusers.  Existing ones would have to die
slowly.

[sfr@canb.auug.org.au: f2fs fixup]
  Link: http://lkml.kernel.org/r/20170320163735.332e64b7@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170306103032.2540-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>	[ext4 part]
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a7c3e901

06 5月, 2017 1 次提交

dm cache metadata: fail operations if fail_io mode has been established · 10add84e

由 Mike Snitzer 提交于 5月 05, 2017

Otherwise it is possible to trigger crashes due to the metadata being
inaccessible yet these methods don't safely account for that possibility
without these checks.

Cc: stable@vger.kernel.org
Reported-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

10add84e

04 5月, 2017 3 次提交

M
dm integrity: improve the Kconfig help text for DM_INTEGRITY · 7ab84db6
由 Mike Snitzer 提交于 5月 04, 2017
```
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NMilan Broz <gmazyland@gmail.com>
```
7ab84db6

dm cache policy smq: cleanup free_target_met() and clean_target_met() · 97dfb203

由 Mike Snitzer 提交于 5月 04, 2017

Depending on the passed @idle arg, there may be no need to calculate
'nr_free' or 'nr_clean' respectively in free_target_met() and
clean_target_met().
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

97dfb203

dm cache policy smq: allow demotions to happen even during continuous IO · ce1d64e8

由 Joe Thornber 提交于 5月 03, 2017

dm-cache's smq policy tries hard to do it's work during the idle periods
when there is no IO. But if there are no idle periods (eg, a long fio
run) we still need to allow some demotions and promotions to occur.

To achieve this, pass @idle=true to queue_promotion()'s
free_target_met() call so that free_target_met() doesn't short-circuit
the possibility of demotion simply because it isn't an idle period.

Fixes: b29d4986 ("dm cache: significant rework to leverage dm-bio-prison-v2")
Reported-by: NJohn Harrigan <jharriga@redhat.com>
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

ce1d64e8

02 5月, 2017 7 次提交

blk-mq: update ->init_request and ->exit_request prototypes · d6296d39

由 Christoph Hellwig 提交于 5月 01, 2017

Remove the request_idx parameter, which can't be used safely now that we
support I/O schedulers with blk-mq.  Except for a superflous check in
mtip32xx it was unused anyway.

Also pass the tag_set instead of just the driver data - this allows drivers
to avoid some code duplication in a follow on cleanup.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

d6296d39

dm: introduce a new DM_MAPIO_KILL return value · 412445ac

由 Christoph Hellwig 提交于 4月 26, 2017

This untangles the DM_MAPIO_* values returned from ->clone_and_map_rq
from the error codes used by the block layer.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

412445ac

dm rq: change ->rq_end_io calling conventions · 7ed8578a

由 Christoph Hellwig 提交于 4月 26, 2017

Instead of returning either a DM_ENDIO_* constant or an error code, add
a new DM_ENDIO_DONE value that means keep errno as is.  This allows us
to easily keep the existing error code in case where we can't push back,
and it also preparares for the new block level status codes with strict
type checking.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

7ed8578a

dm mpath: merge do_end_io into multipath_end_io · b79f10ee

由 Christoph Hellwig 提交于 4月 26, 2017

This simplifies the I/O completion path a bit.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

b79f10ee

md/raid10: skip spare disk as 'first' disk · b506335e

由 Shaohua Li 提交于 5月 01, 2017

Commit 6f287ca6(md/raid10: reset the 'first' at the end of loop) ignores
a case in reshape, the first rdev could be a spare disk, which shouldn't
be accounted as the first disk since it doesn't include the offset info.

Fix: 6f287ca6(md/raid10: reset the 'first' at the end of loop)
Cc: Guoqing Jiang <gqjiang@suse.com>
Cc: NeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

b506335e

dm bufio: check new buffer allocation watermark every 30 seconds · 390020ad

由 Mikulas Patocka 提交于 4月 30, 2017

dm-bufio checks a watermark when it allocates a new buffer in
__bufio_new().  However, it doesn't check the watermark when the user
changes /sys/module/dm_bufio/parameters/max_cache_size_bytes.

This may result in a problem - if the watermark is high enough so that
all possible buffers are allocated and if the user lowers the value of
"max_cache_size_bytes", the watermark will never be checked against the
new value because no new buffer would be allocated.

To fix this, change __evict_old_buffers() so that it checks the
watermark.  __evict_old_buffers() is called every 30 seconds, so if the
user reduces "max_cache_size_bytes", dm-bufio will react to this change
within 30 seconds and decrease memory consumption.

Depends-on: 1b0fb5a5 ("dm bufio: avoid a possible ABBA deadlock")
Cc: stable@vger.kernel.org
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

390020ad

dm bufio: avoid a possible ABBA deadlock · 1b0fb5a5

由 Mikulas Patocka 提交于 4月 30, 2017

__get_memory_limit() tests if dm_bufio_cache_size changed and calls
__cache_size_refresh() if it did.  It takes dm_bufio_clients_lock while
it already holds the client lock.  However, lock ordering is violated
because in cleanup_old_buffers() dm_bufio_clients_lock is taken before
the client lock.

This results in a possible deadlock and lockdep engine warning.

Fix this deadlock by changing mutex_lock() to mutex_trylock().  If the
lock can't be taken, it will be re-checked next time when a new buffer
is allocated.

Also add "unlikely" to the if condition, so that the optimizer assumes
that the condition is false.

Cc: stable@vger.kernel.org
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

1b0fb5a5

28 4月, 2017 13 次提交

dm mpath: make it easier to detect unintended I/O request flushes · 86331f39

由 Bart Van Assche 提交于 4月 27, 2017

I/O errors triggered by multipathd incorrectly not enabling the no-flush
flag for DM_DEVICE_SUSPEND or DM_DEVICE_RESUME are hard to debug.  Add
more logging to make it easier to debug this.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

86331f39

dm mpath: cleanup QUEUE_IF_NO_PATH bit manipulation by introducing assign_bit() · 9a8ac3ae

由 Bart Van Assche 提交于 4月 27, 2017

No functional change but makes the code easier to read.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

9a8ac3ae

dm mpath: micro-optimize the hot path relative to MPATHF_QUEUE_IF_NO_PATH · ca5beb76

由 Bart Van Assche 提交于 4月 27, 2017

Instead of checking MPATHF_QUEUE_IF_NO_PATH,
MPATHF_SAVED_QUEUE_IF_NO_PATH and the no_flush flag to decide whether
or not to push back a request (or bio) if there are no paths available,
only clear MPATHF_QUEUE_IF_NO_PATH in queue_if_no_path() if no_flush has
not been set. The result is that only a single bit has to be tested in
the hot path to decide whether or not a request must be pushed back and
also that m->lock does not have to be taken in the hot path.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

ca5beb76

dm: introduce enum dm_queue_mode to cleanup related code · 7e0d574f

由 Bart Van Assche 提交于 4月 27, 2017

Introduce an enumeration type for the queue mode.  This patch does
not change any functionality but makes the DM code easier to read.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

7e0d574f

dm mpath: verify __pg_init_all_paths locking assumptions at runtime · b194679f

由 Bart Van Assche 提交于 4月 27, 2017

Verify at runtime that __pg_init_all_paths() is called with
multipath.lock held if lockdep is enabled.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

b194679f

dm: verify suspend_locking assumptions at runtime · 1ea0654e

由 Bart Van Assche 提交于 4月 27, 2017

Ensure that the assumptions about the caller holding suspend_lock
are checked at runtime if lockdep is enabled.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

1ea0654e

dm block manager: remove an unused argument from dm_block_manager_create() · 73cbca6a

由 Bart Van Assche 提交于 4月 27, 2017

The 'cache_size' argument of dm_block_manager_create() has never been
used.  Remove it along with the definitions of the constants passed as
the 'cache_size' argument.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

73cbca6a

dm rq: check blk_mq_register_dev() return value in dm_mq_init_request_queue() · 23a60124

由 Bart Van Assche 提交于 4月 27, 2017

Otherwise the request-based DM blk-mq request_queue will be put into
service without being properly exported via sysfs.

Cc: stable@vger.kernel.org
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

23a60124

dm mpath: delay requeuing while path initialization is in progress · c1d7ecf7

由 Bart Van Assche 提交于 4月 27, 2017

Requeuing a request immediately while path initialization is ongoing
causes high CPU usage, something that is undesired.  Hence delay
requeuing while path initialization is in progress.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

c1d7ecf7

dm mpath: avoid that path removal can trigger an infinite loop · 7083abbb

由 Bart Van Assche 提交于 4月 27, 2017

If blk_get_request() fails, check whether the failure is due to a path
being removed.  If that is the case, fail the path by triggering a call
to fail_path().  This avoids that the following scenario can be
encountered while removing paths:
* CPU usage of a kworker thread jumps to 100%.
* Removing the DM device becomes impossible.

Delay requeueing if blk_get_request() returns -EBUSY or -EWOULDBLOCK,
and the queue is not dying, because in these cases immediate requeuing
is inappropriate.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

7083abbb

md/raid1: Use a new variable to count flighting sync requests · 43ac9b84

由 Xiao Ni 提交于 4月 27, 2017

In new barrier codes, raise_barrier waits if conf->nr_pending[idx] is not zero.
After all the conditions are true, the resync request can go on be handled. But
it adds conf->nr_pending[idx] again. The next resync request hit the same bucket
idx need to wait the resync request which is submitted before. The performance
of resync/recovery is degraded.
So we should use a new variable to count sync requests which are in flight.

I did a simple test:
1. Without the patch, create a raid1 with two disks. The resync speed:
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 166.00 0.00 10.38 0.00 128.00 0.03 0.20 0.20 0.00 0.19 3.20
sdc 0.00 0.00 0.00 166.00 0.00 10.38 128.00 0.96 5.77 0.00 5.77 5.75 95.50
2. With the patch, the result is:
sdb 2214.00 0.00 766.00 0.00 185.69 0.00 496.46 2.80 3.66 3.66 0.00 1.03 79.10
sdc 0.00 2205.00 0.00 769.00 0.00 186.44 496.52 5.25 6.84 0.00 6.84 1.30 100.10
Suggested-by: NShaohua Li <shli@kernel.org>
Signed-off-by: NXiao Ni <xni@redhat.com>
Acked-by: NColy Li <colyli@suse.de>
Signed-off-by: NShaohua Li <shli@fb.com>

43ac9b84

dm mpath: split and rename activate_path() to prepare for its expanded use · 89bfce76

由 Bart Van Assche 提交于 4月 27, 2017

activate_path() is renamed to activate_path_work() which now calls
activate_or_offline_path().  activate_or_offline_path() will be used
by the next commit.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

89bfce76

dm ioctl: prevent stack leak in dm ioctl call · 4617f564

由 Adrian Salido 提交于 4月 27, 2017

When calling a dm ioctl that doesn't process any data
(IOCTL_FLAGS_NO_PARAMS), the contents of the data field in struct
dm_ioctl are left initialized.  Current code is incorrectly extending
the size of data copied back to user, causing the contents of kernel
stack to be leaked to user.  Fix by only copying contents before data
and allow the functions processing the ioctl to override.

Cc: stable@vger.kernel.org
Signed-off-by: NAdrian Salido <salidoa@google.com>
Reviewed-by: NAlasdair G Kergon <agk@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

4617f564