提交 · c533f249a166142df4294ec38fa5dcd1903f0400 · openanolis / cloud-kernel

15 9月, 2016 6 次提交

dm rq: simplify dm_old_stop_queue() · c533f249

由 Bart Van Assche 提交于 8月 31, 2016

This patch does not change any functionality.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

c533f249

dm mpath: check if path's request_queue is dying in activate_path() · f10e06b7

由 Mike Snitzer 提交于 9月 01, 2016

If pg_init_retries is set and a request is queued against a multipath
device with all underlying block device request_queues in the "dying"
state then an infinite loop is triggered because activate_path() never
succeeds and hence never calls pg_init_done().

This change avoids that device removal triggers an infinite loop by
failing the activate_path() which causes the "dying" path to be failed.
Reported-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

f10e06b7

dm rq: take request_queue lock while clearing QUEUE_FLAG_STOPPED · 9dbeaeab

由 Mike Snitzer 提交于 9月 01, 2016

Every call of queue_flag_clear_unlocked() after block device
initialization has finished is wrong if blk_cleanup_queue() can be
called concurrently.  Convert queue_flag_clear_unlocked() into
queue_flag_clear() and protect it by the block layer queue lock.

Also, factor out dm_mq_start_queue().
Reported-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

9dbeaeab

dm rq: factor out dm_mq_stop_queue() · 2397a15a

由 Bart Van Assche 提交于 8月 31, 2016

Also, check that the blk-mq request_queue isn't already stopped.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

2397a15a

dm: mark request_queue dead before destroying the DM device · 3b785fbc

由 Bart Van Assche 提交于 8月 31, 2016

This avoids that new requests are queued while __dm_destroy() is in
progress.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

3b785fbc

dm: return correct error code in dm_resume()'s retry loop · 8dc23658

由 Minfei Huang 提交于 9月 06, 2016

dm_resume() will return success (0) rather than -EINVAL if
!dm_suspended_md() upon retry within dm_resume().

Reset the error code at the start of dm_resume()'s retry loop.
Also, remove a useless assignment at the end of dm_resume().

Fixes: ffcc3936 ("dm: enhance internal suspend and resume interface")
Cc: stable@vger.kernel.org # 3.19+
Signed-off-by: NMinfei Huang <mnghuan@gmail.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

8dc23658

14 9月, 2016 1 次提交

block, dm-crypt, btrfs: Introduce bio_flags() · 4382e33a

由 Bart Van Assche 提交于 9月 14, 2016

Introduce the bio_flags() macro. Ensure that the second argument of
bio_set_op_attrs() only contains flags and no operation. This patch
does not change any functionality.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Chris Mason <clm@fb.com> (maintainer:BTRFS FILE SYSTEM)
Cc: Josef Bacik <jbacik@fb.com> (maintainer:BTRFS FILE SYSTEM)
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

4382e33a

25 8月, 2016 2 次提交

dm log: fix unitialized bio operation flags · 9c5a559d

由 Heinz Mauelshagen 提交于 8月 23, 2016

Commit e6047149 ("dm: use bio op accessors") switched DM over to
using bio_set_op_attrs() but didn't take care to initialize
lc->io_req.bi_op_flags in dm-log.c:rw_header().  This caused
rw_header()'s call to dm_io() to make bio->bi_op_flags be uninitialized
in dm-io.c:do_region(), which ultimately resulted in a SCSI BUG() in
sd_init_command().

Also, adjust rw_header() and its callers to use REQ_OP_{READ|WRITE}.

Fixes: e6047149 ("dm: use bio op accessors")
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Reviewed-by: NShaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

9c5a559d

dm flakey: fix reads to be issued if drop_writes configured · 299f6230

由 Mike Snitzer 提交于 8月 24, 2016

v4.8-rc3 commit 99f3c90d ("dm flakey: error READ bios during the
down_interval") overlooked the 'drop_writes' feature, which is meant to
allow reads to be issued rather than errored, during the down_interval.

Fixes: 99f3c90d ("dm flakey: error READ bios during the down_interval")
Reported-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

299f6230

19 8月, 2016 3 次提交

bcache: pr_err: more meaningful error message when nr_stripes is invalid · 90706094

由 Eric Wheeler 提交于 8月 18, 2016

The original error was thought to be corruption, but was actually caused by:
	make-bcache --data-offset N
where N was in bytes and should have been in sectors.  While userspace
tools should be updated to check --data-offset beyond end of volume,
hopefully this will help others that might not have noticed the units.
Signed-off-by: NEric Wheeler <bcache@linux.ewheeler.net>
Cc: Kent Overstreet <kent.overstreet@gmail.com>

90706094

bcache: RESERVE_PRIO is too small by one when prio_buckets() is a power of two. · acc9cf8c

由 Kent Overstreet 提交于 8月 17, 2016

This patch fixes a cachedev registration-time allocation deadlock.
This can deadlock on boot if your initrd auto-registeres bcache devices:

Allocator thread:
[  720.727614] INFO: task bcache_allocato:3833 blocked for more than 120 seconds.
[  720.732361]  [<ffffffff816eeac7>] schedule+0x37/0x90
[  720.732963]  [<ffffffffa05192b8>] bch_bucket_alloc+0x188/0x360 [bcache]
[  720.733538]  [<ffffffff810e6950>] ? prepare_to_wait_event+0xf0/0xf0
[  720.734137]  [<ffffffffa05302bd>] bch_prio_write+0x19d/0x340 [bcache]
[  720.734715]  [<ffffffffa05190bf>] bch_allocator_thread+0x3ff/0x470 [bcache]
[  720.735311]  [<ffffffff816ee41c>] ? __schedule+0x2dc/0x950
[  720.735884]  [<ffffffffa0518cc0>] ? invalidate_buckets+0x980/0x980 [bcache]

Registration thread:
[  720.710403] INFO: task bash:3531 blocked for more than 120 seconds.
[  720.715226]  [<ffffffff816eeac7>] schedule+0x37/0x90
[  720.715805]  [<ffffffffa05235cd>] __bch_btree_map_nodes+0x12d/0x150 [bcache]
[  720.716409]  [<ffffffffa0522d30>] ? bch_btree_insert_check_key+0x1c0/0x1c0 [bcache]
[  720.717008]  [<ffffffffa05236e4>] bch_btree_insert+0xf4/0x170 [bcache]
[  720.717586]  [<ffffffff810e6950>] ? prepare_to_wait_event+0xf0/0xf0
[  720.718191]  [<ffffffffa0527d9a>] bch_journal_replay+0x14a/0x290 [bcache]
[  720.718766]  [<ffffffff810cc90d>] ? ttwu_do_activate.constprop.94+0x5d/0x70
[  720.719369]  [<ffffffff810cf684>] ? try_to_wake_up+0x1d4/0x350
[  720.719968]  [<ffffffffa05317d0>] run_cache_set+0x580/0x8e0 [bcache]
[  720.720553]  [<ffffffffa053302e>] register_bcache+0xe2e/0x13b0 [bcache]
[  720.721153]  [<ffffffff81354cef>] kobj_attr_store+0xf/0x20
[  720.721730]  [<ffffffff812a2dad>] sysfs_kf_write+0x3d/0x50
[  720.722327]  [<ffffffff812a225a>] kernfs_fop_write+0x12a/0x180
[  720.722904]  [<ffffffff81225177>] __vfs_write+0x37/0x110
[  720.723503]  [<ffffffff81228048>] ? __sb_start_write+0x58/0x110
[  720.724100]  [<ffffffff812cedb3>] ? security_file_permission+0x23/0xa0
[  720.724675]  [<ffffffff812258a9>] vfs_write+0xa9/0x1b0
[  720.725275]  [<ffffffff8102479c>] ? do_audit_syscall_entry+0x6c/0x70
[  720.725849]  [<ffffffff81226755>] SyS_write+0x55/0xd0
[  720.726451]  [<ffffffff8106a390>] ? do_page_fault+0x30/0x80
[  720.727045]  [<ffffffff816f2cae>] system_call_fastpath+0x12/0x71

The fifo code in upstream bcache can't use the last element in the buffer,
which was the cause of the bug: if you asked for a power of two size,
it'd give you a fifo that could hold one less than what you asked for
rather than allocating a buffer twice as big.
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Tested-by: NEric Wheeler <bcache@linux.ewheeler.net>
Cc: stable@vger.kernel.org

acc9cf8c

bcache: register_bcache(): call blkdev_put() when cache_alloc() fails · d9dc1702

由 Eric Wheeler 提交于 6月 17, 2016

register_cache() is supposed to return an error string on error so that
register_bcache() will will blkdev_put and cleanup other user counters,
but it does not set 'char *err' when cache_alloc() fails (eg, due to
memory pressure) and thus register_bcache() performs no cleanup.

register_bcache() <----------\  <- no jump to err_close, no blkdev_put()
   |                         |
   +->register_cache()       |  <- fails to set char *err
         |                   |
         +->cache_alloc() ---/  <- returns error

This patch sets `char *err` for this failure case so that register_cache()
will cause register_bcache() to correctly jump to err_close and do
cleanup.  This was tested under OOM conditions that triggered the bug.
Signed-off-by: NEric Wheeler <bcache@linux.ewheeler.net>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: stable@vger.kernel.org

d9dc1702

17 8月, 2016 4 次提交

dm raid: support raid0 with missing metadata devices · 9e7d9367

由 Heinz Mauelshagen 提交于 8月 17, 2016

The raid0 MD personality does not start a raid0 array with any of its
data devices missing.

dm-raid was removing data/metadata device pairs unconditionally if it
failed to read a superblock off the respective metadata device of such
pair, resulting in failure to start arrays with the raid0 personality.

Avoid removing any data/metadata device pairs in case of raid0
(e.g. lvm2 segment type 'raid0_meta') thus allowing MD to start the
array.

Also, avoid region size validation for raid0.
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

9e7d9367

dm raid: enhance attempt_restore_of_faulty_devices() to support more devices · a3c06a38

由 Heinz Mauelshagen 提交于 8月 09, 2016

attempt_restore_of_faulty_devices() is limited to 64 when it should support
the new maximum of 253 when identifying any failed devices. It clears any
revivable devices via an MD personality hot remove and add cylce to allow
for their recovery.

Address by using existing functions to retrieve and update all failed
devices' bitfield members in the dm raid superblocks on all RAID devices
and check for any devices to clear in it.

Whilst on it, don't call attempt_restore_of_faulty_devices() for any MD
personality not providing disk hot add/remove methods (i.e. raid0 now),
because such personalities don't support reviving of failed disks.
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

a3c06a38

dm raid: fix restoring of failed devices regression · 31e10a41

由 Heinz Mauelshagen 提交于 8月 10, 2016

'lvchange --refresh RaidLV' causes a mapped device suspend/resume cycle
aiming at device restore and resync after transient device failures.  This
failed because flag RT_FLAG_RS_RESUMED was always cleared in the suspend path,
thus the device restore wasn't performed in the resume path.

Solve by removing RT_FLAG_RS_RESUMED from the suspend path and resume
unconditionally.  Also, remove superfluous comment from raid_resume().
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

31e10a41

dm raid: fix frozen recovery regression · a4423287

由 Heinz Mauelshagen 提交于 8月 09, 2016

On LVM2 conversions via lvconvert(8), the target keeps mapped devices in
frozen state when requesting RAID devices be resynchronized.  This
applies to e.g. adding legs to a raid1 device or taking over from raid0
to raid4 when the rebuild flag's set on the new raid1 legs or the added
dedicated parity stripe.

Also, fix frozen recovery for reshaping as well.
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

a4423287

15 8月, 2016 2 次提交

dm crypt: increase mempool reserve to better support swapping · 0a83df6c

由 Mikulas Patocka 提交于 7月 15, 2016

Increase mempool size from 16 to 64 entries.  This increase improves
swap on dm-crypt performance.

When swapping to dm-crypt, all available memory is temporarily exhausted
and dm-crypt can only use the mempool reserve.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

0a83df6c

dm round robin: do not use this_cpu_ptr() without having preemption disabled · 802934b2

由 Mike Snitzer 提交于 8月 05, 2016

Use local_irq_save() to disable preemption before calling
this_cpu_ptr().
Reported-by: NBenjamin Block <bblock@linux.vnet.ibm.com>
Fixes: b0b477c7 ("dm round robin: use percpu 'repeat_count' and 'current_path'")
Cc: stable@vger.kernel.org # 4.6+
Suggested-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

802934b2

08 8月, 2016 1 次提交

block: rename bio bi_rw to bi_opf · 1eff9d32

由 Jens Axboe 提交于 8月 05, 2016

Since commit 63a4cc24, bio->bi_rw contains flags in the lower
portion and the op code in the higher portions. This means that
old code that relies on manually setting bi_rw is most likely
going to be broken. Instead of letting that brokeness linger,
rename the member, to force old and out-of-tree code to break
at compile time instead of at runtime.

No intended functional changes in this commit.
Signed-off-by: NJens Axboe <axboe@fb.com>

1eff9d32

04 8月, 2016 2 次提交

dm raid: fix use of wrong status char during resynchronization · 2a034ec1

由 Heinz Mauelshagen 提交于 8月 03, 2016

During a resynchronization, device status char 'a' is output on the raid
status line for every device of a RAID set.  It changes from 'a' to 'A'
(unless device failure) when the resynchronization completes.

Interrupting and restarting a resynchronization, by reloading the DM
table, erroneously lead to status char 'A'.

Fix this by avoiding setting the MD_RECOVERY_REQUESTED flag in
raid_preresume().
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

2a034ec1

dm raid: constructor fails on non-zero incompat_features · b2a4872a

由 Heinz Mauelshagen 提交于 8月 03, 2016

When lvm2 userspace requests a RaidLV repair, it sets the rebuild
constructor flag on the new replacement DataLVs but does not clear the
respective MetaLVs. Hence the superblock that is loaded from such new
MetaLVs may have a non-zero incompat_features member and the constructor
will fail with false-positive on incompat_features.

Solve by initializing the incompat_features member properly.
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

b2a4872a

03 8月, 2016 5 次提交

dm raid: fix processing of max_recovery_rate constructor flag · f15f64d6

由 Heinz Mauelshagen 提交于 7月 27, 2016

__CTR_FLAG_MIN_RECOVERY_RATE was used instead of __CTR_FLAG_MAX_RECOVERY_RATE
thus causing max_recovery_rate to be rejected in case min_recovery_rate
was already set.
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

f15f64d6

dm: set DMF_SUSPENDED* _before_ clearing DMF_NOFLUSH_SUSPENDING · eaf9a736

由 Mike Snitzer 提交于 8月 02, 2016

Otherwise, there is potential for both DMF_SUSPENDED* and
DMF_NOFLUSH_SUSPENDING to not be set during dm_suspend() -- which is
definitely _not_ a valid state.

This fix, in conjuction with "dm rq: fix the starting and stopping of
blk-mq queues", addresses the potential for request-based DM multipath's
__multipath_map() to see !dm_noflush_suspending() during suspend.
Reported-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

eaf9a736

dm rq: fix the starting and stopping of blk-mq queues · 7d9595d8

由 Mike Snitzer 提交于 8月 02, 2016

Improve dm_stop_queue() to cancel any requeue_work.  Also, have
dm_start_queue() and dm_stop_queue() clear/set the QUEUE_FLAG_STOPPED
for the blk-mq request_queue.

On suspend dm_stop_queue() handles stopping the blk-mq request_queue
BUT: even though the hw_queues are marked BLK_MQ_S_STOPPED at that point
there is still a race that is allowing block/blk-mq.c to call ->queue_rq
against a hctx that it really shouldn't.  Add a check to
dm_mq_queue_rq() that guards against this rarity (albeit _not_
race-free).
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # must patch dm.c on < 4.8 kernels

7d9595d8

dm mpath: add locking to multipath_resume and must_push_back · 1814f2e3

由 Mike Snitzer 提交于 7月 25, 2016

Multiple flags were being tested without locking.  Protect against
non-atomic bit changes in m->flags by holding m->lock (while testing or
setting the queue_if_no_path related flags).
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

1814f2e3

dm flakey: error READ bios during the down_interval · 99f3c90d

由 Mike Snitzer 提交于 7月 29, 2016

When the corrupt_bio_byte feature was introduced it caused READ bios to
no longer be errored with -EIO during the down_interval.  This had to do
with the complexity of needing to submit READs if the corrupt_bio_byte
feature was used.

Fix it so READ bios are properly errored with -EIO; doing so early in
flakey_map() as long as there isn't a match for the corrupt_bio_byte
feature.

Fixes: a3998799 ("dm flakey: add corrupt_bio_byte feature")
Reported-by: NAkira Hayakawa <ruby.wktk@gmail.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

99f3c90d

29 7月, 2016 1 次提交

MD: fix null pointer deference · 5d881783

由 Shaohua Li 提交于 7月 28, 2016

The md device might not have personality (for example, ddf raid array). The
issue is introduced by 8430e7e0(md: disconnect device from personality
before trying to remove it)
Reported-by: Nkernel test robot <xiaolong.ye@intel.com>
Signed-off-by: NShaohua Li <shli@fb.com>

5d881783

21 7月, 2016 10 次提交

dm: allow bio-based table to be upgraded to bio-based with DAX support · b5ab4a9b

由 Toshi Kani 提交于 6月 28, 2016

Allow table type DM_TYPE_BIO_BASED to extend with DM_TYPE_DAX_BIO_BASED
since DM_TYPE_DAX_BIO_BASED supports bio-based requests.

This is needed to allow a snapshot of an LV with DAX support to be
removed.  One of the intermediate table reloads that lvm2 does switches
from DM_TYPE_BIO_BASED to DM_TYPE_DAX_BIO_BASED.  No known reason to
disallow this so...
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

b5ab4a9b

dm snap: add fake origin_direct_access · f6e629bd

由 Toshi Kani 提交于 6月 28, 2016

dax-capable mapped-device is marked as DM_TYPE_DAX_BIO_BASED,
which supports both dax and bio-based operations.  dm-snap
needs to work with dax-capable device when bio-based operation
is used.

Add fake origin_direct_access() to origin device so that its
origin device is also marked as DM_TYPE_DAX_BIO_BASED for
dax-capable device.  This allows to extend target's DM table.
dm-snap works normally when bio-based operation is used.

dm-snap does not support dax operation, and mount with dax
option to a target device or snapshot device fails.
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

f6e629bd

dm stripe: add DAX support · beec25b4

由 Toshi Kani 提交于 6月 24, 2016

Change dm-stripe to implement direct_access function,
stripe_direct_access(), which maps bdev and sector and
calls direct_access function of its physical target device.
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

beec25b4

dm error: add DAX support · f8df1fdf

由 Mike Snitzer 提交于 6月 24, 2016

Allow the error target to replace an existing DAX-enabled target.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

f8df1fdf

dm linear: add DAX support · 84b22f83

由 Toshi Kani 提交于 6月 22, 2016

Change dm-linear to implement direct_access function,
linear_direct_access(), which maps sector and calls direct_access
function of its physical target device.
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

84b22f83

dm: add infrastructure for DAX support · 545ed20e

由 Toshi Kani 提交于 6月 22, 2016

Change mapped device to implement direct_access function,
dm_blk_direct_access(), which calls a target direct_access function.
'struct target_type' is extended to have target direct_access interface.
This function limits direct accessible size to the dm_target's limit
with max_io_len().

Add dm_table_supports_dax() to iterate all targets and associated block
devices to check for DAX support.  To add DAX support to a DM target the
target must only implement the direct_access function.

Add a new dm type, DM_TYPE_DAX_BIO_BASED, which indicates that mapped
device supports DAX and is bio based.  This new type is used to assure
that all target devices have DAX support and remain that way after
QUEUE_FLAG_DAX is set in mapped device.

At initial table load, QUEUE_FLAG_DAX is set to mapped device when setting
DM_TYPE_DAX_BIO_BASED to the type.  Any subsequent table load to the
mapped device must have the same type, or else it fails per the check in
table_load().
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

545ed20e

block: simplify and cleanup bvec pool handling · ed996a52

由 Christoph Hellwig 提交于 7月 19, 2016

Instead of a flag and an index just make sure an index of 0 means
no need to free the bvec array.  Also move the constants related
to the bvec pools together and use a consistent naming scheme for
them.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

ed996a52

block: get rid of bio_rw and READA · 70246286

由 Christoph Hellwig 提交于 7月 19, 2016

These two are confusing leftover of the old world order, combining
values of the REQ_OP_ and REQ_ namespaces.  For callers that don't
special case we mostly just replace bi_rw with bio_data_dir or
op_is_write, except for the few cases where a switch over the REQ_OP_
values makes more sense.  Any check for READA is replaced with an
explicit check for REQ_RAHEAD.  Also remove the READA alias for
REQ_RAHEAD.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

70246286

dm thin: fix a race condition between discarding and provisioning a block · 2a0fbffb

由 Joe Thornber 提交于 7月 01, 2016

The discard passdown was being issued after the block was unmapped,
which meant the block could be reprovisioned whilst the passdown discard
was still in flight.

We can only identify unshared blocks (safe to do a passdown a discard
to) once they're unmapped and their ref count hits zero. Block ref
counts are now used to guard against concurrent allocation of these
blocks that are being discarded. So now we unmap the block, issue
passdown discards, and the immediately increment ref counts for regions
that have been discarded via passed down (this is safe because
allocation occurs within the same thread). We then decrement ref counts
once the passdown discard IO is complete -- signaling these blocks may
now be allocated.

This fixes the potential for corruption that was reported here:
https://www.redhat.com/archives/dm-devel/2016-June/msg00311.htmlReported-by: NDennis Yang <dennisyang@qnap.com>
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

2a0fbffb

dm btree: fix a bug in dm_btree_find_next_single() · e7e0f730

由 Joe Thornber 提交于 7月 01, 2016

dm_btree_find_next_single() can short-circuit the search for a block
with a return of -ENODATA if all entries are higher than the search key
passed to lower_bound().

This hasn't been a problem because of the way the btree has been used by
DM thinp.  But it must be fixed now in preparation for fixing the race
in DM thinp's handling of simultaneous block discard vs allocation.
Otherwise, once that fix is in place, some of the blocks in a discard
would not be unmapped as expected.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

e7e0f730

20 7月, 2016 3 次提交

raid10: improve random reads performance · 0e5313e2

由 Tomasz Majchrzak 提交于 6月 24, 2016

RAID10 random read performance is lower than expected due to excessive spinlock
utilisation which is required mostly for rebuild/resync. Simplify allow_barrier
as it's in IO path and encounters a lot of unnecessary congestion.

As lower_barrier just takes a lock in order to decrement a counter, convert
counter (nr_pending) into atomic variable and remove the spin lock. There is
also a congestion for wake_up (it uses lock internally) so call it only when
it's really needed. As wake_up is not called constantly anymore, ensure process
waiting to raise a barrier is notified when there are no more waiting IOs.
Signed-off-by: NTomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: NShaohua Li <shli@fb.com>

0e5313e2

md: add missing sysfs_notify on array_state update · 573275b5

由 Tomasz Majchrzak 提交于 6月 30, 2016

Changeset 6791875e has added early return from a function so there is no
sysfs notification for 'active' and 'clean' state change.
Signed-off-by: NTomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: NShaohua Li <shli@fb.com>

573275b5

Fix kernel module refcount handling · 4cb9da7d

由 Alexey Obitotskiy 提交于 6月 23, 2016

md loads raidX modules and increments module refcount each time level
has changed but does not decrement it. You are unable to unload raid0
module after reshape because raid0 reshape changes level to raid4
and back to raid0.
Signed-off-by: NAleksey Obitotskiy <aleksey.obitotskiy@intel.com>
Signed-off-by: NShaohua Li <shli@fb.com>

4cb9da7d

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功