提交 · 5d8817833c7609c24da9a92f71c53caa9c1424eb · openeuler / Kernel

29 7月, 2016 1 次提交

MD: fix null pointer deference · 5d881783

由 Shaohua Li 提交于 7月 28, 2016

The md device might not have personality (for example, ddf raid array). The
issue is introduced by 8430e7e0(md: disconnect device from personality
before trying to remove it)
Reported-by: Nkernel test robot <xiaolong.ye@intel.com>
Signed-off-by: NShaohua Li <shli@fb.com>

5d881783

20 7月, 2016 4 次提交

raid10: improve random reads performance · 0e5313e2

由 Tomasz Majchrzak 提交于 6月 24, 2016

RAID10 random read performance is lower than expected due to excessive spinlock
utilisation which is required mostly for rebuild/resync. Simplify allow_barrier
as it's in IO path and encounters a lot of unnecessary congestion.

As lower_barrier just takes a lock in order to decrement a counter, convert
counter (nr_pending) into atomic variable and remove the spin lock. There is
also a congestion for wake_up (it uses lock internally) so call it only when
it's really needed. As wake_up is not called constantly anymore, ensure process
waiting to raise a barrier is notified when there are no more waiting IOs.
Signed-off-by: NTomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: NShaohua Li <shli@fb.com>

0e5313e2

md: add missing sysfs_notify on array_state update · 573275b5

由 Tomasz Majchrzak 提交于 6月 30, 2016

Changeset 6791875e has added early return from a function so there is no
sysfs notification for 'active' and 'clean' state change.
Signed-off-by: NTomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: NShaohua Li <shli@fb.com>

573275b5

Fix kernel module refcount handling · 4cb9da7d

由 Alexey Obitotskiy 提交于 6月 23, 2016

md loads raidX modules and increments module refcount each time level
has changed but does not decrement it. You are unable to unload raid0
module after reshape because raid0 reshape changes level to raid4
and back to raid0.
Signed-off-by: NAleksey Obitotskiy <aleksey.obitotskiy@intel.com>
Signed-off-by: NShaohua Li <shli@fb.com>

4cb9da7d

md: use seconds granularity for error logging · 0e3ef49e

由 Arnd Bergmann 提交于 6月 17, 2016

The md code stores the exact time of the last error in the
last_read_error variable using a timespec structure. It only
ever uses the seconds portion of that though, so we can
use a scalar for it.

There won't be an overflow in 2038 here, because it already
used monotonic time and 32-bit is enough for that, but I've
decided to use time64_t for consistency in the conversion.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NShaohua Li <shli@fb.com>

0e3ef49e

14 6月, 2016 19 次提交

md: reduce the number of synchronize_rcu() calls when multiple devices fail. · d787be40

由 NeilBrown 提交于 6月 02, 2016

Every time a device is removed with ->hot_remove_disk() a synchronize_rcu() call is made
which can delay several milliseconds in some case.
If lots of devices fail at once - as could happen with a large RAID10 where one set
of devices are removed all at once - these delays can add up to be very inconcenient.

As failure is not reversible we can check for that first, setting a
separate flag if it is found, and then all synchronize_rcu() once for
all the flagged devices.  Then ->hot_remove_disk() function can skip the
synchronize_rcu() step if the flag is set.

fix build error(Shaohua)
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

d787be40

md: be extra careful not to take a reference to a Faulty device. · f5b67ae8

由 NeilBrown 提交于 6月 02, 2016

It is important that we never increment rdev->nr_pending on a Faulty
device as ->hot_remove_disk() assumes that once the Faulty flag is visible
no code will take a new reference.

Some places take a new reference after only check In_sync.  This should
be safe as the two are changed together.  However to make the code more
obviously safe, add checks for 'Faulty' as well.

Note: the actual rule is:
  Never increment nr_pending if  Faulty is set and Blocked is clear,
  never clear Faulty, and never set Blocked without holding a reference
  through nr_pending.

fix build error (Shaohua)
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

f5b67ae8

N
md/multipath: add rcu protection to rdev access in multipath_status. · 40cf2123
由 NeilBrown 提交于 6月 02, 2016
```
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>
```
40cf2123
N
md/raid5: add rcu protection to rdev accesses in raid5_status. · 5fd13351
由 NeilBrown 提交于 6月 02, 2016
```
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>
```
5fd13351

md/raid5: add rcu protection to rdev accesses in want_replace · 3f232d6a

由 NeilBrown 提交于 6月 02, 2016

Being in the middle of resync is no longer protection against failed
rdevs disappearing.  So add rcu protection.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

3f232d6a

md/raid5: add rcu protection to rdev accesses in handle_failed_sync. · e50d3992

由 NeilBrown 提交于 6月 02, 2016

The rdev could be freed while handle_failed_sync is running, so
rcu protection is needed.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

e50d3992

md/raid1: add rcu protection to rdev in fix_read_error · 707a6a42

由 NeilBrown 提交于 6月 02, 2016

Since remove_and_add_spares() was added to hot_remove_disk() it has
been possible for an rdev to be hot-removed while fix_read_error()
was running, so we need to be more careful, and take a reference to
the rdev while performing IO.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

707a6a42

md/raid1: small code cleanup in end_sync_write · 854abd75

由 NeilBrown 提交于 6月 02, 2016

'mirror' is only used to find 'rdev', several times.
So just find 'rdev' once, and use it instead.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

854abd75

md/raid1: small cleanup in raid1_end_read/write_request · e5872d58

由 NeilBrown 提交于 6月 02, 2016

Both functions use conf->mirrors[mirror].rdev several times, so
improve readability by storing this in a local variable.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

e5872d58

md/raid10: simplify print_conf a little. · 4056ca51

由 NeilBrown 提交于 6月 02, 2016

'tmp' is only ever used to extract 'tmp->rdev', so just use 'rdev' directly.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

4056ca51

md/raid10: minor code improvement in fix_read_error() · d683c8e0

由 NeilBrown 提交于 6月 02, 2016

rdev already holds conf->mirrors[d].rdev, so no need to load it again.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

d683c8e0

md/raid10: add rcu protection to rdev access during reshape. · d094d686

由 NeilBrown 提交于 6月 02, 2016

mirrors[].rdev can become NULL at any point unless:
   - a counted reference is held
   - ->reconfig_mutex is held, or
   - rcu_read_lock() is held

Reshape isn't always suitably careful as in the past rdev couldn't be
removed during reshape.  It can now, so add protection.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

d094d686

md/raid10: add rcu protection to rdev access in raid10_sync_request. · f90145f3

由 NeilBrown 提交于 6月 02, 2016

mirrors[].rdev can become NULL at any point unless:
  - a counted reference is held
  - ->reconfig_mutex is held, or
  - rcu_read_lock() is held

Previously they could not become NULL during a resync/recovery/reshape either.
However when remove_and_add_spares() was added to hot_remove_disk(), that
changed.

So raid10_sync_request didn't previously need to protect rdev access,
but now it does.

Fix missed check(Shaohua)
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

f90145f3

md/raid10: add rcu protection in raid10_status. · d44b0a92

由 NeilBrown 提交于 6月 02, 2016

mirrors[].rdev can become NULL at any point unless:
 - a counted reference is held
 - ->reconfig_mutex is held, or
 - rcu_read_lock() is held

raid10_status holds none of these.  So add rcu_read_lock()
protection.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

d44b0a92

md/raid10: fix refounct imbalance when resyncing an array with a replacement device. · 83f1261f

由 NeilBrown 提交于 6月 02, 2016

If you have a raid10 with a replacement device that is resyncing -
e.g. after a crash before the replacement was complete - the write to
the replacement will increment nr_pending on the wrong device, which
will lead to strangeness.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

83f1261f

md/raid1, raid10: don't recheck "Faulty" flag in read-balance. · 414e6b9a

由 NeilBrown 提交于 6月 02, 2016

Re-checking the faulty flag here brings no value.
The comment about "risk" refers to the risk that the device could
be in the process of being removed by ->hot_remove_disk().
However providing that the ->nr_pending count is incremented inside
an rcu_read_locked() region, there is no risk of that happening.

This is because the rdev pointer (in the personalities array) is set
to NULL before synchronize_rcu(), and ->nr_pending is tested
afterwards.  If the rcu_read_locked region happens before the
synchronize_rcu(), the test will see that nr_pending has been incremented.
If it happens afterwards, the rdev pointer will be NULL so there is nothing
to increment.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

414e6b9a

md: disconnect device from personality before trying to remove it. · 8430e7e0

由 NeilBrown 提交于 6月 02, 2016

When the HOT_REMOVE_DISK ioctl is used to remove a device, we
call remove_and_add_spares() which will remove it from the personality
if possible.  This improves the chances that the removal will succeed.

When writing "remove" to dev-XX/state, we don't.  So that can fail more easily.

So add the remove_and_add_spares() into "remove" handling.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

8430e7e0

raid1/raid10: slow down resync if there is non-resync activity pending · 7ac50447

由 Tomasz Majchrzak 提交于 6月 13, 2016

A performance drop of mkfs has been observed on RAID10 during resync
since commit 09314799 ("md: remove 'go_faster' option from
->sync_request()"). Resync sends so many IOs it slows down non-resync
IOs significantly (few times). Add a short delay to a resync. The
previous long sleep (1s) has proven unnecessary, even very short delay
brings performance right.

The change also applied to raid1. The problem has not been observed on
raid1, however it shares barriers code with raid10 so it might be an
issue for some setup too.
Suggested-by: NNeilBrown <neilb@suse.com>
Link: http://lkml.kernel.org/r/20160609134555.GA9104@proton.igk.intel.comSigned-off-by: NTomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: NShaohua Li <shli@fb.com>

7ac50447

MD:Update superblock when err == 0 in size_store · 4ba1e788

由 Xiao Ni 提交于 6月 12, 2016

This is a simple check before updating the superblock. It should update
the superblock when update_size return 0.
Signed-off-by: NXiao Ni <xni@redhat.com>
Signed-off-by: NShaohua Li <shli@fb.com>

4ba1e788

10 6月, 2016 1 次提交

md: use a mutex to protect a global list · 5b1f5bc3

由 Cong Wang 提交于 6月 08, 2016

We saw a list corruption in the list all_detected_devices:

 WARNING: CPU: 16 PID: 226 at lib/list_debug.c:29 __list_add+0x3c/0xa9()
 list_add corruption. next->prev should be prev (ffff880859d58320), but was ffff880859ce74c0. (next=ffffffff81abfdb0).
 Modules linked in: ahci libahci libata sd_mod scsi_mod
 CPU: 16 PID: 226 Comm: kworker/u241:4 Not tainted 4.1.20 #1
 Hardware name: Dell Inc. PowerEdge C6220/04GD66, BIOS 2.2.3 11/07/2013
 Workqueue: events_unbound async_run_entry_fn
  0000000000000000 ffff880859a5baf8 ffffffff81502872 ffff880859a5bb48
  0000000000000009 ffff880859a5bb38 ffffffff810692a5 ffff880859ee8828
  ffffffff812ad02c ffff880859d58320 ffffffff81abfdb0 ffff880859eb90c0
 Call Trace:
  [<ffffffff81502872>] dump_stack+0x4d/0x63
  [<ffffffff810692a5>] warn_slowpath_common+0xa1/0xbb
  [<ffffffff812ad02c>] ? __list_add+0x3c/0xa9
  [<ffffffff81069305>] warn_slowpath_fmt+0x46/0x48
  [<ffffffff812ad02c>] __list_add+0x3c/0xa9
  [<ffffffff81406f28>] md_autodetect_dev+0x41/0x62
  [<ffffffff81285862>] rescan_partitions+0x25f/0x29d
  [<ffffffff81506372>] ? mutex_lock+0x13/0x31
  [<ffffffff811a090f>] __blkdev_get+0x1aa/0x3cd
  [<ffffffff811a0b91>] blkdev_get+0x5f/0x294
  [<ffffffff81377ceb>] ? put_device+0x17/0x19
  [<ffffffff8128227c>] ? disk_put_part+0x12/0x14
  [<ffffffff812836f3>] add_disk+0x29d/0x407
  [<ffffffff81384345>] ? __pm_runtime_use_autosuspend+0x5c/0x64
  [<ffffffffa004a724>] sd_probe_async+0x115/0x1af [sd_mod]
  [<ffffffff81083177>] async_run_entry_fn+0x72/0x12c
  [<ffffffff8107c44c>] process_one_work+0x198/0x2ce
  [<ffffffff8107cac7>] worker_thread+0x1dd/0x2bb
  [<ffffffff8107c8ea>] ? cancel_delayed_work_sync+0x15/0x15
  [<ffffffff8107c8ea>] ? cancel_delayed_work_sync+0x15/0x15
  [<ffffffff81080d9c>] kthread+0xae/0xb6
  [<ffffffff81080000>] ? param_array_set+0x40/0xfa
  [<ffffffff81080cee>] ? __kthread_parkme+0x61/0x61
  [<ffffffff81508152>] ret_from_fork+0x42/0x70
  [<ffffffff81080cee>] ? __kthread_parkme+0x61/0x61

I suspect it is because there is no lock protecting this
global list, autostart_arrays() is called in ioctl() path
where there is no lock.

Cc: Shaohua Li <shli@kernel.org>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NShaohua Li <shli@fb.com>

5b1f5bc3

04 6月, 2016 2 次提交

G
md: simplify the code with md_kick_rdev_from_array · db767672
由 Guoqing Jiang 提交于 6月 02, 2016
```
Signed-off-by: NGuoqing Jiang <gqjiang@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>
```
db767672

md-cluster: fix deadlock issue when add disk to an recoverying array · bb8bf15b

由 Guoqing Jiang 提交于 6月 02, 2016

Add a disk to an array which is performing recovery
is a little complicated, we need to do both reap the
sync thread and perform add disk for the case, then
it caused deadlock as follows.

linux44:~ # ps aux|grep md|grep D
root      1822  0.0  0.0      0     0 ?        D    16:50   0:00 [md127_resync]
root      1848  0.0  0.0  19860   952 pts/0    D+   16:50   0:00 mdadm --manage /dev/md127 --re-add /dev/vdb
linux44:~ # cat /proc/1848/stack
[<ffffffff8107afde>] kthread_stop+0x6e/0x120
[<ffffffffa051ddb0>] md_unregister_thread+0x40/0x80 [md_mod]
[<ffffffffa0526e45>] md_reap_sync_thread+0x15/0x150 [md_mod]
[<ffffffffa05271e0>] action_store+0x260/0x270 [md_mod]
[<ffffffffa05206b4>] md_attr_store+0xb4/0x100 [md_mod]
[<ffffffff81214a7e>] sysfs_write_file+0xbe/0x140
[<ffffffff811a6b98>] vfs_write+0xb8/0x1e0
[<ffffffff811a75b8>] SyS_write+0x48/0xa0
[<ffffffff8152a5c9>] system_call_fastpath+0x16/0x1b
[<00007f068ea1ed30>] 0x7f068ea1ed30
linux44:~ # cat /proc/1822/stack
[<ffffffffa05251a6>] md_do_sync+0x846/0xf40 [md_mod]
[<ffffffffa052402d>] md_thread+0x16d/0x180 [md_mod]
[<ffffffff8107ad94>] kthread+0xb4/0xc0
[<ffffffff8152a518>] ret_from_fork+0x58/0x90

                        Task1848                                Task1822
md_attr_store (held reconfig_mutex by call mddev_lock())
                        action_store
			md_reap_sync_thread
			md_unregister_thread
			kthread_stop                    md_wakeup_thread(mddev->thread);
						wait_event(mddev->sb_wait, !test_bit(MD_CHANGE_PENDING))

md_check_recovery is triggered by wakeup mddev->thread,
but it can't clear MD_CHANGE_PENDING flag since it can't
get lock which was held by md_attr_store already.

To solve the deadlock problem, we move "->resync_finish()"
from md_do_sync to md_reap_sync_thread (after md_update_sb),
also MD_HELD_RESYNC_LOCK is introduced since it is possible
that node can't get resync lock in md_do_sync.

Then we do not need to wait for MD_CHANGE_PENDING is cleared
or not since metadata should be updated after md_update_sb,
so just call resync_finish if MD_HELD_RESYNC_LOCK is set.

We also unified the code after skip label, since set PENDING
for non-clustered case should be harmless.
Reviewed-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NGuoqing Jiang <gqjiang@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

bb8bf15b

26 5月, 2016 1 次提交

right meaning of PARITY_ENABLE_RMW and PARITY_PREFER_RMW · 41257580

由 Song Liu 提交于 5月 23, 2016

In current handle_stripe_dirtying, the code prefers rmw with
PARITY_ENABLE_RMW; while prefers rcw with PARITY_PREFER_RMW.

This patch reverses this behavior.
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NShaohua Li <shli@fb.com>

41257580

13 5月, 2016 4 次提交

dm thin: unroll issue_discard() to create longer discard bio chains · 202bae52

由 Joe Thornber 提交于 5月 04, 2016

There is little benefit to doing this but it does structure DM thinp's
code to more cleanly use the __blkdev_issue_discard() interface --
particularly in passdown_double_checking_shared_status().
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

202bae52

dm thin: use __blkdev_issue_discard for async discard support · 3dba53a9

由 Mike Snitzer 提交于 5月 02, 2016

With commit 38f25255 ("block: add __blkdev_issue_discard") DM thinp
no longer needs to carry its own async discard method.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Acked-by: NJoe Thornber <ejt@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

3dba53a9

dm thin: remove __bio_inc_remaining() and switch to using bio_inc_remaining() · 13e4f8a6

由 Mike Snitzer 提交于 5月 04, 2016

DM thinp's use of bio_inc_remaining() is critical to ensure the original
parent discard bio isn't completed before sub-discards have. DM thinp
needs this due to the extra quiescing that occurs, via multiple DM thinp
mappings, while processing large discards. As such DM thinp must build
the async discard bio chain after some delay -- so bio_inc_remaining()
is used to enable DM thinp to take a reference on the original parent
discard bio for each mapping. This allows the immediate use of
bio_endio() on that discard bio; but with the understanding that the
actual completion won't occur until each of the sub-discards'
per-mapping references are dropped.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Acked-by: NJoe Thornber <ejt@redhat.com>

13e4f8a6

dm raid: make sure no feature flags are set in metadata · 4c9971ca

由 Heinz Mauelshagen 提交于 4月 29, 2016

Given we don't yet support any feature flags in the dm-raid ondisk
metadata (see: 'features' member of 'struct dm_raid_superblock'),
add a check to ensure no flags are actually set, if any features are
set reject the activation of the RAID mapping.

This is to prevent possible data corruption in case of a kernel
downgrade when there'll potentially be feature flags set by a future
dm-raid target.
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

4c9971ca

10 5月, 2016 6 次提交

md-cluster: check the return value of process_recvd_msg · 1fa9a1ad

由 Guoqing Jiang 提交于 5月 03, 2016

We don't need to run the full path of recv_daemon
if process_recvd_msg doesn't return 0.
Reviewed-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NGuoqing Jiang <gqjiang@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

1fa9a1ad

md-cluster: gather resync infos and enable recv_thread after bitmap is ready · 51e453ae

由 Guoqing Jiang 提交于 5月 04, 2016

The in-memory bitmap is not ready when node joins cluster,
so it doesn't make sense to make gather_all_resync_info()
called so earlier, we need to call it after the node's
bitmap is setup. Also, recv_thread could be wake up after
node joins cluster, but it could cause problem if node
receives RESYNCING message without persionality since
mddev->pers->quiesce is called in process_suspend_info.

This commit introduces a new cluster interface load_bitmaps
to fix above problems, load_bitmaps is called in bitmap_load
where bitmap and persionality are ready, and load_bitmaps
does the following tasks:

1. call gather_all_resync_info to load all the node's
   bitmap info.
2. set MD_CLUSTER_ALREADY_IN_CLUSTER bit to recv_thread
   could be wake up, and wake up recv_thread if there is
   pending recv event.

Then ack_bast only wakes up recv_thread after IN_CLUSTER
bit is ready otherwise MD_CLUSTER_PENDING_RESYNC_EVENT is
set.
Reviewed-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NGuoqing Jiang <gqjiang@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

51e453ae

md: set MD_CHANGE_PENDING in a atomic region · 85ad1d13

由 Guoqing Jiang 提交于 5月 03, 2016

Some code waits for a metadata update by:

1. flagging that it is needed (MD_CHANGE_DEVS or MD_CHANGE_CLEAN)
2. setting MD_CHANGE_PENDING and waking the management thread
3. waiting for MD_CHANGE_PENDING to be cleared

If the first two are done without locking, the code in md_update_sb()
which checks if it needs to repeat might test if an update is needed
before step 1, then clear MD_CHANGE_PENDING after step 2, resulting
in the wait returning early.

So make sure all places that set MD_CHANGE_PENDING are atomicial, and
bit_clear_unless (suggested by Neil) is introduced for the purpose.

Cc: Martin Kepplinger <martink@posteo.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: <linux-kernel@vger.kernel.org>
Reviewed-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NGuoqing Jiang <gqjiang@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

85ad1d13

md: raid5: add prerequisite to run underneath dm-raid · fe67d19a

由 Heinz Mauelshagen 提交于 5月 03, 2016

In case md runs underneath the dm-raid target, the mddev does not have
a request queue or gendisk, thus avoid accesses.

This patch adds a missing conditional to the raid5 personality.
Signed-of-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NShaohua Li <shli@fb.com>

fe67d19a

md: raid10: add prerequisite to run underneath dm-raid · 859644f0

由 Heinz Mauelshagen 提交于 5月 03, 2016

In case md runs underneath the dm-raid target, the mddev does not have
a request queue or gendisk, thus avoid accesses to it.

This patch adds two missing conditionals to the raid10 personality.
Signed-of-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NShaohua Li <shli@fb.com>

859644f0

md: md.c: fix oops in mddev_suspend for raid0 · 092398dc

由 Heinz Mauelshagen 提交于 5月 03, 2016

Introduced by upstream commit 70d9798b

The raid0 personality does not create mddev->thread as oposed to
other personalities leading to its unconditional access in
mddev_suspend() causing an oops.

Patch checks for mddev->thread in order to keep the
intention of aforementioned commit.

Fixes: 70d9798b ("MD: warn for potential deadlock")
Cc: stable@vger.kernel.org (4.5+)
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NShaohua Li <shli@fb.com>

092398dc

06 5月, 2016 2 次提交

dm ioctl: drop use of __GFP_REPEAT in copy_params()'s __vmalloc() call · 72f6d8d8

由 Michal Hocko 提交于 4月 28, 2016

copy_params()'s use of __GFP_REPEAT for the __vmalloc() call doesn't make much
sense because vmalloc doesn't rely on costly high order allocations.
Signed-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

72f6d8d8

dm mpath: eliminate use of spinlock in IO fast-paths · 2da1610a

由 Mike Snitzer 提交于 3月 17, 2016

The primary motivation of this commit is to improve the scalability of
DM multipath on large NUMA systems where m->lock spinlock contention has
been proven to be a serious bottleneck on really fast storage.

The ability to atomically read a pointer, using lockless_dereference(),
is leveraged in this commit.  But all pointer writes are still protected
by the m->lock spinlock (which is fine since these all now occur in the
slow-path).

The following functions no longer require the m->lock spinlock in their
fast-path: multipath_busy(), __multipath_map(), and do_end_io()

And choose_pgpath() is modified to _not_ update m->current_pgpath unless
it also switches the path-group.  This is done to avoid needing to take
the m->lock everytime __multipath_map() calls choose_pgpath().
But m->current_pgpath will be reset if it is failed via fail_path().
Suggested-by: NJeff Moyer <jmoyer@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Tested-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

2da1610a

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功