openanolis / cloud-kernel
1 年多前同步成功

36

7

代码
- 文件
- 提交
- 分支
- Tags
- 贡献者
- 分支图
- Diff
Issue 10
- 列表
- 看板
- 标记
- 里程碑
合并请求 2
Wiki 0
- Wiki
分析
- 仓库
- DevOps
项目成员
Pages

24 3月, 2013 1 次提交

K

bcache: A block layer cache · cafe5635

由 Kent Overstreet 提交于 3月 23, 2013

Does writethrough and writeback caching, handles unclean shutdown, and
has a bunch of other nifty features motivated by real world usage.

See the wiki at http://bcache.evilpiepirate.org for more.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

cafe5635

02 3月, 2013 30 次提交

H

dm cache: add cleaner policy · 8735a813

由 Heinz Mauelshagen 提交于 3月 01, 2013

A simple cache policy that writes back all data to the origin.

This is used to decommission a dm cache by emptying it.
Signed-off-by: NHeinz Mauelshagen <mauelshagen@redhat.com>
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

8735a813

J

dm cache: add mq policy · f2836352

由 Joe Thornber 提交于 3月 01, 2013

A cache policy that uses a multiqueue ordered by recent hit
count to select which blocks should be promoted and demoted.
This is meant to be a general purpose policy.  It prioritises
reads over writes.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

f2836352

J

dm: add cache target · c6b4fcba

由 Joe Thornber 提交于 3月 01, 2013

Add a target that allows a fast device such as an SSD to be used as a
cache for a slower device such as a disk.

A plug-in architecture was chosen so that the decisions about which data
to migrate and when are delegated to interchangeable tunable policy
modules.  The first general purpose module we have developed, called
"mq" (multiqueue), follows in the next patch.  Other modules are
under development.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NHeinz Mauelshagen <mauelshagen@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

c6b4fcba

J

dm persistent data: add bitset · 7a87edfe

由 Joe Thornber 提交于 3月 01, 2013

Add a persistent bitset as a wrapper around dm-array.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

7a87edfe

J

dm persistent data: add transactional array · 6513c29f

由 Joe Thornber 提交于 3月 01, 2013

Add a transactional array.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

6513c29f

J

dm thin: remove cells from stack · 025b9685

由 Joe Thornber 提交于 3月 01, 2013

This patch takes advantage of the new bio-prison interface where the
memory is now passed in rather than using a mempool in bio-prison.
This allows the map function to avoid performing potentially-blocking
allocations that could lead to deadlocks: We want to avoid the cell
allocation that is done in bio_detain.

(The potential for mempool deadlocks still remains in other functions
that use bio_detain.)
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

025b9685

J

dm bio prison: pass cell memory in · 6beca5eb

由 Joe Thornber 提交于 3月 01, 2013

Change the dm_bio_prison interface so that instead of allocating memory
internally, dm_bio_detain is supplied with a pre-allocated cell each
time it is called.

This enables a subsequent patch to move the allocation of the struct
dm_bio_prison_cell outside the thin target's mapping function so it can
no longer block there.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

6beca5eb

J

dm persistent data: add btree_walk · 4e7f1f90

由 Joe Thornber 提交于 3月 01, 2013

Add dm_btree_walk to iterate through the contents of a btree.
This will be used by the dm cache target.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

4e7f1f90

A

dm: add target num_write_bios fn · b0d8ed4d

由 Alasdair G Kergon 提交于 3月 01, 2013

Add a num_write_bios function to struct target.

If an instance of a target sets this, it will be queried before the
target's mapping function is called on a write bio, and the response
controls the number of copies of the write bio that the target will
receive.

This provides a convenient way for a target to send the same data to
more than one device.  The new cache target uses this in writethrough
mode, to send the data both to the cache and the backing device.
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

b0d8ed4d

M

dm kcopyd: introduce configurable throttling · df5d2e90

由 Mikulas Patocka 提交于 3月 01, 2013

This patch allows the administrator to reduce the rate at which kcopyd
issues I/O.

Each module that uses kcopyd acquires a throttle parameter that can be
set in /sys/module/*/parameters.

We maintain a history of kcopyd usage by each module in the variables
io_period and total_period in struct dm_kcopyd_throttle. The actual
kcopyd activity is calculated as a percentage of time equal to
"(100 * io_period / total_period)".  This is compared with the user-defined
throttle percentage threshold and if it is exceeded, we sleep.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

df5d2e90

M

dm ioctl: allow message to return data · a2606241

由 Mikulas Patocka 提交于 3月 01, 2013

This patch introduces enhanced message support that allows the
device-mapper core to recognise messages that are common to all devices,
and for messages to return data to userspace.

Core messages are processed by the function "message_for_md".  If the
device mapper doesn't support the message, it is passed to the target
driver.

If the message returns data, the kernel sets the flag
DM_MESSAGE_OUT_FLAG.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

a2606241

M

dm ioctl: optimize functions without variable params · 02cde50b

由 Mikulas Patocka 提交于 3月 01, 2013

Device-mapper ioctls receive and send data in a buffer supplied
by userspace.  The buffer has two parts.  The first part contains
a 'struct dm_ioctl' and has a fixed size.  The second part depends
on the ioctl and has a variable size.

This patch recognises the specific ioctls that do not use the variable
part of the buffer and skips allocating memory for it.

In particular, when a device is suspended and a resume ioctl is sent,
this now avoid memory allocation completely.

The variable "struct dm_ioctl tmp" is moved from the function
copy_params to its caller ctl_ioctl and renamed to param_kernel.
It is used directly when the ioctl function doesn't need any arguments.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

02cde50b

M

dm ioctl: introduce ioctl_flags · e2914cc2

由 Mikulas Patocka 提交于 3月 01, 2013

This patch introduces flags for each ioctl function.

So far, one flag is defined, IOCTL_FLAGS_NO_PARAMS.  It is set if the
function processing the ioctl doesn't take or produce any parameters in
the section of the data buffer that has a variable size.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

e2914cc2

J

dm: merge io_pool and tio_pool · 5f015204

由 Jun'ichi Nomura 提交于 3月 01, 2013

This patch merges io_pool and tio_pool into io_pool and cleans up
related functions.

Though device-mapper used to have 2 pools of objects for each dm device,
the use of bioset frontbad for per-bio data has shrunk the number of
pools to 1 for both bio-based and request-based device types.
(See c0820cf5 "dm: introduce per_bio_data" and
 94818742 "dm: Use bioset's front_pad for dm_rq_clone_bio_info")

So dm no longer has to maintain 2 different pointers.

No functional changes.
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

5f015204

J

dm: remove unused _rq_bio_info_cache · 23e5083b

由 Jun'ichi Nomura 提交于 3月 01, 2013

Remove _rq_bio_info_cache, which is no longer used.
No functional changes.
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

23e5083b

M

dm: fix limits initialization when there are no data devices · 87eb5b21

由 Mike Christie 提交于 3月 01, 2013

dm_calculate_queue_limits will first reset the provided limits to
defaults using blk_set_stacking_limits; whereby defeating the purpose of
retaining the original live table's limits -- as was intended via commit
3ae70656 ("dm: retain table limits when
swapping to new table with no devices").

Fix this improper limits initialization (in the no data devices case) by
avoiding the call to dm_calculate_queue_limits.

[patch header revised by Mike Snitzer]
Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # v3.6+
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

87eb5b21

M

dm snapshot: add missing module aliases · 23cb2109

由 Mikulas Patocka 提交于 3月 01, 2013

Add module aliases so that autoloading works correctly if the user
tries to activate "snapshot-origin" or "snapshot-merge" targets.

Reference: https://bugzilla.redhat.com/889973Reported-by: NChao Yang <chyang@redhat.com>
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

23cb2109

M

dm persistent data: set some btree fn parms const · 018cede9

由 Mike Snitzer 提交于 3月 01, 2013

Mark some constant parameters constant in some dm-btree functions.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

018cede9

A

dm: refactor bio cloning · e4c93811

由 Alasdair G Kergon 提交于 3月 01, 2013

Refactor part of the bio splitting and cloning code to try to make it
easier to understand.
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

e4c93811

A

dm: rename bio cloning functions · 14fe594d

由 Alasdair G Kergon 提交于 3月 01, 2013

Rename functions involved in splitting and cloning bios.

The sequence of functions is now:
  (1) __split_and_process* - entry point that selects the processing strategy
  (2) __send* - prepare the details for each bio needed and loop through them
  (3) __clone_and_map* - creates a clone and maps it
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

14fe594d

A

dm: rename request variables to bios · 55a62eef

由 Alasdair G Kergon 提交于 3月 01, 2013

Use 'bio' in the name of variables and functions that deal with
bios rather than 'request' to avoid confusion with the normal
block layer use of 'request'.

No functional changes.
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

55a62eef

A

dm: clean up clone_bio · bd2a49b8

由 Alasdair G Kergon 提交于 3月 01, 2013

Remove the no-longer-used struct bio_set argument from clone_bio and split_bvec.
Use tio->ti in __map_bio() instead of passing in ti.
Factor out some code for setting up cloned bios.
Take target_request_nr as a parameter to alloc_tio().
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: Joe Thornber <ejt@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

bd2a49b8

K

dm persistent data: remove CONFIG_EXPERIMENTAL · 88ae4c52

由 Kees Cook 提交于 3月 01, 2013

The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
while now and is almost always enabled by default. As agreed during the
Linux kernel summit, remove it from any "depends on" lines in Kconfigs.
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

88ae4c52

A

dm: remove CONFIG_EXPERIMENTAL · d57916a0

由 Alasdair G Kergon 提交于 3月 01, 2013

Remove EXPERIMENTAL from all existing device-mapper targets.
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

d57916a0

M

dm thin: use block_size_is_power_of_two · 58f77a21

由 Mike Snitzer 提交于 3月 01, 2013

Use block_size_is_power_of_two() rather than checking
sectors_per_block_shift directly.  Also introduce local pool variable in
get_bio_block() to eliminate redundant tc->pool dereferences.

No functional change.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

58f77a21

M

dm bufio: use WRITE_FLUSH instead of REQ_FLUSH · 3daec3b4

由 Mikulas Patocka 提交于 3月 01, 2013

Use WRITE_FLUSH instead of REQ_FLUSH for submitted requests to make it
consistent with the rest of the kernel. There is no functional change.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

3daec3b4

W

dm table: remove superfluous variable reset · d2ce70a1

由 Wang Sheng-Hui 提交于 3月 01, 2013

If allocation fails, the local var *t is not used any more after kfree.
Don't need to reset it to NULL. Remove the unnecesary NULL set here.
Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

d2ce70a1

M

dm thin: support a non power of 2 discard_granularity · f13945d7

由 Mike Snitzer 提交于 3月 01, 2013

Support a non-power-of-2 discard granularity in dm-thin, now that the block
layer supports this(via 8dd2cb7e "block:
discard granularity might not be power of 2" and
59771079 "blk: avoid divide-by-zero with zero
discard granularity").
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

f13945d7

M

dm: fix truncated status strings · fd7c092e

由 Mikulas Patocka 提交于 3月 01, 2013

Avoid returning a truncated table or status string instead of setting
the DM_BUFFER_FULL_FLAG when the last target of a table fills the
buffer.

When processing a table or status request, the function retrieve_status
calls ti->type->status. If ti->type->status returns non-zero,
retrieve_status assumes that the buffer overflowed and sets
DM_BUFFER_FULL_FLAG.

However, targets don't return non-zero values from their status method
on overflow. Most targets returns always zero.

If a buffer overflow happens in a target that is not the last in the
table, it gets noticed during the next iteration of the loop in
retrieve_status; but if a buffer overflow happens in the last target, it
goes unnoticed and erroneously truncated data is returned.

In the current code, the targets behave in the following way:
* dm-crypt returns -ENOMEM if there is not enough space to store the
  key, but it returns 0 on all other overflows.
* dm-thin returns errors from the status method if a disk error happened.
  This is incorrect because retrieve_status doesn't check the error
  code, it assumes that all non-zero values mean buffer overflow.
* all the other targets always return 0.

This patch changes the ti->type->status function to return void (because
most targets don't use the return code). Overflow is detected in
retrieve_status: if the status method fills up the remaining space
completely, it is assumed that buffer overflow happened.

Cc: stable@vger.kernel.org
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

fd7c092e

J

dm: do not replace bioset for request based dm · 16245bdc

由 Jun'ichi Nomura 提交于 3月 01, 2013

This patch fixes a regression introduced in v3.8, which causes oops
like this when dm-multipath is used:

general protection fault: 0000 [#1] SMP
RIP: 0010:[<ffffffff810fe754>]  [<ffffffff810fe754>] mempool_free+0x24/0xb0
Call Trace:
  <IRQ>
  [<ffffffff81187417>] bio_put+0x97/0xc0
  [<ffffffffa02247a5>] end_clone_bio+0x35/0x90 [dm_mod]
  [<ffffffff81185efd>] bio_endio+0x1d/0x30
  [<ffffffff811f03a3>] req_bio_endio.isra.51+0xa3/0xe0
  [<ffffffff811f2f68>] blk_update_request+0x118/0x520
  [<ffffffff811f3397>] blk_update_bidi_request+0x27/0xa0
  [<ffffffff811f343c>] blk_end_bidi_request+0x2c/0x80
  [<ffffffff811f34d0>] blk_end_request+0x10/0x20
  [<ffffffffa000b32b>] scsi_io_completion+0xfb/0x6c0 [scsi_mod]
  [<ffffffffa000107d>] scsi_finish_command+0xbd/0x120 [scsi_mod]
  [<ffffffffa000b12f>] scsi_softirq_done+0x13f/0x160 [scsi_mod]
  [<ffffffff811f9fd0>] blk_done_softirq+0x80/0xa0
  [<ffffffff81044551>] __do_softirq+0xf1/0x250
  [<ffffffff8142ee8c>] call_softirq+0x1c/0x30
  [<ffffffff8100420d>] do_softirq+0x8d/0xc0
  [<ffffffff81044885>] irq_exit+0xd5/0xe0
  [<ffffffff8142f3e3>] do_IRQ+0x63/0xe0
  [<ffffffff814257af>] common_interrupt+0x6f/0x6f
  <EOI>
  [<ffffffffa021737c>] srp_queuecommand+0x8c/0xcb0 [ib_srp]
  [<ffffffffa0002f18>] scsi_dispatch_cmd+0x148/0x310 [scsi_mod]
  [<ffffffffa000a38e>] scsi_request_fn+0x31e/0x520 [scsi_mod]
  [<ffffffff811f1e57>] __blk_run_queue+0x37/0x50
  [<ffffffff811f1f69>] blk_delay_work+0x29/0x40
  [<ffffffff81059003>] process_one_work+0x1c3/0x5c0
  [<ffffffff8105b22e>] worker_thread+0x15e/0x440
  [<ffffffff8106164b>] kthread+0xdb/0xe0
  [<ffffffff8142db9c>] ret_from_fork+0x7c/0xb0

The regression was introduced by the change
c0820cf5 "dm: introduce per_bio_data", where dm started to replace
bioset during table replacement.
For bio-based dm, it is good because clone bios do not exist during the
table replacement.
For request-based dm, however, (not-yet-mapped) clone bios may stay in
request queue and survive during the table replacement.
So freeing the old bioset could cause the oops in bio_put().

Since the size of front_pad may change only with bio-based dm,
it is not necessary to replace bioset for request-based dm.
Reported-by: NBart Van Assche <bvanassche@acm.org>
Tested-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: NMikulas Patocka <mpatocka@redhat.com>
Acked-by: NMike Snitzer <snitzer@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

16245bdc

28 2月, 2013 5 次提交

S

hlist: drop the node parameter from iterators · b67bfe0d

由 Sasha Levin 提交于 2月 27, 2013

I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj->member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    <+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b67bfe0d

T

dm: convert to idr_alloc() · c9d76be6

由 Tejun Heo 提交于 2月 27, 2013

Convert to the much saner new idr interface.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Alasdair Kergon <agk@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c9d76be6

T

dm: don't use idr_remove_all() · adaedbd9

由 Tejun Heo 提交于 2月 27, 2013

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop its usage.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Alasdair Kergon <agk@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

adaedbd9

N

md: expedite metadata update when switching read-auto -> active · f3378b48

由 NeilBrown 提交于 2月 28, 2013

If something has failed while the array was read-auto,
then when we switch to 'active' we need to update the metadata.
This will happen anyway but it is good to expedite it, and
also to ensure any failed device has been released by the
underlying device before we try to action the ioctl which
caused us to switch to 'active' mode.
Reported-by: NJoe Lawrence <Joe.Lawrence@stratus.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

f3378b48

N

md: remove CONFIG_MULTICORE_RAID456 · 51acbcec

由 NeilBrown 提交于 2月 28, 2013

This doesn't seem to actually help and we have an alternate
multi-threading approach waiting in the wings, so just get
rid of this config option and associated code.

As a bonus, we remove one use of CONFIG_EXPERIMENTAL

Cc: Dan Williams <djbw@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: NNeilBrown <neilb@suse.de>

51acbcec

26 2月, 2013 4 次提交

N

md/raid1,raid10: fix deadlock with freeze_array() · ee0b0244

由 NeilBrown 提交于 2月 25, 2013

When raid1/raid10 needs to fix a read error, it first drains
all pending requests by calling freeze_array().
This calls flush_pending_writes() if it needs to sleep,
but some writes may be pending in a per-process plug rather
than in the per-array request queue.

When raid1{,0}_unplug() moves the request from the per-process
plug to the per-array request queue (from which
flush_pending_writes() can flush them), it needs to wake up
freeze_array(), or freeze_array() will never flush them and so
it will block forever.

So add the requires wake_up() calls.

This bug was introduced by commit
   f54a9d0e
for raid1 and a similar commit for RAID10, and so has been present
since linux-3.6.  As the bug causes a deadlock I believe this fix is
suitable for -stable.

Cc: stable@vger.kernel.org (3.6.y 3.7.y 3.8.y)
Reported-by: NTregaron Bayly <tbayly@bluehost.com>
Tested-by: NTregaron Bayly <tbayly@bluehost.com>
Signed-off-by: NNeilBrown <neilb@suse.de>

ee0b0244

N

md/raid0: improve error message when converting RAID4-with-spares to RAID0 · f96c9f30

由 NeilBrown 提交于 2月 21, 2013

Mentioning "bad disk number -1" exposes irrelevant internal detail.
Just say they are inactive and must be removed.
Signed-off-by: NNeilBrown <neilb@suse.de>

f96c9f30

N

md: raid0: fix error return from create_stripe_zones. · 58ebb34c

由 NeilBrown 提交于 2月 21, 2013

Create_stripe_zones returns an error slightly differently to
raid0_run and to raid0_takeover_*.

The error returned used by the second was wrong and an error would
result in mddev->private being set to NULL and sooner or later a
crash.

So never return NULL, return ERR_PTR(err), not NULL from
create_stripe_zones.

This bug has been present since 2.6.35 so the fix is suitable
for any kernel since then.

Cc: stable@vger.kernel.org
Signed-off-by: NNeilBrown <neilb@suse.de>

58ebb34c

N

md: fix two bugs when attempting to resize RAID0 array. · a6468539

由 NeilBrown 提交于 2月 21, 2013

You cannot resize a RAID0 array (in terms of making the devices
bigger), but the code doesn't entirely stop you.
So:

 disable setting of the available size on each device for
 RAID0 and Linear devices.  This must not change as doing so
 can change the effective layout of data.

 Make sure that the size that raid0_size() reports is accurate,
 but rounding devices sizes to chunk sizes.  As the device sizes
 cannot change now, this isn't so important, but it is best to be
 safe.

Without this change:
  mdadm --grow /dev/md0 -z max
  mdadm --grow /dev/md0 -Z max
  then read to the end of the array

can cause a BUG in a RAID0 array.

These bugs have been present ever since it became possible
to resize any device, which is a long time.  So the fix is
suitable for any -stable kerenl.

Cc: stable@vger.kernel.org
Signed-off-by: NNeilBrown <neilb@suse.de>

a6468539