提交 · 4a02b34e0cf1d0d0dd3737702841da4bf615a50a · openeuler / Kernel

11 12月, 2013 5 次提交

dm thin: switch to read-only mode if metadata space is exhausted · 4a02b34e

由 Mike Snitzer 提交于 12月 03, 2013

Switch the thin pool to read-only mode in alloc_data_block() if
dm_pool_alloc_data_block() fails because the pool's metadata space is
exhausted.

Differentiate between data and metadata space in messages about no
free space available.

This issue was noticed with the device-mapper-test-suite using:
dmtest run --suite thin-provisioning -n /exhausting_metadata_space_causes_fail_mode/

The quantity of errors logged in this case must be reduced.

before patch:

device-mapper: thin: 253:4: reached low water mark for metadata device: sending event.
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: space map common: dm_tm_shadow_block() failed
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: space map common: dm_tm_shadow_block() failed
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: space map common: dm_tm_shadow_block() failed
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: space map common: dm_tm_shadow_block() failed
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: space map common: dm_tm_shadow_block() failed
<snip ... these repeat for a _very_ long while ... >
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: thin: 253:4: commit failed: error = -28
device-mapper: thin: 253:4: switching pool to read-only mode

after patch:

device-mapper: thin: 253:4: reached low water mark for metadata device: sending event.
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: thin: 253:4: no free metadata space available.
device-mapper: thin: 253:4: switching pool to read-only mode
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Acked-by: NJoe Thornber <ejt@redhat.com>
Cc: stable@vger.kernel.org

4a02b34e

dm thin: switch to read only mode if a mapping insert fails · fafc7a81

由 Joe Thornber 提交于 12月 02, 2013

Switch the thin pool to read-only mode when dm_thin_insert_block() fails
since there is little reason to expect the cause of the failure to be
resolved without further action by user space.

This issue was noticed with the device-mapper-test-suite using:
dmtest run --suite thin-provisioning -n /exhausting_metadata_space_causes_fail_mode/

The quantity of errors logged in this case must be reduced.

before patch:

device-mapper: thin: dm_thin_insert_block() failed
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: thin: dm_thin_insert_block() failed
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: thin: dm_thin_insert_block() failed
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: thin: dm_thin_insert_block() failed
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: thin: dm_thin_insert_block() failed
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: thin: dm_thin_insert_block() failed
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: thin: dm_thin_insert_block() failed
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: thin: dm_thin_insert_block() failed
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: thin: dm_thin_insert_block() failed
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: thin: dm_thin_insert_block() failed
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: space map metadata: unable to allocate new metadata block
<snip ... these repeat for a long while ... >
device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: space map common: dm_tm_shadow_block() failed
device-mapper: thin: 253:4: no free metadata space available.
device-mapper: thin: 253:4: switching pool to read-only mode

after patch:

device-mapper: space map metadata: unable to allocate new metadata block
device-mapper: thin: 253:4: dm_thin_insert_block() failed: error = -28
device-mapper: thin: 253:4: switching pool to read-only mode
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

fafc7a81

dm space map metadata: return on failure in sm_metadata_new_block · f62b6b8f

由 Mike Snitzer 提交于 12月 02, 2013

Commit 2fc48021 ("dm persistent
metadata: add space map threshold callback") introduced a regression
to the metadata block allocation path that resulted in errors being
ignored.  This regression was uncovered by running the following
device-mapper-test-suite test:
dmtest run --suite thin-provisioning -n /exhausting_metadata_space_causes_fail_mode/

The ignored error codes in sm_metadata_new_block() could crash the
kernel through use of either the dm-thin or dm-cache targets, e.g.:

device-mapper: thin: 253:4: reached low water mark for metadata device: sending event.
device-mapper: space map metadata: unable to allocate new metadata block
general protection fault: 0000 [#1] SMP
...
Workqueue: dm-thin do_worker [dm_thin_pool]
task: ffff880035ce2ab0 ti: ffff88021a054000 task.ti: ffff88021a054000
RIP: 0010:[<ffffffffa0331385>]  [<ffffffffa0331385>] metadata_ll_load_ie+0x15/0x30 [dm_persistent_data]
RSP: 0018:ffff88021a055a68  EFLAGS: 00010202
RAX: 003fc8243d212ba0 RBX: ffff88021a780070 RCX: ffff88021a055a78
RDX: ffff88021a055a78 RSI: 0040402222a92a80 RDI: ffff88021a780070
RBP: ffff88021a055a68 R08: ffff88021a055ba4 R09: 0000000000000010
R10: 0000000000000000 R11: 00000002a02e1000 R12: ffff88021a055ad4
R13: 0000000000000598 R14: ffffffffa0338470 R15: ffff88021a055ba4
FS:  0000000000000000(0000) GS:ffff88033fca0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f467c0291b8 CR3: 0000000001a0b000 CR4: 00000000000007e0
Stack:
 ffff88021a055ab8 ffffffffa0332020 ffff88021a055b30 0000000000000001
 ffff88021a055b30 0000000000000000 ffff88021a055b18 0000000000000000
 ffff88021a055ba4 ffff88021a055b98 ffff88021a055ae8 ffffffffa033304c
Call Trace:
 [<ffffffffa0332020>] sm_ll_lookup_bitmap+0x40/0xa0 [dm_persistent_data]
 [<ffffffffa033304c>] sm_metadata_count_is_more_than_one+0x8c/0xc0 [dm_persistent_data]
 [<ffffffffa0333825>] dm_tm_shadow_block+0x65/0x110 [dm_persistent_data]
 [<ffffffffa0331b00>] sm_ll_mutate+0x80/0x300 [dm_persistent_data]
 [<ffffffffa0330e60>] ? set_ref_count+0x10/0x10 [dm_persistent_data]
 [<ffffffffa0331dba>] sm_ll_inc+0x1a/0x20 [dm_persistent_data]
 [<ffffffffa0332270>] sm_disk_new_block+0x60/0x80 [dm_persistent_data]
 [<ffffffff81520036>] ? down_write+0x16/0x40
 [<ffffffffa001e5c4>] dm_pool_alloc_data_block+0x54/0x80 [dm_thin_pool]
 [<ffffffffa001b23c>] alloc_data_block+0x9c/0x130 [dm_thin_pool]
 [<ffffffffa001c27e>] provision_block+0x4e/0x180 [dm_thin_pool]
 [<ffffffffa001fe9a>] ? dm_thin_find_block+0x6a/0x110 [dm_thin_pool]
 [<ffffffffa001c57a>] process_bio+0x1ca/0x1f0 [dm_thin_pool]
 [<ffffffff8111e2ed>] ? mempool_free+0x8d/0xa0
 [<ffffffffa001d755>] process_deferred_bios+0xc5/0x230 [dm_thin_pool]
 [<ffffffffa001d911>] do_worker+0x51/0x60 [dm_thin_pool]
 [<ffffffff81067872>] process_one_work+0x182/0x3b0
 [<ffffffff81068c90>] worker_thread+0x120/0x3a0
 [<ffffffff81068b70>] ? manage_workers+0x160/0x160
 [<ffffffff8106eb2e>] kthread+0xce/0xe0
 [<ffffffff8106ea60>] ? kthread_freezable_should_stop+0x70/0x70
 [<ffffffff8152af6c>] ret_from_fork+0x7c/0xb0
 [<ffffffff8106ea60>] ? kthread_freezable_should_stop+0x70/0x70
 [<ffffffff8152af6c>] ret_from_fork+0x7c/0xb0
 [<ffffffff8106ea60>] ? kthread_freezable_should_stop+0x70/0x70
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Acked-by: NJoe Thornber <ejt@redhat.com>
Cc: stable@vger.kernel.org # v3.10+

f62b6b8f

dm table: fail dm_table_create on dm_round_up overflow · 5b2d0657

由 Mikulas Patocka 提交于 11月 22, 2013

The dm_round_up function may overflow to zero.  In this case,
dm_table_create() must fail rather than go on to allocate an empty array
with alloc_targets().

This fixes a possible memory corruption that could be caused by passing
too large a number in "param->target_count".
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

5b2d0657

dm snapshot: avoid snapshot space leak on crash · 230c83af

由 Mikulas Patocka 提交于 11月 29, 2013

There is a possible leak of snapshot space in case of crash.

The reason for space leaking is that chunks in the snapshot device are
allocated sequentially, but they are finished (and stored in the metadata)
out of order, depending on the order in which copying finished.

For example, supposed that the metadata contains the following records
SUPERBLOCK
METADATA (blocks 0 ... 250)
DATA 0
DATA 1
DATA 2
...
DATA 250

Now suppose that you allocate 10 new data blocks 251-260. Suppose that
copying of these blocks finish out of order (block 260 finished first
and the block 251 finished last). Now, the snapshot device looks like
this:
SUPERBLOCK
METADATA (blocks 0 ... 250, 260, 259, 258, 257, 256)
DATA 0
DATA 1
DATA 2
...
DATA 250
DATA 251
DATA 252
DATA 253
DATA 254
DATA 255
METADATA (blocks 255, 254, 253, 252, 251)
DATA 256
DATA 257
DATA 258
DATA 259
DATA 260

Now, if the machine crashes after writing the first metadata block but
before writing the second metadata block, the space for areas DATA 250-255
is leaked, it contains no valid data and it will never be used in the
future.

This patch makes dm-snapshot complete exceptions in the same order they
were allocated, thus fixing this bug.

Note: when backporting this patch to the stable kernel, change the version
field in the following way:
* if version in the stable kernel is {1, 11, 1}, change it to {1, 12, 0}
* if version in the stable kernel is {1, 10, 0} or {1, 10, 1}, change it
  to {1, 10, 2}
Userspace reads the version to determine if the bug was fixed, so the
version change is needed.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

230c83af

19 11月, 2013 1 次提交

dm delay: fix a possible deadlock due to shared workqueue · 718822c1

由 Mikulas Patocka 提交于 11月 15, 2013

The dm-delay target uses a shared workqueue for multiple instances.  This
can cause deadlock if two or more dm-delay targets are stacked on the top
of each other.

This patch changes dm-delay to use a per-instance workqueue.

Cc: stable@vger.kernel.org # 2.6.22+
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

718822c1

13 11月, 2013 1 次提交

dm cache: resolve small nits and improve Documentation · 7b6b2bc9

由 Mike Snitzer 提交于 11月 12, 2013

Document passthrough mode, cache shrinking, and cache invalidation.
Also, use strcasecmp() and hlist_unhashed().
Reported-by: NAlasdair G Kergon <agk@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

7b6b2bc9

12 11月, 2013 6 次提交

dm cache: add cache block invalidation support · 65790ff9

由 Joe Thornber 提交于 11月 08, 2013

Cache block invalidation is removing an entry from the cache without
writing it back.  Cache blocks can be invalidated via the
'invalidate_cblocks' message, which takes an arbitrary number of cblock
ranges:
   invalidate_cblocks [<cblock>|<cblock begin>-<cblock end>]*

E.g.
   dmsetup message my_cache 0 invalidate_cblocks 2345 3456-4567 5678-6789
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

65790ff9

dm cache: add remove_cblock method to policy interface · 532906aa

由 Joe Thornber 提交于 11月 08, 2013

Implement policy_remove_cblock() and add remove_cblock method to the mq
policy.  These methods will be used by the following cache block
invalidation patch which adds the 'invalidate_cblocks' message to the
cache core.

Also, update some comments in dm-cache-policy.h
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

532906aa

dm cache policy mq: reduce memory requirements · 633618e3

由 Joe Thornber 提交于 11月 09, 2013

Rather than storing the cblock in each cache entry, we allocate all
entries in an array and infer the cblock from the entry position.

Saves 4 bytes of memory per cache block.  In addition, this gives us an
easy way of looking up cache entries by cblock.

We no longer need to keep an explicit bitset to track which cblocks
have been allocated.  And no searching is needed to find free cblocks.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

633618e3

dm cache metadata: check the metadata version when reading the superblock · 53d49819

由 Joe Thornber 提交于 10月 16, 2013

Need to check the version to verify on-disk metadata is supported.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

53d49819

dm cache: add passthrough mode · 2ee57d58

由 Joe Thornber 提交于 10月 24, 2013

"Passthrough" is a dm-cache operating mode (like writethrough or
writeback) which is intended to be used when the cache contents are not
known to be coherent with the origin device.  It behaves as follows:

* All reads are served from the origin device (all reads miss the cache)
* All writes are forwarded to the origin device; additionally, write
  hits cause cache block invalidates

This mode decouples cache coherency checks from cache device creation,
largely to avoid having to perform coherency checks while booting.  Boot
scripts can create cache devices in passthrough mode and put them into
service (mount cached filesystems, for example) without having to worry
about coherency.  Coherency that exists is maintained, although the
cache will gradually cool as writes take place.

Later, applications can perform coherency checks, the nature of which
will depend on the type of the underlying storage.  If coherency can be
verified, the cache device can be transitioned to writethrough or
writeback mode while still warm; otherwise, the cache contents can be
discarded prior to transitioning to the desired operating mode.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMorgan Mears <Morgan.Mears@netapp.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

2ee57d58

dm cache: cache shrinking support · f494a9c6

由 Joe Thornber 提交于 10月 31, 2013

Allow a cache to shrink if the blocks being removed from the cache are
not dirty.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

f494a9c6

10 11月, 2013 21 次提交

dm cache: promotion optimisation for writes · c9d28d5d

由 Joe Thornber 提交于 10月 31, 2013

If a write block triggers promotion and covers a whole block we can
avoid a copy.

Introduce dm_{hook,unhook}_bio to simplify saving and restoring bio
fields (bi_private is now used by overwrite).  Switch writethrough
support over to using these helpers too.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

c9d28d5d

dm cache: be much more aggressive about promoting writes to discarded blocks · c86c3070

由 Joe Thornber 提交于 10月 24, 2013

Previously these promotions only got priority if there were unused cache
blocks.  Now we give them priority if there are any clean blocks in the
cache.

The fio_soak_test in the device-mapper-test-suite now gives uniform
performance across subvolumes (~16 seconds).
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

c86c3070

dm cache policy mq: implement writeback_work() and mq_{set,clear}_dirty() · 01911c19

由 Joe Thornber 提交于 10月 24, 2013

There are now two multiqueues for in cache blocks.  A clean one and a
dirty one.

writeback_work comes from the dirty one.  Demotions come from the clean
one.

There are two benefits:
- Performance improvement, since demoting a clean block is a noop.
- The cache cleans itself when io load is light.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

01911c19

dm cache: optimize commit_if_needed · ffcbcb67

由 Heinz Mauelshagen 提交于 10月 14, 2013

Check commit_requested flag _before_ calling
dm_cache_changed_this_transaction() superfluously.

Also, be sure to set last_commit_jiffies _after_ dm_cache_commit()
completes.
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

ffcbcb67

dm space map disk: optimise sm_disk_dec_block · 40c57f47

由 Joe Thornber 提交于 8月 09, 2013

Don't waste time spotting blocks that have been allocated and then freed
in the same transaction.

The extra lookup is expensive, and I don't think it really gives us much.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

40c57f47

M
MAINTAINERS: add reference to device-mapper's linux-dm.git tree · 41d35d25
由 Mike Snitzer 提交于 11月 04, 2013
```
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
```
41d35d25

dm: fix Kconfig menu indentation · 5442851e

由 Mikulas Patocka 提交于 11月 08, 2013

The option DM_LOG_USERSPACE is sub-option of DM_MIRROR, so place it
right after DM_MIRROR.  Doing so fixes various other Device mapper
targets/features to be properly nested under "Device mapper support".
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

5442851e

dm: allow remove to be deferred · 2c140a24

由 Mikulas Patocka 提交于 11月 01, 2013

This patch allows the removal of an open device to be deferred until
it is closed.  (Previously such a removal attempt would fail.)

The deferred remove functionality is enabled by setting the flag
DM_DEFERRED_REMOVE in the ioctl structure on DM_DEV_REMOVE or
DM_REMOVE_ALL ioctl.

On return from DM_DEV_REMOVE, the flag DM_DEFERRED_REMOVE indicates if
the device was removed immediately or flagged to be removed on close -
if the flag is clear, the device was removed.

On return from DM_DEV_STATUS and other ioctls, the flag
DM_DEFERRED_REMOVE is set if the device is scheduled to be removed on
closure.

A device that is scheduled to be deleted can be revived using the
message "@cancel_deferred_remove". This message clears the
DMF_DEFERRED_REMOVE flag so that the device won't be deleted on close.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

2c140a24

dm table: print error on preresume failure · 7833b08e

由 Mike Snitzer 提交于 10月 24, 2013

If preresume fails it is worth logging an error given that a device is
left suspended due to the failure.

This change was motivated by local preresume error logging that was
added to the cache target ("preresume failed").  Elevating this
target-agnostic context for the where the target-specific error occurred
relative to the DM core's callouts makes sense.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJoe Thornber <ejt@redhat.com>

7833b08e

dm crypt: add TCW IV mode for old CBC TCRYPT containers · ed04d981

由 Milan Broz 提交于 10月 28, 2013

dm-crypt can already activate TCRYPT (TrueCrypt compatible) containers
in LRW or XTS block encryption mode.

TCRYPT containers prior to version 4.1 use CBC mode with some additional
tweaks, this patch adds support for these containers.

This new mode is implemented using special IV generator named TCW
(TrueCrypt IV with whitening).  TCW IV only supports containers that are
encrypted with one cipher (Tested with AES, Twofish, Serpent, CAST5 and
TripleDES).

While this mode is legacy and is known to be vulnerable to some
watermarking attacks (e.g. revealing of hidden disk existence) it can
still be useful to activate old containers without using 3rd party
software or for independent forensic analysis of such containers.

(Both the userspace and kernel code is an independent implementation
based on the format documentation and it completely avoids use of
original source code.)

The TCW IV generator uses two additional keys: Kw (whitening seed, size
is always 16 bytes - TCW_WHITENING_SIZE) and Kiv (IV seed, size is
always the IV size of the selected cipher).  These keys are concatenated
at the end of the main encryption key provided in mapping table.

While whitening is completely independent from IV, it is implemented
inside IV generator for simplification.

The whitening value is always 16 bytes long and is calculated per sector
from provided Kw as initial seed, xored with sector number and mixed
with CRC32 algorithm.  Resulting value is xored with ciphertext sector
content.

IV is calculated from the provided Kiv as initial IV seed and xored with
sector number.

Detailed calculation can be found in the Truecrypt documentation for
version < 4.1 and will also be described on dm-crypt site, see:
http://code.google.com/p/cryptsetup/wiki/DMCrypt

The experimental support for activation of these containers is already
present in git devel brach of cryptsetup.
Signed-off-by: NMilan Broz <gmazyland@gmail.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

ed04d981

dm crypt: properly handle extra key string in initialization · da31a078

由 Milan Broz 提交于 10月 28, 2013

Some encryption modes use extra keys (e.g. loopAES has IV seed) which
are not used in block cipher initialization but are part of key string
in table constructor.

This patch adds an additional field which describes the length of the
extra key(s) and substracts it before real key encryption setting.

The key_size always includes the size, in bytes, of the key provided
in mapping table.

The key_parts describes how many parts (usually keys) are contained in
the whole key buffer.  And key_extra_size contains size in bytes of
additional keys part (this number of bytes must be subtracted because it
is processed by the IV generator).

| K1 | K2 | .... | K64 |      Kiv       |
|----------- key_size ----------------- |
|                      |-key_extra_size-|
|     [64 keys]        |  [1 key]       | => key_parts = 65

Example where key string contains main key K, whitening key
Kw and IV seed Kiv:

|     K       |   Kiv   |       Kw      |
|--------------- key_size --------------|
|             |-----key_extra_size------|
|  [1 key]    | [1 key] |     [1 key]   | => key_parts = 3

Because key_extra_size is calculated during IV mode setting, key
initialization is moved after this step.

For now, this change has no effect to supported modes (thanks to ilog2
rounding) but it is required by the following patch.

Also, fix a sparse warning in crypt_iv_lmk_one().
Signed-off-by: NMilan Broz <gmazyland@gmail.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

da31a078

dm cache: log error message if dm_kcopyd_copy() fails · 2c2263c9

由 Heinz Mauelshagen 提交于 10月 14, 2013

A migration failure should be logged (albeit limited).
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

2c2263c9

dm cache: use cell_defer() boolean argument consistently · 80f659f3

由 Heinz Mauelshagen 提交于 10月 14, 2013

Fix a few cell_defer() calls that weren't passing a bool.
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

80f659f3

dm cache: return -EINVAL if the user specifies unknown cache policy · 4cb3e1db

由 Mikulas Patocka 提交于 10月 01, 2013

Return -EINVAL when the specified cache policy is unknown rather than
returning -ENOMEM.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

4cb3e1db

J
dm cache metadata: return bool from __superblock_all_zeroes · dd8b0c20
由 Joe Thornber 提交于 10月 24, 2013
```
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
```
dd8b0c20

dm cache policy mq: a few small fixes · 0184b44e

由 Joe Thornber 提交于 10月 24, 2013

Rename takeout_queue to concat_queue.

Fix a harmless bug in mq policies pop() function.  Currently pop()
always succeeds, with up coming changes this wont be the case.

Fix typo in comment above pre_cache_to_cache prototype.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

0184b44e

dm cache policy: remove return from void policy_remove_mapping · 3351937e

由 Joe Thornber 提交于 10月 24, 2013

No need to return from a void function.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

3351937e

dm cache: improve efficiency of quiescing flag management · 238f8363

由 Joe Thornber 提交于 10月 30, 2013

Make the quiescing flag an atomic_t and stop protecting it with a spin
lock.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

238f8363

dm cache: fix a race condition between queuing new migrations and quiescing for a shutdown · 66cb1910

由 Joe Thornber 提交于 10月 30, 2013

The code that was trying to do this was inadequate.  The postsuspend
method (in ioctl context), needs to wait for the worker thread to
acknowledge the request to quiesce.  Otherwise the migration count may
drop to zero temporarily before the worker thread realises we're
quiescing.  In this case the target will be taken down, but the worker
thread may have issued a new migration, which will cause an oops when
it completes.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # 3.9+

66cb1910

dm cache: io destined for the cache device can now serve as tick bios · f8e5f01a

由 Joe Thornber 提交于 10月 21, 2013

Previously only origin bios could trigger ticks, which meant if all
the io was destined for the cache no ticks were generated.  If no ticks
are generated then multiple hits, and movements in general, are
attributed to the same tick.

Only a stop gap fix, we need a better solution.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

f8e5f01a

dm cache policy mq: protect residency method with existing mutex · 99ba2ae4

由 Joe Thornber 提交于 10月 21, 2013

It is safe to use a mutex in mq_residency() at this point since it is
only called from ioctl context.  But future-proof mq_residency() by
using might_sleep() to catch new contexts that cannot sleep.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

99ba2ae4

06 11月, 2013 2 次提交

dm array: fix bug in growing array · 9c1d4de5

由 Joe Thornber 提交于 10月 30, 2013

Entries would be lost if the old tail block was partially filled.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # 3.9+

9c1d4de5

dm mpath: requeue I/O during pg_init · b63349a7

由 Hannes Reinecke 提交于 10月 01, 2013

When pg_init is running no I/O can be submitted to the underlying
devices, as the path priority etc might change.  When using queue_io for
this, requests will be piling up within multipath as the block I/O
scheduler just sees a _very fast_ device.  All of this queued I/O has to
be resubmitted from within multipathing once pg_init is done.

This approach has the problem that it's virtually impossible to
abort I/O when pg_init is running, and we're adding heavy load
to the devices after pg_init since all of the queued I/O needs to be
resubmitted _before_ any requests can be pulled off of the request queue
and normal operation continues.

This patch will requeue the I/O that triggers the pg_init call, and
return 'busy' when pg_init is in progress.  With these changes the block
I/O scheduler will stop submitting I/O during pg_init, resulting in a
quicker path switch and less I/O pressure (and memory consumption) after
pg_init.
Signed-off-by: NHannes Reinecke <hare@suse.de>
[patch header edited for clarity and typos by Mike Snitzer]
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

b63349a7

01 11月, 2013 2 次提交

dm mpath: fix race condition between multipath_dtr and pg_init_done · 954a73d5

由 Shiva Krishna Merla 提交于 10月 30, 2013

Whenever multipath_dtr() is happening we must prevent queueing any
further path activation work.  Implement this by adding a new
'pg_init_disabled' flag to the multipath structure that denotes future
path activation work should be skipped if it is set.  By disabling
pg_init and then re-enabling in flush_multipath_work() we also avoid the
potential for pg_init to be initiated while suspending an mpath device.

Without this patch a race condition exists that may result in a kernel
panic:

1) If after pg_init_done() decrements pg_init_in_progress to 0, a call
   to wait_for_pg_init_completion() assumes there are no more pending path
   management commands.
2) If pg_init_required is set by pg_init_done(), due to retryable
   mode_select errors, then process_queued_ios() will again queue the
   path activation work.
3) If free_multipath() completes before activate_path() work is called a
   NULL pointer dereference like the following can be seen when
   accessing members of the recently destructed multipath:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
RIP: 0010:[<ffffffffa003db1b>]  [<ffffffffa003db1b>] activate_path+0x1b/0x30 [dm_multipath]
[<ffffffff81090ac0>] worker_thread+0x170/0x2a0
[<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40

[switch to disabling pg_init in flush_multipath_work & header edits by Mike Snitzer]
Signed-off-by: NShiva Krishna Merla <shivakrishna.merla@netapp.com>
Reviewed-by: NKrishnasamy Somasundaram <somasundaram.krishnasamy@netapp.com>
Tested-by: NSpeagle Andy <Andy.Speagle@netapp.com>
Acked-by: NJunichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

954a73d5

dm: allocate buffer for messages with small number of arguments using GFP_NOIO · f36afb39

由 Mikulas Patocka 提交于 10月 31, 2013

dm-mpath and dm-thin must process messages even if some device is
suspended, so we allocate argv buffer with GFP_NOIO. These messages have
a small fixed number of arguments.

On the other hand, dm-switch needs to process bulk data using messages
so excessive use of GFP_NOIO could cause trouble.

The patch also lowers the default number of arguments from 64 to 8, so
that there is smaller load on GFP_NOIO allocations.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Acked-by: NAlasdair G Kergon <agk@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

f36afb39

14 10月, 2013 2 次提交

L

Linux 3.12-rc5 · 61e6cfa8
由 Linus Torvalds 提交于 10月 13, 2013

61e6cfa8

Merge git://www.linux-watchdog.org/linux-watchdog · 73cac03d

由 Linus Torvalds 提交于 10月 13, 2013

Pull watchdog fixes from Wim Van Sebroeck:
 "This will fix a deadlock on the ts72xx_wdt driver, fix bitmasks in the
  kempld_wdt driver and fix a section mismatch in the sunxi_wdt driver"

* git://www.linux-watchdog.org/linux-watchdog:
  watchdog: sunxi: Fix section mismatch
  watchdog: kempld_wdt: Fix bit mask definition
  watchdog: ts72xx_wdt: locking bug in ioctl

73cac03d

openeuler / Kernel 12 个月 前同步成功

openeuler / Kernel
12 个月前同步成功