提交 · abfc426d1b2fb2176df59851a64223b58ddae7e7 · openeuler / Kernel

04 2月, 2022 12 次提交

block: pass a block_device to bio_clone_fast · abfc426d

由 Christoph Hellwig 提交于 2月 02, 2022

Pass a block_device to bio_clone_fast and __bio_clone_fast and give
the functions more suitable names.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-14-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

abfc426d

block: initialize the target bio in __bio_clone_fast · a0e8de79

由 Christoph Hellwig 提交于 2月 02, 2022

All callers of __bio_clone_fast initialize the bio first. Move that
initialization into __bio_clone_fast instead.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-13-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

a0e8de79

dm: use bio_clone_fast in alloc_io/alloc_tio · 92986f6b

由 Christoph Hellwig 提交于 2月 02, 2022

Replace open coded bio_clone_fast implementations with the actual helper.
Note that the bio allocated as part of the dm_io structure in alloc_io
will only actually be used later in alloc_tio, making this earlier
cloning of the information safe.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-12-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

92986f6b

block: clone crypto and integrity data in __bio_clone_fast · 56b4b5ab

由 Christoph Hellwig 提交于 2月 02, 2022

__bio_clone_fast should also clone integrity and crypto data, as a clone
without those is incomplete.  Right now the only caller that can actually
support crypto and integrity data (dm) does it manually for the one
callchain that supports these, but we better do it properly in the core.

Note that all callers except for the above mentioned one also don't need
to handle failure at all, given that the integrity and crypto clones are
based on mempool allocations that won't fail for sleeping allocations.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-11-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

56b4b5ab

dm-cache: remove __remap_to_origin_clear_discard · 3c4b455e

由 Christoph Hellwig 提交于 2月 02, 2022

Fold __remap_to_origin_clear_discard into the two callers to prepare
for bio cloning refactoring.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-10-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

3c4b455e

dm: simplify the single bio fast path in __send_duplicate_bios · 891fced6

由 Christoph Hellwig 提交于 2月 02, 2022

Most targets just need a single flush bio.  Open code that case in
__send_duplicate_bios without the need to add the bio to a list.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

891fced6

dm: retun the clone bio from alloc_tio · 1d1068ce

由 Christoph Hellwig 提交于 2月 02, 2022

Return the clone bio embedded into the tio as that is what the callers
actually want. Similar for the free side.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-8-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

1d1068ce

dm: pass the bio instead of tio to __map_bio · 1561b396

由 Christoph Hellwig 提交于 2月 02, 2022

This simplifies the callers a bit.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

1561b396

dm: move cloning the bio into alloc_tio · dc8e2021

由 Christoph Hellwig 提交于 2月 02, 2022

Move the call to __bio_clone_fast and the assignment of ->len_ptr from
the callers into alloc_tio to prepare for changes to the bio clone API.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

dc8e2021

dm: fold __send_duplicate_bios into __clone_and_map_simple_bio · 8eabf5d0

由 Christoph Hellwig 提交于 2月 02, 2022

Fold __send_duplicate_bios into its only caller to prepare for
refactoring.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

8eabf5d0

dm: fold clone_bio into __clone_and_map_data_bio · b1bee792

由 Christoph Hellwig 提交于 2月 02, 2022

Fold clone_bio into its only caller to prepare for refactoring.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

b1bee792

dm: add a clone_to_tio helper · 6c23f0bd

由 Christoph Hellwig 提交于 2月 02, 2022

Add a helper to stop open coding the container_of operations to get
from the clone bio to the tio structure.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

6c23f0bd

02 2月, 2022 9 次提交

block: pass a block_device and opf to bio_reset · a7c50c94

由 Christoph Hellwig 提交于 1月 24, 2022

Pass the block_device that we plan to use this bio for and the
operation to bio_reset to optimize the assigment. A NULL block_device
can be passed, both for the passthrough case on a raw request_queue and
to temporarily avoid refactoring some nasty code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220124091107.642561-20-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

a7c50c94

block: pass a block_device and opf to bio_init · 49add496

由 Christoph Hellwig 提交于 1月 24, 2022

Pass the block_device that we plan to use this bio for and the
operation to bio_init to optimize the assignment. A NULL block_device
can be passed, both for the passthrough case on a raw request_queue and
to temporarily avoid refactoring some nasty code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220124091107.642561-19-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

49add496

block: pass a block_device and opf to bio_alloc · 07888c66

由 Christoph Hellwig 提交于 1月 24, 2022

Pass the block_device and operation that we plan to use this bio for to
bio_alloc to optimize the assignment. NULL/0 can be passed, both for the
passthrough case on a raw request_queue and to temporarily avoid
refactoring some nasty code.

Also move the gfp_mask argument after the nr_vecs argument for a much
more logical calling convention matching what most of the kernel does.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220124091107.642561-18-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

07888c66

block: pass a block_device and opf to bio_alloc_bioset · 609be106

由 Christoph Hellwig 提交于 1月 24, 2022

Pass the block_device and operation that we plan to use this bio for to
bio_alloc_bioset to optimize the assigment. NULL/0 can be passed, both
for the passthrough case on a raw request_queue and to temporarily avoid
refactoring some nasty code.

Also move the gfp_mask argument after the nr_vecs argument for a much
more logical calling convention matching what most of the kernel does.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220124091107.642561-16-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

609be106

dm-thin: use blkdev_issue_flush instead of open coding it · 28d7d128

由 Christoph Hellwig 提交于 1月 24, 2022

Use blkdev_issue_flush, which uses an on-stack bio instead of an
opencoded version with a bio embedded into struct pool.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220124091107.642561-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

28d7d128

dm-snap: use blkdev_issue_flush instead of open coding it · eba33b8e

由 Christoph Hellwig 提交于 1月 24, 2022

Use blkdev_issue_flush, which uses an on-stack bio instead of an
opencoded version with a bio embedded into struct dm_snapshot.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220124091107.642561-8-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

eba33b8e

dm-crypt: remove clone_init · 3f868c09

由 Christoph Hellwig 提交于 1月 24, 2022

Just open code it next to the bio allocations, which saves a few lines
of code, prepares for future changes and allows to remove the duplicate
bi_opf assignment for the bio_clone_fast case in kcryptd_io_read.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220124091107.642561-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

3f868c09

dm: bio_alloc can't fail if it is allowed to sleep · 53db984e

由 Christoph Hellwig 提交于 1月 24, 2022

Remove handling of NULL returns from sleeping bio_alloc calls given that
those can't fail.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220124091107.642561-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

53db984e

block: remove genhd.h · 322cbb50

由 Christoph Hellwig 提交于 1月 24, 2022

There is no good reason to keep genhd.h separate from the main blkdev.h
header that includes it. So fold the contents of genhd.h into blkdev.h
and remove genhd.h entirely.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220124093913.742411-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

322cbb50

29 1月, 2022 2 次提交

dm: properly fix redundant bio-based IO accounting · b879f915

由 Mike Snitzer 提交于 1月 28, 2022

Record the start_time for a bio but defer the starting block core's IO
accounting until after IO is submitted using bio_start_io_acct_time().

This approach avoids the need to mess around with any of the
individual IO stats in response to a bio_split() that follows bio
submission.
Reported-by: NBud Brown <bubrown@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Depends-on: e45c47d1 ("block: add bio_start_io_acct_time() to control start_time")
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220128155841.39644-4-snitzer@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

b879f915

dm: revert partial fix for redundant bio-based IO accounting · f524d9c9

由 Mike Snitzer 提交于 1月 28, 2022

Reverts a1e1cb72 ("dm: fix redundant IO accounting for bios that
need splitting") because it was too narrow in scope (only addressed
redundant 'sectors[]' accounting and not ios, nsecs[], etc).

Cc: stable@vger.kernel.org
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220128155841.39644-3-snitzer@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

f524d9c9

07 1月, 2022 9 次提交

md: use default_groups in kobj_type · 1745e857

由 Greg Kroah-Hartman 提交于 1月 06, 2022

There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field.  Move the md rdev sysfs code to use default_groups field which
has been the preferred way since commit aa30f47c ("kobject: Add
support for default attribute groups to kobj_type") so that we can soon
get rid of the obsolete default_attrs field.

Cc: Song Liu <song@kernel.org>
Cc: linux-raid@vger.kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NSong Liu <song@kernel.org>

1745e857

md: Move alloc/free acct bioset in to personality · 0c031fd3

由 Xiao Ni 提交于 12月 10, 2021

bioset acct is only needed for raid0 and raid5. Therefore, md_run only
allocates it for raid0 and raid5. However, this does not cover
personality takeover, which may cause uninitialized bioset. For example,
the following repro steps:

  mdadm -CR /dev/md0 -l1 -n2 /dev/loop0 /dev/loop1
  mdadm --wait /dev/md0
  mkfs.xfs /dev/md0
  mdadm /dev/md0 --grow -l5
  mount /dev/md0 /mnt

causes panic like:

[  225.933939] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  225.934903] #PF: supervisor instruction fetch in kernel mode
[  225.935639] #PF: error_code(0x0010) - not-present page
[  225.936361] PGD 0 P4D 0
[  225.936677] Oops: 0010 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN PTI
[  225.937525] CPU: 27 PID: 1133 Comm: mount Not tainted 5.16.0-rc3+ #706
[  225.938416] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.module_el8.4.0+547+a85d02ba 04/01/2014
[  225.939922] RIP: 0010:0x0
[  225.940289] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[  225.941196] RSP: 0018:ffff88815897eff0 EFLAGS: 00010246
[  225.941897] RAX: 0000000000000000 RBX: 0000000000092800 RCX: ffffffff81370a39
[  225.942813] RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000092800
[  225.943772] RBP: 1ffff1102b12fe04 R08: fffffbfff0b43c01 R09: fffffbfff0b43c01
[  225.944807] R10: ffffffff85a1e007 R11: fffffbfff0b43c00 R12: ffff88810eaaaf58
[  225.945757] R13: 0000000000000000 R14: ffff88810eaaafb8 R15: ffff88815897f040
[  225.946709] FS:  00007ff3f2505080(0000) GS:ffff888fb5e00000(0000) knlGS:0000000000000000
[  225.947814] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  225.948556] CR2: ffffffffffffffd6 CR3: 000000015aa5a006 CR4: 0000000000370ee0
[  225.949537] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  225.950455] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  225.951414] Call Trace:
[  225.951787]  <TASK>
[  225.952120]  mempool_alloc+0xe5/0x250
[  225.952625]  ? mempool_resize+0x370/0x370
[  225.953187]  ? rcu_read_lock_sched_held+0xa1/0xd0
[  225.953862]  ? rcu_read_lock_bh_held+0xb0/0xb0
[  225.954464]  ? sched_clock_cpu+0x15/0x120
[  225.955019]  ? find_held_lock+0xac/0xd0
[  225.955564]  bio_alloc_bioset+0x1ed/0x2a0
[  225.956080]  ? lock_downgrade+0x3a0/0x3a0
[  225.956644]  ? bvec_alloc+0xc0/0xc0
[  225.957135]  bio_clone_fast+0x19/0x80
[  225.957651]  raid5_make_request+0x1370/0x1b70
[  225.958286]  ? sched_clock_cpu+0x15/0x120
[  225.958797]  ? __lock_acquire+0x8b2/0x3510
[  225.959339]  ? raid5_get_active_stripe+0xce0/0xce0
[  225.959986]  ? lock_is_held_type+0xd8/0x130
[  225.960528]  ? rcu_read_lock_sched_held+0xa1/0xd0
[  225.961135]  ? rcu_read_lock_bh_held+0xb0/0xb0
[  225.961703]  ? sched_clock_cpu+0x15/0x120
[  225.962232]  ? lock_release+0x27a/0x6c0
[  225.962746]  ? do_wait_intr_irq+0x130/0x130
[  225.963302]  ? lock_downgrade+0x3a0/0x3a0
[  225.963815]  ? lock_release+0x6c0/0x6c0
[  225.964348]  md_handle_request+0x342/0x530
[  225.964888]  ? set_in_sync+0x170/0x170
[  225.965397]  ? blk_queue_split+0x133/0x150
[  225.965988]  ? __blk_queue_split+0x8b0/0x8b0
[  225.966524]  ? submit_bio_checks+0x3b2/0x9d0
[  225.967069]  md_submit_bio+0x127/0x1c0
[...]

Fix this by moving alloc/free of acct bioset to pers->run and pers->free.

While we are on this, properly handle md_integrity_register() error in
raid0_run().

Fixes: daee2024 (md: check level before create and exit io_acct_set)
Cc: stable@vger.kernel.org
Acked-by: NGuoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: NXiao Ni <xni@redhat.com>
Signed-off-by: NSong Liu <song@kernel.org>

0c031fd3

md: fix spelling of "its" · dd3dc5f4

由 Randy Dunlap 提交于 12月 25, 2021

Use the possessive "its" instead of the contraction "it's"
in printed messages.
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: linux-raid@vger.kernel.org
Signed-off-by: NSong Liu <song@kernel.org>

dd3dc5f4

md: raid456 add nowait support · bf2c411b

由 Vishal Verma 提交于 12月 21, 2021

Returns EAGAIN in case the raid456 driver would block waiting for reshape.
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NVishal Verma <vverma@digitalocean.com>
Signed-off-by: NSong Liu <song@kernel.org>

bf2c411b

md: raid10 add nowait support · c9aa889b

由 Vishal Verma 提交于 12月 21, 2021

This adds nowait support to the RAID10 driver. Very similar to
raid1 driver changes. It makes RAID10 driver return with EAGAIN
for situations where it could wait for eg:

  - Waiting for the barrier,
  - Reshape operation,
  - Discard operation.

wait_barrier() and regular_request_wait() fn are modified to return bool
to support error for wait barriers. They returns true in case of wait
or if wait is not required and returns false if wait was required
but not performed to support nowait.
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NVishal Verma <vverma@digitalocean.com>
Signed-off-by: NSong Liu <song@kernel.org>

c9aa889b

md: raid1 add nowait support · 5aa70503

由 Vishal Verma 提交于 12月 21, 2021

This adds nowait support to the RAID1 driver. It makes RAID1 driver
return with EAGAIN for situations where it could wait for eg:

  - Waiting for the barrier,

wait_barrier() fn is modified to return bool to support error for
wait barriers. It returns true in case of wait or if wait is not
required and returns false if wait was required but not performed
to support nowait.
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NVishal Verma <vverma@digitalocean.com>
Signed-off-by: NSong Liu <song@kernel.org>

5aa70503

md: add support for REQ_NOWAIT · f51d46d0

由 Vishal Verma 提交于 12月 21, 2021

commit 021a2446 ("block: add QUEUE_FLAG_NOWAIT") added support
for checking whether a given bdev supports handling of REQ_NOWAIT or not.
Since then commit 6abc4946 ("dm: add support for REQ_NOWAIT and enable
it for linear target") added support for REQ_NOWAIT for dm. This uses
a similar approach to incorporate REQ_NOWAIT for md based bios.

This patch was tested using t/io_uring tool within FIO. A nvme drive
was partitioned into 2 partitions and a simple raid 0 configuration
/dev/md0 was created.

md0 : active raid0 nvme4n1p1[1] nvme4n1p2[0]
      937423872 blocks super 1.2 512k chunks

Before patch:

$ ./t/io_uring /dev/md0 -p 0 -a 0 -d 1 -r 100

Running top while the above runs:

$ ps -eL | grep $(pidof io_uring)

  38396   38396 pts/2    00:00:00 io_uring
  38396   38397 pts/2    00:00:15 io_uring
  38396   38398 pts/2    00:00:13 iou-wrk-38397

We can see iou-wrk-38397 io worker thread created which gets created
when io_uring sees that the underlying device (/dev/md0 in this case)
doesn't support nowait.

After patch:

$ ./t/io_uring /dev/md0 -p 0 -a 0 -d 1 -r 100

Running top while the above runs:

$ ps -eL | grep $(pidof io_uring)

  38341   38341 pts/2    00:10:22 io_uring
  38341   38342 pts/2    00:10:37 io_uring

After running this patch, we don't see any io worker thread
being created which indicated that io_uring saw that the
underlying device does support nowait. This is the exact behaviour
noticed on a dm device which also supports nowait.

For all the other raid personalities except raid0, we would need
to train pieces which involves make_request fn in order for them
to correctly handle REQ_NOWAIT.
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NVishal Verma <vverma@digitalocean.com>
Signed-off-by: NSong Liu <song@kernel.org>

f51d46d0

md: drop queue limitation for RAID1 and RAID10 · a92ce0fe

由 Mariusz Tkaczyk 提交于 12月 17, 2021

As suggested by Neil Brown[1], this limitation seems to be
deprecated.

With plugging in use, writes are processed behind the raid thread
and conf->pending_count is not increased. This limitation occurs only
if caller doesn't use plugs.

It can be avoided and often it is (with plugging). There are no reports
that queue is growing to enormous size so remove queue limitation for
non-plugged IOs too.

[1] https://lore.kernel.org/linux-raid/162496301481.7211.18031090130574610495@noble.neil.brown.nameSigned-off-by: NMariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: NSong Liu <song@kernel.org>

a92ce0fe

md/raid5: play nice with PREEMPT_RT · 770b1d21

由 Davidlohr Bueso 提交于 11月 15, 2021

raid_run_ops() relies on the implicitly disabled preemption for
its percpu ops, although this is really about CPU locality. This
breaks RT semantics as it can take regular (and thus sleeping)
spinlocks, such as stripe_lock.

Add a local_lock such that non-RT does not change and continues
to be just map to preempt_disable/enable, but makes RT happy as
the region will use a per-CPU spinlock and thus be preemptible
and still guarantee CPU locality.
Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
Signed-off-by: NSong Liu <songliubraving@fb.com>

770b1d21

06 1月, 2022 2 次提交

dm sysfs: use default_groups in kobj_type · eaac0b59

由 Greg Kroah-Hartman 提交于 1月 06, 2022

There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field.  Move the dm sysfs code to use default_groups field which has
been the preferred way since aa30f47c ("kobject: Add support for
default attribute groups to kobj_type") so that we can soon get rid of
the obsolete default_attrs field.
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

eaac0b59

dm integrity: Use struct_group() to zero struct journal_sector · f069c7ab

由 Kees Cook 提交于 12月 13, 2021

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Add struct_group() to mark region of struct journal_sector that should be
initialized to zero.
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

f069c7ab

05 1月, 2022 5 次提交

dm space map common: add bounds check to sm_ll_lookup_bitmap() · cba23ac1

由 Joe Thornber 提交于 12月 10, 2021

Corrupted metadata could warrant returning error from sm_ll_lookup_bitmap().
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

cba23ac1

dm btree: add a defensive bounds check to insert_at() · 85bca3c0

由 Joe Thornber 提交于 12月 10, 2021

Corrupt metadata could trigger an out of bounds write.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

85bca3c0

dm btree remove: change a bunch of BUG_ON() calls to proper errors · c671ffa5

由 Joe Thornber 提交于 12月 10, 2021

Abuse of BUG_ON() is never appropriate, best to propagate errors to
fail gracefully (rather than take the entire system down).
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

c671ffa5

J
dm btree spine: eliminate duplicate le32_to_cpu() in node_check() · e36649b6
由 Joe Thornber 提交于 12月 10, 2021
```
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
```
e36649b6

dm btree spine: remove extra node_check function declaration · 851a8cd3

由 Joe Thornber 提交于 12月 10, 2021

Should have been removed as part of commit f73e2e70 ("dm btree
spine: remove paranoid node_check call in node_prep_for_write()")
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

851a8cd3

04 1月, 2022 1 次提交

md/raid1: fix missing bitmap update w/o WriteMostly devices · 46669e86

由 Song Liu 提交于 1月 03, 2022

commit [1] causes missing bitmap updates when there isn't any WriteMostly
devices.

Detailed steps to reproduce by Norbert (which somehow didn't make to lore):

   # setup md10 (raid1) with two drives (1 GByte sparse files)
   dd if=/dev/zero of=disk1 bs=1024k seek=1024 count=0
   dd if=/dev/zero of=disk2 bs=1024k seek=1024 count=0

   losetup /dev/loop11 disk1
   losetup /dev/loop12 disk2

   mdadm --create /dev/md10 --level=1 --raid-devices=2 /dev/loop11 /dev/loop12

   # add bitmap (aka write-intent log)
   mdadm /dev/md10 --grow --bitmap=internal

   echo check > /sys/block/md10/md/sync_action

   root:# cat /sys/block/md10/md/mismatch_cnt
   0
   root:#

   # remove member drive disk2 (loop12)
   mdadm /dev/md10 -f loop12 ; mdadm /dev/md10 -r loop12

   # modify degraded md device
   dd if=/dev/urandom of=/dev/md10 bs=512 count=1

   # no blocks recorded as out of sync on the remaining member disk1/loop11
   root:# mdadm -X /dev/loop11 | grep Bitmap
             Bitmap : 16 bits (chunks), 0 dirty (0.0%)
   root:#

   # re-add disk2, nothing synced because of empty bitmap
   mdadm /dev/md10 --re-add /dev/loop12

   # check integrity again
   echo check > /sys/block/md10/md/sync_action

   # disk1 and disk2 are no longer in sync, reads return differend data
   root:# cat /sys/block/md10/md/mismatch_cnt
   128
   root:#

   # clean up
   mdadm -S /dev/md10
   losetup -d /dev/loop11
   losetup -d /dev/loop12
   rm disk1 disk2

Fix this by moving the WriteMostly check to the if condition for
alloc_behind_master_bio().

[1] commit fd3b6975 ("md/raid1: only allocate write behind bio for WriteMostly device")
Fixes: fd3b6975 ("md/raid1: only allocate write behind bio for WriteMostly device")
Cc: stable@vger.kernel.org # v5.12+
Cc: Guoqing Jiang <guoqing.jiang@linux.dev>
Cc: Jens Axboe <axboe@kernel.dk>
Reported-by: NNorbert Warmuth <nwarmuth@t-online.de>
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NSong Liu <song@kernel.org>

46669e86

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功