提交 · 948462294577a3870c407c16d89bb2314f0b0cfb · openeuler / Kernel

10 12月, 2020 7 次提交

btrfs: keep sb cache_generation consistent with space_cache · 94846229

由 Boris Burkov 提交于 11月 18, 2020

When mounting, btrfs uses the cache_generation in the super block to
determine if space cache v1 is in use. However, by mounting with
nospace_cache or space_cache=v2, it is possible to disable space cache
v1, which does not result in un-setting cache_generation back to 0.

In order to base some logic, like mount option printing in /proc/mounts,
on the current state of the space cache rather than just the values of
the mount option, keep the value of cache_generation consistent with the
status of space cache v1.

We ensure that cache_generation > 0 iff the file system is using
space_cache v1. This requires committing a transaction on any mount
which changes whether we are using v1. (v1->nospace_cache, v1->v2,
nospace_cache->v1, v2->v1).

Since the mechanism for writing out the cache generation is transaction
commit, but we want some finer grained control over when we un-set it,
we can't just rely on the SPACE_CACHE mount option, and introduce an
fs_info flag that mount can use when it wants to unset the generation.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NBoris Burkov <boris@bur.io>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

94846229

btrfs: clear oneshot options on mount and remount · 8cd29088

由 Boris Burkov 提交于 11月 18, 2020

Some options only apply during mount time and are cleared at the end
of mount. For now, the example is USEBACKUPROOT, but CLEAR_CACHE also
fits the bill, and this is a preparation patch for also clearing that
option.

One subtlety is that the current code only resets USEBACKUPROOT on rw
mounts, but the option is meaningfully "consumed" by a ro mount, so it
feels appropriate to clear in that case as well. A subsequent read-write
remount would not go through open_ctree, which is the only place that
checks the option, so the change should be benign.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NBoris Burkov <boris@bur.io>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8cd29088

btrfs: lift read-write mount setup from mount and remount · 44c0ca21

由 Boris Burkov 提交于 11月 18, 2020

Mounting rw and remounting from ro to rw naturally share invariants and
functionality which result in a correctly setup rw filesystem. Luckily,
there is even a strong unity in the code which implements them. In
mount's open_ctree, these operations mostly happen after an early return
for ro file systems, and in remount, they happen in a section devoted to
remounting ro->rw, after some remount specific validation passes.

However, there are unfortunately a few differences. There are small
deviations in the order of some of the operations, remount does not
start orphan cleanup in root_tree or fs_tree, remount does not create
the free space tree, and remount does not handle "one-shot" mount
options like clear_cache and uuid tree rescan.

Since we want to add building the free space tree to remount, and also
to start the same orphan cleanup process on a filesystem mounted as ro
then remounted rw, we would benefit from unifying the logic between the
two code paths.

This patch only lifts the existing common functionality, and leaves a
natural path for fixing the discrepancies.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NBoris Burkov <boris@bur.io>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

44c0ca21

btrfs: remove inode number cache feature · 5297199a

由 Nikolay Borisov 提交于 11月 26, 2020

It's been deprecated since commit b547a88e ("btrfs: start
deprecation of mount option inode_cache") which enumerates the reasons.

A filesystem that uses the feature (mount -o inode_cache) tracks the
inode numbers in bitmaps, that data stay on the filesystem after this
patch. The size is roughly 5MiB for 1M inodes [1], which is considered
small enough to be left there. Removal of the change can be implemented
in btrfs-progs if needed.

[1] https://lore.kernel.org/linux-btrfs/20201127145836.GZ6430@twin.jikos.cz/Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5297199a

btrfs: disallow space_cache in ZONED mode · 5d1ab66c

由 Naohiro Aota 提交于 11月 10, 2020

As updates to the space cache v1 are in-place, the space cache cannot be
located over sequential zones and there is no guarantees that the device
will have enough conventional zones to store this cache. Resolve this
problem by disabling completely the space cache v1.  This does not
introduce any problems with sequential block groups: all the free space
is located after the allocation pointer and no free space before the
pointer.  There is no need to have such cache.

Note: we can technically use free-space-tree (space cache v2) on ZONED
mode. But, since ZONED mode now always allocates extents in a block
group sequentially regardless of underlying device zone type, it's no
use to enable and maintain the tree.

For the same reason, NODATACOW is also disabled.

In summary, ZONED will disable:

| Disabled features | Reason                                              |
|-------------------+-----------------------------------------------------|
| RAID/DUP          | Cannot handle two zone append writes to different   |
|                   | zones                                               |
|-------------------+-----------------------------------------------------|
| space_cache (v1)  | In-place updating                                   |
| NODATACOW         | In-place updating                                   |
|-------------------+-----------------------------------------------------|
| fallocate         | Reserved extent will be a write hole                |
|-------------------+-----------------------------------------------------|
| MIXED_BG          | Allocated metadata region will be write holes for   |
|                   | data writes                                         |
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NNaohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5d1ab66c

btrfs: check and enable ZONED mode · b70f5097

由 Naohiro Aota 提交于 11月 10, 2020

Introduce function btrfs_check_zoned_mode() to check if ZONED flag is
enabled on the file system and if the file system consists of zoned
devices with equal zone size.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NNaohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b70f5097

btrfs: get zone information of zoned block devices · 5b316468

由 Naohiro Aota 提交于 11月 10, 2020

If a zoned block device is found, get its zone information (number of
zones and zone size).  To avoid costly run-time zone report
commands to test the device zones type during block allocation, attach
the seq_zones bitmap to the device structure to indicate if a zone is
sequential or accept random writes. Also it attaches the empty_zones
bitmap to indicate if a zone is empty or not.

This patch also introduces the helper function btrfs_dev_is_sequential()
to test if the zone storing a block is a sequential write required zone
and btrfs_dev_is_empty_zone() to test if the zone is a empty zone.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NNaohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5b316468

08 12月, 2020 10 次提交

btrfs: remove stub device info from messages when we have no fs_info · a0f6d924

由 David Sterba 提交于 11月 13, 2020

Without a NULL fs_info the helpers will print something like

	BTRFS error (device <unknown>): ...

This can happen in contexts where fs_info is not available at all or
it's potentially unsafe due to object lifetime. The <unknown> stub does
not bring much information and with the prefix makes the message
unnecessarily longer.

Remove it for the NULL fs_info case.

	BTRFS error: ...

Callers can add the device information to the message itself if needed.
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

a0f6d924

btrfs: locking: rip out path->leave_spinning · b9729ce0

由 Josef Bacik 提交于 8月 20, 2020

We no longer distinguish between blocking and spinning, so rip out all
this code.
Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b9729ce0

btrfs: replace s_blocksize_bits with fs_info::sectorsize_bits · 265fdfa6

由 David Sterba 提交于 7月 01, 2020

The value of super_block::s_blocksize_bits is the same as
fs_info::sectorsize_bits, but we don't need to do the extra dereferences
in many functions and storing the bits as u32 (in fs_info) generates
shorter assembly.
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: NQu Wenruo <wqu@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

265fdfa6

btrfs: generate lockdep keyset names at compile time · ab1405aa

由 David Sterba 提交于 9月 29, 2020

The names in btrfs_lockdep_keysets are generated from a simple pattern
using snprintf but we can generate them directly with some macro magic
and remove the helpers.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ab1405aa

btrfs: introduce mount option rescue=all · 9037d3cb