提交 · 312c89fbca06896cb25a0daf4fa5f44c29bbb1b1 · openanolis / cloud-kernel

22 1月, 2018 40 次提交

btrfs: cleanup btrfs_mount() using btrfs_mount_root() · 312c89fb

由 Misono, Tomohiro 提交于 12月 14, 2017

Cleanup btrfs_mount() by using btrfs_mount_root(). This avoids getting
btrfs_mount() called twice in mount path.

Old btrfs_mount() will do:
0. VFS layer calls vfs_kern_mount() with registered file_system_type
   (for btrfs, btrfs_fs_type). btrfs_mount() is called on the way.
1. btrfs_parse_early_options() parses "subvolid=" mount option and set the
   value to subvol_objectid. Otherwise, subvol_objectid has the initial
   value of 0
2. check subvol_objectid is 5 or not. Assume this time id is not 5, then
   btrfs_mount() returns by calling mount_subvol()
3. In mount_subvol(), original mount options are modified to contain
   "subvolid=0" in setup_root_args(). Then, vfs_kern_mount() is called with
   btrfs_fs_type and new options
4. btrfs_mount() is called again
5. btrfs_parse_early_options() parses "subvolid=0" and set 5 (instead of 0)
   to subvol_objectid
6. check subvol_objectid is 5 or not. This time id is 5 and mount_subvol()
   is not called. btrfs_mount() finishes mounting a root
7. (in mount_subvol()) with using a return vale of vfs_kern_mount(), it
   calls mount_subtree()
8. return subvolume's dentry

Reusing the same file_system_type (and btrfs_mount()) for vfs_kern_mount()
is the cause of complication.

Instead, new btrfs_mount() will do:
1. parse subvol id related options for later use in mount_subvol()
2. mount device's root by calling vfs_kern_mount() with
   btrfs_root_fs_type, which is not registered to VFS by
   register_filesystem(). As a result, btrfs_mount_root() is called
3. return by calling mount_subvol()

The code of 2. is moved from the first part of mount_subvol().

The semantics of device holder changes from btrfs_fs_type to
btrfs_root_fs_type and has to be used in all contexts. Otherwise we'd
get wrong results when mount and dev scan would not check the same
thing. (this has been found indendently and the fix is folded into this
patch)
Signed-off-by: NTomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ fold the btrfs_control_ioctl fixup, extend the comment ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

312c89fb

btrfs: add btrfs_mount_root() and new file_system_type · 72fa39f5

由 Misono, Tomohiro 提交于 12月 14, 2017

Add btrfs_mount_root() and new file_system_type for preparation of cleanup
of btrfs_mount(). Code path is not changed yet.

btrfs_mount_root() is almost the same as current btrfs_mount(), but doesn't
have subvolume related part.
Signed-off-by: NTomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

72fa39f5

btrfs: unify extent_page_data type passed as void · aab6e9ed

由 David Sterba 提交于 11月 30, 2017

Functions called from extent_write_cache_pages used void* as generic
callback data, but all of them convert it to extent_page_data, or use it
directly.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

aab6e9ed

btrfs: sink writepage parameter to extent_write_cache_pages · 935db853

由 David Sterba 提交于 6月 23, 2017

The function extent_write_cache_pages is modelled after
write_cache_pages which is a generic interface and the writepage
parameter makes sense there. In btrfs we know exactly which callback
we're going to use, so we can pass it directly.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

935db853

D
btrfs: sink flush_fn to extent_write_cache_pages · 25b860e0
由 David Sterba 提交于 6月 23, 2017
```
All callers pass the same value flush_write_bio.
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
25b860e0

btrfs: merge two flush_write_bio helpers · e2932ee0

由 David Sterba 提交于 6月 23, 2017

flush_epd_write_bio is same as flush_write_bio, no point having two such
functions. Merge them to flush_write_bio. The 'noinline' attribute is
removed as it does not have any meaning.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e2932ee0

btrfs: Rename bin_search -> btrfs_bin_search · a74b35ec

由 Nikolay Borisov 提交于 12月 08, 2017

Currently there are 2 function doing binary search on btrfs nodes:
bin_search and btrfs_bin_search. The latter being a simple wrapper for
the former. So eliminate the wrapper and just rename bin_search to
btrfs_bin_search. No functional changes
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

a74b35ec

btrfs: sink extent_write_full_page tree argument · 0a9b0e53

由 Nikolay Borisov 提交于 12月 08, 2017

The tree argument passed to extent_write_full_page is referenced from
the page being passed to the same function. Since we already have
enough information to get the reference, remove the function parameter.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

0a9b0e53

btrfs: sink extent_write_locked_range tree parameter · 5e3ee236

由 Nikolay Borisov 提交于 12月 08, 2017

This function is called only from submit_compressed_extents and the
io tree being passed is always that of the inode. But we are also
passing the inode, so just move getting the io tree pointer in
extent_write_locked_range to simplify the signature.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5e3ee236

btrfs: Remove pair of bio_get/put in btrfs_schedule_bio · 3e798068

由 Nikolay Borisov 提交于 12月 11, 2017

This code was added in 492bb6de ("Btrfs: Hold a reference on bios
during submit_bio, add some extra bio checks"). However, holding a
reference on a bio is necessary only if it's going to be referenced
after the submit_bio returns and the bio is completed. In this
particular instance this is not the case so there is no need to hold
an extra reference since we directly return.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3e798068

btrfs: Fix out of bounds access in btrfs_search_slot · 9ea2c7c9

由 Nikolay Borisov 提交于 12月 12, 2017

When modifying a tree where the root is at BTRFS_MAX_LEVEL - 1 then
the level variable is going to be 7 (this is the max height of the
tree). On the other hand btrfs_cow_block is always called with
"level + 1" as an index into the nodes and slots arrays. This leads to
an out of bounds access. Admittdely this will be benign since an OOB
access of the nodes array will likely read the 0th element from the
slots array, which in this case is going to be 0 (since we start CoW at
the top of the tree). The OOB access into the slots array in turn will
read the 0th and 1st values of the locks array, which would both be 0
at the time. However, this benign behavior relies on the fact that the
path being passed hasn't been initialised, if it has already been used to
query a btree then it could potentially have populated the nodes/slots arrays.

Fix it by explicitly checking if we are at level 7 (the maximum allowed
index in nodes/slots arrays) and explicitly call the CoW routine with
NULL for parent's node/slot.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Fixes-coverity-id: 711515
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9ea2c7c9

btrfs: remove duplicate includes · 87c46ec7

由 Pravin Shedge 提交于 12月 06, 2017

These duplicate includes have been found with scripts/checkincludes.pl but
they have been removed manually to avoid removing false positives.
Signed-off-by: NPravin Shedge <pravin.shedge4linux@gmail.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

87c46ec7

btrfs: Handle btrfs_set_extent_delalloc failure in fixup worker · f3038ee3

由 Nikolay Borisov 提交于 12月 05, 2017

This function was introduced by 247e743c ("Btrfs: Use async helpers
to deal with pages that have been improperly dirtied") and it didn't do
any error handling then. This function might very well fail in ENOMEM
situation, yet it's not handled, this could lead to inconsistent state.
So let's handle the failure by setting the mapping error bit.

Cc: stable@vger.kernel.org
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f3038ee3

btrfs: put btrfs_ioctl_vol_args_v2 related defines together · ad8bc4d0

由 Anand Jain 提交于 12月 06, 2017

Just a code spatial rearrangement, no functional change.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ad8bc4d0

btrfs: show options: use helper to convert compression type string · 0f628c63

由 David Sterba 提交于 10月 31, 2017

Use the helper, if the COMPRESS option is set, the result is always
defined and not empty.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

0f628c63

D
btrfs: prop: use common helper for type to string conversion · 802a5c69
由 David Sterba 提交于 10月 31, 2017
```
Use the helper for conversion, keep the semantics.
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
802a5c69
D
btrfs: SETFLAGS ioctl: use helper for compression type conversion · 93370509
由 David Sterba 提交于 10月 31, 2017
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
93370509

btrfs: compression: add helper for type to string conversion · e128f9c3

由 David Sterba 提交于 10月 31, 2017

There are several places opencoding this conversion, add a helper now
that we have 3 compression algorithms.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e128f9c3

btrfs: remove redundant check in btrfs_get_extent_fiemap · bf8d32b9

由 Nikolay Borisov 提交于 12月 01, 2017

Before returning hole_em in btrfs_get_fiemap_extent we check if it's different
than null. However, by the time this null check is triggered we already know
hole_em is not null because it means it points to the em we found and it
has already been dereferenced.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

bf8d32b9

btrfs: Remove unused variable in btrfs_get_extent · 5c9a702e

由 Nikolay Borisov 提交于 12月 01, 2017

trans was statically assigned to NULL and this never changed over the
course of btrfs_get_extent. So remove any code which checks whether
trans != NULL and just hardcode the fact trans is always NULL.

Resolves-coverity-id: 112806
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5c9a702e

btrfs: tree-checker: use %zu format string for size_t · 7cfad652

由 Arnd Bergmann 提交于 12月 06, 2017

The return value of sizeof() is of type size_t, so we must print it
using the %z format modifier rather than %l to avoid this warning
on some architectures:

fs/btrfs/tree-checker.c: In function 'check_dir_item':
fs/btrfs/tree-checker.c:273:50: error: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'u32' {aka 'unsigned int'} [-Werror=format=]

Fixes: 005887f2e3e0 ("btrfs: tree-checker: Add checker for dir item")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

7cfad652

Btrfs: use struct completion in scrub_submit_raid56_bio_wait · b4ff5ad7

由 Liu Bo 提交于 11月 30, 2017

This changes to use struct completion directly and removes 'struct
scrub_bio_ret' along with the code using it.

This struct is used to get the return value from bio, but the caller can
access bio to get the return value directly and is holding a reference
on it so it won't go away underneath us and can be removed safely.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b4ff5ad7

Btrfs: remove unused variable wait in lock_stripe_add · c9f540fa

由 Liu Bo 提交于 12月 04, 2017

The defined wait is not used anywhere.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c9f540fa

Btrfs: compress_file_range() change page dirty status once · e9679de3

由 Timofey Titovets 提交于 10月 24, 2017

We need to call extent_range_clear_dirty_for_io()
on compression range to prevent application from changing
page content, while pages compressing.

extent_range_clear_dirty_for_io() runs on each loop iteration,
"(end - start)" can be much (up to 1024 times) bigger
then compression range (BTRFS_MAX_UNCOMPRESSED).

The start pointer is advanced each time we manage to compress part of
the range. The end pointer does not change so we could redirty the
remaining parts repeatedly.

Fix that behaviour by call extent_range_clear_dirty_for_io()
only once, the first time it happens.

This is the safest but probably not the best behaviour. Previous
iterations of the patch tried to redirty only the range that we were not
able to compress. This has been refused by David for safety reasons, the
writeout callchain is complex and there could be some path that relies
on redirtying the entire unwritten range.
Signed-off-by: NTimofey Titovets <nefelim4ag@gmail.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ enhance changelog, the history and safety concerns, add comment ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e9679de3

Btrfs: compression heuristic: replace heap sort with radix sort · 440c840c

由 Timofey Titovets 提交于 12月 04, 2017

Slowest part of heuristic for now is kernel heap sort()
It's can take up to 55% of runtime on sorting bucket items.

As sorting will always call on most data sets to get correctly
byte_core_set_size, the only way to speed up heuristic, is to
speed up sort on bucket.

Add a general radix_sort function.
Radix sort require 2 buffers, one full size of input array
and one for store counters (jump addresses).

That increase usage per heuristic workspace +1KiB
8KiB + 1KiB -> 8KiB + 2KiB

That is LSD Radix, i use 4 bit as a base for calculating,
to make counters array acceptable small (16 elements * 8 byte).

That Radix sort implementation have several points to adjust,
I added him to make radix sort general usable in kernel,
like heap sort, if needed.

Performance tested in userspace copy of heuristic code,
throughput:
    - average <-> random data: ~3500 MiB/s - heap  sort
    - average <-> random data: ~6000 MiB/s - radix sort
Signed-off-by: NTimofey Titovets <nefelim4ag@gmail.com>
[ coding style fixes ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

440c840c

btrfs: cleanup device states define BTRFS_DEV_STATE_FLUSH_SENT · 1c3063b6

由 Anand Jain 提交于 12月 04, 2017

Currently device state is being managed by each individual int
variable such as struct btrfs_device::is_tgtdev_for_dev_replace.
Instead of that declare btrfs_device::dev_state
BTRFS_DEV_STATE_FLUSH_SENT and use the bit operations.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

1c3063b6

btrfs: cleanup device states define BTRFS_DEV_STATE_REPLACE_TGT · 401e29c1

由 Anand Jain 提交于 12月 04, 2017

Currently device state is being managed by each individual int
variable such as struct btrfs_device::is_tgtdev_for_dev_replace.
Instead of that declare btrfs_device::dev_state
BTRFS_DEV_STATE_MISSING and use the bit operations.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
[ whitespace adjustments ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

401e29c1

btrfs: cleanup device states define BTRFS_DEV_STATE_MISSING · e6e674bd

由 Anand Jain 提交于 12月 04, 2017

Currently device state is being managed by each individual int
variable such as struct btrfs_device::missing. Instead of that
declare btrfs_device::dev_state BTRFS_DEV_STATE_MISSING and use
the bit operations.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by : Nikolay Borisov <nborisov@suse.com>
[ whitespace adjustments ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e6e674bd

btrfs: cleanup device states define BTRFS_DEV_STATE_IN_FS_METADATA · e12c9621

由 Anand Jain 提交于 12月 04, 2017

Currently device state is being managed by each individual int
variable such as struct btrfs_device::in_fs_metadata. Instead of
that declare device state BTRFS_DEV_STATE_IN_FS_METADATA and use
the bit operations.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
[ whitespace adjustments ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e12c9621

btrfs: cleanup device states define BTRFS_DEV_STATE_WRITEABLE · ebbede42

由 Anand Jain 提交于 12月 04, 2017

Currently device state is being managed by each individual int
variable such as struct btrfs_device::writeable. Instead of that
declare device state BTRFS_DEV_STATE_WRITEABLE and use the
bit operations.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
[ whitespace adjustments ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ebbede42

btrfs: add helper for device path or missing · 3c958bd2

由 Anand Jain 提交于 11月 28, 2017

This patch creates a helper function to get either the rcu device path
or missing.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
[ rename to btrfs_dev_name, switch to if/else ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3c958bd2

btrfs: drop btrfs_device::can_discard to query directly · 38b5f68e

由 Anand Jain 提交于 11月 29, 2017

We can query the bdev directly when needed at btrfs_discard_extent()
so drop btrfs_device::can_discard.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Suggested-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

38b5f68e

btrfs: make function update_share_count static · ccc8dc75

由 Colin Ian King 提交于 11月 30, 2017

The function update_share_count is local to the source and does
not need to be in global scope, so make it static.

Cleans up sparse warning:
fs/btrfs/backref.c:219:6: warning: symbol 'update_share_count' was not
declared. Should it be static?
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ccc8dc75

btrfs: Remove redundant FLAG_VACANCY · 4a2d25cd

由 Nikolay Borisov 提交于 11月 23, 2017

Commit 9036c102 ("Btrfs: update hole handling v2") added the
FLAG_VACANCY to denote holes, however there was already a consistent way
of flagging extents which represent hole - ->block_start =
EXTENT_MAP_HOLE. And also the only place where this flag is checked is
in the fiemap code, but the block_start value is also checked and every
other place in the filesystem detects holes by using block_start
value's. So remove the extra flag. This survived a full xfstest run.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

4a2d25cd

btrfs: extent-tree: Make btrfs_inode_rsv_refill function static · 3f2dd7a0

由 Qu Wenruo 提交于 11月 17, 2017

This function is no longer used outside of extent-tree.c.
Make it static.
Signed-off-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3f2dd7a0

btrfs: move some zstd work data from stack to workspace · 431e9822

由 David Sterba 提交于 11月 15, 2017

* ZSTD_inBuffer in_buf
* ZSTD_outBuffer out_buf

are used in all functions to pass the compression parameters and the
local variables consume some space. We can move them to the workspace
and reduce the stack consumption:

zstd.c:zstd_decompress                        -24 (136 -> 112)
zstd.c:zstd_decompress_bio                    -24 (144 -> 120)
zstd.c:zstd_compress_pages                    -24 (264 -> 240)
Signed-off-by: NDavid Sterba <dsterba@suse.com>
Reviewed-by: NNick Terrell <terrelln@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

431e9822

btrfs: reorder btrfs_transaction members for better packing · 5302e089

由 David Sterba 提交于 11月 08, 2017

There are now 20 bytes of holes, we can reduce that to 4 by minor
changes. Moving 'aborted' to the status and flags is also more logical,
similar for num_dirty_bgs. The size goes from 432 to 416.
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5302e089

btrfs: use narrower type for btrfs_transaction::num_dirty_bgs · 165c8b02

由 David Sterba 提交于 11月 08, 2017

The u64 is an overkill here, we could not possibly create that many
blockgroups in one transaction.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

165c8b02

btrfs: reorder btrfs_trans_handle members for better packing · 1ca4bb63

由 David Sterba 提交于 11月 08, 2017

Recent updates to the structure left some holes, reorder the types so
the packing is tight. The size goes from 112 to 104 on 64bit.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

1ca4bb63

btrfs: switch to refcount_t type for btrfs_trans_handle::use_count · b50fff81

由 David Sterba 提交于 11月 08, 2017

The use_count is a reference counter, we can use the refcount_t type,
though we don't use the atomicity. This is not a performance critical
code and we could catch the underflows. The type is changed from long,
but the number of references will fit an int.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b50fff81

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功