提交 · 94f450712ac9cb4e165b5115e5eb0ab10055a64b · openanolis / cloud-kernel

22 1月, 2018 40 次提交

Btrfs: use cached state when dirtying pages during buffered write · 94f45071

由 Filipe Manana 提交于 10月 31, 2017

During a buffered IO write, we can have an extent state that we got when
we locked the range (if the range starts at an offset lower than eof), so
always pass it to btrfs_dirty_pages() so that setting the delalloc bit
in the range does not need to do a full search in the inode's io tree,
saving time and reducing the amount of time we hold the io tree's lock.
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

94f45071

Btrfs: add support for fallocate's zero range operation · f27451f2

由 Filipe Manana 提交于 10月 25, 2017

This implements support the zero range operation of fallocate. For now
at least it's as simple as possible while reusing most of the existing
fallocate and hole punching infrastructure.
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f27451f2

Btrfs: do not merge rbios if their fail stripe index are not identical · cc54ff62

由 Liu Bo 提交于 12月 11, 2017

Since fail stripe index in rbio would be used to decide which
algorithm reconstruction would be run, we cannot merge rbios if
their's fail striped indexes are different, otherwise, one of the two
reconstructions would fail.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

cc54ff62

Btrfs: remove redundant check in rbio_can_merge · db34be19

由 Liu Bo 提交于 12月 04, 2017

Given the above
'
if (last->operation != cur->operation)
	return 0;
',
it's guaranteed that two operations are same.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

db34be19

btrfs: minor style cleanups in btrfs_scan_one_device · 05a5c55d

由 Anand Jain 提交于 12月 15, 2017

Assign ret = -EINVAL where it is actually required.
Remove { } around single line if else code.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

05a5c55d

btrfs: simplify mutex unlocking code in btrfs_commit_transaction · c1f32b7c

由 Anand Jain 提交于 12月 20, 2017

No functional change rearrange the mutex_unlock.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
[ edit subject ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c1f32b7c

btrfs: rename btrfs_device::scrub_device to scrub_ctx · cadbc0a0

由 Anand Jain 提交于 1月 03, 2018

btrfs_device::scrub_device is not a device which is being scrubbed,
but it holds the scrub context, so rename to reflect the same. No
functional changes here.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

cadbc0a0

btrfS: collapse btrfs_handle_error() into __btrfs_handle_fs_error() · 922ea899

由 Anand Jain 提交于 1月 04, 2018

There is no other consumer for btrfs_handle_error() other than
__btrfs_handle_fs_error(), further this function quite small.
Merge it into its parent.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
[ reformat comment ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

922ea899

btrfs: remove check for BTRFS_FS_STATE_ERROR which we just set · 61ecda68

由 Anand Jain 提交于 1月 04, 2018

__btrfs_handle_fs_error() sets BTRFS_FS_STATE_ERROR, and calls
btrfs_handle_error() so no need to check if the BTRFS_FS_STATE_ERROR
is set in btrfs_handle_error(). And there is no other user of
btrfs_handle_error() as well.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

61ecda68

Btrfs: make raid6 rebuild retry more · 8810f751

由 Liu Bo 提交于 1月 02, 2018

There is a scenario that can end up with rebuild process failing to
return good content, i.e.
suppose that all disks can be read without problems and if the content
that was read out doesn't match its checksum, currently for raid6
btrfs at most retries twice,

- the 1st retry is to rebuild with all other stripes, it'll eventually
  be a raid5 xor rebuild,
- if the 1st fails, the 2nd retry will deliberately fail parity p so
  that it will do raid6 style rebuild,

however, the chances are that another non-parity stripe content also
has something corrupted, so that the above retries are not able to
return correct content, and users will think of this as data loss.
More seriouly, if the loss happens on some important internal btree
roots, it could refuse to mount.

This extends btrfs to do more retries and each retry fails only one
stripe.  Since raid6 can tolerate 2 disk failures, if there is one
more failure besides the failure on which we're recovering, this can
always work.

The worst case is to retry as many times as the number of raid6 disks,
but given the fact that such a scenario is really rare in practice,
it's still acceptable.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8810f751

Btrfs: fix scrub to repair raid6 corruption · 762221f0

由 Liu Bo 提交于 1月 02, 2018

The raid6 corruption is that,
suppose that all disks can be read without problems and if the content
that was read out doesn't match its checksum, currently for raid6
btrfs at most retries twice,

- the 1st retry is to rebuild with all other stripes, it'll eventually
  be a raid5 xor rebuild,
- if the 1st fails, the 2nd retry will deliberately fail parity p so
  that it will do raid6 style rebuild,

however, the chances are that another non-parity stripe content also
has something corrupted, so that the above retries are not able to
return correct content.

We've fixed normal reads to rebuild raid6 correctly with more retries
in Patch "Btrfs: make raid6 rebuild retry more"[1], this is to fix
scrub to do the exactly same rebuild process.

[1]: https://patchwork.kernel.org/patch/10091755/Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

762221f0

btrfs: factor btrfs_check_rw_degradable() to check given device · 6528b99d

由 Anand Jain 提交于 12月 18, 2017

Update btrfs_check_rw_degradable() to check against the given device if
its lost.

We can use this function to know if the volume is going to be in
degraded mode OR failed state, when the given device fails.  Which is
needed when we are handling the device failed state.

A preparatory patch does not affect the flow as such.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NQu Wenruo <wqu@suse.com>
[ enhance comment ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

6528b99d

btrfs: sink unlock_extent parameter gfp_flags · e43bbe5e

由 David Sterba 提交于 12月 12, 2017

All callers pass either GFP_NOFS or GFP_KERNEL now, so we can sink the
parameter to the function, though we lose some of the slightly better
semantics of GFP_KERNEL in some places, it's worth cleaning up the
callchains.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e43bbe5e

btrfs: add separate helper for unlock_extent_cached with GFP_ATOMIC · d810a4be

由 David Sterba 提交于 12月 07, 2017

There's only one instance where we pass different gfp mask to
unlock_extent_cached. Add a separate helper for that and then we can
drop the gfp parameter from unlock_extent_cached.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

d810a4be

btrfs: drop unused parameters from mount_subvol · 5bedc48a

由 David Sterba 提交于 1月 02, 2018

Recent patches reworking the mount path left some unused parameters. We
pass a vfsmount to mount_subvol, the flags and data (ie. mount options)
have been already applied and we will not need them.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5bedc48a

btrfs: cleanup unnecessary string dup in btrfs_parse_options() · e215772c

由 Misono, Tomohiro 提交于 12月 14, 2017

Long ago, commit edf24abe ("btrfs: sanity mount option parsing and
early mount code") split the btrfs_parse_options() into two parts
(btrfs_parse_early_options() and btrfs_parse_options()). As a result,
btrfs_parse_optins no longer gets called twice and is the last one to
parse mount option string. Therefore there is no need to dup it.
Signed-off-by: NTomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e215772c

Btrfs: remove unused wait in btrfs_stripe_hash · 203e02d9

由 Liu Bo 提交于 12月 22, 2017

In fact nobody is waiting on @wait's waitqueue, it can be safely
removed.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

203e02d9

btrfs: Remove redundant pair of bio_get/set in __btrfs_submit_dio_bio · 36f7894f

由 Nikolay Borisov 提交于 12月 13, 2017

The bio is not referenced after it has been submitted and the endio is
going to consume the sole reference on successful submission. On error,
the callers of __btrfs_submit_dio_bio do invoke bio_put so we don't
leak it either.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

36f7894f

btrfs: Remove redundant bio_get/bio_set pair from submit_one_bio · ffc9c8dd

由 Nikolay Borisov 提交于 12月 13, 2017

The bio is never referenced after it has been submitted so there is no
point in getting an extra reference.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ffc9c8dd

btrfs: Remove redundant bio_get/set from submit_dio_repair_bio · ea057f6d

由 Nikolay Borisov 提交于 12月 13, 2017

The bio that is passsed is the newly created repair bio which already
has a reference count of 1, which is going to be consumed by the
endio routine on successful submission. On error the handler also
calls bio_put.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ea057f6d

btrfs: Remove redundant bio_get/set calls in compressed read/write paths · 32506af5

由 Nikolay Borisov 提交于 12月 13, 2017

bio_get/set is necessary only if the bio is going to be referenced
following submissions. In the code paths where such calls are made
we don't really need them since the bio is referenced only if
btrfs_map_bio returns an error. And this function can return an error
prior to submission only. So referencing the bio is safe. Furthermore
we do call bio_endio which will consume the last reference. So let's
remove the redundant calls.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

32506af5

N
btrfs: Improve btrfs_search_slot description · 4271ecea
由 Nikolay Borisov 提交于 12月 13, 2017
```
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
4271ecea

btrfs: heuristic: call get4bits directly · 36243c91

由 David Sterba 提交于 12月 12, 2017

As it's a single instance and local to the file, we don't need to pass
it as an argument.
Reviewed-by: NTimofey Titovets <nefelim4ag@gmail.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

36243c91

btrfs: heuristic: open code copy_call callback of radix sort · 7add17be

由 David Sterba 提交于 12月 12, 2017

The callback is trivial and we don't need the abstraction for our
purposes. Let's open code it.
Reviewed-by: NTimofey Titovets <nefelim4ag@gmail.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

7add17be

btrfs: heuristic: open code get_num callback of radix sort · 23ae8c63

由 David Sterba 提交于 12月 12, 2017

The callback is trivial and we don't need the abstraction for our
purposes. Let's open code it and also make the array types explicit.
Reviewed-by: NTimofey Titovets <nefelim4ag@gmail.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

23ae8c63

btrfs: remove unused arg from parse_subvol_options() · 78f6beac

由 Misono, Tomohiro 提交于 1月 17, 2018

Remove unused arg 'holder' from parse_subvol_options(), which has been
forgotten to be cleaned in the commit b99beb110e2d ("btrfs: split
parse_early_options() in two").
Signed-off-by: NTomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

78f6beac

btrfs: remove unused setup_root_args() · 83085935

由 Misono, Tomohiro 提交于 12月 14, 2017

Since setup_root_args() is not used anymore, just remove it.
Signed-off-by: NTomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

83085935

btrfs: split parse_early_options() in two · d7407606

由 Misono, Tomohiro 提交于 12月 14, 2017

Now parse_early_options() is used by both btrfs_mount() and
btrfs_mount_root(). However, the former only needs subvol related part
and the latter needs the others.

Therefore extract the subvol related parts from parse_early_options() and
move it to new parse function (parse_subvol_options()).
Signed-off-by: NTomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

d7407606

btrfs: cleanup btrfs_mount() using btrfs_mount_root() · 312c89fb

由 Misono, Tomohiro 提交于 12月 14, 2017

Cleanup btrfs_mount() by using btrfs_mount_root(). This avoids getting
btrfs_mount() called twice in mount path.

Old btrfs_mount() will do:
0. VFS layer calls vfs_kern_mount() with registered file_system_type
   (for btrfs, btrfs_fs_type). btrfs_mount() is called on the way.
1. btrfs_parse_early_options() parses "subvolid=" mount option and set the
   value to subvol_objectid. Otherwise, subvol_objectid has the initial
   value of 0
2. check subvol_objectid is 5 or not. Assume this time id is not 5, then
   btrfs_mount() returns by calling mount_subvol()
3. In mount_subvol(), original mount options are modified to contain
   "subvolid=0" in setup_root_args(). Then, vfs_kern_mount() is called with
   btrfs_fs_type and new options
4. btrfs_mount() is called again
5. btrfs_parse_early_options() parses "subvolid=0" and set 5 (instead of 0)
   to subvol_objectid
6. check subvol_objectid is 5 or not. This time id is 5 and mount_subvol()
   is not called. btrfs_mount() finishes mounting a root
7. (in mount_subvol()) with using a return vale of vfs_kern_mount(), it
   calls mount_subtree()
8. return subvolume's dentry

Reusing the same file_system_type (and btrfs_mount()) for vfs_kern_mount()
is the cause of complication.

Instead, new btrfs_mount() will do:
1. parse subvol id related options for later use in mount_subvol()
2. mount device's root by calling vfs_kern_mount() with
   btrfs_root_fs_type, which is not registered to VFS by
   register_filesystem(). As a result, btrfs_mount_root() is called
3. return by calling mount_subvol()

The code of 2. is moved from the first part of mount_subvol().

The semantics of device holder changes from btrfs_fs_type to
btrfs_root_fs_type and has to be used in all contexts. Otherwise we'd
get wrong results when mount and dev scan would not check the same
thing. (this has been found indendently and the fix is folded into this
patch)
Signed-off-by: NTomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ fold the btrfs_control_ioctl fixup, extend the comment ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

312c89fb

btrfs: add btrfs_mount_root() and new file_system_type · 72fa39f5

由 Misono, Tomohiro 提交于 12月 14, 2017

Add btrfs_mount_root() and new file_system_type for preparation of cleanup
of btrfs_mount(). Code path is not changed yet.

btrfs_mount_root() is almost the same as current btrfs_mount(), but doesn't
have subvolume related part.
Signed-off-by: NTomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

72fa39f5

btrfs: unify extent_page_data type passed as void · aab6e9ed

由 David Sterba 提交于 11月 30, 2017

Functions called from extent_write_cache_pages used void* as generic
callback data, but all of them convert it to extent_page_data, or use it
directly.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

aab6e9ed

btrfs: sink writepage parameter to extent_write_cache_pages · 935db853

由 David Sterba 提交于 6月 23, 2017

The function extent_write_cache_pages is modelled after
write_cache_pages which is a generic interface and the writepage
parameter makes sense there. In btrfs we know exactly which callback
we're going to use, so we can pass it directly.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

935db853

D
btrfs: sink flush_fn to extent_write_cache_pages · 25b860e0
由 David Sterba 提交于 6月 23, 2017
```
All callers pass the same value flush_write_bio.
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
25b860e0

btrfs: merge two flush_write_bio helpers · e2932ee0

由 David Sterba 提交于 6月 23, 2017

flush_epd_write_bio is same as flush_write_bio, no point having two such
functions. Merge them to flush_write_bio. The 'noinline' attribute is
removed as it does not have any meaning.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e2932ee0

btrfs: Rename bin_search -> btrfs_bin_search · a74b35ec

由 Nikolay Borisov 提交于 12月 08, 2017

Currently there are 2 function doing binary search on btrfs nodes:
bin_search and btrfs_bin_search. The latter being a simple wrapper for
the former. So eliminate the wrapper and just rename bin_search to
btrfs_bin_search. No functional changes
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

a74b35ec

btrfs: sink extent_write_full_page tree argument · 0a9b0e53

由 Nikolay Borisov 提交于 12月 08, 2017

The tree argument passed to extent_write_full_page is referenced from
the page being passed to the same function. Since we already have
enough information to get the reference, remove the function parameter.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

0a9b0e53

btrfs: sink extent_write_locked_range tree parameter · 5e3ee236

由 Nikolay Borisov 提交于 12月 08, 2017

This function is called only from submit_compressed_extents and the
io tree being passed is always that of the inode. But we are also
passing the inode, so just move getting the io tree pointer in
extent_write_locked_range to simplify the signature.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5e3ee236

btrfs: Remove pair of bio_get/put in btrfs_schedule_bio · 3e798068

由 Nikolay Borisov 提交于 12月 11, 2017

This code was added in 492bb6de ("Btrfs: Hold a reference on bios
during submit_bio, add some extra bio checks"). However, holding a
reference on a bio is necessary only if it's going to be referenced
after the submit_bio returns and the bio is completed. In this
particular instance this is not the case so there is no need to hold
an extra reference since we directly return.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3e798068

btrfs: Fix out of bounds access in btrfs_search_slot · 9ea2c7c9

由 Nikolay Borisov 提交于 12月 12, 2017

When modifying a tree where the root is at BTRFS_MAX_LEVEL - 1 then
the level variable is going to be 7 (this is the max height of the
tree). On the other hand btrfs_cow_block is always called with
"level + 1" as an index into the nodes and slots arrays. This leads to
an out of bounds access. Admittdely this will be benign since an OOB
access of the nodes array will likely read the 0th element from the
slots array, which in this case is going to be 0 (since we start CoW at
the top of the tree). The OOB access into the slots array in turn will
read the 0th and 1st values of the locks array, which would both be 0
at the time. However, this benign behavior relies on the fact that the
path being passed hasn't been initialised, if it has already been used to
query a btree then it could potentially have populated the nodes/slots arrays.

Fix it by explicitly checking if we are at level 7 (the maximum allowed
index in nodes/slots arrays) and explicitly call the CoW routine with
NULL for parent's node/slot.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Fixes-coverity-id: 711515
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9ea2c7c9

btrfs: remove duplicate includes · 87c46ec7

由 Pravin Shedge 提交于 12月 06, 2017

These duplicate includes have been found with scripts/checkincludes.pl but
they have been removed manually to avoid removing false positives.
Signed-off-by: NPravin Shedge <pravin.shedge4linux@gmail.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

87c46ec7

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功