提交 · deac642d7e0fd83efd3372c4093fe60ac7436db6 · openeuler / raspberrypi-kernel

22 1月, 2018 36 次提交

D
btrfs: sink get_extent parameter to extent_write_full_page · deac642d
由 David Sterba 提交于 6月 23, 2017
```
There's only one caller.
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
deac642d
D
btrfs: sink get_extent parameter to extent_write_locked_range · 916b9298
由 David Sterba 提交于 6月 23, 2017
```
There's only one caller.
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
916b9298
D
btrfs: sink get_extent parameter to extent_writepages · 43317599
由 David Sterba 提交于 6月 23, 2017
```
There's only one caller.
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
43317599

btrfs: Cleanup existing name_len checks · bae15d95

由 Qu Wenruo 提交于 11月 08, 2017

Since tree-checker has verified leaf when reading from disk, we don't
need the existing verify_dir_item() or btrfs_is_name_len_valid() checks.
Signed-off-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

bae15d95

btrfs: tree-checker: Add checker for dir item · ad7b0368

由 Qu Wenruo 提交于 11月 08, 2017

Add checker for dir item, for key types DIR_ITEM, DIR_INDEX and
XATTR_ITEM.

This checker does comprehensive checks for:

1) dir_item header and its data size
   Against item boundary and maximum name/xattr length.
   This part is mostly the same as old verify_dir_item().

2) dir_type
   Against maximum file types, and against key type.
   Since XATTR key should only have FT_XATTR dir item, and normal dir
   item type should not have XATTR key.

   The check between key->type and dir_type is newly introduced by this
   patch.

3) name hash
   For XATTR and DIR_ITEM key, key->offset is name hash (crc32c).
   Check the hash of the name against the key to ensure it's correct.

   The name hash check is only found in btrfs-progs before this patch.
Signed-off-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NSu Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ad7b0368

btrfs: use GFP_KERNEL in btrfs_alloc_inode · 712e36c5

由 David Sterba 提交于 10月 31, 2017

This callback is called directly from VFS, no locks are held at the
allocation time.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

712e36c5

D
btrfs: sink gfp parameter to clear_extent_uptodate · f08dc36f
由 David Sterba 提交于 10月 31, 2017
```
There's only one callsite with GFP_NOFS.
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
f08dc36f

btrfs: sink gfp parameter to clear_extent_bit · ae0f1625

由 David Sterba 提交于 10月 31, 2017

All callers use GFP_NOFS, we don't have to pass it as an argument. The
built-in tests pass GFP_KERNEL, but they run only at module load time
and NOFS works there as well.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ae0f1625

btrfs: prepare to drop gfp mask parameter from clear_extent_bit · 66b0c887

由 David Sterba 提交于 10月 31, 2017

Use __clear_extent_bit directly in case we want to pass unknown
gfp flags. Otherwise all clear_extent_bit callers use GFP_NOFS, so we
can sink them to the function and reduce argument count, at the cost
that __clear_extent_bit has to be exported.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

66b0c887

btrfs: use non-RCU list traversal in write_all_supers callees · 1538e6c5

由 David Sterba 提交于 6月 16, 2017

We take the fs_devices::device_list_mutex mutex in write_all_supers
which will prevent any add/del changes to the device list. Therefore we
don't need to use the RCU variant list_for_each_entry_rcu in any of the
called functions.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

1538e6c5

btrfs: switch to RCU for device traversal in btrfs_ioctl_fs_info · d03262c7

由 David Sterba 提交于 6月 16, 2017

We don't need to use the mutex as we do not modify the devices nor the
list itself and just read information about device counts.
Move copying fsid out of the protected section, not applicable to RCU
same as the rest of the retrieved information.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

d03262c7

btrfs: switch to RCU for device traversal in btrfs_ioctl_dev_info · c5593ca3

由 David Sterba 提交于 6月 16, 2017

We don't need to use the mutex as we do not modify the devices nor the
list itself and just read some information:

does not change during device lifetime:
- devid
- uuid
- name (ie. the path)

may change in parallel to the ioctl call, but can lead only to reporting
inacurracy:
- bytes_used
- total_bytes
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c5593ca3

D
btrfs: simplify btrfs_close_bdev · 08ffcae8
由 David Sterba 提交于 6月 19, 2017
```
Split the conditions a bit.
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
08ffcae8

btrfs: document device locking · 9c6b1c4d

由 David Sterba 提交于 6月 16, 2017

Overview of the main locks protecting various device-related structures.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9c6b1c4d

D
btrfs: simplify exit paths in btrfs_init_new_device · 5c4cf6c9
由 David Sterba 提交于 10月 30, 2017
```
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
5c4cf6c9

btrfs: use free_device where opencoded · 55de4803

由 David Sterba 提交于 10月 30, 2017

Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

55de4803

btrfs: introduce free_device helper · 48dae9cf

由 David Sterba 提交于 10月 30, 2017

A helper to free a device and all it's dynamically allocated members,
like the rcu_string name or flush_bio. This is going to replace all
open coded places.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

48dae9cf

btrfs: rename device free rcu helper to free_device_rcu · f06c5965

由 David Sterba 提交于 6月 06, 2017

Make it clear that it is an RCU helper, we want to use the name
free_device for a wrapper freeing all device members.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f06c5965

Btrfs: document rules about bio async submit · 4c274bc6

由 Liu Bo 提交于 11月 01, 2017

These rules have been hidden in several if-else and are not
straightforward to follow, for example, dio submit hook's nocsum case
has a bug , i.e. doing async submit instead of sync submit, which has
been fixed recently.

This is documenting the rules for reference.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

4c274bc6

btrfs: Reduce scope of delayed_rsv->lock in may_commit_trans · 057aac3e

由 Nikolay Borisov 提交于 11月 07, 2017

After commit 996478ca ("btrfs: change how we decide to commit
transactions during flushing") there is no need to hold the delayed_rsv
during the percpu_counter_compare call since we get the byte's snapshot
earlier. So hold the lock only while reading delayed_rsv.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

057aac3e

Btrfs: add __init macro to btrfs init functions · f5c29bd9

由 Liu Bo 提交于 11月 02, 2017

Adding __init macro gives kernel a hint that this function is only used
during the initialization phase and its memory resources can be freed up
after.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f5c29bd9

btrfs: rename btrfs_add_device to btrfs_add_dev_item · c74a0b02

由 Anand Jain 提交于 11月 06, 2017

Function btrfs_add_device() is adding the device item so rename to
reflect that in the function. Similarly we have btrfs_rm_dev_item().
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c74a0b02

btrfs: Don't generate UUID for non-fs tree · 33d85fda

由 Qu Wenruo 提交于 10月 31, 2017

btrfs_create_tree() will unconditionally generate UUID for any root.
So for quota tree and data reloc tree created by kernel, they will have
unique UUIDs.

However UUID in root item is only referred by UUID tree, which only
records UUID for fs trees.  This makes unique UUIDs for quota/data reloc
tree meaningless.

Leave the UUID as zero for non-fs tree, making btrfs-debug-tree output
less confusing.
Reported-by: NMisono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

33d85fda

btrfs: move volume_mutex into the btrfs_rm_device() · 2c997384

由 Anand Jain 提交于 11月 06, 2017

A cleanup patch no functional change, we hold volume_mutex before
calling btrfs_rm_device, so move it into the function itself.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

2c997384

btrfs: Use locked_end rather than open coding it · 96b09dde

由 Nikolay Borisov 提交于 11月 01, 2017

Right before we go into this loop locked_end is set to alloc_end - 1 and
is being used in nearby functions, no need to have exceptions. This just
makes the code consistent, no functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

96b09dde

btrfs: Move loop termination condition in while() · 6b7d6e93

由 Nikolay Borisov 提交于 11月 01, 2017

Fallocating a file in btrfs goes through several stages. The one before
actually inserting the fallocated extents is to create a qgroup
reservation, covering the desired range. To this end there is a loop in
btrfs_fallocate which checks to see if there are holes in the fallocated
range or !PREALLOC extents past EOF and if so create qgroup reservations
for them. Unfortunately, the main condition of the loop is burried right
at the end of its body rather than in the actual while statement which
makes it non-obvious. Fix this by moving the condition in the while
statement where it belongs. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

6b7d6e93

Btrfs: remove rcu_barrier in btrfs_close_devices · 47dba171

由 Liu Bo 提交于 10月 10, 2017

It was introduced because btrfs used to do blkdev_put in a deferred
work, now that btrfs has blkdev_put in place, this rcu_barrier can be
removed.

modprobe -r btrfs will do btrfs_cleanup_fs_uuids(), where it cleanup
every %fs_devices on the list, but when we do btrfs_close_devices(), we
have replaced the devices on the list with dummy ones which only have
the same name and uuid, so modprobe -r btrfs will free those instead of
what we were using, this change won't cause a problem for it.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ copied 2nd paragraph from mailinglist discussion ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

47dba171

btrfs: Move checks from btrfs_wq_run_delayed_node to btrfs_balance_delayed_items · 8577787f

由 Nikolay Borisov 提交于 10月 23, 2017

btrfs_balance_delayed_items is the sole caller of
btrfs_wq_run_delayed_node and already includes one of the checks whether
the delayed inodes should be run. On the other hand
btrfs_wq_run_delayed_node duplicates that check and performs an
additional one for wq congestion.

Let's remove the duplicate check and move the congestion one in
btrfs_balance_delayed_items, leaving btrfs_wq_run_delayed_node to only
care about setting up the wq run. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NQu Wenruo <wqu@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8577787f

btrfs: Make btrfs_async_run_delayed_root use a loop rather than multiple labels · 617c54a8

由 Nikolay Borisov 提交于 10月 23, 2017

Currently btrfs_async_run_delayed_root's implementation uses 3 goto
labels to mimic the functionality of a simple do {} while loop. Refactor
the function to use a do {} while construct, making intention clear and
code easier to follow. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NQu Wenruo <wqu@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

617c54a8

btrfs: Remove redundant mirror_num arg · d3fac6ba

由 Nikolay Borisov 提交于 10月 24, 2017

The following callpath is always invoked with mirror_num set to 0, so
let's remove it as an argument and directly pass 0 to __do_redpage. No
functional change.

extent_readpages
  __extent_readpages
    __do_contiguous_readpages
      __do_readpage
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

d3fac6ba

btrfs: Remove unused function · ac244ef1

由 Nikolay Borisov 提交于 10月 20, 2017

It's sole callsite was removed in a previous patch so just nuke it for good.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ac244ef1

btrfs: Remove redundant memory barrier in dev stats · 4660c49f

由 Nikolay Borisov 提交于 10月 20, 2017

As per atomic_t.txt documentation :
 - RMW operations that have a return value are fully ordered;

atomic_xchg is one such operation so it already includes everything it
needs w.r.t memory ordering and add a comment to be more explicit about
that.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

4660c49f

btrfs: Fix memory barriers usage with device stats counters · 9deae968

由 Nikolay Borisov 提交于 10月 24, 2017

Commit addc3fa7 ("Btrfs: Fix the problem that the dirty flag of dev
stats is cleared") reworked the way device stats changes are tracked. A
new atomic dev_stats_ccnt counter was introduced which is incremented
every time any of the device stats counters are changed. This serves as
a flag whether there are any pending stats changes. However, this patch
only partially implemented the correct memory barriers necessary:

- It only ordered the stores to the counters but not the reads e.g.
  btrfs_run_dev_stats
- It completely omitted any comments documenting the intended design and
  how the memory barriers pair with each-other

This patch provides the necessary comments as well as adds a missing
smp_rmb in btrfs_run_dev_stats. Furthermore since dev_stats_cnt is only
a snapshot at best there was no point in reading the counter twice -
once in btrfs_dev_stats_dirty and then again when assigning stats_cnt.
Just collapse both reads into 1.

Fixes: addc3fa7 ("Btrfs: Fix the problem that the dirty flag of dev stats is cleared")
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9deae968

btrfs: clean up btrfs_dev_stat_inc usage · 1cb34c8e

由 Anand Jain 提交于 10月 21, 2017

btrfs_end_bio() is using btrfs_dev_stat_inc() and then
btrfs_dev_stat_print_on_error() separately instead use
btrfs_dev_stat_inc_and_print() directly.

As of now there isn't any bio in btrfs which is - a non-empty write and
also the REQ_PREFLUSH flag is set. So in actual the condition

   if (bio->bi_opf & REQ_PREFLUSH)

is never true in btrfs_end_bio(), and so there won't be any redundant
error log by using btrfs_dev_stat_inc_and_print() separately one for
write and another for flush.

This consolidation will help to add the device critical error handles in
the function btrfs_dev_stat_inc_and_print() and which can be renamed as
needed.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

1cb34c8e

Btrfs: free btrfs_device in place · 9f5316c1

由 Liu Bo 提交于 10月 23, 2017

It's pointless to defer it to a kthread helper as we're not under a
special context.

For reference, commit 1f78160c ("Btrfs: using rcu lock in the reader
side of devices list") introduced RCU freeing for device structures.

Originally the blkdev_put was called from free_device and rcu_barrier had
to be called. This is no longer required, bdev and our device structures
are now freed separately.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ enhance changelog ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9f5316c1

Btrfs: remove redundant btrfs_balance_delayed_items · 1805f2ca

由 Liu Bo 提交于 10月 20, 2017

In functions like btrfs_create(), we run both
btrfs_balance_delayed_items() and btrfs_btree_balance_dirty() after
the operation, but btrfs_btree_balance_dirty() is surely going to run
btrfs_balance_delayed_items().

This keeps only btrfs_btree_balance_dirty().
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

1805f2ca

03 1月, 2018 2 次提交

btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes · ec35e48b

由 Chris Mason 提交于 12月 15, 2017

refcounts have a generic implementation and an asm optimized one.  The
generic version has extra debugging to make sure that once a refcount
goes to zero, refcount_inc won't increase it.

The btrfs delayed inode code wasn't expecting this, and we're tripping
over the warnings when the generic refcounts are used.  We ended up with
this race:

Process A                                         Process B
                                                  btrfs_get_delayed_node()
						  spin_lock(root->inode_lock)
						  radix_tree_lookup()
__btrfs_release_delayed_node()
refcount_dec_and_test(&delayed_node->refs)
our refcount is now zero
						  refcount_add(2) <---
						  warning here, refcount
                                                  unchanged

spin_lock(root->inode_lock)
radix_tree_delete()

With the generic refcounts, we actually warn again when process B above
tries to release his refcount because refcount_add() turned into a
no-op.

We saw this in production on older kernels without the asm optimized
refcounts.

The fix used here is to use refcount_inc_not_zero() to detect when the
object is in the middle of being freed and return NULL.  This is almost
always the right answer anyway, since we usually end up pitching the
delayed_node if it didn't have fresh data in it.

This also changes __btrfs_release_delayed_node() to remove the extra
check for zero refcounts before radix tree deletion.
btrfs_get_delayed_node() was the only path that was allowing refcounts
to go from zero to one.

Fixes: 6de5f18e ("btrfs: fix refcount_t usage when deleting btrfs_delayed_node")
CC: <stable@vger.kernel.org> # 4.12+
Signed-off-by: NChris Mason <clm@fb.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ec35e48b

btrfs: Fix flush bio leak · beed9263

由 Nikolay Borisov 提交于 12月 13, 2017

Commit e0ae9994 ("btrfs: preallocate device flush bio") reworked
the way the flush bio is allocated and used. Concretely it allocates
the bio in __alloc_device and then re-uses it multiple times with a
very simple endio routine that just calls complete() without consuming
a reference. Allocated bios by default come with a ref count of 1,
which is then consumed by the endio routine (or not, in which case they
should be bio_put by the caller). The way the impleementation works now
is that the flush bio has a refcount of 2 and we only ever bio_put it
once, leaving it to hang indefinitely. Fix this by removing the extra
bio_get in __alloc_device.

Fixes: e0ae9994 ("btrfs: preallocate device flush bio")
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

beed9263

07 12月, 2017 2 次提交

btrfs: Fix possible off-by-one in btrfs_search_path_in_tree · c8bcbfbd

由 Nikolay Borisov 提交于 12月 01, 2017

The name char array passed to btrfs_search_path_in_tree is of size
BTRFS_INO_LOOKUP_PATH_MAX (4080). So the actual accessible char indexes
are in the range of [0, 4079]. Currently the code uses the define but this
represents an off-by-one.

Implications:

Size of btrfs_ioctl_ino_lookup_args is 4096, so the new byte will be
written to extra space, not some padding that could be provided by the
allocator.

btrfs-progs store the arguments on stack, but kernel does own copy of
the ioctl buffer and the off-by-one overwrite does not affect userspace,
but the ending 0 might be lost.

Kernel ioctl buffer is allocated dynamically so we're overwriting
somebody else's memory, and the ioctl is privileged if args.objectid is
not 256. Which is in most cases, but resolving a subvolume stored in
another directory will trigger that path.

Before this patch the buffer was one byte larger, but then the -1 was
not added.

Fixes: ac8e9819 ("Btrfs: add search and inode lookup ioctls")
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ added implications ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c8bcbfbd

Btrfs: disable FUA if mounted with nobarrier · 1b9e619c

由 Omar Sandoval 提交于 12月 05, 2017

I was seeing disk flushes still happening when I mounted a Btrfs
filesystem with nobarrier for testing. This is because we use FUA to
write out the first super block, and on devices without FUA support, the
block layer translates FUA to a flush. Even on devices supporting true
FUA, using FUA when we asked for no barriers is surprising.

Fixes: 387125fc ("Btrfs: fix barrier flushes")
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Reviewed-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

1b9e619c