提交 · 093258e6ebaf178bb25da514f0d1f744968cc900 · openeuler / Kernel

29 5月, 2018 16 次提交

btrfs: replace waitqueue_actvie with cond_wake_up · 093258e6

由 David Sterba 提交于 2月 26, 2018

Use the wrappers and reduce the amount of low-level details about the
waitqueue management.
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

093258e6

btrfs: Remove fs_info argument from add_to_free_space_tree · e7355e50

由 Nikolay Borisov 提交于 5月 10, 2018

This function takes a transaction handle which already contains a
reference to the fs_info. So use it and remove the extra function
argument.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e7355e50

btrfs: Remove fs_info argument from remove_from_free_space_tree · 25a356d3

由 Nikolay Borisov 提交于 5月 10, 2018

This function alreay takes a transaction handle which holds a reference
to the fs_info. Use that and remove the extra argument.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

25a356d3

btrfs: Remove fs_info parameter from remove_block_group_free_space · f3f72779

由 Nikolay Borisov 提交于 5月 10, 2018

This function always takes a trans handle which contains a reference to
the fs_info. Use that and remove the extra argument.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f3f72779

btrfs: Remove fs_info argument from add_new_free_space · 4457c1c7

由 Nikolay Borisov 提交于 5月 10, 2018

This function also takes a btrfs_block_group_cache which contains a
referene to the fs_info. So use that and remove the extra argument.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

4457c1c7

btrfs: Remove fs_info argument from add_block_group_free_space · e4e0711c

由 Nikolay Borisov 提交于 5月 10, 2018

We also pass in a transaction handle which has a reference to the
fs_info. Just remove the extraneous argument.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e4e0711c

btrfs: Remove devid parameter from btrfs_rmap_block · 63a9c7b9

由 Nikolay Borisov 提交于 5月 04, 2018

This function is used in only one place and devid argument is always
passed 0. So just remove it, similarly to how it was removed in the
userspace code.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

63a9c7b9

btrfs: Remove delayed_iput parameter of btrfs_start_delalloc_roots · 82b3e53b

由 Nikolay Borisov 提交于 4月 23, 2018

This parameter was introduced alongside the function in
eb73c1b7 ("Btrfs: introduce per-subvolume delalloc inode list") to
avoid deadlocks since this function was used in the transaction commit
path. However, commit 8d875f95 ("btrfs: disable strict file flushes
for renames and truncates") removed that usage, rendering the parameter
obsolete.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NQu Wenruo <wqu@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

82b3e53b

btrfs: trace: Add trace points for unused block groups · 4ed0a7a3

由 Qu Wenruo 提交于 4月 26, 2018

This patch will add the following trace events:
1) btrfs_remove_block_group
   For btrfs_remove_block_group() function.
   Triggered when a block group is really removed.

2) btrfs_add_unused_block_group
   Triggered which block group is added to unused_bgs list.

3) btrfs_skip_unused_block_group
   Triggered which unused block group is not deleted.

These trace events is pretty handy to debug case related to block group
auto remove.
Signed-off-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

4ed0a7a3

btrfs: trace: Remove unnecessary fs_info parameter for btrfs__reserve_extent event class · 3dca5c94

由 Qu Wenruo 提交于 4月 26, 2018

fs_info can be extracted from btrfs_block_group_cache, and all
btrfs_block_group_cache is created by btrfs_create_block_group_cache()
with fs_info initialized, no need to worry about NULL pointer
dereference.
Signed-off-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3dca5c94

btrfs: move btrfs_raid_group values to btrfs_raid_attr table · 41a6e891

由 Anand Jain 提交于 4月 25, 2018

Add a new member struct btrfs_raid_attr::bg_flag so that
btrfs_raid_array can maintain the bit map flag of the raid type, and
so we can drop btrfs_raid_group.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

41a6e891

btrfs: move btrfs_raid_type_names values to btrfs_raid_attr table · ed23467b

由 Anand Jain 提交于 4月 25, 2018

Add a new member struct btrfs_raid_attr::raid_name so that
btrfs_raid_array can maintain the name of the raid type, and so we can
drop btrfs_raid_type_names.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ed23467b

btrfs: kill btrfs_fs_info::volume_mutex · dccdb07b

由 David Sterba 提交于 3月 21, 2018

Mutual exclusion of device add/rm and balance was done by the volume
mutex up to version 3.7. The commit 5ac00add ("Btrfs: disallow
mutually exclusive admin operations from user mode") added a bit that
essentially tracked the same information.

The status bit has an advantage over a mutex that it can be set without
restrictions of function context, so it started to be used in the
mount-time resuming of balance or device replace.

But we don't really need to track the same information in two ways.

1) After the previous cleanups, the main ioctl handlers for
   add/del/resize copy the EXCL_OP bit next to the volume mutex, here
   it's clearly safe.

2) Resuming balance during mount or after rw remount will set only the
   EXCL_OP bit and the volume_mutex is held in the kernel thread that
   calls btrfs_balance.

3) Resuming device replace during mount or after rw remount is done
   after balance and is excluded by the EXCL_OP bit. It does not take
   the volume_mutex at all and completely relies on the EXCL_OP bit.

4) The resuming of balance and dev-replace cannot hapen at the same time
   as the ioctls cannot be started in parallel. Nevertheless, a crafted
   image could trigger that and a warning is printed.

5) Balance is normally excluded by EXCL_OP and also uses own mutex to
   protect against concurrent access to its status data. There's some
   trickery to maintain the right lock nesting in case we need to
   reexamine the status in btrfs_ioctl_balance. The volume_mutex is
   removed and the unlock/lock sequence is left in place as we might
   expect other waiters to proceed.

6) Similar to 5, the unlock/lock sequence is kept in
   btrfs_cancel_balance to allow waiters to continue.
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

dccdb07b

btrfs: Drop fs_info parameter from btrfs_merge_delayed_refs · be97f133

由 Nikolay Borisov 提交于 4月 19, 2018

It's provided by the transaction handle.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

be97f133

btrfs: Consolidate error checking for btrfs_alloc_chunk · 57f1642e

由 Nikolay Borisov 提交于 4月 11, 2018

The second if is really a subcase of ret being less than 0. So
introduce a generic if (ret < 0) check, and inside have another if
which explicitly handles the -ENOSPC and any other errors. No
functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

57f1642e

btrfs: Fix lock release order · 1e7a1421

由 Nikolay Borisov 提交于 4月 11, 2018

Locks should generally be released in the oppposite order they are
acquired. Generally lock acquisiton ordering is used to ensure
deadlocks don't happen. However, as becomes more complicated it's
best to also maintain proper unlock order so as to avoid possible dead
locks. This was found by code inspection and doesn't necessarily lead
to a deadlock scenario.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

1e7a1421

28 5月, 2018 2 次提交

btrfs: Drop delayed_refs argument from btrfs_check_delayed_seq · 41d0bd3b

由 Nikolay Borisov 提交于 4月 04, 2018

It's used to print its pointer in a debug statement but doesn't really
bring any useful information to the error message.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

41d0bd3b

btrfs: Replace owner argument in add_pinned_bytes with a boolean · 29d2b84c

由 Nikolay Borisov 提交于 3月 30, 2018

add_pinned_bytes really cares whether the bytes being pinned are either
data or metadata. To that effect it checks whether the 'owner' argument
is less than BTRFS_FIRST_FREE_OBJECTID (256). This works because
owner can really have 2 types of values:

 a) For metadata extents it holds the level at which the parent is in
    the btree. This amounts to owner having the values 0-7

 b) In case of modifying data extents, owner is the inode number
    to which those extents belongs.

Let's make this more explicit byt converting the owner parameter to a
boolean value and either pass it directly when we know the type of
extents we are working with (i.e. in btrfs_free_tree_block). In cases
when the parent function can be called on both metadata/data extents
perform the check in the caller. This hopefully makes the interface
of add_pinned_bytes more intuitive.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

29d2b84c

02 5月, 2018 1 次提交

btrfs: Take trans lock before access running trans in check_delayed_ref · 998ac6d2

由 ethanwu 提交于 4月 29, 2018

In preivous patch:
Btrfs: kill trans in run_delalloc_nocow and btrfs_cross_ref_exist
We avoid starting btrfs transaction and get this information from
fs_info->running_transaction directly.

When accessing running_transaction in check_delayed_ref, there's a
chance that current transaction will be freed by commit transaction
after the NULL pointer check of running_transaction is passed.

After looking all the other places using fs_info->running_transaction,
they are either protected by trans_lock or holding the transactions.

Fix this by using trans_lock and increasing the use_count.

Fixes: e4c3b2dc ("Btrfs: kill trans in run_delalloc_nocow and btrfs_cross_ref_exist")
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Nethanwu <ethanwu@synology.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

998ac6d2

21 4月, 2018 1 次提交

btrfs: Fix race condition between delayed refs and blockgroup removal · 5e388e95

由 Nikolay Borisov 提交于 4月 18, 2018

When the delayed refs for a head are all run, eventually
cleanup_ref_head is called which (in case of deletion) obtains a
reference for the relevant btrfs_space_info struct by querying the bg
for the range. This is problematic because when the last extent of a
bg is deleted a race window emerges between removal of that bg and the
subsequent invocation of cleanup_ref_head. This can result in cache being null
and either a null pointer dereference or assertion failure.

	task: ffff8d04d31ed080 task.stack: ffff9e5dc10cc000
	RIP: 0010:assfail.constprop.78+0x18/0x1a [btrfs]
	RSP: 0018:ffff9e5dc10cfbe8 EFLAGS: 00010292
	RAX: 0000000000000044 RBX: 0000000000000000 RCX: 0000000000000000
	RDX: ffff8d04ffc1f868 RSI: ffff8d04ffc178c8 RDI: ffff8d04ffc178c8
	RBP: ffff8d04d29e5ea0 R08: 00000000000001f0 R09: 0000000000000001
	R10: ffff9e5dc0507d58 R11: 0000000000000001 R12: ffff8d04d29e5ea0
	R13: ffff8d04d29e5f08 R14: ffff8d04efe29b40 R15: ffff8d04efe203e0
	FS:  00007fbf58ead500(0000) GS:ffff8d04ffc00000(0000) knlGS:0000000000000000
	CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	CR2: 00007fe6c6975648 CR3: 0000000013b2a000 CR4: 00000000000006f0
	DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
	DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
	Call Trace:
	 __btrfs_run_delayed_refs+0x10e7/0x12c0 [btrfs]
	 btrfs_run_delayed_refs+0x68/0x250 [btrfs]
	 btrfs_should_end_transaction+0x42/0x60 [btrfs]
	 btrfs_truncate_inode_items+0xaac/0xfc0 [btrfs]
	 btrfs_evict_inode+0x4c6/0x5c0 [btrfs]
	 evict+0xc6/0x190
	 do_unlinkat+0x19c/0x300
	 do_syscall_64+0x74/0x140
	 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
	RIP: 0033:0x7fbf589c57a7

To fix this, introduce a new flag "is_system" to head_ref structs,
which is populated at insertion time. This allows to decouple the
querying for the spaceinfo from querying the possibly deleted bg.

Fixes: d7eae340 ("Btrfs: rework delayed ref total_bytes_pinned accounting")
CC: stable@vger.kernel.org # 4.14+
Suggested-by: NOmar Sandoval <osandov@osandov.com>
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5e388e95

18 4月, 2018 1 次提交

btrfs: qgroup: Use independent and accurate per inode qgroup rsv · ff6bc37e

由 Qu Wenruo 提交于 12月 21, 2017

Unlike reservation calculation used in inode rsv for metadata, qgroup
doesn't really need to care about things like csum size or extent usage
for the whole tree COW.

Qgroups care more about net change of the extent usage.
That's to say, if we're going to insert one file extent, it will mostly
find its place in COWed tree block, leaving no change in extent usage.
Or causing a leaf split, resulting in one new net extent and increasing
qgroup number by nodesize.
Or in an even more rare case, increase the tree level, increasing qgroup
number by 2 * nodesize.

So here instead of using the complicated calculation for extent
allocator, which cares more about accuracy and no error, qgroup doesn't
need that over-estimated reservation.

This patch will maintain 2 new members in btrfs_block_rsv structure for
qgroup, using much smaller calculation for qgroup rsv, reducing false
EDQUOT.
Signed-off-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NQu Wenruo <wqu@suse.com>

ff6bc37e

12 4月, 2018 1 次提交

btrfs: replace GPL boilerplate by SPDX -- sources · c1d7c514

由 David Sterba 提交于 4月 03, 2018

Remove GPL boilerplate text (long, short, one-line) and keep the rest,
ie. personal, company or original source copyright statements. Add the
SPDX header.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c1d7c514

06 4月, 2018 1 次提交

btrfs: Fix possible softlock on single core machines · 1e1c50a9

由 Nikolay Borisov 提交于 4月 05, 2018

do_chunk_alloc implements a loop checking whether there is a pending
chunk allocation and if so causes the caller do loop. Generally this
loop is executed only once, however testing with btrfs/072 on a single
core vm machines uncovered an extreme case where the system could loop
indefinitely. This is due to a missing cond_resched when loop which
doesn't give a chance to the previous chunk allocator finish its job.

The fix is to simply add the missing cond_resched.

Fixes: 6d74119f ("Btrfs: avoid taking the chunk_mutex in do_chunk_alloc")
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

1e1c50a9

31 3月, 2018 13 次提交

btrfs: use lockdep_assert_held for mutexes · a32bf9a3

由 David Sterba 提交于 3月 16, 2018

Using lockdep_assert_held is preferred, replace mutex_is_locked.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

a32bf9a3

btrfs: Validate child tree block's level and first key · 581c1760

由 Qu Wenruo 提交于 3月 29, 2018

We have several reports about node pointer points to incorrect child
tree blocks, which could have even wrong owner and level but still with
valid generation and checksum.

Although btrfs check could handle it and print error message like:
leaf parent key incorrect 60670574592

Kernel doesn't have enough check on this type of corruption correctly.
At least add such check to read_tree_block() and btrfs_read_buffer(),
where we need two new parameters @level and @first_key to verify the
child tree block.

The new @level check is mandatory and all call sites are already
modified to extract expected level from its call chain.

While @first_key is optional, the following call sites are skipping such
check:
1) Root node/leaf
   As ROOT_ITEM doesn't contain the first key, skip @first_key check.
2) Direct backref
   Only parent bytenr and level is known and we need to resolve the key
   all by ourselves, skip @first_key check.

Another note of this verification is, it needs extra info from nodeptr
or ROOT_ITEM, so it can't fit into current tree-checker framework, which
is limited to node/leaf boundary.
Signed-off-by: NQu Wenruo <wqu@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

581c1760

btrfs: qgroup: Use separate meta reservation type for delalloc · 43b18595

由 Qu Wenruo 提交于 12月 12, 2017

Before this patch, btrfs qgroup is mixing per-transcation meta rsv with
preallocated meta rsv, making it quite easy to underflow qgroup meta
reservation.

Since we have the new qgroup meta rsv types, apply it to delalloc
reservation.

Now for delalloc, most of its reserved space will use META_PREALLOC qgroup
rsv type.

And for callers reducing outstanding extent like btrfs_finish_ordered_io(),
they will convert corresponding META_PREALLOC reservation to
META_PERTRANS.

This is mainly due to the fact that current qgroup numbers will only be
updated in btrfs_commit_transaction(), that's to say if we don't keep
such placeholder reservation, we can exceed qgroup limitation.

And for callers freeing outstanding extent in error handler, we will
just free META_PREALLOC bytes.

This behavior makes callers of btrfs_qgroup_release_meta() or
btrfs_qgroup_convert_meta() to be aware of which type they are.
So in this patch, btrfs_delalloc_release_metadata() and its callers get
an extra parameter to info qgroup to do correct meta convert/release.

The good news is, even we use the wrong type (convert or free), it won't
cause obvious bug, as prealloc type is always in good shape, and the
type only affects how per-trans meta is increased or not.

So the worst case will be at most metadata limitation can be sometimes
exceeded (no convert at all) or metadata limitation is reached too soon
(no free at all).
Signed-off-by: NQu Wenruo <wqu@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

43b18595

btrfs: qgroup: Split meta rsv type into meta_prealloc and meta_pertrans · 733e03a0

由 Qu Wenruo 提交于 12月 12, 2017

Btrfs uses 2 different methods to reseve metadata qgroup space.

1) Reserve at btrfs_start_transaction() time
   This is quite straightforward, caller will use the trans handler
   allocated to modify b-trees.

   In this case, reserved metadata should be kept until qgroup numbers
   are updated.

2) Reserve by using block_rsv first, and later btrfs_join_transaction()
   This is more complicated, caller will reserve space using block_rsv
   first, and then later call btrfs_join_transaction() to get a trans
   handle.

   In this case, before we modify trees, the reserved space can be
   modified on demand, and after btrfs_join_transaction(), such reserved
   space should also be kept until qgroup numbers are updated.

Since these two types behave differently, split the original "META"
reservation type into 2 sub-types:

  META_PERTRANS:
    For above case 1)

  META_PREALLOC:
    For reservations that happened before btrfs_join_transaction() of
    case 2)

NOTE: This patch will only convert existing qgroup meta reservation
callers according to its situation, not ensuring all callers are at
correct timing.
Such fix will be added in later patches.
Signed-off-by: NQu Wenruo <wqu@suse.com>
[ update comments ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

733e03a0

btrfs: defer adding raid type kobject until after chunk relocation · 75cb379d

由 Jeff Mahoney 提交于 3月 20, 2018

Any time the first block group of a new type is created, we add a new
kobject to sysfs to hold the attributes for that type.  Kobject-internal
allocations always use GFP_KERNEL, making them prone to fs-reclaim races.
While it appears as if this can occur any time a block group is created,
the only times the first block group of a new type can be created in
memory is at mount and when we create the first new block group during
raid conversion.

This patch adds a new list to track pending kobject additions and then
handles them after we do chunk relocation.  Between relocating the
target chunk (or forcing allocation of a new chunk in the case of data)
and removing the old chunk, we're in a safe place for fs-reclaim to
occur.  We're holding the volume mutex, which is already held across
page faults, and the delete_unused_bgs_mutex, which will only stall
the cleaner thread.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

75cb379d

btrfs: remove dead create_space_info calls · dc2d3005

由 Jeff Mahoney 提交于 3月 20, 2018

Since commit 2be12ef7 (btrfs: Separate space_info create/update), we've
separated out the creation and updating of the space info structures.
That commit was a straightforward refactoring of the two parts of
update_space_info, but we can go a step further.  Since commits
c59021f8 (Btrfs: fix OOPS of empty filesystem after balance) and
b742bb82 (Btrfs: Link block groups of different raid types), we know
that the space_info structures will be created at mount and there will
only ever be, at most, three of them.

This patch cleans out the create_space_info calls after __find_space_info
returns NULL since __find_space_info *can't* return NULL.

The initial cause for reviewing this was the kobject_add calls from
create_space_info occuring in sites where fs-reclaim wasn't allowed.  Now
we are certain they occur only early in the mount process and are safe.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

dc2d3005

btrfs: Drop fs_info parameter from __btrfs_run_delayed_refs · 0a1e458a

由 Nikolay Borisov 提交于 3月 15, 2018

It's provided by transaction handle.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

0a1e458a

btrfs: Drop fs_info parameter from btrfs_finish_extent_commit · 5ead2dd0

由 Nikolay Borisov 提交于 3月 15, 2018

It's provided by the transaction handle.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5ead2dd0

btrfs: drop fs_info parameter from btrfs_run_delayed_refs · c79a70b1

由 Nikolay Borisov 提交于 3月 15, 2018

It's provided by the transaction handle.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c79a70b1

btrfs: Remove unused flush var in shrink_delalloc · 39d7d09d

由 Nikolay Borisov 提交于 3月 15, 2018

Added by 08e007d2 ("Btrfs: improve the noflush reservation") and
made redundant by 17024ad0 ("Btrfs: fix early ENOSPC due to
delalloc").
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

39d7d09d

btrfs: Remove unused extent_root var from caching_thread · 101d2dc0

由 Nikolay Borisov 提交于 3月 15, 2018

Added by b4570aa9 ("btrfs: fix compiling with CONFIG_BTRFS_DEBUG
enabled.") and obsoleted by 2ff7e61e ("btrfs: take an fs_info
directly when the root is not used otherwise").
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

101d2dc0

btrfs: Document parameters of btrfs_reserve_extent · 6f47c706

由 Nikolay Borisov 提交于 3月 13, 2018

This function is the entry to the extent allocator and as such has
quite a number of parameters. Some of those have subtle effects on the
allocation algorithm. Document the parameters.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

6f47c706

btrfs: Remove btrfs_fs_info::open_ioctl_trans · 92e2f7e3

由 Nikolay Borisov 提交于 2月 05, 2018

Since userspace transaction have been removed we no longer have use
for this field so delete it.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

92e2f7e3

26 3月, 2018 4 次提交

btrfs: Remove custom crc32c init code · 9678c543

由 Nikolay Borisov 提交于 1月 08, 2018

The custom crc32 init code was introduced in
14a958e6 ("Btrfs: fix btrfs boot when compiled as built-in") to
enable using btrfs as a built-in. However, later as pointed out by
60efa5eb ("Btrfs: use late_initcall instead of module_init") this
wasn't enough and finally btrfs was switched to late_initcall which
comes after the generic crc32c implementation is initiliased. The
latter commit superseeded the former. Now that we don't have to
maintain our own code let's just remove it and switch to using the
generic implementation.

Despite touching a lot of files the patch is really simple. Here is the gist of
the changes:

1. Select LIBCRC32C rather than the low-level modules.
2. s/btrfs_crc32c/crc32c/g
3. replace hash.h with linux/crc32c.h
4. Move the btrfs namehash funcs to ctree.h and change the tree accordingly.

I've tested this with btrfs being both a module and a built-in and xfstest
doesn't complain.

Does seem to fix the longstanding problem of not automatically selectiong
the crc32c module when btrfs is used. Possibly there is a workaround in
dracut.

The modinfo confirms that now all the module dependencies are there:

before:
depends:        zstd_compress,zstd_decompress,raid6_pq,xor,zlib_deflate

after:
depends:        libcrc32c,zstd_compress,zstd_decompress,raid6_pq,xor,zlib_deflate
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ add more info to changelog from mails ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9678c543

btrfs: Refactor __get_raid_index() to btrfs_bg_flags_to_raid_index() · 3e72ee88

由 Qu Wenruo 提交于 1月 30, 2018

Function __get_raid_index() is used to convert block group flags into
raid index, which can be used to get various info directly from
btrfs_raid_array[].

Refactor this function a little:

1) Rename to btrfs_bg_flags_to_raid_index()
   Double underline prefix is normally for internal functions, while the
   function is used by both extent-tree and volumes.

   Although the name is a little longer, but it should explain its usage
   quite well.

2) Move it to volumes.h and make it static inline
   Just several if-else branches, really no need to define it as a normal
   function.

   This also makes later code re-use between kernel and btrfs-progs
   easier.

3) Remove function get_block_group_index()
   Really no need to do such a simple thing as an exported function.
Signed-off-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3e72ee88

btrfs: Streamline btrfs_delalloc_reserve_metadata initial operations · da07d4ab

由 Nikolay Borisov 提交于 1月 12, 2018

The behavior of btrfs_delalloc_reserve_metadata depends on whether
the inode we are allocating for is the freespace inode or not. As it
stands if we are the free node we set 'flush' and 'delalloc_lock'
variable to certain values. Subsequently we check the values of those
vars and act accordingly. Instead, simplify things by having 1 if
which checks whether we are the freespace inode or not and do any
specific operation in either branches of that if. This makes the code
a bit easier to understand, as an added bonus it also shrinks the
compiled size:

add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-17 (-17)
Function                                     old     new   delta
btrfs_delalloc_reserve_metadata             1876    1859     -17
Total: Before=85966, After=85949, chg -0.02%

No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NEdmund Nadolski <enadolski@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

da07d4ab

btrfs: Add enospc_debug printing in metadata_reserve_bytes · 9a3daff3

由 Nikolay Borisov 提交于 12月 15, 2017

Currently when enospc_debug mount option is turned on we do not print
any debug info in case metadata reservation failures happen. Fix this
by adding the necessary hook in reserve_metadata_bytes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NQu Wenruo <wqu@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9a3daff3

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功