提交 · fc36ed7e0b13955ba66fc56dc5067e67ac105150 · openeuler / raspberrypi-kernel

07 5月, 2013 40 次提交

Btrfs: separate sequence numbers for delayed ref tracking and tree mod log · fc36ed7e

由 Jan Schmidt 提交于 4月 24, 2013

Sequence numbers for delayed refs have been introduced in the first version
of the qgroup patch set. To solve the problem of find_all_roots on a busy
file system, the tree mod log was introduced. The sequence numbers for that
were simply shared between those two users.

However, at one point in qgroup's quota accounting, there's a statement
accessing the previous sequence number, that's still just doing (seq - 1)
just as it would have to in the very first version.

To satisfy that requirement, this patch makes the sequence number counter 64
bit and splits it into a major part (used for qgroup sequence number
counting) and a minor part (incremented for each tree modification in the
log). This enables us to go exactly one major step backwards, as required
for qgroups, while still incrementing the sequence counter for tree mod log
insertions to keep track of their order. Keeping them in a single variable
means there's no need to change all the code dealing with comparisons of two
sequence numbers.

The sequence number is reset to 0 on commit (not new in this patch), which
ensures we won't overflow the two 32 bit counters.

Without this fix, the qgroup tracking can occasionally go wrong and WARN_ONs
from the tree mod log code may happen.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

fc36ed7e

btrfs: move leak debug code to functions · 6d49ba1b

由 Eric Sandeen 提交于 4月 22, 2013

Clean up the leak debugging in extent_io.c by moving
the debug code into functions.  This also removes the
list_heads used for debugging from the extent_buffer
and extent_state structures when debug is not enabled.

Since we need a global debug config to do that last
part, implement CONFIG_BTRFS_DEBUG to accommodate.

Thanks to Dave Sterba for the Kconfig bit.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

6d49ba1b

Btrfs: return free space in cow error path · ace68bac

由 Liu Bo 提交于 4月 22, 2013

Replace some BUG_ONs with proper handling and take allocated space back to
free space cache for later use.

We don't have to worry about extent maps since they'd be freed in releasepage
path.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

ace68bac

Btrfs: set UUID in root_item for created trees · 6463fe58

由 Stefan Behrens 提交于 4月 19, 2013

It is a rare exception that a new tree is created, like the qgroups
tree. So far these new trees have an all-zero UUID in their root
items. All trees that mkfs.btrfs has created get an UUID during the
first mount when btrfs_read_root_item() rewrites the root_item to
the v2 structure style. These UUID are never used so far, but
anyway, since it is better to have it uniform for all trees, this
commit adds some lines that generate and write an UUID for newly
created trees.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

6463fe58

S
Btrfs: delete unused parameter to btrfs_read_root_item() · 5fbf83c1
由 Stefan Behrens 提交于 4月 19, 2013
```
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
```
5fbf83c1

Btrfs: fix error handling in btrfs_ioctl_send() · ecc7ada7

由 Tsutomu Itoh 提交于 4月 19, 2013

fget() returns NULL if error. So, we should check NULL or not.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

ecc7ada7

Btrfs: remove unused variable in __process_changed_new_xattr() · ba1eeaac

由 Tsutomu Itoh 提交于 4月 18, 2013

Variable 'p' is not used any more. So, remove it.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

ba1eeaac

Btrfs: various abort cleanups · 54067ae9

由 Josef Bacik 提交于 4月 25, 2013

I have a broken file system that when it aborts leaves all sorts of accounting
things wrong and gives you lots of WARN_ON()'s other than the abort. This is
because we're not cleaning up various parts of the file system when we abort.
The first chunks are specific to mount failures, we weren't cleaning up the
block group cached inodes and we weren't cleaning up any transactions that had
been aborted, which leaves a bunch of things laying around.

The second half of this are related to the cleanup parts. First we don't need
to release space for the dirty pages from the trans_block_rsv, that's all
handled by the trans handles so this is just plain wrong. The other thing is we
need to pin down extents that were set ->must_insert_reserved for delayed refs.
This isn't so much for the pinning but more for the cleaning up the
cache->reserved counter since we are no longer going to use those reserved
bytes. With this patch I no longer see a bunch of WARN_ON()'s when I try to
mount this broken file system, just the initial one from the abort. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

54067ae9

Btrfs: cleanup destroy_marked_extents · fd8b2b61

由 Josef Bacik 提交于 4月 24, 2013

We can just look up the extent_buffers for the range and free stuff that way.
This makes the cleanup a bit cleaner and we can make sure to evict the
extent_buffers pretty quickly by marking them as stale. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

fd8b2b61

Btrfs: check return value of commit when recovering log · abefa55a

由 Josef Bacik 提交于 4月 24, 2013

We need to check the return value of the commit in case something goes wrong,
otherwise we could end up going down the line and doing more stuff (like orphan
cleanup) before we notice we should have errored out. We need to do this before
we free up the log_tree_root since the caller will handle all of that. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

abefa55a

Btrfs: don't panic if we're trying to drop too many refs · 32b02538

由 Josef Bacik 提交于 4月 24, 2013

This is just obnoxious.  Just print a message, abort the transaction, and return
an error.  Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

32b02538

Btrfs: cleanup fs roots if we fail to mount · 171f6537

由 Josef Bacik 提交于 4月 24, 2013

We can run the tree logging recovery or the orphan cleanup on mount, so we'll
end up looking up a random fs tree in the meantime. So we need to clean this up
so we don't leave extent buffers hanging around on the cache. With this patch
we no longer leak extent buffers on failure to mount. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

171f6537

Btrfs: fix extent logging with O_DIRECT into prealloc · eb384b55

由 Josef Bacik 提交于 4月 24, 2013

This is the same as the fix from commit

Btrfs: fix bad extent logging

but for O_DIRECT.  I missed this when I fixed the problem originally, we were
still using the em for the orig_start and orig_block_len, which would be the
merged extent.  We need to use the actual extent from the on disk file extent
item, which we have to lookup to make sure it's ok to nocow anyway so just pass
in some pointers to hold this info.  Thanks,

Cc: stable@vger.kernel.org
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

eb384b55

Btrfs: fix all callers of read_tree_block · 416bc658

由 Josef Bacik 提交于 4月 23, 2013

We kept leaking extent buffers when mounting a broken file system and it turns
out it's because not everybody uses read_tree_block properly. You need to check
and make sure the extent_buffer is uptodate before you use it. This patch fixes
everybody who calls read_tree_block directly to make sure they check that it is
uptodate and free it and return an error if it is not. With this we no longer
leak EB's when things go horribly wrong. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

416bc658

Btrfs: only exclude supers in the range of our block group · 51bf5f0b

由 Josef Bacik 提交于 4月 23, 2013

If we fail to load block groups halfway through we can leave extent_state's on
the excluded tree. This is because we just lookup the supers and add them to
the excluded tree regardless of which block group we are looking at currently.
This is a problem because we remove the excluded extents for the range of the
block group only, so if we don't ever load a block group for one of the excluded
extents we won't ever free it. This fixes the problem by only adding excluded
extents if it falls in the block group range we care about. With this patch
we're no longer leaking space when we fail to read all of the block groups.
Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

51bf5f0b

Btrfs: add tree block level sanity check · 1c24c3ce

由 Josef Bacik 提交于 4月 23, 2013

With a users corrupted fs I was getting weird behavior and panics and it turns
out it was because one of his tree blocks had a bogus header level. So add this
to the sanity checks in the endio handler for tree blocks. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

1c24c3ce

Btrfs: don't try and free ebs twice in log replay · 5ec8dca7

由 Josef Bacik 提交于 4月 23, 2013

This work is done by btrfs_free_path() anyway so there's no need for this
duplicate work.  Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

5ec8dca7

Btrfs: don't BUG_ON() in btrfs_num_copies · fb7669b5

由 Josef Bacik 提交于 4月 23, 2013

A user sent me a btrfs-image that was panicing because of some corruption. This
is because we pass in a bogus value to btrfs_num_copies, and it panics. Instead
just return 1. We only call btrfs_num_copies to see if there are other copies
to try and read for things, so if we just return 1 it will make the callers exit
out with an appropriate error value. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

fb7669b5

Btrfs: don't call readahead hook until we have read the entire eb · 79fb65a1

由 Josef Bacik 提交于 4月 20, 2013

Martin Steigerwald reported a BUG_ON() where we were given a bogus bytenr to
map. Turns out he is using > PAGESIZE leafsizes. The readahead stuff is called
every time we do a completion, but we may not have finished reading in all the
pages, so the bytenr we read off the node could be completely bogus. Fix this
by only calling the readahead hook once all pages have been read in. Thanks,
Reported-by: NMartin Steigerwald <Martin@lichtvoll.de>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

79fb65a1

Btrfs: deal with bad mappings in btrfs_map_block · 9bb91873

由 Josef Bacik 提交于 4月 19, 2013

Martin Steigerwald reported a BUG_ON() in btrfs_map_block where we didn't find
a chunk for a particular block we were trying to map. This happened because the
block was bogus. We shouldn't be BUG_ON()'ing in this case, just print a
message and return an error. This came from reada_add_block and it appears to
deal with an error fine so we should be good there. Thanks,
Reported-by: NMartin Steigerwald <Martin@lichtvoll.de>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

9bb91873

Btrfs: use REQ_META for all metadata IO · d4c7ca86

由 Josef Bacik 提交于 4月 19, 2013

We need to tag metadata io with REQ_META to avoid priority inversion when using
io throttling cqroups.  Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

d4c7ca86

Btrfs: fix possible infinite loop in slow caching · 0a3896d0

由 Josef Bacik 提交于 4月 19, 2013

So I noticed there is an infinite loop in the slow caching code. If we return 1
when we hit the end of the tree, so we could end up caching the last block group
the slow way and suddenly we're looping forever because we just keep
re-searching and trying again. Fix this by only doing btrfs_next_leaf() if we
don't need_resched(). Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

0a3896d0

Btrfs: fix lockdep warning · 62dbd717

由 Josef Bacik 提交于 4月 17, 2013

The locking order for stuff is

__sb_start_write
ordered_mutex

but with sync() we don't do __sb_start_write for some strange reason, which
means that our iput in wait_ordered_extents could start a transaction which does
the __sb_start_write while we're holding the ordered_mutex.  Fix this by using
delayed iput in sync.  Thanks,
Reported-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

62dbd717

Btrfs: add all ioctl checks before user change for quota operations · 534e6623

由 Wang Shilong 提交于 4月 17, 2013

Since all the quota configurations are loaded in memory, and we can
have ioctl checks before operating in the disk. It is safe to do such
things because qgroup_ioctl_lock is held outside.

Without these extra checks firstly, it should be ok to do user change
for quota operations. For example:

if we want to add an existed qgroup, we will do:
	->add_qgroup_item()
		->add_qgroup_rb()

add_qgroup_item() will return -EEXIST to us, however, qgroups are all
in memory, why not check them in memory firstly.
Signed-off-by: NWang Shilong <wangsl-fnst@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

534e6623

Btrfs: fix missing check about ulist_add() in qgroup.c · 3c97185c

由 Wang Shilong 提交于 4月 17, 2013

ulist_add() may return -ENOMEM, fix missing check about
return value.
Signed-off-by: NWang Shilong <wangsl-fnst@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

3c97185c

Btrfs: clear received_uuid field for new writable snapshots · 70023da2

由 Stefan Behrens 提交于 4月 17, 2013

For created snapshots, the full root_item is copied from the source
root and afterwards selectively modified. The current code forgets
to clear the field received_uuid. The only problem is that it is
confusing when you look at it with 'btrfs subv list', since for
writable snapshots, the contents of the snapshot can be completely
unrelated to the previously received snapshot.
The receiver ignores such snapshots anyway because he also checks
the field stransid in the root_item and that value used to be reset
to zero for all created snapshots.

This commit changes two things:
- clear the received_uuid field for new writable snapshots.
- don't clear the send/receive related information like the stransid
  for read-only snapshots (which makes them useable as a parent for
  the automatic selection of parents in the receive code).
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

70023da2

Btrfs: don't force pages under writeback to finish when aborting · b8d7f3ac

由 Josef Bacik 提交于 4月 17, 2013

Dave reported a BUG_ON() that happened in end_page_writeback() after an abort.
This happened because we unconditionally call end_page_writeback() in the endio
case, which is right. However when we abort the transaction we will call
end_page_writeback() on any writeback pages we find, which is wrong. We need to
lock the page and wait on page writeback to complete if it is. There is nothing
unsafe about this since we are discarding the transaction anyway. Thanks,
Reported-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

b8d7f3ac

W
Btrfs: remove unused variable in the iterate_extent_inodes() · ccf7f29d
由 Wang Shilong 提交于 4月 16, 2013
```
Signed-off-by: NWang Shilong <wangsl-fnst@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
```
ccf7f29d

Btrfs: return error when we specify wrong start to defrag · 0abd5b17

由 Liu Bo 提交于 4月 16, 2013

We need such a sanity check for wrong start when we defrag a file, otherwise,
even with a wrong start that's larger than file size, we can end up changing
not only inode's force compress flag but also FS's incompat flags.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

0abd5b17

Btrfs: fix reada debug code compilation · 3c59ccd3

由 Vincent 提交于 4月 16, 2013

This fixes the following errors:

  fs/btrfs/reada.c: In function ‘btrfs_reada_wait’:
  fs/btrfs/reada.c:958:42: error: invalid operands to binary < (have ‘atomic_t’ and ‘int’)
  fs/btrfs/reada.c:961:41: error: invalid operands to binary < (have ‘atomic_t’ and ‘int’)
Signed-off-by: NVincent Stehlé <vincent.stehle@laposte.net>
Cc: Chris Mason <chris.mason@fusionio.com>
Cc: linux-btrfs@vger.kernel.org
Reviewed-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

3c59ccd3

Btrfs: cleanup of function where btrfs_extend_item() is called · fd279fae

由 Tsutomu Itoh 提交于 4月 16, 2013

Argument 'trans' became unnecessary from setup_inline_extent_backref()
that called btrfs_extend_item().
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

fd279fae

Btrfs: remove unused argument of btrfs_extend_item() · 4b90c680

由 Tsutomu Itoh 提交于 4月 16, 2013

Argument 'trans' is not used in btrfs_extend_item().
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

4b90c680

Btrfs: cleanup of function where fixup_low_keys() is called · afe5fea7

由 Tsutomu Itoh 提交于 4月 16, 2013

If argument 'trans' is unnecessary in the function where
fixup_low_keys() is called, 'trans' is deleted.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

afe5fea7

Btrfs: remove unused argument of fixup_low_keys() · d6a0a126

由 Tsutomu Itoh 提交于 4月 16, 2013

Argument 'trans' is not used in fixup_low_keys(). So, remove it.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

d6a0a126

Btrfs: fix confusing edquot happening case · b4fcd6be

由 Wang Shilong 提交于 4月 15, 2013

Step to reproduce:
	mkfs.btrfs <disk>
	mount <disk> <mnt>
	dd if=/dev/zero of=/<mnt>/data bs=1M count=10
	sync
	btrfs quota enable <mnt>
	btrfs qgroup create 0/5 <mnt>
	btrfs qgroup limit 5M 0/5 <mnt>
	rm -f /<mnt>/data
	sync
	btrfs qgroup show <mnt>
	dd if=/dev/zero of=data bs=1M count=1

>From the perspective of users, qgroup's referenced or exclusive
is negative,but user can not continue to write data! a workaround
way is to cast u64 to s64 when doing qgroup reservation.
Signed-off-by: NWang Shilong <wangsl-fnst@cn.fujitsu.com>
Reviewed-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

b4fcd6be

Btrfs: do not continue if out of memory happens · e36902d4

由 Wang Shilong 提交于 4月 15, 2013

If out of memory happens, we should return -ENOMEM directly to the caller
rather than continue the work.
Signed-off-by: NWang Shilong <wangsl-fnst@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

e36902d4

btrfs: fix minor typo in comment · 9c931c5a

由 Nathaniel Yazdani 提交于 4月 15, 2013

In the comment describing the sync_writers field of the btrfs_inode
struct, "fsyncing" was misspelled "fsycing."
Signed-off-by: NNathaniel Yazdani <n1ght.4nd.d4y@gmail.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

9c931c5a

W
Btrfs: cleanup to remove reduplicate code in transaction.c · 98ad43be
由 Wang Shilong 提交于 4月 14, 2013
```
Signed-off-by: NWang Shilong <wangsl-fnst@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
```
98ad43be

Btrfs: fix unlock after free on rewinded tree blocks · 47fb091f

由 Jan Schmidt 提交于 4月 13, 2013

When tree_mod_log_rewind decides to make a copy of the current tree buffer
for its modifications, it subsequently freed the buffer before unlocking it.
Obviously, those operations are required in reverse order.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

47fb091f

Btrfs: fix accessing the root pointer in tree mod log functions · 30b0463a

由 Jan Schmidt 提交于 4月 13, 2013

The tree mod log functions were accessing root->node->... directly, without
use of btrfs_root_node() or explicit rcu locking. This could lead to an
extent buffer reference being leaked and another reference being freed too
early when preemtion was enabled.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

30b0463a