提交 · 27c9d772e50731dfd682b4ea9459eccec2071c49 · xiphi1978 / linux

04 2月, 2016 1 次提交

Btrfs: remove no longer used function extent_read_full_page_nolock() · 7f042a83

由 Filipe Manana 提交于 1月 27, 2016

Not needed after the previous patch named
"Btrfs: fix page reading in extent_same ioctl leading to csum errors".
Signed-off-by: NFilipe Manana <fdmanana@suse.com>

7f042a83

18 12月, 2015 2 次提交

Btrfs: add extent buffer bitmap sanity tests · 0f331229

由 Omar Sandoval 提交于 9月 29, 2015

Sanity test the extent buffer bitmap operations (test, set, and clear)
against the equivalent standard kernel operations.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

0f331229

Btrfs: add extent buffer bitmap operations · 3e1e8bb7

由 Omar Sandoval 提交于 9月 29, 2015

These are going to be used for the free space tree bitmap items.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

3e1e8bb7

07 12月, 2015 6 次提交

btrfs: make extent_range_redirty_for_io return void · f6311572

由 David Sterba 提交于 12月 03, 2015

Does not return any errors, nor anything from the callgraph. There's a
BUG_ON but it's a sanity check and not an error condition we could
recover from.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f6311572

btrfs: make extent_range_clear_dirty_for_io return void · bd1fa4f0

由 David Sterba 提交于 12月 03, 2015

Does not return any errors, nor anything from the callgraph. There's a
BUG_ON but it's a sanity check and not an error condition we could
recover from.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

bd1fa4f0

btrfs: make end_extent_writepage return void · b5227c07

由 David Sterba 提交于 12月 03, 2015

Does not return any errors, nor anything from the callgraph.  The branch
in end_bio_extent_writepage has been skipped since
5fd02043 ("Btrfs: finish ordered extents in their own thread").
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b5227c07

btrfs: make extent_clear_unlock_delalloc return void · a9d93e17

由 David Sterba 提交于 12月 03, 2015

Does not return any errors, nor anything from the callgraph.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

a9d93e17

btrfs: make clear_extent_buffer_uptodate return void · 69ba3927

由 David Sterba 提交于 12月 03, 2015

Does not return any errors, nor anything from the callgraph.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

69ba3927

btrfs: make set_extent_buffer_uptodate return void · 09c25a8c

由 David Sterba 提交于 12月 03, 2015

Does not return any errors, nor anything from the callgraph.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

09c25a8c

03 12月, 2015 4 次提交

btrfs: make lock_extent static inline · cd716d8f

由 David Sterba 提交于 12月 03, 2015

One call less reduces stack usage, code slightly reduced as well.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

cd716d8f

btrfs: drop unused parameter from lock_extent_bits · ff13db41

由 David Sterba 提交于 12月 03, 2015

We've always passed 0. Stack usage will slightly decrease.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ff13db41

btrfs: make clear_extent_bit helpers static inline · e83b1d91

由 David Sterba 提交于 12月 03, 2015

The funcions just wrap the clear_extent_bit API and generate function
calls. This increases stack consumption and may negatively affect
performance due to icache misses. We can simply make the helpers static
inline and keep the type checking and API untouched. The code slightly
decreases:

   text	   data	    bss	    dec	    hex	filename
 938667	  43670	  23144	1005481	  f57a9	fs/btrfs/btrfs.ko.before
 939651	  43670	  23144	1006465	  f5b81	fs/btrfs/btrfs.ko.after
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e83b1d91

btrfs: make set_extent_bit helpers static inline · c6317955

由 David Sterba 提交于 12月 03, 2015

The funcions just wrap the set_extent_bit API and generate function
calls. This increases stack consumption and may negatively affect
performance due to icache misses. We can simply make the helpers static
inline and keep the type checking and API untouched. The code slightly
increases:

   text	   data	    bss	    dec	    hex	filename
 938427	  43670	  23144	1005241	  f56b9	fs/btrfs/btrfs.ko.before
 938667	  43670	  23144	1005481	  f57a9	fs/btrfs/btrfs.ko
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c6317955

22 10月, 2015 4 次提交

btrfs: qgroup: Introduce btrfs_qgroup_reserve_data function · 52472553

由 Qu Wenruo 提交于 10月 12, 2015

Introduce a new function, btrfs_qgroup_reserve_data(), which will use
io_tree to accurate qgroup reserve, to avoid reserved space leaking.
Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

52472553

btrfs: extent_io: Introduce new function clear_record_extent_bits() · fefdc557

由 Qu Wenruo 提交于 10月 12, 2015

Introduce new function clear_record_extent_bits(), which will clear bits
for given range and record the details about which ranges are cleared
and how many bytes in total it changes.

This provides the basis for later qgroup reserve codes.
Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

fefdc557

btrfs: extent_io: Introduce new function set_record_extent_bits · d38ed27f

由 Qu Wenruo 提交于 10月 12, 2015

Introduce new function set_record_extent_bits(), which will not only set
given bits, but also record how many bytes are changed, and detailed
range info.

This is quite important for later qgroup reserve framework.
The number of bytes will be used to do qgroup reserve, and detailed
range info will be used to cleanup for EQUOT case.
Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

d38ed27f

btrfs: extent_io: Introduce needed structure for recoding set/clear bits · ac467772

由 Qu Wenruo 提交于 10月 12, 2015

Add a new structure, extent_change_set, to record how many bytes are
changed in one set/clear_extent_bits() operation, with detailed changed
ranges info.

This provides the needed facilities for later qgroup reserve framework.
Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

ac467772

17 2月, 2015 1 次提交

btrfs: constify structs with op functions or static definitions · e8c9f186

由 David Sterba 提交于 1月 02, 2015

There are some op tables that can be easily made const, similarly the
sysfs feature and raid tables. This is motivated by PaX CONSTIFY plugin.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

e8c9f186

22 1月, 2015 1 次提交

btrfs: switch extent_state state to unsigned · 9ee49a04

由 David Sterba 提交于 1月 14, 2015

Currently there's a 4B hole in the structure between refs and state and there
are only 16 bits used so we can make it unsigned. This will get a better
packing and may save some stack space for local variables.

The size of extent_state gets reduced by 8B and there are usually a lot
of slab objects.

struct extent_state {
	u64                        start;                /*     0     8 */
	u64                        end;                  /*     8     8 */
	struct rb_node             rb_node;              /*    16    24 */
	wait_queue_head_t          wq;                   /*    40    24 */
	/* --- cacheline 1 boundary (64 bytes) --- */
	atomic_t                   refs;                 /*    64     4 */

	/* XXX 4 bytes hole, try to pack */

	long unsigned int          state;                /*    72     8 */
	u64                        private;              /*    80     8 */

	/* size: 88, cachelines: 2, members: 7 */
	/* sum members: 84, holes: 1, sum holes: 4 */
	/* last cacheline: 24 bytes */
};
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <clm@fb.com>

9ee49a04

13 12月, 2014 2 次提交

btrfs: sink parameter len to alloc_extent_buffer · ce3e6984

由 David Sterba 提交于 6月 15, 2014

Because we're using globally known nodesize. Do the same for the sanity
test function variant.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

ce3e6984

btrfs: unify extent buffer allocation api · 3f556f78

由 David Sterba 提交于 6月 15, 2014

Make the extent buffer allocation interface consistent.  Cloned eb will
set a valid fs_info.  For dummy eb, we can drop the length parameter and
set it from fs_info.

The built-in sanity checks may pass a NULL fs_info that's queried for
nodesize, but we know it's 4096.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

3f556f78

21 11月, 2014 1 次提交

Btrfs: set page and mapping error on compressed write failure · 704de49d

由 Filipe Manana 提交于 10月 06, 2014

If we fail in submit_compressed_extents() before calling btrfs_submit_compressed_write(),
we start and end the writeback for the pages (clear their dirty flag, unlock them, etc)
but we don't tag the pages, nor the inode's mapping, with an error. This makes it
impossible for a caller of filemap_fdatawait_range() (fsync, or transaction commit
for e.g.) know that there was an error.

Note that the return value of submit_compressed_extents() is useless, as that function
is executed by a workqueue task and not directly by the fill_delalloc callback. This
means the writepage/s callbacks of the inode's address space operations don't get that
return value.
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

704de49d

08 10月, 2014 1 次提交

Btrfs: fix compiles when CONFIG_BTRFS_FS_RUN_SANITY_TESTS is off · 0d4cf4e6

由 Chris Mason 提交于 10月 07, 2014

Commit fccb84c9 moved added some helpers to cleanup our sanity tests,
but it looks like both Dave and I always compile with the tests enabled.

This fixes things to work when they are turned off too.
Signed-off-by: NChris Mason <clm@fb.com>

0d4cf4e6

04 10月, 2014 1 次提交

Btrfs: be aware of btree inode write errors to avoid fs corruption · 656f30db

由 Filipe Manana 提交于 9月 26, 2014

While we have a transaction ongoing, the VM might decide at any time
to call btree_inode->i_mapping->a_ops->writepages(), which will start
writeback of dirty pages belonging to btree nodes/leafs. This call
might return an error or the writeback might finish with an error
before we attempt to commit the running transaction. If this happens,
we might have no way of knowing that such error happened when we are
committing the transaction - because the pages might no longer be
marked dirty nor tagged for writeback (if a subsequent modification
to the extent buffer didn't happen before the transaction commit) which
makes filemap_fdata[write|wait]_range unable to find such pages (even
if they're marked with SetPageError).
So if this happens we must abort the transaction, otherwise we commit
a super block with btree roots that point to btree nodes/leafs whose
content on disk is invalid - either garbage or the content of some
node/leaf from a past generation that got cowed or deleted and is no
longer valid (for this later case we end up getting error messages like
"parent transid verify failed on 10826481664 wanted 25748 found 29562"
when reading btree nodes/leafs from disk).

Note that setting and checking AS_EIO/AS_ENOSPC in the btree inode's
i_mapping would not be enough because we need to distinguish between
log tree extents (not fatal) vs non-log tree extents (fatal) and
because the next call to filemap_fdatawait_range() will catch and clear
such errors in the mapping - and that call might be from a log sync and
not from a transaction commit, which means we would not know about the
error at transaction commit time. Also, checking for the eb flag
EXTENT_BUFFER_IOERR at transaction commit time isn't done and would
not be completely reliable, as the eb might be removed from memory and
read back when trying to get it, which clears that flag right before
reading the eb's pages from disk, making us not know about the previous
write error.

Using the new 3 flags for the btree inode also makes us achieve the
goal of AS_EIO/AS_ENOSPC when writepages() returns success, started
writeback for all dirty pages and before filemap_fdatawait_range() is
called, the writeback for all dirty pages had already finished with
errors - because we were not using AS_EIO/AS_ENOSPC,
filemap_fdatawait_range() would return success, as it could not know
that writeback errors happened (the pages were no longer tagged for
writeback).
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

656f30db

02 10月, 2014 2 次提交

btrfs: kill extent_buffer_page helper · fb85fc9a

由 David Sterba 提交于 7月 31, 2014

It used to be more complex but now it's just a simple array access.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

fb85fc9a

D
btrfs: remove unused extent state bits · 01d5bc37
由 David Sterba 提交于 7月 30, 2014
```
The last users are long gone.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
```
01d5bc37

18 9月, 2014 7 次提交

Btrfs: cleanup the read failure record after write or when the inode is freeing · f612496b

由 Miao Xie 提交于 9月 12, 2014

After the data is written successfully, we should cleanup the read failure record
in that range because
- If we set data COW for the file, the range that the failure record pointed to is
  mapped to a new place, so it is invalid.
- If we set no data COW for the file, and if there is no error during writting,
  the corrupted data is corrected, so the failure record can be removed. And if
  some errors happen on the mirrors, we also needn't worry about it because the
  failure record will be recreated if we read the same place again.

Sometimes, we may fail to correct the data, so the failure records will be left
in the tree, we need free them when we free the inode or the memory leak happens.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

f612496b

Btrfs: implement repair function when direct read fails · 8b110e39

由 Miao Xie 提交于 9月 12, 2014

This patch implement data repair function when direct read fails.

The detail of the implementation is:
- When we find the data is not right, we try to read the data from the other
  mirror.
- When the io on the mirror ends, we will insert the endio work into the
  dedicated btrfs workqueue, not common read endio workqueue, because the
  original endio work is still blocked in the btrfs endio workqueue, if we
  insert the endio work of the io on the mirror into that workqueue, deadlock
  would happen.
- After we get right data, we write it back to the corrupted mirror.
- And if the data on the new mirror is still corrupted, we will try next
  mirror until we read right data or all the mirrors are traversed.
- After the above work, we set the uptodate flag according to the result.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

8b110e39

Btrfs: modify clean_io_failure and make it suit direct io · 1203b681

由 Miao Xie 提交于 9月 12, 2014

We could not use clean_io_failure in the direct IO path because it got the
filesystem information from the page structure, but the page in the direct
IO bio didn't have the filesystem information in its structure. So we need
modify it and pass all the information it need by parameters.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

1203b681

Btrfs: modify repair_io_failure and make it suit direct io · ffdd2018

由 Miao Xie 提交于 9月 12, 2014

The original code of repair_io_failure was just used for buffered read,
because it got some filesystem data from page structure, it is safe for
the page in the page cache. But when we do a direct read, the pages in bio
are not in the page cache, that is there is no filesystem data in the page
structure. In order to implement direct read data repair, we need modify
repair_io_failure and pass all filesystem data it need by function
parameters.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

ffdd2018

Btrfs: split bio_readpage_error into several functions · 2fe6303e

由 Miao Xie 提交于 9月 12, 2014

The data repair function of direct read will be implemented later, and some code
in bio_readpage_error will be reused, so split bio_readpage_error into
several functions which will be used in direct read repair later.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

2fe6303e

Btrfs: shrink further sizeof(struct extent_buffer) · 2a39e598

由 Filipe Manana 提交于 8月 14, 2014

The map_start and map_len fields aren't used anywhere, so just remove
them. On a x86_64 system, this reduced sizeof(struct extent_buffer)
from 296 bytes to 280 bytes, and therefore 14 extent_buffer structs can
now fit into a page instead of 13.
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <clm@fb.com>

2a39e598

Btrfs: reduce size of struct extent_state · 27a3507d

由 Filipe Manana 提交于 7月 06, 2014

The tree field of struct extent_state was only used to figure out if
an extent state was connected to an inode's io tree or not. For this
we can just use the rb_node field itself.

On a x86_64 system with this change the sizeof(struct extent_state) is
reduced from 96 bytes down to 88 bytes, meaning that with a page size
of 4096 bytes we can now store 46 extent states per page instead of 42.
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

27a3507d

20 6月, 2014 1 次提交

Btrfs: remove unused wait queue in struct extent_buffer · 46fefe41

由 Filipe Manana 提交于 6月 16, 2014

The lock_wq wait queue is not used anywhere, therefore just remove it.
On a x86_64 system, this reduced sizeof(struct extent_buffer) from 320
bytes down to 296 bytes, which means a 4Kb page can now be used for
13 extent buffers instead of 12.
Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: NChris Mason <clm@fb.com>

46fefe41

13 6月, 2014 1 次提交

btrfs: new function read_extent_buffer_to_user · 550ac1d8

由 Gerhard Heift 提交于 1月 30, 2014

This new function reads the content of an extent directly to user memory.
Signed-off-by: NGerhard Heift <Gerhard@Heift.Name>
Signed-off-by: NChris Mason <clm@fb.com>
Acked-by: NDavid Sterba <dsterba@suse.cz>

550ac1d8

10 6月, 2014 1 次提交

Btrfs: add sanity tests for new qgroup accounting code · faa2dbf0

由 Josef Bacik 提交于 5月 07, 2014

This exercises the various parts of the new qgroup accounting code. We do some
basic stuff and do some things with the shared refs to make sure all that code
works. I had to add a bunch of infrastructure because I needed to be able to
insert items into a fake tree without having to do all the hard work myself,
hopefully this will be usefull in the future. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

faa2dbf0

07 4月, 2014 1 次提交

Btrfs: don't clear uptodate if the eb is under IO · a26e8c9f

由 Josef Bacik 提交于 3月 28, 2014

So I have an awful exercise script that will run snapshot, balance and
send/receive in parallel. This sometimes would crash spectacularly and when it
came back up the fs would be completely hosed. Turns out this is because of a
bad interaction of balance and send/receive. Send will hold onto its entire
path for the whole send, but its blocks could get relocated out from underneath
it, and because it doesn't old tree locks theres nothing to keep this from
happening. So it will go to read in a slot with an old transid, and we could
have re-allocated this block for something else and it could have a completely
different transid. But because we think it is invalid we clear uptodate and
re-read in the block. If we do this before we actually write out the new block
we could write back stale data to the fs, and boom we're screwed.

Now we definitely need to fix this disconnect between send and balance, but we
really really need to not allow ourselves to accidently read in stale data over
new data. So make sure we check if the extent buffer is not under io before
clearing uptodate, this will kick back EIO to the caller instead of reading in
stale data and keep us from corrupting the fs. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

a26e8c9f

29 1月, 2014 2 次提交

Btrfs: move the extent buffer radix tree into the fs_info · f28491e0

由 Josef Bacik 提交于 12月 16, 2013

I need to create a fake tree to test qgroups and I don't want to have to setup a
fake btree_inode. The fact is we only use the radix tree for the fs_info, so
everybody else who allocates an extent_io_tree is just wasting the space anyway.
This patch moves the radix tree and its lock into btrfs_fs_info so there is less
stuff I have to fake to do qgroup sanity tests. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

f28491e0

Btrfs: use a bit to track if we're in the radix tree · 34b41ace

由 Josef Bacik 提交于 12月 13, 2013

For creating a dummy in-memory btree I need to be able to use the radix tree to
keep track of the buffers like normal extent buffers. With dummy buffers we
skip the radix tree step, and we still want to do that for the tree mod log
dummy buffers but for my test buffers we need to be able to remove them from the
radix tree like normal. This will give me a way to do that. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

34b41ace

12 11月, 2013 1 次提交

Btrfs: Simplify the logic in alloc_extent_buffer() for existing extent buffer case · 452c75c3

由 Chandra Seetharaman 提交于 10月 07, 2013

alloc_extent_buffer() uses radix_tree_lookup() when radix_tree_insert()
fails with EEXIST. That part of the code is very similar to the code in
find_extent_buffer(). This patch replaces radix_tree_lookup() and
surrounding code in alloc_extent_buffer() with find_extent_buffer().

Note that radix_tree_lookup() does not need to be protected by
tree->buffer_lock. It is protected by eb->refs.

While at it, this patch
  - changes the other usage of radix_tree_lookup() in alloc_extent_buffer()
    with find_extent_buffer() to reduce redundancy.
  - removes the unused argument 'len' to find_extent_buffer().
Signed-Off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NZach Brown <zab@redhat.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

452c75c3