提交 · b0175117b9376a69978bbe80af26fb95dddbd53e · openeuler / raspberrypi-kernel

25 1月, 2013 1 次提交

Btrfs: fix panic when recovering tree log · b0175117

由 Josef Bacik 提交于 12月 18, 2012

A user reported a BUG_ON(ret) that occured during tree log replay.  Ret was
-EAGAIN, so what I think happened is that we removed an extent that covered
a bitmap entry and an extent entry.  We remove the part from the bitmap and
return -EAGAIN and then search for the next piece we want to remove, which
happens to be an entire extent entry, so we just free the sucker and return.
The problem is ret is still set to -EAGAIN so we trip the BUG_ON().  The
user used btrfs-zero-log so I'm not 100% sure this is what happened so I've
added a WARN_ON() to catch the other possibility.  Thanks,
Reported-by: NJan Steffens <jan.steffens@gmail.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

b0175117

17 12月, 2012 2 次提交

Btrfs: use ctl->unit for free space calculation instead of block_group->sectorsize · 96009762

由 Wang Sheng-Hui 提交于 11月 30, 2012

We should use ctl->unit for free space calculation instead of block_group->sectorsize
even though for free space use_bitmap or free space cluster we only have sectorsize assigned to ctl->unit currently. Also, we can keep it consisten in code style.
Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

96009762

Btrfs: do not warn_on io_ctl->cur in io_ctl_map_page · 07140125

由 Wang Sheng-Hui 提交于 11月 23, 2012

io_ctl_map_page is called by many functions in free-space-cache.
In most scenarios, the ->cur is not null, e.g. io_ctl_add_entry.
I think we'd better remove the warn_on here.
Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

07140125

12 12月, 2012 1 次提交

Btrfs: fix unnecessary while loop when search the free space, cache · de6c4115

由 Miao Xie 提交于 10月 18, 2012

When we find a bitmap free space entry, we may check the previous extent
entry covers the offset or not. But if we find this entry is also a bitmap
entry, we will continue to check the previous entry of the current one by
a while loop. It is unnecessary because it is impossible that the extent
entry which is in front of a bitmap entry can cover the offset of the entry
after that bitmap entry.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

de6c4115

09 10月, 2012 1 次提交

Btrfs: cache extent state when writing out dirty metadata pages · e6138876

由 Josef Bacik 提交于 9月 27, 2012

Everytime we write out dirty pages we search for an offset in the tree,
convert the bits in the state, and then when we wait we search for the
offset again and clear the bits. So for every dirty range in the io tree we
are doing 4 rb searches, which is suboptimal. With this patch we are only
doing 2 searches for every cycle (modulo weird things happening). Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

e6138876

04 10月, 2012 1 次提交

Btrfs: using for_each_set_bit_from to simplify the code · ebb3dad4

由 Wei Yongjun 提交于 9月 13, 2012

Using for_each_set_bit_from() to simplify the code.

spatch with a semantic match is used to found this.
(http://coccinelle.lip6.fr/)
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>

ebb3dad4

24 7月, 2012 1 次提交

Btrfs: do not count in readonly bytes · f6175efa

由 Liu Bo 提交于 7月 06, 2012

If a block group is ro, do not count its entries in when we dump space info.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

f6175efa

03 7月, 2012 1 次提交

Btrfs: fix tree log remove space corner case · bdb7d303

由 Josef Bacik 提交于 6月 27, 2012

The tree log stuff can have allocated space that we end up having split
across a bitmap and a real extent. The free space code does not deal with
this, it assumes that if it finds an extent or bitmap entry that the entire
range must fall within the entry it finds. This isn't necessarily the case,
so rework the remove function so it can handle this case properly. This
fixed two panics the user hit, first in the case where the space was
initially in a bitmap and then in an extent entry, and then the reverse
case. Thanks,
Reported-and-tested-by: NShaun Reich <sreich@kde.org>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

bdb7d303

30 5月, 2012 3 次提交

Btrfs: merge contigous regions when loading free space cache · cd023e7b

由 Josef Bacik 提交于 5月 14, 2012

When we write out the free space cache we will write out everything that is
in our in memory tree, and then we will just walk the pinned extents tree
and write anything we see there. The problem with this is that during
normal operations the pinned extents will be merged back into the free space
tree normally, and then we can allocate space from the merged areas and
commit them to the tree log. If we crash and replay the tree log we will
crash again because the tree log will try to free up space from what looks
like 2 seperate but contiguous entries, since one entry is from the original
free space cache and the other was a pinned extent that was merged back. To
fix this we just need to walk the free space tree after we load it and merge
contiguous entries back together. This will keep the tree log stuff from
breaking and it will make the allocator behave more nicely. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

cd023e7b

Btrfs: finish ordered extents in their own thread · 5fd02043

由 Josef Bacik 提交于 5月 02, 2012

We noticed that the ordered extent completion doesn't really rely on having
a page and that it could be done independantly of ending the writeback on a
page. This patch makes us not do the threaded endio stuff for normal
buffered writes and direct writes so we can end page writeback as soon as
possible (in irq context) and only start threads to do the ordered work when
it is actually done. Compression needs to be reworked some to take
advantage of this as well, but atm it has to do a find_get_page in its endio
handler so it must be done in its own thread. This makes direct writes
quite a bit faster. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

5fd02043

A
btrfs: trivial endianness annotations · 528c0327
由 Al Viro 提交于 4月 13, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
528c0327

13 4月, 2012 1 次提交

Btrfs: use commit root when loading free space cache · d53ba474

由 Josef Bacik 提交于 4月 12, 2012

A user reported that booting his box up with btrfs root on 3.4 was way
slower than on 3.3 because I removed the ideal caching code. It turns out
that we don't load the free space cache if we're in a commit for deadlock
reasons, but since we're reading the cache and it hasn't changed yet we are
safe reading the inode and free space item from the commit root, so do that
and remove all of the deadlock checks so we don't unnecessarily skip loading
the free space cache. The user reported this fixed the slowness. Thanks,
Tested-by: NCalvin Walton <calvin.walton@kepstin.ca>
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d53ba474

22 3月, 2012 2 次提交

btrfs: replace many BUG_ONs with proper error handling · 79787eaa

由 Jeff Mahoney 提交于 3月 12, 2012

 btrfs currently handles most errors with BUG_ON. This patch is a work-in-
 progress but aims to handle most errors other than internal logic
 errors and ENOMEM more gracefully.

 This iteration prevents most crashes but can run into lockups with
 the page lock on occasion when the timing "works out."
Signed-off-by: NJeff Mahoney <jeffm@suse.com>

79787eaa

btrfs: drop gfp_t from lock_extent · d0082371

由 Jeff Mahoney 提交于 3月 01, 2012

 lock_extent and unlock_extent are always called with GFP_NOFS, drop the
 argument and use GFP_NOFS consistently.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>

d0082371

15 2月, 2012 1 次提交

Btrfs: fix memory leak in load_free_space_cache() · a7e221e9

由 Tsutomu Itoh 提交于 2月 14, 2012

load_free_space_cache() has forgotten to free path.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>

a7e221e9

10 2月, 2012 1 次提交

btrfs: Fix typo in free-space-cache.c · 934e7d44

由 Masanari Iida 提交于 2月 07, 2012

Correct spelling "cace" to "cache" in
fs/btrfs/free-space-cache.c
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

934e7d44

27 1月, 2012 3 次提交

Btrfs: advance window_start if we're using a bitmap · 9b230628

由 Josef Bacik 提交于 1月 26, 2012

If we span a long area in a bitmap we could end up taking a lot of time
searching to the next free area if we're searching from the original
window_start, so advance window_start in order to make sure we don't do any
superficial searching.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9b230628

Btrfs: use cluster->window_start when allocating from a cluster bitmap · 0b4a9d24

由 Josef Bacik 提交于 1月 26, 2012

We specifically set window_start in the cluster struct to indicate where the
cluster starts in a bitmap, but we've been using min_start to indicate where
we're searching from. This is usually the start of the blockgroup, so
essentially means we're constantly searching from the start of any bitmap we
find, which completely negates all the trouble we go to in order to setup a
cluster. So start using window_start to make sure we actually use the area we
found. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0b4a9d24

Btrfs: make sure a bitmap has enough bytes · 357b9784

由 Josef Bacik 提交于 1月 26, 2012

We have only been checking for min_bytes available in bitmap entries, but we
won't successfully setup a bitmap cluster unless it has at least bytes in the
bitmap, so in the common case min_bytes is 4k and we want something like 2MB, so
if there are a bunch of bitmap entries with less than 2mb's in them, we'll
search all them anyway, which is suboptimal. Fix this check. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

357b9784

17 1月, 2012 1 次提交

Btrfs: add allocator tracepoints · 3f7de037

由 Josef Bacik 提交于 11月 10, 2011

I used these tracepoints when figuring out what the cluster stuff was doing, so
add them to mainline in case we need to profile this stuff again. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

3f7de037

11 1月, 2012 4 次提交

Btrfs: rewrite btrfs_trim_block_group() · 7fe1e641

由 Li Zefan 提交于 12月 29, 2011

There are various bugs in block group trimming:

- It may trim from offset smaller than user-specified offset.
- It may trim beyond user-specified range.
- It may leak free space for extents smaller than specified minlen.
- It may truncate the last trimmed extent thus leak free space.
- With mixed extents+bitmaps, some extents may not be trimmed.
- With mixed extents+bitmaps, some bitmaps may not be trimmed (even
none will be trimmed). Even for those trimmed, not all the free space
in the bitmaps will be trimmed.

I rewrite btrfs_trim_block_group() and break it into two functions.
One is to trim extents only, and the other is to trim bitmaps only.

Before patching:

	# fstrim -v /mnt/
	/mnt/: 1496465408 bytes were trimmed

After patching:

	# fstrim -v /mnt/
	/mnt/: 2193768448 bytes were trimmed

And this matches the total free space:

	# btrfs fi df /mnt
	Data: total=3.58GB, used=1.79GB
	System, DUP: total=8.00MB, used=4.00KB
	System: total=4.00MB, used=0.00
	Metadata, DUP: total=205.12MB, used=97.14MB
	Metadata: total=8.00MB, used=0.00
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

7fe1e641

L
Btrfs: check the return value of io_ctl_init() · 706efc66
由 Li Zefan 提交于 1月 09, 2012
```
It can return -ENOMEM.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
```
706efc66

Btrfs: avoid possible NULL deref in io_ctl_drop_pages() · a1ee5a45

由 Li Zefan 提交于 1月 09, 2012

If we run into some failure path in io_ctl_prepare_pages(),
io_ctl->pages[] array may have some NULL pointers.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

a1ee5a45

Btrfs: add pinned extents to on-disk free space cache correctly · db804f23

由 Li Zefan 提交于 1月 10, 2012

I got this while running xfstests:

[24256.836098] block group 317849600 has an wrong amount of free space
[24256.836100] btrfs: failed to load free space cache for block group 317849600

We should clamp the extent returned by find_first_extent_bit(),
so the start of the extent won't smaller than the start of the
block group.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

db804f23

08 1月, 2012 1 次提交

Btrfs: revamp clustered allocation logic · 1bb91902

由 Alexandre Oliva 提交于 10月 14, 2011

Parameterize clusters on minimum total size, minimum chunk size and
minimum contiguous size for at least one chunk, without limits on
cluster, window or gap sizes. Don't tolerate any fragmentation for
SSD_SPREAD; accept it for metadata, but try to keep data dense.
Signed-off-by: NAlexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

1bb91902

15 12月, 2011 1 次提交

btrfs: free-space-cache.c: remove extra semicolon. · cb54f257

由 Justin P. Mattock 提交于 11月 21, 2011

The patch below removes an extra semicolon.
Signed-off-by: NJustin P. Mattock <justinmattock@gmail.com>
CC: Chris Mason <chris.mason@oracle.com>
CC: linux-btrfs@vger.kernel.org
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

cb54f257

01 12月, 2011 2 次提交

Btrfs: reset cluster's max_size when creating bitmap · b78d09bc

由 Alexandre Oliva 提交于 11月 30, 2011

The field that indicates the size of the largest contiguous chunk of
free space in the cluster is not initialized when setting up bitmaps,
it's only increased when we find a larger contiguous chunk.  We end up
retaining a larger value than appropriate for highly-fragmented
clusters, which may cause pointless searches for large contiguous
groups, and even cause clusters that do not meet the density
requirements to be set up.
Signed-off-by: NAlexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b78d09bc

Btrfs: initialize new bitmaps' list · f2d0f676

由 Alexandre Oliva 提交于 11月 28, 2011

We're failing to create clusters with bitmaps because
setup_cluster_no_bitmap checks that the list is empty before inserting
the bitmap entry in the list for setup_cluster_bitmap, but the list
field is only initialized when it is restored from the on-disk free
space cache, or when it is written out to disk.

Besides a potential race condition due to the multiple use of the list
field, filesystem performance severely degrades over time: as we use
up all non-bitmap free extents, the try-to-set-up-cluster dance is
done at every metadata block allocation. For every block group, we
fail to set up a cluster, and after failing on them all up to twice,
we fall back to the much slower unclustered allocation.

To make matters worse, before the unclustered allocation, we try to
create new block groups until we reach the 1% threshold, which
introduces additional bitmaps and thus block groups that we'll iterate
over at each metadata block request.

f2d0f676

22 11月, 2011 1 次提交

Btrfs: remove free-space-cache.c WARN during log replay · 24a70313

由 Chris Mason 提交于 11月 21, 2011

The log replay code only partially loads block groups, since
the block group caching code is able to detect and deal with
extents the logging code has pinned down.

While the logging code is pinning down block groups, there is
a bogus WARN_ON we're hitting if the code wasn't able to find
an extent in the cache.  This commit removes the warning because
it can happen any time there isn't a valid free space cache
for that block group.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

24a70313

20 11月, 2011 3 次提交

Btrfs: clear pages dirty for io and set them extent mapped · f7d61dcd

由 Josef Bacik 提交于 11月 15, 2011

When doing the io_ctl helpers to clean up the free space cache stuff I stopped
using our normal prepare_pages stuff, which means I of course forgot to do
things like set the pages extent mapped, which will cause us all sorts of
wonderful propblems.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

f7d61dcd

Btrfs: avoid unnecessary bitmap search for cluster setup · 52621cb6

由 Li Zefan 提交于 11月 20, 2011

setup_cluster_no_bitmap() searches all the extents and bitmaps starting
from offset. Therefore if it returns -ENOSPC, all the bitmaps starting
from offset are in the bitmaps list, so it's sufficient to search from
this list in setup_cluser_bitmap().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

52621cb6

Btrfs: fix to search one more bitmap for cluster setup · 0f0fbf1d

由 Li Zefan 提交于 11月 20, 2011

Suppose there are two bitmaps [0, 256], [256, 512] and one extent
[100, 120] in the free space cache, and we want to setup a cluster
with offset=100, bytes=50.

In this case, there will be only one bitmap [256, 512] in the temporary
bitmaps list, and then setup_cluster_bitmap() won't search bitmap [0, 256].

The cause is, the list is constructed in setup_cluster_no_bitmap(),
and only bitmaps with bitmap_entry->offset >= offset will be added
into the list, and the very bitmap that convers offset has
bitmap_entry->offset <= offset.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0f0fbf1d

11 11月, 2011 1 次提交

Btrfs: only map pages if we know we need them when reading the space cache · 2f120c05

由 Josef Bacik 提交于 11月 10, 2011

People have been running into a warning when loading space cache because the
page is already mapped when trying to read in a bitmap. The way we read in
entries and pages is kind of convoluted, so fix it so that io_ctl_read_entry
maps the entries if it needs to, and if it hits the end of the page it simply
unmaps the page. That way we can unconditionally unmap the io_ctl before
reading in the bitmap and we should stop hitting these warnings. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2f120c05

06 11月, 2011 2 次提交

Btrfs: use the global reserve when truncating the free space cache inode · c8174313

由 Josef Bacik 提交于 11月 02, 2011

We no longer use the orphan block rsv for holding the reservation for truncating
the inode, so instead use the global block rsv and check to make sure it has
enough space for us to truncate the space. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c8174313

Btrfs: make sure btrfs_remove_free_space doesn't leak EAGAIN · 1eae31e9

由 Chris Mason 提交于 10月 14, 2011

btrfs_remove_free_space needs to make sure to set ret back to a
valid return value after setting it to EAGAIN, otherwise we return
it to the callers.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

1eae31e9

20 10月, 2011 5 次提交

Btrfs: don't flush the cache inode before writing it · 016fc6a6

由 Josef Bacik 提交于 10月 19, 2011

I noticed we had a little bit of latency when writing out the space cache
inodes. It's because we flush it before we write anything in case we have dirty
pages already there. This doesn't matter though since we're just going to
overwrite the space, and there really shouldn't be any dirty pages anyway. This
makes some of my tests run a little bit faster. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

016fc6a6

Btrfs: seperate out btrfs_block_rsv_check out into 2 different functions · 36ba022a

由 Josef Bacik 提交于 10月 18, 2011

Currently btrfs_block_rsv_check does 2 things, it will either refill a block
reserve like in the truncate or refill case, or it will check to see if there is
enough space in the global reserve and possibly refill it. However because of
overcommit we could be well overcommitting ourselves just to try and refill the
global reserve, when really we should just be committing the transaction. So
breack this out into btrfs_block_rsv_refill and btrfs_block_rsv_check. Refill
will try to reserve more metadata if it can and btrfs_block_rsv_check will not,
it will only tell you if the factor of the total space is still reserved.
Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

36ba022a

Btrfs: inline checksums into the disk free space cache · 5b0e95bf

由 Josef Bacik 提交于 10月 06, 2011

Yeah yeah I know this is how we used to do it and then I changed it, but damnit
I'm changing it back. The fact is that writing out checksums will modify
metadata, which could cause us to dirty a block group we've already written out,
so we have to truncate it and all of it's checksums and re-write it which will
write new checksums which could dirty a blockg roup that has already been
written and you see where I'm going with this? This can cause unmount or really
anything that depends on a transaction to commit to take it's sweet damned time
to happen. So go back to the way it was, only this time we're specifically
setting NODATACOW because we can't go through the COW pathway anyway and we're
doing our own built-in cow'ing by truncating the free space cache. The other
new thing is once we truncate the old cache and preallocate the new space, we
don't need to do that song and dance at all for the rest of the transaction, we
can just overwrite the existing space with the new cache if the block group
changes for whatever reason, and the NODATACOW will let us do this fine. So
keep track of which transaction we last cleared our cache in and if we cleared
it in this transaction just say we're all setup and carry on. This survives
xfstests and stress.sh.

The inode cache will continue to use the normal csum infrastructure since it
only gets written once and there will be no more modifications to the fs tree in
a transaction commit.
Signed-off-by: NJosef Bacik <josef@redhat.com>

5b0e95bf

Btrfs: check the return value of filemap_write_and_wait in the space cache · 549b4fdb

由 Josef Bacik 提交于 10月 05, 2011

We need to check the return value of filemap_write_and_wait in the space cache
writeout code.  Also don't set the inode's generation until we're sure nothing
else is going to fail.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

549b4fdb

Btrfs: add a io_ctl struct and helpers for dealing with the space cache · a67509c3

由 Josef Bacik 提交于 10月 05, 2011

In writing and reading the space cache we have one big loop that keeps track of
which page we are on and then a bunch of sizeable loops underneath this big loop
to try and read/write out properly. Especially in the write case this makes
things hugely complicated and hard to follow, and makes our error checking and
recovery equally as complex. So add a io_ctl struct with a bunch of helpers to
keep track of the pages we have, where we are, if we have enough space etc.
This unifies how we deal with the pages we're writing and keeps all the messy
tracking internal. This allows us to kill the big loops in both the read and
write case and makes reviewing and chaning the write and read paths much
simpler. I've run xfstests and stress.sh on this code and it survives. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

a67509c3