提交 · 78a6184a3ff9041280ee56273c01e5679a831b39 · openeuler / raspberrypi-kernel

20 2月, 2013 1 次提交

Btrfs: use slabs for delayed reference allocation · 78a6184a

由 Miao Xie 提交于 11月 21, 2012

The delayed reference allocation is in the fast path of the IO, so use slabs
to improve the speed of the allocation.

And besides that, it can do check for leaked objects when the module is removed.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>

78a6184a

16 2月, 2013 1 次提交

btrfs: access superblock via pagecache in scan_one_device · 6f60cbd3

由 David Sterba 提交于 2月 15, 2013

btrfs_scan_one_device is calling set_blocksize() which can race
with a concurrent process making dirty page cache pages.  It can end up
dropping dirty page cache pages on the floor, which isn't very nice when
someone is just running btrfs dev scan to find filesystems on the
box.

Now that udev is registering btrfs devices as it discovers them, we can
actually end up racing with our own mkfs program too.  When this
happens, we drop some of the important blocks written by mkfs.

This commit changes scan_one_device to read the super out of the page
cache instead of trying to use bread.  This way we don't have to care
about the blocksize of the device.

This also drops the invalidate_bdev() call.  It wasn't very polite to
invalidate during the scan either.  mkfs is putting the super into the
page cache, there's no reason to invalidate at this point.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

6f60cbd3

15 2月, 2013 1 次提交

Btrfs: fix crash in log replay with qgroups enabled · 2a745b14

由 Arne Jansen 提交于 2月 13, 2013

When replaying a log tree with qgroups enabled, tree_mod_log_rewind does a
sanity-check of the number of items against the maximum possible number.
It calculates that number with the nodesize of fs_root. Unfortunately
fs_root is not yet set at this stage. So instead use the nodesize from
tree_root, which is already initialized.
Signed-off-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

2a745b14

07 2月, 2013 1 次提交

Btrfs: move d_instantiate outside the transaction during mksubvol · 1a65e24b

由 Chris Mason 提交于 2月 06, 2013

Dave Sterba triggered a lockdep complaint about lock ordering
between the sb_internal lock and the cleaner semaphore.

btrfs_lookup_dentry() checks for orphans if we're looking up
the inode for a subvolume, and subvolume creation is triggering
the lookup with a transaction running.

This commit moves the d_instantiate after the transaction closes.
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

1a65e24b

06 2月, 2013 8 次提交

Btrfs: fix EDQUOT handling in btrfs_delalloc_reserve_metadata · eb6b88d9

由 Jan Schmidt 提交于 1月 27, 2013

When btrfs_qgroup_reserve returned a failure, we were missing a counter
operation for BTRFS_I(inode)->outstanding_extents++, leading to warning
messages about outstanding extents and space_info->bytes_may_use != 0.
Additionally, the error handling code didn't take into account that we
dropped the inode lock which might require more cleanup.

Luckily, all the cleanup code we need is already there and can be shared
with reserve_metadata_bytes, which is exactly what this patch does.
Reported-by: NLev Vainblat <lev@zadarastorage.com>
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

eb6b88d9

C

Merge git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git for-chris into for-linus · 24f8ebe9
由 Chris Mason 提交于 2月 05, 2013

24f8ebe9

Btrfs: fix possible stale data exposure · 59fe4f41

由 Josef Bacik 提交于 1月 30, 2013

We specifically do not update the disk i_size if there are ordered extents
outstanding for any area between the current disk_i_size and our ordered
extent so that we do not expose stale data. The problem is the check we
have only checks if the ordered extent starts at or after the current
disk_i_size, which doesn't take into account an ordered extent that starts
before the current disk_i_size and ends past the disk_i_size. Fix this by
checking if the extent ends past the disk_i_size. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

59fe4f41

Btrfs: fix missing i_size update · 5d1f4020

由 Josef Bacik 提交于 1月 30, 2013

If we have an ordered extent before the ordered extent we are currently
completing that is after the current disk_i_size we will put our i_size
update into that ordered extent so that we do not expose stale data. The
problem is that if our disk i_size is updated past the previous ordered
extent we won't update the i_size with the pending i_size update. So check
the pending i_size update and if its above the current disk i_size we need
to go ahead and try to update. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

5d1f4020

Btrfs: fix race between snapshot deletion and getting inode · 6f1c3605

由 Liu Bo 提交于 1月 29, 2013

While running snapshot testscript created by Mitch and David,
the race between autodefrag and snapshot deletion can lead to
corruption of dead_root list so that we can get crash on
btrfs_clean_old_snapshots().

And besides autodefrag, scrub also does the same thing, ie. read
root first and get inode.

Here is the story(take autodefrag as an example):
(1) when we delete a snapshot or subvolume, it will set its root's
refs to zero and do a iput() on its own inode, and if this inode happens
to be the only active in-meory one in root's inode rbtree, it will add
itself to the global dead_roots list for later cleanup.

(2) after (1), the autodefrag thread may read another inode for defrag
and the inode is just in the deleted snapshot/subvolume, but all of these
are without checking if the root is still valid(refs > 0).  So the end up
result is adding the deleted snapshot/subvolume's root to the global
dead_roots list AGAIN.

Fortunately, we already have a srcu lock to avoid the race, ie. subvol_srcu.

So all we need to do is to take the lock to protect 'read root and get inode',
since we synchronize to wait for the rcu grace period before adding something
to the global dead_roots list.
Reported-by: NMitch Harder <mitch.harder@sabayonlinux.org>
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

6f1c3605

Btrfs: fix missing release of the space/qgroup reservation in start_transaction() · 843fcf35

由 Miao Xie 提交于 1月 28, 2013

When we fail to start a transaction, we need to release the reserved free space
and qgroup space, fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Reviewed-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

843fcf35

Btrfs: fix wrong sync_writers decrement in btrfs_file_aio_write() · 0a3404dc

由 Miao Xie 提交于 1月 28, 2013

If the checks at the beginning of btrfs_file_aio_write() fail, we needn't
decrease ->sync_writers, because we have not increased it. Fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

0a3404dc

Btrfs: do not merge logged extents if we've removed them from the tree · 222c81dc

由 Josef Bacik 提交于 1月 28, 2013

You can run into this problem where if somebody is fsyncing and writing out
the existing extents you will have removed the extent map from the em tree,
but it's still valid for the current fsync so we go ahead and write it. The
problem is we unconditionally try to merge it back into the em tree, but if
we've removed it from the em tree that will cause use after free problems.
Fix this to only merge if we are still a part of the tree. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

222c81dc

02 2月, 2013 1 次提交

btrfs: don't try to notify udev about missing devices · 3c911608

由 Eric Sandeen 提交于 1月 31, 2013

If we remove a missing device, bdev is null, and if we
send that off to btrfs_kobject_uevent we'll panic.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

3c911608

25 1月, 2013 8 次提交

Btrfs: fix repeated delalloc work allocation · 1eafa6c7

由 Miao Xie 提交于 1月 22, 2013

btrfs_start_delalloc_inodes() locks the delalloc_inodes list, fetches the
first inode, unlocks the list, triggers btrfs_alloc_delalloc_work/
btrfs_queue_worker for this inode, and then it locks the list, checks the
head of the list again. But because we don't delete the first inode that it
deals with before, it will fetch the same inode. As a result, this function
allocates a huge amount of btrfs_delalloc_work structures, and OOM happens.

Fix this problem by splice this delalloc list.
Reported-by: NAlex Lyakas <alex.btrfs@zadarastorage.com>
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

1eafa6c7

Btrfs: fix wrong max device number for single profile · c9f01bfe

由 Miao Xie 提交于 1月 16, 2013

The max device number of single profile is 1, not 0 (0 means 'as many as
possible'). Fix it.

Cc: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

c9f01bfe

Btrfs: fix missed transaction->aborted check · 2cba30f1

由 Miao Xie 提交于 1月 15, 2013

First, though the current transaction->aborted check can stop the commit early
and avoid unnecessary operations, it is too early, and some transaction handles
don't end, those handles may set transaction->aborted after the check.

Second, when we commit the transaction, we will wake up some worker threads to
flush the space cache and inode cache. Those threads also allocate some transaction
handles and may set transaction->aborted if some serious error happens.

So we need more check for ->aborted when committing the transaction. Fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

2cba30f1

Btrfs: Add ACCESS_ONCE() to transaction->abort accesses · 8d25a086

由 Miao Xie 提交于 1月 15, 2013

We may access and update transaction->aborted on the different CPUs without
lock, so we need ACCESS_ONCE() wrapper to prevent the compiler from creating
unsolicited accesses and make sure we can get the right value.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

8d25a086

Btrfs: put csums on the right ordered extent · e58dd74b

由 Josef Bacik 提交于 1月 22, 2013

I noticed a WARN_ON going off when adding csums because we were going over
the amount of csum bytes that should have been allowed for an ordered
extent. This is a leftover from when we used to hold the csums privately
for direct io, but now we use the normal ordered sum stuff so we need to
make sure and check if we've moved on to another extent so that the csums
are added to the right extent. Without this we could end up with csums for
bytenrs that don't have extents to cover them yet. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

e58dd74b

Btrfs: use right range to find checksum for compressed extents · 192000dd

由 Liu Bo 提交于 1月 06, 2013

For compressed extents, the range of checksum is covered by disk length,
and the disk length is different with ram length, so we need to use disk
length instead to get us the right checksum.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

192000dd

Btrfs: fix panic when recovering tree log · b0175117

由 Josef Bacik 提交于 12月 18, 2012

A user reported a BUG_ON(ret) that occured during tree log replay.  Ret was
-EAGAIN, so what I think happened is that we removed an extent that covered
a bitmap entry and an extent entry.  We remove the part from the bitmap and
return -EAGAIN and then search for the next piece we want to remove, which
happens to be an entire extent entry, so we just free the sucker and return.
The problem is ret is still set to -EAGAIN so we trip the BUG_ON().  The
user used btrfs-zero-log so I'm not 100% sure this is what happened so I've
added a WARN_ON() to catch the other possibility.  Thanks,
Reported-by: NJan Steffens <jan.steffens@gmail.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

b0175117

Btrfs: do not allow logged extents to be merged or removed · 201a9038

由 Josef Bacik 提交于 1月 24, 2013

We drop the extent map tree lock while we're logging extents, so somebody
could come in and merge another extent into this one and screw up our
logging, or they could even remove us from the list which would keep us from
logging the extent or freeing our ref on it, so we need to make sure to not
clear LOGGING until after the extent is logged, and then we can merge it to
adjacent extents. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

201a9038

22 1月, 2013 5 次提交

Btrfs: fix a regression in balance usage filter · a105bb88

由 Ilya Dryomov 提交于 1月 21, 2013

Commit 3fed40cc ("Btrfs: cleanup duplicated division functions"), which
was merged into 3.8-rc1, has introduced a regression by removing logic
that was guarding us against bad user input.  Bring it back.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

a105bb88

C

Merge branch 'mutex-ops@next-for-chris' of git://github.com/idryomov/btrfs-unstable into linus · 83bfccb5
由 Chris Mason 提交于 1月 21, 2013

83bfccb5

Merge branch 'for-chris' of... · daf2c089

由 Chris Mason 提交于 1月 21, 2013

Merge branch 'for-chris' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next into linus

daf2c089

Btrfs: prevent qgroup destroy when there are still relations · 2cf68703

由 Arne Jansen 提交于 1月 17, 2013

Currently you can just destroy a qgroup even though it is in use by other qgroups
or has qgroups assigned to it. This patch prevents destruction of qgroups unless
they are completely unused. Otherwise destroy will return EBUSY.
Reported-by: NEric Hopper <hopper@omnifarious.org>
Signed-off-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

2cf68703

Btrfs: ignore orphan qgroup relations · ff24858c

由 Arne Jansen 提交于 1月 17, 2013

If a qgroup that has still assignments is deleted by the user, the corresponding
relations are left in the tree. This leads to an unmountable filesystem.
With this patch, those relations are simple ignored.
Reported-by: NEric Hopper <hopper@omnifarious.org>
Signed-off-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

ff24858c

20 1月, 2013 5 次提交

Btrfs: reorder locks and sanity checks in btrfs_ioctl_defrag · 25122d15

由 Ilya Dryomov 提交于 1月 20, 2013

Operation-specific check (whether subvol is readonly or not) should go
after the mutual exclusiveness check.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

25122d15

I
Btrfs: fix unlock order in btrfs_ioctl_rm_dev · 4ac20c70
由 Ilya Dryomov 提交于 1月 20, 2013
```
Fix unlock order in btrfs_ioctl_rm_dev().
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
4ac20c70
I
Btrfs: fix unlock order in btrfs_ioctl_resize · 18f39c41
由 Ilya Dryomov 提交于 1月 20, 2013
```
Fix unlock order in btrfs_ioctl_resize().
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
18f39c41

Btrfs: fix "mutually exclusive op is running" error code · 2c0c9da0

由 Ilya Dryomov 提交于 1月 20, 2013

The error code that is returned in response to starting a mutually
exclusive operation when there is one already running got silently
changed from EINVAL to EINPROGRESS by 5ac00add. Returning EINPROGRESS
to, say, add_dev, when rm_dev is running is misleading. Furthermore,
the operation itself may want to use EINPROGRESS for other purposes.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

2c0c9da0

Btrfs: bring back balance pause/resume logic · ed0fb78f

由 Ilya Dryomov 提交于 1月 20, 2013

Balance pause/resume logic got broken by 5ac00add (went in into 3.8-rc1
as part of dev-replace merge). Offending commit took a stab at making
mutually exclusive volume operations (add_dev, rm_dev, resize, balance,
replace_dev) not block behind volume_mutex if another such operation is
in progress and instead return an error right away. Balancing front-end
relied on the blocking behaviour, so the fix is ugly, but short of a
complete rework, it's the best we can do.
Reported-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

ed0fb78f

15 1月, 2013 9 次提交

btrfs: update timestamps on truncate() · 3972f260

由 Eric Sandeen 提交于 1月 12, 2013

truncate() vs. ftruncate() differ in the VFS; truncate()
doesn't set (ATTR_CTIME | ATTR_MTIME), and it's up to the
fs to do the timestamp updates if the size changes.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NJosef Bacik <josef@toxicpanda.com>

3972f260

btrfs: fix btrfs_cont_expand() freeing IS_ERR em · f2767956

由 Zach Brown 提交于 1月 08, 2013

btrfs_cont_expand() tries to free an IS_ERR em as it gets an error from
btrfs_get_extent() and breaks out of its loop.

An instance of -EEXIST was reported in the wild:

  https://bugzilla.redhat.com/show_bug.cgi?id=874407

I have no idea if that -EEXIST is surprising, or not.  Regardless, this
error handling should be cleaned up to handle other reasonable errors
(ENOMEM, EIO; whatever).

This seemed to be the only buggy freeing of the relatively rare IS_ERR
em so I opted to fix the caller rather than teach free_extent_map() to
use IS_ERR_OR_NULL().
Signed-off-by: NZach Brown <zab@redhat.com>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NJosef Bacik <josef@toxicpanda.com>

f2767956

Btrfs: fix a bug when llseek for delalloc bytes behind prealloc extents · f9e4fb53

由 Liu Bo 提交于 1月 07, 2013

xfstests case 285 complains.

It it because btrfs did not try to find unwritten delalloc
bytes(only dirty pages, not yet writeback) behind prealloc
extents, it ends up finding nothing while we're with SEEK_DATA.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

f9e4fb53

Btrfs: fix off-by-one in lseek · 1214b53f

由 Liu Bo 提交于 1月 07, 2013

Lock end is inclusive.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

1214b53f

Btrfs: reset path lock state to zero · 3268a246

由 Liu Bo 提交于 12月 28, 2012

We forgot to reset the path lock state to zero after we unlock the path block,
and this can lead to the ASSERT checker in tree unlock API.
Reported-by: NSlava Barinov <rayslava@gmail.com>
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

3268a246

Btrfs: let allocation start from the right raid type · ac5c9300

由 Liu Bo 提交于 12月 27, 2012

This'd avoid us empty looping.

Say we have only one disk and the metadata raid type will be defaultly DUP,
and we do not need to start from index=0(RAID10) and get over two empty
loops to index=2(DUP).
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

ac5c9300

Btrfs: add orphan before truncating pagecache · f3fe820c

由 Josef Bacik 提交于 1月 07, 2013

Running xfstests 83 in a loop would sometimes fail the fsck. This happens
because if we invalidate a page that already has an ordered extent setup for
it we will complete the ordered extent ourselves, assuming that the truncate
will clean everything up. The problem with this is there is plenty of time
for the truncate to fail after we've done this work. So to fix this we need
to add the orphan item first to make sure the cleanup gets done properly,
and then we can truncate the pagecache and all that stuff and be safe. This
fixes the btrfsck failures I was seeing while running 83 in a loop. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

f3fe820c

Btrfs: set flushing if we're limited flushing · 72bcd99d

由 Josef Bacik 提交于 12月 18, 2012

We still need to say we're flushing if we're limit flushing to keep somebody
from coming in and stealing our reservation.  Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

72bcd99d

Btrfs: fix missing write access release in btrfs_ioctl_resize() · 97547676

由 Miao Xie 提交于 12月 21, 2012

We forget to give up the write access after we find some device operation
is going on. Fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

97547676