提交 · f48b90756bd834dda852ff514f2690d3175b1f44 · openeuler / raspberrypi-kernel

29 1月, 2010 6 次提交

Btrfs: do not mark the chunk as readonly if in degraded mode · f48b9075

由 Josef Bacik 提交于 1月 27, 2010

If a RAID setup has chunks that span multiple disks, and one of those
disks has failed, btrfs_chunk_readonly will return 1 since one of the
disks in that chunk's stripes is dead and therefore not writeable. So
instead if we are in degraded mode, return 0 so we can go ahead and
allocate stuff. Without this patch all of the block groups in a RAID1
setup will end up read-only, which will mean we can't add new disks to
the array since we won't be able to make allocations.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f48b9075

Btrfs: run orphan cleanup on default fs root · e3acc2a6

由 Josef Bacik 提交于 1月 26, 2010

This patch revert's commit

6c090a11

Since it introduces this problem where we can run orphan cleanup on a
volume that can have orphan entries re-added.  Instead of my original
fix, Yan Zheng pointed out that we can just revert my original fix and
then run the orphan cleanup in open_ctree after we look up the fs_root.
I have tested this with all the tests that gave me problems and this
patch fixes both problems.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e3acc2a6

Btrfs: fix a memory leak in btrfs_init_acl · f858153c

由 Yang Hongyang 提交于 1月 26, 2010

In btrfs_init_acl() cloned acl is not released
Signed-off-by: NYang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f858153c

Btrfs: Use correct values when updating inode i_size on fallocate · d1ea6a61

由 Aneesh Kumar K.V 提交于 1月 20, 2010

commit f2bc9dd07e3424c4ec5f3949961fe053d47bc825
Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Date:   Wed Jan 20 12:57:53 2010 +0530

    Btrfs: Use correct values when updating inode i_size on fallocate

    Even though we allocate more, we should be updating inode i_size
    as per the arguments passed
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d1ea6a61

Btrfs: remove tree_search() in extent_map.c · b8d9bfeb

由 Miao Xie 提交于 12月 15, 2009

This patch removes tree_search() in extent_map.c because it is not called by
anything.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b8d9bfeb

Btrfs: Add mount -o compress-force · a555f810

由 Chris Mason 提交于 1月 28, 2010

The default btrfs mount -o compress mode will quickly back off
compressing a file if it notices that compression does not reduce the
size of the data being written.  This can save considerable CPU because
all future writes to the file go through uncompressed.

But some files are both very large and have mixed data stored in
them.  In that case, we want to add the ability to always try
compressing data before writing it.

This commit adds mount -o compress-force.  A later commit will add
a new inode flag that does the same thing.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a555f810

18 1月, 2010 7 次提交

Btrfs: fix possible panic on unmount · 11dfe35a

由 Josef Bacik 提交于 11月 13, 2009

We can race with the unmount of an fs and the stopping of a kthread where we
will free the block group before we're done using it. The reason for this is
because we do not hold a reference on the block group while its caching, since
the allocator drops its reference once it exits or moves on to the next block
group. This patch fixes the problem by taking a reference to the block group
before we start caching and dropping it when we're done to make sure all
accesses to the block group are safe. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

11dfe35a

Btrfs: deal with NULL acl sent to btrfs_set_acl · a9cc71a6

由 Chris Mason 提交于 1月 17, 2010

It is legal for btrfs_set_acl to be sent a NULL acl.  This
makes sure we don't dereference it.  A similar patch was sent by
Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a9cc71a6

Btrfs: fix regression in orphan cleanup · 6c090a11

由 Josef Bacik 提交于 1月 15, 2010

Currently orphan cleanup only ever gets triggered if we cross subvolumes during
a lookup, which means that if we just mount a plain jane fs that has orphans in
it, they will never get cleaned up.  This results in panic's like these

http://www.kerneloops.org/oops.php?number=1109085

where adding an orphan entry results in -EEXIST being returned and we panic.  In
order to fix this, we check to see on lookup if our root has had the orphan
cleanup done, and if not go ahead and do it.  This is easily reproduceable by
running this testcase

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>
#include <stdio.h>

int main(int argc, char **argv)
{
	char data[4096];
	char newdata[4096];
	int fd1, fd2;

	memset(data, 'a', 4096);
	memset(newdata, 'b', 4096);

	while (1) {
		int i;

		fd1 = creat("file1", 0666);
		if (fd1 < 0)
			break;

		for (i = 0; i < 512; i++)
			write(fd1, data, 4096);

		fsync(fd1);
		close(fd1);

		fd2 = creat("file2", 0666);
		if (fd2 < 0)
			break;

		ftruncate(fd2, 4096 * 512);

		for (i = 0; i < 512; i++)
			write(fd2, newdata, 4096);
		close(fd2);

		i = rename("file2", "file1");
		unlink("file1");
	}

	return 0;
}

and then pulling the power on the box, and then trying to run that test again
when the box comes back up.  I've tested this locally and it fixes the problem.
Thanks to Tomas Carnecky for helping me track this down initially.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6c090a11

Btrfs: Fix race in btrfs_mark_extent_written · 6c7d54ac

由 Yan, Zheng 提交于 1月 15, 2010

Fix bug reported by Johannes Hirte. The reason of that bug
is btrfs_del_items is called after btrfs_duplicate_item and
btrfs_del_items triggers tree balance. The fix is check that
case and call btrfs_search_slot when needed.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6c7d54ac

Btrfs, fix memory leaks in error paths · 2423fdfb

由 Jiri Slaby 提交于 1月 06, 2010

Stanse found 2 memory leaks in relocate_block_group and
__btrfs_map_block. cluster and multi are not freed/assigned on all
paths. Fix that.
Signed-off-by: NJiri Slaby <jslaby@suse.cz>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2423fdfb

Btrfs: align offsets for btrfs_ordered_update_i_size · a038fab0

由 Yan, Zheng 提交于 12月 28, 2009

Some callers of btrfs_ordered_update_i_size can now pass in
a NULL for the ordered extent to update against.  This makes
sure we properly align the offset they pass in when deciding
how much to bump the on disk i_size.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a038fab0

btrfs: fix missing last-entry in readdir(3) · 406266ab

由 Jan Engelhardt 提交于 12月 09, 2009

parent 49313cdac7b34c9f7ecbb1780cfc648b1c082cd7 (v2.6.32-1-g49313cd)
commit ff48c08e1c05c67e8348ab6f8a24de8034e0e34d
Author: Jan Engelhardt <jengelh@medozas.de>
Date:   Wed Dec 9 22:57:36 2009 +0100

Btrfs: fix missing last-entry in readdir(3)

When one does a 32-bit readdir(3), the last entry of a directory is
missing. This is however not due to passing a large value to filldir,
but it seems to have to do with glibc doing telldir or something
quirky. In any case, this patch fixes it in practice.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

406266ab

18 12月, 2009 15 次提交

Btrfs: make sure fallocate properly starts a transaction · 3a1abec9

由 Chris Mason 提交于 12月 17, 2009

The recent patch to make fallocate enospc friendly would send
down a NULL trans handle to the allocator.  This moves the
transaction start to properly fix things.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3a1abec9

Btrfs: make metadata chunks smaller · 83d3c969

由 Josef Bacik 提交于 12月 07, 2009

This patch makes us a bit less zealous about making sure we have enough free
metadata space by pearing down the size of new metadata chunks to 256mb instead
of 1gb.  Also, we used to try an allocate metadata chunks when allocating data,
but that sort of thing is done elsewhere now so we can just remove it.  With my
-ENOSPC test I used to have 3gb reserved for metadata out of 75gb, now I have
1.7gb.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

83d3c969

Btrfs: Show discard option in /proc/mounts · 20a5239a

由 Matthew Wilcox 提交于 12月 14, 2009

Christoph's patch e244a0ae doesn't display
the discard option in /proc/mounts, leading to some confusion for me.
Here's the missing bit.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

20a5239a

Btrfs: deny sys_link across subvolumes. · 4a8be425

由 TARUISI Hiroaki 提交于 11月 12, 2009

I rebased Christian Parpart's patch to deny hard link across
subvolumes. Original patch modifies also btrfs_rename, but
I excluded it because we can move across subvolumes now and
it make no problem.
-----------------

Hard link across subvolumes should not allowed in Btrfs.
btrfs_link checks root of 'to' directory is same as root
of 'from' file. If not same, btrfs_link returns -EPERM.
Signed-off-by: NTARUISI Hiroaki <taruishi.hiroak@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4a8be425

Btrfs: fail mount on bad mount options · a7a3f7ca

由 Sage Weil 提交于 11月 07, 2009

We shouldn't silently ignore unrecognized options.
Signed-off-by: NSage Weil <sage@newdream.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a7a3f7ca

Btrfs: don't add extent 0 to the free space cache v2 · 06b2331f

由 Yan, Zheng 提交于 11月 26, 2009

If block group 0 is completely free, btrfs_read_block_groups will
add extent [0, BTRFS_SUPER_INFO_OFFSET) to the free space cache.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

06b2331f

Btrfs: Fix per root used space accounting · 86b9f2ec

由 Yan, Zheng 提交于 11月 12, 2009

The bytes_used field in root item was originally planned to
trace the amount of used data and tree blocks. But it never
worked right since we can't trace freeing of data accurately.
This patch changes it to only trace the amount of tree blocks.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

86b9f2ec

Btrfs: Fix btrfs_drop_extent_cache for skip pinned case · 55ef6899

由 Yan, Zheng 提交于 11月 12, 2009

The check for skip pinned case is wrong, it may breaks the
while loop too soon.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

55ef6899

Btrfs: Add delayed iput · 24bbcf04

由 Yan, Zheng 提交于 11月 12, 2009

iput() can trigger new transactions if we are dropping the
final reference, so calling it in btrfs_commit_transaction
may end up deadlock. This patch adds delayed iput to avoid
the issue.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

24bbcf04

Btrfs: Pass transaction handle to security and ACL initialization functions · f34f57a3

由 Yan, Zheng 提交于 11月 12, 2009

Pass transaction handle down to security and ACL initialization
functions, so we can avoid starting nested transactions
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f34f57a3

Btrfs: Make truncate(2) more ENOSPC friendly · 8082510e

由 Yan, Zheng 提交于 11月 12, 2009

truncating and deleting regular files are unbound operations,
so it's not good to do them in a single transaction. This
patch makes btrfs_truncate and btrfs_delete_inode start a
new transaction after all items in a tree leaf are deleted.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8082510e

Btrfs: Make fallocate(2) more ENOSPC friendly · 5a303d5d

由 Yan, Zheng 提交于 11月 12, 2009

fallocate(2) may allocate large number of file extents, so it's not
good to do it in a single transaction. This patch make fallocate(2)
start a new transaction for each file extents it allocates.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5a303d5d

Btrfs: Avoid orphan inodes cleanup during committing transaction · 2e4bfab9

由 Yan, Zheng 提交于 11月 12, 2009

btrfs_lookup_dentry may trigger orphan cleanup, so it's not good
to call it while committing a transaction.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2e4bfab9

Btrfs: Avoid orphan inodes cleanup while replaying log · c71bf099

由 Yan, Zheng 提交于 11月 12, 2009

We do log replay in a single transaction, so it's not good to do unbound
operations. This patch cleans up orphan inodes cleanup after replaying
the log. It also avoids doing other unbound operations such as truncating
a file during replaying log. These unbound operations are postponed to
the orphan inode cleanup stage.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c71bf099

Btrfs: Fix disk_i_size update corner case · c2167754

由 Yan, Zheng 提交于 11月 12, 2009

There are some cases file extents are inserted without involving
ordered struct. In these cases, we update disk_i_size directly,
without checking pending ordered extent and DELALLOC bit. This
patch extends btrfs_ordered_update_i_size() to handle these cases.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c2167754

16 12月, 2009 3 次提交

Btrfs: Rewrite btrfs_drop_extents · 920bbbfb

由 Yan, Zheng 提交于 11月 12, 2009

Rewrite btrfs_drop_extents by using btrfs_duplicate_item, so we can
avoid calling lock_extent within transaction.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

920bbbfb

Btrfs: Add btrfs_duplicate_item · ad48fd75

由 Yan, Zheng 提交于 11月 12, 2009

btrfs_duplicate_item duplicates item with new key, guaranteeing
the source item and the new items are in the same tree leaf and
contiguous. It allows us to split file extent in place, without
using lock_extent to prevent bookend extent race.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ad48fd75

Btrfs: Avoid superfluous tree-log writeout · 8cef4e16

由 Yan, Zheng 提交于 11月 12, 2009

We allow two log transactions at a time, but use same flag
to mark dirty tree-log btree blocks. So we may flush dirty
blocks belonging to newer log transaction when committing a
log transaction. This patch fixes the issue by using two
flags to mark dirty tree-log btree blocks.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8cef4e16

12 11月, 2009 9 次提交

Btrfs: fix panic when trying to destroy a newly allocated · a6dbd429

由 Josef Bacik 提交于 11月 11, 2009

There is a problem where iget5_locked will look for an inode, not find it, and
then subsequently try to allocate it. Another CPU will have raced in and
allocated the inode instead, so when iget5_locked gets the inode spin lock again
and does a search, it finds the new inode. So it goes ahead and calls
destroy_inode on the inode it just allocated. The problem is we don't set
BTRFS_I(inode)->root until the new inode is completely initialized. This patch
makes us set root to NULL when alloc'ing a new inode, so when we get to
btrfs_destroy_inode and we see that root is NULL we can just free up the memory
and continue on. This fixes the panic

http://www.kerneloops.org/submitresult.php?number=812690

Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a6dbd429

Btrfs: allow more metadata chunk preallocation · 33b25808

由 Chris Mason 提交于 11月 11, 2009

On an FS where all of the space has not been allocated into chunks yet,
the enospc can return enospc just because the existing metadata chunks
are full.

We get around this by allowing more metadata chunks to be allocated up
to a certain limit, and finding the right limit is a little fuzzy.  The
problem is the reservations for delalloc would preallocate way too much
of the FS as metadata.  We need to start saying no and just force some
IO to happen.

But we also need to let a reasonable amount of the FS become metadata.
This bumps the hard limit up, later releases will have a better system.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

33b25808

Btrfs: fallback on uncompressed io if compressed io fails · f5a84ee3

由 Josef Bacik 提交于 11月 10, 2009

Currently compressed IO does not deal with not having its entire extent able to
be allocated. So if we have enough free space to allocate for the extent, but
its not contiguous, it will fail spectacularly. This patch fixes this by
falling back on uncompressed IO which lets us spread the delalloc extent across
multiple extents. I tested this by making us randomly think the reservation had
failed to make it fallback on the uncompressed io way and it seemed to work
fine. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f5a84ee3

Btrfs: find ideal block group for caching · ccf0e725

由 Josef Bacik 提交于 11月 10, 2009

This patch changes a few things. Hopefully the comments are helpfull, but
I'll try and be as verbose here.

Problem:

My fedora box was taking 1 minute and 21 seconds to boot with btrfs as root.
Part of this problem was we pick the first block group we can find and start
caching it, even if it may not have enough free space. The other problem is
we only search for cached block groups the first time around, which we won't
find any cached block groups because this is a newly mounted fs, so we end up
caching several block groups during bootup, which with alot of fragmentation
takes around 30-45 seconds to complete, which bogs down the system. So

Solution:

1) Don't cache block groups willy-nilly at first. Instead try and figure out
which block group has the most free, and therefore will take the least amount
of time to cache.

2) Don't be so picky about cached block groups. The other problem is once
we've filled up a cluster, if the block group isn't finished caching the next
time we try and do the allocation we'll completely ignore the cluster and
start searching from the beginning of the space, which makes us cache more
block groups, which slows us down even more. So instead of skipping block
groups that are not finished caching when we have a hint, only skip the block
group if it hasn't started caching yet.

There is one other tweak in here. Before if we allocated a chunk and still
couldn't find new space, we'd end up switching the space info to force another
chunk allocation. This could make us end up with way too many chunks, so keep
track of this particular case.

With this patch and my previous cluster fixes my fedora box now boots in 43
seconds, and according to the bootchart is not held up by our block group
caching at all.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ccf0e725

Btrfs: avoid null deref in unpin_extent_cache() · 4eb3991c

由 Dan Carpenter 提交于 11月 10, 2009

I re-orderred the checks to avoid dereferencing "em" if it was null.

Found by smatch static checker.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4eb3991c

Btrfs: skip btrfs_release_path in btrfs_update_root and btrfs_del_root · df66916e

由 Li Dongyang 提交于 11月 06, 2009

We don't need to call btrfs_release_path because btrfs_free_path will do
that for us.
Signed-off-by: NLi Dongyang <Jerry87905@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

df66916e

Btrfs: fix some metadata enospc issues · 5df6a9f6

由 Josef Bacik 提交于 11月 10, 2009

We weren't reserving metadata space for rename, rmdir and unlink, which could
cause problems.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5df6a9f6

Btrfs: fix how we set max_size for free space clusters · 01dea1ef

由 Josef Bacik 提交于 11月 10, 2009

This patch fixes a problem where max_size can be set to 0 even though we
filled the cluster properly. We set max_size to 0 if we restart the cluster
window, but if the new start entry is big enough to be our new cluster then we
could return with a max_size set to 0, which will mean the next time we try to
allocate from this cluster it will fail. So set max_extent to the entry's
size. Tested this on my box and now we actually allocate from the cluster
after we fill it. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

01dea1ef

Btrfs: cleanup transaction starting and fix journal_info usage · 249ac1e5

由 Josef Bacik 提交于 11月 10, 2009

We use journal_info to tell if we're in a nested transaction to make sure we
don't commit the transaction within a nested transaction. We use another
method to see if there are any outstanding ioctl trans handles, so if we're
starting one do not set current->journal_info, since it will screw with other
filesystems. This patch also cleans up the starting stuff so there aren't any
magic numbers.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

249ac1e5