提交 · 1e701a3292e25a6c4939cad9f24951dc6b6ad853 · openeuler / raspberrypi-kernel

15 3月, 2010 9 次提交

Btrfs: add new defrag-range ioctl. · 1e701a32

由 Chris Mason 提交于 3月 11, 2010

The btrfs defrag ioctl was limited to doing the entire file.  This
commit adds a new interface that can defrag a specific range inside
the file.

It can also force compression on the file, allowing you to selectively
compress individual files after they were created, even when mount -o
compress isn't turned on.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

1e701a32

Btrfs: be more selective in the defrag ioctl · 940100a4

由 Chris Mason 提交于 3月 10, 2010

The btrfs defrag ioctl had some bugs around delalloc accounting, and it
wasn't properly skipping pages that were not in the mapping.

It wasn't properly clearing the page checked flag, which could make the
writeback code ignore the page forever while pinning it as dirty.

This commit fixes those problems and makes defrag a little smarter. It
skips holes and it doesn't waste time defragging large extents. If a
tiny extent comes before a very large extent, it will defrag both of
them to make sure the tiny extent ends up next to something big.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

940100a4

Btrfs: run the backing dev more often in the submit_bio helper · 51684082

由 Chris Mason 提交于 3月 10, 2010

The submit_bio helper thread can decide to loop back around to
service more bios.  This commit forces it to unplug first, which helps
reduce the latency seen by submitters.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

51684082

Btrfs: make subvolid=0 mount the original default root · 4849f01d

由 Josef Bacik 提交于 12月 14, 2009

Since theres not a good way to make sure the user sees the original default root
tree id, and not to mention it's 5 so is way different than any other volume,
just make subvol=0 mount the original default root. This makes it a bit easier
for users to handle in the long run. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4849f01d

Btrfs: add ioctl and incompat flag to set the default mount subvol · 6ef5ed0d

由 Josef Bacik 提交于 12月 11, 2009

This patch needs to go along with my previous patch. This lets us set the
default dir item's location to whatever root we want to use as our default
mounting subvol. With this we don't have to use mount -o subvol=<tree id>
anymore to mount a different subvol, we can just set the new one and it will
just magically work. I've done some moderate testing with this, mostly just
switching the default mount around, mounting subvols and the default mount at
the same time and such, everything seems to work. Thanks,

Older kernels would generally be able to still mount the filesystem with the
default subvolume set, but it would result in a different volume being mounted,
which could be an even more unpleasant suprise for users. So if you set your
default subvolume, you can't go back to older kernels. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6ef5ed0d

Btrfs: change how we mount subvolumes · 73f73415

由 Josef Bacik 提交于 12月 04, 2009

This work is in preperation for being able to set a different root as the
default mounting root.

There is currently a problem with how we mount subvolumes.  We cannot currently
mount a subvolume of a subvolume, you can only mount subvolumes/snapshots of the
default subvolume.  So say you take a snapshot of the default subvolume and call
it snap1, and then take a snapshot of snap1 and call it snap2, so now you have

/
/snap1
/snap1/snap2

as your available volumes.  Currently you can only mount / and /snap1,
you cannot mount /snap1/snap2.  To fix this problem instead of passing
subvolid=<name> you must pass in subvolid=<treeid>, where <treeid> is
the tree id that gets spit out via the subvolume listing you get from
the subvolume listing patches (btrfs filesystem list).  This allows us
to mount /, /snap1 and /snap1/snap2 as the root volume.

In addition to the above, we also now read the default dir item in the
tree root to get the root key that it points to.  For now this just
points at what has always been the default subvolme, but later on I plan
to change it to point at whatever root you want to be the new default
root, so you can just set the default mount and not have to mount with
-o subvolid=<treeid>.  I tested this out with the above scenario and it
worked perfectly.  Thanks,

mount -o subvol operates inside the selected subvolid.  For example:

mount -o subvol=snap1,subvolid=256 /dev/xxx /mnt

/mnt will have the snap1 directory for the subvolume with id
256.

mount -o subvol=snap /dev/xxx /mnt

/mnt will be the snap directory of whatever the default subvolume
is.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

73f73415

Btrfs: make set/get functions for the super compat_ro flags use compat_ro · 12534832

由 Josef Bacik 提交于 12月 17, 2009

Our set/get functions for compat_ro_flags actually look at compat_flags. This
will mess any attempt to use compat flags up. The fix is obvious. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

12534832

Btrfs: add search and inode lookup ioctls · ac8e9819

由 Chris Mason 提交于 2月 28, 2010

The search ioctl is a generic tool for doing btree searches from
userland applications.  The first user of the search ioctl is a
subvolume listing feature, but we'll also use it to find new
files in a subvolume.

The search ioctl allows you to specify min and max keys to search for,
along with min and max transid.  It returns the items along with a
header that includes the item key.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ac8e9819

Btrfs: add a function to lookup a directory path by following backrefs · 98d377a0

由 TARUISI Hiroaki 提交于 11月 18, 2009

This will be used by the inode lookup ioctl.
Signed-off-by: NTARUISI Hiroaki <taruishi.hiroak@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

98d377a0

09 3月, 2010 2 次提交

Btrfs: kfree correct pointer during mount option parsing · da495ecc

由 Josef Bacik 提交于 2月 25, 2010

We kstrdup the options string, but then strsep screws with the pointer,
so when we kfree() it, we're not giving it the right pointer.
Tested-by: NAndy Lutomirski <luto@mit.edu>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

da495ecc

Btrfs: use RB_ROOT to intialize rb_trees instead of setting rb_node to NULL · 6bef4d31

由 Eric Paris 提交于 2月 23, 2010

btrfs inialize rb trees in quite a number of places by settin rb_node =
NULL;  The problem with this is that 17d9ddc7 in the
linux-next tree adds a new field to that struct which needs to be NULL for
the new rbtree library code to work properly.  This patch uses RB_ROOT as
the intializer so all of the relevant fields will be NULL'd.  Without the
patch I get a panic.
Signed-off-by: NEric Paris <eparis@redhat.com>
Acked-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6bef4d31

13 2月, 2010 1 次提交

Btrfs: btrfs_mark_extent_written uses the wrong slot · 3f6fae95

由 Shaohua Li 提交于 2月 11, 2010

My test do: fallocate a big file and do write. The file is 512M, but
after file write is done btrfs-debug-tree shows:
item 6 key (257 EXTENT_DATA 0) itemoff 3516 itemsize 53
                extent data disk byte 1103101952 nr 536870912
                extent data offset 0 nr 399634432 ram 536870912
                extent compression 0
Looks like a regression introducted by
6c7d54ac, where we set wrong slot.
Signed-off-by: NShaohua Li <shaohua.li@intel.com>
Acked-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3f6fae95

05 2月, 2010 6 次提交

Btrfs: apply updated fallocate i_size fix · 23b5c509

由 Aneesh Kumar K.V 提交于 2月 04, 2010

This version of the i_size fix for fallocate makes sure we only update
the i_size when the current fallocate is really operating outside of
i_size.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

23b5c509

Btrfs: do not try and lookup the file extent when finishing ordered io · efd049fb

由 Josef Bacik 提交于 2月 02, 2010

When running the following fio job

[torrent]
filename=torrent-test
rw=randwrite
size=4g
filesize=4g
bs=4k
ioengine=sync

you would see long stalls where no work was being done.  That is because we were
doing all this extra work to read in the file extent outside of the transaction,
however in the random io case this ends up hurting us because the file extents
are not there to begin with.  So axe this logic, since we end up reading in the
file extent when we go to update it anyway.  This took the fio job from 11 mb/s
with several ~10 second stalls to 24 mb/s to a couple of 1-2 second stalls.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

efd049fb

Btrfs: Fix oopsen when dropping empty tree. · 7a7965f8

由 Yan, Zheng 提交于 2月 01, 2010

When dropping a empty tree, walk_down_tree() skips checking
extent information for the tree root. This will triggers a
BUG_ON in walk_up_proc().
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7a7965f8

Btrfs: remove BUG_ON() due to mounting bad filesystem · d7ce5843

由 Miao Xie 提交于 2月 02, 2010

Mounting a bad filesystem caused a BUG_ON(). The following is steps to
reproduce it.
 # mkfs.btrfs /dev/sda2
 # mount /dev/sda2 /mnt
 # mkfs.btrfs /dev/sda1 /dev/sda2
 (the program says that /dev/sda2 was mounted, and then exits. )
 # umount /mnt
 # mount /dev/sda1 /mnt

At the third step, mkfs.btrfs exited in the way of make filesystem. So the
initialization of the filesystem didn't finish. So the filesystem was bad, and
it caused BUG_ON() when mounting it. But BUG_ON() should be called by the wrong
code, not user's operation, so I think it is a bug of btrfs.

This patch fixes it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d7ce5843

Btrfs: make error return negative in btrfs_sync_file() · 014e4ac4

由 Roel Kluin 提交于 1月 29, 2010

It appears the error return should be negative
Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

014e4ac4

Btrfs: fix race between allocate and release extent buffer. · f044ba78

由 Yan, Zheng 提交于 2月 04, 2010

Increase extent buffer's reference count while holding the lock.
Otherwise it can race with try_release_extent_buffer.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f044ba78

29 1月, 2010 8 次提交

Btrfs: check total number of devices when removing missing · 035fe03a

由 Josef Bacik 提交于 1月 27, 2010

If you have a disk failure in RAID1 and then add a new disk to the
array, and then try to remove the missing volume, it will fail.  The
reason is the sanity check only looks at the total number of rw devices,
which is just 2 because we have 2 good disks and 1 bad one.  Instead
check the total number of devices in the array to make sure we can
actually remove the device.  Tested this with a failed disk setup and
with this test we can now run

btrfs-vol -r missing /mount/point

and it works fine.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

035fe03a

Btrfs: check return value of open_bdev_exclusive properly · 7f59203a

由 Josef Bacik 提交于 1月 27, 2010

Hit this problem while testing RAID1 failure stuff.  open_bdev_exclusive
returns ERR_PTR(), not NULL.  So change the return value properly.  This
is important if you accidently specify a device that doesn't exist when
trying to add a new device to an array, you will panic the box
dereferencing bdev.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7f59203a

Btrfs: do not mark the chunk as readonly if in degraded mode · f48b9075

由 Josef Bacik 提交于 1月 27, 2010

If a RAID setup has chunks that span multiple disks, and one of those
disks has failed, btrfs_chunk_readonly will return 1 since one of the
disks in that chunk's stripes is dead and therefore not writeable. So
instead if we are in degraded mode, return 0 so we can go ahead and
allocate stuff. Without this patch all of the block groups in a RAID1
setup will end up read-only, which will mean we can't add new disks to
the array since we won't be able to make allocations.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f48b9075

Btrfs: run orphan cleanup on default fs root · e3acc2a6

由 Josef Bacik 提交于 1月 26, 2010

This patch revert's commit

6c090a11

Since it introduces this problem where we can run orphan cleanup on a
volume that can have orphan entries re-added.  Instead of my original
fix, Yan Zheng pointed out that we can just revert my original fix and
then run the orphan cleanup in open_ctree after we look up the fs_root.
I have tested this with all the tests that gave me problems and this
patch fixes both problems.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e3acc2a6

Btrfs: fix a memory leak in btrfs_init_acl · f858153c

由 Yang Hongyang 提交于 1月 26, 2010

In btrfs_init_acl() cloned acl is not released
Signed-off-by: NYang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f858153c

Btrfs: Use correct values when updating inode i_size on fallocate · d1ea6a61

由 Aneesh Kumar K.V 提交于 1月 20, 2010

commit f2bc9dd07e3424c4ec5f3949961fe053d47bc825
Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Date:   Wed Jan 20 12:57:53 2010 +0530

    Btrfs: Use correct values when updating inode i_size on fallocate

    Even though we allocate more, we should be updating inode i_size
    as per the arguments passed
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d1ea6a61

Btrfs: remove tree_search() in extent_map.c · b8d9bfeb

由 Miao Xie 提交于 12月 15, 2009

This patch removes tree_search() in extent_map.c because it is not called by
anything.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b8d9bfeb

Btrfs: Add mount -o compress-force · a555f810

由 Chris Mason 提交于 1月 28, 2010

The default btrfs mount -o compress mode will quickly back off
compressing a file if it notices that compression does not reduce the
size of the data being written.  This can save considerable CPU because
all future writes to the file go through uncompressed.

But some files are both very large and have mixed data stored in
them.  In that case, we want to add the ability to always try
compressing data before writing it.

This commit adds mount -o compress-force.  A later commit will add
a new inode flag that does the same thing.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a555f810

18 1月, 2010 7 次提交

Btrfs: fix possible panic on unmount · 11dfe35a

由 Josef Bacik 提交于 11月 13, 2009

We can race with the unmount of an fs and the stopping of a kthread where we
will free the block group before we're done using it. The reason for this is
because we do not hold a reference on the block group while its caching, since
the allocator drops its reference once it exits or moves on to the next block
group. This patch fixes the problem by taking a reference to the block group
before we start caching and dropping it when we're done to make sure all
accesses to the block group are safe. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

11dfe35a

Btrfs: deal with NULL acl sent to btrfs_set_acl · a9cc71a6

由 Chris Mason 提交于 1月 17, 2010

It is legal for btrfs_set_acl to be sent a NULL acl.  This
makes sure we don't dereference it.  A similar patch was sent by
Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a9cc71a6

Btrfs: fix regression in orphan cleanup · 6c090a11

由 Josef Bacik 提交于 1月 15, 2010

Currently orphan cleanup only ever gets triggered if we cross subvolumes during
a lookup, which means that if we just mount a plain jane fs that has orphans in
it, they will never get cleaned up.  This results in panic's like these

http://www.kerneloops.org/oops.php?number=1109085

where adding an orphan entry results in -EEXIST being returned and we panic.  In
order to fix this, we check to see on lookup if our root has had the orphan
cleanup done, and if not go ahead and do it.  This is easily reproduceable by
running this testcase

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>
#include <stdio.h>

int main(int argc, char **argv)
{
	char data[4096];
	char newdata[4096];
	int fd1, fd2;

	memset(data, 'a', 4096);
	memset(newdata, 'b', 4096);

	while (1) {
		int i;

		fd1 = creat("file1", 0666);
		if (fd1 < 0)
			break;

		for (i = 0; i < 512; i++)
			write(fd1, data, 4096);

		fsync(fd1);
		close(fd1);

		fd2 = creat("file2", 0666);
		if (fd2 < 0)
			break;

		ftruncate(fd2, 4096 * 512);

		for (i = 0; i < 512; i++)
			write(fd2, newdata, 4096);
		close(fd2);

		i = rename("file2", "file1");
		unlink("file1");
	}

	return 0;
}

and then pulling the power on the box, and then trying to run that test again
when the box comes back up.  I've tested this locally and it fixes the problem.
Thanks to Tomas Carnecky for helping me track this down initially.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6c090a11

Btrfs: Fix race in btrfs_mark_extent_written · 6c7d54ac

由 Yan, Zheng 提交于 1月 15, 2010

Fix bug reported by Johannes Hirte. The reason of that bug
is btrfs_del_items is called after btrfs_duplicate_item and
btrfs_del_items triggers tree balance. The fix is check that
case and call btrfs_search_slot when needed.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6c7d54ac

Btrfs, fix memory leaks in error paths · 2423fdfb

由 Jiri Slaby 提交于 1月 06, 2010

Stanse found 2 memory leaks in relocate_block_group and
__btrfs_map_block. cluster and multi are not freed/assigned on all
paths. Fix that.
Signed-off-by: NJiri Slaby <jslaby@suse.cz>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2423fdfb

Btrfs: align offsets for btrfs_ordered_update_i_size · a038fab0

由 Yan, Zheng 提交于 12月 28, 2009

Some callers of btrfs_ordered_update_i_size can now pass in
a NULL for the ordered extent to update against.  This makes
sure we properly align the offset they pass in when deciding
how much to bump the on disk i_size.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a038fab0

btrfs: fix missing last-entry in readdir(3) · 406266ab

由 Jan Engelhardt 提交于 12月 09, 2009

parent 49313cdac7b34c9f7ecbb1780cfc648b1c082cd7 (v2.6.32-1-g49313cd)
commit ff48c08e1c05c67e8348ab6f8a24de8034e0e34d
Author: Jan Engelhardt <jengelh@medozas.de>
Date:   Wed Dec 9 22:57:36 2009 +0100

Btrfs: fix missing last-entry in readdir(3)

When one does a 32-bit readdir(3), the last entry of a directory is
missing. This is however not due to passing a large value to filldir,
but it seems to have to do with glibc doing telldir or something
quirky. In any case, this patch fixes it in practice.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

406266ab

18 12月, 2009 7 次提交

Btrfs: make sure fallocate properly starts a transaction · 3a1abec9

由 Chris Mason 提交于 12月 17, 2009

The recent patch to make fallocate enospc friendly would send
down a NULL trans handle to the allocator.  This moves the
transaction start to properly fix things.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3a1abec9

Btrfs: make metadata chunks smaller · 83d3c969

由 Josef Bacik 提交于 12月 07, 2009

This patch makes us a bit less zealous about making sure we have enough free
metadata space by pearing down the size of new metadata chunks to 256mb instead
of 1gb.  Also, we used to try an allocate metadata chunks when allocating data,
but that sort of thing is done elsewhere now so we can just remove it.  With my
-ENOSPC test I used to have 3gb reserved for metadata out of 75gb, now I have
1.7gb.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

83d3c969

Btrfs: Show discard option in /proc/mounts · 20a5239a

由 Matthew Wilcox 提交于 12月 14, 2009

Christoph's patch e244a0ae doesn't display
the discard option in /proc/mounts, leading to some confusion for me.
Here's the missing bit.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

20a5239a

Btrfs: deny sys_link across subvolumes. · 4a8be425

由 TARUISI Hiroaki 提交于 11月 12, 2009

I rebased Christian Parpart's patch to deny hard link across
subvolumes. Original patch modifies also btrfs_rename, but
I excluded it because we can move across subvolumes now and
it make no problem.
-----------------

Hard link across subvolumes should not allowed in Btrfs.
btrfs_link checks root of 'to' directory is same as root
of 'from' file. If not same, btrfs_link returns -EPERM.
Signed-off-by: NTARUISI Hiroaki <taruishi.hiroak@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4a8be425

Btrfs: fail mount on bad mount options · a7a3f7ca

由 Sage Weil 提交于 11月 07, 2009

We shouldn't silently ignore unrecognized options.
Signed-off-by: NSage Weil <sage@newdream.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a7a3f7ca

Btrfs: don't add extent 0 to the free space cache v2 · 06b2331f

由 Yan, Zheng 提交于 11月 26, 2009

If block group 0 is completely free, btrfs_read_block_groups will
add extent [0, BTRFS_SUPER_INFO_OFFSET) to the free space cache.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

06b2331f

Btrfs: Fix per root used space accounting · 86b9f2ec

由 Yan, Zheng 提交于 11月 12, 2009

The bytes_used field in root item was originally planned to
trace the amount of used data and tree blocks. But it never
worked right since we can't trace freeing of data accurately.
This patch changes it to only trace the amount of tree blocks.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

86b9f2ec