提交 · 2a29edc6b60a5248ccab588e7ba7dad38cef0235 · xiphi1978 / linux

29 1月, 2011 3 次提交

btrfs: fix several uncheck memory allocations · 2a29edc6

由 liubo 提交于 1月 26, 2011

To make btrfs more stable, add several missing necessary memory allocation
checks, and when no memory, return proper errno.

We've checked that some of those -ENOMEM errors will be returned to
userspace, and some will be catched by BUG_ON() in the upper callers,
and none will be ignored silently.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2a29edc6

btrfs: fix uncheck memory allocation in btrfs_submit_compressed_read · 6b82ce8d

由 liubo 提交于 1月 26, 2011

btrfs_submit_compressed_read() is lack of memory allocation checks and
corresponding error route.

After this fix, if it comes to "no memory" case, errno will be returned
to userland step by step, and tell users this operation cannot go on.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6b82ce8d

C

Merge branch 'bug-fixes' of git://repo.or.cz/linux-btrfs-devel into btrfs-38 · eab49bec
由 Chris Mason 提交于 1月 28, 2011

eab49bec

27 1月, 2011 12 次提交

Btrfs: Fix file clone when source offset is not 0 · 4d728ec7

由 Li Zefan 提交于 1月 26, 2011

Suppose:
- the source extent is: [0, 100]
- the src offset is 10
- the clone length is 90
- the dest offset is 0

This statement:

	new_key.offset = key.offset + destoff - off

will produce such an extent for the dest file:

	[ino, BTRFS_EXTENT_DATA_KEY, -10]

, which is obviously wrong.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

4d728ec7

Btrfs: Fix memory leak in writepage fixup work · b897abec

由 Miao Xie 提交于 1月 26, 2011

fixup, which is allocated when starting page write to fix up the
extent without ORDERED bit set, should be freed after this work
is done.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

b897abec

Btrfs: Don't return acl info when mounting with noacl option · d0f69686

由 Miao Xie 提交于 1月 25, 2011

Steps to reproduce:

  # mkfs.btrfs /dev/sda2
  # mount /dev/sda2 /mnt
  # touch /mnt/file0
  # setfacl -m 'u:root:x,g::x,o::x' /mnt/file0
  # umount /mnt
  # mount /dev/sda2 -o noacl /mnt
  # getfacl /mnt/file0
  ...
  user::rw-
  user:root:--x
  group::--x
  mask::--x
  other::--x

The output should be:

  user::rw-
  group::--x
  other::--x
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

d0f69686

Btrfs: Free correct pointer after using strsep · 3f3d0bc0

由 Tero Roponen 提交于 12月 27, 2010

We must save and free the original kstrdup()'ed pointer
because strsep() modifies its first argument.
Signed-off-by: NTero Roponen <tero.roponen@gmail.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

3f3d0bc0

Btrfs: Fix memory leak on finding existing super · bdc924bb

由 Ian Kent 提交于 12月 27, 2010

We missed a memory deallocation in commit 450ba0ea.

If an existing super block is found at mount and there is no
error condition then the pre-allocated tree_root and fs_info
are no not used and are not freeded.
Signed-off-by: NIan Kent <raven@themaw.net>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

bdc924bb

Btrfs: Fix memory leak at umount · 83a4d548

由 Li Zefan 提交于 12月 27, 2010

fs_info, which is allocated in open_ctree(), should be freed
in close_ctree().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

83a4d548

btrfs: Check mergeable free space when removing a cluster · f333adb5

由 Li Zefan 提交于 11月 09, 2010

After returing extents from a cluster to the block group, some
extents in the block group may be mergeable.
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

f333adb5

btrfs: Add a helper try_merge_free_space() · 120d66ee

由 Li Zefan 提交于 11月 09, 2010

When adding a new extent, we'll firstly see if we can merge
this extent to the left or/and right extent. Extract this as
a helper try_merge_free_space().

As a side effect, we fix a small bug that if the new extent
has non-bitmap left entry but is unmergeble, we'll directly
link the extent without trying to drop it into bitmap.

This also prepares for the next patch.
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

120d66ee

btrfs: Update stats when allocating from a cluster · 5e71b5d5

由 Li Zefan 提交于 11月 09, 2010

When allocating extent entry from a cluster, we should update
the free_space and free_extents fields of the block group.
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

5e71b5d5

btrfs: Free fully occupied bitmap in cluster · 70b7da30

由 Li Zefan 提交于 11月 09, 2010

If there's no more free space in a bitmap, we should free it.
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

70b7da30

btrfs: Add helper function free_bitmap() · edf6e2d1

由 Li Zefan 提交于 11月 09, 2010

Remove some duplicated code.

This prepares for the next patch.
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

edf6e2d1

btrfs: Fix threshold calculation for block groups smaller than 1GB · 8eb2d829

由 Li Zefan 提交于 11月 09, 2010

If a block group is smaller than 1GB, the extent entry threadhold
calculation will always set the threshold to 0.

So as free space gets fragmented, btrfs will switch to use bitmap
to manage free space, but then will never switch back to extents
due to this bug.
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

8eb2d829

18 1月, 2011 1 次提交

Btrfs: forced readonly mounts on errors · acce952b

由 liubo 提交于 1月 06, 2011

This patch comes from "Forced readonly mounts on errors" ideas.

As we know, this is the first step in being more fault tolerant of disk
corruptions instead of just using BUG() statements.

The major content:
- add a framework for generating errors that should result in filesystems
  going readonly.
- keep FS state in disk super block.
- make sure that all of resource will be freed and released at umount time.
- make sure that fter FS is forced readonly on error, there will be no more
  disk change before FS is corrected. For this, we should stop write operation.

After this patch is applied, the conversion from BUG() to such a framework can
happen incrementally.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

acce952b

17 1月, 2011 16 次提交

btrfs: Require CAP_SYS_ADMIN for filesystem rebalance · 6f88a440

由 Ben Hutchings 提交于 12月 29, 2010

Filesystem rebalancing (BTRFS_IOC_BALANCE) affects the entire
filesystem and may run uninterruptibly for a long time.  This does not
seem to be something that an unprivileged user should be able to do.
Reported-by: NAron Xu <happyaron.xu@gmail.com>
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6f88a440

Btrfs: don't warn if we get ENOSPC in btrfs_block_rsv_check · f690efb1

由 Josef Bacik 提交于 1月 12, 2011

If we run low on space we could get a bunch of warnings out of
btrfs_block_rsv_check, but this is mostly just called via the transaction code
to see if we need to end the transaction, it expects to see failures, so let's
not WARN and freak everybody out for no reason. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f690efb1

btrfs: Fix memory leak in btrfs_read_fs_root_no_radix() · 5e540f77

由 Tsutomu Itoh 提交于 12月 27, 2010

In btrfs_read_fs_root_no_radix(), 'root' is not freed if
btrfs_search_slot() returns error.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5e540f77

btrfs: check NULL or not · 91ca338d

由 Tsutomu Itoh 提交于 1月 05, 2011

Should check if functions returns NULL or not.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

91ca338d

btrfs: Don't pass NULL ptr to func that may deref it. · ff175d57

由 Jesper Juhl 提交于 12月 25, 2010

Hi,

In fs/btrfs/inode.c::fixup_tree_root_location() we have this code:

...
 		if (!path) {
 			err = -ENOMEM;
 			goto out;
 		}
...
 	out:
 		btrfs_free_path(path);
 		return err;

btrfs_free_path() passes its argument on to other functions and some of
them end up dereferencing the pointer.
In the code above that pointer is clearly NULL, so btrfs_free_path() will
eventually cause a NULL dereference.

There are many ways to cut this cake (fix the bug). The one I chose was to
make btrfs_free_path() deal gracefully with NULL pointers. If you
disagree, feel free to come up with an alternative patch.
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ff175d57

btrfs: mount failure return value fix · 20b45077

由 Dave Young 提交于 1月 08, 2011

I happened to pass swap partition as root partition in cmdline,
then kernel panic and tell me about "Cannot open root device".
It is not correct, in fact it is a fs type mismatch instead of 'no device'.

Eventually I found btrfs mounting failed with -EIO, it should be -EINVAL.
The logic in init/do_mounts.c:
        for (p = fs_names; *p; p += strlen(p)+1) {
                int err = do_mount_root(name, p, flags, root_mount_data);
                switch (err) {
                        case 0:
                                goto out;
                        case -EACCES:
                                flags |= MS_RDONLY;
                                goto retry;
                        case -EINVAL:
                                continue;
                }
		print "Cannot open root device"
		panic
	}
SO fs type after btrfs will have no chance to mount

Here fix the return value as -EINVAL
Signed-off-by: NDave Young <hidave.darkstar@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

20b45077

btrfs: Mem leak in btrfs_get_acl() · 42838bb2

由 Jesper Juhl 提交于 1月 06, 2011

It seems to me that we leak the memory allocated to 'value' in
btrfs_get_acl() if the call to posix_acl_from_xattr() fails.
Here's a patch that attempts to correct that problem.
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

42838bb2

btrfs: fix wrong free space information of btrfs · 6d07bcec

由 Miao Xie 提交于 1月 05, 2011

When we store data by raid profile in btrfs with two or more different size
disks, df command shows there is some free space in the filesystem, but the
user can not write any data in fact, df command shows the wrong free space
information of btrfs.

 # mkfs.btrfs -d raid1 /dev/sda9 /dev/sda10
 # btrfs-show
 Label: none  uuid: a95cd49e-6e33-45b8-8741-a36153ce4b64
 	Total devices 2 FS bytes used 28.00KB
 	devid    1 size 5.01GB used 2.03GB path /dev/sda9
 	devid    2 size 10.00GB used 2.01GB path /dev/sda10
 # btrfs device scan /dev/sda9 /dev/sda10
 # mount /dev/sda9 /mnt
 # dd if=/dev/zero of=tmpfile0 bs=4K count=9999999999
   (fill the filesystem)
 # sync
 # df -TH
 Filesystem	Type	Size	Used	Avail	Use%	Mounted on
 /dev/sda9	btrfs	17G	8.6G	5.4G	62%	/mnt
 # btrfs-show
 Label: none  uuid: a95cd49e-6e33-45b8-8741-a36153ce4b64
 	Total devices 2 FS bytes used 3.99GB
 	devid    1 size 5.01GB used 5.01GB path /dev/sda9
 	devid    2 size 10.00GB used 4.99GB path /dev/sda10

It is because btrfs cannot allocate chunks when one of the pairing disks has
no space, the free space on the other disks can not be used for ever, and should
be subtracted from the total space, but btrfs doesn't subtract this space from
the total. It is strange to the user.

This patch fixes it by calcing the free space that can be used to allocate
chunks.

Implementation:
1. get all the devices free space, and align them by stripe length.
2. sort the devices by the free space.
3. check the free space of the devices,
   3.1. if it is not zero, and then check the number of the devices that has
        more free space than this device,
        if the number of the devices is beyond the min stripe number, the free
        space can be used, and add into total free space.
        if the number of the devices is below the min stripe number, we can not
        use the free space, the check ends.
   3.2. if the free space is zero, check the next devices, goto 3.1

This implementation is just likely fake chunk allocation.

After appling this patch, df can show correct space information:
 # df -TH
 Filesystem	Type	Size	Used	Avail	Use%	Mounted on
 /dev/sda9	btrfs	17G	8.6G	0	100%	/mnt
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6d07bcec

btrfs: make the chunk allocator utilize the devices better · b2117a39

由 Miao Xie 提交于 1月 05, 2011

With this patch, we change the handling method when we can not get enough free
extents with default size.

Implementation:
1. Look up the suitable free extent on each device and keep the search result.
   If not find a suitable free extent, keep the max free extent
2. If we get enough suitable free extents with default size, chunk allocation
   succeeds.
3. If we can not get enough free extents, but the number of the extent with
   default size is >= min_stripes, we just change the mapping information
   (reduce the number of stripes in the extent map), and chunk allocation
   succeeds.
4. If the number of the extent with default size is < min_stripes, sort the
   devices by its max free extent's size descending
5. Use the size of the max free extent on the (num_stripes - 1)th device as the
   stripe size to allocate the device space

By this way, the chunk allocator can allocate chunks as large as possible when
the devices' space is not enough and make full use of the devices.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b2117a39

btrfs: restructure find_free_dev_extent() · 7bfc837d

由 Miao Xie 提交于 1月 05, 2011

- make it return the start position and length of the max free space when it can
  not find a suitable free space.
- make it more readability
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7bfc837d

btrfs: fix wrong calculation of stripe size · 1974a3b4

由 Miao Xie 提交于 1月 05, 2011

There are two tiny problem:
- One is When we check the chunk size is greater than the max chunk size or not,
  we should take mirrors into account, but the original code didn't.
- The other is btrfs shouldn't use the size of the residual free space as the
  length of of a dup chunk when doing chunk allocation. It is because the device
  space that a dup chunk needs is twice as large as the chunk size, if we use
  the size of the residual free space as the length of a dup chunk, we can not
  get enough free space. Fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

1974a3b4

btrfs: try to reclaim some space when chunk allocation fails · d52a5b5f

由 Miao Xie 提交于 1月 05, 2011

We cannot write data into files when when there is tiny space in the filesystem.

Reproduce steps:
 # mkfs.btrfs /dev/sda1
 # mount /dev/sda1 /mnt
 # dd if=/dev/zero of=/mnt/tmpfile0 bs=4K count=1
 # dd if=/dev/zero of=/mnt/tmpfile1 bs=4K count=99999999999999
   (fill the filesystem)
 # umount /mnt
 # mount /dev/sda1 /mnt
 # rm -f /mnt/tmpfile0
 # dd if=/dev/zero of=/mnt/tmpfile0 bs=4K count=1
   (failed with nospec)

But if we do the last step again, we can write data successfully. The reason of
the problem is that btrfs didn't try to commit the current transaction and
reclaim some space when chunk allocation failed.

This patch fixes it by committing the current transaction to reclaim some
space when chunk allocation fails.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d52a5b5f

btrfs: fix wrong data space statistics · 299a08b1

由 Miao Xie 提交于 1月 05, 2011

Josef has implemented mixed data/metadata chunks, we must add those chunks'
space just like data chunks.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

299a08b1

fs/btrfs: Fix build of ctree · f580eb09

由 Stefan Schmidt 提交于 1月 12, 2011

CC [M]  fs/btrfs/ctree.o
In file included from fs/btrfs/ctree.c:21:0:
fs/btrfs/ctree.h:1003:17: error: field <91>super_kobj<92> has incomplete type
fs/btrfs/ctree.h:1074:17: error: field <91>root_kobj<92> has incomplete type
make[2]: *** [fs/btrfs/ctree.o] Error 1
make[1]: *** [fs/btrfs] Error 2
make: *** [fs] Error 2

We need to include kobject.h here.
Reported-by: NJeff Garzik <jeff@garzik.org>
Fix-suggested-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NStefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f580eb09

C

Merge branch 'lzo-support' of git://repo.or.cz/linux-btrfs-devel into btrfs-38 · f892436e
由 Chris Mason 提交于 1月 16, 2011

f892436e
C

Merge branch 'readonly-snapshots' of git://repo.or.cz/linux-btrfs-devel into btrfs-38 · 26c79f6b
由 Chris Mason 提交于 1月 16, 2011

26c79f6b

05 1月, 2011 1 次提交

Btrfs: fix off by one while setting block groups readonly · 65e5341b

由 Chris Mason 提交于 12月 24, 2010

When we read in block groups, we'll set non-redundant groups
readonly if we find a raid1, DUP or raid10 group.  But the
ro code has an off by one bug in the math around testing to
make sure out accounting doesn't go wrong.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

65e5341b

23 12月, 2010 3 次提交

Btrfs: Add BTRFS_IOC_SUBVOL_GETFLAGS/SETFLAGS ioctls · 0caa102d

由 Li Zefan 提交于 12月 20, 2010

This allows us to set a snapshot or a subvolume readonly or writable
on the fly.

Usage:

Set BTRFS_SUBVOL_RDONLY of btrfs_ioctl_vol_arg_v2->flags, and then
call ioctl(BTRFS_IOCTL_SUBVOL_SETFLAGS);

Changelog for v3:

- Change to pass __u64 as ioctl parameter.

Changelog for v2:

- Add _GETFLAGS ioctl.
- Check if the passed fd is the root of a subvolume.
- Change the name from _SNAP_SETFLAGS to _SUBVOL_SETFLAGS.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

0caa102d

Btrfs: Add readonly snapshots support · b83cc969

由 Li Zefan 提交于 12月 20, 2010

Usage:

Set BTRFS_SUBVOL_RDONLY of btrfs_ioctl_vol_arg_v2->flags, and call
ioctl(BTRFS_I0CTL_SNAP_CREATE_V2).

Implementation:

- Set readonly bit of btrfs_root_item->flags.
- Add readonly checks in btrfs_permission (inode_permission),
btrfs_setattr, btrfs_set/remove_xattr and some ioctls.

Changelog for v3:

- Eliminate btrfs_root->readonly, but check btrfs_root->root_item.flags.
- Rename BTRFS_ROOT_SNAP_RDONLY to BTRFS_ROOT_SUBVOL_RDONLY.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

b83cc969

Btrfs: Refactor btrfs_ioctl_snap_create() · fa0d2b9b

由 Li Zefan 提交于 12月 20, 2010

Split it into two functions for two different ioctls, since they
share no common code.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

fa0d2b9b

22 12月, 2010 4 次提交

btrfs: Extract duplicate decompress code · 3a39c18d

由 Li Zefan 提交于 11月 08, 2010

Add a common function to copy decompressed data from working buffer
to bio pages.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

3a39c18d

btrfs: Allow to specify compress method when defrag · 1a419d85

由 Li Zefan 提交于 10月 25, 2010

Update defrag ioctl, so one can choose lzo or zlib when turning
on compression in defrag operation.

Changelog:

v1 -> v2
- Add incompability flag.
- Fix to check invalid compress type.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

1a419d85

btrfs: Add lzo compression support · a6fa6fae

由 Li Zefan 提交于 10月 25, 2010

Lzo is a much faster compression algorithm than gzib, so would allow
more users to enable transparent compression, and some users can
choose from compression ratio and speed for different applications

Usage:

 # mount -t btrfs -o compress[=<zlib,lzo>] dev /mnt
or
 # mount -t btrfs -o compress-force[=<zlib,lzo>] dev /mnt

"-o compress" without argument is still allowed for compatability.

Compatibility:

If we mount a filesystem with lzo compression, it will not be able be
mounted in old kernels. One reason is, otherwise btrfs will directly
dump compressed data, which sits in inline extent, to user.

Performance:

The test copied a linux source tarball (~400M) from an ext4 partition
to the btrfs partition, and then extracted it.

(time in second)
           lzo        zlib        nocompress
copy:      10.6       21.7        14.9
extract:   70.1       94.4        66.6

(data size in MB)
           lzo        zlib        nocompress
copy:      185.87     108.69      394.49
extract:   193.80     132.36      381.21

Changelog:

v1 -> v2:
- Select LZO_COMPRESS and LZO_DECOMPRESS in btrfs Kconfig.
- Add incompability flag.
- Fix error handling in compress code.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

a6fa6fae

btrfs: Allow to add new compression algorithm · 261507a0

由 Li Zefan 提交于 12月 17, 2010

Make the code aware of compression type, instead of always assuming
zlib compression.

Also make the zlib workspace function as common code for all
compression types.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

261507a0