提交 · 61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf · Linux-御风守护者 / linux

26 4月, 2011 1 次提交
- D
  btrfs: add missing spin_unlock to a rare exit path · cfece4db
  由 David Sterba 提交于 4月 25, 2011
```
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
  cfece4db
12 4月, 2011 1 次提交

Btrfs: avoid taking the trans_mutex in btrfs_end_transaction · 13c5a93e

由 Josef Bacik 提交于 4月 11, 2011

I've been working on making our O_DIRECT latency not suck and I noticed we were
taking the trans_mutex in btrfs_end_transaction. So to do this we convert
num_writers and use_count to atomic_t's and just decrement them in
btrfs_end_transaction. Instead of deleting the transaction from the trans list
in put_transaction we do that in btrfs_commit_transaction() since that's the
only time it actually needs to be removed from the list. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

13c5a93e

05 4月, 2011 1 次提交

Btrfs: Fix uninitialized root flags for subvolumes · 08fe4db1

由 Li Zefan 提交于 3月 28, 2011

root_item->flags and root_item->byte_limit are not initialized when
a subvolume is created. This bug is not revealed until we added
readonly snapshot support - now you mount a btrfs filesystem and you
may find the subvolumes in it are readonly.

To work around this problem, we steal a bit from root_item->inode_item->flags,
and use it to indicate if those fields have been properly initialized.
When we read a tree root from disk, we check if the bit is set, and if
not we'll set the flag and initialize the two fields of the root item.
Reported-by: NAndreas Philipp <philipp.andreas@gmail.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Tested-by: NAndreas Philipp <philipp.andreas@gmail.com>
cc: stable@kernel.org
Signed-off-by: NChris Mason <chris.mason@oracle.com>

08fe4db1

28 3月, 2011 6 次提交

btrfs: fix possible deadlock by clearing __GFP_FS flag · 1561deda

由 Miao Xie 提交于 3月 27, 2011

Using the GFP_HIGHUSER_MOVABLE flag to allocate the metadata's page may cause
deadlock.
  Task1
  open()
    ...
    btrfs_search_slot()
      ...
      btrfs_cow_block()
	...
	alloc_page()
	  wait for reclaiming
					shrink_slab()
					  ...
					  shrink_icache_memory()
					    ...
					    btrfs_evict_inode()
					      ...
					      btrfs_search_slot()

If the path is locked by task1, the deadlock happens.

So the btree's page cache is different with the file's page cache, it can not
allocate pages by GFP_HIGHUSER_MOVABLE flag, we must clear __GFP_FS flag in
GFP_HIGHUSER_MOVABLE flag.
Reported-by: NItaru Kitayama <kitayama@cl.bb4u.ne.jp>
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

1561deda

Btrfs: fix OOPS of empty filesystem after balance · c59021f8

由 liubo 提交于 3月 07, 2011

btrfs will remove unused block groups after balance.
When a empty filesystem is balanced, the block group with tag "DATA" may be
dropped, and after umount and mount again, it will not find "DATA" space_info
and lead to OOPS.
So we initial the necessary space_infos(DATA, SYSTEM, METADATA) to avoid OOPS.
Reported-by: NDaniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c59021f8

Btrfs: adjust btrfs_discard_extent() return errors and trimmed bytes · 5378e607

由 Li Dongyang 提交于 3月 24, 2011

Callers of btrfs_discard_extent() should check if we are mounted with -o discard,
as we want to make fitrim to work even the fs is not mounted with -o discard.
Also we should use REQ_DISCARD to map the free extent to get a full mapping,
last we only return errors if
1. the error is not a EOPNOTSUPP
2. no device supports discard
Signed-off-by: NLi Dongyang <lidongyang@novell.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5378e607

Btrfs: Per file/directory controls for COW and compression · 75e7cb7f

由 Liu Bo 提交于 3月 22, 2011

Data compression and data cow are controlled across the entire FS by mount
options right now.  ioctls are needed to set this on a per file or per
directory basis.  This has been proposed previously, but VFS developers
wanted us to use generic ioctls rather than btrfs-specific ones.

According to Chris's comment, there should be just one true compression
method(probably LZO) stored in the super.  However, before this, we would
wait for that one method is stable enough to be adopted into the super.
So I list it as a long term goal, and just store it in ram today.

After applying this patch, we can use the generic "FS_IOC_SETFLAGS" ioctl to
control file and directory's datacow and compression attribute.

NOTE:
 - The compression type is selected by such rules:
   If we mount btrfs with compress options, ie, zlib/lzo, the type is it.
   Otherwise, we'll use the default compress type (zlib today).

v1->v2:
- rebase to the latest btrfs.
v2->v3:
- fix a problem, i.e. when a file is set NOCOW via mount option, then this NOCOW
  will be screwed by inheritance from parent directory.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

75e7cb7f

btrfs: properly access unaligned checksum buffer · 7e75bf3f

由 David Sterba 提交于 3月 18, 2011

On Fri, Mar 18, 2011 at 11:56:53AM -0400, Chris Mason wrote:
> Thanks for fielding this one.  Does put_unaligned_le32 optimize away on
> platforms with efficient access?  It would be great if we didn't need
> the #ifdef.

(quicktest: assembly output is same for put_unaligned_le32 and direct
assignment on my x86_64)
I was originally following examples in
Documentation/unaligned-memory-access.txt. From other code it seems to me that
the define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is intended for larger
portions of code. Macros/wrappers for {put,get}_unaligned* are chosen via
arch/<arch>/include/asm/unaligned.h accordingly, therefore it's safe to use
put_unaligned_le32 without the ifdef.

dave
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7e75bf3f

Btrfs: cleanup some BUG_ON() · db5b493a

由 Tsutomu Itoh 提交于 3月 23, 2011

This patch changes some BUG_ON() to the error return.
(but, most callers still use BUG_ON())
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

db5b493a

18 3月, 2011 2 次提交

Btrfs: check items for correctness as we search · a826d6dc

由 Josef Bacik 提交于 3月 16, 2011

Currently if we have corrupted items things will blow up in spectacular ways.
So as we read in blocks and they are leaves, check the entire leaf to make sure
all of the items are correct and point to valid parts in the leaf for the item
data the are responsible for. If the item is corrupt we will kick back EIO and
not read any of the copies since they are likely to not be correct either. This
will catch generic corruptions, it will be up to the individual callers of
btrfs_search_slot to make sure their items are right. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

a826d6dc

Btrfs: handle errors in btrfs_orphan_cleanup · 66b4ffd1

由 Josef Bacik 提交于 1月 31, 2011

If we cannot truncate an inode for some reason we will never delete the orphan
item associated with that inode, which means that we will loop forever in
btrfs_orphan_cleanup. Instead of doing this just return error so we fail to
mount. It sucks, but hey it's better than hanging. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

66b4ffd1

10 3月, 2011 1 次提交

block: remove per-queue plugging · 7eaceacc

由 Jens Axboe 提交于 3月 10, 2011

Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7eaceacc

01 3月, 2011 1 次提交

Remove one to many n's in a word · ae0e47f0

由 Justin P. Mattock 提交于 3月 01, 2011

Signed-off-by: NJustin P. Mattock <justinmattock@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

ae0e47f0

15 2月, 2011 1 次提交

Btrfs: fix page->private races · eb14ab8e

由 Chris Mason 提交于 2月 10, 2011

There is a race where btrfs_releasepage can drop the
page->private contents just as alloc_extent_buffer is setting
up pages for metadata.  Because of how the Btrfs page flags work,
this results in us skipping the crc on the page during IO.

This patch sovles the race by waiting until after the extent buffer
is inserted into the radix tree before it sets page private.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

eb14ab8e

29 1月, 2011 1 次提交

btrfs: fix return value check of btrfs_join_transaction() · 3612b495

由 Tsutomu Itoh 提交于 1月 25, 2011

The error check of btrfs_join_transaction()/btrfs_join_transaction_nolock()
is added, and the mistake of the error check in several places is
corrected.

For more stable Btrfs, I think that we should reduce BUG_ON().
But, I think that long time is necessary for this.
So, I propose this patch as a short-term solution.

With this patch:
 - To more stable Btrfs, the part that should be corrected is clarified.
 - The panic isn't done by the NULL pointer reference etc. (even if
   BUG_ON() is increased temporarily)
 - The error code is returned in the place where the error can be easily
   returned.

As a long-term plan:
 - BUG_ON() is reduced by using the forced-readonly framework, etc.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3612b495

27 1月, 2011 1 次提交

Btrfs: Fix memory leak at umount · 83a4d548

由 Li Zefan 提交于 12月 27, 2010

fs_info, which is allocated in open_ctree(), should be freed
in close_ctree().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

83a4d548

18 1月, 2011 1 次提交

Btrfs: forced readonly mounts on errors · acce952b

由 liubo 提交于 1月 06, 2011

This patch comes from "Forced readonly mounts on errors" ideas.

As we know, this is the first step in being more fault tolerant of disk
corruptions instead of just using BUG() statements.

The major content:
- add a framework for generating errors that should result in filesystems
  going readonly.
- keep FS state in disk super block.
- make sure that all of resource will be freed and released at umount time.
- make sure that fter FS is forced readonly on error, there will be no more
  disk change before FS is corrected. For this, we should stop write operation.

After this patch is applied, the conversion from BUG() to such a framework can
happen incrementally.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

acce952b

17 1月, 2011 3 次提交

btrfs: Fix memory leak in btrfs_read_fs_root_no_radix() · 5e540f77

由 Tsutomu Itoh 提交于 12月 27, 2010

In btrfs_read_fs_root_no_radix(), 'root' is not freed if
btrfs_search_slot() returns error.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5e540f77

btrfs: check NULL or not · 91ca338d

由 Tsutomu Itoh 提交于 1月 05, 2011

Should check if functions returns NULL or not.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

91ca338d

btrfs: mount failure return value fix · 20b45077

由 Dave Young 提交于 1月 08, 2011

I happened to pass swap partition as root partition in cmdline,
then kernel panic and tell me about "Cannot open root device".
It is not correct, in fact it is a fs type mismatch instead of 'no device'.

Eventually I found btrfs mounting failed with -EIO, it should be -EINVAL.
The logic in init/do_mounts.c:
        for (p = fs_names; *p; p += strlen(p)+1) {
                int err = do_mount_root(name, p, flags, root_mount_data);
                switch (err) {
                        case 0:
                                goto out;
                        case -EACCES:
                                flags |= MS_RDONLY;
                                goto retry;
                        case -EINVAL:
                                continue;
                }
		print "Cannot open root device"
		panic
	}
SO fs type after btrfs will have no chance to mount

Here fix the return value as -EINVAL
Signed-off-by: NDave Young <hidave.darkstar@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

20b45077

22 12月, 2010 1 次提交

btrfs: Add lzo compression support · a6fa6fae

由 Li Zefan 提交于 10月 25, 2010

Lzo is a much faster compression algorithm than gzib, so would allow
more users to enable transparent compression, and some users can
choose from compression ratio and speed for different applications

Usage:

 # mount -t btrfs -o compress[=<zlib,lzo>] dev /mnt
or
 # mount -t btrfs -o compress-force[=<zlib,lzo>] dev /mnt

"-o compress" without argument is still allowed for compatability.

Compatibility:

If we mount a filesystem with lzo compression, it will not be able be
mounted in old kernels. One reason is, otherwise btrfs will directly
dump compressed data, which sits in inline extent, to user.

Performance:

The test copied a linux source tarball (~400M) from an ext4 partition
to the btrfs partition, and then extracted it.

(time in second)
           lzo        zlib        nocompress
copy:      10.6       21.7        14.9
extract:   70.1       94.4        66.6

(data size in MB)
           lzo        zlib        nocompress
copy:      185.87     108.69      394.49
extract:   193.80     132.36      381.21

Changelog:

v1 -> v2:
- Select LZO_COMPRESS and LZO_DECOMPRESS in btrfs Kconfig.
- Add incompability flag.
- Fix error handling in compress code.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

a6fa6fae

14 12月, 2010 1 次提交

Btrfs: EIO when we fail to read tree roots · 68433b73

由 Chris Mason 提交于 12月 13, 2010

If we just get a plain IO error when we read tree roots, the code
wasn't properly sending that error up the chain.  This allowed mounts to
continue when they should failed, and allowed operations
on partially setup root structs.  The end result was usually oopsen
on spinlocks that hadn't been spun up correctly.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

68433b73

11 12月, 2010 1 次提交

Btrfs: fix compiler warnings · 3dd1462e

由 Jan Beulich 提交于 12月 07, 2010

... regarding an unused function when !MIGRATION, and regarding a
printk() format string vs argument mismatch.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3dd1462e

29 11月, 2010 1 次提交
- C
  Btrfs: don't use migrate page without CONFIG_MIGRATION · 5a92bc88
  由 Chris Mason 提交于 11月 29, 2010
```
Fixes compile error
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
  5a92bc88
28 11月, 2010 1 次提交

Btrfs: setup blank root and fs_info for mount time · 450ba0ea

由 Josef Bacik 提交于 11月 19, 2010

There is a problem with how we use sget, it searches through the list of supers
attached to the fs_type looking for a super with the same fs_devices as what
we're trying to mount. This depends on sb->s_fs_info being filled, but we don't
fill that in until we get to btrfs_fill_super, so we could hit supers on the
fs_type super list that have a null s_fs_info. In order to fix that we need to
go ahead and setup a blank root with a blank fs_info to hold fs_devices, that
way our test will work out right and then we can set s_fs_info in
btrfs_set_super, and then open_ctree will simply use our pre-allocated root and
fs_info when setting everything up. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

450ba0ea

22 11月, 2010 1 次提交

Btrfs: add migrate page for metadata inode · 784b4e29

由 Chris Mason 提交于 11月 21, 2010

Migrate page will directly call the btrfs btree writepage function,
which isn't actually allowed.

Our writepage assumes that you have locked the extent_buffer and
flagged the block as written.  Without doing these steps, we can
corrupt metadata blocks.

A later commit will remove the btree writepage function since
it is really only safely used internally by btrfs.  We
use writepages for everything else.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

784b4e29

30 10月, 2010 2 次提交

Btrfs: async transaction commit · bb9c12c9

由 Sage Weil 提交于 10月 29, 2010

Add support for an async transaction commit that is ordered such that any
subsequent operations will join the following transaction, but does not
wait until the current commit is fully on disk. This avoids much of the
latency associated with the btrfs_commit_transaction for callers concerned
with serialization and not safety.

The wait_for_unblock flag controls whether we wait for the 'middle' portion
of commit_transaction to complete, which is necessary if the caller expects
some of the modifications contained in the commit to be available (this is
the case for subvol/snapshot creation).
Signed-off-by: NSage Weil <sage@newdream.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

bb9c12c9

Btrfs: cleanup warnings from gcc 4.6 (nonbugs) · 559af821

由 Andi Kleen 提交于 10月 29, 2010

These are all the cases where a variable is set, but not read which are
not bugs as far as I can see, but simply leftovers.

Still needs more review.

Found by gcc 4.6's new warnings
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Cc: Chris Mason <chris.mason@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

559af821

29 10月, 2010 2 次提交

Btrfs: write out free space cache · 0cb59c99

由 Josef Bacik 提交于 7月 02, 2010

This is a simple bit, just dump the free space cache out to our preallocated
inode when we're writing out dirty block groups. There are a bunch of changes
in inode.c in order to account for special cases. Mostly when we're doing the
writeout we're holding trans_mutex, so we need to use the nolock transacation
functions. Also we can't do asynchronous completions since the async thread
could be blocked on already completed IO waiting for the transaction lock. This
has been tested with xfstests and btrfs filesystem balance, as well as my ENOSPC
tests. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

0cb59c99

Btrfs: create special free space cache inode · 0af3d00b

由 Josef Bacik 提交于 6月 21, 2010

In order to save free space cache, we need an inode to hold the data, and we
need a special item to point at the right inode for the right block group. So
first, create a special item that will point to the right inode, and the number
of extent entries we will have and the number of bitmaps we will have. We
truncate and pre-allocate space everytime to make sure it's uptodate.

This feature will be turned on as soon as you mount with -o space_cache, however
it is safe to boot into old kernels, they will just generate the cache the old
fashion way. When you boot back into a newer kernel we will notice that we
modified and not the cache and automatically discard the cache.
Signed-off-by: NJosef Bacik <josef@redhat.com>

0af3d00b

10 9月, 2010 1 次提交

btrfs: replace barriers with explicit flush / FUA usage · c3b9a62c

由 Christoph Hellwig 提交于 8月 18, 2010

Switch to the WRITE_FLUSH_FUA flag for log writes, remove the EOPNOTSUPP
detection for barriers and stop setting the barrier flag for discards.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NChris Mason <chris.mason@oracle.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

c3b9a62c

08 8月, 2010 1 次提交

block: unify flags for struct bio and struct request · 7b6d91da

由 Christoph Hellwig 提交于 8月 07, 2010

Remove the current bio flags and reuse the request flags for the bio, too.
This allows to more easily trace the type of I/O from the filesystem
down to the block driver. There were two flags in the bio that were
missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've
renamed two request flags that had a superflous RW in them.

Note that the flags are in bio.h despite having the REQ_ name - as
blkdev.h includes bio.h that is the only way to go for now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7b6d91da

12 6月, 2010 2 次提交

Btrfs: btrfs_read_fs_root_no_name() returns ERR_PTRs · 3140c9a3

由 Dan Carpenter 提交于 5月 29, 2010

btrfs_read_fs_root_no_name() returns ERR_PTRs on error so I added a
check for that.  It's not clear to me if it can also return NULL
pointers or not so I left the original NULL pointer check as is.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3140c9a3

Btrfs: handle kzalloc() failure in open_ctree() · 676e4c86

由 Dan Carpenter 提交于 5月 29, 2010

Unwind and return -ENOMEM if the allocation fails here.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

676e4c86

25 5月, 2010 6 次提交

Btrfs: use async helpers for DIO write checksumming · eaf25d93

由 Chris Mason 提交于 5月 25, 2010

The async helper threads offload crc work onto all the
CPUs, and make streaming writes much faster.  This
changes the O_DIRECT write code to use them.  The only
small complication was that we need to pass in the
logical offset in the file for each bio, because we can't
find it in the bio's pages.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

eaf25d93

Btrfs: Metadata ENOSPC handling for tree log · 4a500fd1

由 Yan, Zheng 提交于 5月 16, 2010

Previous patches make the allocater return -ENOSPC if there is no
unreserved free metadata space. This patch updates tree log code
and various other places to propagate/handle the ENOSPC error.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4a500fd1

Btrfs: Metadata reservation for orphan inodes · d68fc57b

由 Yan, Zheng 提交于 5月 16, 2010

reserve metadata space for handling orphan inodes
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d68fc57b

Btrfs: Introduce global metadata reservation · 8929ecfa

由 Yan, Zheng 提交于 5月 16, 2010

Reserve metadata space for extent tree, checksum tree and root tree
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8929ecfa

Btrfs: Integrate metadata reservation with start_transaction · a22285a6

由 Yan, Zheng 提交于 5月 16, 2010

Besides simplify the code, this change makes sure all metadata
reservation for normal metadata operations are released after
committing transaction.

Changes since V1:

Add code that check if unlink and rmdir will free space.

Add ENOSPC handling for clone ioctl.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a22285a6

Btrfs: Introduce contexts for metadata reservation · f0486c68

由 Yan, Zheng 提交于 5月 16, 2010

Introducing metadata reseravtion contexts has two major advantages.
First, it makes metadata reseravtion more traceable. Second, it can
reclaim freed space and re-add them to the itself after transaction
committed.

Besides add btrfs_block_rsv structure and related helper functions,
This patch contains following changes:

Move code that decides if freed tree block should be pinned into
btrfs_free_tree_block().

Make space accounting more accurate, mainly for handling read only
block groups.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f0486c68

Linux-御风守护者 / linux 与 Fork 源项目一致

Linux-御风守护者 / linux
与 Fork 源项目一致