提交 · 6df9a95e63395f595d0d1eb5d561dd6c91c40270 · xiphi1978 / linux

02 7月, 2013 9 次提交

Btrfs: make the chunk allocator completely tree lockless · 6df9a95e

由 Josef Bacik 提交于 6月 27, 2013

When adjusting the enospc rules for relocation I ran into a deadlock because we
were relocating the only system chunk and that forced us to try and allocate a
new system chunk while holding locks in the chunk tree, which caused us to
deadlock. To fix this I've moved all of the dev extent addition and chunk
addition out to the delayed chunk completion stuff. We still keep the in-memory
stuff which makes sure everything is consistent.

One change I had to make was to search the commit root of the device tree to
find a free dev extent, and hold onto any chunk em's that we allocated in that
transaction so we do not allocate the same dev extent twice. This has the side
effect of fixing a bug with balance that has been there ever since balance
existed. Basically you can free a block group and it's dev extent and then
immediately allocate that dev extent for a new block group and write stuff to
that dev extent, all within the same transaction. So if you happen to crash
during a balance you could come back to a completely broken file system. This
patch should keep these sort of things from happening in the future since we
won't be able to allocate free'd dev extents until after the transaction
commits. This has passed all of the xfstests and my super annoying stress test
followed by a balance. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

6df9a95e

Btrfs: cleanup orphaned root orphan item · 68a7342c

由 Josef Bacik 提交于 6月 27, 2013

I hit a weird problem were my root item had been deleted but the orphan item had
not. This isn't necessarily a problem, but it keeps the file system from being
mounted. To fix this we just need to axe the orphan item if we can't find the
fs root when we're putting them altogether. With this patch I was able to
successfully mount my file system. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

68a7342c

Btrfs: fix wrong mirror number tuning · a70c6172

由 Miao Xie 提交于 6月 19, 2013

Now reading the data from the target device of the replace operation is allowed,
so the mirror number that is greater than the stripes number of a chunk is valid,
we will tune it when we find there is no target device later. Fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

a70c6172

M
Btrfs: cleanup redundant code in btrfs_submit_direct() · e6da5d2e
由 Miao Xie 提交于 6月 19, 2013
```
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
```
e6da5d2e

Btrfs: remove btrfs_sector_sum structure · f51a4a18

由 Miao Xie 提交于 6月 19, 2013

Using the structure btrfs_sector_sum to keep the checksum value is
unnecessary, because the extents that btrfs_sector_sum points to are
continuous, we can find out the expected checksums by btrfs_ordered_sum's
bytenr and the offset, so we can remove btrfs_sector_sum's bytenr. After
removing bytenr, there is only one member in the structure, so it makes
no sense to keep the structure, just remove it, and use a u32 array to
store the checksum value.

By this change, we don't use the while loop to get the checksums one by
one. Now, we can get several checksum value at one time, it improved the
performance by ~74% on my SSD (31MB/s -> 54MB/s).

test command:
 # dd if=/dev/zero of=/mnt/btrfs/file0 bs=1M count=1024 oflag=sync
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

f51a4a18

Btrfs: check if we can nocow if we don't have data space · 7ee9e440

由 Josef Bacik 提交于 6月 21, 2013

We always just try and reserve data space when we write, but if we are out of
space but have prealloc'ed extents we should still successfully write. This
patch will try and see if we can write to prealloc'ed space and if we can go
ahead and allow the write to continue. With this patch we now pass xfstests
generic/274. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

7ee9e440

Btrfs: stop using try_to_writeback_inodes_sb_nr to flush delalloc · 925a6efb

由 Josef Bacik 提交于 6月 20, 2013

try_to_writeback_inodes_sb_nr returns 1 if writeback is already underway, which
is completely fraking useless for us as we need to make sure pages are actually
written before we go and check if there are ordered extents. So replace this
with an open coding of try_to_writeback_inodes_sb_nr minus the writeback
underway check so that we are sure to actually have flushed some dirty pages out
and will have ordered extents to use. With this patch xfstests generic/273 now
passes. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

925a6efb

Btrfs: use a percpu to keep track of possibly pinned bytes · b150a4f1

由 Josef Bacik 提交于 6月 19, 2013

There are all of these checks in the ENOSPC code to see if committing the
transaction would free up enough space to make the allocation. This is because
early on we just committed the transaction and hoped and prayed, which resulted
in cases where it took _forever_ to get an ENOSPC when we really were out of
space. So we check space_info->bytes_pinned, except this isn't completely true
because it doesn't account for space we may free but are stuck in delayed refs.
So tests like xfstests 226 would fail because we wouldn't commit the transaction
to free up the data space. So instead add a percpu counter that will be a
little fuzzier, it will add bytes as soon as we try to free up the space, and
remove any space it doesn't actually free up when we get around to doing the
actual free. We then 0 out this counter every transaction period so we have a
better idea of how much space we will actually free up by committing this
transaction. With this patch we now pass xfstests 226. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

b150a4f1

Btrfs: check for actual acls rather than just xattrs when caching no acl · f23b5a59

由 Josef Bacik 提交于 6月 19, 2013

We have an optimization that will go ahead and cache no acls on an inode if
there are no xattrs on the inode. This saves us a lookup later to check the
acls for writes or any other access. The problem is I use selinux so I always
have an xattr on inodes, so make this test a little smarter and check for the
actual acl hash on the key and if it isn't there then we still get to cache no
acl which makes everybody who uses selinux a little happier. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

f23b5a59

01 7月, 2013 13 次提交

Btrfs: move btrfs_truncate_page to btrfs_cont_expand instead of btrfs_truncate · a71754fc

由 Josef Bacik 提交于 6月 17, 2013

This has plagued us forever and I'm so over working around it. When we truncate
down to a non-page aligned offset we will call btrfs_truncate_page to zero out
the end of the page and write it back to disk, this will keep us from exposing
stale data if we truncate back up from that point. The problem with this is it
requires data space to do this, and people don't really expect to get ENOSPC
from truncate() for these sort of things. This also tends to bite the orphan
cleanup stuff too which keeps people from mounting. To get around this we can
just move this into btrfs_cont_expand() to make sure if we are truncating up
from a non-page size aligned i_size we will zero out the rest of this page so
that we don't expose stale data. This will give ENOSPC if you try to truncate()
up or if you try to write past the end of isize, which is much more reasonable.
This fixes xfstests generic/083 failing to mount because of the orphan cleanup
failing. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

a71754fc

Btrfs: optimize reada_for_balance · 0b08851f

由 Josef Bacik 提交于 6月 17, 2013

This patch does two things. First we no longer explicitly read in the blocks
we're trying to readahead. For things like balance_level we may never actually
use the blocks so this just adds uneeded latency, and balance_level and
split_node will both read in the blocks they care about explicitly so if the
blocks need to be waited on it will be done there. Secondly we no longer drop
the path if we do readahead, we just set the path blocking before we call
reada_for_balance() and then we're good to go. Hopefully this will cut down on
the number of re-searches. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

0b08851f

Btrfs: optimize read_block_for_search · bdf7c00e

由 Josef Bacik 提交于 6月 17, 2013

This patch does two things, first it only does one call to
btrfs_buffer_uptodate() with the gen specified instead of once with 0 and then
again with gen specified. The other thing is to call btrfs_read_buffer() on the
buffer we've found instead of dropping it and then calling read_tree_block().
This will keep us from doing yet another radix tree lookup for a buffer we've
already found. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

bdf7c00e

Btrfs: unlock extent range on enospc in compressed submit · fdf8e2ea

由 Josef Bacik 提交于 6月 14, 2013

A user reported a deadlock where the async submit thread was blocked on the
lock_extent() lock, and then everybody behind him was locked on the page lock
for the page he was holding. Looking at the code I noticed we do not unlock the
extent range when we get ENOSPC and goto retry. This is bad because we
immediately try to lock that range again to do the cow, which will cause a
deadlock. Fix this by unlocking the range. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

fdf8e2ea

Btrfs: fix the comment typo for btrfs_attach_transaction_barrier · 90b6d283

由 Wang Sheng-Hui 提交于 6月 14, 2013

The comment is for btrfs_attach_transaction_barrier, not for
btrfs_attach_transaction. Fix the typo.
Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>
Acked-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

90b6d283

Btrfs: fix not being able to find skinny extents during relocate · aee68ee5

由 Josef Bacik 提交于 6月 13, 2013

We unconditionally search for the EXTENT_ITEM_KEY for metadata during balance,
and then check the key that we found to see if it is actually a
METADATA_ITEM_KEY, but this doesn't work right because METADATA is a higher key
value, so if what we are looking for happens to be the first item in the leaf
the search will dump us out at the previous leaf, and we won't find our item.
So instead do what we do everywhere else, search for the skinny extent first and
if we don't find it go back and re-search for the extent item. This patch fixes
the panic I was hitting when balancing a large file system with skinny extents.
Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

aee68ee5

Btrfs: cleanup backref search commit root flag stuff · da61d31a

由 Josef Bacik 提交于 6月 12, 2013

Looking into this backref problem I noticed we're using a macro to what turns
out to essentially be a NULL check to see if we need to search the commit root.
I'm killing this, let's just do what everybody else does and checks if trans ==
NULL. I've also made it so we pass in the path to __resolve_indirect_refs which
will have the search_commit_root flag set properly already and that way we can
avoid allocating another path when we have a perfectly good one to use. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

da61d31a

Btrfs: free csums when we're done scrubbing an extent · d88d46c6

由 Josef Bacik 提交于 6月 10, 2013

A user reported scrub taking up an unreasonable amount of ram as it ran. This
is because we lookup the csums for the extent we're scrubbing but don't free it
up until after we're done with the scrub, which means we can take up a whole lot
of ram. This patch fixes this by dropping the csums once we're done with the
extent we've scrubbed. The user reported this to fix their problem. Thanks,
Reported-and-tested-by: NRemco Hosman <remco@hosman.xs4all.nl>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

d88d46c6

Btrfs: fix transaction throttling for delayed refs · 1be41b78

由 Josef Bacik 提交于 6月 12, 2013

Dave has this fs_mark script that can make btrfs abort with sufficient amount of
ram. This is because with more ram we can keep more dirty metadata in cache
which in a round about way makes for many more pending delayed refs. What
happens is we end up not throttling the transaction enough so when we go to
commit the transaction when we've completely filled the file system we'll
abort() because we use all of the space in the global reserve and we still have
delayed refs to run. To fix this we need to make the delayed ref flushing and
the transaction throttling dependant upon the number of delayed refs that we
have instead of how much reserved space is left in the global reserve. With
this patch we not only stop aborting transactions but we also get a smoother run
speed with fs_mark and it makes us about 10% faster. Thanks,
Reported-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

1be41b78

Btrfs: stop waiting on current trans if we aborted · 501407aa

由 Josef Bacik 提交于 6月 10, 2013

I hit a hang when run_delayed_refs returned an error in the beginning of
btrfs_commit_transaction. If we decide we need to commit the transaction in
btrfs_end_transaction we'll set BLOCKED and start to commit, but if we get an
error this early on we'll just exit without committing. This is fine, except
that anybody else who tried to start a transaction will sit in
wait_current_trans() since we're set to BLOCKED and we never set it to something
else and woke people up. To fix this we want to check for trans->aborted
everywhere we wait for the transaction state to change, and make
btrfs_abort_transaction() wake up any waiters there may be. All the callers
will notice that the transaction has aborted and exit out properly. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

501407aa

Btrfs: wake up delayed ref flushing waiters on abort · f971fe29

由 Josef Bacik 提交于 6月 10, 2013

I hit a deadlock because we aborted when flushing delayed refs but didn't wake
any of the other flushers up and so everybody was just sleeping forever.  This
should fix the problem.  Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

f971fe29

btrfs: fix the code comments for LZO compression workspace · 3fb40375

由 Jie Liu 提交于 6月 06, 2013

Fix the code comments for lzo compression workspace.
The buf item is used to store the decompressed data
and cbuf is used to store the compressed data.
Signed-off-by: NJie Liu <jeff.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

3fb40375

Btrfs: fix broken nocow after balance · 5bc7247a

由 Miao Xie 提交于 6月 06, 2013

Balance will create reloc_root for each fs root, and it's going to
record last_snapshot to filter shared blocks.  The side effect of
setting last_snapshot is to break nocow attributes of files.

Since the extents are not shared by the relocation tree after the balance,
we can recover the old last_snapshot safely if no one snapshoted the
source tree. We fix the above problem by this way.
Reported-by: NKyle Gates <kylegates@hotmail.com>
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

5bc7247a

14 6月, 2013 18 次提交

Btrfs: exclude logged extents before replying when we are mixed · 8c2a1a30

由 Josef Bacik 提交于 6月 06, 2013

With non-mixed block groups we replay the logs before we're allowed to do any
writes, so we get away with not pinning/removing the data extents until right
when we replay them. However with mixed block groups we allocate out of the
same pool, so we could easily allocate a metadata block that was logged in our
tree log. To deal with this we just need to notice that we have mixed block
groups and do the normal excluding/removal dance during the pin stage of the log
replay and that way we don't allocate metadata blocks from areas we have logged
data extents. With this patch we now pass xfstests generic/311 with mixed
block groups turned on. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

8c2a1a30

Btrfs: put our inode if orphan cleanup fails · 01cd3367

由 Josef Bacik 提交于 6月 03, 2013

When we cross into a different subvol when doing a lookup we will run the orhpan
cleanup.  If this fails however we do not drop the ref to the inode we were
looking up before we return an error, which leads to busy inodes on umount.
Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

01cd3367

Btrfs: add some missing iput()'s in btrfs_orphan_cleanup · c69b26b0

由 Josef Bacik 提交于 6月 03, 2013

There are some error cases that we don't do an iput() on our inode, fix this.
Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

c69b26b0

Btrfs: do not pin while under spin lock · e78417d1

由 Josef Bacik 提交于 6月 03, 2013

When testing a corrupted fs I noticed I was getting sleep while atomic errors
when the transaction aborted. This is because btrfs_pin_extent may need to
allocate memory and we are calling this under the spin lock. Fix this by moving
it out and doing the pin after dropping the spin lock but before dropping the
mutex, the same way it works when delayed refs run normally. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

e78417d1

Btrfs: Cocci spatch "memdup.spatch" · a5959bc0

由 Thomas Meyer 提交于 6月 01, 2013

Signed-off-by: NThomas Meyer <thomas@m3y3r.de>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

a5959bc0

Btrfs: Cocci spatch "ptr_ret.spatch" · 97a184fe

由 Thomas Meyer 提交于 6月 01, 2013

Signed-off-by: NThomas Meyer <thomas@m3y3r.de>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

97a184fe

Btrfs: fix qgroup rescan resume on mount · b382a324

由 Jan Schmidt 提交于 5月 28, 2013

When called during mount, we cannot start the rescan worker thread until
open_ctree is done. This commit restuctures the qgroup rescan internals to
enable a clean deferral of the rescan resume operation.

First of all, the struct qgroup_rescan is removed, saving us a malloc and
some initialization synchronizations problems. Its only element (the worker
struct) now lives within fs_info just as the rest of the rescan code.

Then setting up a rescan worker is split into several reusable stages.
Currently we have three different rescan startup scenarios:
	(A) rescan ioctl
	(B) rescan resume by mount
	(C) rescan by quota enable

Each case needs its own combination of the four following steps:
	(1) set the progress [A, C: zero; B: state of umount]
	(2) commit the transaction [A]
	(3) set the counters [A, C: zero; B: state of umount]
	(4) start worker [A, B, C]

qgroup_rescan_init does step (1). There's no extra function added to commit
a transaction, we've got that already. qgroup_rescan_zero_tracking does
step (3). Step (4) is nothing more than a call to the generic
btrfs_queue_worker.

We also get rid of a double check for the rescan progress during
btrfs_qgroup_account_ref, which is no longer required due to having step 2
from the list above.

As a side effect, this commit prepares to move the rescan start code from
btrfs_run_qgroups (which is run during commit) to a less time critical
section.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

b382a324

Btrfs: avoid double free of fs_info->qgroup_ulist · eb1716af

由 Jan Schmidt 提交于 5月 28, 2013

When btrfs_read_qgroup_config or btrfs_quota_enable return non-zero, we've
already freed the fs_info->qgroup_ulist. The final btrfs_free_qgroup_config
called from quota_disable makes another ulist_free(fs_info->qgroup_ulist)
call.

We set fs_info->qgroup_ulist to NULL on the mentioned error paths, turning
the ulist_free in btrfs_free_qgroup_config into a noop.

Cc: Wang Shilong <wangsl-fnst@cn.fujitsu.com>
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

eb1716af

Btrfs: fix memory patcher through fs_info->qgroup_ulist · 4373519d

由 Jan Schmidt 提交于 5月 28, 2013

Commit 5b7c665e introduced fs_info->qgroup_ulist, that is allocated during
btrfs_read_qgroup_config and meant to be used later by the qgroup accounting
code. However, it is always freed before btrfs_read_qgroup_config returns,
becuase the commit mentioned above adds a check for (ret), where a check
for (ret < 0) would have been the right choice. This commit fixes the check.

Cc: Wang Shilong <wangsl-fnst@cn.fujitsu.com>
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

4373519d

Btrfs: simplify unlink reservations · d52be818

由 Josef Bacik 提交于 5月 29, 2013

Dave pointed out a problem where if you filled up a file system as much as
possible you couldn't remove any files. The whole unlink reservation thing is
convoluted because it tries to guess if it's going to add space to unlink
something or not, and has all these odd uncommented cases where it simply does
not try. So to fix this I've added a way to conditionally steal from the global
reserve if we can't make our normal reservation. If we have more than half the
space in the global reserve free we will go ahead and steal from the global
reserve. With this patch Dave's reproducer now works and I can rm all the files
on the file system. Thanks,
Reported-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

d52be818

Btrfs: merge pending IO for tree log write back · c6adc9cc

由 Miao Xie 提交于 5月 28, 2013

Before applying this patch, we flushed the log tree of the fs/file
tree firstly, and then flushed the log root tree. It is ineffective,
especially on the hard disk. This patch improved this problem by wrapping
the above two flushes by the same blk_plug.

By test, the performance of the sync write went up ~60%(2.9MB/s -> 4.6MB/s)
on my scsi disk whose disk buffer was enabled.

Test step:
 # mkfs.btrfs -f -m single <disk>
 # mount <disk> <mnt>
 # dd if=/dev/zero of=<mnt>/file0 bs=32K count=1024 oflag=sync
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

c6adc9cc

Btrfs: allow file data clone within a file · a96fbc72

由 Liu Bo 提交于 5月 26, 2013

We did not allow file data clone within the same file because of
deadlock issues.

However, we now use nested lock to avoid deadlock between the
parent directory and the child file.

So it's safe to do file clone within the same file when the two
ranges are not overlapped.
Reviewed-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

a96fbc72

Btrfs: remove unused code in btrfs_del_root · b7394eb9

由 Liu Bo 提交于 5月 26, 2013

'leaf' and 'ri' is not used somehow.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

b7394eb9

Btrfs: kill replicate code in replay_one_buffer · 2da1c669

由 Liu Bo 提交于 5月 26, 2013

EXTREF is treated same as REF, so we can make the code tidy.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

2da1c669

Btrfs: update new flags for tracepoint · e112e2b4

由 Liu Bo 提交于 5月 26, 2013

Adding new flags to keep tracepoints consistent with btrfs.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

e112e2b4

Btrfs: check if leaf's parent exists before pushing items around · 33157e05

由 Liu Bo 提交于 5月 22, 2013

During splitting a leaf, pushing items around to hopefully get some space only
works when we have a parent, ie. we have at least one sibling leaf.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

33157e05

Btrfs: dont do log_removal in insert_new_root · fdd99c72

由 Liu Bo 提交于 5月 22, 2013

As for splitting a leaf, root is just the leaf, and tree mod log does not apply
on leaf, so in this case, we don't do log_removal.

As for splitting a node, the old root is kept as a normal node and we have nicely
put records in tree mod log for moving keys and items, so in this case we don't do
that either.

As above, insert_new_root can get rid of log_removal.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

fdd99c72

Btrfs: return error code in btrfs_check_trunc_cache_free_space() · 4b286cd1

由 Wei Yongjun 提交于 5月 21, 2013

Fix to return error code instead always return 0 from function
btrfs_check_trunc_cache_free_space().
Introduced by commit 7b61cd92
(Btrfs: don't use global block reservation for inode cache truncation)
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

4b286cd1