提交 · ea8efc74bd0402b4d5f663d007b4e25fa29ea778 · openanolis / cloud-kernel

09 3月, 2011 1 次提交

Btrfs: make sure not to return overlapping extents to fiemap · ea8efc74

由 Chris Mason 提交于 3月 08, 2011

The btrfs fiemap code was incorrectly returning duplicate or overlapping
extents in some cases.  cp was blindly trusting this result and we would
end up with a destination file that was bigger than the original because
some bytes were copied twice.

The fix here adjusts our offsets to make sure we're always moving
forward in the fiemap results.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ea8efc74

08 3月, 2011 1 次提交

Btrfs: deal with short returns from copy_from_user · 31339acd

由 Chris Mason 提交于 3月 07, 2011

When copy_from_user is only able to copy some of the bytes we requested,
we may end up creating a partially up to date page.  To avoid garbage in
the page, we need to treat a partial copy as a zero length copy.

This makes the rest of the file_write code drop the page and
retry the whole copy instead of marking the partially up to
date page as dirty.
Signed-off-by: NChris Mason <chris.mason@oracle.com>
cc: stable@kernel.org

31339acd

07 3月, 2011 1 次提交

Btrfs: fix regressions in copy_from_user handling · b1bf862e

由 Chris Mason 提交于 2月 28, 2011

Commit 914ee295 fixed deadlocks in
btrfs_file_write where we would catch page faults on pages we had
locked.

But, there were a few problems:

1) The x86-32 iov_iter_copy_from_user_atomic code always fails to copy
data when the amount to copy is more than 4K and the offset to start
copying from is not page aligned.  The result was btrfs_file_write
looping forever retrying the iov_iter_copy_from_user_atomic

We deal with this by changing btrfs_file_write to drop down to single
page copies when iov_iter_copy_from_user_atomic starts returning failure.

2) The btrfs_file_write code was leaking delalloc reservations when
iov_iter_copy_from_user_atomic returned zero.  The looping above would
result in the entire filesystem running out of delalloc reservations and
constantly trying to flush things to disk.

3) btrfs_file_write will lock down page cache pages, make sure
any writeback is finished, do the copy_from_user and then release them.
Before the loop runs we check the first and last pages in the write to
see if they are only being partially modified.  If the start or end of
the write isn't aligned, we make sure the corresponding pages are
up to date so that we don't introduce garbage into the file.

With the copy_from_user changes, we're allowing the VM to reclaim the
pages after a partial update from copy_from_user, but we're not
making sure the page cache page is up to date when we loop around to
resume the write.

We deal with this by pushing the up to date checks down into the page
prep code.  This fits better with how the rest of file_write works.
Signed-off-by: NChris Mason <chris.mason@oracle.com>
Reported-by: NMitch Harder <mitch.harder@sabayonlinux.org>
cc: stable@kernel.org

b1bf862e

24 2月, 2011 1 次提交

Btrfs: fix fiemap bugs with delalloc · ec29ed5b

由 Chris Mason 提交于 2月 23, 2011

The Btrfs fiemap code wasn't properly returning delalloc extents,
so applications that trust fiemap to decide if there are holes in the
file see holes instead of delalloc.

This reworks the btrfs fiemap code, adding a get_extent helper that
searches for delalloc ranges and also adding a helper for extent_fiemap
that skips past holes in the file.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ec29ed5b

17 2月, 2011 6 次提交

Btrfs: set FMODE_EXCL in btrfs_device->mode · fb01aa85

由 Ilya Dryomov 提交于 2月 15, 2011

This fixes a bug introduced in d4d77629, where the device added online
(and therefore initialized via btrfs_init_new_device()) would be left
with the positive bdev->bd_holders after unmount.  Since d4d77629 we no
longer OR FMODE_EXCL explicitly on blkdev_put(), set it in
btrfs_device->mode.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

fb01aa85

Btrfs: make btrfs_rm_device() fail gracefully · 9b3517e9

由 Ilya Dryomov 提交于 2月 15, 2011

If shrinking done as part of the online device removal fails add that
device back to the allocation list and increment the rw_devices counter.
This fixes two bugs:

1) we could have a perfectly good device out of alloc list for no good
reason;

2) in the btrfs consisting of two devices, failure in btrfs_rm_device()
could lead to a situation where it was impossible to remove any of the
devices because of the "unable to remove the only writeable device"
error.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9b3517e9

Btrfs: Avoid accessing unmapped kernel address · ca9b688c

由 Li Zefan 提交于 2月 16, 2011

When decompressing a chunk of data, we'll copy the data out to
a working buffer if the data is stored in more than one page,
otherwise we'll use the mapped page directly to avoid memory
copy.

In the latter case, we'll end up accessing the kernel address
after we've unmapped the page in a corner case.
Reported-by: NJuan Francisco Cantero Hurtado <iam@juanfra.info>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ca9b688c

Btrfs: Fix BTRFS_IOC_SUBVOL_SETFLAGS ioctl · b4dc2b8c

由 Li Zefan 提交于 2月 16, 2011

- Check user-specified flags correctly
- Check the inode owership
- Search root item in root tree but not fs tree
Reported-by: NDan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b4dc2b8c

Btrfs: allow balance to explicitly allocate chunks as it relocates · c87f08ca

由 Chris Mason 提交于 2月 16, 2011

Btrfs device shrinking and balancing ends up reallocating all the blocks
in order to allow COW to move them to new destinations. It is somewhat
awkward in terms of ENOSPC because most of the enospc code is built
around the idea that some operation on a reference counted tree triggers
allocations in the non-reference counted trees.

This commit changes the balancing code to deal with enospc by trying to
allocate a new chunk. If that allocation succeeds, we go ahead and
retry whatever failed due to enospc.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c87f08ca

Btrfs: put ENOSPC debugging under a mount option · 91435650

由 Chris Mason 提交于 2月 16, 2011

ENOSPC in btrfs is getting to the point where the extra debugging isn't
required.  I've put it under mount -o enospc_debug just in case someone
is having difficult problems.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

91435650

15 2月, 2011 6 次提交

Btrfs: check return value of alloc_extent_map() · c26a9203

由 Tsutomu Itoh 提交于 2月 14, 2011

I add the check on the return value of alloc_extent_map() to several places.
In addition, alloc_extent_map() returns only the address or NULL.
Therefore, check by IS_ERR() is unnecessary. So, I remove IS_ERR() checking.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c26a9203

Btrfs - Fix memory leak in btrfs_init_new_device() · 67100f25

由 Ilya Dryomov 提交于 2月 06, 2011

Memory allocated by calling kstrdup() should be freed.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

67100f25

btrfs: prevent heap corruption in btrfs_ioctl_space_info() · 51788b1b

由 Dan Rosenberg 提交于 2月 14, 2011

Commit bf5fc093 refactored
btrfs_ioctl_space_info() and introduced several security issues.

space_args.space_slots is an unsigned 64-bit type controlled by a
possibly unprivileged caller.  The comparison as a signed int type
allows providing values that are treated as negative and cause the
subsequent allocation size calculation to wrap, or be truncated to 0.
By providing a size that's truncated to 0, kmalloc() will return
ZERO_SIZE_PTR.  It's also possible to provide a value smaller than the
slot count.  The subsequent loop ignores the allocation size when
copying data in, resulting in a heap overflow or write to ZERO_SIZE_PTR.

The fix changes the slot count type and comparison typecast to u64,
which prevents truncation or signedness errors, and also ensures that we
don't copy more data than we've allocated in the subsequent loop.  Note
that zero-size allocations are no longer possible since there is already
an explicit check for space_args.space_slots being 0 and truncation of
this value is no longer an issue.
Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: NJosef Bacik <josef@redhat.com>
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

51788b1b

Btrfs: Fix balance panic · 6848ad64

由 Yan, Zheng 提交于 2月 14, 2011

Mark the cloned backref_node as checked in clone_backref_node()
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6848ad64

Btrfs: don't release pages when we can't clear the uptodate bits · e3f24cc5

由 Chris Mason 提交于 2月 14, 2011

Btrfs tracks uptodate state in an rbtree as well as in the
page bits.  This is supposed to enable us to use block sizes other than
the page size, but there are a few parts still missing before that
completely works.

But, our readpage routine trusts this additional range based tracking
of uptodateness, much in the same way the buffer head up to date bits
are trusted for the other filesystems.

The problem is that sometimes we need to allocate memory in order to
split records in the rbtree, even when we are just clearing bits.  This
can be difficult when our clearing function is called GFP_ATOMIC, which
can happen in the releasepage path.

So, what happens today looks like this:

releasepage called with GFP_ATOMIC
btrfs_releasepage calls clear_extent_bit
clear_extent_bit fails to allocate ram, leaving the up to date bit set
btrfs_releasepage returns success

The end result is the page being gone, but btrfs thinking the range is
up to date.   Later on if someone tries to read that same page, the
btrfs readpage code will return immediately thinking the page is already
up to date.

This commit fixes things to fail the releasepage when we can't clear the
extent state bits.  It covers both data pages and metadata tree blocks.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e3f24cc5

Btrfs: fix page->private races · eb14ab8e

由 Chris Mason 提交于 2月 10, 2011

There is a race where btrfs_releasepage can drop the
page->private contents just as alloc_extent_buffer is setting
up pages for metadata.  Because of how the Btrfs page flags work,
this results in us skipping the crc on the page during IO.

This patch sovles the race by waiting until after the extent buffer
is inserted into the radix tree before it sets page private.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

eb14ab8e

08 2月, 2011 1 次提交

Btrfs: Fix page count calculation · 3a90983d

由 Yan, Zheng 提交于 1月 18, 2011

take offset of start position into account when calculating page count.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3a90983d

06 2月, 2011 4 次提交

btrfs: Drop __exit attribute on btrfs_exit_compress · 8e4eef7a

由 Alexey Charkov 提交于 2月 02, 2011

As this function is called in some error paths while not
removing the module, the __exit attribute prevents the kernel
image from linking when btrfs is compiled in statically.
Signed-off-by: NAlexey Charkov <alchark@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8e4eef7a

btrfs: cleanup error handling in btrfs_unlink_inode() · 554233a6

由 Tsutomu Itoh 提交于 2月 03, 2011

When btrfs_alloc_path() fails, btrfs_free_path() need not be called.
Therefore, it changes the branch ahead.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

554233a6

Btrfs: exclude super blocks when we read in block groups · 3c14874a

由 Josef Bacik 提交于 2月 02, 2011

This has been resulting in a BUT_ON(ret) after btrfs_reserve_extent in
btrfs_cow_file_range. The reason is we don't actually calculate the bytes_super
for a block group until we go to cache it, which means that the space_info can
hand out reservations for space that it doesn't actually have, and we can run
out of data space. This is also a problem if you are using space caching since
we don't ever calculate bytes_super for the block groups. So instead everytime
we read a block group call exclude_super_stripes, which calculates the
bytes_super for the block group so it can be left out of the space_info. Then
whenever caching completes we just call free_excluded_extents so that the super
excluded extents are freed up. Also if we are unmounting and we hit any block
groups that haven't been cached we still need to call free_excluded_extents to
make sure things are cleaned up properly. Thanks,
Reported-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3c14874a

Btrfs: make sure search_bitmap finds something in remove_from_bitmap · 13dbc089

由 Josef Bacik 提交于 2月 03, 2011

When we're cleaning up the tree log we need to be able to remove free space from
the block group. The problem is if that free space spans bitmaps we would not
find the space since we're looking for too many bytes. So make sure the amount
of bytes we search for is limited to either the number of bytes we want, or the
number of bytes left in the bitmap. This was tested by a user who was hitting
the BUG() after search_bitmap. With this patch he can now mount his fs.
Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

13dbc089

01 2月, 2011 5 次提交

btrfs: fix return value check of btrfs_start_transaction() · 98d5dc13

由 Tsutomu Itoh 提交于 1月 20, 2011

The error check of btrfs_start_transaction() is added, and the mistake
of the error check on several places is corrected.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

98d5dc13

btrfs: checking NULL or not in some functions · 5df67083

由 Tsutomu Itoh 提交于 2月 01, 2011

Because NULL is returned when the memory allocation fails,
it is checked whether it is NULL.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5df67083

Btrfs: avoid uninit variable warnings in ordered-data.c · c87fb6fd

由 Chris Mason 提交于 1月 31, 2011

This one isn't really an uninit variable, but for pretty
obscure reasons.  Let's make it clearly correct.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c87fb6fd

Btrfs: catch errors from btrfs_sync_log · b31eabd8

由 Chris Mason 提交于 1月 31, 2011

btrfs_sync_log returns -EAGAIN when we need full transaction commits
instead of small log commits, but sometimes we were dropping the return
value.

In practice, we check for this a few different ways, but this is still a
bug that can leave off full log commits when we really need them.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b31eabd8

Btrfs: make shrink_delalloc a little friendlier · b1953bce

由 Josef Bacik 提交于 1月 21, 2011

Xfstests 224 will just sit there and spin for ever until eventually we give up
flushing delalloc and exit. On my box this took several hours. I could not
interrupt this process either, even though we use INTERRUPTIBLE. So do 2 things

1) Keep us from looping over and over again without reclaiming anything
2) If we get interrupted exit the loop

I tested this and the test now exits in a reasonable amount of time, and can be
interrupted with ctrl+c. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b1953bce

29 1月, 2011 12 次提交

Btrfs: handle no memory properly in prepare_pages · 7adf5dfb

由 Josef Bacik 提交于 1月 25, 2011

Instead of doing a BUG_ON(1) in prepare_pages if grab_cache_page() fails, just
loop through the pages we've already grabbed and unlock and release them, then
return -ENOMEM like we should.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7adf5dfb

Btrfs: do error checking in btrfs_del_csums · ad0397a7

由 Josef Bacik 提交于 1月 28, 2011

Got a report of a box panicing because we got a NULL eb in read_extent_buffer.
His fs was borked and btrfs_search_path returned EIO, but we don't check for
errors so the box paniced. Yes I know this will just make something higher up
the stack panic, but that's a problem for future Josef. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ad0397a7

Btrfs: use the global block reserve if we cannot reserve space · 68a82277

由 Josef Bacik 提交于 1月 24, 2011

We call use_block_rsv right before we make an allocation in order to make sure
we have enough space. Now normally people have called btrfs_start_transaction()
with the appropriate amount of space that we need, so we just use some of that
pre-reserved space and move along happily. The problem is where people use
btrfs_join_transaction(), which doesn't actually reserve any space. So we try
and reserve space here, but we cannot flush delalloc, so this forces us to
return -ENOSPC when in reality we have plenty of space. The most common symptom
is seeing a bunch of "couldn't dirty inode" messages in syslog. With
xfstests 224 we end up falling back to start_transaction and then doing all the
flush delalloc stuff which causes to hang for a very long time.

So instead steal from the global reserve, which is what this is meant for
anyway. With this patch and the other 2 I have sent xfstests 224 now passes
successfully. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

68a82277

Btrfs: do not release more reserved bytes to the global_block_rsv than we need · e9e22899

由 Josef Bacik 提交于 1月 24, 2011

When we do btrfs_block_rsv_release, if global_block_rsv is not full we will
release all the extra bytes to global_block_rsv, even if it's only a little
short of the amount of space that we need to reserve. This causes us to starve
ourselves of reservable space during the transaction which will force us to
shrink delalloc bytes and commit the transaction more often than we should. So
instead just add the amount of bytes we need to add to the global reserve so
reserved == size, and then add the rest back into the space_info for general
use. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e9e22899

Btrfs: fix check_path_shared so it returns the right value · dedefd72

由 Josef Bacik 提交于 1月 24, 2011

When running xfstests 224 I kept getting ENOSPC when trying to remove the files,
and this is because we were returning ret from check_path_shared while it was
uninitalized, which isn't right. Fix this to return 0 properly, and now
xfstests 224 doesn't freak out when it tries to clean itself up. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

dedefd72

btrfs: check return value of btrfs_start_ioctl_transaction() properly · abd30bb0

由 Tsutomu Itoh 提交于 1月 24, 2011

btrfs_start_ioctl_transaction() returns ERR_PTR(), not NULL.
So, it is necessary to use IS_ERR() to check the return value.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

abd30bb0

btrfs: fix return value check of btrfs_join_transaction() · 3612b495

由 Tsutomu Itoh 提交于 1月 25, 2011

The error check of btrfs_join_transaction()/btrfs_join_transaction_nolock()
is added, and the mistake of the error check in several places is
corrected.

For more stable Btrfs, I think that we should reduce BUG_ON().
But, I think that long time is necessary for this.
So, I propose this patch as a short-term solution.

With this patch:
 - To more stable Btrfs, the part that should be corrected is clarified.
 - The panic isn't done by the NULL pointer reference etc. (even if
   BUG_ON() is increased temporarily)
 - The error code is returned in the place where the error can be easily
   returned.

As a long-term plan:
 - BUG_ON() is reduced by using the forced-readonly framework, etc.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3612b495

fs/btrfs/inode.c: Add missing IS_ERR test · 34d19bad

由 Julia Lawall 提交于 1月 24, 2011

After the conditional that precedes the following code, inode may be an
ERR_PTR value.  This can eg result from a memory allocation failure via the
call to btrfs_iget, and thus does not imply that root is different than
sub_root.  Thus, an IS_ERR check is added to ensure that there is no
dereference of inode in this case.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r@
identifier f;
@@
f(...) { ... return ERR_PTR(...); }

@@
identifier r.f, fld;
expression x;
statement S1,S2;
@@
 x = f(...)
 ... when != IS_ERR(x)
(
 if (IS_ERR(x) ||...) S1 else S2
|
*x->fld
)
// </smpl>
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

34d19bad

btrfs: fix missing break in switch phrase · 333e8105

由 liubo 提交于 1月 26, 2011

There is a missing break in switch, fix it.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

333e8105

btrfs: fix several uncheck memory allocations · 2a29edc6

由 liubo 提交于 1月 26, 2011

To make btrfs more stable, add several missing necessary memory allocation
checks, and when no memory, return proper errno.

We've checked that some of those -ENOMEM errors will be returned to
userspace, and some will be catched by BUG_ON() in the upper callers,
and none will be ignored silently.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2a29edc6

btrfs: fix uncheck memory allocation in btrfs_submit_compressed_read · 6b82ce8d

由 liubo 提交于 1月 26, 2011

btrfs_submit_compressed_read() is lack of memory allocation checks and
corresponding error route.

After this fix, if it comes to "no memory" case, errno will be returned
to userland step by step, and tell users this operation cannot go on.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6b82ce8d

C

Merge branch 'bug-fixes' of git://repo.or.cz/linux-btrfs-devel into btrfs-38 · eab49bec
由 Chris Mason 提交于 1月 28, 2011

eab49bec

27 1月, 2011 2 次提交

Btrfs: Fix file clone when source offset is not 0 · 4d728ec7

由 Li Zefan 提交于 1月 26, 2011

Suppose:
- the source extent is: [0, 100]
- the src offset is 10
- the clone length is 90
- the dest offset is 0

This statement:

	new_key.offset = key.offset + destoff - off

will produce such an extent for the dest file:

	[ino, BTRFS_EXTENT_DATA_KEY, -10]

, which is obviously wrong.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

4d728ec7

Btrfs: Fix memory leak in writepage fixup work · b897abec

由 Miao Xie 提交于 1月 26, 2011

fixup, which is allocated when starting page write to fix up the
extent without ORDERED bit set, should be freed after this work
is done.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

b897abec

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功