提交 · fb4f6f910ca6f58564c31a680ef88940d8192713 · openeuler / raspberrypi-kernel

12 6月, 2010 3 次提交

Btrfs: handle error returns from btrfs_lookup_dir_item() · fb4f6f91

由 Dan Carpenter 提交于 5月 29, 2010

If btrfs_lookup_dir_item() fails, we should can just let the mount fail
with an error.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

fb4f6f91

Btrfs: Fix BUG_ON for fs converted from extN · 3bf84a5a

由 Yan, Zheng 提交于 5月 31, 2010

Tree blocks can live in data block groups in FS converted from extN.
So it's easy to trigger the BUG_ON.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3bf84a5a

Btrfs: Fix null dereference in relocation.c · 046f264f

由 Yan, Zheng 提交于 5月 31, 2010

Fix a potential null dereference in relocation.c
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Acked-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

046f264f

11 6月, 2010 4 次提交

Btrfs: fix remap_file_pages error · 058a457e

由 Miao Xie 提交于 5月 20, 2010

when we use remap_file_pages() to remap a file, remap_file_pages always return
error. It is because btrfs didn't set VM_CAN_NONLINEAR for vma.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

058a457e

Btrfs: uninitialized data is check_path_shared() · 0e4dcbef

由 Dan Carpenter 提交于 6月 01, 2010

refs can be used with uninitialized data if btrfs_lookup_extent_info()
fails on the first pass through the loop.  In the original code if that
happens then check_path_shared() probably returns 1, this patch
changes it to return 1 for safety.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0e4dcbef

Btrfs: fix fallocate regression · 83609779

由 Josef Bacik 提交于 6月 07, 2010

Seems that when btrfs_fallocate was converted to use the new ENOSPC stuff we
dropped passing the mode to the function that actually does the preallocation.
This breaks anybody who wants to use FALLOC_FL_KEEP_SIZE. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

83609779

Btrfs: fix loop device on top of btrfs · 4a001071

由 Miao Xie 提交于 6月 07, 2010

We cannot use the loop device which has been connected to a file in the btrf

The reproduce steps is following:
 # dd if=/dev/zero of=vdev0 bs=1M count=1024
 # losetup /dev/loop0 vdev0
 # mkfs.btrfs /dev/loop0
 ...
 failed to zero device start -5

The reason is that the btrfs don't implement either ->write_begin or ->write
the VFS API, so we fix it by setting ->write to do_sync_write().
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4a001071

27 5月, 2010 5 次提交

Btrfs: add more error checking to btrfs_dirty_inode · 9aeead73

由 Chris Mason 提交于 5月 27, 2010

The ENOSPC code will now return ENOSPC to btrfs_start_transaction.
btrfs_dirty_inode needs to check for this and error out appropriately.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9aeead73

Btrfs: allow unaligned DIO · 5a5f79b5

由 Chris Mason 提交于 5月 26, 2010

In order to support DIO that isn't aligned to the filesystem blocksize,
we fall back to buffered for any unaligned DIOs.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5a5f79b5

C
Btrfs: drop verbose enospc printk · 933b585f
由 Chris Mason 提交于 5月 26, 2010
```
Less printk is good printk.
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
933b585f

Btrfs: Fix block generation verification race · 5bdd3536

由 Yan, Zheng 提交于 5月 26, 2010

After the path is released, the generation number got from block
pointer is no long valid. The race may cause disk corruption, because
verify_parent_transid() calls clear_extent_buffer_uptodate() when
generation numbers mismatch.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5bdd3536

Btrfs: fix preallocation and nodatacow checks in O_DIRECT · 46bfbb5c

由 Chris Mason 提交于 5月 26, 2010

The O_DIRECT code wasn't checking for multiple references
on preallocated or nodatacow extents.  This means it
wasn't honoring snapshots properly.

The fix here is to add an explicit check for multiple references
This also fixes the math for selecting the correct disk block,
making sure not to go past the end of the extent.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

46bfbb5c

26 5月, 2010 3 次提交

Btrfs: avoid ENOSPC errors in btrfs_dirty_inode · 94b60442

由 Chris Mason 提交于 5月 26, 2010

btrfs_dirty_inode tries to sneak in without much waiting or
space reservation, mostly for performance reasons.  This
usually works well but can cause problems when there are
many many writers.

When btrfs_update_inode fails with ENOSPC, we fallback
to a slower btrfs_start_transaction call that will reserve
some space.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

94b60442

Btrfs: move O_DIRECT space reservation to btrfs_direct_IO · 3f7c579c

由 Chris Mason 提交于 5月 26, 2010

This moves the delalloc space reservation done for O_DIRECT
into btrfs_direct_IO.  This way we don't leak reserved space
if the generic O_DIRECT write code errors out before it
calls into btrfs_direct_IO.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3f7c579c

Btrfs: rework O_DIRECT enospc handling · 4845e44f

由 Chris Mason 提交于 5月 25, 2010

This changes O_DIRECT write code to mark extents as delalloc
while it is processing them.  Yan Zheng has reworked the
enospc accounting based on tracking delalloc extents and
this makes it much easier to track enospc in the O_DIRECT code.

There are a few space cases with the O_DIRECT code though,
it only sets the EXTENT_DELALLOC bits, instead of doing
EXTENT_DELALLOC | EXTENT_DIRTY | EXTENT_UPTODATE, because
we don't want to mess with clearing the dirty and uptodate
bits when things go wrong.  This is important because there
are no pages in the page cache, so any extent state structs
that we put in the tree won't get freed by releasepage.  We have
to clear them ourselves as the DIO ends.

With this commit, we reserve space at in btrfs_file_aio_write,
and then as each btrfs_direct_IO call progresses it sets
EXTENT_DELALLOC on the range.

btrfs_get_blocks_direct is responsible for clearing the delalloc
at the same time it drops the extent lock.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4845e44f

25 5月, 2010 18 次提交

Btrfs: use async helpers for DIO write checksumming · eaf25d93

由 Chris Mason 提交于 5月 25, 2010

The async helper threads offload crc work onto all the
CPUs, and make streaming writes much faster.  This
changes the O_DIRECT write code to use them.  The only
small complication was that we need to pass in the
logical offset in the file for each bio, because we can't
find it in the bio's pages.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

eaf25d93

Btrfs: don't walk around with task->state != TASK_RUNNING · ed3b3d31

由 Chris Mason 提交于 5月 25, 2010

Yan Zheng noticed two places we were doing a lot of work
without task->state set to TASK_RUNNING.  This sets the state
properly after we get ready to sleep but decide not to.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ed3b3d31

Btrfs: do aio_write instead of write · 11c65dcc

由 Josef Bacik 提交于 5月 23, 2010

In order for AIO to work, we need to implement aio_write. This patch converts
our btrfs_file_write to btrfs_aio_write. I've tested this with xfstests and
nothing broke, and the AIO stuff magically started working. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

11c65dcc

Btrfs: add basic DIO read/write support · 4b46fce2

由 Josef Bacik 提交于 5月 23, 2010

This provides basic DIO support for reading and writing.  It does not do the
work to recover from mismatching checksums, that will come later.  A few design
changes have been made from Jim's code (sorry Jim!)

1) Use the generic direct-io code.  Jim originally re-wrote all the generic DIO
code in order to account for all of BTRFS's oddities, but thanks to that work it
seems like the best bet is to just ignore compression and such and just opt to
fallback on buffered IO.

2) Fallback on buffered IO for compressed or inline extents.  Jim's code did
it's own buffering to make dio with compressed extents work.  Now we just
fallback onto normal buffered IO.

3) Use ordered extents for the writes so that all of the

lock_extent()
lookup_ordered()

type checks continue to work.

4) Do the lock_extent() lookup_ordered() loop in readpage so we don't race with
DIO writes.

I've tested this with fsx and everything works great.  This patch depends on my
dio and filemap.c patches to work.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4b46fce2

direct-io: do not merge logically non-contiguous requests · c2c6ca41

由 Josef Bacik 提交于 5月 23, 2010

Btrfs cannot handle having logically non-contiguous requests submitted.  For
example if you have

Logical:  [0-4095][HOLE][8192-12287]
Physical: [0-4095]      [4096-8191]

Normally the DIO code would put these into the same BIO's.  The problem is we
need to know exactly what offset is associated with what BIO so we can do our
checksumming and unlocking properly, so putting them in the same BIO doesn't
work.  So add another check where we submit the current BIO if the physical
blocks are not contigous OR the logical blocks are not contiguous.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c2c6ca41

direct-io: add a hook for the fs to provide its own submit_bio function · facd07b0

由 Josef Bacik 提交于 5月 23, 2010

Because BTRFS can do RAID and such, we need our own submit hook so we can setup
the bio's in the correct fashion, and handle checksum errors properly.  So there
are a few changes here

1) The submit_io hook.  This is straightforward, just call this instead of
submit_bio.

2) Allow the fs to return -ENOTBLK for reads.  Usually this has only worked for
writes, since writes can fallback onto buffered IO.  But BTRFS needs the option
of falling back on buffered IO if it encounters a compressed extent, since we
need to read the entire extent in and decompress it.  So if we get -ENOTBLK back
from get_block we'll return back and fallback on buffered just like the write
case.

I've tested these changes with fsx and everything seems to work.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

facd07b0

Btrfs: Metadata ENOSPC handling for balance · 3fd0a558

由 Yan, Zheng 提交于 5月 16, 2010

This patch adds metadata ENOSPC handling for the balance code.
It is consisted by following major changes:

1. Avoid COW tree leave in the phrase of merging tree.

2. Handle interaction with snapshot creation.

3. make the backref cache can live across transactions.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3fd0a558

Btrfs: Pre-allocate space for data relocation · efa56464

由 Yan, Zheng 提交于 5月 16, 2010

Pre-allocate space for data relocation. This can detect ENOPSC
condition caused by fragmentation of free space.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

efa56464

Btrfs: Metadata ENOSPC handling for tree log · 4a500fd1

由 Yan, Zheng 提交于 5月 16, 2010

Previous patches make the allocater return -ENOSPC if there is no
unreserved free metadata space. This patch updates tree log code
and various other places to propagate/handle the ENOSPC error.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4a500fd1

Btrfs: Metadata reservation for orphan inodes · d68fc57b

由 Yan, Zheng 提交于 5月 16, 2010

reserve metadata space for handling orphan inodes
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d68fc57b

Btrfs: Introduce global metadata reservation · 8929ecfa

由 Yan, Zheng 提交于 5月 16, 2010

Reserve metadata space for extent tree, checksum tree and root tree
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8929ecfa

Btrfs: Update metadata reservation for delayed allocation · 0ca1f7ce

由 Yan, Zheng 提交于 5月 16, 2010

Introduce metadata reservation context for delayed allocation
and update various related functions.

This patch also introduces EXTENT_FIRST_DELALLOC control bit for
set/clear_extent_bit. It tells set/clear_bit_hook whether they
are processing the first extent_state with EXTENT_DELALLOC bit
set. This change is important if set/clear_extent_bit involves
multiple extent_state.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0ca1f7ce

Btrfs: Integrate metadata reservation with start_transaction · a22285a6

由 Yan, Zheng 提交于 5月 16, 2010

Besides simplify the code, this change makes sure all metadata
reservation for normal metadata operations are released after
committing transaction.

Changes since V1:

Add code that check if unlink and rmdir will free space.

Add ENOSPC handling for clone ioctl.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a22285a6

Btrfs: Introduce contexts for metadata reservation · f0486c68

由 Yan, Zheng 提交于 5月 16, 2010

Introducing metadata reseravtion contexts has two major advantages.
First, it makes metadata reseravtion more traceable. Second, it can
reclaim freed space and re-add them to the itself after transaction
committed.

Besides add btrfs_block_rsv structure and related helper functions,
This patch contains following changes:

Move code that decides if freed tree block should be pinned into
btrfs_free_tree_block().

Make space accounting more accurate, mainly for handling read only
block groups.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f0486c68

Btrfs: Kill init_btrfs_i() · 2ead6ae7

由 Yan, Zheng 提交于 5月 16, 2010

All code in init_btrfs_i can be moved into btrfs_alloc_inode()
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2ead6ae7

Btrfs: Shrink delay allocated space in a synchronized · 5da9d01b

由 Yan, Zheng 提交于 5月 16, 2010

Shrink delayed allocation space in a synchronized manner is more
controllable than flushing all delay allocated space in an async
thread.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5da9d01b

Btrfs: Kill allocate_wait in space_info · 424499db

由 Yan, Zheng 提交于 5月 16, 2010

We already have fs_info->chunk_mutex to avoid concurrent
chunk creation.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

424499db

Btrfs: Link block groups of different raid types · b742bb82

由 Yan, Zheng 提交于 5月 16, 2010

The size of reserved space is stored in space_info. If block groups
of different raid types are linked to separate space_info, changing
allocation profile will corrupt reserved space accounting.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b742bb82

16 5月, 2010 1 次提交

Btrfs: check for read permission on src file in the clone ioctl · 5dc64164

由 Dan Rosenberg 提交于 5月 15, 2010

The existing code would have allowed you to clone a file that was
only open for writing
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5dc64164

15 5月, 2010 4 次提交

JFS: Free sbi memory in error path · 684bdc7f

由 Jan Blunck 提交于 4月 12, 2010

I spotted the missing kfree() while removing the BKL.

[akpm@linux-foundation.org: avoid multiple returns so it doesn't happen again]
Signed-off-by: NJan Blunck <jblunck@suse.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

684bdc7f

fs/sysv: dereferencing ERR_PTR() · 404e7812

由 Dan Carpenter 提交于 4月 21, 2010

I moved the dir_put_page() inside the if condition so we don't dereference
"page", if it's an ERR_PTR().
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

404e7812

Fix double-free in logfs · 26562449

由 Al Viro 提交于 4月 28, 2010

iput() is needed *until* we'd done successful d_alloc_root()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

26562449

Fix the regression created by "set S_DEAD on unlink()..." commit · d83c49f3

由 Al Viro 提交于 4月 30, 2010

1) i_flags simply doesn't work for mount/unlink race prevention;
we may have many links to file and rm on one of those obviously
shouldn't prevent bind on top of another later on.  To fix it
right way we need to mark _dentry_ as unsuitable for mounting
upon; new flag (DCACHE_CANT_MOUNT) is protected by d_flags and
i_mutex on the inode in question.  Set it (with dont_mount(dentry))
in unlink/rmdir/etc., check (with cant_mount(dentry)) in places
in namespace.c that used to check for S_DEAD.  Setting S_DEAD
is still needed in places where we used to set it (for directories
getting killed), since we rely on it for readdir/rmdir race
prevention.

2) rename()/mount() protection has another bogosity - we unhash
the target before we'd checked that it's not a mountpoint.  Fixed.

3) ancient bogosity in pivot_root() - we locked i_mutex on the
right directory, but checked S_DEAD on the different (and wrong)
one.  Noticed and fixed.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d83c49f3

14 5月, 2010 2 次提交

inotify: don't leak user struct on inotify release · b3b38d84

由 Pavel Emelyanov 提交于 5月 12, 2010

inotify_new_group() receives a get_uid-ed user_struct and saves the
reference on group->inotify_data.user.  The problem is that free_uid() is
never called on it.

Issue seem to be introduced by 63c882a0 (inotify: reimplement inotify
using fsnotify) after 2.6.30.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Eric Paris <eparis@parisplace.org>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NEric Paris <eparis@redhat.com>

b3b38d84

inotify: race use after free/double free in inotify inode marks · e0873344

由 Eric Paris 提交于 5月 11, 2010

There is a race in the inotify add/rm watch code.  A task can find and
remove a mark which doesn't have all of it's references.  This can
result in a use after free/double free situation.

Task A					Task B
------------				-----------
inotify_new_watch()
 allocate a mark (refcnt == 1)
 add it to the idr
					inotify_rm_watch()
					 inotify_remove_from_idr()
					  fsnotify_put_mark()
					      refcnt hits 0, free
 take reference because we are on idr
 [at this point it is a use after free]
 [time goes on]
 refcnt may hit 0 again, double free

The fix is to take the reference BEFORE the object can be found in the
idr.
Signed-off-by: NEric Paris <eparis@redhat.com>
Cc: <stable@kernel.org>

e0873344