提交 · 213e64da90d14537cd63f7090d6c4d1fcc75d9f8 · openeuler / raspberrypi-kernel

27 3月, 2012 24 次提交

Btrfs: fix infinite loop in btrfs_shrink_device() · 213e64da

由 Ilya Dryomov 提交于 3月 27, 2012

If relocate of block group 0 fails with ENOSPC we end up infinitely
looping because key.offset -= 1 statement in that case brings us back to
where we started.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

213e64da

Btrfs: fix memory leak in resolver code · 5eb56d25

由 Ilya Dryomov 提交于 3月 27, 2012

init_ipath() allocates btrfs_data_container which is never freed.  Free
it in free_ipath() and nuke the comment for init_data_container() - we
can safely free it with kfree().
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

5eb56d25

Btrfs: allow dup for data chunks in mixed mode · e4837f8f

由 Ilya Dryomov 提交于 3月 27, 2012

Generally we don't allow dup for data, but mixed chunks are special and
people seem to think this has its use cases.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

e4837f8f

Btrfs: validate target profiles only if we are going to use them · 6728b198

由 Ilya Dryomov 提交于 3月 27, 2012

Do not run sanity checks on all target profiles unless they all will be
used.  This came up because alloc_profile_is_valid() is now more strict
than it used to be.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

6728b198

Btrfs: improve the logic in btrfs_can_relocate() · 4a5e98f5

由 Ilya Dryomov 提交于 3月 27, 2012

Currently if we don't have enough space allocated we go ahead and loop
though devices in the hopes of finding enough space for a chunk of the
*same* type as the one we are trying to relocate.  The problem with that
is that if we are trying to restripe the chunk its target type can be
more relaxed than the current one (eg require less devices or less
space).  So, when restriping, run checks against the target profile
instead of the current one.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

4a5e98f5

Btrfs: add __get_block_group_index() helper · 7738a53a

由 Ilya Dryomov 提交于 3月 27, 2012

Add __get_block_group_index() helper to be able to derive block group
index from an arbitary set of flags.  Implement get_block_group_index()
in terms of it.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

7738a53a

Btrfs: add get_restripe_target() helper · fc67c450

由 Ilya Dryomov 提交于 3月 27, 2012

Add get_restripe_target() helper and switch everybody to use it.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

fc67c450

Btrfs: move alloc_profile_is_valid() to volumes.c · 0c460c0d

由 Ilya Dryomov 提交于 3月 27, 2012

Header file is not a good place to define functions.  This also moves a
call to alloc_profile_is_valid() down the stack and removes a redundant
check from __btrfs_alloc_chunk() - alloc_profile_is_valid() takes it
into account.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

0c460c0d

Btrfs: make profile_is_valid() check more strict · e8920a64

由 Ilya Dryomov 提交于 3月 27, 2012

"0" is a valid value for an on-disk chunk profile, but it is not a valid
extended profile.  (We have a separate bit for single chunks in extended
case)

Also rename it to alloc_profile_is_valid() for clarity.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

e8920a64

Btrfs: add wrappers for working with alloc profiles · 899c81ea

由 Ilya Dryomov 提交于 3月 27, 2012

Add functions to abstract the conversion between chunk and extended
allocation profile formats and switch everybody to use them.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

899c81ea

Btrfs: stop silently switching single chunks to raid0 on balance · e3176ca2

由 Ilya Dryomov 提交于 3月 27, 2012

This has been causing a lot of confusion for quite a while now and a lot
of users were surprised by this (some of them were even stuck in a
ENOSPC situation which they couldn't easily get out of). The addition
of restriper gives users a clear choice between raid0 and drive concat
setup so there's absolutely no excuse for us to keep doing this.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

e3176ca2

Btrfs: deal with read errors on extent buffers differently · ea466794

由 Josef Bacik 提交于 3月 26, 2012

Since we need to read and write extent buffers in their entirety we can't use
the normal bio_readpage_error stuff since it only works on a per page basis. So
instead make it so that if we see an io error in endio we just mark the eb as
having an IO error and then in btree_read_extent_buffer_pages we will manually
try other mirrors and then overwrite the bad mirror if we find a good copy.
This works with larger than page size blocks. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ea466794

Btrfs: don't use threaded IO completion helpers for metadata writes · f3f266ab

由 Chris Mason 提交于 3月 23, 2012

The metadata write IO completion code is now simple enough that we
don't need the threaded helpers anymore.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f3f266ab

Btrfs: adjust the write_lock_level as we unlock · f7c79f30

由 Chris Mason 提交于 3月 19, 2012

btrfs_search_slot sometimes needs write locks on high levels of
the tree.  It remembers the highest level that needs a write lock
and will use that for all future searches through the tree in a given
call.

But, very often we'll just cow the top level or the level below and we
won't really need write locks on the root again after that.  This patch
changes things to adjust the write lock requirement as it unlocks
levels.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f7c79f30

Btrfs: loop waiting on writeback · a098d8e8

由 Chris Mason 提交于 3月 21, 2012

lock_extent_buffer_for_io needs to loop around and make sure the
writeback bits are not set.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a098d8e8

Btrfs: add the ability to cache a pointer into the eb · cfed81a0

由 Chris Mason 提交于 3月 03, 2012

This cuts down on the CPU time used by map_private_extent_buffer
Signed-off-by: NChris Mason <chris.mason@oracle.com>

cfed81a0

Btrfs: ensure an entire eb is written at once · 0b32f4bb

由 Josef Bacik 提交于 3月 13, 2012

This patch simplifies how we track our extent buffers. Previously we could exit
writepages with only having written half of an extent buffer, which meant we had
to track the state of the pages and the state of the extent buffers differently.
Now we only read in entire extent buffers and write out entire extent buffers,
this allows us to simply set bits in our bflags to indicate the state of the eb
and we no longer have to do things like track uptodate with our iotree. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0b32f4bb

Btrfs: introduce mark_extent_buffer_accessed · 5df4235e

由 Josef Bacik 提交于 3月 15, 2012

Because an eb can have multiple pages we need to make sure that all pages within
the eb are markes as accessed, since releasepage can be called against any page
in the eb. This will keep us from possibly evicting hot eb's when we're doing
larger than pagesize eb's. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

5df4235e

Btrfs: introduce free_extent_buffer_stale · 3083ee2e

由 Josef Bacik 提交于 3月 09, 2012

Because btrfs cow's we can end up with extent buffers that are no longer
necessary just sitting around in memory. So instead of evicting these pages, we
could end up evicting things we actually care about. Thus we have
free_extent_buffer_stale for use when we are freeing tree blocks. This will
make it so that the ref for the eb being in the radix tree is dropped as soon as
possible and then is freed when the refcount hits 0 instead of waiting to be
released by releasepage. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

3083ee2e

Btrfs: only use the existing eb if it's count isn't 0 · 115391d2

由 Josef Bacik 提交于 3月 09, 2012

We can run into a problem where we find an eb for our existing page already on
the radix tree but it has a ref count of 0. It hasn't yet been removed by RCU
yet so this can cause issues where we will use the EB after free. So do
atomic_inc_not_zero on the exists->refs and if it is zero just do
synchronize_rcu() and try again. We won't have to worry about new allocators
coming in since they will block on the page lock at this point. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

115391d2

Btrfs: set page->private to the eb · 4f2de97a

由 Josef Bacik 提交于 3月 07, 2012

We spend a lot of time looking up extent buffers from pages when we could just
store the pointer to the eb the page is associated with in page->private. This
patch does just that, and it makes things a little simpler and reduces a bit of
CPU overhead involved with doing metadata IO. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

4f2de97a

Btrfs: allow metadata blocks larger than the page size · 727011e0

由 Chris Mason 提交于 8月 06, 2010

A few years ago the btrfs code to support blocks lager than
the page size was disabled to fix a few corner cases in the
page cache handling.  This fixes the code to properly support
large metadata blocks again.

Since current kernels will crash early and often with larger
metadata blocks, this adds an incompat bit so that older kernels
can't mount it.

This also does away with different blocksizes for nodes and leaves.
You get a single block size for all tree blocks.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

727011e0

Btrfs: remove search_start and search_end from find_free_extent and callers · 81c9ad23

由 Josef Bacik 提交于 1月 18, 2012

We have been passing nothing but (u64)-1 to find_free_extent for search_end in
all of the callers, so it's completely useless, and we've always been passing 0
in as search_start, so just remove them as function arguments and move
search_start into find_free_extent. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

81c9ad23

Btrfs: remove the ideal caching code · 285ff5af

由 Josef Bacik 提交于 1月 13, 2012

This is a relic from before we had the disk space cache and it was to make
bootup times when you had btrfs as root not be so damned slow. Now that we have
the disk space cache this isn't a problem anymore and really having this code
casues uneeded fragmentation and complexity, so just remove it. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

285ff5af

03 3月, 2012 2 次提交

Btrfs: fix casting error in scrub reada code · a175423c

由 Chris Mason 提交于 2月 28, 2012

The reada code from scrub was casting down a u64 to
an unsigned long so it could insert it into a radix tree.

What it really wanted to do was cast down the result of a shift, instead
of casting down the u64.  The bug resulted in trying to insert our
reada struct into the wrong place, which caused soft lockups and other
problems.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a175423c

btrfs: fix locking issues in find_parent_nodes() · d3b01064

由 Li Zefan 提交于 3月 03, 2012

- We might unlock head->mutex while it was not locked
- We might leave the function without unlocking delayed_refs->lock
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d3b01064

24 2月, 2012 1 次提交

Btrfs: fix compiler warnings on 32 bit systems · e77266e4

由 Chris Mason 提交于 2月 24, 2012

The enospc tracing code added some interesting uses of
u64 pointer casts.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e77266e4

23 2月, 2012 5 次提交

Btrfs: increase the global block reserve estimates · 5500cdbe

由 Liu Bo 提交于 2月 23, 2012

When doing IO with large amounts of data fragmentation, the global block
reserve calulations are too low.  This increases them to avoid
ENOSPC crashes.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5500cdbe

Btrfs: clear the extent uptodate bits during parent transid failures · 50653190

由 Chris Mason 提交于 2月 22, 2012

If btrfs reads a block and finds a parent transid mismatch, it clears
the uptodate flags on the extent buffer, and the pages inside it.  But
we only clear the uptodate bits in the state tree if the block straddles
more than one page.

This is from an old optimization from to reduce contention on the extent
state tree.  But it is buggy because the code that retries a read from
a different copy of the block is going to find the uptodate state bits
set and skip the IO.

The end result of the bug is that we'll never actually read the good
copy (if there is one).

The fix here is to always clear the uptodate state bits, which is safe
because this code is only called when the parent transid fails.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

50653190

C
Btrfs: add extra sanity checks on the path names in btrfs_mksubvol · 16780cab
由 Chris Mason 提交于 2月 20, 2012
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
16780cab

Btrfs: make sure we update latest_bdev · a6b0d5c8

由 Chris Mason 提交于 2月 20, 2012

When we are setting up the mount, we close all the
devices that were not actually part of the metadata we found.

But, we don't make sure that one of those devices wasn't
fs_devices->latest_bdev, which means we can do a use after free
on the one we closed.

This updates latest_bdev as it goes.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a6b0d5c8

Btrfs: improve error handling for btrfs_insert_dir_item callers · fe66a05a

由 Chris Mason 提交于 2月 20, 2012

This allows us to gracefully continue if we aren't able to insert
directory items, both for normal files/dirs and snapshots.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

fe66a05a

21 2月, 2012 1 次提交

Btrfs: be less strict on finding next node in clear_extent_bit · 692e5759

由 Liu Bo 提交于 2月 16, 2012

In clear_extent_bit, it is enough that next node is adjacent in tree level.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>

692e5759

17 2月, 2012 6 次提交

Btrfs: fix a bug on overcommit stuff · d9b0218f

由 Liu Bo 提交于 2月 16, 2012

When overcommitting, we should check the sum of pinned space and
bytes for delayed item.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>

d9b0218f

Btrfs: kick out redundant stuff in convert_extent_bit · 9d47c767

由 Liu Bo 提交于 2月 16, 2012

clear_state_bit will do merge_state for us, so kick out the redundant one.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>

9d47c767

Btrfs: skip states when they does not contain bits to clear · 0449314a

由 Liu Bo 提交于 2月 16, 2012

Clearing a range's bits is different with setting them, since we don't
need to touch them when states do not contain bits we want.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>

0449314a

T
Btrfs: check return value of lookup_extent_mapping() correctly · 285190d9
由 Tsutomu Itoh 提交于 2月 16, 2012
```
This patch corrects error checking of lookup_extent_mapping().
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
```
285190d9

Btrfs: fix deadlock on page lock when doing auto-defragment · 600a45e1

由 Miao Xie 提交于 2月 16, 2012

When I ran xfstests circularly on a auto-defragment btrfs, the deadlock
happened.

Steps to reproduce:
[tty0]
 # export MOUNT_OPTIONS="-o autodefrag"
 # export TEST_DEV=<partition1>
 # export TEST_DIR=<mountpoint1>
 # export SCRATCH_DEV=<partition2>
 # export SCRATCH_MNT=<mountpoint2>
 # while [ 1 ]
 > do
 > ./check 091 127 263
 > sleep 1
 > done
[tty1]
 # while [ 1 ]
 > do
 > echo 3 > /proc/sys/vm/drop_caches
 > done

Several hours later, the test processes will hang on, and the deadlock will
happen on page lock.

The reason is that:
  Auto defrag task		Flush thread			Test task
				btrfs_writepages()
				  add ordered extent
				  (including page 1, 2)
				  set page 1 writeback
				  set page 2 writeback
				endio_fn()
				  end page 2 writeback
								release page 2
lock page 1
alloc and lock page 2
page 2 is not uptodate
  btrfs_readpage()
    start ordered extent()
    btrfs_writepages()
      try  to lock page 1

so deadlock happens.

Fix this bug by unlocking the page which is in writeback, and re-locking it
after the writeback end.
Signed-off-by: NMiao Xie <miax@cn.fujitsu.com>

600a45e1

Btrfs: fix return value check of extent_io_ops · 013bd4c3

由 Tsutomu Itoh 提交于 2月 16, 2012

This patch adds the check on the return value of extent_io_ops.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>

013bd4c3

16 2月, 2012 1 次提交
- F
  btrfs: honor umask when creating subvol root · 12fc9d09
  由 Florian Albrechtskirchinger 提交于 2月 10, 2012
```
Set the subvol root inode permissions based on the current umask.
```
  12fc9d09