提交 · b1375d64c539c5b76794be759b62d3f178e67c32 · openeuler / Kernel

27 1月, 2012 1 次提交

Btrfs: fix uninit warning in backref.c · b1375d64

由 Jan Schmidt 提交于 13年前

Added initialization with the declaration of ret. It isn't set later on the
switch-default branch (which should never be taken).
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b1375d64

17 1月, 2012 35 次提交

Btrfs: use larger system chunks · 96bdc7dc

由 Chris Mason 提交于 13年前

system chunks by default are very small.  This makes them slightly
larger and also fixes the conditional checks to make sure we don't
allocate a billion of them at once.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

96bdc7dc

Btrfs: add a delalloc mutex to inodes for delalloc reservations · f248679e

由 Josef Bacik 提交于 13年前

I was using i_mutex for this, but we're getting bogus lockdep warnings by doing
that and theres no real way to get rid of those, so just stop using i_mutex to
protect delalloc metadata reservations and use a delalloc mutex instead. This
shouldn't be contended often at all, only if you are writing and mmap writing to
the file at the same time. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

f248679e

Btrfs: space leak tracepoints · 8c2a3ca2

由 Josef Bacik 提交于 13年前

This in addition to a script in my btrfs-tracing tree will help track down space
leaks when we're getting space left over in block groups on umount. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

8c2a3ca2

Btrfs: protect orphan block rsv with spin_lock · 90290e19

由 Josef Bacik 提交于 13年前

We've been seeing warnings coming out of the orphan commit stuff forever from
ceph. Turns out it's because we're racing with checking if the orphan block
reserve is set, because we clear it outside of the spin_lock. So leave the
normal fastpath checks where they are, but take the spin_lock and _recheck_ to
make sure we haven't had an orphan block rsv added in the meantime. Then clear
the root's orphan block rsv and release the lock. With this patch a user said
the warnings went away and they usually showed up pretty soon after he started
ceph. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

90290e19

Btrfs: add allocator tracepoints · 3f7de037

由 Josef Bacik 提交于 13年前

I used these tracepoints when figuring out what the cluster stuff was doing, so
add them to mainline in case we need to profile this stuff again. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

3f7de037

Btrfs: don't call btrfs_throttle in file write · 45a8090e

由 Josef Bacik 提交于 13年前

Btrfs_throttle will make us wait if there is a currently committing transaction
until we can open new transactions, which is ridiculous since we don't actually
start any transactions within the file write path anyway, so all this does is
introduce big latencies if we have a sync/fsync heavy workload going on while
somebody else is trying to do work. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

45a8090e

Btrfs: release space on error in page_mkwrite · ec39e180

由 Josef Bacik 提交于 13年前

If updating the inode gave us an ENOSPC we were just returning in page_mkwrite,
which is a problem since we make our reservation right before trying to update
the inode, so fix the out label so that we actually free our reservation.
Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ec39e180

Btrfs: fix btrfsck error 400 when truncating a compressed · f70a9a6b

由 Miao Xie 提交于 13年前

Reproduce steps:
 # mkfs.btrfs /dev/sdb5
 # mount /dev/sdb5 -o compress=lzo /mnt
 # dd if=/dev/zero of=/mnt/tmpfile bs=128K count=1
 # sync
 # truncate -s 64K /mnt/tmpfile
 root 5 inode 257 errors 400

This is because of the wrong if condition, which is used to check if we should
subtract the bytes of the dropped range from i_blocks/i_bytes of i-node or not.
When we truncate a compressed extent, btrfs substracts the bytes of the whole
extent, it's wrong. We should substract the real size that we truncate, no
matter it is a compressed extent or not. Fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f70a9a6b

Btrfs: do not use btrfs_end_transaction_throttle everywhere · 7ad85bb7

由 Josef Bacik 提交于 13年前

A user reported a problem where things like open with O_CREAT would take up to
30 seconds when he had nfs activity on the same mount. This is because all of
our quick metadata operations, like create, symlink etc all do
btrfs_end_transaction_throttle, which if the transaction is blocked will wait
for the commit to complete before it returns. This adds a ridiculous amount of
latency and isn't really needed. The normal btrfs_end_transaction will mark the
transaction as blocked and wake the transaction kthread up if it thinks the
transaction needs to end (this being in the running out of global reserve space
scenario), and this is all that is really needed since we've already done
everything we're going to do, we just need to return. This should help people
with the latency they were seeing when using synchronous heavy workloads.
Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7ad85bb7

C
Merge branch 'integrity-check-patch-v2' of git://btrfs.giantdisaster.de/git/btrfs into integration · c126dea7
由 Chris Mason 提交于 13年前
```
Conflicts:
	fs/btrfs/ctree.h
	fs/btrfs/super.c
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
c126dea7
C

Merge branch 'for-chris' of git://git.jan-o-sch.net/btrfs-unstable into integration · 9785dbdf
由 Chris Mason 提交于 13年前

9785dbdf
C
Merge branch 'for-chris' of git://repo.or.cz/linux-btrfs-devel into integration · d756bd2d
由 Chris Mason 提交于 13年前
```
Conflicts:
	fs/btrfs/volumes.c
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
d756bd2d
C

Merge branch 'restriper' of git://github.com/idryomov/btrfs-unstable into integration · 27263e28
由 Chris Mason 提交于 13年前

27263e28
C

Merge branch 'allocation-fixes' into integration · 64e05503
由 Chris Mason 提交于 13年前

64e05503
I
Btrfs: add balance progress reporting · 19a39dce
由 Ilya Dryomov 提交于 13年前
```
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
19a39dce

Btrfs: allow for resuming restriper after it was paused · de322263

由 Ilya Dryomov 提交于 13年前

Recognize BTRFS_BALANCE_RESUME flag passed from userspace.  We use the
same heuristics used when recovering balance after a crash to try to
start where we left off last time.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

de322263

Btrfs: allow for canceling restriper · a7e99c69

由 Ilya Dryomov 提交于 13年前

Implement an ioctl for canceling restriper.  Currently we wait until
relocation of the current block group is finished, in future this can be
done by triggering a commit.  Balance item is deleted and no memory
about the interrupted balance is kept.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

a7e99c69

Btrfs: allow for pausing restriper · 837d5b6e

由 Ilya Dryomov 提交于 13年前

Implement an ioctl for pausing restriper.  This pauses the relocation,
but balance is still considered to be "in progress": balance item is
not deleted, other volume operations cannot be started, etc.  If paused
in the middle of profile changing operation we will continue making
allocations with the target profile.

Add a hook to close_ctree() to pause restriper and free its data
structures on unmount.  (It's safe to unmount when restriper is in
"paused" state, we will resume with the same parameters on the next
mount)
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

837d5b6e

Btrfs: add skip_balance mount option · 9555c6c1

由 Ilya Dryomov 提交于 13年前

Since restriper kthread starts involuntarily on mount and can suck cpu
and memory bandwidth add a mount option to forcefully skip it.  The
restriper in that case hangs around in paused state and can be resumed
from userspace when it's convenient.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

9555c6c1

Btrfs: recover balance on mount · 59641015

由 Ilya Dryomov 提交于 13年前

On mount, if balance item is found, resume balance in a separate
kernel thread.

Try to be smart to continue roughly where previous balance (or convert)
was interrupted.  For chunk types that were being converted to some
profile we turn on soft convert, in case of a simple balance we turn on
usage filter and relocate only less-than-90%-full chunks of that type.
These are just heuristics but they help quite a bit, and can be improved
in future.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

59641015

Btrfs: save balance parameters to disk · 0940ebf6

由 Ilya Dryomov 提交于 13年前

Introduce a new btree objectid for storing balance item.  The reason is
to be able to resume restriper after a crash with the same parameters.
Balance item has a very high objectid and goes into tree of tree roots.

The key for the new item is as follows:

	[ BTRFS_BALANCE_OBJECTID ; BTRFS_BALANCE_ITEM_KEY ; 0 ]

Older kernels simply ignore it so it's safe to mount with an older
kernel and then go back to the newer one.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

0940ebf6

Btrfs: soft profile changing mode (aka soft convert) · cfa4c961

由 Ilya Dryomov 提交于 13年前

When doing convert from one profile to another if soft mode is on
restriper won't touch chunks that already have the profile we are
converting to.  This is useful if e.g. half of the FS was converted
earlier.

The soft mode switch is (like every other filter) per-type.  This means
that we can convert for example meta chunks the "hard" way while
converting data chunks selectively with soft switch.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

cfa4c961

Btrfs: implement online profile changing · e4d8ec0f

由 Ilya Dryomov 提交于 13年前

Profile changing is done by launching a balance with
BTRFS_BALANCE_CONVERT bits set and target fields of respective
btrfs_balance_args structs initialized.  Profile reducing code in this
case will pick restriper's target profile if it's available instead of
doing a blind reduce.  If target profile is not yet available it goes
back to a plain reduce.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

e4d8ec0f

Btrfs: do not reduce profile in do_chunk_alloc() · 70922617

由 Ilya Dryomov 提交于 13年前

Every caller of do_chunk_alloc() feeds it the reduced allocation
profile, so stop trying to reduce it one more time.  Instead check the
validity of the passed profile.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

70922617

Btrfs: virtual address space subset filter · ea67176a

由 Ilya Dryomov 提交于 13年前

Select chunks which have at least one byte located inside a given
[vstart, vend) virtual address space range.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

ea67176a

Btrfs: devid subset filter · 94e60d5a

由 Ilya Dryomov 提交于 13年前

Select chunks which have at least one byte of at least one stripe
located on a device with devid X in a given [pstart,pend) physical
address range.

This filter only works when devid filter is turned on.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

94e60d5a

Btrfs: devid filter · 409d404b

由 Ilya Dryomov 提交于 13年前

Relocate chunks which have at least one stripe located on a device with
devid X.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

409d404b

Btrfs: usage filter · 5ce5b3c0

由 Ilya Dryomov 提交于 13年前

Select chunks that are less than X percent full.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

5ce5b3c0

Btrfs: profiles filter · ed25e9b2

由 Ilya Dryomov 提交于 13年前

Select chunks based on a given profile mask.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

ed25e9b2

Btrfs: add basic infrastructure for selective balancing · f43ffb60

由 Ilya Dryomov 提交于 13年前

This allows to have a separate set of filters for each chunk type
(data,meta,sys).  The code however is generic and switch on chunk type
is only done once.

This commit also adds a type filter: it allows to balance for example
meta and system chunks w/o touching data ones.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

f43ffb60

Btrfs: add basic restriper infrastructure · c9e9f97b

由 Ilya Dryomov 提交于 13年前

Add basic restriper infrastructure: extended balancing ioctl and all
related ioctl data structures, add data structure for tracking
restriper's state to fs_info, etc.  The semantics of the old balancing
ioctl are fully preserved.

Explicitly disallow any volume operations when balance is in progress.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

c9e9f97b

Btrfs: make avail_*_alloc_bits fields dynamic · 10ea00f5

由 Ilya Dryomov 提交于 13年前

Currently when new chunks are created respective avail_alloc_bits field
is updated to reflect profiles of all chunks present in the system.
However when chunks are removed profile bits are never cleared.

This patch clears profile bit of respective avail_alloc_bits field when
the last chunk with that profile is removed.  Restriper needs this to
properly operate when "downgrading".
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

10ea00f5

Btrfs: add BTRFS_AVAIL_ALLOC_BIT_SINGLE bit · a46d11a8

由 Ilya Dryomov 提交于 13年前

Right now on-disk BTRFS_BLOCK_GROUP_* profile bits are used for
avail_{data,metadata,system}_alloc_bits fields, which gather info about
available allocation profiles in the FS. When chunk is created or read
from disk, its profile is OR'ed with the corresponding avail_alloc_bits
field. Since SINGLE is denoted by 0 in the on-disk format, currently
there is no way to tell when such chunks become avaialble. Restriper
needs that information, so add a separate bit for SINGLE profile.

This bit is going to be in-memory only, it should never be written out
to disk, so it's not a disk format change. However to avoid remappings
in future, reserve corresponding on-disk bit.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

a46d11a8

Btrfs: introduce masks for chunk type and profile · 52ba6929

由 Ilya Dryomov 提交于 13年前

Chunk's type and profile are encoded in u64 flags field.  Introduce
masks to easily access them.  Also fix the type of BTRFS_BLOCK_GROUP_*
constants, it should be ULL.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

52ba6929

Btrfs: get rid of *_alloc_profile fields · 6fef8df1

由 Ilya Dryomov 提交于 13年前

{data,metadata,system}_alloc_profile fields have been unused for a long
time now.  Get rid of them.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

6fef8df1

11 1月, 2012 4 次提交

Btrfs: fix possible deadlock when opening a seed device · b367e47f

由 Li Zefan 提交于 13年前

The correct lock order is uuid_mutex -> volume_mutex -> chunk_mutex,
but when we mount a filesystem which has backing seed devices, we have
this lock chain:

    open_ctree()
        lock(chunk_mutex);
        read_chunk_tree();
            read_one_dev();
                open_seed_devices();
                    lock(uuid_mutex);

and then we hit a lockdep splat.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

b367e47f

Btrfs: update global block_rsv when creating a new block group · c7c144db

由 Li Zefan 提交于 13年前

A bug was triggered while using seed device:

    # mkfs.btrfs /dev/loop1
    # btrfstune -S 1 /dev/loop1
    # mount -o /dev/loop1 /mnt
    # btrfs dev add /dev/loop2 /mnt

btrfs: block rsv returned -28
------------[ cut here ]------------
WARNING: at fs/btrfs/extent-tree.c:5969 btrfs_alloc_free_block+0x166/0x396 [btrfs]()
...
Call Trace:
...
[<f7b7c31c>] btrfs_cow_block+0x101/0x147 [btrfs]
[<f7b7eaa6>] btrfs_search_slot+0x1b8/0x55f [btrfs]
[<f7b7f844>] btrfs_insert_empty_items+0x42/0x7f [btrfs]
[<f7b7f8c1>] btrfs_insert_item+0x40/0x7e [btrfs]
[<f7b8ac02>] btrfs_make_block_group+0x243/0x2aa [btrfs]
[<f7bb3f53>] __btrfs_alloc_chunk+0x672/0x70e [btrfs]
[<f7bb41ff>] init_first_rw_device+0x77/0x13c [btrfs]
[<f7bb5a62>] btrfs_init_new_device+0x664/0x9fd [btrfs]
[<f7bbb65a>] btrfs_ioctl+0x694/0xdbe [btrfs]
[<c04f55f7>] do_vfs_ioctl+0x496/0x4cc
[<c04f5660>] sys_ioctl+0x33/0x4f
[<c07b9edf>] sysenter_do_call+0x12/0x38
---[ end trace 906adac595facc7d ]---

Since seed device is readonly, there's no usable space in the filesystem.
Afterwards we add a sprout device to it, and the kernel creates a METADATA
block group and a SYSTEM block group where comes free space we can reserve,
but we still get revervation failure because the global block_rsv hasn't
been updated accordingly.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

c7c144db

Btrfs: rewrite btrfs_trim_block_group() · 7fe1e641

由 Li Zefan 提交于 13年前

There are various bugs in block group trimming:

- It may trim from offset smaller than user-specified offset.
- It may trim beyond user-specified range.
- It may leak free space for extents smaller than specified minlen.
- It may truncate the last trimmed extent thus leak free space.
- With mixed extents+bitmaps, some extents may not be trimmed.
- With mixed extents+bitmaps, some bitmaps may not be trimmed (even
none will be trimmed). Even for those trimmed, not all the free space
in the bitmaps will be trimmed.

I rewrite btrfs_trim_block_group() and break it into two functions.
One is to trim extents only, and the other is to trim bitmaps only.

Before patching:

	# fstrim -v /mnt/
	/mnt/: 1496465408 bytes were trimmed

After patching:

	# fstrim -v /mnt/
	/mnt/: 2193768448 bytes were trimmed

And this matches the total free space:

	# btrfs fi df /mnt
	Data: total=3.58GB, used=1.79GB
	System, DUP: total=8.00MB, used=4.00KB
	System: total=4.00MB, used=0.00
	Metadata, DUP: total=205.12MB, used=97.14MB
	Metadata: total=8.00MB, used=0.00
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

7fe1e641

Btrfs: simplfy calculation of stripe length for discard operation · ec9ef7a1

由 Li Zefan 提交于 13年前

For btrfs raid, while discarding a range of space, we'll need to know
the start offset and length to discard for each device, and it's done
in btrfs_map_block().

However the calculation is a bit complex for raid0 and raid10, so I
reimplement it based on a fact that:

        dev1          dev2           dev3    (raid0)
        -----------------------------------
        s0 s3 s6      s1 s4 s7       s2 s5

Each device has (total_stripes / nr_dev) stripes, or plus one.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

ec9ef7a1

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功