提交 · 5f3ab90a72f98adbf00c50ac2d4d2b47cf4a9685 · openeuler / raspberrypi-kernel

17 12月, 2012 35 次提交

Btrfs: rename root_times_lock to root_item_lock · 5f3ab90a

由 Anand Jain 提交于 12月 07, 2012

Originally root_times_lock was introduced as part of send/receive
code however newly developed patch to label the subvol reused
the same lock, so renaming it for a meaningful name.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

5f3ab90a

btrfs: Notify udev when removing device · b8b8ff59

由 Lukas Czerner 提交于 12月 06, 2012

Currently udev does not know about the device being removed from the
file system. This may result in the situation where we're unable to
mount the file system by UUID or by LABEL because the by-uuid and
by-label links may still point to the device which is no longer part of
the btrfs file system and hence does not have any btrfs super block.

It can be easily reproduced by the following:

mkfs.btrfs -L bugfs /dev/loop[0-6]
mount /dev/loop0 /mnt/test
btrfs device delete /dev/loop0 /mnt/test
umount /mnt/test

mount LABEL=bugfs /mnt/test <---- this fails

then see:

ls -l /dev/disk/by-label/bugfs

which will still point to the /dev/loop0

We did not noticed this before because libblkid would send the udev
event for us when it notice that the link does not fit the reality,
however it does not do that anymore and completely relies on udev
information.

Fix this by sending the KOBJ_CHANGE event to the bdev kobject after
successful device removal.

Note that this does not affect device addition, because we will open the
device prior the addition from userspace and udev will notice that and
reread the device afterwards.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

b8b8ff59

Btrfs: fix wrong return value of btrfs_truncate_page() · ac6a2b36

由 Miao Xie 提交于 12月 05, 2012

ret variant may be set to 0 if we read page successfully, but it might be
released before we lock it again. On this case, if we fail to allocate a
new page, we will return 0, it is wrong, fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

ac6a2b36

Btrfs: punch hole past the end of the file · 7426cc04

由 Miao Xie 提交于 12月 05, 2012

Since we can pre-allocate the space past EOF, we should be able to reclaim
that space if we need. This patch implements it by removing the EOF check.

Though the manual of fallocate command says we can use truncate command to
reclaim the pre-allocated space which past EOF, but because truncate command
changes the file size, we must run several commands to reclaim the space if we
don't want to change the file size, so it is not a good choice.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

7426cc04

Btrfs: fix the page that is beyond EOF · 0061280d

由 Miao Xie 提交于 12月 05, 2012

Steps to reproduce:
 # mkfs.btrfs <disk>
 # mount <disk> <mnt>
 # dd if=/dev/zero of=<mnt>/<file> bs=512 seek=5 count=8
 # fallocate -p -o 2048 -l 16384 <mnt>/<file>
 # dd if=/dev/zero of=<mnt>/<file> bs=4096 seek=3 count=8 conv=notrunc,nocreat
 # umount <mnt>
 # dmesg
 WARNING: at fs/btrfs/inode.c:7140 btrfs_destroy_inode+0x2eb/0x330

The reason is that we inputed a range which is beyond the end of the file. And
because the end of this range was not page-aligned, we had to truncate the last
page in this range, this operation is similar to a buffered file write. In other
words, we reserved enough space and clear the data which was in the hole range
on that page. But when we expanded that test file, write the data into the same
page, we forgot that we have reserved enough space for the buffered write of
that page because in most cases there is no page that is beyond the end of
the file. As a result, we reserved the space twice.

In fact, we needn't truncate the page if it is beyond the end of the file, just
release the allocated space in that range. Fix the above problem by this way.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

0061280d

Btrfs: fix off-by-one error of the same page check in btrfs_punch_hole() · 6347b3c4

由 Miao Xie 提交于 12月 05, 2012

(start + len) is the start of the adjacent extent, not the end of the current
extent, so we should not use it to check the hole is on the same page or not.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

6347b3c4

Btrfs: fix missing reserved space release in error path of delalloc reservation · 4b5829a8

由 Miao Xie 提交于 12月 05, 2012

We forget to release the reserved space in the error path of delalloc
reservatiom, fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

4b5829a8

Btrfs: don't auto defrag a file when doing directIO · 543eabd5

由 Miao Xie 提交于 12月 05, 2012

If we runt the direct IO, we should not run auto defrag, because it may
introduce buffered IO vs direcIO problem, and make direct IO slow down.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

543eabd5

Btrfs: parse parent 0 into correct value in tracepoint · fb57dc81

由 Liu Bo 提交于 11月 30, 2012

Value 0 is not a tree id, so besides an upper limit, a lower limit is
necessary as well while parsing root types of tracepoint.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

fb57dc81

Btrfs: use ctl->unit for free space calculation instead of block_group->sectorsize · 96009762

由 Wang Sheng-Hui 提交于 11月 30, 2012

We should use ctl->unit for free space calculation instead of block_group->sectorsize
even though for free space use_bitmap or free space cluster we only have sectorsize assigned to ctl->unit currently. Also, we can keep it consisten in code style.
Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

96009762

Btrfs: refactor error handling to drop inode in btrfs_create() · 43baa579

由 Filipe Brandenburger 提交于 11月 30, 2012

Refactor it by checking whether the inode has been created and needs to be
dropped (drop_inode_on_err) and also if the err variable is set. That way the
variable doesn't need to be set on each and every error handling block.
Signed-off-by: NFilipe Brandenburger <filbranden@google.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

43baa579

Btrfs: fix permissions of empty files not affected by umask · 2794ed01

由 Filipe Brandenburger 提交于 11月 30, 2012

When a new file is created with btrfs_create(), the inode will initially be
created with permissions 0666 and later on in btrfs_init_acl() it will be
adapted to mask out the umask bits. The problem is that this change won't make
it into the btrfs_inode unless there's another change to the inode (e.g. writing
content changing the size or touching the file changing the mtime.)

This fix adds a call to btrfs_update_inode() to btrfs_create() to make sure that
the change will not get lost if the in-memory inode is flushed before other
changes are made to the file.
Signed-off-by: NFilipe Brandenburger <filbranden@google.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

2794ed01

Btrfs: add fiemap's flag check · 05dadc09

由 Tsutomu Itoh 提交于 11月 29, 2012

When the flag not supported is specified, it is necessary to return the error
to the caller.
So, we add the validity check of the fiemap's flag.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

05dadc09

Btrfs: don't add a NULL extended attribute · 01e6deb2

由 Liu Bo 提交于 11月 28, 2012

Passing a null extended attribute value means to remove the attribute,
but we don't have to add a new NULL extended attribute.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

01e6deb2

Btrfs: skip adding an acl attribute if we don't have to · 755ac67f

由 Liu Bo 提交于 11月 28, 2012

If the acl can be exactly represented in the traditional file
mode permission bits, we don't set another acl attribute.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

755ac67f

Btrfs: fix off-by-one error of the reserved size of btrfs_allocate() · 0ff6fabd

由 Miao Xie 提交于 11月 28, 2012

alloc_end is not the real end of the current extent, it is the start of the
next adjoining extent. So we needn't +1 when calculating the size the space
that is about to be reserved.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

0ff6fabd

Btrfs: use existing align macros in btrfs_allocate() · 797f4277

由 Miao Xie 提交于 11月 28, 2012

The kernel developers have implemented some often-used align macros, we should
use them instead of the complex code.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

797f4277

Btrfs: fix a scrub regression in case of write errors · af1be4f8

由 Stefan Behrens 提交于 11月 27, 2012

This regression was introduced by the device-replace patches.
Scrub immediately stops checking those disks that have write errors.
This is nothing that happens in the real world, but it is wrong
since scrub is the tool to detect and repair defects. Fix it.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

af1be4f8

Btrfs: fix a build warning for an unused label · f9c83748

由 Stefan Behrens 提交于 11月 27, 2012

This issue was detected by the "0-DAY kernel build testing".

fs/btrfs/volumes.c: In function 'btrfs_rm_device':
fs/btrfs/volumes.c:1505:1: warning: label 'error_close' defined but not used [-Wunused-label]
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

f9c83748

Btrfs: fix race in check-integrity caused by usage of bitfield · cb3806ec

由 Stefan Behrens 提交于 11月 27, 2012

The structure member mirror_num is modified concurrently to the
structure member is_iodone. This doesn't require any locking by
design, unless everything is stored in the same 32 bits of a
bit field. This was the case and xfstest 284 was able to
trigger false warnings from the checker code. This patch
seperates the bits and fixes the race.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

cb3806ec

Btrfs: fix freeze vs auto defrag · b66f00da

由 Miao Xie 提交于 11月 26, 2012

If we freeze the fs, the auto defragment should not run. Fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

b66f00da

Btrfs: restructure btrfs_run_defrag_inodes() · 26176e7c

由 Miao Xie 提交于 11月 26, 2012

This patch restructure btrfs_run_defrag_inodes() and make the code of the auto
defragment more readable.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

26176e7c

Btrfs: fix unprotected defragable inode insertion · 8ddc4734

由 Miao Xie 提交于 11月 26, 2012

We forget to get the defrag lock when we re-add the defragable inode,
Fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

8ddc4734

Btrfs: use slabs for auto defrag allocation · 9247f317

由 Miao Xie 提交于 11月 26, 2012

The auto defrag allocation is in the fast path of the IO, so use slabs
to improve the speed of the allocation.

And besides that, it can do check for leaked objects when the module is removed.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

9247f317

Btrfs: get write access for qgroup operations · 905b0dda

由 Miao Xie 提交于 11月 26, 2012

We need get write access for qgroup operations, or we will modify the R/O fs.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

905b0dda

Btrfs: get write access for scrub · b8e95489

由 Miao Xie 提交于 11月 26, 2012

We need get write access for scrub, or we will modify the R/O fs.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

b8e95489

Btrfs: get write access when removing a device · da24927b

由 Miao Xie 提交于 11月 26, 2012

Steps to reproduce:
 # mkfs.btrfs -d single -m single <disk0> <disk1>
 # mount -o ro <disk0> <mnt0>
 # mount -o ro <disk0> <mnt1>
 # mount -o remount,rw <mnt0>
 # umount <mnt0>
 # btrfs device delete <disk1> <mnt1>

We can remove a device from a R/O filesystem. The reason is that we just check
the R/O flag of the super block object. It is not enough, because the kernel
may set the R/O flag only for the mount point. We need invoke

	mnt_want_write_file()

to do a full check.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

da24927b

Btrfs: get write access when doing resize fs · 198605a8

由 Miao Xie 提交于 11月 26, 2012

Steps to reproduce:
 # mkfs.btrfs <partition>
 # mount -o ro <partition> <mnt0>
 # mount -o ro <partition> <mnt1>
 # mount -o remount,rw <mnt0>
 # umount <mnt0>
 # btrfs fi resize 10g <mnt1>

We re-sized a R/O filesystem. The reason is that we just check the R/O flag
of the super block object. It is not enough, because the kernel may set the
R/O flag only for the mount point. We need invoke mnt_want_write_file() to
do a full check.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

198605a8

Btrfs: get write access when setting the default subvolume · 3c04ce01

由 Miao Xie 提交于 11月 26, 2012

When wen want to set the default subvolume, we must get write access, or
we will change the R/O file system.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

3c04ce01

Btrfs: fix wrong return value of btrfs_wait_for_commit() · 8cd2807f

由 Miao Xie 提交于 11月 26, 2012

If the id of the existed transaction is more than the one we specified, it
means the specified transaction was commited, so we should return 0, not
EINVAL.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

8cd2807f

Btrfs: don't start a new transaction when starting sync · ff7c1d33

由 Miao Xie 提交于 11月 26, 2012

If there is no running transaction in the fs, we needn't start a new one when
we want to start sync.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

ff7c1d33

Btrfs: pass root object into btrfs_ioctl_{start, wait}_sync() · 9a8c28be

由 Miao Xie 提交于 11月 26, 2012

Since we have gotten the root in the caller, just pass it into
btrfs_ioctl_{start, wait}_sync() directly.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

9a8c28be

Btrfs: fix an while-loop of listxattr · db2254bc

由 Liu Bo 提交于 11月 26, 2012

If we found an invalid xattr dir item, we'd better try the next one instead.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

db2254bc

Btrfs: do not warn_on io_ctl->cur in io_ctl_map_page · 07140125

由 Wang Sheng-Hui 提交于 11月 23, 2012

io_ctl_map_page is called by many functions in free-space-cache.
In most scenarios, the ->cur is not null, e.g. io_ctl_add_entry.
I think we'd better remove the warn_on here.
Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

07140125

Btrfs: add support for device replace ioctls · 3f6bcfbd

由 Stefan Behrens 提交于 11月 06, 2012

This is the commit that allows to start the device replace
procedure.

An ioctl() interface is added that supports starting and
canceling the device replace procedure, and to retrieve
the status and progress.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

3f6bcfbd

13 12月, 2012 5 次提交

Btrfs: allow repair code to include target disk when searching mirrors · ad6d620e

由 Stefan Behrens 提交于 11月 06, 2012

Make the target disk of a running device replace operation
available for reading. This is only used as a last ressort for
the defect repair procedure. And it is dependent on the location
of the data block to read, because during an ongoing device
replace operation, the target drive is only partially filled
with the filesystem data.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

ad6d620e

Btrfs: increase BTRFS_MAX_MIRRORS by one for dev replace · 72d7aefc

由 Stefan Behrens 提交于 11月 06, 2012

This change of the define is effective in all modes, it
is required and used only in the case when a device replace
procedure is running. The reason is that during an active
device replace procedure, the target device of the copy
operation is a mirror for the filesystem data as well that
can be used to read data in order to repair read errors on
other disks.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

72d7aefc

Btrfs: optionally avoid reads from device replace source drive · 30d9861f

由 Stefan Behrens 提交于 11月 06, 2012

It is desirable to be able to configure the device replace
procedure to avoid reading the source drive (the one to be
copied) whenever possible. This is useful when the number of
read errors on this disk is high, because it would delay the
copy procedure alot. Therefore there is an option to avoid
reading from the source disk unless the repair procedure
really needs to access it. The regular read req asks for
mapping the block with mirror_num == 0, in this case the
source disk is avoided whenever possible. The repair code
selects the mirror_num explicitly (mirror_num != 0), this
case is not changed by this commit.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

30d9861f

Btrfs: changes to live filesystem are also written to replacement disk · 472262f3

由 Stefan Behrens 提交于 11月 06, 2012

During a running dev replace operation, all write requests to
the live filesystem are duplicated to also write to the target
drive. Therefore btrfs_map_block() is changed to duplicate
stripes that are written to the source disk of a device replace
procedure to be written to the target disk as well.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

472262f3

Btrfs: introduce GET_READ_MIRRORS functionality for btrfs_map_block() · 29a8d9a0

由 Stefan Behrens 提交于 11月 06, 2012

Before this commit, btrfs_map_block() was called with REQ_WRITE
in order to retrieve the list of mirrors for a disk block.
This needs to be changed for the device replace procedure since
it makes a difference whether you are asking for read mirrors
or for locations to write to.
GET_READ_MIRRORS is introduced as a new interface to call
btrfs_map_block().
In the current commit, the functionality is not yet changed,
only the interface for GET_READ_MIRRORS is introduced and all
the places that should use this new interface are adapted.

The reason that REQ_WRITE cannot be abused anymore to retrieve
a list of read mirrors is that during a running dev replace
operation all write requests to the live filesystem are
duplicated to also write to the target drive.
Keep in mind that the target disk is only partially a valid
copy of the source disk while the operation is ongoing. All
writes go to the target disk, but not all reads would return
valid data on the target disk. Therefore it is not possible
anymore to abuse a REQ_WRITE interface to find valid mirrors
for a REQ_READ.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

29a8d9a0