提交 · 4a54c8c165b66300830a67349fc7595e3fc442f7 · openeuler / Kernel

29 9月, 2011 12 次提交

btrfs: Moved repair code from inode.c to extent_io.c · 4a54c8c1

由 Jan Schmidt 提交于 7月 22, 2011

The raid-retry code in inode.c can be generalized so that it works for
metadata as well. Thus, this patch moves it to extent_io.c and makes the
raid-retry code a raid-repair code.

Repair works that way: Whenever a read error occurs and we have more
mirrors to try, note the failed mirror, and retry another. If we find a
good one, check if we did note a failure earlier and if so, do not allow
the read to complete until after the bad sector was written with the good
data we just fetched. As we have the extent locked while reading, no one
can change the data in between.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

4a54c8c1

btrfs: Put mirror_num in bi_bdev · 2774b2ca

由 Jan Schmidt 提交于 6月 16, 2011

The error correction code wants to make sure that only the bad mirror is
rewritten. Thus, we need to know which mirror is the bad one. I did not
find a more apropriate field than bi_bdev. But I think using this is fine,
because it is modified by the block layer, anyway, and should not be read
after the bio returned.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

2774b2ca

btrfs: Do not use bio->bi_bdev after submission · 1503140d

由 Jan Schmidt 提交于 6月 16, 2011

The block layer modifies bio->bi_bdev and bio->bi_sector while working on
the bio, they do _not_ come back unmodified in the completion callback.

To call add_page, we need at least some bi_bdev set, which is why the code
was working, previously. With this patch, we use the latest_bdev from
fsinfo instead of the leftover in the bio. This gives us the possibility to
use the bi_bdev field for another purpose.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

1503140d

btrfs: btrfs_multi_bio replaced with btrfs_bio · a1d3c478

由 Jan Schmidt 提交于 8月 04, 2011

btrfs_bio is a bio abstraction able to split and not complete after the last
bio has returned (like the old btrfs_multi_bio). Additionally, btrfs_bio
tracks the mirror_num used to read data which can be used for error
correction purposes.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

a1d3c478

btrfs: new ioctls to do logical->inode and inode->path resolving · d7728c96

由 Jan Schmidt 提交于 7月 07, 2011

these ioctls make use of the new functions initially added for scrub. they
return all inodes belonging to a logical address (BTRFS_IOC_LOGICAL_INO) and
all paths belonging to an inode (BTRFS_IOC_INO_PATHS).
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

d7728c96

btrfs scrub: add fixup code for errors on nodatasum files · 0ef8e451

由 Jan Schmidt 提交于 6月 13, 2011

This removes a FIXME comment and introduces the first part of nodatasum
fixup: It gets the corresponding inode for a logical address and triggers a
regular readpage for the corrupted sector.

Once we have on-the-fly error correction our error will be automatically
corrected. The correction code is expected to clear the newly introduced
EXTENT_DAMAGED flag, making scrub report that error as "corrected" instead
of "uncorrectable" eventually.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

0ef8e451

btrfs scrub: use int for mirror_num, not u64 · e12fa9cd

由 Jan Schmidt 提交于 6月 17, 2011

the rest of the code uses int mirror_num, and so should scrub
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

e12fa9cd

btrfs: add mirror_num to extent_read_full_page · 8ddc7d9c

由 Jan Schmidt 提交于 6月 13, 2011

Currently, extent_read_full_page always assumes we are trying to read mirror
0, which generally is the best we can do. To add flexibility, pass it as a
parameter. This will be needed by scrub fixup code.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

8ddc7d9c

btrfs scrub: bugfix: mirror_num off by one · 193ea74b

由 Jan Schmidt 提交于 6月 13, 2011

Fix the mirror_num determination in scrub_stripe. The rest of the scrub code
did not use mirror_num for anything important and that error went unnoticed.
The nodatasum fixup patch of this set depends on a correct mirror_num.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

193ea74b

btrfs scrub: print paths of corrupted files · 558540c1

由 Jan Schmidt 提交于 6月 13, 2011

While scrubbing, we may encounter various errors. Previously, a logical
address was printed to the log only. Now, all paths belonging to that
address are resolved and printed separately. That should work for hardlinks
as well as reflinks.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

558540c1

btrfs scrub: added unverified_errors · 13db62b7

由 Jan Schmidt 提交于 6月 13, 2011

In normal operation, scrub is reading data sequentially in large portions.
In case of an i/o error, we try to find the corrupted area(s) by issuing
page sized read requests. With this commit we increment the
unverified_errors counter if all of the small size requests succeed.

Userland patches carrying such conspicous events to the administrator should
already be around.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

13db62b7

btrfs: added helper functions to iterate backrefs · a542ad1b

由 Jan Schmidt 提交于 6月 13, 2011

These helper functions iterate back references and call a function for each
backref. There is also a function to resolve an inode to a path in the
file system.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

a542ad1b

21 9月, 2011 1 次提交

Btrfs: reserve sufficient space for ioctl clone · b6f3409b

由 Sage Weil 提交于 9月 20, 2011

Fix a crash/BUG_ON in the clone ioctl due to insufficient reservation. We
need to reserve space for:

 - adjusting the old extent (possibly splitting it)
 - adding the new extent
 - updating the inode
Signed-off-by: NSage Weil <sage@newdream.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b6f3409b

18 9月, 2011 6 次提交

Btrfs: only clear the need lookup flag after the dentry is setup · a66e7cc6

由 Josef Bacik 提交于 9月 18, 2011

We can race with readdir and the RCU path walking stuff. This is because we
clear the need lookup flag before actually instantiating the inode. This will
lead the RCU path walk stuff to find a dentry it thinks is valid without a
d_inode attached. So instead unhash the dentry when we first start the lookup,
and then clear the flag after we've instantiated the dentry so we're garunteed
to either try the slow lookup, or have the d_inode set properly.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a66e7cc6

BTRFS: Fix lseek return value for error · 48802c8a

由 Jeff Liu 提交于 9月 18, 2011

The recent reworking of btrfs' lseek lead to incorrect
values being returned.  This adds checks for seeking
beyond EOF in SEEK_HOLE and makes sure the error
values come back correct.

Andi Kleen also sent in similar patches.
Signed-off-by: NJie Liu <jeff.liu@oracle.com>
Reported-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

48802c8a

Btrfs: don't change inode flag of the dest clone file · dde820fb

由 Li Zefan 提交于 9月 18, 2011

The dst file will have the same inode flags with dst file after
file clone, and I think it's unexpected.

For example, the dst file will suddenly become immutable after
getting some share of data with src file, if the src is immutable.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

dde820fb

Btrfs: don't make a file partly checksummed through file clone · 0e7b824c

由 Li Zefan 提交于 9月 18, 2011

To reproduce the bug:

  # mount /dev/sda7 /mnt
  # dd if=/dev/zero of=/mnt/src bs=4K count=1
  # umount /mnt

  # mount -o nodatasum /dev/sda7 /mnt
  # dd if=/dev/zero of=/mnt/dst bs=4K count=1
  # clone_range -s 4K -l 4K /mnt/src /mnt/dst

  # echo 3 > /proc/sys/vm/drop_caches
  # cat /mnt/dst
  # dmesg
  ...
  btrfs no csum found for inode 258 start 0
  btrfs csum failed ino 258 off 0 csum 2566472073 private 0

It's because part of the file is checksummed and the other part is not,
and then btrfs will complain checksum is not found when we read the file.

Disallow file clone if src and dst file have different checksum flag,
so we ensure a file is completely checksummed or unchecksummed.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0e7b824c

Btrfs: fix pages truncation in btrfs_ioctl_clone() · 71ef0786

由 Li Zefan 提交于 9月 18, 2011

It's a bug in commit f81c9cdc
(Btrfs: truncate pages from clone ioctl target range)

We should pass the dest range to the truncate function, but not the
src range.

Also move the function before locking extent state.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

71ef0786

btrfs: fix d_off in the first dirent · 3765fefa

由 Hidetoshi Seto 提交于 9月 18, 2011

Since the d_off in the first dirent for "." (that originates from
the 4th argument "offset" of filldir() for the 2nd dirent for "..")
is wrongly assigned in btrfs_real_readdir(), telldir returns same
offset for different locations.

 | # mkfs.btrfs /dev/sdb1
 | # mount /dev/sdb1 fs0
 | # cd fs0
 | # touch file0 file1
 | # ../test
 | telldir: 0
 | readdir: d_off = 2, d_name = "."
 | telldir: 2
 | readdir: d_off = 2, d_name = ".."
 | telldir: 2
 | readdir: d_off = 3, d_name = "file0"
 | telldir: 3
 | readdir: d_off = 2147483647, d_name = "file1"
 | telldir: 2147483647

To fix this problem, pass filp->f_pos (which is loff_t) instead.

 | # ../test
 | telldir: 0
 | readdir: d_off = 1, d_name = "."
 | telldir: 1
 | readdir: d_off = 2, d_name = ".."
 | telldir: 2
 | readdir: d_off = 3, d_name = "file0"
 :

At the moment the "offset" for "." is unused because there is no
preceding dirent, however it is better to pass filp->f_pos to follow
grammatical usage.
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3765fefa

11 9月, 2011 11 次提交

Btrfs: add dummy extent if dst offset excceeds file end in · d525e8ab

由 Li Zefan 提交于 9月 11, 2011

You can see there's no file extent with range [0, 4096]. Check this by
btrfsck:

 # btrfsck /dev/sda7
 root 5 inode 258 errors 100
 ...
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d525e8ab

Btrfs: calc file extent num_bytes correctly in file clone · d72c0842

由 Li Zefan 提交于 9月 11, 2011

num_bytes should be 4096 not 12288.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d72c0842

btrfs: xattr: fix attribute removal · 4815053a

由 David Sterba 提交于 9月 11, 2011

An attribute is not removed by 'setfattr -x attr file' and remains
visible in attr list. This makes xfstests/062 pass again.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4815053a

Btrfs: fix wrong nbytes information of the inode · a39f7521

由 Miao Xie 提交于 9月 11, 2011

If we write some data into the data hole of the file(no preallocation for this
hole), Btrfs will allocate some disk space, and update nbytes of the inode, but
the other element--disk_i_size needn't be updated. At this condition, we must
update inode metadata though disk_i_size is not changed(btrfs_ordered_update_i_size()
return 1).

 # mkfs.btrfs /dev/sdb1
 # mount /dev/sdb1 /mnt
 # touch /mnt/a
 # truncate -s 856002 /mnt/a
 # dd if=/dev/zero of=/mnt/a bs=4K count=1 conv=nocreat,notrunc
 # umount /mnt
 # btrfsck /dev/sdb1
 root 5 inode 257 errors 400
 found 32768 bytes used err is 1
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a39f7521

Btrfs: fix the file extent gap when doing direct IO · 0c1a98c8

由 Miao Xie 提交于 9月 11, 2011

When we write some data to the place that is beyond the end of the file
in direct I/O mode, a data hole will be created. And Btrfs should insert
a file extent item that point to this hole into the fs tree. But unfortunately
Btrfs forgets doing it.

The following is a simple way to reproduce it:
 # mkfs.btrfs /dev/sdc2
 # mount /dev/sdc2 /test4
 # touch /test4/a
 # dd if=/dev/zero of=/test4/a seek=8 count=1 bs=4K oflag=direct conv=nocreat,notrunc
 # umount /test4
 # btrfsck /dev/sdc2
 root 5 inode 257 errors 100
Reported-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Tested-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0c1a98c8

Btrfs: fix unclosed transaction handle in btrfs_cont_expand · 5b397377

由 Miao Xie 提交于 9月 11, 2011

The function - btrfs_cont_expand() forgot to close the transaction handle before
it jump out the while loop. Fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5b397377

Btrfs: fix misuse of trans block rsv · 98c9942a

由 Liu Bo 提交于 9月 11, 2011

At the beginning of create_pending_snapshot, trans->block_rsv is set
to pending->block_rsv and is used for snapshot things, however, when
it is done, we do not recover it as will.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

98c9942a

Btrfs: reset to appropriate block rsv after orphan operations · 65450aa6

由 Liu Bo 提交于 9月 11, 2011

While truncating free space cache, we forget to change trans->block_rsv
back to the original one, but leave it with the orphan_block_rsv, and
then with option inode_cache enable, it leads to countless warnings of
btrfs_alloc_free_block and btrfs_orphan_commit_root:

WARNING: at fs/btrfs/extent-tree.c:5711 btrfs_alloc_free_block+0x180/0x350 [btrfs]()
...
WARNING: at fs/btrfs/inode.c:2193 btrfs_orphan_commit_root+0xb0/0xc0 [btrfs]()
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

65450aa6

Btrfs: skip locking if searching the commit root in csum lookup · ddf23b3f

由 Josef Bacik 提交于 9月 11, 2011

It's not enough to just search the commit root, since we could be cow'ing the
very block we need to search through, which would mean that its locked and we'll
still deadlock. So use path->skip_locking as well. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ddf23b3f

btrfs: fix warning in iput for bad-inode · e0b6d65b

由 Sergei Trofimovich 提交于 9月 11, 2011

iput() shouldn't be called for inodes in I_NEW state.
We need to mark inode as constructed first.

WARNING: at fs/inode.c:1309 iput+0x20b/0x210()
Call Trace:
 [<ffffffff8103e7ba>] warn_slowpath_common+0x7a/0xb0
 [<ffffffff8103e805>] warn_slowpath_null+0x15/0x20
 [<ffffffff810eaf0b>] iput+0x20b/0x210
 [<ffffffff811b96fb>] btrfs_iget+0x1eb/0x4a0
 [<ffffffff811c3ad6>] btrfs_run_defrag_inodes+0x136/0x210
 [<ffffffff811ad55f>] cleaner_kthread+0x17f/0x1a0
 [<ffffffff81035b7d>] ? sub_preempt_count+0x9d/0xd0
 [<ffffffff811ad3e0>] ? transaction_kthread+0x280/0x280
 [<ffffffff8105af86>] kthread+0x96/0xa0
 [<ffffffff814336d4>] kernel_thread_helper+0x4/0x10
 [<ffffffff8105aef0>] ? kthread_worker_fn+0x190/0x190
 [<ffffffff814336d0>] ? gs_change+0xb/0xb
Signed-off-by: NSergei Trofimovich <slyfox@gentoo.org>
CC: Konstantin Khlebnikov <khlebnikov@openvz.org>
Tested-by: NDavid Sterba <dsterba@suse.cz>
CC: Josef Bacik <josef@redhat.com>
CC: Chris Mason <chris.mason@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e0b6d65b

Btrfs: fix an oops when deleting snapshots · 14c7cca7

由 Liu Bo 提交于 9月 11, 2011

We can reproduce this oops via the following steps:

$ mkfs.btrfs /dev/sdb7
$ mount /dev/sdb7 /mnt/btrfs
$ for ((i=0; i<3; i++)); do btrfs sub snap /mnt/btrfs /mnt/btrfs/s_$i; done
$ rm -fr /mnt/btrfs/*
$ rm -fr /mnt/btrfs/*

then we'll get
------------[ cut here ]------------
kernel BUG at fs/btrfs/inode.c:2264!
[...]
Call Trace:
 [<ffffffffa05578c7>] btrfs_rmdir+0xf7/0x1b0 [btrfs]
 [<ffffffff81150b95>] vfs_rmdir+0xa5/0xf0
 [<ffffffff81153cc3>] do_rmdir+0x123/0x140
 [<ffffffff81145ac7>] ? fput+0x197/0x260
 [<ffffffff810aecff>] ? audit_syscall_entry+0x1bf/0x1f0
 [<ffffffff81153d0d>] sys_unlinkat+0x2d/0x40
 [<ffffffff8147896b>] system_call_fastpath+0x16/0x1b
RIP  [<ffffffffa054f7b9>] btrfs_orphan_add+0x179/0x1a0 [btrfs]

When it comes to btrfs_lookup_dentry, we may set a snapshot's inode->i_ino
to BTRFS_EMPTY_SUBVOL_DIR_OBJECTID instead of BTRFS_FIRST_FREE_OBJECTID,
while the snapshot's location.objectid remains unchanged.

However, btrfs_ino() does not take this into account, and returns a wrong ino,
and causes the oops.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

14c7cca7

21 8月, 2011 1 次提交

Btrfs: fix 64 bit divide problem · 6719db6a

由 Josef Bacik 提交于 8月 20, 2011

This fixes a regression introduced by commit cdcb725c ("Btrfs: check
if there is enough space for balancing smarter").  We can't do 64-bit
divides on 32-bit architectures.

In cases where we need to divide/multiply by 2 we should just left/right
shift respectively, and in cases where theres N number of devices use
do_div.  Also make the counters u64 to match up with rw_devices.
Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Acked-and-tested-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6719db6a

18 8月, 2011 3 次提交

Btrfs: set i_size properly when fallocating and we already · f1e490a7

由 Josef Bacik 提交于 8月 18, 2011

xfstests exposed a problem with preallocate when it fallocates a range that
already has an extent. We don't set the new i_size properly because we see that
we already have an extent. This isn't right and we should update i_size if the
space already exists. With this patch we now pass xfstests 075. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f1e490a7

btrfs: unlock on error in btrfs_file_llseek() · 9a4327ca

由 Dan Carpenter 提交于 8月 18, 2011

There were some unlocks on error missing in a recent patch to
btrfs_file_llseek().
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9a4327ca

btrfs: btrfs_permission's RO check shouldn't apply to device nodes · cb6db4e5

由 Jeff Mahoney 提交于 8月 15, 2011

This patch tightens the read-only access checks in btrfs_permission to
match the constraints in inode_permission. Currently, even though the
device node itself will be unmodified, read-write access to device nodes
is denied to when the device node resides on a read-only subvolume or a
is a file that has been marked read-only by the btrfs conversion utility.

With this patch applied, the check only affects regular files,
directories, and symlinks. It also restructures the code a bit so that
we don't duplicate the MAY_WRITE check for both tests.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

cb6db4e5

17 8月, 2011 6 次提交

Btrfs: truncate pages from clone ioctl target range · f81c9cdc

由 Sage Weil 提交于 8月 10, 2011

We need to truncate page cache pages for the clone ioctl target range or
else we'll confuse ourselves to no end.  If the old data was cached, we
used to still see it (until remount).  If the page was partially updated
we used to get a mix of old and new data.
Signed-off-by: NSage Weil <sage@newdream.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f81c9cdc

Btrfs: fix uninitialized sync_pending · 0e588859

由 Miao Xie 提交于 8月 05, 2011

sync_pending is uninitialized before it be used, fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0e588859

Btrfs: fix wrong free space information · bb3ac5a4

由 Miao Xie 提交于 8月 05, 2011

Btrfs subtracted the size of the allocated space twice when it allocated
the space from the bitmap in the cluster, it broke the free space information
and led to oops finally.

And this patch also fixes the bug that ctl->free_space was subtracted
without lock.
Reported-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

bb3ac5a4

btrfs: memory leak in btrfs_add_inode_defrag() · f4ac904c

由 Dan Carpenter 提交于 8月 05, 2011

We don't use the defrag struct on this path.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f4ac904c

Btrfs: use plain page_address() in header fields setget functions · c97c2916

由 Li Zefan 提交于 8月 03, 2011

We've stopped using highmem for extent buffers.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c97c2916

Btrfs: forced readonly when btrfs_drop_snapshot() fails · cb1b69f4

由 Tsutomu Itoh 提交于 8月 09, 2011

The filesystem turns readonly instead of returning the error to the
caller when detected error in btrfs_drop_snapshot().
and, because the caller doesn't check the error, the function type is
changed to 'void'.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

cb1b69f4

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功