提交 · c018daecead7a46a575e2a1397fea850b83396c8 · openeuler / Kernel

20 2月, 2013 1 次提交

Btrfs: use wrapper page_offset · 4eee4fa4

由 Miao Xie 提交于 12月 21, 2012

Use wrapper page_offset to get byte-offset into filesystem object for page.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

4eee4fa4

13 12月, 2012 4 次提交

Btrfs: handle errors from btrfs_map_bio() everywhere · 61891923

由 Stefan Behrens 提交于 11月 05, 2012

With the addition of the device replace procedure, it is possible
for btrfs_map_bio(READ) to report an error. This happens when the
specific mirror is requested which is located on the target disk,
and the copy operation has not yet copied this block. Hence the
block cannot be read and this error state is indicated by
returning EIO.
Some background information follows now. A new mirror is added
while the device replace procedure is running.
btrfs_get_num_copies() returns one more, and
btrfs_map_bio(GET_READ_MIRROR) adds one more mirror if a disk
location is involved that was already handled by the device
replace copy operation. The assigned mirror num is the highest
mirror number, e.g. the value 3 in case of RAID1.
If btrfs_map_bio() is invoked with mirror_num == 0 (i.e., select
any mirror), the copy on the target drive is never selected
because that disk shall be able to perform the write requests as
quickly as possible. The parallel execution of read requests would
only slow down the disk copy procedure. Second case is that
btrfs_map_bio() is called with mirror_num > 0. This is done from
the repair code only. In this case, the highest mirror num is
assigned to the target disk, since it is used last. And when this
mirror is not available because the copy procedure has not yet
handled this area, an error is returned. Everywhere in the code
the handling of such errors is added now.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

61891923

Btrfs: pass fs_info to btrfs_map_block() instead of mapping_tree · 3ec706c8

由 Stefan Behrens 提交于 11月 05, 2012

This is required for the device replace procedure in a later step.
Two calling functions also had to be changed to have the fs_info
pointer: repair_io_failure() and scrub_setup_recheck_block().
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

3ec706c8

Btrfs: Pass fs_info to btrfs_num_copies() instead of mapping_tree · 5d964051

由 Stefan Behrens 提交于 11月 05, 2012

This is required for the device replace procedure in a later step.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

5d964051

fs/btrfs: use WARN · 31b1a2bd

由 Julia Lawall 提交于 11月 03, 2012

Use WARN rather than printk followed by WARN_ON(1), for conciseness.

A simplified version of the semantic patch that makes this transformation
is as follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression list es;
@@

-printk(
+WARN(1,
  es);
-WARN_ON(1);
// </smpl>
Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

31b1a2bd

26 10月, 2012 1 次提交

Btrfs: Fix wrong error handling code · 84167d19

由 Stefan Behrens 提交于 10月 11, 2012

gcc says "warning: comparison of unsigned expression >= 0 is always
true" because i is an unsigned long. And gcc is right this time.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>

84167d19

09 10月, 2012 7 次提交

Btrfs: fix page leakage · f60b1b49

由 Josef Bacik 提交于 10月 05, 2012

Alloc_dummy_extent_buffer will not free the first page in the eb array if we
fail to allocate a page, fix this.  Thanks,
Reported-by: NDavid Sterba <dave@jikos.cz>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

f60b1b49

Btrfs: do not warn_on when we cannot alloc a page for an extent buffer · 4804b382

由 Josef Bacik 提交于 10月 05, 2012

It's just annoying and the user will have gotten a nice OOM killer message
so they are already fully aware they are screwed :).  Thanks,
Reported-by: NJérôme Poulin <jeromepoulin@gmail.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

4804b382

Btrfs: don't bug on enomem in readpage · edd33c99

由 Josef Bacik 提交于 10月 05, 2012

Get rid of the BUG_ON(ret == -ENOMEM) in __extent_read_full_page.  Thanks,
Reported-by: NJérôme Poulin <jeromepoulin@gmail.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

edd33c99

btrfs: move inline function code to header file · 479ed9ab

由 Robin Dong 提交于 9月 29, 2012

When building btrfs from kernel code, it will report:

fs/btrfs/extent_io.h:281: warning: 'extent_buffer_page' declared inline after being called
fs/btrfs/extent_io.h:281: warning: previous declaration of 'extent_buffer_page' was here
fs/btrfs/extent_io.h:280: warning: 'num_extent_pages' declared inline after being called
fs/btrfs/extent_io.h:280: warning: previous declaration of 'num_extent_pages' was here

because of the wrong declaration of inline functions.
Signed-off-by: NRobin Dong <sanbai@taobao.com>

479ed9ab

Btrfs: remove unnecessary IS_ERR in bio_readpage_error() · 7a2d6a64

由 Tsutomu Itoh 提交于 10月 01, 2012

Because the value of extent_map is only a correct value or NULL,
so IS_ERR is unnecessary.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>

7a2d6a64

Btrfs: cache extent state when writing out dirty metadata pages · e6138876

由 Josef Bacik 提交于 9月 27, 2012

Everytime we write out dirty pages we search for an offset in the tree,
convert the bits in the state, and then when we wait we search for the
offset again and clear the bits. So for every dirty range in the io tree we
are doing 4 rb searches, which is suboptimal. With this patch we are only
doing 2 searches for every cycle (modulo weird things happening). Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

e6138876

Btrfs: do not async metadata csumming in certain situations · de0022b9

由 Josef Bacik 提交于 9月 25, 2012

There are a coule scenarios where farming metadata csumming off to an async
thread doesn't help. The first is if our processor supports crc32c, in
which case the csumming will be fast and so the overhead of the async model
is not worth the cost. The other case is for our tree log. We will be
making that stuff dirty and writing it out and waiting for it immediately.
Even with software crc32c this gives me a ~15% increase in speed with O_SYNC
workloads. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

de0022b9

04 10月, 2012 1 次提交

Btrfs: fix race when getting the eb out of page->private · b5bae261

由 Josef Bacik 提交于 9月 14, 2012

We can race when checking wether PagePrivate is set on a page and we
actually have an eb saved in the pages private pointer. We could have
easily written out this page and released it in the time that we did the
pagevec lookup and actually got around to looking at this page. So use
mapping->private_lock to ensure we get a consistent view of the
page->private pointer. This is inline with the alloc and releasepage paths
which use private_lock when manipulating page->private. Thanks,
Reported-by: NDavid Sterba <dave@jikos.cz>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

b5bae261

03 10月, 2012 1 次提交

fs: push rcu_barrier() from deactivate_locked_super() to filesystems · 8c0a8537

由 Kirill A. Shutemov 提交于 9月 26, 2012

There's no reason to call rcu_barrier() on every
deactivate_locked_super().  We only need to make sure that all delayed rcu
free inodes are flushed before we destroy related cache.

Removing rcu_barrier() from deactivate_locked_super() affects some fast
paths.  E.g.  on my machine exit_group() of a last process in IPC
namespace takes 0.07538s.  rcu_barrier() takes 0.05188s of that time.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8c0a8537

02 10月, 2012 4 次提交

btrfs: Kill some bi_idx references · be3940c0

由 Kent Overstreet 提交于 9月 11, 2012

For immutable bio vecs, I've been auditing and removing bi_idx
references. These were harmless, but removing them will make auditing
easier.

scrub_bio_end_io_worker() was open coding a bio_reset() - but this
doesn't appear to have been needed for anything as right after it does a
bio_put(), and perusing the code it doesn't appear anything else was
holding a reference to the bio.

The other use end_bio_extent_readpage() was just for a pr_debug() -
changed it to something that might be a bit more useful.
Signed-off-by: NKent Overstreet <koverstreet@google.com>
CC: Chris Mason <chris.mason@oracle.com>
CC: Stefan Behrens <sbehrens@giantdisaster.de>

be3940c0

btrfs: polish names of kmem caches · 837e1972

由 David Sterba 提交于 9月 07, 2012

Usecase:

  watch 'grep btrfs < /proc/slabinfo'

easy to watch all caches in one go.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

837e1972

Btrfs: use flag EXTENT_DEFRAG for snapshot-aware defrag · 9e8a4a8b

由 Liu Bo 提交于 9月 05, 2012

We're going to use this flag EXTENT_DEFRAG to indicate which range
belongs to defragment so that we can implement snapshow-aware defrag:

We set the EXTENT_DEFRAG flag when dirtying the extents that need
defragmented, so later on writeback thread can differentiate between
normal writeback and writeback started by defragmentation.
Original-Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>

9e8a4a8b

Btrfs: fix btrfs send for inline items and compression · 74dd17fb

由 Chris Mason 提交于 8月 07, 2012

The btrfs send code was assuming the offset of the file item into the
extent translated to bytes on disk.  If we're compressed, this isn't
true, and so it was off into extents owned by other files.

It was also improperly handling inline extents.  This solves a crash
where we may have gone past the end of the file extent item by not
testing early enough for an inline extent.  It also solves problems
where we have a whole between the end of the inline item and the start
of the full extent.
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

74dd17fb

29 8月, 2012 1 次提交

Btrfs: revert checksum error statistic which can cause a BUG() · 5ee0844d

由 Stefan Behrens 提交于 8月 27, 2012

Commit 442a4f63 added btrfs device
statistic counters for detected IO and checksum errors to Linux 3.5.
The statistic part that counts checksum errors in
end_bio_extent_readpage() can cause a BUG() in a subfunction:
"kernel BUG at fs/btrfs/volumes.c:3762!"
That part is reverted with the current patch.
However, the counting of checksum errors in the scrub context remains
active, and the counting of detected IO errors (read, write or flush
errors) in all contexts remains active.

Cc: stable <stable@vger.kernel.org> # 3.5
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5ee0844d

24 7月, 2012 5 次提交

Btrfs: improve multi-thread buffer read · 67c9684f

由 Liu Bo 提交于 7月 20, 2012

While testing with my buffer read fio jobs[1], I find that btrfs does not
perform well enough.

Here is a scenario in fio jobs:

We have 4 threads, "t1 t2 t3 t4", starting to buffer read a same file,
and all of them will race on add_to_page_cache_lru(), and if one thread
successfully puts its page into the page cache, it takes the responsibility
to read the page's data.

And what's more, reading a page needs a period of time to finish, in which
other threads can slide in and process rest pages:

     t1          t2          t3          t4
   add Page1
   read Page1  add Page2
     |         read Page2  add Page3
     |            |        read Page3  add Page4
     |            |           |        read Page4
-----|------------|-----------|-----------|--------
     v            v           v           v
    bio          bio         bio         bio

Now we have four bios, each of which holds only one page since we need to
maintain consecutive pages in bio.  Thus, we can end up with far more bios
than we need.

Here we're going to
a) delay the real read-page section and
b) try to put more pages into page cache.

With that said, we can make each bio hold more pages and reduce the number
of bios we need.

Here is some numbers taken from fio results:
         w/o patch                 w patch
       -------------  --------  ---------------
READ:    745MB/s        +25%       934MB/s

[1]:
[global]
group_reporting
thread
numjobs=4
bs=32k
rw=read
ioengine=sync
directory=/mnt/btrfs/

[READ]
filename=foobar
size=2000M
invalidate=1
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

67c9684f

Btrfs: lock the transition from dirty to writeback for an eb · 51561ffe

由 Josef Bacik 提交于 7月 20, 2012

There is a small window where an eb can have no IO bits set on it, which
could potentially result in extent_buffer_under_io() returning false when we
want it to return true, which could result in not fun things happening. So
in order to protect this case we need to hold the refs_lock when we make
this transition to make sure we get reliable results out of
extent_buffer_udner_io(). Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

51561ffe

Btrfs: fix potential race in extent buffer freeing · 594831c4

由 Josef Bacik 提交于 7月 20, 2012

This sounds sort of impossible but it is the only thing I can think of and
at the very least it is theoretically possible so here it goes.

If we are in try_release_extent_buffer we will check that the ref count on
the extent buffer is 1 and not under IO, and then go down and clear the tree
ref. If between this check and clearing the tree ref somebody else comes in
and grabs a ref on the eb and the marks it dirty before
try_release_extent_buffer() does it's tree ref clear we can end up with a
dirty eb that will be freed while it is still dirty which will result in a
panic. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

594831c4

Btrfs: don't return true in releasepage unless we actually freed the eb · e64860aa

由 Josef Bacik 提交于 7月 20, 2012

I noticed while looking at an extent_buffer race that we will
unconditionally return 1 if we get down to release_extent_buffer after
clearing the tree ref.  However we can easily race in here and get a ref on
the eb and not actually free the eb.  So make release_extent_buffer return 1
if it free'd the eb and 0 if not so we can be a little kinder to the vm.
Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

e64860aa

A
btrfs read error corrected message floods the console during recovery · d5b025d5
由 Anand Jain 提交于 7月 02, 2012
```
Changing printk_in_rcu to printk_ratelimited_in_rcu will suffice
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
```
d5b025d5

12 7月, 2012 1 次提交

Btrfs: fix typo in convert_extent_bit · 10983f2e

由 Liu Bo 提交于 7月 11, 2012

It should be convert_extent_bit.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

10983f2e

03 7月, 2012 1 次提交

Btrfs: hold a ref on the inode during writepages · 7fd1a3f7

由 Josef Bacik 提交于 6月 27, 2012

We can race with unlink and not actually be able to do our igrab in
btrfs_add_ordered_extent. This will result in all sorts of problems.
Instead of doing the complicated work to try and handle returning an error
properly from btrfs_add_ordered_extent, just hold a ref to the inode during
writepages. If we cannot grab a ref we know we're freeing this inode anyway
and can just drop the dirty pages on the floor, because screw them we're
going to invalidate them anyway. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

7fd1a3f7

15 6月, 2012 1 次提交

Btrfs: use rcu to protect device->name · 606686ee

由 Josef Bacik 提交于 6月 04, 2012

Al pointed out that we can just toss out the old name on a device and add a
new one arbitrarily, so anybody who uses device->name in printk could
possibly use free'd memory. Instead of adding locking around all of this he
suggested doing it with RCU, so I've introduced a struct rcu_string that
does just that and have gone through and protected all accesses to
device->name that aren't under the uuid_mutex with rcu_read_lock(). This
protects us and I will use it for dealing with removing the device that we
used to mount the file system in a later patch. Thanks,
Reviewed-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NJosef Bacik <josef@redhat.com>

606686ee

30 5月, 2012 4 次提交

Btrfs: add device counters for detected IO and checksum errors · 442a4f63

由 Stefan Behrens 提交于 5月 25, 2012

The goal is to detect when drives start to get an increased error rate,
when drives should be replaced soon. Therefore statistic counters are
added that count IO errors (read, write and flush). Additionally, the
software detected errors like checksum errors and corrupted blocks are
counted.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>

442a4f63

Btrfs: use fastpath in extent state ops as much as possible · d1ac6e41

由 Liu Bo 提交于 5月 10, 2012

Fully utilize our extent state's new helper functions to use
fastpath as much as possible.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Reviewed-by: NJosef Bacik <josef@redhat.com>

d1ac6e41

Btrfs: finish ordered extents in their own thread · 5fd02043

由 Josef Bacik 提交于 5月 02, 2012

We noticed that the ordered extent completion doesn't really rely on having
a page and that it could be done independantly of ending the writeback on a
page. This patch makes us not do the threaded endio stuff for normal
buffered writes and direct writes so we can end page writeback as soon as
possible (in irq context) and only start threads to do the ordered work when
it is actually done. Compression needs to be reworked some to take
advantage of this as well, but atm it has to do a find_get_page in its endio
handler so it must be done in its own thread. This makes direct writes
quite a bit faster. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

5fd02043

Btrfs: fix compile warnings in extent_io.c · d7dbe9e7

由 Josef Bacik 提交于 4月 23, 2012

These warnings are bogus since we will always have at least one page in an
eb, but to make the compiler happy just set ret = 0 in these two cases.
Thanks,
Btrfs: fix compile warnings in extent_io.c

These warnings are bogus since we will always have at least one page in an
eb, but to make the compiler happy just set ret = 0 in these two cases.
Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

d7dbe9e7

26 5月, 2012 1 次提交

Btrfs: dummy extent buffers for tree mod log · 815a51c7

由 Jan Schmidt 提交于 5月 16, 2012

The tree modification log needs two ways to create dummy extent buffers,
once by allocating a fresh one (to rebuild an old root) and once by
cloning an existing one (to make private rewind modifications) to it.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

815a51c7

11 5月, 2012 4 次提交

Btrfs: remove the useless assignment to *entry in function tree_insert of file extent_io.c · fd5e62a3

由 Wang Sheng-Hui 提交于 4月 06, 2012

In tree_insert, var *entry is used in the loop only, and is useless
out of the loop. Remove the useless assignment after the loop.
Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>

fd5e62a3

Btrfs: fix the comment for find_first_extent_bit · 477d7eaf

由 Wang Sheng-Hui 提交于 4月 06, 2012

The return value of find_first_extent_bit is 1 or 0, no < 0.
And if found something, return 0; if nothing was found, return 1.
Fix the comment.
Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>

477d7eaf

Btrfs: fix btrfs_release_extent_buffer_page with the right usage of num_extent_pages · 39bab87b

由 Wang Sheng-Hui 提交于 4月 06, 2012

num_extent_pages returns the number of pages in the specific range, not
the index of the last page in the eb range.

btrfs_release_extent_buffer_page is called with start_idx set 0 in current
codes, so it's not a problem yet. But the logic is indeed wrong.

Fix it here.
Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>

39bab87b

W
Btrfs: cleanup the comment for clear_state_bit in extent_io.c · 1b303fc0
由 Wang Sheng-Hui 提交于 4月 06, 2012
```
No 'delete' arg is used for clear_state_bit.
Cleanup the comment.
Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>
```
1b303fc0

05 5月, 2012 1 次提交

Btrfs: fix page leak when allocing extent buffers · 17de39ac

由 Josef Bacik 提交于 5月 04, 2012

If we happen to alloc a extent buffer and then alloc a page and notice that
page is already attached to an extent buffer, we will only unlock it and
free our existing eb. Any pages currently attached to that eb will be
properly freed, but we don't do the page_cache_release() on the page where
we noticed the other extent buffer which can cause us to leak pages and I
hope cause the weird issues we've been seeing in this area. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

17de39ac

19 4月, 2012 2 次提交

Btrfs: always store the mirror we read the eb from · 5cf1ab56

由 Josef Bacik 提交于 4月 16, 2012

A user reported a panic where we were trying to fix a bad mirror but the
mirror number we were giving was 0, which is invalid. This is because we
don't do the transid verification until after the read, so as far as the
read code is concerned the read was a success. So instead store the mirror
we read from so that if there is some failure post read we know which mirror
to try next and which mirror needs to be fixed if we find a good copy of the
block. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

5cf1ab56

Btrfs: avoid possible use-after-free in clear_extent_bit() · cdc6a395

由 Li Zefan 提交于 3月 12, 2012

clear_extent_bit()
{
    next_node = rb_next(&state->rb_node);
    ...
    clear_state_bit(state);  <-- this may free next_node
    if (next_node) {
        state = rb_entry(next_node);
        ...
    }
}

clear_state_bit() calls merge_state() which may free the next node
of the passing extent_state, so clear_extent_bit() may end up
referencing freed memory.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

cdc6a395

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功