提交 · 72ac3c0d7921f943d92d1ef42a549fb52e56817d · openeuler / Kernel

30 5月, 2012 2 次提交

Btrfs: convert the inode bit field to use the actual bit operations · 72ac3c0d

由 Josef Bacik 提交于 5月 23, 2012

Miao pointed this out while I was working on an orphan problem that messing
with a bitfield where different ranges are protected by different locks
doesn't work out right. Turns out we've been doing this forever where we
have different parts of the bit field protected by either no lock at all or
different locks which could cause all sorts of weird problems including the
issue I was hitting. So instead make a runtime_flags thing that we use the
normal bit operations on that are all atomic so we can keep having our
no/different locking for the different flags and then make force_compress
it's own thing so it can be treated normally. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

72ac3c0d

Btrfs: finish ordered extents in their own thread · 5fd02043

由 Josef Bacik 提交于 5月 02, 2012

We noticed that the ordered extent completion doesn't really rely on having
a page and that it could be done independantly of ending the writeback on a
page. This patch makes us not do the threaded endio stuff for normal
buffered writes and direct writes so we can end page writeback as soon as
possible (in irq context) and only start threads to do the ordered work when
it is actually done. Compression needs to be reworked some to take
advantage of this as well, but atm it has to do a find_get_page in its endio
handler so it must be done in its own thread. This makes direct writes
quite a bit faster. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

5fd02043

06 5月, 2012 1 次提交

Btrfs: avoid sleeping in verify_parent_transid while atomic · b9fab919

由 Chris Mason 提交于 5月 06, 2012

verify_parent_transid needs to lock the extent range to make
sure no IO is underway, and so it can safely clear the
uptodate bits if our checks fail.

But, a few callers are using it with spinlocks held.  Most
of the time, the generation numbers are going to match, and
we don't want to switch to a blocking lock just for the error
case.  This adds an atomic flag to verify_parent_transid,
and changes it to return EAGAIN if it needs to block to
properly verifiy things.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b9fab919

19 4月, 2012 2 次提交

Btrfs: always store the mirror we read the eb from · 5cf1ab56

由 Josef Bacik 提交于 4月 16, 2012

A user reported a panic where we were trying to fix a bad mirror but the
mirror number we were giving was 0, which is invalid. This is because we
don't do the transid verification until after the read, so as far as the
read code is concerned the read was a success. So instead store the mirror
we read from so that if there is some failure post read we know which mirror
to try next and which mirror needs to be fixed if we find a good copy of the
block. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

5cf1ab56

Btrfs: do not mount when we have a sectorsize unequal to PAGE_SIZE · 8d082fb7

由 Liu Bo 提交于 4月 03, 2012

Our code is not ready to cope with a sectorsize that's not equal to PAGE_SIZE.
It will lead to hanging-on while writing something.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>

8d082fb7

30 3月, 2012 1 次提交

Btrfs: update the checks for mixed block groups with big metadata blocks · bc3f116f

由 Chris Mason 提交于 3月 29, 2012

Dave Sterba had put in patches to look for mixed data/metadata groups
with metadata bigger than 4KB.  But these ended up in the wrong place
and it wasn't testing the feature flag correctly.

This updates the tests to make sure our sizes are matching
Signed-off-by: NChris Mason <chris.mason@oracle.com>

bc3f116f

29 3月, 2012 3 次提交

Btrfs: flush out and clean up any block device pages during mount · 3c4bb26b

由 Chris Mason 提交于 3月 27, 2012

Btrfs puts the filesystem metadata into its own address space, and
somehow the block device address space isn't getting onto disk properly
before a mount.  The end result is that a loop of mkfs and mounting the
filesystem will sometimes find stale or incorrect data.

This commit should fix it by sprinkling fdatawrites and invalidate_bdev
calls around.  This is a short term measure to make sure it is fixed.
The block devices really should be flushed and cleaned up higher in the
stack.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3c4bb26b

btrfs: disallow unequal data/metadata blocksize for mixed block groups · 65139ed9

由 David Sterba 提交于 2月 17, 2012

With support for bigger metadata blocks, we must avoid mounting a
filesystem with different block size for mixed block groups, this causes
corruption (found by xfstests/083).
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

65139ed9

Btrfs: enhance superblock sanity checks · fcd1f065

由 David Sterba 提交于 3月 06, 2012

Validate checksum algorithm during mount and prevent BUG_ON later in
btrfs_super_csum_size.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

fcd1f065

27 3月, 2012 6 次提交

Btrfs: deal with read errors on extent buffers differently · ea466794

由 Josef Bacik 提交于 3月 26, 2012

Since we need to read and write extent buffers in their entirety we can't use
the normal bio_readpage_error stuff since it only works on a per page basis. So
instead make it so that if we see an io error in endio we just mark the eb as
having an IO error and then in btree_read_extent_buffer_pages we will manually
try other mirrors and then overwrite the bad mirror if we find a good copy.
This works with larger than page size blocks. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ea466794

Btrfs: don't use threaded IO completion helpers for metadata writes · f3f266ab

由 Chris Mason 提交于 3月 23, 2012

The metadata write IO completion code is now simple enough that we
don't need the threaded helpers anymore.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f3f266ab

Btrfs: ensure an entire eb is written at once · 0b32f4bb

由 Josef Bacik 提交于 3月 13, 2012

This patch simplifies how we track our extent buffers. Previously we could exit
writepages with only having written half of an extent buffer, which meant we had
to track the state of the pages and the state of the extent buffers differently.
Now we only read in entire extent buffers and write out entire extent buffers,
this allows us to simply set bits in our bflags to indicate the state of the eb
and we no longer have to do things like track uptodate with our iotree. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0b32f4bb

Btrfs: introduce free_extent_buffer_stale · 3083ee2e

由 Josef Bacik 提交于 3月 09, 2012

Because btrfs cow's we can end up with extent buffers that are no longer
necessary just sitting around in memory. So instead of evicting these pages, we
could end up evicting things we actually care about. Thus we have
free_extent_buffer_stale for use when we are freeing tree blocks. This will
make it so that the ref for the eb being in the radix tree is dropped as soon as
possible and then is freed when the refcount hits 0 instead of waiting to be
released by releasepage. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

3083ee2e

Btrfs: set page->private to the eb · 4f2de97a

由 Josef Bacik 提交于 3月 07, 2012

We spend a lot of time looking up extent buffers from pages when we could just
store the pointer to the eb the page is associated with in page->private. This
patch does just that, and it makes things a little simpler and reduces a bit of
CPU overhead involved with doing metadata IO. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

4f2de97a

Btrfs: allow metadata blocks larger than the page size · 727011e0

由 Chris Mason 提交于 8月 06, 2010

A few years ago the btrfs code to support blocks lager than
the page size was disabled to fix a few corner cases in the
page cache handling.  This fixes the code to properly support
large metadata blocks again.

Since current kernels will crash early and often with larger
metadata blocks, this adds an incompat bit so that older kernels
can't mount it.

This also does away with different blocksizes for nodes and leaves.
You get a single block size for all tree blocks.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

727011e0

22 3月, 2012 8 次提交

btrfs: Fix busyloop in transaction_kthread() · 914b2007

由 Jan Kara 提交于 3月 12, 2012

When a filesystem got aborted due do error, transaction_kthread() will
busyloop.  Fix it by going to sleep in that case as well. Maybe we should
just stop transaction_kthread() when filesystem is aborted but that would be
more complex.
Signed-off-by: NJan Kara <jack@suse.cz>

914b2007

btrfs: replace many BUG_ONs with proper error handling · 79787eaa

由 Jeff Mahoney 提交于 3月 12, 2012

 btrfs currently handles most errors with BUG_ON. This patch is a work-in-
 progress but aims to handle most errors other than internal logic
 errors and ENOMEM more gracefully.

 This iteration prevents most crashes but can run into lockups with
 the page lock on occasion when the timing "works out."
Signed-off-by: NJeff Mahoney <jeffm@suse.com>

79787eaa

J
btrfs: enhance transaction abort infrastructure · 49b25e05
由 Jeff Mahoney 提交于 3月 01, 2012
```
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
```
49b25e05

btrfs: drop gfp_t from lock_extent · d0082371

由 Jeff Mahoney 提交于 3月 01, 2012

 lock_extent and unlock_extent are always called with GFP_NOFS, drop the
 argument and use GFP_NOFS consistently.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>

d0082371

J
btrfs: return void in functions without error conditions · 143bede5
由 Jeff Mahoney 提交于 3月 01, 2012
```
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
```
143bede5

btrfs: ->submit_bio_hook error push-up · 355808c2

由 Jeff Mahoney 提交于 10月 03, 2011

This pushes failures from the submit_bio_hook callbacks,
btrfs_submit_bio_hook and btree_submit_bio_hook into the callers, including
callers of submit_one_bio where it catches the failures with BUG_ON.

It also pushes up through the ->readpage_io_failed_hook to
end_bio_extent_writepage where the error is already caught with BUG_ON.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>

355808c2

btrfs: find_and_setup_root error push-up · 200a5c17

由 Jeff Mahoney 提交于 10月 03, 2011

find_and_setup_root BUGs when it encounters an error from
btrfs_find_last_root, which can occur if a path can't be allocated.

This patch pushes it up to its callers where it is already handled.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>

200a5c17

btrfs: clean_tree_block should panic on observed memory corruption and return void · d5c13f92

由 Jeff Mahoney 提交于 3月 01, 2012

The only error condition in clean_tree_block is an accounting bug.
Returning without modifying dirty_metadata_bytes and as if the cleaning
as been performed may cause problems later so it should panic instead.

It should probably be a BUG_ON but we have btrfs_panic now.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>

d5c13f92

23 2月, 2012 1 次提交

Btrfs: make sure we update latest_bdev · a6b0d5c8

由 Chris Mason 提交于 2月 20, 2012

When we are setting up the mount, we close all the
devices that were not actually part of the metadata we found.

But, we don't make sure that one of those devices wasn't
fs_devices->latest_bdev, which means we can do a use after free
on the one we closed.

This updates latest_bdev as it goes.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a6b0d5c8

15 2月, 2012 1 次提交

btrfs: Sector Size check during Mount · 941b2ddf

由 Keith Mannthey 提交于 11月 29, 2011

Gracefully fail when trying to mount a BTRFS file system that has a
sectorsize smaller than PAGE_SIZE.

On PPC it is possible to build a FS while using a 4k PAGE_SIZE kernel
then boot into a 64K PAGE_SIZE kernel.  Presently open_ctree fails in an
endless loop and hangs the machine in this situation.

My debugging has show this Sector size < Page size to be a non trivial
situation and a graceful exit from the situation would be nice for the
time being.
Signed-off-by: NKeith Mannthey <kmannth@us.ibm.com>

941b2ddf

27 1月, 2012 1 次提交

btrfs: mask out gfp flags in releasepage · 0c4e538b

由 David Sterba 提交于 1月 26, 2012

btree_releasepage is a callback and can be passed unknown gfp flags and then
they may end up in kmem_cache_alloc called from alloc_extent_state, slab
allocator will BUG_ON when there is HIGHMEM or DMA32 flag set.

This may happen when btrfs is mounted from a loop device, which masks out
__GFP_IO flag. The check in try_release_extent_state

3399                 if ((mask & GFP_NOFS) == GFP_NOFS)
3400                         mask = GFP_NOFS;

will not work and passes unfiltered flags further resulting in crash at
mm/slab.c:2963

 [<000000000024ae4c>] cache_alloc_refill+0x3b4/0x5c8
 [<000000000024c810>] kmem_cache_alloc+0x204/0x294
 [<00000000001fd3c2>] mempool_alloc+0x52/0x170
 [<000003c000ced0b0>] alloc_extent_state+0x40/0xd4 [btrfs]
 [<000003c000cee5ae>] __clear_extent_bit+0x38a/0x4cc [btrfs]
 [<000003c000cee78c>] try_release_extent_state+0x9c/0xd4 [btrfs]
 [<000003c000cc4c66>] btree_releasepage+0x7e/0xd0 [btrfs]
 [<0000000000210d84>] shrink_page_list+0x6a0/0x724
 [<0000000000211394>] shrink_inactive_list+0x230/0x578
 [<0000000000211bb8>] shrink_list+0x6c/0x120
 [<0000000000211e4e>] shrink_zone+0x1e2/0x228
 [<0000000000211f24>] shrink_zones+0x90/0x254
 [<0000000000213410>] do_try_to_free_pages+0xac/0x420
 [<0000000000213ae0>] try_to_free_pages+0x13c/0x1b0
 [<0000000000204e6c>] __alloc_pages_nodemask+0x5b4/0x9a8
 [<00000000001fb04a>] grab_cache_page_write_begin+0x7e/0xe8
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0c4e538b

17 1月, 2012 5 次提交

Btrfs: allow for canceling restriper · a7e99c69