提交 · 40c3c40947324d9f40bf47830c92c59a9bbadf4a · openanolis / cloud-kernel

30 10月, 2017 32 次提交

btrfs: Add sanity check for EXTENT_DATA when reading out leaf · 40c3c409

由 Qu Wenruo 提交于 8月 23, 2017

Add extra checks for item with EXTENT_DATA type.  This checks the
following thing:

0) Key offset
   All key offsets must be aligned to sectorsize.
   Inline extent must have 0 for key offset.

1) Item size
   Uncompressed inline file extent size must match item size.
   (Compressed inline file extent has no information about its on-disk size.)
   Regular/preallocated file extent size must be a fixed value.

2) Every member of regular file extent item
   Including alignment for bytenr and offset, possible value for
   compression/encryption/type.

3) Type/compression/encode must be one of the valid values.

This should be the most comprehensive and strict check in the context
of btrfs_item for EXTENT_DATA.
Signed-off-by: NQu Wenruo <quwenruo.btrfs@gmx.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ switch to BTRFS_FILE_EXTENT_TYPES, similar to what
  BTRFS_COMPRESS_TYPES does ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

40c3c409

btrfs: Check if item pointer overlaps with the item itself · 7f43d4af

由 Qu Wenruo 提交于 8月 23, 2017

Function check_leaf() checks if any item pointer points outside of the
leaf, but it doesn't check if the pointer overlaps with the item itself.

Normally only the last item may be the victim, but adding such check is
never a bad idea anyway.
Signed-off-by: NQu Wenruo <quwenruo.btrfs@gmx.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

7f43d4af

btrfs: Refactor check_leaf function for later expansion · c3267bba

由 Qu Wenruo 提交于 8月 23, 2017

Current check_leaf() function does a good job checking key order and
item offset/size.

However it only checks from slot 0 to the last but one slot, this is
good but makes later expansion hard.

So this refactoring iterates from slot 0 to the last slot.
For key comparison, it uses a key with all 0 as initial key, so all
valid keys should be larger than that.

And for item size/offset checks, it compares current item end with
previous item offset.
For slot 0, use leaf end as a special case.

This makes later item/key offset checks and item size checks easier to
be implemented.

Also, makes check_leaf() to return -EUCLEAN other than -EIO to indicate
error.
Signed-off-by: NQu Wenruo <quwenruo.btrfs@gmx.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c3267bba

Btrfs: cleanup 'start' subtraction from try uncompressed inline extent · 6018ba0a

由 Timofey Titovets 提交于 9月 15, 2017

Was added in:
  c8b97818
  "Btrfs: Add zlib compression support"
Survive to near time (from 08.10.2008).

Because 'start' checked for zero before branch, so it's safe to remove
that subtraction.
Signed-off-by: NTimofey Titovets <nefelim4ag@gmail.com>
Reviewed-by: NSatoru Takeuchi <satoru.takeuchi@gmail.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

6018ba0a

btrfs: change how we decide to commit transactions during flushing · 996478ca

由 Josef Bacik 提交于 8月 22, 2017

Nikolay reported that generic/273 was failing currently with ENOSPC.
Turns out this is because we get to the point where the outstanding
reservations are greater than the pinned space on the fs.  This is a
mistake, previously we used the current reservation amount in
may_commit_transaction, not the entire outstanding reservation amount.
Fix this to find the minimum byte size needed to make progress in
flushing, and pass that into may_commit_transaction.  From there we can
make a smarter decision on whether to commit the transaction or not.
This fixes the failure in generic/273.

From Nikolai, IOW: when we go to the final stage of deciding whether to
do trans commit, instead of passing all the reservations from all
tickets we just pass the reservation for the current ticket. Otherwise,
in case all reservations exceed pinned space, then we don't commit
transaction and fail prematurely. Before we passed num_bytes from
flush_space, where num_bytes was the sum of all pending reserverations,
but now all we do is take the first ticket and commit the trans if we
can satisfy that.

Fixes: 957780eb ("Btrfs: introduce ticketed enospc infrastructure")
Cc: stable@vger.kernel.org # 4.8
Reported-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Tested-by: NNikolay Borisov <nborisov@suse.com>
[ added Nikolai's comment ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

996478ca

Btrfs: send, apply asynchronous page cache readahead to enhance page read · eef16ba2

由 Kuanling Huang 提交于 9月 15, 2017

By analyzing the perf on btrfs send, we found it take large amount of
cpu time on page_cache_sync_readahead. This effort can be reduced after
switching to asynchronous one. Overall performance gain on HDD and SSD
were 9 and 15 percent if simply send a large file.
Signed-off-by: NKuanling Huang <peterh@synology.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

eef16ba2

Btrfs: fix memory leak in raid56 · 785884fc

由 Liu Bo 提交于 9月 22, 2017

The local bio_list may have pending bios when doing cleanup, it can
end up with memory leak if they don't get freed.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

785884fc

btrfs: make array types static const, reduces object code size · 315d8e98

由 Colin Ian King 提交于 9月 19, 2017

Don't populate the read-only array types on the stack, instead make
it static const.  Makes the object code smaller by nearly 60 bytes:

Before:
   text	   data	    bss	    dec	    hex	filename
  90536	   6552	     64	  97152	  17b80	fs/btrfs/ioctl.o

After:
   text	   data	    bss	    dec	    hex	filename
  90414	   6616	     64	  97094	  17b46	fs/btrfs/ioctl.o
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

315d8e98

btrfs: return -ENOMEM on allocation failure in btrfsic · 3afb0c50

由 Allen Pais 提交于 9月 20, 2017

Forward the correct return value -ENOMEM from btrfsic_dev_state_alloc()
too.
Signed-off-by: NAllen Pais <allen.lkml@gmail.com>
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ adjust changelog ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3afb0c50

Btrfs: fix confusing worker helper info in stacktrace · 6939f667

由 Liu Bo 提交于 9月 13, 2017

We've seen the following backtrace stack in ftrace or dmesg log,

  kworker/u16:10-4244  [000] 241942.480955: function:             btrfs_put_ordered_extent
  kworker/u16:10-4244  [000] 241942.480956: kernel_stack:         <stack trace>
=> finish_ordered_fn (ffffffffa0384475)
=> btrfs_scrubparity_helper (ffffffffa03ca577)        <-----"incorrect"
=> btrfs_freespace_write_helper (ffffffffa03ca98e)    <-----"correct"
=> process_one_work (ffffffff81117b2f)
=> worker_thread (ffffffff81118c2a)
=> kthread (ffffffff81121de0)
=> ret_from_fork (ffffffff81d7087a)

btrfs_freespace_write_helper is actually calling normal_worker_helper
instead of btrfs_scrubparity_helper, so somehow kernel has parsed the
incorrect function address while unwinding the stack,
btrfs_scrubparity_helper really shouldn't be shown up.

It's caused by compiler doing inline for our helper function, adding a
noinline tag can fix that.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ use noinline_for_stack ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

6939f667

Btrfs: remove bio_flags which indicates a meta block of log-tree · 18fdc679

由 Liu Bo 提交于 9月 13, 2017

Since both committing transaction and writing log-tree are doing
plugging on metadata IO, we can unify to use %sync_writers to benefit
both cases, instead of checking bio_flags while writing meta blocks of
log-tree.

We can remove this bio_flags because in order to write dirty blocks,
log tree also uses btrfs_write_marked_extents(), inside which we
have enabled %sync_writers, therefore, every write goes in a
synchronous way, so does checksuming.

Please also note that, bio_flags is applied per-context while
%sync_writers is applied per-inode, so this might incur some overhead, ie.

1) while log tree is flushing its dirty blocks via
   btrfs_write_marked_extents(), in which %sync_writers is increased
   by one.

2) in the meantime, some writeback operations may happen upon btrfs's
   metadata inode, so these writes go synchronously, too.

However, AFAICS, the overhead is not a big one while the win is that
we unify the two places that needs synchronous way and remove a
special hack/flag.

This removes the bio_flags related stuff for writing log-tree.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

18fdc679

Btrfs: make plug in writing meta blocks really work · 6300463b

由 Liu Bo 提交于 8月 21, 2017

We have started plug in btrfs_write_and_wait_marked_extents() but the
generated IOs actually go to device's schedule IO list where the work
is doing in another task, thus the started plug doesn't make any
sense.

And since we wait for IOs immediately after writing meta blocks, it's
the same case as writing log tree, doing sync submit can merge more
IOs.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

6300463b

btrfs: convert all mount option checking code to use btrfs_test_opt · d8953d69

由 Satoru Takeuchi 提交于 9月 12, 2017

Signed-off-by: NSatoru Takeuchi <satoru.takeuchi@gmail.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

d8953d69

btrfs: avoid null pointer dereference on fs_info when calling btrfs_crit · 3993b112

由 Colin Ian King 提交于 9月 11, 2017

There are checks on fs_info in __btrfs_panic to avoid dereferencing a
null fs_info, however, there is a call to btrfs_crit that may also
dereference a null fs_info. Fix this by adding a check to see if fs_info
is null and only print the s_id if fs_info is non-null.

Detected by CoverityScan CID#401973 ("Dereference after null check")

Fixes: efe120a0 ("Btrfs: convert printk to btrfs_ and fix BTRFS prefix")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3993b112

btrfs: Clean up dead code in root-tree · fa0d0888

由 Christos Gkekas 提交于 9月 09, 2017

The value of variable 'can_recover' is never used after being set, thus
it should be removed, as it was never used since the first commit
68a7342c ("Btrfs: cleanup orphaned root orphan item").
Signed-off-by: NChristos Gkekas <chris.gekas@gmail.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

fa0d0888

btrfs: tests: Fix a memory leak in error handling path in 'run_test()' · 9ca2e97f

由 Christophe JAILLET 提交于 9月 10, 2017

If 'btrfs_alloc_path()' fails, we must free the resources already
allocated, as done in the other error handling paths in this function.
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: NQu Wenruo <quwenruo.btrfs@gmx.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9ca2e97f

btrfs: Remove redundant argument of __link_block_group · c434d21c

由 Nikolay Borisov 提交于 8月 21, 2017

__link_block_group is called from only 2 places and at each call site the
space_info being passed is the same as the space info assigned to the passed
cache struct. Let's remove the redundant argument and make the function
reference the space_info from the passed block_group_cache. No functional
changes
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ renamed to link_block_group ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c434d21c

btrfs: Rework error handling of add_extent_mapping in __btrfs_alloc_chunk · 1efb72a3

由 Nikolay Borisov 提交于 8月 21, 2017

Currently the code executes add_extent_mapping and if it is successful
it links the new mapping, it then proceeds to unlock the extent mapping
tree and check for failure and handle them. Instead, rework the code to
only perform a single check if add_extent_mapping has failed and handle
it, otherwise the code continues in a linear fashion. No functional
changes
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

1efb72a3

btrfs: Remove unused parameter from check_direct_IO · 8c70c9f8

由 Nikolay Borisov 提交于 8月 21, 2017

Introduced by 5a5f79b5 ("Btrfs: allow unaligned DIO") and never
used. The buffered fallback from unaligned DIO works as expected.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NTimofey Titovets <nefelim4ag@gmail.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8c70c9f8

btrfs: Remove unused arguments from btrfs_changed_cb_t · ee8c494f

由 Nikolay Borisov 提交于 8月 21, 2017

btrfs_changed_cb_t represents the signature of the callback being passed
to btrfs_compare_trees. Currently there is only one such callback,
namely changed_cb in send.c. This function doesn't really uses the first
2 parameters, i.e. the roots. Since there are not other functions
implementing the btrfs_changed_cb_t let's remove the unused parameters
from the prototype and implementation.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ee8c494f

btrfs: Remove unused parameters from various functions · a0357511

由 Nikolay Borisov 提交于 8月 21, 2017

iterate_dir_item:found_key - introduced in 31db9f7c ("Btrfs:
  introduce BTRFS_IOC_SEND for btrfs send/receive"), yet never used.

record_ref:num - ditto

This is a first pass with the low-hanging fruit. There are still quite a
few unsued parameters in some function which have to abide by a callback
interface.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

a0357511

btrfs: Remove unused variable · 8ca19950

由 Nikolay Borisov 提交于 8月 21, 2017

Src was initially part of 31ff1cd2 ("Btrfs: Copy into the log tree in
big batches"), however 16e7549f ("Btrfs: incompatible format change
to remove hole extents") changed parameters passed to copy_items which
made the src variable redundant.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NTimofey Titovets <nefelim4ag@gmail.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8ca19950

Btrfs: do not async submit for nodatasum inodes · 9b4a9b28

由 Liu Bo 提交于 8月 18, 2017

While we submit direct writes, if the inode is flagged with nodatasum,
there's no benefit to submit asynchronously, because

a) we don't have to calculate checksum across processors,

b) and direct IO has started a plug, but async submit makes us queue
IO on each device's scheduled IO list instead of DIO's plug list, so
that IOs get much less merges in general.

Lets use sync submit for nodatasum inodes.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9b4a9b28

Btrfs: search parity device wisely · 9cd3a7eb

由 Liu Bo 提交于 8月 03, 2017

After mapping block with BTRFS_MAP_WRITE, parities have been sorted to
the end position, so this search can start from the first parity
stripe.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ copied changelog as a comment ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9cd3a7eb

btrfs: copy fsid to super_block s_uuid · ee87cf5e

由 Anand Jain 提交于 8月 01, 2017

We didn't copy fsid to struct super_block.s_uuid so Overlay disables
index feature with btrfs as the lower FS.

kernel: overlayfs: fs on '/lower' does not support file handles, falling back to index=off.

Fix this by publishing the fsid through struct super_block.s_uuid.

[ dsterba: I think that setting s_uuid is the last missing bit. Overlay
  needs the file handle encoding support from the lower filesystem, which
  is supported. Filling the whole filesystem id is correct, the subvolume
  id is encoded in the file handle buffer from inside btrfs_encode_fh. ]
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ee87cf5e

Btrfs: fix __user casting in ioctl.c · 718dc5fa

由 Omar Sandoval 提交于 8月 22, 2017

Signed-off-by: NOmar Sandoval <osandov@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

718dc5fa

Btrfs: make some volumes.c functions static · c9162bdf

由 Omar Sandoval 提交于 8月 22, 2017

These aren't used outside of volumes.c.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c9162bdf

btrfs: Remove redundant forward declarations · f78541dd

由 Nikolay Borisov 提交于 8月 31, 2017

Some static functions are needlessly forward declared. Let's remove those
declarations since they add no value.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f78541dd

Btrfs: protect conditions within root->log_mutex while waiting · 49e83f57

由 Liu Bo 提交于 9月 01, 2017

Both wait_for_commit() and wait_for_writer() are checking the
condition out of the mutex lock.

This refactors code a bit to be lock safe.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

49e83f57

Btrfs: use wait_event instead of a single function · 45bac0f3

由 Liu Bo 提交于 9月 01, 2017

Since TASK_UNINTERRUPTIBLE has been used here, wait_event() can do the
same job.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

45bac0f3

Btrfs: move finish_wait out of the loop · 69cc7151

由 Liu Bo 提交于 9月 01, 2017

If we're still going to wait after schedule(), we don't have to do
finish_wait() to remove our %wait_queue_entry since prepare_to_wait()
won't add the same %wait_queue_entry twice.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

69cc7151

Btrfs: remove batch plug in run_scheduled_IO · 219d33b2

由 Liu Bo 提交于 9月 01, 2017

Block layer has a limit on plug, ie. BLK_MAX_REQUEST_COUNT == 16, so
we don't gain benefits by batching 64 bios here.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

219d33b2

19 10月, 2017 1 次提交

Convert fs/*/* to SB_I_VERSION · 357fdad0

由 Matthew Garrett 提交于 10月 18, 2017

[AV: in addition to the fix in previous commit]
Signed-off-by: NMatthew Garrett <mjg59@google.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Reviewed-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

357fdad0

04 10月, 2017 2 次提交

Btrfs: fix overlap of fs_info::flags values · 69ad5976

由 Tsutomu Itoh 提交于 10月 04, 2017

Because the values of BTRFS_FS_EXCL_OP and BTRFS_FS_QUOTA_OVERRIDE overlap,
we should change the value.

First, BTRFS_FS_EXCL_OP was set to 14.

  commit 171938e5 ("btrfs: track exclusive filesystem operation in flags")

Next, the value of BTRFS_FS_QUOTA_OVERRIDE was set to 14.

  commit f29efe29 ("btrfs: add quota override flag to enable quota override for CAP_SYS_RESOURCE")

As a result, the value 14 overlapped, by accident.
This problem is solved by defining the value of BTRFS_FS_EXCL_OP as 16,
the flags are internal.

Fixes: f29efe29 ("btrfs: add quota override flag to enable quota override for CAP_SYS_RESOURCE")
CC: stable@vger.kernel.org # 4.13+
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ minimize the change, update only BTRFS_FS_EXCL_OP ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

69ad5976

btrfs: avoid overflow when sector_t is 32 bit · 2d8ce70a

由 Goffredo Baroncelli 提交于 10月 03, 2017

Jean-Denis Girard noticed commit c821e7f3 "pass bytes to
btrfs_bio_alloc" (https://patchwork.kernel.org/patch/9763081/)
introduces a regression on 32 bit machines.
When CONFIG_LBDAF is _not_ defined (CONFIG_LBDAF == Support for large
(2TB+) block devices and files) sector_t is 32 bit on 32bit machines.

In the function submit_extent_page, 'sector' (which is sector_t type) is
multiplied by 512 to convert it from sectors to bytes, leading to an
overflow when the disk is bigger than 4GB (!).

I added a cast to u64 to avoid overflow.

Fixes: c821e7f3 ("btrfs: pass bytes to btrfs_bio_alloc")
CC: stable@vger.kernel.org # 4.13+
Signed-off-by: NGoffredo Baroncelli <kreijack@inwind.it>
Tested-by: NJean-Denis Girard <jd.girard@sysnux.pf>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

2d8ce70a

26 9月, 2017 5 次提交

btrfs: log csums for all modified extents · 8c6c5928

由 Josef Bacik 提交于 8月 29, 2017

Amir reported a bug discovered by his cleaned up version of my
dm-log-writes xfstests where we were missing csums at certain replay
points. This is because fsx was doing an msync(), which essentially
fsync()'s a specific range of a file. We will log all modified extents,
but only search for the checksums in the range we are being asked to
sync. We cannot simply log the extents in the range we're being asked
because we are logging the inode item as it is currently, which if it
has had a i_size update before the msync means we will miss extents when
replaying. We could possibly get around this by marking the inode with
the transaction that extended the i_size to see if we have this case,
but this would be racy and we'd have to lock the whole range of the
inode to make sure we didn't have an ordered extent outside of our range
that was in the middle of completing.

Fix this simply by keeping track of the modified extents range and
logging the csums for the entire range of extents that we are logging.
This makes the xfstest pass.
Reported-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8c6c5928

Btrfs: fix unexpected result when dio reading corrupted blocks · 99c4e3b9

由 Liu Bo 提交于 9月 15, 2017

commit 4246a0b6 ("block: add a bi_error field to struct bio")
changed the logic of how dio read endio reports errors.

For single stripe dio read, %bio->bi_status reflects the error before
verifying checksum, and now we're updating it when data block matches
with its checksum, while in the mismatching case, %bio->bi_status is
not updated to relfect that.

When some blocks in a file have been corrupted on disk, reading such a
file ends up with

1) checksum errors are reported in kernel log
2) read(2) returns successfully with some content being 0x01.

In order to fix it, we need to report its checksum mismatch error to
the upper layer (dio layer in this case) as well.

Fixes: 4246a0b6 ("block: add a bi_error field to struct bio")
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reported-by: NGoffredo Baroncelli <kreijack@inwind.it>
Tested-by: NGoffredo Baroncelli <kreijack@inwind.it>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

99c4e3b9

btrfs: Report error on removing qgroup if del_qgroup_item fails · 36b96fdc

由 Sargun Dhillon 提交于 9月 17, 2017

Previously, we were calling del_qgroup_item, and ignoring the return code
resulting in a potential to have divergent in-memory state without an
error. Perhaps, it makes sense to handle this error code, and put the
filesystem into a read only, or similar state.

This patch only adds reporting of the error if the error is fatal,
(any error other than qgroup not found).
Signed-off-by: NSargun Dhillon <sargun@sargun.me>
Reviewed-by: NQu Wenruo <quwenruo.btrfs@gmx.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

36b96fdc

Btrfs: skip checksum when reading compressed data if some IO have failed · e6311f24

由 Liu Bo 提交于 9月 20, 2017

Currently even if the underlying disk reports failure on IO,
compressed read endio still gets to verify checksum and reports it as
a checksum error.

In fact, if some IO have failed during reading a compressed data
extent , there's no way the checksum could match, therefore, we can
skip that in order to return error quickly to the upper layer.

Please note that we need to do this after recording the failed mirror
index so that read-repair in the upper layer's endio can work
properly.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Tested-by: NPaul Jones <paul@pauljones.id.au>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e6311f24

Btrfs: fix kernel oops while reading compressed data · cf1167d5

由 Liu Bo 提交于 9月 20, 2017

The kernel oops happens at

kernel BUG at fs/btrfs/extent_io.c:2104!
...
RIP: clean_io_failure+0x263/0x2a0 [btrfs]

It's showing that read-repair code is using an improper mirror index.
This is due to the fact that compression read's endio hasn't recorded
the failed mirror index in %cb->orig_bio.

With this, btrfs's read-repair can work properly on reading compressed
data.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reported-by: NPaul Jones <paul@pauljones.id.au>
Tested-by: NPaul Jones <paul@pauljones.id.au>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

cf1167d5

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功