提交 · eceff22a8067fa5f587d1bab0eb66503d33b7164 · openanolis / cloud-kernel

26 3月, 2018 31 次提交

btrfs: add a comment to mark the deprecated mount option · eceff22a

由 Anand Jain 提交于 2月 13, 2018

The options alloc_start and subvolrootid are deprecated, comment them in
the tokens list. And leave them as it is. No functional changes.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

eceff22a

btrfs: manage commit mount option as %u · d3740608

由 Anand Jain 提交于 2月 13, 2018

As the commit mount option is unsigned so manage it as %u for token
verifications, instead of %d.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

d3740608

btrfs: manage check_int_print_mask mount option as %u · 02453bde

由 Anand Jain 提交于 2月 13, 2018

As check_int_print_mask mount option is unsigned so manage it as %u for
token verifications, instead of %d.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

02453bde

btrfs: manage metadata_ratio mount option as %u · 764cb8b4

由 Anand Jain 提交于 2月 13, 2018

As metadata_ratio mount option is unsinged so manage it as %u for token
verifications, instead of %d.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

764cb8b4

btrfs: manage thread_pool mount option as %u · f7b885be

由 Anand Jain 提交于 2月 13, 2018

The mount option thread_pool is always unsigned. Manage it that way all
around.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

f7b885be

btrfs: extent_buffer_uptodate() make it static and inline · ba020491

由 Anand Jain 提交于 2月 13, 2018

extent_buffer_uptodate() is a trivial wrapper around test_bit() and
nothing else. So make it static and inline, save on code space and call
indirection.

Before:
   text	   data	    bss	    dec	    hex	filename
1131257	  82898	  18992	1233147	 12d0fb	fs/btrfs/btrfs.ko

After:
   text	   data	    bss	    dec	    hex	filename
1131090	  82898	  18992	1232980	 12d054	fs/btrfs/btrfs.ko
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ba020491

btrfs: Remove fs_info argument of btrfs_write_and_wait_transaction · 70458a58

由 Nikolay Borisov 提交于 2月 07, 2018

We already pass btrfs_trans_handle which contains a reference to the
fs_info so use that. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

70458a58

btrfs: Remove fs_info argument from btrfs_update_commit_device_bytes_used · e9b919b1

由 Nikolay Borisov 提交于 2月 07, 2018

We already pass the btrfs_transaction which references fs_info so no
need to pass the later as an argument. Also use the opportunity to
shorten transaction->trans. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e9b919b1

btrfs: Remove fs_info argument from create_pending_snapshots/create_pending_snapshot · 08d50ca3

由 Nikolay Borisov 提交于 2月 07, 2018

We already pass the trans handle which has a reference to fs_info to
create_pending_snapshot so we can refer to it directly. Doing this
obviates the need to pass the fs_info to create_pending_snapshots as
well. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

08d50ca3

btrfs: Remove fs_info argument from switch_commit_roots · 16916a88

由 Nikolay Borisov 提交于 2月 07, 2018

We already have the fs_info from the passed transaction so use it
directly. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

16916a88

btrfs: Remove root argument of cleanup_transaction · 97cb39bb

由 Nikolay Borisov 提交于 2月 07, 2018

The only thing the passed root is used for is:
1. get a reference to the fs_info and to
2. call trace_btrfs_transaction_commit.

We can achieve 1) by simply referring to the fs_info from passed trans
object. As far as 2) is concerned cleanup_transaction is called from
only one place and the 'root' argument passed is the one from the trans
handle. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

97cb39bb

btrfs: Don't pass fs_info to commit_cowonly_roots · 9386d8bc

由 Nikolay Borisov 提交于 2月 07, 2018

We already pass a transaction handle which refrences the fs_info so
we can grab it from there. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

9386d8bc

btrfs: Don't pass fs_info to commit_fs_roots · 7e4443d9

由 Nikolay Borisov 提交于 2月 07, 2018

We already pass the transaction handle which has a reference to the
fs_info. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

7e4443d9

btrfs: Don't pass fs_info to btrfs_run_delayed_items/_nr · e5c304e6

由 Nikolay Borisov 提交于 2月 07, 2018

We already pass the transaction which has a reference to the fs_info,
so use that. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e5c304e6

btrfs: Don't pass fs_info to __btrfs_run_delayed_items · b84acab3

由 Nikolay Borisov 提交于 2月 07, 2018

We already pass the transaction handle, which contains a refrence to
the fs_info so grab it from there. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b84acab3

btrfs: Don't pass fs_info arg to btrfs_start_dirty_block_groups · 21217054

由 Nikolay Borisov 提交于 2月 07, 2018

It can be referenced from the passed transaction so no point in passing
it as a function argument. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

21217054

btrfs: Remove fs_info argument from btrfs_create_pending_block_groups · 6c686b35

由 Nikolay Borisov 提交于 2月 07, 2018

It can be referenced from the passed transaciton so no point in
passing it as function argument. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

6c686b35

btrfs: Remove fs_info argument from btrfs_trans_release_metadata · dc60c525

由 Nikolay Borisov 提交于 2月 07, 2018

All current callers of this function just get a reference to the
trans->fs_info member and pass it as the second argument. Collapse this
into the function itself. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

dc60c525

btrfs: Open code btrfs_write_and_wait_marked_extents · c9b577c0

由 Nikolay Borisov 提交于 2月 07, 2018

btrfs_write_and_wait_transaction is essentially a wrapper of
btrfs_write_and_wait_marked_extents with the addition of calling
clear_btree_io_tree. Having the code split doesn't really bring any
benefit. Open code the later into the former and add proper
documentation header.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ reformat comment ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c9b577c0

btrfs: Make btrfs_trans_release_metadata private to transaction.c · 0e34693f

由 Nikolay Borisov 提交于 2月 07, 2018

This function is only ever used in __btrfs_end_transaction and
btrfs_commit_transaction so there is no need to export it via header.
Let's move it closer to where it's used, make it static and remove it
from the header. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

0e34693f

btrfs: open code btrfs_init_dev_replace_tgtdev_for_resume() · 15fc1283

由 Anand Jain 提交于 2月 12, 2018

btrfs_init_dev_replace_tgtdev_for_resume() initializes replace
target device in a few simple steps, so do it at the parent function.
Moreover, there isn't any other caller so just open code it.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

15fc1283

btrfs: btrfs_dev_replace_cancel() can return int · 18e67c73

由 Anand Jain 提交于 2月 12, 2018

Current u64 return from btrfs_dev_replace_cancel() was probably done
to match the btrfs_ioctl_dev_replace_args::result. However as our
actual return value fits in int, and it further gets typecast to u64,
so just return int.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

18e67c73

btrfs: rename __btrfs_dev_replace_cancel() · 17d202b9

由 Anand Jain 提交于 2月 12, 2018

Remove __ which is for the special functions.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

17d202b9

btrfs: open code btrfs_dev_replace_cancel() · 97282031

由 Anand Jain 提交于 2月 12, 2018

btrfs_dev_replace_cancel() calls __btrfs_dev_replace_cancel() for the
actual cancel so just open code it.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

97282031

btrfs: Don't hardcode the csum size in btrfs_ordered_sum_size · af89e0dc

由 Nikolay Borisov 提交于 2月 07, 2018

Currently the function uses a hardcoded value for the checksum size of
a sector. This is fine, given that we currently support only a single
algorithm, whose checksum is 4 bytes == sizeof(u32). Despite not
having other algorithms, btrfs' design supports using a different
algorithm whith different space requirements. To future-proof the code
query the size of the currently used algorithm from the in-memory copy
of the super block. No functional changes.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NSu Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

af89e0dc

Btrfs: extent map selftest: add missing void parameter to btrfs_test_extent_map · 97dc231e

由 Colin Ian King 提交于 1月 08, 2018

Add a missing void parameter to function btrfs_test_extent_map, fixes
sparse warning:

warning: non-ANSI function declaration of function 'btrfs_test_extent_map'
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

97dc231e

btrfs: remove redundant check on ret and goto · 7a61f880

由 Colin Ian King 提交于 1月 12, 2018

The check for a non-zero ret is redundant as the goto will jump to
the very next statement anyway.  Remove this extraneous code.

Detected by CoverityScan, CID#1463784 ("Identical code for different
branches")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

7a61f880

btrfs: Remove unused btrfs_start_transaction_lflush function · 7806c6eb

由 Nikolay Borisov 提交于 12月 15, 2017

Commit 0e8c36a9 ("Btrfs: fix lots of orphan inodes when the space
is not enough") changed the way transaction reservation is made in
btrfs_evict_node and as a result this function became unused. This has
been the status quo for 5 years in which time no one noticed, so I'd
say it's safe to assume it's unlikely it will ever be used again.

Historical note: there were more attempts to remove the function, the
reasoning was missing and only based on some static analysis tool
reports. Other reason for rejection was that there seemed to be
connection to BTRFS_RESERVE_FLUSH_LIMIT and that would need to be
removeed to. This was not correct so removing the function is all we can
do.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
[ add the note ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

7806c6eb

btrfs: print error if primary super block write fails · b6a535fa

由 Howard McLauchlan 提交于 2月 02, 2018

Presently, failing a primary super block write but succeeding in at
least one super block write in general will appear to users as if
nothing important went wrong. However, upon unmounting and re-mounting,
the file system will be in a rolled back state. This was discovered
with a BCC program that uses bpf_override_return() to fail super block
writes.

This patch outputs an error clarifying that the primary super block
write has failed, so users can expect potentially erroneous behaviour.
It also forces wait_dev_supers() to return an error to its caller if
the primary super block write fails.
Signed-off-by: NHoward McLauchlan <hmclauchlan@fb.com>
Reviewed-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b6a535fa

btrfs: Refactor parameter of BTRFS_MAX_DEVS() from root to fs_info · 062d4d1f

由 Qu Wenruo 提交于 1月 30, 2018

Signed-off-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

062d4d1f

Btrfs: enhance leak debug checker for extent state and extent buffer · af2679e4

由 Liu Bo 提交于 1月 25, 2018

This prints out eb->bflags since it contains some useful information,
e.g. whether eb is dirty.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

af2679e4

23 3月, 2018 1 次提交

hugetlbfs: check for pgoff value overflow · 63489f8e

由 Mike Kravetz 提交于 3月 22, 2018

A vma with vm_pgoff large enough to overflow a loff_t type when
converted to a byte offset can be passed via the remap_file_pages system
call.  The hugetlbfs mmap routine uses the byte offset to calculate
reservations and file size.

A sequence such as:

  mmap(0x20a00000, 0x600000, 0, 0x66033, -1, 0);
  remap_file_pages(0x20a00000, 0x600000, 0, 0x20000000000000, 0);

will result in the following when task exits/file closed,

  kernel BUG at mm/hugetlb.c:749!
  Call Trace:
    hugetlbfs_evict_inode+0x2f/0x40
    evict+0xcb/0x190
    __dentry_kill+0xcb/0x150
    __fput+0x164/0x1e0
    task_work_run+0x84/0xa0
    exit_to_usermode_loop+0x7d/0x80
    do_syscall_64+0x18b/0x190
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2

The overflowed pgoff value causes hugetlbfs to try to set up a mapping
with a negative range (end < start) that leaves invalid state which
causes the BUG.

The previous overflow fix to this code was incomplete and did not take
the remap_file_pages system call into account.

[mike.kravetz@oracle.com: v3]
  Link: http://lkml.kernel.org/r/20180309002726.7248-1-mike.kravetz@oracle.com
[akpm@linux-foundation.org: include mmdebug.h]
[akpm@linux-foundation.org: fix -ve left shift count on sh]
Link: http://lkml.kernel.org/r/20180308210502.15952-1-mike.kravetz@oracle.com
Fixes: 045c7a3f ("hugetlbfs: fix offset overflow in hugetlbfs mmap")
Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
Reported-by: NNic Losby <blurbdust@gmail.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

63489f8e

20 3月, 2018 2 次提交

sysfs: symlink: export sysfs_create_link_nowarn() · 2399ac42

由 Grygorii Strashko 提交于 3月 16, 2018

The sysfs_create_link_nowarn() is going to be used in phylib framework in
subsequent patch which can be built as module. Hence, export
sysfs_create_link_nowarn() to avoid build errors.

Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Fixes: a3995460 ("net: phy: Relax error checking on sysfs_create_link()")
Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2399ac42

nfsd: remove blocked locks on client teardown · 68ef3bc3

由 Jeff Layton 提交于 3月 16, 2018

We had some reports of panics in nfsd4_lm_notify, and that showed a
nfs4_lockowner that had outlived its so_client.

Ensure that we walk any leftover lockowners after tearing down all of
the stateids, and remove any blocked locks that they hold.

With this change, we also don't need to walk the nbl_lru on nfsd_net
shutdown, as that will happen naturally when we tear down the clients.

Fixes: 76d348fa (nfsd: have nfsd4_lock use blocking locks for v4.1+ locks)
Reported-by: NFrank Sorenson <fsorenso@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Cc: stable@vger.kernel.org # 4.9
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

68ef3bc3

16 3月, 2018 2 次提交

Revert "btrfs: use proper endianness accessors for super_copy" · 093e037c

由 David Sterba 提交于 3月 16, 2018

This reverts commit 3c181c12.

The offending patch was merged in 4.16-rc4 and was promptly applied to
stable kernels 4.14.25 and 4.15.8.

The patch causes a corruption in several superblock items on big-endian
machines because of messed up endianity conversions. The damage is
manually repairable. A filesystem cannot be mounted again after it has
been unmounted once.

We do a full revert and not a fixup so stable can pick that patch ASAP.

Fixes: 3c181c12 ("btrfs: use proper endianness accessors for super_copy")
Link: https://lkml.kernel.org/r/1521139304@msgid.manchmal.in-ulm.de
CC: stable@vger.kernel.org # 4.14+
Reported-by: NChristoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

093e037c

fs: Teach path_connected to handle nfs filesystems with multiple roots. · 95dd7758

由 Eric W. Biederman 提交于 3月 14, 2018

On nfsv2 and nfsv3 the nfs server can export subsets of the same
filesystem and report the same filesystem identifier, so that the nfs
client can know they are the same filesystem.  The subsets can be from
disjoint directory trees.  The nfsv2 and nfsv3 filesystems provides no
way to find the common root of all directory trees exported form the
server with the same filesystem identifier.

The practical result is that in struct super s_root for nfs s_root is
not necessarily the root of the filesystem.  The nfs mount code sets
s_root to the root of the first subset of the nfs filesystem that the
kernel mounts.

This effects the dcache invalidation code in generic_shutdown_super
currently called shrunk_dcache_for_umount and that code for years
has gone through an additional list of dentries that might be dentry
trees that need to be freed to accomodate nfs.

When I wrote path_connected I did not realize nfs was so special, and
it's hueristic for avoiding calling is_subdir can fail.

The practical case where this fails is when there is a move of a
directory from the subtree exposed by one nfs mount to the subtree
exposed by another nfs mount.  This move can happen either locally or
remotely.  With the remote case requiring that the move directory be cached
before the move and that after the move someone walks the path
to where the move directory now exists and in so doing causes the
already cached directory to be moved in the dcache through the magic
of d_splice_alias.

If someone whose working directory is in the move directory or a
subdirectory and now starts calling .. from the initial mount of nfs
(where s_root == mnt_root), then path_connected as a heuristic will
not bother with the is_subdir check.  As s_root really is not the root
of the nfs filesystem this heuristic is wrong, and the path may
actually not be connected and path_connected can fail.

The is_subdir function might be cheap enough that we can call it
unconditionally.  Verifying that will take some benchmarking and
the result may not be the same on all kernels this fix needs
to be backported to.  So I am avoiding that for now.

Filesystems with snapshots such as nilfs and btrfs do something
similar.  But as the directory tree of the snapshots are disjoint
from one another and from the main directory tree rename won't move
things between them and this problem will not occur.

Cc: stable@vger.kernel.org
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Fixes: 397d425d ("vfs: Test for and handle paths that are unreachable from their mnt_root")
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

95dd7758

15 3月, 2018 4 次提交

btrfs: add missing initialization in btrfs_check_shared · 18bf591b

由 Edmund Nadolski 提交于 3月 14, 2018

This patch addresses an issue that causes fiemap to falsely
report a shared extent.  The test case is as follows:

xfs_io -f -d -c "pwrite -b 16k 0 64k" -c "fiemap -v" /media/scratch/file5
sync
xfs_io  -c "fiemap -v" /media/scratch/file5

which gives the resulting output:

wrote 65536/65536 bytes at offset 0
64 KiB, 4 ops; 0.0000 sec (121.359 MiB/sec and 7766.9903 ops/sec)
/media/scratch/file5:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..127]:        24576..24703       128 0x2001
/media/scratch/file5:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..127]:        24576..24703       128   0x1

This is because btrfs_check_shared calls find_parent_nodes
repeatedly in a loop, passing a share_check struct to report
the count of shared extent. But btrfs_check_shared does not
re-initialize the count value to zero for subsequent calls
from the loop, resulting in a false share count value. This
is a regressive behavior from 4.13.

With proper re-initialization the test result is as follows:

wrote 65536/65536 bytes at offset 0
64 KiB, 4 ops; 0.0000 sec (110.035 MiB/sec and 7042.2535 ops/sec)
/media/scratch/file5:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..127]:        24576..24703       128   0x1
/media/scratch/file5:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..127]:        24576..24703       128   0x1

which corrects the regression.

Fixes: 3ec4d323 ("btrfs: allow backref search checks for shared extents")
Signed-off-by: NEdmund Nadolski <enadolski@suse.com>
[ add text from cover letter to changelog ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

18bf591b

btrfs: Fix NULL pointer exception in find_bio_stripe · 047fdea6

由 Dmitriy Gorokh 提交于 2月 16, 2018

On detaching of a disk which is a part of a RAID6 filesystem, the
following kernel OOPS may happen:

[63122.680461] BTRFS error (device sdo): bdev /dev/sdo errs: wr 0, rd 0, flush 1, corrupt 0, gen 0
[63122.719584] BTRFS warning (device sdo): lost page write due to IO error on /dev/sdo
[63122.719587] BTRFS error (device sdo): bdev /dev/sdo errs: wr 1, rd 0, flush 1, corrupt 0, gen 0
[63122.803516] BTRFS warning (device sdo): lost page write due to IO error on /dev/sdo
[63122.803519] BTRFS error (device sdo): bdev /dev/sdo errs: wr 2, rd 0, flush 1, corrupt 0, gen 0
[63122.863902] BTRFS critical (device sdo): fatal error on device /dev/sdo
[63122.935338] BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
[63122.946554] IP: fail_bio_stripe+0x58/0xa0 [btrfs]
[63122.958185] PGD 9ecda067 P4D 9ecda067 PUD b2b37067 PMD 0
[63122.971202] Oops: 0000 [#1] SMP
[63123.006760] CPU: 0 PID: 3979 Comm: kworker/u8:9 Tainted: G W 4.14.2-16-scst34x+ #8
[63123.007091] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[63123.007402] Workqueue: btrfs-worker btrfs_worker_helper [btrfs]
[63123.007595] task: ffff880036ea4040 task.stack: ffffc90006384000
[63123.007796] RIP: 0010:fail_bio_stripe+0x58/0xa0 [btrfs]
[63123.007968] RSP: 0018:ffffc90006387ad8 EFLAGS: 00010287
[63123.008140] RAX: 0000000000000002 RBX: ffff88004beaa0b8 RCX: ffff8800b2bd5690
[63123.008359] RDX: 0000000000000000 RSI: ffff88007bb43500 RDI: ffff88004beaa000
[63123.008621] RBP: ffffc90006387ae8 R08: 0000000099100000 R09: ffff8800b2bd5600
[63123.008840] R10: 0000000000000004 R11: 0000000000010000 R12: ffff88007bb43500
[63123.009059] R13: 00000000fffffffb R14: ffff880036fc5180 R15: 0000000000000004
[63123.009278] FS: 0000000000000000(0000) GS:ffff8800b7000000(0000) knlGS:0000000000000000
[63123.009564] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[63123.009748] CR2: 0000000000000080 CR3: 00000000b0866000 CR4: 00000000000406f0
[63123.009969] Call Trace:
[63123.010085] raid_write_end_io+0x7e/0x80 [btrfs]
[63123.010251] bio_endio+0xa1/0x120
[63123.010378] generic_make_request+0x218/0x270
[63123.010921] submit_bio+0x66/0x130
[63123.011073] finish_rmw+0x3fc/0x5b0 [btrfs]
[63123.011245] full_stripe_write+0x96/0xc0 [btrfs]
[63123.011428] raid56_parity_write+0x117/0x170 [btrfs]
[63123.011604] btrfs_map_bio+0x2ec/0x320 [btrfs]
[63123.011759] ? ___cache_free+0x1c5/0x300
[63123.011909] __btrfs_submit_bio_done+0x26/0x50 [btrfs]
[63123.012087] run_one_async_done+0x9c/0xc0 [btrfs]
[63123.012257] normal_work_helper+0x19e/0x300 [btrfs]
[63123.012429] btrfs_worker_helper+0x12/0x20 [btrfs]
[63123.012656] process_one_work+0x14d/0x350
[63123.012888] worker_thread+0x4d/0x3a0
[63123.013026] ? _raw_spin_unlock_irqrestore+0x15/0x20
[63123.013192] kthread+0x109/0x140
[63123.013315] ? process_scheduled_works+0x40/0x40
[63123.013472] ? kthread_stop+0x110/0x110
[63123.013610] ret_from_fork+0x25/0x30
[63123.014469] RIP: fail_bio_stripe+0x58/0xa0 [btrfs] RSP: ffffc90006387ad8
[63123.014678] CR2: 0000000000000080
[63123.016590] ---[ end trace a295ea7259c17880 ]—

This is reproducible in a cycle, where a series of writes is followed by
SCSI device delete command. The test may take up to few minutes.

Fixes: 74d46992 ("block: replace bi_bdev with a gendisk pointer and partitions index")
[ no signed-off-by provided ]
Author: Dmitriy Gorokh <Dmitriy.Gorokh@wdc.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

047fdea6

fs/aio: Use RCU accessors for kioctx_table->table[] · d0264c01

由 Tejun Heo 提交于 3月 14, 2018

While converting ioctx index from a list to a table, db446a08
("aio: convert the ioctx list to table lookup v3") missed tagging
kioctx_table->table[] as an array of RCU pointers and using the
appropriate RCU accessors.  This introduces a small window in the
lookup path where init and access may race.

Mark kioctx_table->table[] with __rcu and use the approriate RCU
accessors when using the field.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NJann Horn <jannh@google.com>
Fixes: db446a08 ("aio: convert the ioctx list to table lookup v3")
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org # v3.12+

d0264c01

fs/aio: Add explicit RCU grace period when freeing kioctx · a6d7cff4

由 Tejun Heo 提交于 3月 14, 2018

While fixing refcounting, e34ecee2 ("aio: Fix a trinity splat")
incorrectly removed explicit RCU grace period before freeing kioctx.
The intention seems to be depending on the internal RCU grace periods
of percpu_ref; however, percpu_ref uses a different flavor of RCU,
sched-RCU.  This can lead to kioctx being freed while RCU read
protected dereferences are still in progress.

Fix it by updating free_ioctx() to go through call_rcu() explicitly.

v2: Comment added to explain double bouncing.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NJann Horn <jannh@google.com>
Fixes: e34ecee2 ("aio: Fix a trinity splat")
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org # v3.13+

a6d7cff4

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功