提交 · 6e5cbea199c6e2587487c2556f49c3d349115c2c · openanolis / cloud-kernel

02 9月, 2020 40 次提交

io_uring: always allow drain/link/hardlink/async sqe flags · 6e5cbea1

由 Daniele Albano 提交于 7月 18, 2020

to #29608102

commit 61710e437f2807e26a3402543bdbb7217a9c8620 upstream.

We currently filter these for timeout_remove/async_cancel/files_update,
but we only should be filtering for fixed file and buffer select. This
also causes a second read of sqe->flags, which isn't needed.

Just check req->flags for the relevant bits. This then allows these
commands to be used in links, for example, like everything else.
Signed-off-by: NDaniele Albano <d.albano@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NJiufei Xue <jiufei.xue@linux.alibaba.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

6e5cbea1

io_uring: ensure double poll additions work with both request types · 8d834b0c

由 Jens Axboe 提交于 7月 17, 2020

to #29608102

commit 807abcb0883439af5ead73f3308310453b97b624 upstream.

The double poll additions were centered around doing POLL_ADD on file
descriptors that use more than one waitqueue (typically one for read,
one for write) when being polled. However, it can also end up being
triggered for when we use poll triggered retry. For that case, we cannot
safely use req->io, as that could be used by the request type itself.

Add a second io_poll_iocb pointer in the structure we allocate for poll
based retry, and ensure we use the right one from the two paths.

Fixes: 18bceab101ad ("io_uring: allow POLL_ADD with double poll_wait() users")
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NJiufei Xue <jiufei.xue@linux.alibaba.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

8d834b0c

ovl: initialize error in ovl_copy_xattr · 7db5692f

由 Yuxuan Shui 提交于 5月 27, 2020

to #28557782

commit 520da69d265a91c6536c63851cbb8a53946974f0 upstream.

In ovl_copy_xattr, if all the xattrs to be copied are overlayfs private
xattrs, the copy loop will terminate without assigning anything to the
error variable, thus returning an uninitialized value.

If ovl_copy_xattr is called from ovl_clear_empty, this uninitialized error
value is put into a pointer by ERR_PTR(), causing potential invalid memory
accesses down the line.

This commit initialize error with 0. This is the correct value because when
there's no xattr to copy, because all xattrs are private, ovl_copy_xattr
should succeed.

This bug is discovered with the help of INIT_STACK_ALL and clang.
Signed-off-by: NYuxuan Shui <yshuiv7@gmail.com>
Link: https://bugs.chromium.org/p/chromium/issues/detail?id=1050405
Fixes: 0956254a ("ovl: don't copy up opaqueness")
Cc: stable@vger.kernel.org # v4.8
Signed-off-by: NAlexander Potapenko <glider@google.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

7db5692f

xfs: add agf freeblocks verify in xfs_agf_verify · 028b4911

由 Zheng Bin 提交于 2月 21, 2020

to #28557760

[ Upstream commit d0c7feaf87678371c2c09b3709400be416b2dc62 ]

We recently used fuzz(hydra) to test XFS and automatically generate
tmp.img(XFS v5 format, but some metadata is wrong)

xfs_repair information(just one AG):
agf_freeblks 0, counted 3224 in ag 0
agf_longest 536874136, counted 3224 in ag 0
sb_fdblocks 613, counted 3228

Test as follows:
mount tmp.img tmpdir
cp file1M tmpdir
sync

In 4.19-stable, sync will stuck, the reason is:
xfs_mountfs
  xfs_check_summary_counts
    if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) ||
       XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) &&
       !xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS))
	return 0;  -->just return, incore sb_fdblocks still be 613
    xfs_initialize_perag_data

cp file1M tmpdir -->ok(write file to pagecache)
sync -->stuck(write pagecache to disk)
xfs_map_blocks
  xfs_iomap_write_allocate
    while (count_fsb != 0) {
      nimaps = 0;
      while (nimaps == 0) { --> endless loop
         nimaps = 1;
         xfs_bmapi_write(..., &nimaps) --> nimaps becomes 0 again
xfs_bmapi_write
  xfs_bmap_alloc
    xfs_bmap_btalloc
      xfs_alloc_vextent
        xfs_alloc_fix_freelist
          xfs_alloc_space_available -->fail(agf_freeblks is 0)

In linux-next, sync not stuck, cause commit c2b3164320b5 ("xfs:
use the latest extent at writeback delalloc conversion time") remove
the above while, dmesg is as follows:
[   55.250114] XFS (loop0): page discard on page ffffea0008bc7380, inode 0x1b0c, offset 0.

Users do not know why this page is discard, the better soultion is:
1. Like xfs_repair, make sure sb_fdblocks is equal to counted
(xfs_initialize_perag_data did this, who is not called at this mount)
2. Add agf verify, if fail, will tell users to repair

This patch use the second soultion.
Signed-off-by: NZheng Bin <zhengbin13@huawei.com>
Signed-off-by: NRen Xudong <renxudong1@huawei.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

028b4911

ext4: fix race between ext4_sync_parent() and rename() · 727bd990

由 Eric Biggers 提交于 5月 06, 2020

to #28557685

commit 08adf452e628b0e2ce9a01048cfbec52353703d7 upstream.

'igrab(d_inode(dentry->d_parent))' without holding dentry->d_lock is
broken because without d_lock, d_parent can be concurrently changed due
to a rename().  Then if the old directory is immediately deleted, old
d_parent->inode can be NULL.  That causes a NULL dereference in igrab().

To fix this, use dget_parent() to safely grab a reference to the parent
dentry, which pins the inode.  This also eliminates the need to use
d_find_any_alias() other than for the initial inode, as we no longer
throw away the dentry at each step.

This is an extremely hard race to hit, but it is possible.  Adding a
udelay() in between the reads of ->d_parent and its ->d_inode makes it
reproducible on a no-journal filesystem using the following program:

    #include <fcntl.h>
    #include <unistd.h>

    int main()
    {
        if (fork()) {
            for (;;) {
                mkdir("dir1", 0700);
                int fd = open("dir1/file", O_RDWR|O_CREAT|O_SYNC);
                write(fd, "X", 1);
                close(fd);
            }
        } else {
            mkdir("dir2", 0700);
            for (;;) {
                rename("dir1/file", "dir2/file");
                rmdir("dir1");
            }
        }
    }

Fixes: d59729f4 ("ext4: fix races in ext4_sync_parent()")
Cc: stable@vger.kernel.org
Signed-off-by: NEric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20200506183140.541194-1-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

727bd990

ext4: fix EXT_MAX_EXTENT/INDEX to check for zeroed eh_max · d9bf1840

由 Harshad Shirwadkar 提交于 4月 20, 2020

to #28557685

commit c36a71b4e35ab35340facdd6964a00956b9fef0a upstream.

If eh->eh_max is 0, EXT_MAX_EXTENT/INDEX would evaluate to unsigned
(-1) resulting in illegal memory accesses. Although there is no
consistent repro, we see that generic/019 sometimes crashes because of
this bug.

Ran gce-xfstests smoke and verified that there were no regressions.
Signed-off-by: NHarshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20200421023959.20879-2-harshadshirwadkar@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

d9bf1840

ext4: disable dioread_nolock whenever delayed allocation is disabled · 8c6a9862

由 Eric Whitney 提交于 3月 19, 2020

fix #29455282

commit c8980e1980ccdc2229aa2218d532ddc62e0aabe5 upstream

The patch "ext4: make dioread_nolock the default" (244adf6426ee) causes
generic/422 to fail when run in kvm-xfstests' ext3conv test case. This
applies both the dioread_nolock and nodelalloc mount options, a
combination not previously tested by kvm-xfstests. The failure occurs
because the dioread_nolock code path splits a previously fallocated
multiblock extent into a series of single block extents when overwriting
a portion of that extent. That causes allocation of an extent tree leaf
node and a reshuffling of extents. Once writeback is completed, the
individual extents are recombined into a single extent, the extent is
moved again, and the leaf node is deleted. The difference in block
utilization before and after writeback due to the leaf node triggers the
failure.

The original reason for this behavior was to avoid ENOSPC when handling
I/O completions during writeback in the dioread_nolock code paths when
delayed allocation is disabled. It may no longer be necessary, because
code was added in the past to reserve extra space to solve this problem
when delayed allocation is enabled, and this code may also apply when
delayed allocation is disabled. Until this can be verified, don't use
the dioread_nolock code paths if delayed allocation is disabled.
Signed-off-by: NEric Whitney <enwlinux@gmail.com>
Link: https://lore.kernel.org/r/20200319150028.24592-1-enwlinux@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

8c6a9862

alinux: virtiofs: simplify mount options · d068535c

由 Liu Bo 提交于 4月 09, 2020

task #28910367
Rather than explicitly specifying "-o
default_permissions,allow_other", virtiofs can set some default values
for them.

With this, we can simply do
"mount -t virtio_fs atest /mnt/test/ -otag=myfs-1,dax".
Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

d068535c

alinux: virtio-fs: export fuse_request_free · e6067150

由 Liu Bo 提交于 7月 25, 2020

task #28910367
virtio-fs will need to use it from outside fs/fuse/dev.c.
Make the symbol visible.
Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

e6067150

fuse: Support RENAME_WHITEOUT flag · ebea99bf

由 Vivek Goyal 提交于 2月 05, 2020

task #28910367
commit 519525fa47b5a8155f0b203e49a3a6a2319f75ae upstream

Allow fuse to pass RENAME_WHITEOUT to fuse server.  Overlayfs on top of
virtiofs uses RENAME_WHITEOUT.

Without this patch renaming a directory in overlayfs (dir is on lower)
fails with -EINVAL. With this patch it works.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit 519525fa47b5a8155f0b203e49a3a6a2319f75ae)
Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

ebea99bf

virtiofs: Use completions while waiting for queue to be drained · 88fa38fa

由 Vivek Goyal 提交于 10月 30, 2019

task #28910367
commit 724c15a43e2c7ac26e2d07abef99191162498fa9 upstream

While we wait for queue to finish draining, use completions instead of
usleep_range(). This is better way of waiting for event.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit 724c15a43e2c7ac26e2d07abef99191162498fa9)
Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

88fa38fa

virtiofs: Do not send forget request "struct list_head" element · 2a6ae53e

由 Vivek Goyal 提交于 10月 30, 2019

task #28910367
commit 1efcf39eb627573f8d543ea396cf36b0651b1e56 upstream

We are sending whole of virtio_fs_forget struct to the other end over
virtqueue. Other end does not need to see elements like "struct list".
That's internal detail of guest kernel. Fix it.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit 1efcf39eb627573f8d543ea396cf36b0651b1e56)
Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

2a6ae53e

virtiofs: Use a common function to send forget · a6d9f512

由 Vivek Goyal 提交于 10月 30, 2019

task #28910367
commit 58ada94f95f71d4f73197ab0e9603dbba6e47fe3 upstream

Currently we are duplicating logic to send forgets at two
places. Consolidate the code by calling one helper function.

This also uses virtqueue_add_outbuf() instead of
virtqueue_add_sgs(). Former is simpler to call.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit 58ada94f95f71d4f73197ab0e9603dbba6e47fe3)
Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

a6d9f512

virtiofs: Fix old-style declaration · 8270fcad

由 YueHaibing 提交于 11月 11, 2019

task #28910367
commit 00929447f5758c4f64c74d0a4b40a6eb3d9df0e3 upstream

There expect the 'static' keyword to come first in a declaration, and we
get warnings like this with "make W=1":

fs/fuse/virtio_fs.c:687:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
fs/fuse/virtio_fs.c:692:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
fs/fuse/virtio_fs.c:1029:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit 00929447f5758c4f64c74d0a4b40a6eb3d9df0e3)
Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>

8270fcad

virtiofs: Remove set but not used variable 'fc' · 823286b7

由 zhengbin 提交于 10月 23, 2019

task #28910367
commit 80da5a809d193c60d090cbdf4fe316781bc07965 upstream

Fixes gcc '-Wunused-but-set-variable' warning:

fs/fuse/virtio_fs.c: In function virtio_fs_wake_pending_and_unlock:
fs/fuse/virtio_fs.c:983:20: warning: variable fc set but not used [-Wunused-but-set-variable]

It is not used since commit 7ee1e2e631db ("virtiofs: No need to check
fpq->connected state")
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: Nzhengbin <zhengbin13@huawei.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com>

823286b7

virtiofs: Retry request submission from worker context · 986957da