提交 · c6b80eb89b55590b12db11103913088735205b5c · openeuler / Kernel

27 3月, 2020 3 次提交

ovl: enable xino automatically in more cases · 926e94d7

由 Amir Goldstein 提交于 2月 21, 2020

So far, with xino=auto, we only enable xino if we know that all
underlying filesystem use 32bit inode numbers.

When users configure overlay with xino=auto, they already declare that
they are ready to handle 64bit inode number from overlay.

It is a very common case, that underlying filesystem uses 64bit ino,
but rarely or never uses the high inode number bits (e.g. tmpfs, xfs).
Leaving it for the users to declare high ino bits are unused with
xino=on is not a recipe for many users to enjoy the benefits of xino.

There appears to be very little reason not to enable xino when users
declare xino=auto even if we do not know how many bits underlying
filesystem uses for inode numbers.

In the worst case of xino bits overflow by real inode number, we
already fall back to the non-xino behavior - real inode number with
unique pseudo dev or to non persistent inode number and overlay st_dev
(for directories).

The only annoyance from auto enabling xino is that xino bits overflow
emits a warning to kmsg. Suppress those warnings unless users explicitly
asked for xino=on, suggesting that they expected high ino bits to be
unused by underlying filesystem.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

926e94d7

ovl: avoid possible inode number collisions with xino=on · dfe51d47

由 Amir Goldstein 提交于 2月 21, 2020

When xino feature is enabled and a real directory inode number overflows
the lower xino bits, we cannot map this directory inode number to a unique
and persistent inode number and we fall back to the real inode st_ino and
overlay st_dev.

The real inode st_ino with high bits may collide with a lower inode number
on overlay st_dev that was mapped using xino.

To avoid possible collision with legitimate xino values, map a non
persistent inode number to a dedicated range in the xino address space.
The dedicated range is created by adding one more bit to the number of
reserved high xino bits. We could have added just one more fsid, but that
would have had the undesired effect of changing persistent overlay inode
numbers on kernel or require more complex xino mapping code.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

dfe51d47

ovl: use a private non-persistent ino pool · 4d314f78

由 Amir Goldstein 提交于 2月 21, 2020

There is no reason to deplete the system's global get_next_ino() pool for
overlay non-persistent inode numbers and there is no reason at all to
allocate non-persistent inode numbers for non-directories.

For non-directories, it is much better to leave i_ino the same as real
i_ino, to be consistent with st_ino/d_ino.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

4d314f78

17 3月, 2020 8 次提交

ovl: strict upper fs requirements for remote upper fs · d80172c2

由 Amir Goldstein 提交于 2月 20, 2020

Overlayfs works sub-optimally with upper fs that has no xattr/d_type/
RENAME_WHITEOUT support. We should basically deprecate support for those
filesystems, but so far, we only issue a warning and don't fail the mount
for the sake of backward compat.  Some features are already being disabled
with no xattr support.

For newly supported remote upper fs, we do not need to worry about backward
compatibility, so we can fail the mount if upper fs is a sub-optimal
filesystem.

This reduces the in-tree remote filesystems supported as upper to just
FUSE, for which the remote upper fs support was added.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

d80172c2

ovl: check if upper fs supports RENAME_WHITEOUT · cad218ab

由 Amir Goldstein 提交于 2月 20, 2020

As with other required upper fs features, we only warn if support is
missing to avoid breaking existing sub-optimal setups.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

cad218ab

ovl: allow remote upper · bccece1e

由 Miklos Szeredi 提交于 3月 17, 2020

No reason to prevent upper layer being a remote filesystem.  Do the
revalidation in that case, just as we already do for lower layers.

This lets virtiofs be used as upper layer, which appears to be a real use
case.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

bccece1e

ovl: decide if revalidate needed on a per-dentry basis · f4288844

由 Miklos Szeredi 提交于 3月 17, 2020

Allow completely skipping ->revalidate() on a per-dentry basis, in case the
underlying layers used for a dentry do not themselves have ->revalidate().

E.g. negative overlay dentry has no underlying layers, hence revalidate is
unnecessary.  Or if lower layer is remote but overlay dentry is pure-upper,
then can skip revalidate.

The following places need to update whether the dentry needs revalidate or
not:

 - fill-super (root dentry)
 - lookup
 - create
 - fh_to_dentry
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

f4288844

ovl: separate detection of remote upper layer from stacked overlay · 7925dad8

由 Miklos Szeredi 提交于 3月 17, 2020

Following patch will allow remote as upper layer, but not overlay stacked
on upper layer.  Separate the two concepts.

This patch is doesn't change behavior.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

7925dad8

ovl: restructure dentry revalidation · 3bb7df92

由 Miklos Szeredi 提交于 3月 17, 2020

Use a common loop for plain and weak revalidation.  This will aid doing
revalidation on upper layer.

This patch doesn't change behavior.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

3bb7df92

ovl: simplify i_ino initialization · 62c832ed

由 Amir Goldstein 提交于 11月 19, 2019

Move i_ino initialization to ovl_inode_init() to avoid the dance of setting
i_ino in ovl_fill_inode() sometimes on the first call and sometimes on the
seconds call.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

62c832ed

ovl: factor out helper ovl_get_root() · 2effc5c2

由 Amir Goldstein 提交于 11月 19, 2019

Allocates and initializes the root dentry and inode.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2effc5c2

13 3月, 2020 1 次提交

ovl: fix some xino configurations · 53afcd31

由 Amir Goldstein 提交于 2月 21, 2020

Fix up two bugs in the coversion to xino_mode:
1. xino=off does not always end up in disabled mode
2. xino=auto on 32bit arch should end up in disabled mode

Take a proactive approach to disabling xino on 32bit kernel:
1. Disable XINO_AUTO config during build time
2. Disable xino with a warning on mount time

As a by product, xino=on on 32bit arch also ends up in disabled mode.
We never intended to enable xino on 32bit arch and this will make the
rest of the logic simpler.

Fixes: 0f831ec8 ("ovl: simplify ovl_same_sb() helper")
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

53afcd31

24 1月, 2020 6 次提交

ovl: implement async IO routines · 2406a307

由 Jiufei Xue 提交于 11月 20, 2019

A performance regression was observed since linux v4.19 with aio test using
fio with iodepth 128 on overlayfs.  The queue depth of the device was
always 1 which is unexpected.

After investigation, it was found that commit 16914e6f ("ovl: add
ovl_read_iter()") and commit 2a92e07e ("ovl: add ovl_write_iter()")
resulted in vfs_iter_{read,write} being called on underlying filesystem,
which always results in syncronous IO.

Implement async IO for stacked reading and writing.  This resolves the
performance regresion.

This is implemented by allocating a new kiocb for submitting the AIO
request on the underlying filesystem.  When the request is completed, the
new kiocb is freed and the completion callback is called on the original
iocb.
Signed-off-by: NJiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2406a307

ovl: layer is const · 13464165

由 Miklos Szeredi 提交于 1月 24, 2020

The ovl_layer struct is never modified except at initialization.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

13464165

ovl: fix corner case of non-constant st_dev;st_ino · b7bf9908

由 Amir Goldstein 提交于 1月 14, 2020

On non-samefs overlay without xino, non pure upper inodes should use a
pseudo_dev assigned to each unique lower fs, but if lower layer is on the
same fs and upper layer, it has no pseudo_dev assigned.

In this overlay layers setup:
 - two filesystems, A and B
 - upper layer is on A
 - lower layer 1 is also on A
 - lower layer 2 is on B

Non pure upper overlay inode, whose origin is in layer 1 will have the
st_dev;st_ino values of the real lower inode before copy up and the
st_dev;st_ino values of the real upper inode after copy up.

Fix this inconsitency by assigning a unique pseudo_dev also for upper fs,
that will be used as st_dev value along with the lower inode st_dev for
overlay inodes in the case above.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

b7bf9908

ovl: fix corner case of conflicting lower layer uuid · 1b81dddd

由 Amir Goldstein 提交于 11月 16, 2019

This fixes ovl_lower_uuid_ok() to correctly detect the corner case:
 - two filesystems, A and B, both have null uuid
 - upper layer is on A
 - lower layer 1 is also on A
 - lower layer 2 is on B

In this case, bad_uuid would not have been set for B, because the check
only involved the list of lower fs.  Hence we'll try to decode a layer 2
origin on layer 1 and fail.

We check for conflicting (and null) uuid among all lower layers, including
those layers that are on the same fs as the upper layer.
Reported-by: NMiklos Szeredi <mszeredi@redhat.com>
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

1b81dddd

ovl: generalize the lower_fs[] array · 07f1e596

由 Amir Goldstein 提交于 1月 14, 2020

Rename lower_fs[] array to fs[], extend its size by one and use index fsid
(instead of fsid-1) to access the fs[] array.

Initialize fs[0] with upper fs values. fsid 0 is reserved even with lower
only overlay, so fs[0] remains null in this case.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

07f1e596

ovl: simplify ovl_same_sb() helper · 0f831ec8

由 Amir Goldstein 提交于 11月 16, 2019

No code uses the sb returned from this helper, so make it retrun a boolean
and rename it to ovl_same_fs().

The xino mode is irrelevant when all layers are on same fs, so instead of
describing samefs with mode OVL_XINO_OFF, use a new xino_mode state, which
is 0 in the case of samefs, -1 in the case of xino=off and > 0 with xino
enabled.

Create a new helper ovl_same_dev(), to use instead of the common check for
(ovl_same_fs() || xinobits).
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

0f831ec8

23 1月, 2020 2 次提交

ovl: generalize the lower_layers[] array · 94375f9d

由 Amir Goldstein 提交于 11月 15, 2019

Rename lower_layers[] array to layers[], extend its size by one and
initialize layers[0] with upper layer values.  Lower layers are now
addressed with index 1..numlower.  layers[0] is reserved even with lower
only overlay.

[SzM: replace ofs->numlower with ofs->numlayer, the latter's value is
incremented by one]
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

94375f9d

ovl: use pr_fmt auto generate prefix · 1bd0a3ae

由 lijiazi 提交于 12月 16, 2019

Use pr_fmt auto generate "overlayfs: " prefix.
Signed-off-by: Nlijiazi <lijiazi@xiaomi.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

1bd0a3ae

10 12月, 2019 1 次提交

ovl: fix lookup failure on multi lower squashfs · 7e63c87f

由 Amir Goldstein 提交于 11月 14, 2019

In the past, overlayfs required that lower fs have non null uuid in
order to support nfs export and decode copy up origin file handles.

Commit 9df085f3 ("ovl: relax requirement for non null uuid of
lower fs") relaxed this requirement for nfs export support, as long
as uuid (even if null) is unique among all lower fs.

However, said commit unintentionally also relaxed the non null uuid
requirement for decoding copy up origin file handles, regardless of
the unique uuid requirement.

Amend this mistake by disabling decoding of copy up origin file handle
from lower fs with a conflicting uuid.

We still encode copy up origin file handles from those fs, because
file handles like those already exist in the wild and because they
might provide useful information in the future.

There is an unhandled corner case described by Miklos this way:
- two filesystems, A and B, both have null uuid
- upper layer is on A
- lower layer 1 is also on A
- lower layer 2 is on B

In this case bad_uuid won't be set for B, because the check only
involves the list of lower fs.  Hence we'll try to decode a layer 2
origin on layer 1 and fail.

We will deal with this corner case later.
Reported-by: NColin Ian King <colin.king@canonical.com>
Tested-by: NColin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/lkml/20191106234301.283006-1-colin.king@canonical.com/
Fixes: 9df085f3 ("ovl: relax requirement for non null uuid ...")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

7e63c87f

16 7月, 2019 1 次提交

ovl: fix regression caused by overlapping layers detection · 0be0bfd2

由 Amir Goldstein 提交于 7月 12, 2019

Once upon a time, commit 2cac0c00 ("ovl: get exclusive ownership on
upper/work dirs") in v4.13 added some sanity checks on overlayfs layers.
This change caused a docker regression. The root cause was mount leaks
by docker, which as far as I know, still exist.

To mitigate the regression, commit 85fdee1e ("ovl: fix regression
caused by exclusive upper/work dir protection") in v4.14 turned the
mount errors into warnings for the default index=off configuration.

Recently, commit 146d62e5 ("ovl: detect overlapping layers") in
v5.2, re-introduced exclusive upper/work dir checks regardless of
index=off configuration.

This changes the status quo and mount leak related bug reports have
started to re-surface. Restore the status quo to fix the regressions.
To clarify, index=off does NOT relax overlapping layers check for this
ovelayfs mount. index=off only relaxes exclusive upper/work dir checks
with another overlayfs mount.

To cover the part of overlapping layers detection that used the
exclusive upper/work dir checks to detect overlap with self upper/work
dir, add a trap also on the work base dir.

Link: https://github.com/moby/moby/issues/34672
Link: https://lore.kernel.org/linux-fsdevel/20171006121405.GA32700@veci.piliscsaba.szeredi.hu/
Link: https://github.com/containers/libpod/issues/3540
Fixes: 146d62e5 ("ovl: detect overlapping layers")
Cc: <stable@vger.kernel.org> # v4.19+
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Tested-by: NColin Walters <walters@verbum.org>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

0be0bfd2

19 6月, 2019 1 次提交

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 · d2912cb1

由 Thomas Gleixner 提交于 6月 04, 2019

Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NEnrico Weigelt <info@metux.net>
Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: NAllison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

d2912cb1

18 6月, 2019 3 次提交

ovl: fix typo in MODULE_PARM_DESC · 253e7483

由 Nicolas Schier 提交于 6月 17, 2019

Change first argument to MODULE_PARM_DESC() calls, that each of them
matched the actual module parameter name.  The matching results in
changing (the 'parm' section from) the output of `modinfo overlay` from:

    parm: ovl_check_copy_up:Obsolete; does nothing
    parm: redirect_max:ushort
    parm: ovl_redirect_max:Maximum length of absolute redirect xattr value
    parm: redirect_dir:bool
    parm: ovl_redirect_dir_def:Default to on or off for the redirect_dir feature
    parm: redirect_always_follow:bool
    parm: ovl_redirect_always_follow:Follow redirects even if redirect_dir feature is turned off
    parm: index:bool
    parm: ovl_index_def:Default to on or off for the inodes index feature
    parm: nfs_export:bool
    parm: ovl_nfs_export_def:Default to on or off for the NFS export feature
    parm: xino_auto:bool
    parm: ovl_xino_auto_def:Auto enable xino feature
    parm: metacopy:bool
    parm: ovl_metacopy_def:Default to on or off for the metadata only copy up feature

into:

    parm: check_copy_up:Obsolete; does nothing
    parm: redirect_max:Maximum length of absolute redirect xattr value (ushort)
    parm: redirect_dir:Default to on or off for the redirect_dir feature (bool)
    parm: redirect_always_follow:Follow redirects even if redirect_dir feature is turned off (bool)
    parm: index:Default to on or off for the inodes index feature (bool)
    parm: nfs_export:Default to on or off for the NFS export feature (bool)
    parm: xino_auto:Auto enable xino feature (bool)
    parm: metacopy:Default to on or off for the metadata only copy up feature (bool)
Signed-off-by: NNicolas Schier <n.schier@avm.de>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

253e7483

ovl: fix bogus -Wmaybe-unitialized warning · 1dac6f5b

由 Arnd Bergmann 提交于 6月 17, 2019

gcc gets a bit confused by the logic in ovl_setup_trap() and
can't figure out whether the local 'trap' variable in the caller
was initialized or not:

fs/overlayfs/super.c: In function 'ovl_fill_super':
fs/overlayfs/super.c:1333:4: error: 'trap' may be used uninitialized in this function [-Werror=maybe-uninitialized]
    iput(trap);
    ^~~~~~~~~~
fs/overlayfs/super.c:1312:17: note: 'trap' was declared here

Reword slightly to make it easier for the compiler to understand.

Fixes: 146d62e5 ("ovl: detect overlapping layers")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

1dac6f5b

ovl: don't fail with disconnected lower NFS · 9179c21d

由 Miklos Szeredi 提交于 6月 18, 2019

NFS mounts can be disconnected from fs root.  Don't fail the overlapping
layer check because of this.

The check is not authoritative anyway, since topology can change during or
after the check.

Reported-by: Antti Antinoja <antti@fennosys.fi> 
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Fixes: 146d62e5 ("ovl: detect overlapping layers")

9179c21d

29 5月, 2019 1 次提交

ovl: detect overlapping layers · 146d62e5

由 Amir Goldstein 提交于 4月 18, 2019

Overlapping overlay layers are not supported and can cause unexpected
behavior, but overlayfs does not currently check or warn about these
configurations.

User is not supposed to specify the same directory for upper and
lower dirs or for different lower layers and user is not supposed to
specify directories that are descendants of each other for overlay
layers, but that is exactly what this zysbot repro did:

    https://syzkaller.appspot.com/x/repro.syz?x=12c7a94f400000

Moving layer root directories into other layers while overlayfs
is mounted could also result in unexpected behavior.

This commit places "traps" in the overlay inode hash table.
Those traps are dummy overlay inodes that are hashed by the layers
root inodes.

On mount, the hash table trap entries are used to verify that overlay
layers are not overlapping.  While at it, we also verify that overlay
layers are not overlapping with directories "in-use" by other overlay
instances as upperdir/workdir.

On lookup, the trap entries are used to verify that overlay layers
root inodes have not been moved into other layers after mount.

Some examples:

$ ./run --ov --samefs -s
...
( mkdir -p base/upper/0/u base/upper/0/w base/lower lower upper mnt
  mount -o bind base/lower lower
  mount -o bind base/upper upper
  mount -t overlay none mnt ...
        -o lowerdir=lower,upperdir=upper/0/u,workdir=upper/0/w)

$ umount mnt
$ mount -t overlay none mnt ...
        -o lowerdir=base,upperdir=upper/0/u,workdir=upper/0/w

  [   94.434900] overlayfs: overlapping upperdir path
  mount: mount overlay on mnt failed: Too many levels of symbolic links

$ mount -t overlay none mnt ...
        -o lowerdir=upper/0/u,upperdir=upper/0/u,workdir=upper/0/w

  [  151.350132] overlayfs: conflicting lowerdir path
  mount: none is already mounted or mnt busy

$ mount -t overlay none mnt ...
        -o lowerdir=lower:lower/a,upperdir=upper/0/u,workdir=upper/0/w

  [  201.205045] overlayfs: overlapping lowerdir path
  mount: mount overlay on mnt failed: Too many levels of symbolic links

$ mount -t overlay none mnt ...
        -o lowerdir=lower,upperdir=upper/0/u,workdir=upper/0/w
$ mv base/upper/0/ base/lower/
$ find mnt/0
  mnt/0
  mnt/0/w
  find: 'mnt/0/w/work': Too many levels of symbolic links
  find: 'mnt/0/u': Too many levels of symbolic links

Reported-by: syzbot+9c69c282adc4edd2b540@syzkaller.appspotmail.com
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

146d62e5

02 5月, 2019 1 次提交

overlayfs: make use of ->free_inode() · 0b269ded

由 Al Viro 提交于 4月 15, 2019

synchronous parts are left in ->destroy_inode()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0b269ded

02 11月, 2018 1 次提交

ovl: automatically enable redirect_dir on metacopy=on · d47748e5

由 Miklos Szeredi 提交于 11月 01, 2018

Current behavior is to automatically disable metacopy if redirect_dir is
not enabled and proceed with the mount.

If "metacopy=on" mount option was given, then this behavior can confuse the
user: no mount failure, yet metacopy is disabled.

This patch makes metacopy=on imply redirect_dir=on.

The converse is also true: turning off full redirect with redirect_dir=
{off|follow|nofollow} will disable metacopy.

If both metacopy=on and redirect_dir={off|follow|nofollow} is specified,
then mount will fail, since there's no way to correctly resolve the
conflict.
Reported-by: NDaniel Walsh <dwalsh@redhat.com>
Fixes: d5791044 ("ovl: Provide a mount option metacopy=on/off...")
Cc: <stable@vger.kernel.org> # v4.19
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

d47748e5

27 10月, 2018 1 次提交

ovl: relax requirement for non null uuid of lower fs · 9df085f3

由 Amir Goldstein 提交于 9月 03, 2018

We use uuid to associate an overlay lower file handle with a lower layer,
so we can accept lower fs with null uuid as long as all lower layers with
null uuid are on the same fs.

This change allows enabling index and nfs_export features for the setup of
single lower fs of type squashfs - squashfs supports file handles, but has
a null uuid. This change also allows enabling index and nfs_export features
for nested overlayfs, where the lower overlay has nfs_export enabled.

Enabling the index feature with single lower squashfs fixes the
unionmount-testsuite test:
  ./run --ov --squashfs --verify

As a by-product, if, like the lower squashfs, upper fs also uses the
generic export_encode_fh() implementation to export 32bit inode file
handles (e.g. ext4), then the xino_auto config/module/mount option will
enable unique overlay inode numbers.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

9df085f3

10 9月, 2018 1 次提交

ovl: fix oopses in ovl_fill_super() failure paths · 8c25741a

由 Miklos Szeredi 提交于 9月 10, 2018

ovl_free_fs() dereferences ofs->workbasedir and ofs->upper_mnt in cases when
those might not have been initialized yet.

Fix the initialization order for these fields.

Reported-by: syzbot+c75f181dc8429d2eb887@syzkaller.appspotmail.com
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Cc:  <stable@vger.kernel.org> # v4.15
Fixes: 95e6d417 ("ovl: grab reference to workbasedir early")
Fixes: a9075cdb ("ovl: factor out ovl_free_fs() helper")

8c25741a

20 7月, 2018 4 次提交

ovl: Do not expose metacopy only dentry from d_real() · 2c3d7358

由 Vivek Goyal 提交于 5月 11, 2018

Metacopy dentry/inode is internal to overlay and is never exposed outside
of it.  Exception is metacopy upper file used for fsync().  Modify d_real()
to look for dentries/inode which have data, but also allow matching upper
inode without data for the fsync case.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2c3d7358

ovl: Store lower data inode in ovl_inode · 2664bd08

由 Vivek Goyal 提交于 5月 11, 2018

Right now ovl_inode stores inode pointer for lower inode.  This helps with
quickly getting lower inode given overlay inode (ovl_inode_lower()).

Now with metadata only copy-up, we can have metacopy inode in middle layer
as well and inode containing data can be different from ->lower.  I need to
be able to open the real file in ovl_open_realfile() and for that I need to
quickly find the lower data inode.

Hence store lower data inode also in ovl_inode.  Also provide an helper
ovl_inode_lowerdata() to access this field.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2664bd08

ovl: A new xattr OVL_XATTR_METACOPY for file on upper · 0c288874

由 Vivek Goyal 提交于 5月 11, 2018

Now we will have the capability to have upper inodes which might be only
metadata copy up and data is still on lower inode.  So add a new xattr
OVL_XATTR_METACOPY to distinguish between two cases.

Presence of OVL_XATTR_METACOPY reflects that file has been copied up
metadata only and and data will be copied up later from lower origin.  So
this xattr is set when a metadata copy takes place and cleared when data
copy takes place.

We also use a bit in ovl_inode->flags to cache OVL_UPPERDATA which reflects
whether ovl inode has data or not (as opposed to metadata only copy up).

If a file is copied up metadata only and later when same file is opened for
WRITE, then data copy up takes place.  We copy up data, remove METACOPY
xattr and then set the UPPERDATA flag in ovl_inode->flags.  While all these
operations happen with oi->lock held, read side of oi->flags can be
lockless.  That is another thread on another cpu can check if UPPERDATA
flag is set or not.

So this gives us an ordering requirement w.r.t UPPERDATA flag.  That is, if
another cpu sees UPPERDATA flag set, then it should be guaranteed that
effects of data copy up and remove xattr operations are also visible.

For example.

	CPU1				CPU2
ovl_open()				acquire(oi->lock)
 ovl_open_maybe_copy_up()                ovl_copy_up_data()
  open_open_need_copy_up()		 vfs_removexattr()
   ovl_already_copied_up()
    ovl_dentry_needs_data_copy_up()	 ovl_set_flag(OVL_UPPERDATA)
     ovl_test_flag(OVL_UPPERDATA)       release(oi->lock)

Say CPU2 is copying up data and in the end sets UPPERDATA flag.  But if
CPU1 perceives the effects of setting UPPERDATA flag but not the effects of
preceding operations (ex. upper that is not fully copied up), it will be a
problem.

Hence this patch introduces smp_wmb() on setting UPPERDATA flag operation
and smp_rmb() on UPPERDATA flag test operation.

May be some other lock or barrier is already covering it. But I am not sure
what that is and is it obvious enough that we will not break it in future.

So hence trying to be safe here and introducing barriers explicitly for
UPPERDATA flag/bit.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

0c288874

ovl: Provide a mount option metacopy=on/off for metadata copyup · d5791044

由 Vivek Goyal 提交于 5月 11, 2018

By default metadata only copy up is disabled.  Provide a mount option so
that users can choose one way or other.

Also provide a kernel config and module option to enable/disable metacopy
feature.

metacopy feature requires redirect_dir=on when upper is present.
Otherwise, it requires redirect_dir=follow atleast.

As of now, metacopy does not work with nfs_export=on.  So if both
metacopy=on and nfs_export=on then nfs_export is disabled.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

d5791044

18 7月, 2018 5 次提交

vfs: remove open_flags from d_real() · fb16043b

由 Miklos Szeredi 提交于 7月 18, 2018

Opening regular files on overlayfs is now handled via ovl_open().  Remove
the now unused "open_flags" argument from d_op->d_real() and the d_real()
helper.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

fb16043b

Partially revert "locks: fix file locking on overlayfs" · de2a4a50

由 Miklos Szeredi 提交于 7月 18, 2018

This partially reverts commit c568d683.

Overlayfs files will now automatically get the correct locks, no need to
hack overlay support in VFS.

It is a partial revert, because it leaves the locks_inode() calls in place
and defines locks_inode() to file_inode().  We could revert those as well,
but it would be unnecessary code churn and it makes sense to document that
we are getting the inode for locking purposes.

Don't revert MS_NOREMOTELOCK yet since that has been part of the userspace
API for some time (though not in a useful way).  Will try to remove
internal flags later when the dust around the new mount API settles.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Acked-by: NJeff Layton <jlayton@kernel.org>

de2a4a50

Revert "vfs: add flags to d_real()" · 4ab30319

由 Miklos Szeredi 提交于 7月 18, 2018

This reverts commit 495e6429.

No user of "flags" argument of d_real() remain.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

4ab30319

Revert "ovl: fix relatime for directories" · 88059de1

由 Miklos Szeredi 提交于 7月 18, 2018

This reverts commit cd91304e.

Overlayfs no longer relies on the vfs correct atime handling.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

88059de1

M
ovl: deal with overlay files in ovl_d_real() · e8c985ba
由 Miklos Szeredi 提交于 7月 18, 2018
```
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
```
e8c985ba

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功