提交 · 53afcd310e867d25e394718558783c476301205c · openeuler / Kernel

13 3月, 2020 1 次提交

ovl: fix some xino configurations · 53afcd31

由 Amir Goldstein 提交于 2月 21, 2020

Fix up two bugs in the coversion to xino_mode:
1. xino=off does not always end up in disabled mode
2. xino=auto on 32bit arch should end up in disabled mode

Take a proactive approach to disabling xino on 32bit kernel:
1. Disable XINO_AUTO config during build time
2. Disable xino with a warning on mount time

As a by product, xino=on on 32bit arch also ends up in disabled mode.
We never intended to enable xino on 32bit arch and this will make the
rest of the logic simpler.

Fixes: 0f831ec8 ("ovl: simplify ovl_same_sb() helper")
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

53afcd31

24 1月, 2020 6 次提交

ovl: implement async IO routines · 2406a307

由 Jiufei Xue 提交于 11月 20, 2019

A performance regression was observed since linux v4.19 with aio test using
fio with iodepth 128 on overlayfs.  The queue depth of the device was
always 1 which is unexpected.

After investigation, it was found that commit 16914e6f ("ovl: add
ovl_read_iter()") and commit 2a92e07e ("ovl: add ovl_write_iter()")
resulted in vfs_iter_{read,write} being called on underlying filesystem,
which always results in syncronous IO.

Implement async IO for stacked reading and writing.  This resolves the
performance regresion.

This is implemented by allocating a new kiocb for submitting the AIO
request on the underlying filesystem.  When the request is completed, the
new kiocb is freed and the completion callback is called on the original
iocb.
Signed-off-by: NJiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2406a307

ovl: layer is const · 13464165

由 Miklos Szeredi 提交于 1月 24, 2020

The ovl_layer struct is never modified except at initialization.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

13464165

ovl: fix corner case of non-constant st_dev;st_ino · b7bf9908

由 Amir Goldstein 提交于 1月 14, 2020

On non-samefs overlay without xino, non pure upper inodes should use a
pseudo_dev assigned to each unique lower fs, but if lower layer is on the
same fs and upper layer, it has no pseudo_dev assigned.

In this overlay layers setup:
 - two filesystems, A and B
 - upper layer is on A
 - lower layer 1 is also on A
 - lower layer 2 is on B

Non pure upper overlay inode, whose origin is in layer 1 will have the
st_dev;st_ino values of the real lower inode before copy up and the
st_dev;st_ino values of the real upper inode after copy up.

Fix this inconsitency by assigning a unique pseudo_dev also for upper fs,
that will be used as st_dev value along with the lower inode st_dev for
overlay inodes in the case above.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

b7bf9908

ovl: fix corner case of conflicting lower layer uuid · 1b81dddd

由 Amir Goldstein 提交于 11月 16, 2019

This fixes ovl_lower_uuid_ok() to correctly detect the corner case:
 - two filesystems, A and B, both have null uuid
 - upper layer is on A
 - lower layer 1 is also on A
 - lower layer 2 is on B

In this case, bad_uuid would not have been set for B, because the check
only involved the list of lower fs.  Hence we'll try to decode a layer 2
origin on layer 1 and fail.

We check for conflicting (and null) uuid among all lower layers, including
those layers that are on the same fs as the upper layer.
Reported-by: NMiklos Szeredi <mszeredi@redhat.com>
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

1b81dddd

ovl: generalize the lower_fs[] array · 07f1e596

由 Amir Goldstein 提交于 1月 14, 2020

Rename lower_fs[] array to fs[], extend its size by one and use index fsid
(instead of fsid-1) to access the fs[] array.

Initialize fs[0] with upper fs values. fsid 0 is reserved even with lower
only overlay, so fs[0] remains null in this case.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

07f1e596

ovl: simplify ovl_same_sb() helper · 0f831ec8

由 Amir Goldstein 提交于 11月 16, 2019

No code uses the sb returned from this helper, so make it retrun a boolean
and rename it to ovl_same_fs().

The xino mode is irrelevant when all layers are on same fs, so instead of
describing samefs with mode OVL_XINO_OFF, use a new xino_mode state, which
is 0 in the case of samefs, -1 in the case of xino=off and > 0 with xino
enabled.

Create a new helper ovl_same_dev(), to use instead of the common check for
(ovl_same_fs() || xinobits).
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

0f831ec8

23 1月, 2020 2 次提交

ovl: generalize the lower_layers[] array · 94375f9d

由 Amir Goldstein 提交于 11月 15, 2019

Rename lower_layers[] array to layers[], extend its size by one and
initialize layers[0] with upper layer values.  Lower layers are now
addressed with index 1..numlower.  layers[0] is reserved even with lower
only overlay.

[SzM: replace ofs->numlower with ofs->numlayer, the latter's value is
incremented by one]
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

94375f9d

ovl: use pr_fmt auto generate prefix · 1bd0a3ae

由 lijiazi 提交于 12月 16, 2019

Use pr_fmt auto generate "overlayfs: " prefix.
Signed-off-by: Nlijiazi <lijiazi@xiaomi.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

1bd0a3ae

10 12月, 2019 1 次提交

ovl: fix lookup failure on multi lower squashfs · 7e63c87f

由 Amir Goldstein 提交于 11月 14, 2019

In the past, overlayfs required that lower fs have non null uuid in
order to support nfs export and decode copy up origin file handles.

Commit 9df085f3 ("ovl: relax requirement for non null uuid of
lower fs") relaxed this requirement for nfs export support, as long
as uuid (even if null) is unique among all lower fs.

However, said commit unintentionally also relaxed the non null uuid
requirement for decoding copy up origin file handles, regardless of
the unique uuid requirement.

Amend this mistake by disabling decoding of copy up origin file handle
from lower fs with a conflicting uuid.

We still encode copy up origin file handles from those fs, because
file handles like those already exist in the wild and because they
might provide useful information in the future.

There is an unhandled corner case described by Miklos this way:
- two filesystems, A and B, both have null uuid
- upper layer is on A
- lower layer 1 is also on A
- lower layer 2 is on B

In this case bad_uuid won't be set for B, because the check only
involves the list of lower fs.  Hence we'll try to decode a layer 2
origin on layer 1 and fail.

We will deal with this corner case later.
Reported-by: NColin Ian King <colin.king@canonical.com>
Tested-by: NColin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/lkml/20191106234301.283006-1-colin.king@canonical.com/
Fixes: 9df085f3 ("ovl: relax requirement for non null uuid ...")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

7e63c87f

16 7月, 2019 1 次提交

ovl: fix regression caused by overlapping layers detection · 0be0bfd2

由 Amir Goldstein 提交于 7月 12, 2019

Once upon a time, commit 2cac0c00 ("ovl: get exclusive ownership on
upper/work dirs") in v4.13 added some sanity checks on overlayfs layers.
This change caused a docker regression. The root cause was mount leaks
by docker, which as far as I know, still exist.

To mitigate the regression, commit 85fdee1e ("ovl: fix regression
caused by exclusive upper/work dir protection") in v4.14 turned the
mount errors into warnings for the default index=off configuration.

Recently, commit 146d62e5 ("ovl: detect overlapping layers") in
v5.2, re-introduced exclusive upper/work dir checks regardless of
index=off configuration.

This changes the status quo and mount leak related bug reports have
started to re-surface. Restore the status quo to fix the regressions.
To clarify, index=off does NOT relax overlapping layers check for this
ovelayfs mount. index=off only relaxes exclusive upper/work dir checks
with another overlayfs mount.

To cover the part of overlapping layers detection that used the
exclusive upper/work dir checks to detect overlap with self upper/work
dir, add a trap also on the work base dir.

Link: https://github.com/moby/moby/issues/34672
Link: https://lore.kernel.org/linux-fsdevel/20171006121405.GA32700@veci.piliscsaba.szeredi.hu/
Link: https://github.com/containers/libpod/issues/3540
Fixes: 146d62e5 ("ovl: detect overlapping layers")
Cc: <stable@vger.kernel.org> # v4.19+
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Tested-by: NColin Walters <walters@verbum.org>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

0be0bfd2

19 6月, 2019 1 次提交

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 · d2912cb1

由 Thomas Gleixner 提交于 6月 04, 2019

Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NEnrico Weigelt <info@metux.net>
Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: NAllison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

d2912cb1

18 6月, 2019 3 次提交

ovl: fix typo in MODULE_PARM_DESC · 253e7483

由 Nicolas Schier 提交于 6月 17, 2019

Change first argument to MODULE_PARM_DESC() calls, that each of them
matched the actual module parameter name.  The matching results in
changing (the 'parm' section from) the output of `modinfo overlay` from:

    parm: ovl_check_copy_up:Obsolete; does nothing
    parm: redirect_max:ushort
    parm: ovl_redirect_max:Maximum length of absolute redirect xattr value
    parm: redirect_dir:bool
    parm: ovl_redirect_dir_def:Default to on or off for the redirect_dir feature
    parm: redirect_always_follow:bool
    parm: ovl_redirect_always_follow:Follow redirects even if redirect_dir feature is turned off
    parm: index:bool
    parm: ovl_index_def:Default to on or off for the inodes index feature
    parm: nfs_export:bool
    parm: ovl_nfs_export_def:Default to on or off for the NFS export feature
    parm: xino_auto:bool
    parm: ovl_xino_auto_def:Auto enable xino feature
    parm: metacopy:bool
    parm: ovl_metacopy_def:Default to on or off for the metadata only copy up feature

into:

    parm: check_copy_up:Obsolete; does nothing
    parm: redirect_max:Maximum length of absolute redirect xattr value (ushort)
    parm: redirect_dir:Default to on or off for the redirect_dir feature (bool)
    parm: redirect_always_follow:Follow redirects even if redirect_dir feature is turned off (bool)
    parm: index:Default to on or off for the inodes index feature (bool)
    parm: nfs_export:Default to on or off for the NFS export feature (bool)
    parm: xino_auto:Auto enable xino feature (bool)
    parm: metacopy:Default to on or off for the metadata only copy up feature (bool)
Signed-off-by: NNicolas Schier <n.schier@avm.de>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

253e7483

ovl: fix bogus -Wmaybe-unitialized warning · 1dac6f5b

由 Arnd Bergmann 提交于 6月 17, 2019

gcc gets a bit confused by the logic in ovl_setup_trap() and
can't figure out whether the local 'trap' variable in the caller
was initialized or not:

fs/overlayfs/super.c: In function 'ovl_fill_super':
fs/overlayfs/super.c:1333:4: error: 'trap' may be used uninitialized in this function [-Werror=maybe-uninitialized]
    iput(trap);
    ^~~~~~~~~~
fs/overlayfs/super.c:1312:17: note: 'trap' was declared here

Reword slightly to make it easier for the compiler to understand.

Fixes: 146d62e5 ("ovl: detect overlapping layers")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

1dac6f5b

ovl: don't fail with disconnected lower NFS · 9179c21d

由 Miklos Szeredi 提交于 6月 18, 2019

NFS mounts can be disconnected from fs root.  Don't fail the overlapping
layer check because of this.

The check is not authoritative anyway, since topology can change during or
after the check.

Reported-by: Antti Antinoja <antti@fennosys.fi> 
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Fixes: 146d62e5 ("ovl: detect overlapping layers")

9179c21d

29 5月, 2019 1 次提交

ovl: detect overlapping layers · 146d62e5

由 Amir Goldstein 提交于 4月 18, 2019

Overlapping overlay layers are not supported and can cause unexpected
behavior, but overlayfs does not currently check or warn about these
configurations.

User is not supposed to specify the same directory for upper and
lower dirs or for different lower layers and user is not supposed to
specify directories that are descendants of each other for overlay
layers, but that is exactly what this zysbot repro did:

    https://syzkaller.appspot.com/x/repro.syz?x=12c7a94f400000

Moving layer root directories into other layers while overlayfs
is mounted could also result in unexpected behavior.

This commit places "traps" in the overlay inode hash table.
Those traps are dummy overlay inodes that are hashed by the layers
root inodes.

On mount, the hash table trap entries are used to verify that overlay
layers are not overlapping.  While at it, we also verify that overlay
layers are not overlapping with directories "in-use" by other overlay
instances as upperdir/workdir.

On lookup, the trap entries are used to verify that overlay layers
root inodes have not been moved into other layers after mount.

Some examples:

$ ./run --ov --samefs -s
...
( mkdir -p base/upper/0/u base/upper/0/w base/lower lower upper mnt
  mount -o bind base/lower lower
  mount -o bind base/upper upper
  mount -t overlay none mnt ...
        -o lowerdir=lower,upperdir=upper/0/u,workdir=upper/0/w)

$ umount mnt
$ mount -t overlay none mnt ...
        -o lowerdir=base,upperdir=upper/0/u,workdir=upper/0/w

  [   94.434900] overlayfs: overlapping upperdir path
  mount: mount overlay on mnt failed: Too many levels of symbolic links

$ mount -t overlay none mnt ...
        -o lowerdir=upper/0/u,upperdir=upper/0/u,workdir=upper/0/w

  [  151.350132] overlayfs: conflicting lowerdir path
  mount: none is already mounted or mnt busy

$ mount -t overlay none mnt ...
        -o lowerdir=lower:lower/a,upperdir=upper/0/u,workdir=upper/0/w

  [  201.205045] overlayfs: overlapping lowerdir path
  mount: mount overlay on mnt failed: Too many levels of symbolic links

$ mount -t overlay none mnt ...
        -o lowerdir=lower,upperdir=upper/0/u,workdir=upper/0/w
$ mv base/upper/0/ base/lower/
$ find mnt/0
  mnt/0
  mnt/0/w
  find: 'mnt/0/w/work': Too many levels of symbolic links
  find: 'mnt/0/u': Too many levels of symbolic links

Reported-by: syzbot+9c69c282adc4edd2b540@syzkaller.appspotmail.com
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

146d62e5

02 5月, 2019 1 次提交

overlayfs: make use of ->free_inode() · 0b269ded

由 Al Viro 提交于 4月 15, 2019

synchronous parts are left in ->destroy_inode()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0b269ded

02 11月, 2018 1 次提交

ovl: automatically enable redirect_dir on metacopy=on · d47748e5

由 Miklos Szeredi 提交于 11月 01, 2018

Current behavior is to automatically disable metacopy if redirect_dir is
not enabled and proceed with the mount.

If "metacopy=on" mount option was given, then this behavior can confuse the
user: no mount failure, yet metacopy is disabled.

This patch makes metacopy=on imply redirect_dir=on.

The converse is also true: turning off full redirect with redirect_dir=
{off|follow|nofollow} will disable metacopy.

If both metacopy=on and redirect_dir={off|follow|nofollow} is specified,
then mount will fail, since there's no way to correctly resolve the
conflict.
Reported-by: NDaniel Walsh <dwalsh@redhat.com>
Fixes: d5791044 ("ovl: Provide a mount option metacopy=on/off...")
Cc: <stable@vger.kernel.org> # v4.19
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

d47748e5

27 10月, 2018 1 次提交

ovl: relax requirement for non null uuid of lower fs · 9df085f3

由 Amir Goldstein 提交于 9月 03, 2018

We use uuid to associate an overlay lower file handle with a lower layer,
so we can accept lower fs with null uuid as long as all lower layers with
null uuid are on the same fs.

This change allows enabling index and nfs_export features for the setup of
single lower fs of type squashfs - squashfs supports file handles, but has
a null uuid. This change also allows enabling index and nfs_export features
for nested overlayfs, where the lower overlay has nfs_export enabled.

Enabling the index feature with single lower squashfs fixes the
unionmount-testsuite test:
  ./run --ov --squashfs --verify

As a by-product, if, like the lower squashfs, upper fs also uses the
generic export_encode_fh() implementation to export 32bit inode file
handles (e.g. ext4), then the xino_auto config/module/mount option will
enable unique overlay inode numbers.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

9df085f3

10 9月, 2018 1 次提交

ovl: fix oopses in ovl_fill_super() failure paths · 8c25741a

由 Miklos Szeredi 提交于 9月 10, 2018

ovl_free_fs() dereferences ofs->workbasedir and ofs->upper_mnt in cases when
those might not have been initialized yet.

Fix the initialization order for these fields.

Reported-by: syzbot+c75f181dc8429d2eb887@syzkaller.appspotmail.com
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Cc:  <stable@vger.kernel.org> # v4.15
Fixes: 95e6d417 ("ovl: grab reference to workbasedir early")
Fixes: a9075cdb ("ovl: factor out ovl_free_fs() helper")

8c25741a

20 7月, 2018 4 次提交

ovl: Do not expose metacopy only dentry from d_real() · 2c3d7358

由 Vivek Goyal 提交于 5月 11, 2018

Metacopy dentry/inode is internal to overlay and is never exposed outside
of it.  Exception is metacopy upper file used for fsync().  Modify d_real()
to look for dentries/inode which have data, but also allow matching upper
inode without data for the fsync case.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2c3d7358

ovl: Store lower data inode in ovl_inode · 2664bd08

由 Vivek Goyal 提交于 5月 11, 2018

Right now ovl_inode stores inode pointer for lower inode.  This helps with
quickly getting lower inode given overlay inode (ovl_inode_lower()).

Now with metadata only copy-up, we can have metacopy inode in middle layer
as well and inode containing data can be different from ->lower.  I need to
be able to open the real file in ovl_open_realfile() and for that I need to
quickly find the lower data inode.

Hence store lower data inode also in ovl_inode.  Also provide an helper
ovl_inode_lowerdata() to access this field.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2664bd08

ovl: A new xattr OVL_XATTR_METACOPY for file on upper · 0c288874

由 Vivek Goyal 提交于 5月 11, 2018

Now we will have the capability to have upper inodes which might be only
metadata copy up and data is still on lower inode.  So add a new xattr
OVL_XATTR_METACOPY to distinguish between two cases.

Presence of OVL_XATTR_METACOPY reflects that file has been copied up
metadata only and and data will be copied up later from lower origin.  So
this xattr is set when a metadata copy takes place and cleared when data
copy takes place.

We also use a bit in ovl_inode->flags to cache OVL_UPPERDATA which reflects
whether ovl inode has data or not (as opposed to metadata only copy up).

If a file is copied up metadata only and later when same file is opened for
WRITE, then data copy up takes place.  We copy up data, remove METACOPY
xattr and then set the UPPERDATA flag in ovl_inode->flags.  While all these
operations happen with oi->lock held, read side of oi->flags can be
lockless.  That is another thread on another cpu can check if UPPERDATA
flag is set or not.

So this gives us an ordering requirement w.r.t UPPERDATA flag.  That is, if
another cpu sees UPPERDATA flag set, then it should be guaranteed that
effects of data copy up and remove xattr operations are also visible.

For example.

	CPU1				CPU2
ovl_open()				acquire(oi->lock)
 ovl_open_maybe_copy_up()                ovl_copy_up_data()
  open_open_need_copy_up()		 vfs_removexattr()
   ovl_already_copied_up()
    ovl_dentry_needs_data_copy_up()	 ovl_set_flag(OVL_UPPERDATA)
     ovl_test_flag(OVL_UPPERDATA)       release(oi->lock)

Say CPU2 is copying up data and in the end sets UPPERDATA flag.  But if
CPU1 perceives the effects of setting UPPERDATA flag but not the effects of
preceding operations (ex. upper that is not fully copied up), it will be a
problem.

Hence this patch introduces smp_wmb() on setting UPPERDATA flag operation
and smp_rmb() on UPPERDATA flag test operation.

May be some other lock or barrier is already covering it. But I am not sure
what that is and is it obvious enough that we will not break it in future.

So hence trying to be safe here and introducing barriers explicitly for
UPPERDATA flag/bit.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

0c288874

ovl: Provide a mount option metacopy=on/off for metadata copyup · d5791044

由 Vivek Goyal 提交于 5月 11, 2018

By default metadata only copy up is disabled.  Provide a mount option so
that users can choose one way or other.

Also provide a kernel config and module option to enable/disable metacopy
feature.

metacopy feature requires redirect_dir=on when upper is present.
Otherwise, it requires redirect_dir=follow atleast.

As of now, metacopy does not work with nfs_export=on.  So if both
metacopy=on and nfs_export=on then nfs_export is disabled.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

d5791044

18 7月, 2018 5 次提交

vfs: remove open_flags from d_real() · fb16043b

由 Miklos Szeredi 提交于 7月 18, 2018

Opening regular files on overlayfs is now handled via ovl_open().  Remove
the now unused "open_flags" argument from d_op->d_real() and the d_real()
helper.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

fb16043b

Partially revert "locks: fix file locking on overlayfs" · de2a4a50

由 Miklos Szeredi 提交于 7月 18, 2018

This partially reverts commit c568d683.

Overlayfs files will now automatically get the correct locks, no need to
hack overlay support in VFS.

It is a partial revert, because it leaves the locks_inode() calls in place
and defines locks_inode() to file_inode().  We could revert those as well,
but it would be unnecessary code churn and it makes sense to document that
we are getting the inode for locking purposes.

Don't revert MS_NOREMOTELOCK yet since that has been part of the userspace
API for some time (though not in a useful way).  Will try to remove
internal flags later when the dust around the new mount API settles.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Acked-by: NJeff Layton <jlayton@kernel.org>

de2a4a50

Revert "vfs: add flags to d_real()" · 4ab30319

由 Miklos Szeredi 提交于 7月 18, 2018

This reverts commit 495e6429.

No user of "flags" argument of d_real() remain.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

4ab30319

Revert "ovl: fix relatime for directories" · 88059de1

由 Miklos Szeredi 提交于 7月 18, 2018

This reverts commit cd91304e.

Overlayfs no longer relies on the vfs correct atime handling.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

88059de1

M
ovl: deal with overlay files in ovl_d_real() · e8c985ba
由 Miklos Szeredi 提交于 7月 18, 2018
```
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
```
e8c985ba

31 5月, 2018 3 次提交

ovl: return dentry from ovl_create_real() · 95a1c815

由 Miklos Szeredi 提交于 5月 16, 2018

Al Viro suggested to simplify callers of ovl_create_real() by
returning the created dentry (or ERR_PTR) from ovl_create_real().
Suggested-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

95a1c815

ovl: struct cattr cleanups · 471ec5dc

由 Amir Goldstein 提交于 5月 16, 2018

* Rename to ovl_cattr

* Fold ovl_create_real() hardlink argument into struct ovl_cattr

* Create macro OVL_CATTR() to initialize struct ovl_cattr from mode
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

471ec5dc

ovl: strip debug argument from ovl_do_ helpers · 6cf00764

由 Amir Goldstein 提交于 5月 16, 2018

It did not prove to be useful.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

6cf00764

12 4月, 2018 3 次提交

ovl: add support for "xino" mount and config options · 795939a9

由 Amir Goldstein 提交于 3月 29, 2018

With mount option "xino=on", mounter declares that there are enough
free high bits in underlying fs to hold the layer fsid.
If overlayfs does encounter underlying inodes using the high xino
bits reserved for layer fsid, a warning will be emitted and the original
inode number will be used.

The mount option name "xino" goes after a similar meaning mount option
of aufs, but in overlayfs case, the mapping is stateless.

An example for a use case of "xino=on" is when upper/lower is on an xfs
filesystem. xfs uses 64bit inode numbers, but it currently never uses the
upper 8bit for inode numbers exposed via stat(2) and that is not likely to
change in the future without user opting-in for a new xfs feature. The
actual number of unused upper bit is much larger and determined by the xfs
filesystem geometry (64 - agno_log - agblklog - inopblog). That means
that for all practical purpose, there are enough unused bits in xfs
inode numbers for more than OVL_MAX_STACK unique fsid's.

Another use case of "xino=on" is when upper/lower is on tmpfs. tmpfs inode
numbers are allocated sequentially since boot, so they will practially
never use the high inode number bits.

For compatibility with applications that expect 32bit inodes, the feature
can be disabled with "xino=off". The option "xino=auto" automatically
detects underlying filesystem that use 32bit inodes and enables the
feature. The Kconfig option OVERLAY_FS_XINO_AUTO and module parameter of
the same name, determine if the default mode for overlayfs mount is
"xino=auto" or "xino=off".
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

795939a9

ovl: constant st_ino for non-samefs with xino · e487d889

由 Amir Goldstein 提交于 11月 07, 2017

On 64bit systems, when overlay layers are not all on the same fs, but
all inode numbers of underlying fs are not using the high bits, use the
high bits to partition the overlay st_ino address space. The high bits
hold the fsid (upper fsid is 0). This way overlay inode numbers are unique
and all inodes use overlay st_dev. Inode numbers are also persistent
for a given layer configuration.

Currently, our only indication for available high ino bits is from a
filesystem that supports file handles and uses the default encode_fh()
operation, which encodes a 32bit inode number.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

e487d889

ovl: allocate anon bdev per unique lower fs · 5148626b

由 Amir Goldstein 提交于 3月 28, 2018

Instead of allocating an anonymous bdev per lower layer, allocate
one anonymous bdev per every unique lower fs that is different than
upper fs.

Every unique lower fs is assigned an fsid > 0 and the number of
unique lower fs are stored in ofs->numlowerfs.

The assigned fsid is stored in the lower layer struct and will be
used also for inode number multiplexing.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

5148626b

16 2月, 2018 1 次提交

ovl: check lower ancestry on encode of lower dir file handle · 2ca3c148

由 Amir Goldstein 提交于 1月 30, 2018

This change relaxes copy up on encode of merge dir with lower layer > 1
and handles the case of encoding a merge dir with lower layer 1, where an
ancestor is a non-indexed merge dir. In that case, decode of the lower
file handle will not have been possible if the non-indexed ancestor is
redirected before or after encode.

Before encoding a non-upper directory file handle from real layer N, we
need to check if it will be possible to reconnect an overlay dentry from
the real lower decoded dentry. This is done by following the overlay
ancestry up to a "layer N connected" ancestor and verifying that all
parents along the way are "layer N connectable". If an ancestor that is
NOT "layer N connectable" is found, we need to copy up an ancestor, which
is "layer N connectable", thus making that ancestor "layer N connected".
For example:

 layer 1: /a
 layer 2: /a/b/c

The overlay dentry /a is NOT "layer 2 connectable", because if dir /a is
copied up and renamed, upper dir /a will be indexed by lower dir /a from
layer 1. The dir /a from layer 2 will never be indexed, so the algorithm
in ovl_lookup_real_ancestor() (*) will not be able to lookup a connected
overlay dentry from the connected lower dentry /a/b/c.

To avoid this problem on decode time, we need to copy up an ancestor of
/a/b/c, which is "layer 2 connectable", on encode time. That ancestor is
/a/b. After copy up (and index) of /a/b, it will become "layer 2 connected"
and when the time comes to decode the file handle from lower dentry /a/b/c,
ovl_lookup_real_ancestor() will find the indexed ancestor /a/b and decoding
a connected overlay dentry will be accomplished.

(*) the algorithm in ovl_lookup_real_ancestor() can be improved to lookup
an entry /a in the lower layers above layer N and find the indexed dir /a
from layer 1. If that improvement is made, then the check for "layer N
connected" will need to verify there are no redirects in lower layers above
layer N. In the example above, /a will be "layer 2 connectable". However,
if layer 2 dir /a is a target of a layer 1 redirect, then /a will NOT be
"layer 2 connectable":

 layer 1: /A (redirect = /a)
 layer 2: /a/b/c
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2ca3c148

24 1月, 2018 4 次提交

ovl: wire up NFS export operations · 8383f174

由 Amir Goldstein 提交于 10月 02, 2017

Now that NFS export operations are implemented, enable overlayfs NFS
export support if the "nfs_export" feature is enabled.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

8383f174

ovl: store 'has_upper' and 'opaque' as bit flags · c62520a8

由 Amir Goldstein 提交于 1月 14, 2018

We need to make some room in struct ovl_entry to store information
about redirected ancestors for NFS export, so cram two booleans as
bit flags.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

c62520a8

ovl: use directory index entries for consistency verification · ad1d615c

由 Amir Goldstein 提交于 1月 11, 2018

A directory index is a directory type entry in index dir with a
"trusted.overlay.upper" xattr containing an encoded ovl_fh of the merge
directory upper dir inode.

On lookup of non-dir files, lower file is followed by origin file handle.
On lookup of dir entries, lower dir is found by name and then compared
to origin file handle. We only trust dir index if we verified that lower
dir matches origin file handle, otherwise index may be inconsistent and
we ignore it.

If we find an indexed non-upper dir or an indexed merged dir, whose
index 'upper' xattr points to a different upper dir, that means that the
lower directory may be also referenced by another upper dir via redirect,
so we fail the lookup on inconsistency error.

To be consistent with directory index entries format, the association of
index dir to upper root dir, that was stored by older kernels in
"trusted.overlay.origin" xattr is now stored in "trusted.overlay.upper"
xattr. This also serves as an indication that overlay was mounted with a
kernel that support index directory entries. For backward compatibility,
if an 'origin' xattr exists on the index dir we also verify it on mount.

Directory index entries are going to be used for NFS export.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

ad1d615c

ovl: add support for "nfs_export" configuration · f168f109

由 Amir Goldstein 提交于 1月 19, 2018

Introduce the "nfs_export" config, module and mount options.

The NFS export feature depends on the "index" feature and enables two
implicit overlayfs features: "index_all" and "verify_lower".
The "index_all" feature creates an index on copy up of every file and
directory. The "verify_lower" feature uses the full index to detect
overlay filesystems inconsistencies on lookup, like redirect from
multiple upper dirs to the same lower dir.

NFS export can be enabled for non-upper mount with no index. However,
because lower layer redirects cannot be verified with the index, enabling
NFS export support on an overlay with no upper layer requires turning off
redirect follow (e.g. "redirect_dir=nofollow").

The full index may incur some overhead on mount time, especially when
verifying that lower directory file handles are not stale.

NFS export support, full index and consistency verification will be
implemented by following patches.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

f168f109

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功