提交 · 313684c48cc0e450ab303e1f82130ee2d0b50274 · openanolis / cloud-kernel

16 12月, 2016 35 次提交

ovl: fix return value of ovl_fill_super · 313684c4

由 Geliang Tang 提交于 11月 18, 2016

If kcalloc() failed, the return value of ovl_fill_super() is -EINVAL,
not -ENOMEM. So this patch sets this value to -ENOMEM before calling
kcalloc(), and sets it back to -EINVAL after calling kcalloc().
Signed-off-by: NGeliang Tang <geliangtang@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

313684c4

ovl: clean up kstat usage · 32a3d848

由 Al Viro 提交于 12月 04, 2016

FWIW, there's a bit of abuse of struct kstat in overlayfs object
creation paths - for one thing, it ends up with a very small subset
of struct kstat (mode + rdev), for another it also needs link in
case of symlinks and ends up passing it separately.

IMO it would be better to introduce a separate object for that.

In principle, we might even lift that thing into general API and switch
 ->mkdir()/->mknod()/->symlink() to identical calling conventions.  Hell
knows, perhaps ->create() as well...
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

32a3d848

ovl: fold ovl_copy_up_truncate() into ovl_copy_up() · 9aba6521

由 Amir Goldstein 提交于 11月 12, 2016

This removes code duplication.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

9aba6521

ovl: create directories inside merged parent opaque · 97c684cc

由 Amir Goldstein 提交于 11月 21, 2016

The benefit of making directories opaque on creation is that lookups can
stop short when they reach the original created directory, instead of
continue lookup the entire depth of parent directory stack.

The best case is overlay with N layers, performing lookup for first level
directory, which exists only in upper.  In that case, there will be only
one lookup instead of N.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

97c684cc

ovl: opaque cleanup · 5cf5b477

由 Miklos Szeredi 提交于 12月 16, 2016

oe->opaque is set for

 a) whiteouts
 b) directories having the "trusted.overlay.opaque" xattr

Case b can be simplified, since setting the xattr always implies setting
oe->opaque.  Also once set, the opaque flag is never cleared.

Don't need to set opaque flag for non-directories.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

5cf5b477

ovl: show redirect_dir mount option · c5bef3a7

由 Amir Goldstein 提交于 11月 22, 2016

Show the value of redirect_dir in /proc/mounts.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

c5bef3a7

ovl: allow setting max size of redirect · 3ea22a71

由 Miklos Szeredi 提交于 12月 16, 2016

Add a module option to allow tuning the max size of absolute redirects.
Default is 256.

Size of relative redirects is naturally limited by the the underlying
filesystem's max filename length (usually 255).
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

3ea22a71

ovl: allow redirect_dir to default to "on" · 688ea0e5

由 Miklos Szeredi 提交于 12月 16, 2016

This patch introduces a kernel config option and a module param. Both can
be used independently to turn the default value of redirect_dir on or off.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

688ea0e5

ovl: check for emptiness of redirect dir · d1595119

由 Amir Goldstein 提交于 10月 26, 2016

Before introducing redirect_dir feature, the condition
!ovl_lower_positive(dentry) for a directory, implied that it is a pure
upper directory, which may be removed if empty.

Now that directory can be redirect, it is possible that upper does not
cover any lower (i.e. !ovl_lower_positive(dentry)), but the directory is a
merge (with redirected path) and maybe non empty.

Check for this case in ovl_remove_upper().

This change fixes the following test case from rename-pop-dir.py
of unionmount-testsuite:

    """Remove dir and rename old name"""
    d = ctx.non_empty_dir()
    d2 = ctx.no_dir()

    ctx.rmdir(d, err=ENOTEMPTY)
    ctx.rename(d, d2)
    ctx.rmdir(d, err=ENOENT)
    ctx.rmdir(d2, err=ENOTEMPTY)

./run --ov rename-pop-dir
/mnt/a/no_dir103: Expected error (Directory not empty) was not produced
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

d1595119

ovl: redirect on rename-dir · a6c60655

由 Miklos Szeredi 提交于 12月 16, 2016

Current code returns EXDEV when a directory would need to be copied up to
move. We could copy up the directory tree in this case, but there's
another, simpler solution: point to old lower directory from moved upper
directory.

This is achieved with a "trusted.overlay.redirect" xattr storing the path
relative to the root of the overlay. After such attribute has been set,
the directory can be moved without further actions required.

This is a backward incompatible feature, old kernels won't be able to
correctly mount an overlay containing redirected directories.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

a6c60655

ovl: lookup redirects · 02b69b28

由 Miklos Szeredi 提交于 12月 16, 2016

If a directory has the "trusted.overlay.redirect" xattr, it means that the
value of the xattr should be used to find the underlying directory on the
next lower layer.

The redirect may be relative or absolute.  Absolute redirects begin with a
slash.

A relative redirect means: instead of the current dentry's name use the
value of the redirect to find the directory in the next lower
layer. Relative redirects must not contain a slash.

An absolute redirect means: look up the directory relative to the root of
the overlay using the value of the redirect in the next lower layer.

Redirects work on lower layers as well.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

02b69b28

ovl: consolidate lookup for underlying layers · e28edc46

由 Miklos Szeredi 提交于 12月 16, 2016

Use a common helper for lookup of upper and lower layers.  This paves the
way for looking up directory redirects.

No functional change.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

e28edc46

ovl: fix nested overlayfs mount · 48fab5d7

由 Amir Goldstein 提交于 11月 16, 2016

When the upper overlayfs checks "trusted.overlay.*" xattr on the underlying
overlayfs mount, it gets -EPERM, which confuses the upper overlayfs.

Fix this by returning -EOPNOTSUPP instead of -EPERM from
ovl_own_xattr_get() and ovl_own_xattr_set().  This behavior is consistent
with the behavior of ovl_listxattr(), which filters out the private
overlayfs xattrs.

Note: nested overlays are deprecated.  But this change makes sense
regardless: these xattrs are private to the overlay and should always be
hidden.  Hence getting and setting them should indicate this.

[SzMi: Use EOPNOTSUPP instead of ENODATA and use it for both getting and
setting "trusted.overlay." xattrs.  This is a perfectly valid error code
for "we don't support this prefix", which is the case here.]
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

48fab5d7

ovl: check namelen · 6b2d5fe4

由 Miklos Szeredi 提交于 12月 16, 2016

We already calculate f_namelen in statfs as the maximum of the name lengths
provided by the filesystems taking part in the overlay.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

6b2d5fe4

ovl: split super.c · bbb1e54d

由 Miklos Szeredi 提交于 12月 16, 2016

fs/overlayfs/super.c is the biggest of the overlayfs source files and it
contains various utility functions as well as the rather complicated lookup
code.  Split these parts out to separate files.

Before:

 1446 fs/overlayfs/super.c

After:

  919 fs/overlayfs/super.c
  267 fs/overlayfs/namei.c
  235 fs/overlayfs/util.c
   51 fs/overlayfs/ovl_entry.h
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

bbb1e54d

M
ovl: use d_is_dir() · 2b8c30e9
由 Miklos Szeredi 提交于 12月 16, 2016
```
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
```
2b8c30e9

ovl: simplify lookup · 8ee6059c

由 Miklos Szeredi 提交于 12月 16, 2016

If encountering a non-directory, then stop looking at lower layers.

In this case the oe->opaque flag is not set anymore, which doesn't matter
since existence of lower file is now checked at remove/rename time.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

8ee6059c

ovl: check lower existence of rename target · 3ee23ff1

由 Miklos Szeredi 提交于 12月 16, 2016

Check if something exists on the lower layer(s) under the target or rename
to decide if directory needs to be marked "opaque".

Marking opaque is done before the rename, and on failure the marking was
undone. Also the opaque xattr was removed if the target didn't cover
anything.

This patch changes behavior so that removal of "opaque" is not done in
either of the above cases. This means that directory may have the opaque
flag even if it doesn't cover anything. However this shouldn't affect the
performance or semantics of the overalay, while simplifying the code.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

3ee23ff1

ovl: rename: simplify handling of lower/merged directory · 370e55ac

由 Miklos Szeredi 提交于 12月 16, 2016

d_is_dir() is safe to call on a negative dentry.  Use this fact to simplify
handling of the lower or merged directories.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

370e55ac

ovl: get rid of PURE type · 38e813db

由 Miklos Szeredi 提交于 12月 16, 2016

The remainging uses of __OVL_PATH_PURE can be replaced by
ovl_dentry_is_opaque().
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

38e813db

ovl: check lower existence when removing · 2aff4534

由 Miklos Szeredi 提交于 12月 16, 2016

Currently ovl_lookup() checks existence of lower file even if there's a
non-directory on upper (which is always opaque). This is done so that
remove can decide whether a whiteout is needed or not.

It would be better to defer this check to unlink, since most of the time
the gathered information about opaqueness will be unused.

This adds a helper ovl_lower_positive() that checks if there's anything on
the lower layer(s).

The following patches also introduce changes to how the "opaque" attribute
is updated on directories: this attribute is added when the directory is
creted or moved over a whiteout or object covering something on the lower
layer. However following changes will allow the attribute to remain on the
directory after being moved, even if the new location doesn't cover
anything. Because of this, we need to check lower layers even for opaque
directories, so that whiteout is only created when necessary.

This function will later be also used to decide about marking a directory
opaque, so deal with negative dentries as well. When dealing with
negative, it's enough to check for being a whiteout

If the dentry is positive but not upper then it also obviously needs
whiteout/opaque.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2aff4534

ovl: add ovl_dentry_is_whiteout() · c412ce49

由 Miklos Szeredi 提交于 12月 16, 2016

And use it instead of ovl_dentry_is_opaque() where appropriate.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

c412ce49

ovl: don't check sticky · 99f5d08e

由 Miklos Szeredi 提交于 12月 16, 2016

Since commit 07a2daab ("ovl: Copy up underlying inode's ->i_mode to
overlay inode") sticky checking on overlay inode is performed by the vfs,
so checking against sticky on underlying inode is not needed.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

99f5d08e

ovl: don't check rename to self · 804032fa

由 Miklos Szeredi 提交于 12月 16, 2016

This is redundant, the vfs already performed this check (and was broken,
see commit 9409e22a ("vfs: rename: check backing inode being equal")).
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

804032fa

ovl: treat special files like a regular fs · ca4c8a3a

由 Miklos Szeredi 提交于 12月 16, 2016

No sense in opening special files on the underlying layers, they work just
as well if opened on the overlay.

Side effect is that it's no longer possible to connect one side of a pipe
opened on overlayfs with the other side opened on the underlying layer.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

ca4c8a3a

M
ovl: rename ovl_rename2() to ovl_rename() · 6c02cb59
由 Miklos Szeredi 提交于 12月 16, 2016
```
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
```
6c02cb59

ovl: use vfs_clone_file_range() for copy up if possible · 2ea98466

由 Amir Goldstein 提交于 9月 23, 2016

When copying up within the same fs, try to use vfs_clone_file_range().
This is very efficient when lower and upper are on the same fs
with file reflink support. If vfs_clone_file_range() fails for any
reason, copy up falls back to the regular data copy code.

Tested correct behavior when lower and upper are on:
1. same ext4 (copy)
2. same xfs + reflink patches + mkfs.xfs (copy)
3. same xfs + reflink patches + mkfs.xfs -m reflink=1 (reflink)
4. different xfs + reflink patches + mkfs.xfs -m reflink=1 (copy)

For comparison, on my laptop, xfstest overlay/001 (copy up of large
sparse files) takes less than 1 second in the xfs reflink setup vs.
25 seconds on the rest of the setups.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2ea98466

Revert "ovl: get_write_access() in truncate" · 31c3a706

由 Miklos Szeredi 提交于 10月 12, 2016

This reverts commit 03bea604.

Commit 4d0c5ba2 ("vfs: do get_write_access() on upper layer of
overlayfs") makes the writecount checks inside overlayfs superfluous, the
file is already copied up and write access acquired on the upper inode when
ovl_setattr is called with ATTR_SIZE.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

31c3a706

ovl: update doc · 2d8f2908

由 Miklos Szeredi 提交于 12月 16, 2016

The quirk for file locks and leases no longer applies.

Add missing info about renaming directory residing on lower layer.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2d8f2908

vfs: fix vfs_clone_file_range() for overlayfs files · b335e9d9

由 Amir Goldstein 提交于 10月 26, 2016

With overlayfs, it is wrong to compare file_inode(inode)->i_sb
of regular files with those of non-regular files, because the
former reference the real (upper/lower) sb and the latter reference
the overlayfs sb.

Move the test for same super block after the sanity tests for
clone range of directory and non-regular file.

This change fixes xfstest generic/157, which returned EXDEV instead
of EISDIR/EINVAL in the following test cases over overlayfs:

  echo "Try to reflink a dir"
  _reflink_range $testdir1/dir1 0 $testdir1/file2 0 $blksz

  echo "Try to reflink a device"
  _reflink_range $testdir1/dev1 0 $testdir1/file2 0 $blksz
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

b335e9d9

vfs: call vfs_clone_file_range() under freeze protection · 031a072a

由 Amir Goldstein 提交于 9月 23, 2016

Move sb_start_write()/sb_end_write() out of the vfs helper and up into the
ioctl handler.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

031a072a

vfs: allow vfs_clone_file_range() across mount points · 913b86e9

由 Amir Goldstein 提交于 9月 23, 2016

FICLONE/FICLONERANGE ioctls return -EXDEV if src and dest
files are not on the same mount point.
Practically, clone only requires that src and dest files
are on the same file system.

Move the check for same mount point to ioctl handler and keep
only the check for same super block in the vfs helper.

A following patch is going to use the vfs_clone_file_range()
helper in overlayfs to copy up between lower and upper
mount points on the same file system.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

913b86e9

vfs: no mnt_want_write_file() in vfs_{copy,clone}_file_range() · 3616119d

由 Miklos Szeredi 提交于 12月 16, 2016

We've checked for file_out being opened for write.  This ensures that we
already have mnt_want_write() on target.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

3616119d

Revert "vfs: rename: check backing inode being equal" · 8d3e2936

由 Miklos Szeredi 提交于 12月 16, 2016

This reverts commit 9409e22a.

Since commit 51f7e52d ("ovl: share inode for hard link") there's no
need to call d_real_inode() to check two overlay inodes for equality.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

8d3e2936

Revert "af_unix: fix hard linked sockets on overlay" · beef5121

由 Miklos Szeredi 提交于 12月 16, 2016

This reverts commit eb0a4a47.

Since commit 51f7e52d ("ovl: share inode for hard link") there's no
need to call d_real_inode() to check two overlay inodes for equality.

Side effect of this revert is that it's no longer possible to connect one
socket on overlayfs to one on the underlying layer (something which didn't
make sense anyway).
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

beef5121

05 12月, 2016 1 次提交
- L
  
  Linux 4.9-rc8 · 3e5de27e
  由 Linus Torvalds 提交于 12月 04, 2016
  
  3e5de27e
04 12月, 2016 2 次提交

Merge tag 'drm-fixes-for-v4.9-rc8' of git://people.freedesktop.org/~airlied/linux · 0cb65c83

由 Linus Torvalds 提交于 12月 03, 2016

Pull drm fixes from Dave Airlie:
 "A pretty small pull request: a couple of AMD powerxpress regression
  fixes and a power management fix, a couple of i915 fixes and one hdlcd
  fix, along with one core don't oops because of incorrect API usage fix"

* tag 'drm-fixes-for-v4.9-rc8' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: drop the struct_mutex when wedged or trying to reset
  drm/i915: Don't touch NULL sg on i915_gem_object_get_pages_gtt() error
  drm: Don't call drm_for_each_crtc with a non-KMS driver
  drm/radeon: fix check for port PM availability
  drm/amdgpu: fix check for port PM availability
  drm/amd/powerplay: initialize the soft_regs offset in struct smu7_hwmgr
  drm: hdlcd: Fix cleanup order

0cb65c83

Merge tag 'drm-intel-fixes-2016-12-01' of... · ab7cd8d8

由 Dave Airlie 提交于 12月 04, 2016

Merge tag 'drm-intel-fixes-2016-12-01' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes

2 intel fixes.

* tag 'drm-intel-fixes-2016-12-01' of git://anongit.freedesktop.org/git/drm-intel:
  drm/i915: drop the struct_mutex when wedged or trying to reset
  drm/i915: Don't touch NULL sg on i915_gem_object_get_pages_gtt() error

ab7cd8d8

03 12月, 2016 2 次提交

Merge branch 'akpm' (patches from Andrew) · 3c49de52

由 Linus Torvalds 提交于 12月 02, 2016

Merge more fixes from Andrew Morton:
 "2 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm, vmscan: add cond_resched() into shrink_node_memcg()
  mm: workingset: fix NULL ptr in count_shadow_nodes

3c49de52

mm, vmscan: add cond_resched() into shrink_node_memcg() · bd041733

由 Michal Hocko 提交于 12月 02, 2016

Boris Zhmurov has reported RCU stalls during the kswapd reclaim:

  INFO: rcu_sched detected stalls on CPUs/tasks:
   23-...: (22 ticks this GP) idle=92f/140000000000000/0 softirq=2638404/2638404 fqs=23
   (detected by 4, t=6389 jiffies, g=786259, c=786258, q=42115)
  Task dump for CPU 23:
  kswapd1         R  running task        0   148      2 0x00000008
  Call Trace:
    shrink_node+0xd2/0x2f0
    kswapd+0x2cb/0x6a0
    mem_cgroup_shrink_node+0x160/0x160
    kthread+0xbd/0xe0
    __switch_to+0x1fa/0x5c0
    ret_from_fork+0x1f/0x40
    kthread_create_on_node+0x180/0x180

a closer code inspection has shown that we might indeed miss all the
scheduling points in the reclaim path if no pages can be isolated from
the LRU list.  This is a pathological case but other reports from Donald
Buczek have shown that we might indeed hit such a path:

        clusterd-989   [009] .... 118023.654491: mm_vmscan_direct_reclaim_end: nr_reclaimed=193
         kswapd1-86    [001] dN.. 118023.987475: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239830 nr_taken=0 file=1
         kswapd1-86    [001] dN.. 118024.320968: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239844 nr_taken=0 file=1
         kswapd1-86    [001] dN.. 118024.654375: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239858 nr_taken=0 file=1
         kswapd1-86    [001] dN.. 118024.987036: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239872 nr_taken=0 file=1
         kswapd1-86    [001] dN.. 118025.319651: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239886 nr_taken=0 file=1
         kswapd1-86    [001] dN.. 118025.652248: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239900 nr_taken=0 file=1
         kswapd1-86    [001] dN.. 118025.984870: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239914 nr_taken=0 file=1
  [...]
         kswapd1-86    [001] dN.. 118084.274403: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4241133 nr_taken=0 file=1

this is minute long snapshot which didn't take a single page from the
LRU.  It is not entirely clear why only 1303 pages have been scanned
during that time (maybe there was a heavy IRQ activity interfering).

In any case it looks like we can really hit long periods without
scheduling on non preemptive kernels so an explicit cond_resched() in
shrink_node_memcg which is independent on the reclaim operation is due.

Link: http://lkml.kernel.org/r/20161202095841.16648-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Reported-by: NBoris Zhmurov <bb@kernelpanic.ru>
Tested-by: NBoris Zhmurov <bb@kernelpanic.ru>
Reported-by: NDonald Buczek <buczek@molgen.mpg.de>
Reported-by: N"Christopher S. Aker" <caker@theshore.net>
Reported-by: NPaul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bd041733

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功