1. 19 9月, 2016 1 次提交
    • V
      ovl: during copy up, switch to mounter's creds early · 8eac98b8
      Vivek Goyal 提交于
      Now, we have the notion that copy up of a file is done with the creds
      of mounter of overlay filesystem (as opposed to task). Right now before
      we switch creds, we do some vfs_getattr() operations in the context of
      task and that itself can fail. We should do that getattr() using the
      creds of mounter instead.
      
      So this patch switches to mounter's creds early during copy up process so
      that even vfs_getattr() is done with mounter's creds.
      
      Do not call revert_creds() unless we have already called
      ovl_override_creds(). [Reported by Arnd Bergmann]
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      8eac98b8
  2. 01 9月, 2016 6 次提交
  3. 08 8月, 2016 1 次提交
    • M
      ovl: don't copy up opaqueness · 0956254a
      Miklos Szeredi 提交于
      When a copy up of a directory occurs which has the opaque xattr set, the
      xattr remains in the upper directory. The immediate behavior with overlayfs
      is that the upper directory is not treated as opaque, however after a
      remount the opaque flag is used and upper directory is treated as opaque.
      This causes files created in the lower layer to be hidden when using
      multiple lower directories.
      
      Fix by not copying up the opaque flag.
      
      To reproduce:
      
       ----8<---------8<---------8<---------8<---------8<---------8<----
      mkdir -p l/d/s u v w mnt
      mount -t overlay overlay -olowerdir=l,upperdir=u,workdir=w mnt
      rm -rf mnt/d/
      mkdir -p mnt/d/n
      umount mnt
      mount -t overlay overlay -olowerdir=u:l,upperdir=v,workdir=w mnt
      touch mnt/d/foo
      umount mnt
      mount -t overlay overlay -olowerdir=u:l,upperdir=v,workdir=w mnt
      ls mnt/d
       ----8<---------8<---------8<---------8<---------8<---------8<----
       
      output should be:  "foo  n"
      Reported-by: NDerek McGowan <dmcg@drizz.net>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=151291Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Cc: <stable@vger.kernel.org>
      0956254a
  4. 29 7月, 2016 13 次提交
    • V
      ovl: append MAY_READ when diluting write checks · 500cac3c
      Vivek Goyal 提交于
      Right now we remove MAY_WRITE/MAY_APPEND bits from mask if realfile is on
      lower/. This is done as files on lower will never be written and will be
      copied up. But to copy up a file, mounter should have MAY_READ permission
      otherwise copy up will fail. So set MAY_READ in mask when MAY_WRITE is
      reset.
      
      Dan Walsh noticed this when he did access(lowerfile, W_OK) and it returned
      True (context mounts) but when he tried to actually write to file, it
      failed as mounter did not have permission on lower file.
      
      [SzM] don't set MAY_READ if only MAY_APPEND is set without MAY_WRITE; this
      won't trigger a copy-up.
      Reported-by: NDan Walsh <dwalsh@redhat.com>
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      500cac3c
    • V
      ovl: dilute permission checks on lower only if not special file · e29841a0
      Vivek Goyal 提交于
      Right now if file is on lower/, we remove MAY_WRITE/MAY_APPEND bits from
      mask as lower/ will never be written and file will be copied up. But this
      is not true for special files. These files are not copied up and are opened
      in place. So don't dilute the checks for these types of files.
      Reported-by: NDan Walsh <dwalsh@redhat.com>
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      e29841a0
    • M
      ovl: fix POSIX ACL setting · d837a49b
      Miklos Szeredi 提交于
      Setting POSIX ACL needs special handling:
      
      1) Some permission checks are done by ->setxattr() which now uses mounter's
      creds ("ovl: do operations on underlying file system in mounter's
      context").  These permission checks need to be done with current cred as
      well.
      
      2) Setting ACL can fail for various reasons.  We do not need to copy up in
      these cases.
      
      In the mean time switch to using generic_setxattr.
      
      [Arnd Bergmann] Fix link error without POSIX ACL. posix_acl_from_xattr()
      doesn't have a 'static inline' implementation when CONFIG_FS_POSIX_ACL is
      disabled, and I could not come up with an obvious way to do it.
      
      This instead avoids the link error by defining two sets of ACL operations
      and letting the compiler drop one of the two at compile time depending
      on CONFIG_FS_POSIX_ACL. This avoids all references to the ACL code,
      also leading to smaller code.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      d837a49b
    • M
      ovl: share inode for hard link · 51f7e52d
      Miklos Szeredi 提交于
      Inode attributes are copied up to overlay inode (uid, gid, mode, atime,
      mtime, ctime) so generic code using these fields works correcty.  If a hard
      link is created in overlayfs separate inodes are allocated for each link.
      If chmod/chown/etc. is performed on one of the links then the inode
      belonging to the other ones won't be updated.
      
      This patch attempts to fix this by sharing inodes for hard links.
      
      Use inode hash (with real inode pointer as a key) to make sure overlay
      inodes are shared for hard links on upper.  Hard links on lower are still
      split (which is not user observable until the copy-up happens, see
      Documentation/filesystems/overlayfs.txt under "Non-standard behavior").
      
      The inode is only inserted in the hash if it is non-directoy and upper.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      51f7e52d
    • M
      ovl: store real inode pointer in ->i_private · 39b681f8
      Miklos Szeredi 提交于
      To get from overlay inode to real inode we currently use 'struct
      ovl_entry', which has lifetime connected to overlay dentry.  This is okay,
      since each overlay dentry had a new overlay inode allocated.
      
      Following patch will break that assumption, so need to leave out ovl_entry.
      This patch stores the real inode directly in i_private, with the lowest bit
      used to indicate whether the inode is upper or lower.
      
      Lifetime rules remain, using ovl_inode_real() must only be done while
      caller holds ref on overlay dentry (and hence on real dentry), or within
      RCU protected regions.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      39b681f8
    • M
      ovl: permission: return ECHILD instead of ENOENT · a999d7e1
      Miklos Szeredi 提交于
      The error is due to RCU and is temporary.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      a999d7e1
    • M
      ovl: update atime on upper · d719e8f2
      Miklos Szeredi 提交于
      Fix atime update logic in overlayfs.
      
      This patch adds an i_op->update_time() handler to overlayfs inodes.  This
      forwards atime updates to the upper layer only.  No atime updates are done
      on lower layers.
      
      Remove implicit atime updates to underlying files and directories with
      O_NOATIME.  Remove explicit atime update in ovl_readlink().
      
      Clear atime related mnt flags from cloned upper mount.  This means atime
      updates are controlled purely by overlayfs mount options.
      
      Reported-by: Konstantin Khlebnikov <koct9i@gmail.com> 
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      d719e8f2
    • M
      ovl: simplify permission checking · 9c630ebe
      Miklos Szeredi 提交于
      The fact that we always do permission checking on the overlay inode and
      clear MAY_WRITE for checking access to the lower inode allows cruft to be
      removed from ovl_permission().
      
      1) "default_permissions" option effectively did generic_permission() on the
      overlay inode with i_mode, i_uid and i_gid updated from underlying
      filesystem.  This is what we do by default now.  It did the update using
      vfs_getattr() but that's only needed if the underlying filesystem can
      change (which is not allowed).  We may later introduce a "paranoia_mode"
      that verifies that mode/uid/gid are not changed.
      
      2) splitting out the IS_RDONLY() check from inode_permission() also becomes
      unnecessary once we remove the MAY_WRITE from the lower inode check.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      9c630ebe
    • V
      ovl: do not require mounter to have MAY_WRITE on lower · 754f8cb7
      Vivek Goyal 提交于
      Now we have two levels of checks in ovl_permission(). overlay inode
      is checked with the creds of task while underlying inode is checked
      with the creds of mounter.
      
      Looks like mounter does not have to have WRITE access to files on lower/.
      So remove the MAY_WRITE from access mask for checks on underlying
      lower inode.
      
      This means task should still have the MAY_WRITE permission on lower
      inode and mounter is not required to have MAY_WRITE.
      
      It also solves the problem of read only NFS mounts being used as lower.
      If __inode_permission(lower_inode, MAY_WRITE) is called on read only
      NFS, it fails. By resetting MAY_WRITE, check succeeds and case of
      read only NFS shold work with overlay without having to specify any
      special mount options (default permission).
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      754f8cb7
    • V
      ovl: do operations on underlying file system in mounter's context · 1175b6b8
      Vivek Goyal 提交于
      Given we are now doing checks both on overlay inode as well underlying
      inode, we should be able to do checks and operations on underlying file
      system using mounter's context.
      
      So modify all operations to do checks/operations on underlying dentry/inode
      in the context of mounter.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      1175b6b8
    • V
      ovl: modify ovl_permission() to do checks on two inodes · c0ca3d70
      Vivek Goyal 提交于
      Right now ovl_permission() calls __inode_permission(realinode), to do
      permission checks on real inode and no checks are done on overlay inode.
      
      Modify it to do checks both on overlay inode as well as underlying inode.
      Checks on overlay inode will be done with the creds of calling task while
      checks on underlying inode will be done with the creds of mounter.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      c0ca3d70
    • V
      ovl: define ->get_acl() for overlay inodes · 39a25b2b
      Vivek Goyal 提交于
      Now we are planning to do DAC permission checks on overlay inode
      itself. And to make it work, we will need to make sure we can get acls from
      underlying inode. So define ->get_acl() for overlay inodes and this in turn
      calls into underlying filesystem to get acls, if any.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      39a25b2b
    • A
      ovl: store ovl_entry in inode->i_private for all inodes · 58ed4e70
      Andreas Gruenbacher 提交于
      Previously this was only done for directory inodes.  Doing so for all
      inodes makes for a nice cleanup in ovl_permission at zero cost.
      
      Inodes are not shared for hard links on the overlay, so this works fine.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      58ed4e70
  5. 04 7月, 2016 2 次提交
    • V
      ovl: Copy up underlying inode's ->i_mode to overlay inode · 07a2daab
      Vivek Goyal 提交于
      Right now when a new overlay inode is created, we initialize overlay
      inode's ->i_mode from underlying inode ->i_mode but we retain only
      file type bits (S_IFMT) and discard permission bits.
      
      This patch changes it and retains permission bits too. This should allow
      overlay to do permission checks on overlay inode itself in task context.
      
      [SzM] It also fixes clearing suid/sgid bits on write.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reported-by: NEryu Guan <eguan@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
      Cc: <stable@vger.kernel.org>
      07a2daab
    • M
      ovl: handle ATTR_KILL* · b99c2d91
      Miklos Szeredi 提交于
      Before 4bacc9c9 ("overlayfs: Make f_path...") file->f_path pointed to
      the underlying file, hence suid/sgid removal on write worked fine.
      
      After that patch file->f_path pointed to the overlay file, and the file
      mode bits weren't copied to overlay_inode->i_mode.  So the suid/sgid
      removal simply stopped working.
      
      The fix is to copy the mode bits, but then ovl_setattr() needs to clear
      ATTR_MODE to avoid the BUG() in notify_change().  So do this first, then in
      the next patch copy the mode.
      Reported-by: NEryu Guan <eguan@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
      Cc: <stable@vger.kernel.org>
      b99c2d91
  6. 30 6月, 2016 1 次提交
    • M
      vfs: merge .d_select_inode() into .d_real() · 2d902671
      Miklos Szeredi 提交于
      The two methods essentially do the same: find the real dentry/inode
      belonging to an overlay dentry.  The difference is in the usage:
      
      vfs_open() uses ->d_select_inode() and expects the function to perform
      copy-up if necessary based on the open flags argument.
      
      file_dentry() uses ->d_real() passing in the overlay dentry as well as the
      underlying inode.
      
      vfs_rename() uses ->d_select_inode() but passes zero flags.  ->d_real()
      with a zero inode would have worked just as well here.
      
      This patch merges the functionality of ->d_select_inode() into ->d_real()
      by adding an 'open_flags' argument to the latter.
      
      [Al Viro] Make the signature of d_real() match that of ->d_real() again.
      And constify the inode argument, while we are at it.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      2d902671
  7. 29 6月, 2016 2 次提交
    • M
      ovl: get_write_access() in truncate · 03bea604
      Miklos Szeredi 提交于
      When truncating a file we should check write access on the underlying
      inode.  And we should do so on the lower file as well (before copy-up) for
      consistency.
      
      Original patch and test case by Aihua Zhang.
      
       - - >o >o - - test.c - - >o >o - -
      #include <stdio.h>
      #include <errno.h>
      #include <unistd.h>
      
      int main(int argc, char *argv[])
      {
      	int ret;
      
      	ret = truncate(argv[0], 4096);
      	if (ret != -1) {
      		fprintf(stderr, "truncate(argv[0]) should have failed\n");
      		return 1;
      	}
      	if (errno != ETXTBSY) {
      		perror("truncate(argv[0])");
      		return 1;
      	}
      
      	return 0;
      }
       - - >o >o - - >o >o - - >o >o - -
      Reported-by: NAihua Zhang <zhangaihua1@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Cc: <stable@vger.kernel.org>
      03bea604
    • M
      ovl: fix dentry leak for default_permissions · a4859d75
      Miklos Szeredi 提交于
      When using the 'default_permissions' mount option, ovl_permission() on
      non-directories was missing a dput(alias), resulting in "BUG Dentry still
      in use".
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 8d3095f4 ("ovl: default permissions")
      Cc: <stable@vger.kernel.org> # v4.5+
      a4859d75
  8. 06 6月, 2016 1 次提交
    • M
      ovl: xattr filter fix · b581755b
      Miklos Szeredi 提交于
      a) ovl_need_xattr_filter() is wrong, we can have multiple lower layers
      overlaid, all of which (except the lowest one) honouring the
      "trusted.overlay.opaque" xattr.  So need to filter everything except the
      bottom and the pure-upper layer.
      
      b) we no longer can assume that inode is attached to dentry in
      get/setxattr.
      
      This patch unconditionally filters private xattrs to fix both of the above.
      Performance impact for get/removexattrs is likely in the noise.
      
      For listxattrs it might be measurable in pathological cases, but I very
      much hope nobody cares.  If they do, we'll fix it then.
      Reported-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Fixes: b9680917 ("security_d_instantiate(): move to the point prior to attaching dentry to inode")
      b581755b
  9. 28 5月, 2016 1 次提交
  10. 11 4月, 2016 1 次提交
  11. 04 3月, 2016 1 次提交
  12. 23 1月, 2016 1 次提交
    • A
      wrappers for ->i_mutex access · 5955102c
      Al Viro 提交于
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5955102c
  13. 31 12月, 2015 1 次提交
  14. 11 12月, 2015 1 次提交
  15. 09 12月, 2015 1 次提交
    • A
      replace ->follow_link() with new method that could stay in RCU mode · 6b255391
      Al Viro 提交于
      new method: ->get_link(); replacement of ->follow_link().  The differences
      are:
      	* inode and dentry are passed separately
      	* might be called both in RCU and non-RCU mode;
      the former is indicated by passing it a NULL dentry.
      	* when called that way it isn't allowed to block
      and should return ERR_PTR(-ECHILD) if it needs to be called
      in non-RCU mode.
      
      It's a flagday change - the old method is gone, all in-tree instances
      converted.  Conversion isn't hard; said that, so far very few instances
      do not immediately bail out when called in RCU mode.  That'll change
      in the next commits.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6b255391
  16. 07 12月, 2015 2 次提交
  17. 12 10月, 2015 2 次提交
    • M
      ovl: default permissions · 8d3095f4
      Miklos Szeredi 提交于
      Add mount option "default_permissions" to alter the way permissions are
      calculated.
      
      Without this option and prior to this patch permissions were calculated by
      underlying lower or upper filesystem.
      
      With this option the permissions are calculated by overlayfs based on the
      file owner, group and mode bits.
      
      This has significance for example when a read-only exported NFS filesystem
      is used as a lower layer.  In this case the underlying NFS filesystem will
      reply with EROFS, in which case all we know is that the filesystem is
      read-only.  But that's not what we are interested in, we are interested in
      whether the access would be allowed if the filesystem wasn't read-only; the
      server doesn't tell us that, and would need updating at various levels,
      which doesn't seem practicable.
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      8d3095f4
    • M
      ovl: fix open in stacked overlay · 1c8a47df
      Miklos Szeredi 提交于
      If two overlayfs filesystems are stacked on top of each other, then we need
      recursion in ovl_d_select_inode().
      
      I guess d_backing_inode() is supposed to do that.  But currently it doesn't
      and that functionality is open coded in vfs_open().  This is now copied
      into ovl_d_select_inode() to fix this regression.
      Reported-by: NAlban Crequy <alban.crequy@gmail.com>
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay...")
      Cc: David Howells <dhowells@redhat.com>
      Cc: <stable@vger.kernel.org> # v4.2+
      1c8a47df
  18. 12 7月, 2015 1 次提交
  19. 19 6月, 2015 1 次提交
    • D
      overlayfs: Make f_path always point to the overlay and f_inode to the underlay · 4bacc9c9
      David Howells 提交于
      Make file->f_path always point to the overlay dentry so that the path in
      /proc/pid/fd is correct and to ensure that label-based LSMs have access to the
      overlay as well as the underlay (path-based LSMs probably don't need it).
      
      Using my union testsuite to set things up, before the patch I see:
      
      	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
      	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
      	...
      	lr-x------. 1 root root 64 Jun  5 14:38 5 -> /a/foo107
      	[root@andromeda union-testsuite]# stat /mnt/a/foo107
      	...
      	Device: 23h/35d Inode: 13381       Links: 1
      	...
      	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
      	...
      	Device: 23h/35d Inode: 13381       Links: 1
      	...
      
      After the patch:
      
      	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
      	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
      	...
      	lr-x------. 1 root root 64 Jun  5 14:22 5 -> /mnt/a/foo107
      	[root@andromeda union-testsuite]# stat /mnt/a/foo107
      	...
      	Device: 23h/35d Inode: 40346       Links: 1
      	...
      	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
      	...
      	Device: 23h/35d Inode: 40346       Links: 1
      	...
      
      Note the change in where /proc/$$/fd/5 points to in the ls command.  It was
      pointing to /a/foo107 (which doesn't exist) and now points to /mnt/a/foo107
      (which is correct).
      
      The inode accessed, however, is the lower layer.  The union layer is on device
      25h/37d and the upper layer on 24h/36d.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4bacc9c9