1. 29 7月, 2016 12 次提交
    • M
      ovl: permission: return ECHILD instead of ENOENT · a999d7e1
      Miklos Szeredi 提交于
      The error is due to RCU and is temporary.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      a999d7e1
    • M
      ovl: update atime on upper · d719e8f2
      Miklos Szeredi 提交于
      Fix atime update logic in overlayfs.
      
      This patch adds an i_op->update_time() handler to overlayfs inodes.  This
      forwards atime updates to the upper layer only.  No atime updates are done
      on lower layers.
      
      Remove implicit atime updates to underlying files and directories with
      O_NOATIME.  Remove explicit atime update in ovl_readlink().
      
      Clear atime related mnt flags from cloned upper mount.  This means atime
      updates are controlled purely by overlayfs mount options.
      
      Reported-by: Konstantin Khlebnikov <koct9i@gmail.com> 
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      d719e8f2
    • M
      ovl: fix sgid on directory · bb0d2b8a
      Miklos Szeredi 提交于
      When creating directory in workdir, the group/sgid inheritance from the
      parent dir was omitted completely.  Fix this by calling inode_init_owner()
      on overlay inode and using the resulting uid/gid/mode to create the file.
      
      Unfortunately the sgid bit can be stripped off due to umask, so need to
      reset the mode in this case in workdir before moving the directory in
      place.
      Reported-by: NEryu Guan <eguan@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      bb0d2b8a
    • M
      ovl: simplify permission checking · 9c630ebe
      Miklos Szeredi 提交于
      The fact that we always do permission checking on the overlay inode and
      clear MAY_WRITE for checking access to the lower inode allows cruft to be
      removed from ovl_permission().
      
      1) "default_permissions" option effectively did generic_permission() on the
      overlay inode with i_mode, i_uid and i_gid updated from underlying
      filesystem.  This is what we do by default now.  It did the update using
      vfs_getattr() but that's only needed if the underlying filesystem can
      change (which is not allowed).  We may later introduce a "paranoia_mode"
      that verifies that mode/uid/gid are not changed.
      
      2) splitting out the IS_RDONLY() check from inode_permission() also becomes
      unnecessary once we remove the MAY_WRITE from the lower inode check.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      9c630ebe
    • V
      ovl: do not require mounter to have MAY_WRITE on lower · 754f8cb7
      Vivek Goyal 提交于
      Now we have two levels of checks in ovl_permission(). overlay inode
      is checked with the creds of task while underlying inode is checked
      with the creds of mounter.
      
      Looks like mounter does not have to have WRITE access to files on lower/.
      So remove the MAY_WRITE from access mask for checks on underlying
      lower inode.
      
      This means task should still have the MAY_WRITE permission on lower
      inode and mounter is not required to have MAY_WRITE.
      
      It also solves the problem of read only NFS mounts being used as lower.
      If __inode_permission(lower_inode, MAY_WRITE) is called on read only
      NFS, it fails. By resetting MAY_WRITE, check succeeds and case of
      read only NFS shold work with overlay without having to specify any
      special mount options (default permission).
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      754f8cb7
    • V
      ovl: do operations on underlying file system in mounter's context · 1175b6b8
      Vivek Goyal 提交于
      Given we are now doing checks both on overlay inode as well underlying
      inode, we should be able to do checks and operations on underlying file
      system using mounter's context.
      
      So modify all operations to do checks/operations on underlying dentry/inode
      in the context of mounter.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      1175b6b8
    • V
      ovl: modify ovl_permission() to do checks on two inodes · c0ca3d70
      Vivek Goyal 提交于
      Right now ovl_permission() calls __inode_permission(realinode), to do
      permission checks on real inode and no checks are done on overlay inode.
      
      Modify it to do checks both on overlay inode as well as underlying inode.
      Checks on overlay inode will be done with the creds of calling task while
      checks on underlying inode will be done with the creds of mounter.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      c0ca3d70
    • V
      ovl: define ->get_acl() for overlay inodes · 39a25b2b
      Vivek Goyal 提交于
      Now we are planning to do DAC permission checks on overlay inode
      itself. And to make it work, we will need to make sure we can get acls from
      underlying inode. So define ->get_acl() for overlay inodes and this in turn
      calls into underlying filesystem to get acls, if any.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      39a25b2b
    • V
      ovl: move some common code in a function · 72e48481
      Vivek Goyal 提交于
      ovl_create_upper() and ovl_create_over_whiteout() seem to be sharing some
      common code which can be moved into a separate function.  No functionality
      change.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      72e48481
    • A
      ovl: store ovl_entry in inode->i_private for all inodes · 58ed4e70
      Andreas Gruenbacher 提交于
      Previously this was only done for directory inodes.  Doing so for all
      inodes makes for a nice cleanup in ovl_permission at zero cost.
      
      Inodes are not shared for hard links on the overlay, so this works fine.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      58ed4e70
    • M
      ovl: use generic_delete_inode · eead4f2d
      Miklos Szeredi 提交于
      No point in keeping overlay inodes around since they will never be reused.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      eead4f2d
    • M
      ovl: check mounter creds on underlying lookup · c1b2cc1a
      Miklos Szeredi 提交于
      The hash salting changes meant that we can no longer reuse the hash in the
      overlay dentry to look up the underlying dentry.
      
      Instead of lookup_hash(), use lookup_one_len_unlocked() and swith to
      mounter's creds (like we do for all other operations later in the series).
      
      Now the lookup_hash() export introduced in 4.6 by 3c9fe8cd ("vfs: add
      lookup_hash() helper") is unused and can possibly be removed; its
      usefulness negated by the hash salting and the idea that mounter's creds
      should be used on operations on underlying filesystems.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 8387ff25 ("vfs: make the string hashes salt the hash")
      c1b2cc1a
  2. 22 7月, 2016 1 次提交
  3. 04 7月, 2016 2 次提交
    • V
      ovl: Copy up underlying inode's ->i_mode to overlay inode · 07a2daab
      Vivek Goyal 提交于
      Right now when a new overlay inode is created, we initialize overlay
      inode's ->i_mode from underlying inode ->i_mode but we retain only
      file type bits (S_IFMT) and discard permission bits.
      
      This patch changes it and retains permission bits too. This should allow
      overlay to do permission checks on overlay inode itself in task context.
      
      [SzM] It also fixes clearing suid/sgid bits on write.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reported-by: NEryu Guan <eguan@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
      Cc: <stable@vger.kernel.org>
      07a2daab
    • M
      ovl: handle ATTR_KILL* · b99c2d91
      Miklos Szeredi 提交于
      Before 4bacc9c9 ("overlayfs: Make f_path...") file->f_path pointed to
      the underlying file, hence suid/sgid removal on write worked fine.
      
      After that patch file->f_path pointed to the overlay file, and the file
      mode bits weren't copied to overlay_inode->i_mode.  So the suid/sgid
      removal simply stopped working.
      
      The fix is to copy the mode bits, but then ovl_setattr() needs to clear
      ATTR_MODE to avoid the BUG() in notify_change().  So do this first, then in
      the next patch copy the mode.
      Reported-by: NEryu Guan <eguan@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
      Cc: <stable@vger.kernel.org>
      b99c2d91
  4. 03 7月, 2016 1 次提交
  5. 30 6月, 2016 1 次提交
    • M
      vfs: merge .d_select_inode() into .d_real() · 2d902671
      Miklos Szeredi 提交于
      The two methods essentially do the same: find the real dentry/inode
      belonging to an overlay dentry.  The difference is in the usage:
      
      vfs_open() uses ->d_select_inode() and expects the function to perform
      copy-up if necessary based on the open flags argument.
      
      file_dentry() uses ->d_real() passing in the overlay dentry as well as the
      underlying inode.
      
      vfs_rename() uses ->d_select_inode() but passes zero flags.  ->d_real()
      with a zero inode would have worked just as well here.
      
      This patch merges the functionality of ->d_select_inode() into ->d_real()
      by adding an 'open_flags' argument to the latter.
      
      [Al Viro] Make the signature of d_real() match that of ->d_real() again.
      And constify the inode argument, while we are at it.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      2d902671
  6. 29 6月, 2016 2 次提交
    • M
      ovl: get_write_access() in truncate · 03bea604
      Miklos Szeredi 提交于
      When truncating a file we should check write access on the underlying
      inode.  And we should do so on the lower file as well (before copy-up) for
      consistency.
      
      Original patch and test case by Aihua Zhang.
      
       - - >o >o - - test.c - - >o >o - -
      #include <stdio.h>
      #include <errno.h>
      #include <unistd.h>
      
      int main(int argc, char *argv[])
      {
      	int ret;
      
      	ret = truncate(argv[0], 4096);
      	if (ret != -1) {
      		fprintf(stderr, "truncate(argv[0]) should have failed\n");
      		return 1;
      	}
      	if (errno != ETXTBSY) {
      		perror("truncate(argv[0])");
      		return 1;
      	}
      
      	return 0;
      }
       - - >o >o - - >o >o - - >o >o - -
      Reported-by: NAihua Zhang <zhangaihua1@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Cc: <stable@vger.kernel.org>
      03bea604
    • M
      ovl: fix dentry leak for default_permissions · a4859d75
      Miklos Szeredi 提交于
      When using the 'default_permissions' mount option, ovl_permission() on
      non-directories was missing a dput(alias), resulting in "BUG Dentry still
      in use".
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 8d3095f4 ("ovl: default permissions")
      Cc: <stable@vger.kernel.org> # v4.5+
      a4859d75
  7. 15 6月, 2016 1 次提交
    • M
      ovl: fix uid/gid when creating over whiteout · d0e13f5b
      Miklos Szeredi 提交于
      Fix a regression when creating a file over a whiteout.  The new
      file/directory needs to use the current fsuid/fsgid, not the ones from the
      mounter's credentials.
      
      The refcounting is a bit tricky: prepare_creds() sets an original refcount,
      override_creds() gets one more, which revert_cred() drops.  So
      
        1) we need to expicitly put the mounter's credentials when overriding
           with the updated one
      
        2) we need to put the original ref to the updated creds (and this can
           safely be done before revert_creds(), since we'll still have the ref
           from override_creds()).
      Reported-by: NStephen Smalley <sds@tycho.nsa.gov>
      Fixes: 3fe6e52f ("ovl: override creds with the ones from the superblock mounter")
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      d0e13f5b
  8. 06 6月, 2016 1 次提交
    • M
      ovl: xattr filter fix · b581755b
      Miklos Szeredi 提交于
      a) ovl_need_xattr_filter() is wrong, we can have multiple lower layers
      overlaid, all of which (except the lowest one) honouring the
      "trusted.overlay.opaque" xattr.  So need to filter everything except the
      bottom and the pure-upper layer.
      
      b) we no longer can assume that inode is attached to dentry in
      get/setxattr.
      
      This patch unconditionally filters private xattrs to fix both of the above.
      Performance impact for get/removexattrs is likely in the noise.
      
      For listxattrs it might be measurable in pathological cases, but I very
      much hope nobody cares.  If they do, we'll fix it then.
      Reported-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Fixes: b9680917 ("security_d_instantiate(): move to the point prior to attaching dentry to inode")
      b581755b
  9. 28 5月, 2016 1 次提交
  10. 27 5月, 2016 2 次提交
    • V
      ovl: Do d_type check only if work dir creation was successful · 21765194
      Vivek Goyal 提交于
      d_type check requires successful creation of workdir as iterates
      through work dir and expects work dir to be present in it. If that's
      not the case, this check will always return d_type not supported even
      if underlying filesystem might be supporting it.
      
      So don't do this check if work dir creation failed in previous step.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      21765194
    • A
      ovl: override creds with the ones from the superblock mounter · 3fe6e52f
      Antonio Murdaca 提交于
      In user namespace the whiteout creation fails with -EPERM because the
      current process isn't capable(CAP_SYS_ADMIN) when setting xattr.
      
      A simple reproducer:
      
      $ mkdir upper lower work merged lower/dir
      $ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged
      $ unshare -m -p -f -U -r bash
      
      Now as root in the user namespace:
      
      \# touch merged/dir/{1,2,3} # this will force a copy up of lower/dir
      \# rm -fR merged/*
      
      This ends up failing with -EPERM after the files in dir has been
      correctly deleted:
      
      unlinkat(4, "2", 0)                     = 0
      unlinkat(4, "1", 0)                     = 0
      unlinkat(4, "3", 0)                     = 0
      close(4)                                = 0
      unlinkat(AT_FDCWD, "merged/dir", AT_REMOVEDIR) = -1 EPERM (Operation not
      permitted)
      
      Interestingly, if you don't place files in merged/dir you can remove it,
      meaning if upper/dir does not exist, creating the char device file works
      properly in that same location.
      
      This patch uses ovl_sb_creator_cred() to get the cred struct from the
      superblock mounter and override the old cred with these new ones so that
      the whiteout creation is possible because overlay is wrong in assuming that
      the creds it will get with prepare_creds will be in the initial user
      namespace.  The old cap_raise game is removed in favor of just overriding
      the old cred struct.
      
      This patch also drops from ovl_copy_up_one() the following two lines:
      
      override_cred->fsuid = stat->uid;
      override_cred->fsgid = stat->gid;
      
      This is because the correct uid and gid are taken directly with the stat
      struct and correctly set with ovl_set_attr().
      Signed-off-by: NAntonio Murdaca <runcom@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      3fe6e52f
  11. 26 5月, 2016 1 次提交
  12. 11 5月, 2016 1 次提交
    • M
      ovl: ignore permissions on underlying lookup · 38b78a5f
      Miklos Szeredi 提交于
      Generally permission checking is not necessary when overlayfs looks up a
      dentry on one of the underlying layers, since search permission on base
      directory was already checked in ovl_permission().
      
      More specifically using lookup_one_len() causes a problem when the lower
      directory lacks search permission for a specific user while the upper
      directory does have search permission.  Since lookups are cached, this
      causes inconsistency in behavior: success depends on who did the first
      lookup.
      
      So instead use lookup_hash() which doesn't do the permission check.
      Reported-by: NIgnacy Gawędzki <ignacy.gawedzki@green-communications.fr>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      38b78a5f
  13. 03 5月, 2016 2 次提交
  14. 11 4月, 2016 1 次提交
  15. 27 3月, 2016 1 次提交
    • M
      fs: add file_dentry() · d101a125
      Miklos Szeredi 提交于
      This series fixes bugs in nfs and ext4 due to 4bacc9c9 ("overlayfs:
      Make f_path always point to the overlay and f_inode to the underlay").
      
      Regular files opened on overlayfs will result in the file being opened on
      the underlying filesystem, while f_path points to the overlayfs
      mount/dentry.
      
      This confuses filesystems which get the dentry from struct file and assume
      it's theirs.
      
      Add a new helper, file_dentry() [*], to get the filesystem's own dentry
      from the file.  This checks file->f_path.dentry->d_flags against
      DCACHE_OP_REAL, and returns file->f_path.dentry if DCACHE_OP_REAL is not
      set (this is the common, non-overlayfs case).
      
      In the uncommon case it will call into overlayfs's ->d_real() to get the
      underlying dentry, matching file_inode(file).
      
      The reason we need to check against the inode is that if the file is copied
      up while being open, d_real() would return the upper dentry, while the open
      file comes from the lower dentry.
      
      [*] If possible, it's better simply to use file_inode() instead.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Tested-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Reviewed-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Cc: <stable@vger.kernel.org> # v4.2
      Cc: David Howells <dhowells@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Daniel Axtens <dja@axtens.net>
      d101a125
  16. 22 3月, 2016 7 次提交
  17. 04 3月, 2016 3 次提交