1. 29 7月, 2016 9 次提交
    • W
      ovl: remove duplicated include from super.c · 5f215013
      Wei Yongjun 提交于
      Remove duplicated include.
      Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      5f215013
    • M
      ovl: fix POSIX ACL setting · d837a49b
      Miklos Szeredi 提交于
      Setting POSIX ACL needs special handling:
      
      1) Some permission checks are done by ->setxattr() which now uses mounter's
      creds ("ovl: do operations on underlying file system in mounter's
      context").  These permission checks need to be done with current cred as
      well.
      
      2) Setting ACL can fail for various reasons.  We do not need to copy up in
      these cases.
      
      In the mean time switch to using generic_setxattr.
      
      [Arnd Bergmann] Fix link error without POSIX ACL. posix_acl_from_xattr()
      doesn't have a 'static inline' implementation when CONFIG_FS_POSIX_ACL is
      disabled, and I could not come up with an obvious way to do it.
      
      This instead avoids the link error by defining two sets of ACL operations
      and letting the compiler drop one of the two at compile time depending
      on CONFIG_FS_POSIX_ACL. This avoids all references to the ACL code,
      also leading to smaller code.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      d837a49b
    • M
      ovl: share inode for hard link · 51f7e52d
      Miklos Szeredi 提交于
      Inode attributes are copied up to overlay inode (uid, gid, mode, atime,
      mtime, ctime) so generic code using these fields works correcty.  If a hard
      link is created in overlayfs separate inodes are allocated for each link.
      If chmod/chown/etc. is performed on one of the links then the inode
      belonging to the other ones won't be updated.
      
      This patch attempts to fix this by sharing inodes for hard links.
      
      Use inode hash (with real inode pointer as a key) to make sure overlay
      inodes are shared for hard links on upper.  Hard links on lower are still
      split (which is not user observable until the copy-up happens, see
      Documentation/filesystems/overlayfs.txt under "Non-standard behavior").
      
      The inode is only inserted in the hash if it is non-directoy and upper.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      51f7e52d
    • M
      ovl: store real inode pointer in ->i_private · 39b681f8
      Miklos Szeredi 提交于
      To get from overlay inode to real inode we currently use 'struct
      ovl_entry', which has lifetime connected to overlay dentry.  This is okay,
      since each overlay dentry had a new overlay inode allocated.
      
      Following patch will break that assumption, so need to leave out ovl_entry.
      This patch stores the real inode directly in i_private, with the lowest bit
      used to indicate whether the inode is upper or lower.
      
      Lifetime rules remain, using ovl_inode_real() must only be done while
      caller holds ref on overlay dentry (and hence on real dentry), or within
      RCU protected regions.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      39b681f8
    • M
      ovl: update atime on upper · d719e8f2
      Miklos Szeredi 提交于
      Fix atime update logic in overlayfs.
      
      This patch adds an i_op->update_time() handler to overlayfs inodes.  This
      forwards atime updates to the upper layer only.  No atime updates are done
      on lower layers.
      
      Remove implicit atime updates to underlying files and directories with
      O_NOATIME.  Remove explicit atime update in ovl_readlink().
      
      Clear atime related mnt flags from cloned upper mount.  This means atime
      updates are controlled purely by overlayfs mount options.
      
      Reported-by: Konstantin Khlebnikov <koct9i@gmail.com> 
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      d719e8f2
    • M
      ovl: simplify permission checking · 9c630ebe
      Miklos Szeredi 提交于
      The fact that we always do permission checking on the overlay inode and
      clear MAY_WRITE for checking access to the lower inode allows cruft to be
      removed from ovl_permission().
      
      1) "default_permissions" option effectively did generic_permission() on the
      overlay inode with i_mode, i_uid and i_gid updated from underlying
      filesystem.  This is what we do by default now.  It did the update using
      vfs_getattr() but that's only needed if the underlying filesystem can
      change (which is not allowed).  We may later introduce a "paranoia_mode"
      that verifies that mode/uid/gid are not changed.
      
      2) splitting out the IS_RDONLY() check from inode_permission() also becomes
      unnecessary once we remove the MAY_WRITE from the lower inode check.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      9c630ebe
    • V
      ovl: define ->get_acl() for overlay inodes · 39a25b2b
      Vivek Goyal 提交于
      Now we are planning to do DAC permission checks on overlay inode
      itself. And to make it work, we will need to make sure we can get acls from
      underlying inode. So define ->get_acl() for overlay inodes and this in turn
      calls into underlying filesystem to get acls, if any.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      39a25b2b
    • M
      ovl: use generic_delete_inode · eead4f2d
      Miklos Szeredi 提交于
      No point in keeping overlay inodes around since they will never be reused.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      eead4f2d
    • M
      ovl: check mounter creds on underlying lookup · c1b2cc1a
      Miklos Szeredi 提交于
      The hash salting changes meant that we can no longer reuse the hash in the
      overlay dentry to look up the underlying dentry.
      
      Instead of lookup_hash(), use lookup_one_len_unlocked() and swith to
      mounter's creds (like we do for all other operations later in the series).
      
      Now the lookup_hash() export introduced in 4.6 by 3c9fe8cd ("vfs: add
      lookup_hash() helper") is unused and can possibly be removed; its
      usefulness negated by the hash salting and the idea that mounter's creds
      should be used on operations on underlying filesystems.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 8387ff25 ("vfs: make the string hashes salt the hash")
      c1b2cc1a
  2. 03 7月, 2016 1 次提交
  3. 30 6月, 2016 1 次提交
    • M
      vfs: merge .d_select_inode() into .d_real() · 2d902671
      Miklos Szeredi 提交于
      The two methods essentially do the same: find the real dentry/inode
      belonging to an overlay dentry.  The difference is in the usage:
      
      vfs_open() uses ->d_select_inode() and expects the function to perform
      copy-up if necessary based on the open flags argument.
      
      file_dentry() uses ->d_real() passing in the overlay dentry as well as the
      underlying inode.
      
      vfs_rename() uses ->d_select_inode() but passes zero flags.  ->d_real()
      with a zero inode would have worked just as well here.
      
      This patch merges the functionality of ->d_select_inode() into ->d_real()
      by adding an 'open_flags' argument to the latter.
      
      [Al Viro] Make the signature of d_real() match that of ->d_real() again.
      And constify the inode argument, while we are at it.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      2d902671
  4. 27 5月, 2016 2 次提交
    • V
      ovl: Do d_type check only if work dir creation was successful · 21765194
      Vivek Goyal 提交于
      d_type check requires successful creation of workdir as iterates
      through work dir and expects work dir to be present in it. If that's
      not the case, this check will always return d_type not supported even
      if underlying filesystem might be supporting it.
      
      So don't do this check if work dir creation failed in previous step.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      21765194
    • A
      ovl: override creds with the ones from the superblock mounter · 3fe6e52f
      Antonio Murdaca 提交于
      In user namespace the whiteout creation fails with -EPERM because the
      current process isn't capable(CAP_SYS_ADMIN) when setting xattr.
      
      A simple reproducer:
      
      $ mkdir upper lower work merged lower/dir
      $ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged
      $ unshare -m -p -f -U -r bash
      
      Now as root in the user namespace:
      
      \# touch merged/dir/{1,2,3} # this will force a copy up of lower/dir
      \# rm -fR merged/*
      
      This ends up failing with -EPERM after the files in dir has been
      correctly deleted:
      
      unlinkat(4, "2", 0)                     = 0
      unlinkat(4, "1", 0)                     = 0
      unlinkat(4, "3", 0)                     = 0
      close(4)                                = 0
      unlinkat(AT_FDCWD, "merged/dir", AT_REMOVEDIR) = -1 EPERM (Operation not
      permitted)
      
      Interestingly, if you don't place files in merged/dir you can remove it,
      meaning if upper/dir does not exist, creating the char device file works
      properly in that same location.
      
      This patch uses ovl_sb_creator_cred() to get the cred struct from the
      superblock mounter and override the old cred with these new ones so that
      the whiteout creation is possible because overlay is wrong in assuming that
      the creds it will get with prepare_creds will be in the initial user
      namespace.  The old cap_raise game is removed in favor of just overriding
      the old cred struct.
      
      This patch also drops from ovl_copy_up_one() the following two lines:
      
      override_cred->fsuid = stat->uid;
      override_cred->fsgid = stat->gid;
      
      This is because the correct uid and gid are taken directly with the stat
      struct and correctly set with ovl_set_attr().
      Signed-off-by: NAntonio Murdaca <runcom@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      3fe6e52f
  5. 11 5月, 2016 1 次提交
    • M
      ovl: ignore permissions on underlying lookup · 38b78a5f
      Miklos Szeredi 提交于
      Generally permission checking is not necessary when overlayfs looks up a
      dentry on one of the underlying layers, since search permission on base
      directory was already checked in ovl_permission().
      
      More specifically using lookup_one_len() causes a problem when the lower
      directory lacks search permission for a specific user while the upper
      directory does have search permission.  Since lookups are cached, this
      causes inconsistency in behavior: success depends on who did the first
      lookup.
      
      So instead use lookup_hash() which doesn't do the permission check.
      Reported-by: NIgnacy Gawędzki <ignacy.gawedzki@green-communications.fr>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      38b78a5f
  6. 03 5月, 2016 1 次提交
  7. 11 4月, 2016 1 次提交
  8. 27 3月, 2016 1 次提交
    • M
      fs: add file_dentry() · d101a125
      Miklos Szeredi 提交于
      This series fixes bugs in nfs and ext4 due to 4bacc9c9 ("overlayfs:
      Make f_path always point to the overlay and f_inode to the underlay").
      
      Regular files opened on overlayfs will result in the file being opened on
      the underlying filesystem, while f_path points to the overlayfs
      mount/dentry.
      
      This confuses filesystems which get the dentry from struct file and assume
      it's theirs.
      
      Add a new helper, file_dentry() [*], to get the filesystem's own dentry
      from the file.  This checks file->f_path.dentry->d_flags against
      DCACHE_OP_REAL, and returns file->f_path.dentry if DCACHE_OP_REAL is not
      set (this is the common, non-overlayfs case).
      
      In the uncommon case it will call into overlayfs's ->d_real() to get the
      underlying dentry, matching file_inode(file).
      
      The reason we need to check against the inode is that if the file is copied
      up while being open, d_real() would return the upper dentry, while the open
      file comes from the lower dentry.
      
      [*] If possible, it's better simply to use file_inode() instead.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Tested-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Reviewed-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Cc: <stable@vger.kernel.org> # v4.2
      Cc: David Howells <dhowells@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Daniel Axtens <dja@axtens.net>
      d101a125
  9. 22 3月, 2016 2 次提交
  10. 04 3月, 2016 2 次提交
    • K
      ovl: ignore lower entries when checking purity of non-directory entries · 45d11738
      Konstantin Khlebnikov 提交于
      After rename file dentry still holds reference to lower dentry from
      previous location. This doesn't matter for data access because data comes
      from upper dentry. But this stale lower dentry taints dentry at new
      location and turns it into non-pure upper. Such file leaves visible
      whiteout entry after remove in directory which shouldn't have whiteouts at
      all.
      
      Overlayfs already tracks pureness of file location in oe->opaque.  This
      patch just uses that for detecting actual path type.
      
      Comment from Vivek Goyal's patch:
      
      Here are the details of the problem. Do following.
      
      $ mkdir upper lower work merged upper/dir/
      $ touch lower/test
      $ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=
      work merged
      $ mv merged/test merged/dir/
      $ rm merged/dir/test
      $ ls -l merged/dir/
      /usr/bin/ls: cannot access merged/dir/test: No such file or directory
      total 0
      c????????? ? ? ? ?            ? test
      
      Basic problem seems to be that once a file has been unlinked, a whiteout
      has been left behind which was not needed and hence it becomes visible.
      
      Whiteout is visible because parent dir is of not type MERGE, hence
      od->is_real is set during ovl_dir_open(). And that means ovl_iterate()
      passes on iterate handling directly to underlying fs. Underlying fs does
      not know/filter whiteouts so it becomes visible to user.
      
      Why did we leave a whiteout to begin with when we should not have.
      ovl_do_remove() checks for OVL_TYPE_PURE_UPPER() and does not leave
      whiteout if file is pure upper. In this case file is not found to be pure
      upper hence whiteout is left.
      
      So why file was not PURE_UPPER in this case? I think because dentry is
      still carrying some leftover state which was valid before rename. For
      example, od->numlower was set to 1 as it was a lower file. After rename,
      this state is not valid anymore as there is no such file in lower.
      Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
      Reported-by: NViktor Stanchev <me@viktorstanchev.com>
      Suggested-by: NVivek Goyal <vgoyal@redhat.com>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=109611Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Cc: <stable@vger.kernel.org>
      45d11738
    • K
      ovl: fix working on distributed fs as lower layer · b5891cfa
      Konstantin Khlebnikov 提交于
      This adds missing .d_select_inode into alternative dentry_operations.
      Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
      Fixes: 7c03b5d4 ("ovl: allow distributed fs as lower layer")
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
      Reviewed-by: NNikolay Borisov <kernel@kyup.com>
      Tested-by: NNikolay Borisov <kernel@kyup.com>
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Cc: <stable@vger.kernel.org> # 4.2+
      b5891cfa
  11. 23 1月, 2016 1 次提交
    • A
      wrappers for ->i_mutex access · 5955102c
      Al Viro 提交于
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5955102c
  12. 21 1月, 2016 1 次提交
    • A
      fs/overlayfs/super.c needs pagemap.h · e458bcd1
      Andrew Morton 提交于
      i386 allmodconfig:
      
        In file included from fs/overlayfs/super.c:10:0:
        fs/overlayfs/super.c: In function 'ovl_fill_super':
        include/linux/fs.h:898:36: error: 'PAGE_CACHE_SIZE' undeclared (first use in this function)
         #define MAX_LFS_FILESIZE (((loff_t)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1)
                                            ^
        fs/overlayfs/super.c:939:19: note: in expansion of macro 'MAX_LFS_FILESIZE'
          sb->s_maxbytes = MAX_LFS_FILESIZE;
                           ^
        include/linux/fs.h:898:36: note: each undeclared identifier is reported only once for each function it appears in
         #define MAX_LFS_FILESIZE (((loff_t)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1)
                                            ^
        fs/overlayfs/super.c:939:19: note: in expansion of macro 'MAX_LFS_FILESIZE'
          sb->s_maxbytes = MAX_LFS_FILESIZE;
                           ^
      
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e458bcd1
  13. 11 12月, 2015 1 次提交
  14. 09 12月, 2015 1 次提交
  15. 11 11月, 2015 1 次提交
  16. 12 10月, 2015 3 次提交
    • M
      ovl: default permissions · 8d3095f4
      Miklos Szeredi 提交于
      Add mount option "default_permissions" to alter the way permissions are
      calculated.
      
      Without this option and prior to this patch permissions were calculated by
      underlying lower or upper filesystem.
      
      With this option the permissions are calculated by overlayfs based on the
      file owner, group and mode bits.
      
      This has significance for example when a read-only exported NFS filesystem
      is used as a lower layer.  In this case the underlying NFS filesystem will
      reply with EROFS, in which case all we know is that the filesystem is
      read-only.  But that's not what we are interested in, we are interested in
      whether the access would be allowed if the filesystem wasn't read-only; the
      server doesn't tell us that, and would need updating at various levels,
      which doesn't seem practicable.
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      8d3095f4
    • K
      ovl: free lower_mnt array in ovl_put_super · 5ffdbe8b
      Konstantin Khlebnikov 提交于
      This fixes memory leak after umount.
      
      Kmemleak report:
      
      unreferenced object 0xffff8800ba791010 (size 8):
        comm "mount", pid 2394, jiffies 4294996294 (age 53.920s)
        hex dump (first 8 bytes):
          20 1c 13 02 00 88 ff ff                           .......
        backtrace:
          [<ffffffff811f8cd4>] create_object+0x124/0x2c0
          [<ffffffff817a059b>] kmemleak_alloc+0x7b/0xc0
          [<ffffffff811dffe6>] __kmalloc+0x106/0x340
          [<ffffffffa0152bfc>] ovl_fill_super+0x55c/0x9b0 [overlay]
          [<ffffffff81200ac4>] mount_nodev+0x54/0xa0
          [<ffffffffa0152118>] ovl_mount+0x18/0x20 [overlay]
          [<ffffffff81201ab3>] mount_fs+0x43/0x170
          [<ffffffff81220d34>] vfs_kern_mount+0x74/0x170
          [<ffffffff812233ad>] do_mount+0x22d/0xdf0
          [<ffffffff812242cb>] SyS_mount+0x7b/0xc0
          [<ffffffff817b6bee>] entry_SYSCALL_64_fastpath+0x12/0x76
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Fixes: dd662667 ("ovl: add mutli-layer infrastructure")
      Cc: <stable@vger.kernel.org> # v4.0+
      5ffdbe8b
    • K
      ovl: free stack of paths in ovl_fill_super · 0f95502a
      Konstantin Khlebnikov 提交于
      This fixes small memory leak after mount.
      
      Kmemleak report:
      
      unreferenced object 0xffff88003683fe00 (size 16):
        comm "mount", pid 2029, jiffies 4294909563 (age 33.380s)
        hex dump (first 16 bytes):
          20 27 1f bb 00 88 ff ff 40 4b 0f 36 02 88 ff ff   '......@K.6....
        backtrace:
          [<ffffffff811f8cd4>] create_object+0x124/0x2c0
          [<ffffffff817a059b>] kmemleak_alloc+0x7b/0xc0
          [<ffffffff811dffe6>] __kmalloc+0x106/0x340
          [<ffffffffa01b7a29>] ovl_fill_super+0x389/0x9a0 [overlay]
          [<ffffffff81200ac4>] mount_nodev+0x54/0xa0
          [<ffffffffa01b7118>] ovl_mount+0x18/0x20 [overlay]
          [<ffffffff81201ab3>] mount_fs+0x43/0x170
          [<ffffffff81220d34>] vfs_kern_mount+0x74/0x170
          [<ffffffff812233ad>] do_mount+0x22d/0xdf0
          [<ffffffff812242cb>] SyS_mount+0x7b/0xc0
          [<ffffffff817b6bee>] entry_SYSCALL_64_fastpath+0x12/0x76
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Fixes: a78d9f0d ("ovl: support multiple lower layers")
      Cc: <stable@vger.kernel.org> # v4.0+
      0f95502a
  17. 05 9月, 2015 1 次提交
    • K
      fs: create and use seq_show_option for escaping · a068acf2
      Kees Cook 提交于
      Many file systems that implement the show_options hook fail to correctly
      escape their output which could lead to unescaped characters (e.g.  new
      lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files.  This
      could lead to confusion, spoofed entries (resulting in things like
      systemd issuing false d-bus "mount" notifications), and who knows what
      else.  This looks like it would only be the root user stepping on
      themselves, but it's possible weird things could happen in containers or
      in other situations with delegated mount privileges.
      
      Here's an example using overlay with setuid fusermount trusting the
      contents of /proc/mounts (via the /etc/mtab symlink).  Imagine the use
      of "sudo" is something more sneaky:
      
        $ BASE="ovl"
        $ MNT="$BASE/mnt"
        $ LOW="$BASE/lower"
        $ UP="$BASE/upper"
        $ WORK="$BASE/work/ 0 0
        none /proc fuse.pwn user_id=1000"
        $ mkdir -p "$LOW" "$UP" "$WORK"
        $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
        $ cat /proc/mounts
        none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
        none /proc fuse.pwn user_id=1000 0 0
        $ fusermount -u /proc
        $ cat /proc/mounts
        cat: /proc/mounts: No such file or directory
      
      This fixes the problem by adding new seq_show_option and
      seq_show_option_n helpers, and updating the vulnerable show_option
      handlers to use them as needed.  Some, like SELinux, need to be open
      coded due to unusual existing escape mechanisms.
      
      [akpm@linux-foundation.org: add lost chunk, per Kees]
      [keescook@chromium.org: seq_show_option should be using const parameters]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NJan Kara <jack@suse.com>
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Cc: J. R. Okajima <hooanon05g@gmail.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a068acf2
  18. 22 6月, 2015 2 次提交
    • M
      ovl: allow distributed fs as lower layer · 7c03b5d4
      Miklos Szeredi 提交于
      Allow filesystems with .d_revalidate as lower layer(s), but not as upper
      layer.
      
      For local filesystems the rule was that modifications on the layers
      directly while being part of the overlay results in undefined behavior.
      
      This can easily be extended to distributed filesystems: we assume the tree
      used as lower layer is static, which means ->d_revalidate() should always
      return "1".  If that is not the case, return -ESTALE, don't try to work
      around the modification.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      7c03b5d4
    • M
      ovl: don't traverse automount points · a6f15d9a
      Miklos Szeredi 提交于
      NFS and other distributed filesystems may place automount points in the
      tree.  Previoulsy overlayfs refused to mount such filesystems types (based
      on the existence of the .d_automount callback), even if the actual export
      didn't have any automount points.
      
      It cannot be determined in advance whether the filesystem has automount
      points or not.  The solution is to allow fs with .d_automount but refuse to
      traverse any automount points encountered.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      a6f15d9a
  19. 19 6月, 2015 1 次提交
    • D
      overlayfs: Make f_path always point to the overlay and f_inode to the underlay · 4bacc9c9
      David Howells 提交于
      Make file->f_path always point to the overlay dentry so that the path in
      /proc/pid/fd is correct and to ensure that label-based LSMs have access to the
      overlay as well as the underlay (path-based LSMs probably don't need it).
      
      Using my union testsuite to set things up, before the patch I see:
      
      	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
      	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
      	...
      	lr-x------. 1 root root 64 Jun  5 14:38 5 -> /a/foo107
      	[root@andromeda union-testsuite]# stat /mnt/a/foo107
      	...
      	Device: 23h/35d Inode: 13381       Links: 1
      	...
      	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
      	...
      	Device: 23h/35d Inode: 13381       Links: 1
      	...
      
      After the patch:
      
      	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
      	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
      	...
      	lr-x------. 1 root root 64 Jun  5 14:22 5 -> /mnt/a/foo107
      	[root@andromeda union-testsuite]# stat /mnt/a/foo107
      	...
      	Device: 23h/35d Inode: 40346       Links: 1
      	...
      	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
      	...
      	Device: 23h/35d Inode: 40346       Links: 1
      	...
      
      Note the change in where /proc/$$/fd/5 points to in the ls command.  It was
      pointing to /a/foo107 (which doesn't exist) and now points to /mnt/a/foo107
      (which is correct).
      
      The inode accessed, however, is the lower layer.  The union layer is on device
      25h/37d and the upper layer on 24h/36d.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4bacc9c9
  20. 19 5月, 2015 1 次提交
    • M
      ovl: mount read-only if workdir can't be created · cc6f67bc
      Miklos Szeredi 提交于
      OpenWRT folks reported that overlayfs fails to mount if upper fs is full,
      because workdir can't be created.  Wordir creation can fail for various
      other reasons too.
      
      There's no reason that the mount itself should fail, overlayfs can work
      fine without a workdir, as long as the overlay isn't modified.
      
      So mount it read-only and don't allow remounting read-write.
      
      Add a couple of WARN_ON()s for the impossible case of workdir being used
      despite being read-only.
      
      Reported-by: Bastian Bittorf <bittorf@bluebottle.com> 
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: <stable@vger.kernel.org> # v3.18+
      cc6f67bc
  21. 18 3月, 2015 3 次提交
    • H
      ovl: upper fs should not be R/O · 71cbad7e
      hujianyang 提交于
      After importing multi-lower layer support, users could mount a r/o
      partition as the left most lowerdir instead of using it as upperdir.
      And a r/o upperdir may cause an error like
      
      	overlayfs: failed to create directory ./workdir/work
      
      during mount.
      
      This patch check the *s_flags* of upper fs and return an error if
      it is a r/o partition. The checking of *upper_mnt->mnt_sb->s_flags*
      can be removed now.
      
      This patch also remove
      
      	/* FIXME: workdir is not needed for a R/O mount */
      
      from ovl_fill_super() because:
      
      1) for upper fs r/o case
      Setting a r/o partition as upper is prevented, no need to care about
      workdir in this case.
      
      2) for "mount overlay -o ro" with a r/w upper fs case
      Users could remount overlayfs to r/w in this case, so workdir should
      not be omitted.
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      71cbad7e
    • H
      ovl: check lowerdir amount for non-upper mount · 6be4506e
      hujianyang 提交于
      Recently multi-lower layer mount support allow upperdir and workdir
      to be omitted, then cause overlayfs can be mount with only one
      lowerdir directory. This action make no sense and have potential risk.
      
      This patch check the total number of lower directories to prevent
      mounting overlayfs with only one directory.
      
      Also, an error message is added to indicate lower directories exceed
      OVL_MAX_STACK limit.
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      6be4506e
    • H
      ovl: print error message for invalid mount options · bead55ef
      hujianyang 提交于
      Overlayfs should print an error message if an incorrect mount option
      is caught like other filesystems.
      
      After this patch, improper option input could be clearly known.
      Reported-by: NFabian Sturm <fabian.sturm@aduu.de>
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      bead55ef
  22. 08 1月, 2015 3 次提交
    • S
      ovl: Prevent rw remount when it should be ro mount · 3cdf6fe9
      Seunghun Lee 提交于
      Overlayfs should be mounted read-only when upper-fs is read-only or nonexistent.
      But now it can be remounted read-write and this can cause kernel panic.
      So we should prevent read-write remount when the above situation happens.
      Signed-off-by: NSeunghun Lee <waydi1@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      3cdf6fe9
    • H
      ovl: Fix opaque regression in ovl_lookup · a425c037
      hujianyang 提交于
      Current multi-layer support overlayfs has a regression in
      .lookup(). If there is a directory in upperdir and a regular
      file has same name in lowerdir in a merged directory, lower
      file is hidden and upper directory is set to opaque in former
      case. But it is changed in present code.
      
      In lowerdir lookup path, if a found inode is not directory,
      the type checking of previous inode is missing. This inode
      will be copied to the lowerstack of ovl_entry directly.
      
      That will lead to several wrong conditions, for example,
      the reading of the directory in upperdir may return an error
      like:
      
         ls: reading directory .: Not a directory
      
      This patch makes the lowerdir lookup path check the opaque
      for non-directory file too.
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      a425c037
    • H
      ovl: Fix kernel panic while mounting overlayfs · 2f83fd8c
      hujianyang 提交于
      The function ovl_fill_super() in recently multi-layer support
      version will incorrectly return 0 at error handling path and
      then cause kernel panic.
      
      This failure can be reproduced by mounting a overlayfs with
      upperdir and workdir in different mounts.
      
      And also, If the memory allocation of *lower_mnt* fail, this
      function may return an zero either.
      
      This patch fix this problem by setting *err* to proper error
      number before jumping to error handling path.
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      2f83fd8c