1. 11 5月, 2016 1 次提交
    • M
      ovl: ignore permissions on underlying lookup · 38b78a5f
      Miklos Szeredi 提交于
      Generally permission checking is not necessary when overlayfs looks up a
      dentry on one of the underlying layers, since search permission on base
      directory was already checked in ovl_permission().
      
      More specifically using lookup_one_len() causes a problem when the lower
      directory lacks search permission for a specific user while the upper
      directory does have search permission.  Since lookups are cached, this
      causes inconsistency in behavior: success depends on who did the first
      lookup.
      
      So instead use lookup_hash() which doesn't do the permission check.
      Reported-by: NIgnacy Gawędzki <ignacy.gawedzki@green-communications.fr>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      38b78a5f
  2. 27 3月, 2016 1 次提交
    • M
      fs: add file_dentry() · d101a125
      Miklos Szeredi 提交于
      This series fixes bugs in nfs and ext4 due to 4bacc9c9 ("overlayfs:
      Make f_path always point to the overlay and f_inode to the underlay").
      
      Regular files opened on overlayfs will result in the file being opened on
      the underlying filesystem, while f_path points to the overlayfs
      mount/dentry.
      
      This confuses filesystems which get the dentry from struct file and assume
      it's theirs.
      
      Add a new helper, file_dentry() [*], to get the filesystem's own dentry
      from the file.  This checks file->f_path.dentry->d_flags against
      DCACHE_OP_REAL, and returns file->f_path.dentry if DCACHE_OP_REAL is not
      set (this is the common, non-overlayfs case).
      
      In the uncommon case it will call into overlayfs's ->d_real() to get the
      underlying dentry, matching file_inode(file).
      
      The reason we need to check against the inode is that if the file is copied
      up while being open, d_real() would return the upper dentry, while the open
      file comes from the lower dentry.
      
      [*] If possible, it's better simply to use file_inode() instead.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Tested-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Reviewed-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Cc: <stable@vger.kernel.org> # v4.2
      Cc: David Howells <dhowells@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Daniel Axtens <dja@axtens.net>
      d101a125
  3. 22 3月, 2016 2 次提交
  4. 04 3月, 2016 2 次提交
    • K
      ovl: ignore lower entries when checking purity of non-directory entries · 45d11738
      Konstantin Khlebnikov 提交于
      After rename file dentry still holds reference to lower dentry from
      previous location. This doesn't matter for data access because data comes
      from upper dentry. But this stale lower dentry taints dentry at new
      location and turns it into non-pure upper. Such file leaves visible
      whiteout entry after remove in directory which shouldn't have whiteouts at
      all.
      
      Overlayfs already tracks pureness of file location in oe->opaque.  This
      patch just uses that for detecting actual path type.
      
      Comment from Vivek Goyal's patch:
      
      Here are the details of the problem. Do following.
      
      $ mkdir upper lower work merged upper/dir/
      $ touch lower/test
      $ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=
      work merged
      $ mv merged/test merged/dir/
      $ rm merged/dir/test
      $ ls -l merged/dir/
      /usr/bin/ls: cannot access merged/dir/test: No such file or directory
      total 0
      c????????? ? ? ? ?            ? test
      
      Basic problem seems to be that once a file has been unlinked, a whiteout
      has been left behind which was not needed and hence it becomes visible.
      
      Whiteout is visible because parent dir is of not type MERGE, hence
      od->is_real is set during ovl_dir_open(). And that means ovl_iterate()
      passes on iterate handling directly to underlying fs. Underlying fs does
      not know/filter whiteouts so it becomes visible to user.
      
      Why did we leave a whiteout to begin with when we should not have.
      ovl_do_remove() checks for OVL_TYPE_PURE_UPPER() and does not leave
      whiteout if file is pure upper. In this case file is not found to be pure
      upper hence whiteout is left.
      
      So why file was not PURE_UPPER in this case? I think because dentry is
      still carrying some leftover state which was valid before rename. For
      example, od->numlower was set to 1 as it was a lower file. After rename,
      this state is not valid anymore as there is no such file in lower.
      Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
      Reported-by: NViktor Stanchev <me@viktorstanchev.com>
      Suggested-by: NVivek Goyal <vgoyal@redhat.com>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=109611Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Cc: <stable@vger.kernel.org>
      45d11738
    • K
      ovl: fix working on distributed fs as lower layer · b5891cfa
      Konstantin Khlebnikov 提交于
      This adds missing .d_select_inode into alternative dentry_operations.
      Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
      Fixes: 7c03b5d4 ("ovl: allow distributed fs as lower layer")
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
      Reviewed-by: NNikolay Borisov <kernel@kyup.com>
      Tested-by: NNikolay Borisov <kernel@kyup.com>
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Cc: <stable@vger.kernel.org> # 4.2+
      b5891cfa
  5. 23 1月, 2016 1 次提交
    • A
      wrappers for ->i_mutex access · 5955102c
      Al Viro 提交于
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5955102c
  6. 21 1月, 2016 1 次提交
    • A
      fs/overlayfs/super.c needs pagemap.h · e458bcd1
      Andrew Morton 提交于
      i386 allmodconfig:
      
        In file included from fs/overlayfs/super.c:10:0:
        fs/overlayfs/super.c: In function 'ovl_fill_super':
        include/linux/fs.h:898:36: error: 'PAGE_CACHE_SIZE' undeclared (first use in this function)
         #define MAX_LFS_FILESIZE (((loff_t)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1)
                                            ^
        fs/overlayfs/super.c:939:19: note: in expansion of macro 'MAX_LFS_FILESIZE'
          sb->s_maxbytes = MAX_LFS_FILESIZE;
                           ^
        include/linux/fs.h:898:36: note: each undeclared identifier is reported only once for each function it appears in
         #define MAX_LFS_FILESIZE (((loff_t)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1)
                                            ^
        fs/overlayfs/super.c:939:19: note: in expansion of macro 'MAX_LFS_FILESIZE'
          sb->s_maxbytes = MAX_LFS_FILESIZE;
                           ^
      
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e458bcd1
  7. 11 12月, 2015 1 次提交
  8. 09 12月, 2015 1 次提交
  9. 11 11月, 2015 1 次提交
  10. 12 10月, 2015 3 次提交
    • M
      ovl: default permissions · 8d3095f4
      Miklos Szeredi 提交于
      Add mount option "default_permissions" to alter the way permissions are
      calculated.
      
      Without this option and prior to this patch permissions were calculated by
      underlying lower or upper filesystem.
      
      With this option the permissions are calculated by overlayfs based on the
      file owner, group and mode bits.
      
      This has significance for example when a read-only exported NFS filesystem
      is used as a lower layer.  In this case the underlying NFS filesystem will
      reply with EROFS, in which case all we know is that the filesystem is
      read-only.  But that's not what we are interested in, we are interested in
      whether the access would be allowed if the filesystem wasn't read-only; the
      server doesn't tell us that, and would need updating at various levels,
      which doesn't seem practicable.
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      8d3095f4
    • K
      ovl: free lower_mnt array in ovl_put_super · 5ffdbe8b
      Konstantin Khlebnikov 提交于
      This fixes memory leak after umount.
      
      Kmemleak report:
      
      unreferenced object 0xffff8800ba791010 (size 8):
        comm "mount", pid 2394, jiffies 4294996294 (age 53.920s)
        hex dump (first 8 bytes):
          20 1c 13 02 00 88 ff ff                           .......
        backtrace:
          [<ffffffff811f8cd4>] create_object+0x124/0x2c0
          [<ffffffff817a059b>] kmemleak_alloc+0x7b/0xc0
          [<ffffffff811dffe6>] __kmalloc+0x106/0x340
          [<ffffffffa0152bfc>] ovl_fill_super+0x55c/0x9b0 [overlay]
          [<ffffffff81200ac4>] mount_nodev+0x54/0xa0
          [<ffffffffa0152118>] ovl_mount+0x18/0x20 [overlay]
          [<ffffffff81201ab3>] mount_fs+0x43/0x170
          [<ffffffff81220d34>] vfs_kern_mount+0x74/0x170
          [<ffffffff812233ad>] do_mount+0x22d/0xdf0
          [<ffffffff812242cb>] SyS_mount+0x7b/0xc0
          [<ffffffff817b6bee>] entry_SYSCALL_64_fastpath+0x12/0x76
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Fixes: dd662667 ("ovl: add mutli-layer infrastructure")
      Cc: <stable@vger.kernel.org> # v4.0+
      5ffdbe8b
    • K
      ovl: free stack of paths in ovl_fill_super · 0f95502a
      Konstantin Khlebnikov 提交于
      This fixes small memory leak after mount.
      
      Kmemleak report:
      
      unreferenced object 0xffff88003683fe00 (size 16):
        comm "mount", pid 2029, jiffies 4294909563 (age 33.380s)
        hex dump (first 16 bytes):
          20 27 1f bb 00 88 ff ff 40 4b 0f 36 02 88 ff ff   '......@K.6....
        backtrace:
          [<ffffffff811f8cd4>] create_object+0x124/0x2c0
          [<ffffffff817a059b>] kmemleak_alloc+0x7b/0xc0
          [<ffffffff811dffe6>] __kmalloc+0x106/0x340
          [<ffffffffa01b7a29>] ovl_fill_super+0x389/0x9a0 [overlay]
          [<ffffffff81200ac4>] mount_nodev+0x54/0xa0
          [<ffffffffa01b7118>] ovl_mount+0x18/0x20 [overlay]
          [<ffffffff81201ab3>] mount_fs+0x43/0x170
          [<ffffffff81220d34>] vfs_kern_mount+0x74/0x170
          [<ffffffff812233ad>] do_mount+0x22d/0xdf0
          [<ffffffff812242cb>] SyS_mount+0x7b/0xc0
          [<ffffffff817b6bee>] entry_SYSCALL_64_fastpath+0x12/0x76
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Fixes: a78d9f0d ("ovl: support multiple lower layers")
      Cc: <stable@vger.kernel.org> # v4.0+
      0f95502a
  11. 05 9月, 2015 1 次提交
    • K
      fs: create and use seq_show_option for escaping · a068acf2
      Kees Cook 提交于
      Many file systems that implement the show_options hook fail to correctly
      escape their output which could lead to unescaped characters (e.g.  new
      lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files.  This
      could lead to confusion, spoofed entries (resulting in things like
      systemd issuing false d-bus "mount" notifications), and who knows what
      else.  This looks like it would only be the root user stepping on
      themselves, but it's possible weird things could happen in containers or
      in other situations with delegated mount privileges.
      
      Here's an example using overlay with setuid fusermount trusting the
      contents of /proc/mounts (via the /etc/mtab symlink).  Imagine the use
      of "sudo" is something more sneaky:
      
        $ BASE="ovl"
        $ MNT="$BASE/mnt"
        $ LOW="$BASE/lower"
        $ UP="$BASE/upper"
        $ WORK="$BASE/work/ 0 0
        none /proc fuse.pwn user_id=1000"
        $ mkdir -p "$LOW" "$UP" "$WORK"
        $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
        $ cat /proc/mounts
        none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
        none /proc fuse.pwn user_id=1000 0 0
        $ fusermount -u /proc
        $ cat /proc/mounts
        cat: /proc/mounts: No such file or directory
      
      This fixes the problem by adding new seq_show_option and
      seq_show_option_n helpers, and updating the vulnerable show_option
      handlers to use them as needed.  Some, like SELinux, need to be open
      coded due to unusual existing escape mechanisms.
      
      [akpm@linux-foundation.org: add lost chunk, per Kees]
      [keescook@chromium.org: seq_show_option should be using const parameters]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NJan Kara <jack@suse.com>
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Cc: J. R. Okajima <hooanon05g@gmail.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a068acf2
  12. 22 6月, 2015 2 次提交
    • M
      ovl: allow distributed fs as lower layer · 7c03b5d4
      Miklos Szeredi 提交于
      Allow filesystems with .d_revalidate as lower layer(s), but not as upper
      layer.
      
      For local filesystems the rule was that modifications on the layers
      directly while being part of the overlay results in undefined behavior.
      
      This can easily be extended to distributed filesystems: we assume the tree
      used as lower layer is static, which means ->d_revalidate() should always
      return "1".  If that is not the case, return -ESTALE, don't try to work
      around the modification.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      7c03b5d4
    • M
      ovl: don't traverse automount points · a6f15d9a
      Miklos Szeredi 提交于
      NFS and other distributed filesystems may place automount points in the
      tree.  Previoulsy overlayfs refused to mount such filesystems types (based
      on the existence of the .d_automount callback), even if the actual export
      didn't have any automount points.
      
      It cannot be determined in advance whether the filesystem has automount
      points or not.  The solution is to allow fs with .d_automount but refuse to
      traverse any automount points encountered.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      a6f15d9a
  13. 19 6月, 2015 1 次提交
    • D
      overlayfs: Make f_path always point to the overlay and f_inode to the underlay · 4bacc9c9
      David Howells 提交于
      Make file->f_path always point to the overlay dentry so that the path in
      /proc/pid/fd is correct and to ensure that label-based LSMs have access to the
      overlay as well as the underlay (path-based LSMs probably don't need it).
      
      Using my union testsuite to set things up, before the patch I see:
      
      	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
      	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
      	...
      	lr-x------. 1 root root 64 Jun  5 14:38 5 -> /a/foo107
      	[root@andromeda union-testsuite]# stat /mnt/a/foo107
      	...
      	Device: 23h/35d Inode: 13381       Links: 1
      	...
      	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
      	...
      	Device: 23h/35d Inode: 13381       Links: 1
      	...
      
      After the patch:
      
      	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
      	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
      	...
      	lr-x------. 1 root root 64 Jun  5 14:22 5 -> /mnt/a/foo107
      	[root@andromeda union-testsuite]# stat /mnt/a/foo107
      	...
      	Device: 23h/35d Inode: 40346       Links: 1
      	...
      	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
      	...
      	Device: 23h/35d Inode: 40346       Links: 1
      	...
      
      Note the change in where /proc/$$/fd/5 points to in the ls command.  It was
      pointing to /a/foo107 (which doesn't exist) and now points to /mnt/a/foo107
      (which is correct).
      
      The inode accessed, however, is the lower layer.  The union layer is on device
      25h/37d and the upper layer on 24h/36d.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4bacc9c9
  14. 19 5月, 2015 1 次提交
    • M
      ovl: mount read-only if workdir can't be created · cc6f67bc
      Miklos Szeredi 提交于
      OpenWRT folks reported that overlayfs fails to mount if upper fs is full,
      because workdir can't be created.  Wordir creation can fail for various
      other reasons too.
      
      There's no reason that the mount itself should fail, overlayfs can work
      fine without a workdir, as long as the overlay isn't modified.
      
      So mount it read-only and don't allow remounting read-write.
      
      Add a couple of WARN_ON()s for the impossible case of workdir being used
      despite being read-only.
      
      Reported-by: Bastian Bittorf <bittorf@bluebottle.com> 
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: <stable@vger.kernel.org> # v3.18+
      cc6f67bc
  15. 18 3月, 2015 3 次提交
    • H
      ovl: upper fs should not be R/O · 71cbad7e
      hujianyang 提交于
      After importing multi-lower layer support, users could mount a r/o
      partition as the left most lowerdir instead of using it as upperdir.
      And a r/o upperdir may cause an error like
      
      	overlayfs: failed to create directory ./workdir/work
      
      during mount.
      
      This patch check the *s_flags* of upper fs and return an error if
      it is a r/o partition. The checking of *upper_mnt->mnt_sb->s_flags*
      can be removed now.
      
      This patch also remove
      
      	/* FIXME: workdir is not needed for a R/O mount */
      
      from ovl_fill_super() because:
      
      1) for upper fs r/o case
      Setting a r/o partition as upper is prevented, no need to care about
      workdir in this case.
      
      2) for "mount overlay -o ro" with a r/w upper fs case
      Users could remount overlayfs to r/w in this case, so workdir should
      not be omitted.
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      71cbad7e
    • H
      ovl: check lowerdir amount for non-upper mount · 6be4506e
      hujianyang 提交于
      Recently multi-lower layer mount support allow upperdir and workdir
      to be omitted, then cause overlayfs can be mount with only one
      lowerdir directory. This action make no sense and have potential risk.
      
      This patch check the total number of lower directories to prevent
      mounting overlayfs with only one directory.
      
      Also, an error message is added to indicate lower directories exceed
      OVL_MAX_STACK limit.
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      6be4506e
    • H
      ovl: print error message for invalid mount options · bead55ef
      hujianyang 提交于
      Overlayfs should print an error message if an incorrect mount option
      is caught like other filesystems.
      
      After this patch, improper option input could be clearly known.
      Reported-by: NFabian Sturm <fabian.sturm@aduu.de>
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      bead55ef
  16. 08 1月, 2015 3 次提交
    • S
      ovl: Prevent rw remount when it should be ro mount · 3cdf6fe9
      Seunghun Lee 提交于
      Overlayfs should be mounted read-only when upper-fs is read-only or nonexistent.
      But now it can be remounted read-write and this can cause kernel panic.
      So we should prevent read-write remount when the above situation happens.
      Signed-off-by: NSeunghun Lee <waydi1@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      3cdf6fe9
    • H
      ovl: Fix opaque regression in ovl_lookup · a425c037
      hujianyang 提交于
      Current multi-layer support overlayfs has a regression in
      .lookup(). If there is a directory in upperdir and a regular
      file has same name in lowerdir in a merged directory, lower
      file is hidden and upper directory is set to opaque in former
      case. But it is changed in present code.
      
      In lowerdir lookup path, if a found inode is not directory,
      the type checking of previous inode is missing. This inode
      will be copied to the lowerstack of ovl_entry directly.
      
      That will lead to several wrong conditions, for example,
      the reading of the directory in upperdir may return an error
      like:
      
         ls: reading directory .: Not a directory
      
      This patch makes the lowerdir lookup path check the opaque
      for non-directory file too.
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      a425c037
    • H
      ovl: Fix kernel panic while mounting overlayfs · 2f83fd8c
      hujianyang 提交于
      The function ovl_fill_super() in recently multi-layer support
      version will incorrectly return 0 at error handling path and
      then cause kernel panic.
      
      This failure can be reproduced by mounting a overlayfs with
      upperdir and workdir in different mounts.
      
      And also, If the memory allocation of *lower_mnt* fail, this
      function may return an zero either.
      
      This patch fix this problem by setting *err* to proper error
      number before jumping to error handling path.
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      2f83fd8c
  17. 13 12月, 2014 14 次提交
  18. 20 11月, 2014 1 次提交