1. 12 10月, 2015 5 次提交
    • K
      ovl: free lower_mnt array in ovl_put_super · 5ffdbe8b
      Konstantin Khlebnikov 提交于
      This fixes memory leak after umount.
      
      Kmemleak report:
      
      unreferenced object 0xffff8800ba791010 (size 8):
        comm "mount", pid 2394, jiffies 4294996294 (age 53.920s)
        hex dump (first 8 bytes):
          20 1c 13 02 00 88 ff ff                           .......
        backtrace:
          [<ffffffff811f8cd4>] create_object+0x124/0x2c0
          [<ffffffff817a059b>] kmemleak_alloc+0x7b/0xc0
          [<ffffffff811dffe6>] __kmalloc+0x106/0x340
          [<ffffffffa0152bfc>] ovl_fill_super+0x55c/0x9b0 [overlay]
          [<ffffffff81200ac4>] mount_nodev+0x54/0xa0
          [<ffffffffa0152118>] ovl_mount+0x18/0x20 [overlay]
          [<ffffffff81201ab3>] mount_fs+0x43/0x170
          [<ffffffff81220d34>] vfs_kern_mount+0x74/0x170
          [<ffffffff812233ad>] do_mount+0x22d/0xdf0
          [<ffffffff812242cb>] SyS_mount+0x7b/0xc0
          [<ffffffff817b6bee>] entry_SYSCALL_64_fastpath+0x12/0x76
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Fixes: dd662667 ("ovl: add mutli-layer infrastructure")
      Cc: <stable@vger.kernel.org> # v4.0+
      5ffdbe8b
    • K
      ovl: free stack of paths in ovl_fill_super · 0f95502a
      Konstantin Khlebnikov 提交于
      This fixes small memory leak after mount.
      
      Kmemleak report:
      
      unreferenced object 0xffff88003683fe00 (size 16):
        comm "mount", pid 2029, jiffies 4294909563 (age 33.380s)
        hex dump (first 16 bytes):
          20 27 1f bb 00 88 ff ff 40 4b 0f 36 02 88 ff ff   '......@K.6....
        backtrace:
          [<ffffffff811f8cd4>] create_object+0x124/0x2c0
          [<ffffffff817a059b>] kmemleak_alloc+0x7b/0xc0
          [<ffffffff811dffe6>] __kmalloc+0x106/0x340
          [<ffffffffa01b7a29>] ovl_fill_super+0x389/0x9a0 [overlay]
          [<ffffffff81200ac4>] mount_nodev+0x54/0xa0
          [<ffffffffa01b7118>] ovl_mount+0x18/0x20 [overlay]
          [<ffffffff81201ab3>] mount_fs+0x43/0x170
          [<ffffffff81220d34>] vfs_kern_mount+0x74/0x170
          [<ffffffff812233ad>] do_mount+0x22d/0xdf0
          [<ffffffff812242cb>] SyS_mount+0x7b/0xc0
          [<ffffffff817b6bee>] entry_SYSCALL_64_fastpath+0x12/0x76
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Fixes: a78d9f0d ("ovl: support multiple lower layers")
      Cc: <stable@vger.kernel.org> # v4.0+
      0f95502a
    • M
      ovl: fix open in stacked overlay · 1c8a47df
      Miklos Szeredi 提交于
      If two overlayfs filesystems are stacked on top of each other, then we need
      recursion in ovl_d_select_inode().
      
      I guess d_backing_inode() is supposed to do that.  But currently it doesn't
      and that functionality is open coded in vfs_open().  This is now copied
      into ovl_d_select_inode() to fix this regression.
      Reported-by: NAlban Crequy <alban.crequy@gmail.com>
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay...")
      Cc: David Howells <dhowells@redhat.com>
      Cc: <stable@vger.kernel.org> # v4.2+
      1c8a47df
    • D
      ovl: fix dentry reference leak · ab79efab
      David Howells 提交于
      In ovl_copy_up_locked(), newdentry is leaked if the function exits through
      out_cleanup as this just to out after calling ovl_cleanup() - which doesn't
      actually release the ref on newdentry.
      
      The out_cleanup segment should instead exit through out2 as certainly
      newdentry leaks - and possibly upper does also, though this isn't caught
      given the catch of newdentry.
      
      Without this fix, something like the following is seen:
      
      	BUG: Dentry ffff880023e9eb20{i=f861,n=#ffff880023e82d90} still in use (1) [unmount of tmpfs tmpfs]
      	BUG: Dentry ffff880023ece640{i=0,n=bigfile}  still in use (1) [unmount of tmpfs tmpfs]
      
      when unmounting the upper layer after an error occurred in copyup.
      
      An error can be induced by creating a big file in a lower layer with
      something like:
      
      	dd if=/dev/zero of=/lower/a/bigfile bs=65536 count=1 seek=$((0xf000))
      
      to create a large file (4.1G).  Overlay an upper layer that is too small
      (on tmpfs might do) and then induce a copy up by opening it writably.
      Reported-by: NUlrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Cc: <stable@vger.kernel.org> # v3.18+
      ab79efab
    • D
      ovl: use O_LARGEFILE in ovl_copy_up() · 0480334f
      David Howells 提交于
      Open the lower file with O_LARGEFILE in ovl_copy_up().
      
      Pass O_LARGEFILE unconditionally in ovl_copy_up_data() as it's purely for
      catching 32-bit userspace dealing with a file large enough that it'll be
      mishandled if the application isn't aware that there might be an integer
      overflow.  Inside the kernel, there shouldn't be any problems.
      Reported-by: NUlrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Cc: <stable@vger.kernel.org> # v3.18+
      0480334f
  2. 12 7月, 2015 1 次提交
  3. 22 6月, 2015 3 次提交
    • M
      ovl: lookup whiteouts outside iterate_dir() · cdb67279
      Miklos Szeredi 提交于
      If jffs2 can deadlock on overlayfs readdir because it takes the same lock
      on ->iterate() as in ->lookup().
      
      Fix by moving whiteout checking outside iterate_dir().  Optimized by
      collecting potential whiteouts (DT_CHR) in a temporary list and if
      non-empty iterating throug these and checking for a 0/0 chardev.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Fixes: 49c21e1c ("ovl: check whiteout while reading directory")
      Reported-by: Roman Yeryomin <leroi.lists@gmail.com> 
      cdb67279
    • M
      ovl: allow distributed fs as lower layer · 7c03b5d4
      Miklos Szeredi 提交于
      Allow filesystems with .d_revalidate as lower layer(s), but not as upper
      layer.
      
      For local filesystems the rule was that modifications on the layers
      directly while being part of the overlay results in undefined behavior.
      
      This can easily be extended to distributed filesystems: we assume the tree
      used as lower layer is static, which means ->d_revalidate() should always
      return "1".  If that is not the case, return -ESTALE, don't try to work
      around the modification.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      7c03b5d4
    • M
      ovl: don't traverse automount points · a6f15d9a
      Miklos Szeredi 提交于
      NFS and other distributed filesystems may place automount points in the
      tree.  Previoulsy overlayfs refused to mount such filesystems types (based
      on the existence of the .d_automount callback), even if the actual export
      didn't have any automount points.
      
      It cannot be determined in advance whether the filesystem has automount
      points or not.  The solution is to allow fs with .d_automount but refuse to
      traverse any automount points encountered.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      a6f15d9a
  4. 19 6月, 2015 2 次提交
    • D
      overlayfs: Make f_path always point to the overlay and f_inode to the underlay · 4bacc9c9
      David Howells 提交于
      Make file->f_path always point to the overlay dentry so that the path in
      /proc/pid/fd is correct and to ensure that label-based LSMs have access to the
      overlay as well as the underlay (path-based LSMs probably don't need it).
      
      Using my union testsuite to set things up, before the patch I see:
      
      	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
      	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
      	...
      	lr-x------. 1 root root 64 Jun  5 14:38 5 -> /a/foo107
      	[root@andromeda union-testsuite]# stat /mnt/a/foo107
      	...
      	Device: 23h/35d Inode: 13381       Links: 1
      	...
      	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
      	...
      	Device: 23h/35d Inode: 13381       Links: 1
      	...
      
      After the patch:
      
      	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
      	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
      	...
      	lr-x------. 1 root root 64 Jun  5 14:22 5 -> /mnt/a/foo107
      	[root@andromeda union-testsuite]# stat /mnt/a/foo107
      	...
      	Device: 23h/35d Inode: 40346       Links: 1
      	...
      	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
      	...
      	Device: 23h/35d Inode: 40346       Links: 1
      	...
      
      Note the change in where /proc/$$/fd/5 points to in the ls command.  It was
      pointing to /a/foo107 (which doesn't exist) and now points to /mnt/a/foo107
      (which is correct).
      
      The inode accessed, however, is the lower layer.  The union layer is on device
      25h/37d and the upper layer on 24h/36d.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4bacc9c9
    • D
      overlay: Call ovl_drop_write() earlier in ovl_dentry_open() · f25801ee
      David Howells 提交于
      Call ovl_drop_write() earlier in ovl_dentry_open() before we call vfs_open()
      as we've done the copy up for which we needed the freeze-write lock by that
      point.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      f25801ee
  5. 19 5月, 2015 1 次提交
    • M
      ovl: mount read-only if workdir can't be created · cc6f67bc
      Miklos Szeredi 提交于
      OpenWRT folks reported that overlayfs fails to mount if upper fs is full,
      because workdir can't be created.  Wordir creation can fail for various
      other reasons too.
      
      There's no reason that the mount itself should fail, overlayfs can work
      fine without a workdir, as long as the overlay isn't modified.
      
      So mount it read-only and don't allow remounting read-write.
      
      Add a couple of WARN_ON()s for the impossible case of workdir being used
      despite being read-only.
      
      Reported-by: Bastian Bittorf <bittorf@bluebottle.com> 
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: <stable@vger.kernel.org> # v3.18+
      cc6f67bc
  6. 14 5月, 2015 1 次提交
  7. 11 5月, 2015 4 次提交
    • A
      switch ->put_link() from dentry to inode · 5f2c4179
      Al Viro 提交于
      only one instance looks at that argument at all; that sole
      exception wants inode rather than dentry.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5f2c4179
    • A
      don't pass nameidata to ->follow_link() · 6e77137b
      Al Viro 提交于
      its only use is getting passed to nd_jump_link(), which can obtain
      it from current->nameidata
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6e77137b
    • A
      new ->follow_link() and ->put_link() calling conventions · 680baacb
      Al Viro 提交于
      a) instead of storing the symlink body (via nd_set_link()) and returning
      an opaque pointer later passed to ->put_link(), ->follow_link() _stores_
      that opaque pointer (into void * passed by address by caller) and returns
      the symlink body.  Returning ERR_PTR() on error, NULL on jump (procfs magic
      symlinks) and pointer to symlink body for normal symlinks.  Stored pointer
      is ignored in all cases except the last one.
      
      Storing NULL for opaque pointer (or not storing it at all) means no call
      of ->put_link().
      
      b) the body used to be passed to ->put_link() implicitly (via nameidata).
      Now only the opaque pointer is.  In the cases when we used the symlink body
      to free stuff, ->follow_link() now should store it as opaque pointer in addition
      to returning it.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      680baacb
    • N
      ovl: rearrange ovl_follow_link to it doesn't need to call ->put_link · 3188b295
      NeilBrown 提交于
      ovl_follow_link current calls ->put_link on an error path.
      However ->put_link is about to change in a way that it will be
      impossible to call it from ovl_follow_link.
      
      So rearrange the code to avoid the need for that error path.
      Specifically: move the kmalloc() call before the ->follow_link()
      call to the subordinate filesystem.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      3188b295
  8. 18 3月, 2015 3 次提交
    • H
      ovl: upper fs should not be R/O · 71cbad7e
      hujianyang 提交于
      After importing multi-lower layer support, users could mount a r/o
      partition as the left most lowerdir instead of using it as upperdir.
      And a r/o upperdir may cause an error like
      
      	overlayfs: failed to create directory ./workdir/work
      
      during mount.
      
      This patch check the *s_flags* of upper fs and return an error if
      it is a r/o partition. The checking of *upper_mnt->mnt_sb->s_flags*
      can be removed now.
      
      This patch also remove
      
      	/* FIXME: workdir is not needed for a R/O mount */
      
      from ovl_fill_super() because:
      
      1) for upper fs r/o case
      Setting a r/o partition as upper is prevented, no need to care about
      workdir in this case.
      
      2) for "mount overlay -o ro" with a r/w upper fs case
      Users could remount overlayfs to r/w in this case, so workdir should
      not be omitted.
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      71cbad7e
    • H
      ovl: check lowerdir amount for non-upper mount · 6be4506e
      hujianyang 提交于
      Recently multi-lower layer mount support allow upperdir and workdir
      to be omitted, then cause overlayfs can be mount with only one
      lowerdir directory. This action make no sense and have potential risk.
      
      This patch check the total number of lower directories to prevent
      mounting overlayfs with only one directory.
      
      Also, an error message is added to indicate lower directories exceed
      OVL_MAX_STACK limit.
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      6be4506e
    • H
      ovl: print error message for invalid mount options · bead55ef
      hujianyang 提交于
      Overlayfs should print an error message if an incorrect mount option
      is caught like other filesystems.
      
      After this patch, improper option input could be clearly known.
      Reported-by: NFabian Sturm <fabian.sturm@aduu.de>
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      bead55ef
  9. 23 2月, 2015 1 次提交
    • D
      VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) · e36cb0b8
      David Howells 提交于
      Convert the following where appropriate:
      
       (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).
      
       (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).
      
       (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry).  This is actually more
           complicated than it appears as some calls should be converted to
           d_can_lookup() instead.  The difference is whether the directory in
           question is a real dir with a ->lookup op or whether it's a fake dir with
           a ->d_automount op.
      
      In some circumstances, we can subsume checks for dentry->d_inode not being
      NULL into this, provided we the code isn't in a filesystem that expects
      d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
      use d_inode() rather than d_backing_inode() to get the inode pointer).
      
      Note that the dentry type field may be set to something other than
      DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
      manages the fall-through from a negative dentry to a lower layer.  In such a
      case, the dentry type of the negative union dentry is set to the same as the
      type of the lower dentry.
      
      However, if you know d_inode is not NULL at the call site, then you can use
      the d_is_xxx() functions even in a filesystem.
      
      There is one further complication: a 0,0 chardev dentry may be labelled
      DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE.  Strictly, this was
      intended for special directory entry types that don't have attached inodes.
      
      The following perl+coccinelle script was used:
      
      use strict;
      
      my @callers;
      open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
          die "Can't grep for S_ISDIR and co. callers";
      @callers = <$fd>;
      close($fd);
      unless (@callers) {
          print "No matches\n";
          exit(0);
      }
      
      my @cocci = (
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISLNK(E->d_inode->i_mode)',
          '+ d_is_symlink(E)',
          '',
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISDIR(E->d_inode->i_mode)',
          '+ d_is_dir(E)',
          '',
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISREG(E->d_inode->i_mode)',
          '+ d_is_reg(E)' );
      
      my $coccifile = "tmp.sp.cocci";
      open($fd, ">$coccifile") || die $coccifile;
      print($fd "$_\n") || die $coccifile foreach (@cocci);
      close($fd);
      
      foreach my $file (@callers) {
          chomp $file;
          print "Processing ", $file, "\n";
          system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
      	die "spatch failed";
      }
      
      [AV: overlayfs parts skipped]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e36cb0b8
  10. 09 1月, 2015 1 次提交
  11. 08 1月, 2015 3 次提交
    • S
      ovl: Prevent rw remount when it should be ro mount · 3cdf6fe9
      Seunghun Lee 提交于
      Overlayfs should be mounted read-only when upper-fs is read-only or nonexistent.
      But now it can be remounted read-write and this can cause kernel panic.
      So we should prevent read-write remount when the above situation happens.
      Signed-off-by: NSeunghun Lee <waydi1@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      3cdf6fe9
    • H
      ovl: Fix opaque regression in ovl_lookup · a425c037
      hujianyang 提交于
      Current multi-layer support overlayfs has a regression in
      .lookup(). If there is a directory in upperdir and a regular
      file has same name in lowerdir in a merged directory, lower
      file is hidden and upper directory is set to opaque in former
      case. But it is changed in present code.
      
      In lowerdir lookup path, if a found inode is not directory,
      the type checking of previous inode is missing. This inode
      will be copied to the lowerstack of ovl_entry directly.
      
      That will lead to several wrong conditions, for example,
      the reading of the directory in upperdir may return an error
      like:
      
         ls: reading directory .: Not a directory
      
      This patch makes the lowerdir lookup path check the opaque
      for non-directory file too.
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      a425c037
    • H
      ovl: Fix kernel panic while mounting overlayfs · 2f83fd8c
      hujianyang 提交于
      The function ovl_fill_super() in recently multi-layer support
      version will incorrectly return 0 at error handling path and
      then cause kernel panic.
      
      This failure can be reproduced by mounting a overlayfs with
      upperdir and workdir in different mounts.
      
      And also, If the memory allocation of *lower_mnt* fail, this
      function may return an zero either.
      
      This patch fix this problem by setting *err* to proper error
      number before jumping to error handling path.
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      2f83fd8c
  12. 13 12月, 2014 15 次提交