1. 04 1月, 2012 25 次提交
  2. 07 12月, 2011 1 次提交
    • A
      fix apparmor dereferencing potentially freed dentry, sanitize __d_path() API · 02125a82
      Al Viro 提交于
      __d_path() API is asking for trouble and in case of apparmor d_namespace_path()
      getting just that.  The root cause is that when __d_path() misses the root
      it had been told to look for, it stores the location of the most remote ancestor
      in *root.  Without grabbing references.  Sure, at the moment of call it had
      been pinned down by what we have in *path.  And if we raced with umount -l, we
      could have very well stopped at vfsmount/dentry that got freed as soon as
      prepend_path() dropped vfsmount_lock.
      
      It is safe to compare these pointers with pre-existing (and known to be still
      alive) vfsmount and dentry, as long as all we are asking is "is it the same
      address?".  Dereferencing is not safe and apparmor ended up stepping into
      that.  d_namespace_path() really wants to examine the place where we stopped,
      even if it's not connected to our namespace.  As the result, it looked
      at ->d_sb->s_magic of a dentry that might've been already freed by that point.
      All other callers had been careful enough to avoid that, but it's really
      a bad interface - it invites that kind of trouble.
      
      The fix is fairly straightforward, even though it's bigger than I'd like:
      	* prepend_path() root argument becomes const.
      	* __d_path() is never called with NULL/NULL root.  It was a kludge
      to start with.  Instead, we have an explicit function - d_absolute_root().
      Same as __d_path(), except that it doesn't get root passed and stops where
      it stops.  apparmor and tomoyo are using it.
      	* __d_path() returns NULL on path outside of root.  The main
      caller is show_mountinfo() and that's precisely what we pass root for - to
      skip those outside chroot jail.  Those who don't want that can (and do)
      use d_path().
      	* __d_path() root argument becomes const.  Everyone agrees, I hope.
      	* apparmor does *NOT* try to use __d_path() or any of its variants
      when it sees that path->mnt is an internal vfsmount.  In that case it's
      definitely not mounted anywhere and dentry_path() is exactly what we want
      there.  Handling of sysctl()-triggered weirdness is moved to that place.
      	* if apparmor is asked to do pathname relative to chroot jail
      and __d_path() tells it we it's not in that jail, the sucker just calls
      d_absolute_path() instead.  That's the other remaining caller of __d_path(),
      BTW.
              * seq_path_root() does _NOT_ return -ENAMETOOLONG (it's stupid anyway -
      the normal seq_file logics will take care of growing the buffer and redoing
      the call of ->show() just fine).  However, if it gets path not reachable
      from root, it returns SEQ_SKIP.  The only caller adjusted (i.e. stopped
      ignoring the return value as it used to do).
      Reviewed-by: NJohn Johansen <john.johansen@canonical.com>
      ACKed-by: NJohn Johansen <john.johansen@canonical.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: stable@vger.kernel.org
      02125a82
  3. 23 11月, 2011 1 次提交
  4. 17 11月, 2011 2 次提交
  5. 28 10月, 2011 1 次提交
  6. 27 9月, 2011 1 次提交
  7. 24 7月, 2011 1 次提交
    • T
      VFS : mount lock scalability for internal mounts · 423e0ab0
      Tim Chen 提交于
      For a number of file systems that don't have a mount point (e.g. sockfs
      and pipefs), they are not marked as long term. Therefore in
      mntput_no_expire, all locks in vfs_mount lock are taken instead of just
      local cpu's lock to aggregate reference counts when we release
      reference to file objects.  In fact, only local lock need to have been
      taken to update ref counts as these file systems are in no danger of
      going away until we are ready to unregister them.
      
      The attached patch marks file systems using kern_mount without
      mount point as long term.  The contentions of vfs_mount lock
      is now eliminated.  Before un-registering such file system,
      kern_unmount should be called to remove the long term flag and
      make the mount point ready to be freed.
      Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      423e0ab0
  8. 21 7月, 2011 1 次提交
  9. 26 5月, 2011 1 次提交
    • R
      fs/namespace.c: bound mount propagation fix · 7c6e984d
      Roman Borisov 提交于
      This issue was discovered by users of busybox.  And the bug is actual for
      busybox users, I don't know how it affects others.  Apparently, mount is
      called with and without MS_SILENT, and this affects mount() behaviour.
      But MS_SILENT is only supposed to affect kernel logging verbosity.
      
      The following script was run in an empty test directory:
      
      mkdir -p mount.dir mount.shared1 mount.shared2
      touch mount.dir/a mount.dir/b
      mount -vv --bind         mount.shared1 mount.shared1
      mount -vv --make-rshared mount.shared1
      mount -vv --bind         mount.shared2 mount.shared2
      mount -vv --make-rshared mount.shared2
      mount -vv --bind mount.shared2 mount.shared1
      mount -vv --bind mount.dir     mount.shared2
      ls -R mount.dir mount.shared1 mount.shared2
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      rm -f mount.dir/a mount.dir/b mount.dir/c
      rmdir mount.dir mount.shared1 mount.shared2
      
      mount -vv was used to show the mount() call arguments and result.
      Output shows that flag argument has 0x00008000 = MS_SILENT bit:
      
      mount: mount('mount.shared1','mount.shared1','(null)',0x00009000,'(null)'):0
      mount: mount('','mount.shared1','',0x0010c000,''):0
      mount: mount('mount.shared2','mount.shared2','(null)',0x00009000,'(null)'):0
      mount: mount('','mount.shared2','',0x0010c000,''):0
      mount: mount('mount.shared2','mount.shared1','(null)',0x00009000,'(null)'):0
      mount: mount('mount.dir','mount.shared2','(null)',0x00009000,'(null)'):0
      mount.dir:
      a
      b
      
      mount.shared1:
      
      mount.shared2:
      a
      b
      
      After adding --loud option to remove MS_SILENT bit from just one mount cmd:
      
      mkdir -p mount.dir mount.shared1 mount.shared2
      touch mount.dir/a mount.dir/b
      mount -vv --bind         mount.shared1 mount.shared1 2>&1
      mount -vv --make-rshared mount.shared1               2>&1
      mount -vv --bind         mount.shared2 mount.shared2 2>&1
      mount -vv --loud --make-rshared mount.shared2               2>&1  # <-HERE
      mount -vv --bind mount.shared2 mount.shared1         2>&1
      mount -vv --bind mount.dir     mount.shared2         2>&1
      ls -R mount.dir mount.shared1 mount.shared2      2>&1
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      rm -f mount.dir/a mount.dir/b mount.dir/c
      rmdir mount.dir mount.shared1 mount.shared2
      
      The result is different now - look closely at mount.shared1 directory listing.
      Now it does show files 'a' and 'b':
      
      mount: mount('mount.shared1','mount.shared1','(null)',0x00009000,'(null)'):0
      mount: mount('','mount.shared1','',0x0010c000,''):0
      mount: mount('mount.shared2','mount.shared2','(null)',0x00009000,'(null)'):0
      mount: mount('','mount.shared2','',0x00104000,''):0
      mount: mount('mount.shared2','mount.shared1','(null)',0x00009000,'(null)'):0
      mount: mount('mount.dir','mount.shared2','(null)',0x00009000,'(null)'):0
      
      mount.dir:
      a
      b
      
      mount.shared1:
      a
      b
      
      mount.shared2:
      a
      b
      
      The analysis shows that MS_SILENT flag which is ON by default in any
      busybox-> mount operations cames to flags_to_propagation_type function and
      causes the error return while is_power_of_2 checking because the function
      expects only one bit set.  This doesn't allow to do busybox->mount with
      any --make-[r]shared, --make-[r]private etc options.
      
      Moreover, the recently added flags_to_propagation_type() function doesn't
      allow us to do such operations as --make-[r]private --make-[r]shared etc.
      when MS_SILENT is on.  The idea or clearing the MS_SILENT flag came from
      to Denys Vlasenko.
      Signed-off-by: NRoman Borisov <ext-roman.borisov@nokia.com>
      Reported-by: NDenys Vlasenko <vda.linux@googlemail.com>
      Cc: Chuck Ebbert <cebbert@redhat.com>
      Cc: Alexander Shishkin <virtuoso@slind.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7c6e984d
  10. 13 4月, 2011 1 次提交
    • L
      Revert "vfs: Export file system uuid via /proc/<pid>/mountinfo" · be85bcca
      Linus Torvalds 提交于
      This reverts commit 93f1c20b.
      
      It turns out that libmount misparses it because it adds a '-' character
      in the uuid string, which libmount then incorrectly confuses with the
      separator string (" - ") at the end of all the optional arguments.
      
      Upstream libmount (in the util-linux tree) has been fixed, but until
      that fix actually percolates up to users, we'd better not expose this
      change in the kernel.
      
      Let's revisit this later (possibly by exposing the UUID without any '-'
      characters in it, avoiding the user-space bug).
      Reported-by: NDave Jones <davej@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Karel Zak <kzak@redhat.com>
      Cc: Ram Pai <linuxram@us.ibm.com>
      Cc: Miklos Szeredi <mszeredi@suse.cz>
      Cc: Eric Sandeen <sandeen@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      be85bcca
  11. 23 3月, 2011 1 次提交
  12. 18 3月, 2011 4 次提交
    • A
      change the locking order for namespace_sem · b12cea91
      Al Viro 提交于
      Have it nested inside ->i_mutex.  Instead of using follow_down()
      under namespace_sem, followed by grabbing i_mutex and checking that
      mountpoint to be is not dead, do the following:
      	grab i_mutex
      	check that it's not dead
      	grab namespace_sem
      	see if anything is mounted there
      	if not, we've won
      	otherwise
      		drop locks
      		put_path on what we had
      		replace with what's mounted
      		retry everything with new mountpoint to be
      
      New helper (lock_mount()) does that.  do_add_mount(), do_move_mount(),
      do_loopback() and pivot_root() switched to it; in case of the last
      two that eliminates a race we used to have - original code didn't
      do follow_down().
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b12cea91
    • A
      fix deadlock in pivot_root() · 27cb1572
      Al Viro 提交于
      Don't hold vfsmount_lock over the loop traversing ->mnt_parent;
      do check_mnt(new.mnt) under namespace_sem instead; combined with
      namespace_sem held over all that code it'll guarantee the stability
      of ->mnt_parent chain all the way to the root.
      
      Doing check_mnt() outside of namespace_sem in case of pivot_root()
      is wrong anyway.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      27cb1572
    • A
      vfs: split off vfsmount-related parts of vfs_kern_mount() · 9d412a43
      Al Viro 提交于
      new function: mount_fs().  Does all work done by vfs_kern_mount()
      except the allocation and filling of vfsmount; returns root dentry
      or ERR_PTR().
      
      vfs_kern_mount() switched to using it and taken to fs/namespace.c,
      along with its wrappers.
      
      alloc_vfsmnt()/free_vfsmnt() made static.
      
      functions in namespace.c slightly reordered.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9d412a43
    • A
      kill simple_set_mnt() · 474a00ee
      Al Viro 提交于
      not needed anymore, since all users (->get_sb() instances) are gone.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      474a00ee