1. 04 1月, 2012 31 次提交
  2. 07 12月, 2011 1 次提交
    • A
      fix apparmor dereferencing potentially freed dentry, sanitize __d_path() API · 02125a82
      Al Viro 提交于
      __d_path() API is asking for trouble and in case of apparmor d_namespace_path()
      getting just that.  The root cause is that when __d_path() misses the root
      it had been told to look for, it stores the location of the most remote ancestor
      in *root.  Without grabbing references.  Sure, at the moment of call it had
      been pinned down by what we have in *path.  And if we raced with umount -l, we
      could have very well stopped at vfsmount/dentry that got freed as soon as
      prepend_path() dropped vfsmount_lock.
      
      It is safe to compare these pointers with pre-existing (and known to be still
      alive) vfsmount and dentry, as long as all we are asking is "is it the same
      address?".  Dereferencing is not safe and apparmor ended up stepping into
      that.  d_namespace_path() really wants to examine the place where we stopped,
      even if it's not connected to our namespace.  As the result, it looked
      at ->d_sb->s_magic of a dentry that might've been already freed by that point.
      All other callers had been careful enough to avoid that, but it's really
      a bad interface - it invites that kind of trouble.
      
      The fix is fairly straightforward, even though it's bigger than I'd like:
      	* prepend_path() root argument becomes const.
      	* __d_path() is never called with NULL/NULL root.  It was a kludge
      to start with.  Instead, we have an explicit function - d_absolute_root().
      Same as __d_path(), except that it doesn't get root passed and stops where
      it stops.  apparmor and tomoyo are using it.
      	* __d_path() returns NULL on path outside of root.  The main
      caller is show_mountinfo() and that's precisely what we pass root for - to
      skip those outside chroot jail.  Those who don't want that can (and do)
      use d_path().
      	* __d_path() root argument becomes const.  Everyone agrees, I hope.
      	* apparmor does *NOT* try to use __d_path() or any of its variants
      when it sees that path->mnt is an internal vfsmount.  In that case it's
      definitely not mounted anywhere and dentry_path() is exactly what we want
      there.  Handling of sysctl()-triggered weirdness is moved to that place.
      	* if apparmor is asked to do pathname relative to chroot jail
      and __d_path() tells it we it's not in that jail, the sucker just calls
      d_absolute_path() instead.  That's the other remaining caller of __d_path(),
      BTW.
              * seq_path_root() does _NOT_ return -ENAMETOOLONG (it's stupid anyway -
      the normal seq_file logics will take care of growing the buffer and redoing
      the call of ->show() just fine).  However, if it gets path not reachable
      from root, it returns SEQ_SKIP.  The only caller adjusted (i.e. stopped
      ignoring the return value as it used to do).
      Reviewed-by: NJohn Johansen <john.johansen@canonical.com>
      ACKed-by: NJohn Johansen <john.johansen@canonical.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: stable@vger.kernel.org
      02125a82
  3. 23 11月, 2011 1 次提交
  4. 17 11月, 2011 2 次提交
  5. 28 10月, 2011 1 次提交
  6. 27 9月, 2011 1 次提交
  7. 24 7月, 2011 1 次提交
    • T
      VFS : mount lock scalability for internal mounts · 423e0ab0
      Tim Chen 提交于
      For a number of file systems that don't have a mount point (e.g. sockfs
      and pipefs), they are not marked as long term. Therefore in
      mntput_no_expire, all locks in vfs_mount lock are taken instead of just
      local cpu's lock to aggregate reference counts when we release
      reference to file objects.  In fact, only local lock need to have been
      taken to update ref counts as these file systems are in no danger of
      going away until we are ready to unregister them.
      
      The attached patch marks file systems using kern_mount without
      mount point as long term.  The contentions of vfs_mount lock
      is now eliminated.  Before un-registering such file system,
      kern_unmount should be called to remove the long term flag and
      make the mount point ready to be freed.
      Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      423e0ab0
  8. 21 7月, 2011 1 次提交
  9. 26 5月, 2011 1 次提交
    • R
      fs/namespace.c: bound mount propagation fix · 7c6e984d
      Roman Borisov 提交于
      This issue was discovered by users of busybox.  And the bug is actual for
      busybox users, I don't know how it affects others.  Apparently, mount is
      called with and without MS_SILENT, and this affects mount() behaviour.
      But MS_SILENT is only supposed to affect kernel logging verbosity.
      
      The following script was run in an empty test directory:
      
      mkdir -p mount.dir mount.shared1 mount.shared2
      touch mount.dir/a mount.dir/b
      mount -vv --bind         mount.shared1 mount.shared1
      mount -vv --make-rshared mount.shared1
      mount -vv --bind         mount.shared2 mount.shared2
      mount -vv --make-rshared mount.shared2
      mount -vv --bind mount.shared2 mount.shared1
      mount -vv --bind mount.dir     mount.shared2
      ls -R mount.dir mount.shared1 mount.shared2
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      rm -f mount.dir/a mount.dir/b mount.dir/c
      rmdir mount.dir mount.shared1 mount.shared2
      
      mount -vv was used to show the mount() call arguments and result.
      Output shows that flag argument has 0x00008000 = MS_SILENT bit:
      
      mount: mount('mount.shared1','mount.shared1','(null)',0x00009000,'(null)'):0
      mount: mount('','mount.shared1','',0x0010c000,''):0
      mount: mount('mount.shared2','mount.shared2','(null)',0x00009000,'(null)'):0
      mount: mount('','mount.shared2','',0x0010c000,''):0
      mount: mount('mount.shared2','mount.shared1','(null)',0x00009000,'(null)'):0
      mount: mount('mount.dir','mount.shared2','(null)',0x00009000,'(null)'):0
      mount.dir:
      a
      b
      
      mount.shared1:
      
      mount.shared2:
      a
      b
      
      After adding --loud option to remove MS_SILENT bit from just one mount cmd:
      
      mkdir -p mount.dir mount.shared1 mount.shared2
      touch mount.dir/a mount.dir/b
      mount -vv --bind         mount.shared1 mount.shared1 2>&1
      mount -vv --make-rshared mount.shared1               2>&1
      mount -vv --bind         mount.shared2 mount.shared2 2>&1
      mount -vv --loud --make-rshared mount.shared2               2>&1  # <-HERE
      mount -vv --bind mount.shared2 mount.shared1         2>&1
      mount -vv --bind mount.dir     mount.shared2         2>&1
      ls -R mount.dir mount.shared1 mount.shared2      2>&1
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      rm -f mount.dir/a mount.dir/b mount.dir/c
      rmdir mount.dir mount.shared1 mount.shared2
      
      The result is different now - look closely at mount.shared1 directory listing.
      Now it does show files 'a' and 'b':
      
      mount: mount('mount.shared1','mount.shared1','(null)',0x00009000,'(null)'):0
      mount: mount('','mount.shared1','',0x0010c000,''):0
      mount: mount('mount.shared2','mount.shared2','(null)',0x00009000,'(null)'):0
      mount: mount('','mount.shared2','',0x00104000,''):0
      mount: mount('mount.shared2','mount.shared1','(null)',0x00009000,'(null)'):0
      mount: mount('mount.dir','mount.shared2','(null)',0x00009000,'(null)'):0
      
      mount.dir:
      a
      b
      
      mount.shared1:
      a
      b
      
      mount.shared2:
      a
      b
      
      The analysis shows that MS_SILENT flag which is ON by default in any
      busybox-> mount operations cames to flags_to_propagation_type function and
      causes the error return while is_power_of_2 checking because the function
      expects only one bit set.  This doesn't allow to do busybox->mount with
      any --make-[r]shared, --make-[r]private etc options.
      
      Moreover, the recently added flags_to_propagation_type() function doesn't
      allow us to do such operations as --make-[r]private --make-[r]shared etc.
      when MS_SILENT is on.  The idea or clearing the MS_SILENT flag came from
      to Denys Vlasenko.
      Signed-off-by: NRoman Borisov <ext-roman.borisov@nokia.com>
      Reported-by: NDenys Vlasenko <vda.linux@googlemail.com>
      Cc: Chuck Ebbert <cebbert@redhat.com>
      Cc: Alexander Shishkin <virtuoso@slind.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7c6e984d