1. 20 7月, 2011 10 次提交
  2. 13 7月, 2011 1 次提交
  3. 20 6月, 2011 2 次提交
  4. 16 6月, 2011 2 次提交
    • A
      VFS: Fix vfsmount overput on simultaneous automount · 8aef1884
      Al Viro 提交于
      [Kudos to dhowells for tracking that crap down]
      
      If two processes attempt to cause automounting on the same mountpoint at the
      same time, the vfsmount holding the mountpoint will be left with one too few
      references on it, causing a BUG when the kernel tries to clean up.
      
      The problem is that lock_mount() drops the caller's reference to the
      mountpoint's vfsmount in the case where it finds something already mounted on
      the mountpoint as it transits to the mounted filesystem and replaces path->mnt
      with the new mountpoint vfsmount.
      
      During a pathwalk, however, we don't take a reference on the vfsmount if it is
      the same as the one in the nameidata struct, but do_add_mount() doesn't know
      this.
      
      The fix is to make sure we have a ref on the vfsmount of the mountpoint before
      calling do_add_mount().  However, if lock_mount() doesn't transit, we're then
      left with an extra ref on the mountpoint vfsmount which needs releasing.
      We can handle that in follow_managed() by not making assumptions about what
      we can and what we cannot get from lookup_mnt() as the current code does.
      
      The callers of follow_managed() expect that reference to path->mnt will be
      grabbed iff path->mnt has been changed.  follow_managed() and follow_automount()
      keep track of whether such reference has been grabbed and assume that it'll
      happen in those and only those cases that'll have us return with changed
      path->mnt.  That assumption is almost correct - it breaks in case of
      racing automounts and in even harder to hit race between following a mountpoint
      and a couple of mount --move.  The thing is, we don't need to make that
      assumption at all - after the end of loop in follow_manage() we can check
      if path->mnt has ended up unchanged and do mntput() if needed.
      
      The BUG can be reproduced with the following test program:
      
      	#include <stdio.h>
      	#include <sys/types.h>
      	#include <sys/stat.h>
      	#include <unistd.h>
      	#include <sys/wait.h>
      	int main(int argc, char **argv)
      	{
      		int pid, ws;
      		struct stat buf;
      		pid = fork();
      		stat(argv[1], &buf);
      		if (pid > 0) wait(&ws);
      		return 0;
      	}
      
      and the following procedure:
      
       (1) Mount an NFS volume that on the server has something else mounted on a
           subdirectory.  For instance, I can mount / from my server:
      
      	mount warthog:/ /mnt -t nfs4 -r
      
           On the server /data has another filesystem mounted on it, so NFS will see
           a change in FSID as it walks down the path, and will mark /mnt/data as
           being a mountpoint.  This will cause the automount code to be triggered.
      
           !!! Do not look inside the mounted fs at this point !!!
      
       (2) Run the above program on a file within the submount to generate two
           simultaneous automount requests:
      
      	/tmp/forkstat /mnt/data/testfile
      
       (3) Unmount the automounted submount:
      
      	umount /mnt/data
      
       (4) Unmount the original mount:
      
      	umount /mnt
      
           At this point the kernel should throw a BUG with something like the
           following:
      
      	BUG: Dentry ffff880032e3c5c0{i=2,n=} still in use (1) [unmount of nfs4 0:12]
      
      Note that the bug appears on the root dentry of the original mount, not the
      mountpoint and not the submount because sys_umount() hasn't got to its final
      mntput_no_expire() yet, but this isn't so obvious from the call trace:
      
       [<ffffffff8117cd82>] shrink_dcache_for_umount+0x69/0x82
       [<ffffffff8116160e>] generic_shutdown_super+0x37/0x15b
       [<ffffffffa00fae56>] ? nfs_super_return_all_delegations+0x2e/0x1b1 [nfs]
       [<ffffffff811617f3>] kill_anon_super+0x1d/0x7e
       [<ffffffffa00d0be1>] nfs4_kill_super+0x60/0xb6 [nfs]
       [<ffffffff81161c17>] deactivate_locked_super+0x34/0x83
       [<ffffffff811629ff>] deactivate_super+0x6f/0x7b
       [<ffffffff81186261>] mntput_no_expire+0x18d/0x199
       [<ffffffff811862a8>] mntput+0x3b/0x44
       [<ffffffff81186d87>] release_mounts+0xa2/0xbf
       [<ffffffff811876af>] sys_umount+0x47a/0x4ba
       [<ffffffff8109e1ca>] ? trace_hardirqs_on_caller+0x1fd/0x22f
       [<ffffffff816ea86b>] system_call_fastpath+0x16/0x1b
      
      as do_umount() is inlined.  However, you can see release_mounts() in there.
      
      Note also that it may be necessary to have multiple CPU cores to be able to
      trigger this bug.
      Tested-by: NJeff Layton <jlayton@redhat.com>
      Tested-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8aef1884
    • T
      fix wrong iput on d_inode introduced by e6bc45d6 · 50338b88
      Török Edwin 提交于
      Git bisection shows that commit e6bc45d6 causes
      BUG_ONs under high I/O load:
      
      kernel BUG at fs/inode.c:1368!
      [ 2862.501007] Call Trace:
      [ 2862.501007]  [<ffffffff811691d8>] d_kill+0xf8/0x140
      [ 2862.501007]  [<ffffffff81169c19>] dput+0xc9/0x190
      [ 2862.501007]  [<ffffffff8115577f>] fput+0x15f/0x210
      [ 2862.501007]  [<ffffffff81152171>] filp_close+0x61/0x90
      [ 2862.501007]  [<ffffffff81152251>] sys_close+0xb1/0x110
      [ 2862.501007]  [<ffffffff814c14fb>] system_call_fastpath+0x16/0x1b
      
      A reliable way to reproduce this bug is:
      Login to KDE, run 'rsnapshot sync', and apt-get install openjdk-6-jdk,
      and apt-get remove openjdk-6-jdk.
      
      The buggy part of the patch is this:
      	struct inode *inode = NULL;
      .....
      -               if (nd.last.name[nd.last.len])
      -                       goto slashes;
                      inode = dentry->d_inode;
      -               if (inode)
      -                       ihold(inode);
      +               if (nd.last.name[nd.last.len] || !inode)
      +                       goto slashes;
      +               ihold(inode)
      ...
      	if (inode)
      		iput(inode);	/* truncate the inode here */
      
      If nd.last.name[nd.last.len] is nonzero (and thus goto slashes branch is taken),
      and dentry->d_inode is non-NULL, then this code now does an additional iput on
      the inode, which is wrong.
      
      Fix this by only setting the inode variable if nd.last.name[nd.last.len] is 0.
      
      Reference: https://lkml.org/lkml/2011/6/15/50Reported-by: NNorbert Preining <preining@logic.at>
      Reported-by: NTörök Edwin <edwintorok@gmail.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NTörök Edwin <edwintorok@gmail.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      50338b88
  5. 07 6月, 2011 1 次提交
  6. 30 5月, 2011 1 次提交
  7. 27 5月, 2011 3 次提交
  8. 26 5月, 2011 11 次提交
  9. 21 5月, 2011 1 次提交
  10. 14 5月, 2011 1 次提交
    • L
      vfs: micro-optimize acl_permission_check() · 26cf46be
      Linus Torvalds 提交于
      It's a hot function, and we're better off not mixing types in the mask
      calculations.  The compiler just ends up mixing 16-bit and 32-bit
      operations, for no good reason.
      
      So do everything in 'unsigned int' rather than mixing 'unsigned int'
      masking with a 'umode_t' (16-bit) mode variable.
      
      This, together with the parent commit (47a150ed: "Cache user_ns in
      struct cred") makes acl_permission_check() much nicer.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26cf46be
  11. 16 4月, 2011 1 次提交
  12. 31 3月, 2011 1 次提交
  13. 25 3月, 2011 1 次提交
  14. 24 3月, 2011 2 次提交
  15. 23 3月, 2011 1 次提交
  16. 18 3月, 2011 1 次提交