1. 23 6月, 2006 4 次提交
    • N
      [PATCH] Fix dcache race during umount · 0feae5c4
      NeilBrown 提交于
      The race is that the shrink_dcache_memory shrinker could get called while a
      filesystem is being unmounted, and could try to prune a dentry belonging to
      that filesystem.
      
      If it does, then it will call in to iput on the inode while the dentry is
      no longer able to be found by the umounting process.  If iput takes a
      while, generic_shutdown_super could get all the way though
      shrink_dcache_parent and shrink_dcache_anon and invalidate_inodes without
      ever waiting on this particular inode.
      
      Eventually the superblock gets freed anyway and if the iput tried to touch
      it (which some filesystems certainly do), it will lose.  The promised
      "Self-destruct in 5 seconds" doesn't lead to a nice day.
      
      The race is closed by holding s_umount while calling prune_one_dentry on
      someone else's dentry.  As a down_read_trylock is used,
      shrink_dcache_memory will no longer try to prune the dentry of a filesystem
      that is being unmounted, and unmount will not be able to start until any
      such active prune_one_dentry completes.
      
      This requires that prune_dcache *knows* which filesystem (if any) it is
      doing the prune on behalf of so that it can be careful of other
      filesystems.  shrink_dcache_memory isn't called it on behalf of any
      filesystem, and so is careful of everything.
      
      shrink_dcache_anon is now passed a super_block rather than the s_anon list
      out of the superblock, so it can get the s_anon list itself, and can pass
      the superblock down to prune_dcache.
      
      If prune_dcache finds a dentry that it cannot free, it leaves it where it
      is (at the tail of the list) and exits, on the assumption that some other
      thread will be removing that dentry soon.  To try to make sure that some
      work gets done, a limited number of dnetries which are untouchable are
      skipped over while choosing the dentry to work on.
      
      I believe this race was first found by Kirill Korotaev.
      
      Cc: Jan Blunck <jblunck@suse.de>
      Acked-by: NKirill Korotaev <dev@openvz.org>
      Cc: Olaf Hering <olh@suse.de>
      Acked-by: NBalbir Singh <balbir@in.ibm.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NBalbir Singh <balbir@in.ibm.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0feae5c4
    • M
      [PATCH] remove steal_locks() · c89681ed
      Miklos Szeredi 提交于
      This patch removes the steal_locks() function.
      
      steal_locks() doesn't work correctly with any filesystem that does it's own
      lock management, including NFS, CIFS, etc.
      
      In addition it has weird semantics on local filesystems in case tasks
      sharing file-descriptor tables are doing POSIX locking operations in
      parallel to execve().
      
      The steal_locks() function has an effect on applications doing:
      
      clone(CLONE_FILES)
        /* in child */
        lock
        execve
        lock
      
      POSIX locks acquired before execve (by "child", "parent" or any further
      task sharing files_struct) will after the execve be owned exclusively by
      "child".
      
      According to Chris Wright some LSB/LTP kind of suite triggers without the
      stealing behavior, but there's no known real-world application that would
      also fail.
      
      Apps using NPTL are not affected, since all other threads are killed before
      execve.
      
      Apps using LinuxThreads are only affected if they
      
        - have multiple threads during exec (LinuxThreads doesn't kill other
          threads, the app may do it with pthread_kill_other_threads_np())
        - rely on POSIX locks being inherited across exec
      
      Both conditions are documented, but not their interaction.
      
      Apps using clone() natively are affected if they
      
        - use clone(CLONE_FILES)
        - rely on POSIX locks being inherited across exec
      
      The above scenarios are unlikely, but possible.
      
      If the patch is vetoed, there's a plan B, that involves mostly keeping the
      weird stealing semantics, but changing the way lock ownership is handled so
      that network and local filesystems work consistently.
      
      That would add more complexity though, so this solution seems to be
      preferred by most people.
      Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Matthew Wilcox <willy@debian.org>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Steven French <sfrench@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c89681ed
    • O
      [PATCH] Fix a race condition between ->i_mapping and iput() · 09d967c6
      OGAWA Hirofumi 提交于
      This race became a cause of oops, and can reproduce by the following.
      
          while true; do
      	dd if=/dev/zero of=/dev/.static/dev/hdg1 bs=512 count=1000 & sync
          done
      
      This race condition was between __sync_single_inode() and iput().
      
                cpu0 (fs's inode)                 cpu1 (bdev's inode)
                -----------------                 -------------------
                                             close("/dev/hda2")
                                             [...]
      __sync_single_inode()
         /* copy the bdev's ->i_mapping */
         mapping = inode->i_mapping;
      
                                             generic_forget_inode()
                                                bdev_clear_inode()
      					     /* restre the fs's ->i_mapping */
      				             inode->i_mapping = &inode->i_data;
      				          /* bdev's inode was freed */
                                                destroy_inode(inode);
      
         if (wait) {
            /* dereference a freed bdev's mapping->host */
            filemap_fdatawait(mapping);  /* Oops */
      
      Since __sync_single_inode() is only taking a ref-count of fs's inode, the
      another process can be close() and freeing the bdev's inode while writing
      fs's inode.  So, __sync_signle_inode() accesses the freed ->i_mapping,
      oops.
      
      This patch takes a ref-count on the bdev's inode for the fs's inode before
      setting a ->i_mapping, and the clear_inode() of the fs's inode does iput() on
      the bdev's inode.  So if the fs's inode is still living, bdev's inode
      shouldn't be freed.
      Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      09d967c6
    • A
      [PATCH] NTFS: Critical bug fix (affects MIPS and possibly others) · f893afbe
      Anton Altaparmakov 提交于
      Many thanks to Pauline Ng for the detailed bug report and analysis!
      Signed-off-by: NAnton Altaparmakov <aia21@cantab.net>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f893afbe
  2. 22 6月, 2006 1 次提交
  3. 20 6月, 2006 11 次提交
  4. 19 6月, 2006 7 次提交
  5. 18 6月, 2006 3 次提交
  6. 14 6月, 2006 1 次提交
  7. 13 6月, 2006 1 次提交
  8. 09 6月, 2006 12 次提交