1. 12 3月, 2011 9 次提交
  2. 11 3月, 2011 1 次提交
    • J
      nfs: close NFSv4 COMMIT vs. CLOSE race · d2224e7a
      Jeff Layton 提交于
      I've been adding in more artificial delays in the NFSv4 commit and close
      codepaths to uncover races. The kernel I'm testing has the patch to
      close the race in __rpc_wait_for_completion_task that's in Trond's
      cthon2011 branch. The reproducer I've been using does this in a loop:
      
      	mkdir("DIR");
      	fd = open("DIR/FILE", O_WRONLY|O_CREAT|O_EXCL, 0644);
      	write(fd, "abcdefg", 7);
      	close(fd);
      	unlink("DIR/FILE");
      	rmdir("DIR");
      
      The above reproducer shouldn't result in any silly-renaming. However,
      when I add a "msleep(100)" just after the nfs_commit_clear_lock call in
      nfs_commit_release, I can almost always force one to occur. If I can
      force it to occur with that, then it can happen without that delay
      given the right timing.
      
      nfs_commit_inode waits for the NFS_INO_COMMIT bit to clear when called
      with FLUSH_SYNC set. nfs_commit_rpcsetup on the other hand does not wait
      for the task to complete before putting its reference to it, so the last
      reference get put in rpc_release task and gets queued to a workqueue.
      
      In this situation, the last open context reference may be put by the
      COMMIT release instead of the close() syscall. The close() syscall
      returns too quickly and the unlink runs while the d_count is still
      high since the COMMIT release hasn't put its dentry reference yet.
      
      Fix this by having rpc_commit_rpcsetup wait for the RPC call to complete
      before putting the task reference when FLUSH_SYNC is set. With this, the
      last reference is put by the process that's initiating the FLUSH_SYNC
      commit and the race is closed.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      d2224e7a
  3. 20 1月, 2011 1 次提交
  4. 08 12月, 2010 1 次提交
    • T
      nfs: remove extraneous and problematic calls to nfs_clear_request · 2df485a7
      Trond Myklebust 提交于
      When a nfs_page is freed, nfs_free_request is called which also calls
      nfs_clear_request to clean out the lock and open contexts and free the
      pagecache page.
      
      However, a couple of places in the nfs code call nfs_clear_request
      themselves. What happens here if the refcount on the request is still high?
      We'll be releasing contexts and freeing pointers while the request is
      possibly still in use.
      
      Remove those bare calls to nfs_clear_context. That should only be done when
      the request is being freed.
      
      Note that when doing this, we need to watch out for tests of req->wb_page.
      Previously, nfs_set_page_tag_locked() and nfs_clear_page_tag_locked()
      would check the value of req->wb_page to figure out if the page is mapped
      into the nfsi->nfs_page_tree. We now indicate the page is mapped using
      the new bit PG_MAPPED in req->wb_flags .
      Reported-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      2df485a7
  5. 27 10月, 2010 1 次提交
    • W
      writeback: remove nonblocking/encountered_congestion references · 1b430bee
      Wu Fengguang 提交于
      This removes more dead code that was somehow missed by commit 0d99519e
      (writeback: remove unused nonblocking and congestion checks).  There are
      no behavior change except for the removal of two entries from one of the
      ext4 tracing interface.
      
      The nonblocking checks in ->writepages are no longer used because the
      flusher now prefer to block on get_request_wait() than to skip inodes on
      IO congestion.  The latter will lead to more seeky IO.
      
      The nonblocking checks in ->writepage are no longer used because it's
      redundant with the WB_SYNC_NONE check.
      
      We no long set ->nonblocking in VM page out and page migration, because
      a) it's effectively redundant with WB_SYNC_NONE in current code
      b) it's old semantic of "Don't get stuck on request queues" is mis-behavior:
         that would skip some dirty inodes on congestion and page out others, which
         is unfair in terms of LRU age.
      
      Inspired by Christoph Hellwig. Thanks!
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Sage Weil <sage@newdream.net>
      Cc: Steve French <sfrench@samba.org>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1b430bee
  6. 30 9月, 2010 1 次提交
  7. 24 9月, 2010 1 次提交
  8. 02 8月, 2010 1 次提交
  9. 31 7月, 2010 3 次提交
  10. 23 6月, 2010 1 次提交
  11. 26 5月, 2010 2 次提交
  12. 28 4月, 2010 1 次提交
  13. 23 4月, 2010 1 次提交
    • T
      NFS: Fix an unstable write data integrity race · 71d0a611
      Trond Myklebust 提交于
      Commit 2c61be0a (NFS: Ensure that the WRITE
      and COMMIT RPC calls are always uninterruptible) exposed a race on file
      close. In order to ensure correct close-to-open behaviour, we want to wait
      for all outstanding background commit operations to complete.
      
      This patch adds an inode flag that indicates if a commit operation is under
      way, and provides a mechanism to allow ->write_inode() to wait for its
      completion if this is a data integrity flush.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      71d0a611
  14. 10 4月, 2010 3 次提交
  15. 06 3月, 2010 8 次提交
  16. 10 2月, 2010 1 次提交
  17. 27 1月, 2010 2 次提交
  18. 10 12月, 2009 2 次提交
    • T
      NFS: Fix nfs_migrate_page() · 190f38e5
      Trond Myklebust 提交于
      The call to migrate_page() will cause the page->private field to be
      cleared.
      Also fix up the locking around the page->private transfer, so that we ensure
      that calls to nfs_page_find_request() don't end up racing.
      
      Finally, fix up a double free bug: nfs_unlock_request() already calls
      nfs_release_request() for us...
      Reported-by: NWu Fengguang <fengguang.wu@intel.com>
      Tested-by: NAndi Kleen <andi@firstfloor.org>
      Cc: stable@kernel.org
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      190f38e5
    • C
      vfs: Implement proper O_SYNC semantics · 6b2f3d1f
      Christoph Hellwig 提交于
      While Linux provided an O_SYNC flag basically since day 1, it took until
      Linux 2.4.0-test12pre2 to actually get it implemented for filesystems,
      since that day we had generic_osync_around with only minor changes and the
      great "For now, when the user asks for O_SYNC, we'll actually give
      O_DSYNC" comment.  This patch intends to actually give us real O_SYNC
      semantics in addition to the O_DSYNC semantics.  After Jan's O_SYNC
      patches which are required before this patch it's actually surprisingly
      simple, we just need to figure out when to set the datasync flag to
      vfs_fsync_range and when not.
      
      This patch renames the existing O_SYNC flag to O_DSYNC while keeping it's
      numerical value to keep binary compatibility, and adds a new real O_SYNC
      flag.  To guarantee backwards compatiblity it is defined as expanding to
      both the O_DSYNC and the new additional binary flag (__O_SYNC) to make
      sure we are backwards-compatible when compiled against the new headers.
      
      This also means that all places that don't care about the differences can
      just check O_DSYNC and get the right behaviour for O_SYNC, too - only
      places that actuall care need to check __O_SYNC in addition.  Drivers and
      network filesystems have been updated in a fail safe way to always do the
      full sync magic if O_DSYNC is set.  The few places setting O_SYNC for
      lower layers are kept that way for now to stay failsafe.
      
      We enforce that O_DSYNC is set when __O_SYNC is set early in the open path
      to make sure we always get these sane options.
      
      Note that parisc really screwed up their headers as they already define a
      O_DSYNC that has always been a no-op.  We try to repair it by using it for
      the new O_DSYNC and redefinining O_SYNC to send both the traditional
      O_SYNC numerical value _and_ the O_DSYNC one.
      
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andreas Dilger <adilger@sun.com>
      Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: NKyle McMartin <kyle@mcmartin.ca>
      Acked-by: NUlrich Drepper <drepper@redhat.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      6b2f3d1f