1. 08 5月, 2007 1 次提交
  2. 05 5月, 2007 1 次提交
  3. 02 5月, 2007 2 次提交
  4. 01 5月, 2007 15 次提交
    • N
      NFS: Fix directory caching problem - with test case and patch. · 83672d39
      Neil Brown 提交于
      Try running this script in an NFS mounted directory (Client relatively
      recent - 2.6.18 has the problem as does 2.6.20).
      
      ------------------------------------------------------
      #!/bin/bash
      #
      # This script will produce the following errormessage from tar:
      #
      #   tar: newdir/innerdir/innerfile: file changed as we read it
      
      # create dirs
      rm -rf nfstest
      mkdir -p nfstest/dir/innerdir
      
      # create files (should not be empty)
      echo "Hello World!" >nfstest/dir/file
      echo "Hello World!" >nfstest/dir/innerdir/innerfile
      
      # problem only happens if we sleep before chmod
      sleep 1
      
      # change file modes
      chmod -R a+r nfstest
      
      # rename dir
      mv nfstest/dir nfstest/newdir
      
      # tar it
      tar -cf nfstest/nfstest.tar -C nfstest newdir
      
      # restore old dir name
      mv nfstest/newdir nfstest/dir
      --------------------------------------------------------
      
      What happens:
      
      The 'chmod -R' does a readdir_plus in each directory and the results
      get cached in the page cache.  It then updates the ctime on each file
      by one second.  When this happens, the post-op attributes are used to
      update the ctime stored on the client to match the value in the kernel.
      
      The 'mv' calls shrink_dcache_parent on the directory tree which
      flushes all the dentries (so a new lookup will be required) but
      doesn't flush the inodes or pagecache.
      
      The 'tar' does a readdir on each directory, but (in the case of
      'innerdir' at least) satisfies it from the pagecache and uses the
      READDIRPLUS data to update all the inodes.  In the case of
      'innerdir/innerfile', the ctime is out of date.
      
      'tar' then calls 'lstat' on innerdir/innerfile getting an old ctime.
      It then opens the file (triggering a GETATTR), reads the content, and
      then calls fstat to see if anything has changed.  It finds that ctime
      has changed and so complains.
      
      The problem seems to be that the cache readdirplus info is kept around
      for too long.
      
      My patch below discards pagecache data for directories when
      dentry_iput is called on them.  This effectively removes the symptom
      which convinces me that I correctly understand the problem.  However
      I'm not convinced that is a proper solution, as there could easily be
      other races that trigger the same problem without being affected by
      this 'fix'.
      
      One possibility would be to require that readdirplus pagecache data be
      only used *once* to instantiate an inode.  Somehow it should then be
      invalidated so that if the dentry subsequently disappears, it will
      cause a new request to the server to fill in the stat data.
      
      Another possibility is to compare the cache_change_attribute on the
      inode with something similar for the readdirplus info and reject the
      info from readdirplus if it is too old.
      
      I haven't tried to implement these and would value other opinions
      before I do.
      
      Thanks,
      NeilBrown
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      83672d39
    • N
      NFS: Set meaningful value for fattr->time_start in readdirplus results. · 1f4eab7e
      Neil Brown 提交于
      Don't use uninitialsed value for fattr->time_start in readdirplus results.
      
      The 'fattr' structure filled in by nfs3_decode_direct does not get a
      value for ->time_start set.
      Thus if an entry is for an inode that we already have in cache,
      when nfs_readdir_lookup calls nfs_fhget, it will call nfs_refresh_inode
      and may update the inode with out-of-date information.
      
      Directories are read a page at a time, so each page could have a
      different timestamp that "should" be used to set the time_start for
      the fattr for info in that page.  However storing the timestamp per
      page is awkward.  (We could stick in the first 4 bytes and only read 4092
      bytes, but that is a bigger code change than I am interested it).
      
      This patch ignores the readdir_plus attributes if a readdir finds the
      information already in cache, and otherwise sets ->time_start to the time
      the readdir request was sent to the server.
      
      It might be nice to store - in the directory inode - the time stamp for
      the earliest readdir request that is still in the page cache, so that we
      don't ignore attribute data that we don't have to.  This patch doesn't do
      that.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      1f4eab7e
    • S
      NFS: Added support to turn off the NFSv3 READDIRPLUS RPC. · 74dd34e6
      Steve Dickson 提交于
      READDIRPLUS can be a performance hindrance when the client is working with
      large directories. In addition, some servers still have bugs in their
      implementations (e.g. Tru64 returns wrong values for the fsid).
      
      Add a mount flag to enable users to turn it off at mount time following the
      implementation in Apple's NFS client.
      Signed-off-by: NSteve Dickson <steved@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      74dd34e6
    • C
      NFS: switch NFSROOT to use new rpcbind client · df8b172a
      Chuck Lever 提交于
      It is arguable whether NFSROOT will support IPv6, and thus whether
      rpcb_getport_external needs to support rpcbind versions greater than 2.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      df8b172a
    • C
      SUNRPC: RPC buffer size estimates are too large · 2bea90d4
      Chuck Lever 提交于
      The RPC buffer size estimation logic in net/sunrpc/clnt.c always
      significantly overestimates the requirements for the buffer size.
      A little instrumentation demonstrated that in fact rpc_malloc was never
      allocating the buffer from the mempool, but almost always called kmalloc.
      
      To compute the size of the RPC buffer more precisely, split p_bufsiz into
      two fields; one for the argument size, and one for the result size.
      
      Then, compute the sum of the exact call and reply header sizes, and split
      the RPC buffer precisely between the two.  That should keep almost all RPC
      buffers within the 2KiB buffer mempool limit.
      
      And, we can finally be rid of RPC_SLACK_SPACE!
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      2bea90d4
    • T
    • T
      NFS: Clean up nfs_sync_mapping_wait() · 724c439c
      Trond Myklebust 提交于
      It has no business touching wbc->pages_skipped.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      724c439c
    • T
    • T
      NFS: Fix a race when doing NFS write coalescing · c63c7b05
      Trond Myklebust 提交于
      Currently we do write coalescing in a very inefficient manner: one pass in
      generic_writepages() in order to lock the pages for writing, then one pass
      in nfs_flush_mapping() and/or nfs_sync_mapping_wait() in order to gather
      the locked pages for coalescing into RPC requests of size "wsize".
      
      In fact, it turns out there is actually a deadlock possible here since we
      only start I/O on the second pass. If the user signals the process while
      we're in nfs_sync_mapping_wait(), for instance, then we may exit before
      starting I/O on all the requests that have been queued up.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      c63c7b05
    • T
      NFS: Cleanup for nfs_readpages() · 8b09bee3
      Trond Myklebust 提交于
      Do the coalescing of read requests into block sized requests at start of
      I/O as we scan through the pages instead of going through a second pass.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      8b09bee3
    • T
    • T
      NFS: Cleanup the coalescing code · d8a5ad75
      Trond Myklebust 提交于
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      d8a5ad75
    • T
      NFS: Don't wait for congestion in nfs_update_request() · 91e59c36
      Trond Myklebust 提交于
      It is redundant, and will interfere with the call to
      balance_dirty_pages_ratelimited_nr in generic_file_write().
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      91e59c36
    • A
      NFS: statfs error-handling fix · 1a0ba9ae
      Amnon Aaronsohn 提交于
      The nfs statfs function returns a success code on error, and fills the
      output buffer with invalid values.  The attached patch makes it return a
      correct error code instead.
      Signed-off-by: NAmnon Aaronsohn <amnonaar@gmail.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
       (Modified patch to reinstate the dprintk())
      1a0ba9ae
    • T
      NFS: Fix nfs_set_page_dirty() · d585158b
      Trond Myklebust 提交于
      Be more careful about testing page->mapping.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      d585158b
  5. 21 4月, 2007 4 次提交
  6. 16 4月, 2007 1 次提交
  7. 15 4月, 2007 3 次提交
  8. 17 3月, 2007 2 次提交
  9. 15 2月, 2007 2 次提交
    • E
      [PATCH] sysctl: remove insert_at_head from register_sysctl · 0b4d4147
      Eric W. Biederman 提交于
      The semantic effect of insert_at_head is that it would allow new registered
      sysctl entries to override existing sysctl entries of the same name.  Which is
      pain for caching and the proc interface never implemented.
      
      I have done an audit and discovered that none of the current users of
      register_sysctl care as (excpet for directories) they do not register
      duplicate sysctl entries.
      
      So this patch simply removes the support for overriding existing entries in
      the sys_sysctl interface since no one uses it or cares and it makes future
      enhancments harder.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Corey Minyard <minyard@acm.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "John W. Linville" <linville@tuxdriver.com>
      Cc: James Bottomley <James.Bottomley@steeleye.com>
      Cc: Jan Kara <jack@ucw.cz>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Cc: David Chinner <dgc@sgi.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0b4d4147
    • T
      [PATCH] remove many unneeded #includes of sched.h · cd354f1a
      Tim Schmielau 提交于
      After Al Viro (finally) succeeded in removing the sched.h #include in module.h
      recently, it makes sense again to remove other superfluous sched.h includes.
      There are quite a lot of files which include it but don't actually need
      anything defined in there.  Presumably these includes were once needed for
      macros that used to live in sched.h, but moved to other header files in the
      course of cleaning it up.
      
      To ease the pain, this time I did not fiddle with any header files and only
      removed #includes from .c-files, which tend to cause less trouble.
      
      Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha,
      arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig,
      allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all
      configs in arch/arm/configs on arm.  I also checked that no new warnings were
      introduced by the patch (actually, some warnings are removed that were emitted
      by unnecessarily included header files).
      Signed-off-by: NTim Schmielau <tim@physik3.uni-rostock.de>
      Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cd354f1a
  10. 13 2月, 2007 9 次提交