1. 10 8月, 2009 8 次提交
    • P
      NFS: read-modify-write page updating · 38c73044
      Peter Staubach 提交于
      Hi.
      
      I have a proposal for possibly resolving this issue.
      
      I believe that this situation occurs due to the way that the
      Linux NFS client handles writes which modify partial pages.
      
      The Linux NFS client handles partial page modifications by
      allocating a page from the page cache, copying the data from
      the user level into the page, and then keeping track of the
      offset and length of the modified portions of the page.  The
      page is not marked as up to date because there are portions
      of the page which do not contain valid file contents.
      
      When a read call comes in for a portion of the page, the
      contents of the page must be read in the from the server.
      However, since the page may already contain some modified
      data, that modified data must be written to the server
      before the file contents can be read back in the from server.
      And, since the writing and reading can not be done atomically,
      the data must be written and committed to stable storage on
      the server for safety purposes.  This means either a
      FILE_SYNC WRITE or a UNSTABLE WRITE followed by a COMMIT.
      This has been discussed at length previously.
      
      This algorithm could be described as modify-write-read.  It
      is most efficient when the application only updates pages
      and does not read them.
      
      My proposed solution is to add a heuristic to decide whether
      to do this modify-write-read algorithm or switch to a read-
      modify-write algorithm when initially allocating the page
      in the write system call path.  The heuristic uses the modes
      that the file was opened with, the offset in the page to
      read from, and the size of the region to read.
      
      If the file was opened for reading in addition to writing
      and the page would not be filled completely with data from
      the user level, then read in the old contents of the page
      and mark it as Uptodate before copying in the new data.  If
      the page would be completely filled with data from the user
      level, then there would be no reason to read in the old
      contents because they would just be copied over.
      
      This would optimize for applications which randomly access
      and update portions of files.  The linkage editor for the
      C compiler is an example of such a thing.
      
      I tested the attached patch by using rpmbuild to build the
      current Fedora rawhide kernel.  The kernel without the
      patch generated about 269,500 WRITE requests.  The modified
      kernel containing the patch generated about 261,000 WRITE
      requests.  Thus, about 8,500 fewer WRITE requests were
      generated.  I suspect that many of these additional
      WRITE requests were probably FILE_SYNC requests to WRITE
      a single page, but I didn't test this theory.
      
      The difference between this patch and the previous one was
      to remove the unneeded PageDirty() test.  I then retested to
      ensure that the resulting system continued to behave as
      desired.
      
      	Thanx...
      
      		ps
      Signed-off-by: NPeter Staubach <staubach@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      38c73044
    • T
      NFS: Add a ->migratepage() aop for NFS · 074cc1de
      Trond Myklebust 提交于
      Make NFS a bit more friendly to NUMA and memory hot removal...
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      074cc1de
    • T
      NFSv4: Clean up the nfs.callback_tcpport option · c140aa91
      Trond Myklebust 提交于
      Tighten up the validity checking in param_set_port: check for NULL pointers.
      Ensure that the option shows up on 'modinfo' output.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      c140aa91
    • T
      SUNRPC: convert some sysctls into module parameters · cbf11071
      Trond Myklebust 提交于
      Parameters like the minimum reserved port, and the number of slot entries
      should really be module parameters rather than sysctls.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      cbf11071
    • T
      NFSv4: Don't do idmapper upcalls for asynchronous RPC calls · 80e52ace
      Trond Myklebust 提交于
      We don't want to cause rpciod to hang...
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      80e52ace
    • T
      NFSv4: Add 'server capability' flags for NFSv4 recommended attributes · 62ab460c
      Trond Myklebust 提交于
      If the NFSv4 server doesn't support a POSIX attribute, the generic NFS code
      needs to know that, so that it don't keep trying to poll for it.
      
      However, by the same count, if the NFSv4 server does support that
      attribute, then we should ensure that the inode metadata is appropriately
      labelled as being untrusted. For instance, if we don't know the correct
      value of the file's uid, we should certainly not be caching ACLs or ACCESS
      results.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      62ab460c
    • T
      NFSv4: Don't loop forever on state recovery failure... · a78cb57a
      Trond Myklebust 提交于
      If the server is broken, then retrying forever won't fix it. We
      should just give up after a while, and return an error to the user.
      We set the number of retries to 10 for now...
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      a78cb57a
    • R
      nfs: Keep index within mnt_errtbl[] · dd8ac1da
      Roel Kluin 提交于
      Ensure that index i remains within array mnt_errtbl[] and mnt3_errtbl[].
      Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      dd8ac1da
  2. 08 8月, 2009 32 次提交