1. 12 9月, 2017 3 次提交
    • N
      NFS: various changes relating to reporting IO errors. · bf4b4905
      NeilBrown 提交于
      1/ remove 'start' and 'end' args from nfs_file_fsync_commit().
         They aren't used.
      
      2/ Make nfs_context_set_write_error() a "static inline" in internal.h
         so we can...
      
      3/ Use nfs_context_set_write_error() instead of mapping_set_error()
         if nfs_pageio_add_request() fails before sending any request.
         NFS generally keeps errors in the open_context, not the mapping,
         so this is more consistent.
      
      4/ If filemap_write_and_write_range() reports any error, still
         check ctx->error.  The value in ctx->error is likely to be
         more useful.  As part of this, NFS_CONTEXT_ERROR_WRITE is
         cleared slightly earlier, before nfs_file_fsync_commit() is called,
         rather than at the start of that function.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      bf4b4905
    • C
      NFS: Add static NFS I/O tracepoints · 8224b273
      Chuck Lever 提交于
      Tools like tcpdump and rpcdebug can be very useful. But there are
      plenty of environments where they are difficult or impossible to
      use. For example, we've had customers report I/O failures during
      workloads so heavy that collecting network traffic or enabling
      RPC debugging are themselves onerous.
      
      The kernel's static tracepoints are lightweight (less likely to
      introduce timing changes) and efficient (the trace data is compact).
      They also work in scenarios where capturing network traffic is not
      possible due to lack of hardware support (some InfiniBand HCAs) or
      where data or network privacy is a concern.
      
      Introduce tracepoints that show when an NFS READ, WRITE, or COMMIT
      is initiated, and when it completes. Record the arguments and
      results of each operation, which are not shown by existing sunrpc
      module's tracepoints.
      
      For instance, the recorded offset and count can be used to match an
      "initiate" event to a "done" event. If an NFS READ result returns
      fewer bytes than requested or zero, seeing the EOF flag can be
      probative. Seeing an NFS4ERR_BAD_STATEID result is also indication
      of a particular class of problems. The timing information attached
      to each event record can often be useful as well.
      
      Usage example:
      
      [root@manet tmp]# trace-cmd record -e nfs:*initiate* -e nfs:*done
      /sys/kernel/debug/tracing/events/nfs/*initiate*/filter
      /sys/kernel/debug/tracing/events/nfs/*done/filter
      Hit Ctrl^C to stop recording
      ^CKernel buffer statistics:
        Note: "entries" are the entries left in the kernel ring buffer and are not
              recorded in the trace data. They should all be zero.
      
      CPU: 0
      entries: 0
      overrun: 0
      commit overrun: 0
      bytes: 3680
      oldest event ts:    78.367422
      now ts:   100.124419
      dropped events: 0
      read events: 74
      
      ... and so on.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      8224b273
    • T
      pNFS: Use the standard I/O stateid when calling LAYOUTGET · 70d2f7b1
      Trond Myklebust 提交于
      Instead of having a private method for copying the open/delegation stateid,
      use the same call that is used for standard I/O through the MDS.
      
      Note that this means we transmit the stateid with a zero seqid, avoiding
      issues with NFS4ERR_OLD_STATEID.
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      70d2f7b1
  2. 10 9月, 2017 4 次提交
  3. 09 9月, 2017 1 次提交
    • T
      NFS: Fix 2 use after free issues in the I/O code · 196639eb
      Trond Myklebust 提交于
      The writeback code wants to send a commit after processing the pages,
      which is why we want to delay releasing the struct path until after
      that's done.
      
      Also, the layout code expects that we do not free the inode before
      we've put the layout segments in pnfs_writehdr_free() and
      pnfs_readhdr_free()
      
      Fixes: 919e3bd9 ("NFS: Ensure we commit after writeback is complete")
      Fixes: 4714fb51 ("nfs: remove pgio_header refcount, related cleanup")
      Cc: stable@vger.kernel.org
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      196639eb
  4. 07 9月, 2017 6 次提交
  5. 06 9月, 2017 2 次提交
    • C
      xprtrdma: Use xprt_pin_rqst in rpcrdma_reply_handler · 9590d083
      Chuck Lever 提交于
      Adopt the use of xprt_pin_rqst to eliminate contention between
      Call-side users of rb_lock and the use of rb_lock in
      rpcrdma_reply_handler.
      
      This replaces the mechanism introduced in 431af645 ("xprtrdma:
      Fix client lock-up after application signal fires").
      
      Use recv_lock to quickly find the completing rqst, pin it, then
      drop the lock. At that point invalidation and pull-up of the Reply
      XDR can be done. Both are often expensive operations.
      
      Finally, take recv_lock again to signal completion to the RPC
      layer. It also protects adjustment of "cwnd".
      
      This greatly reduces the amount of time a lock is held by the
      reply handler. Comparing lock_stat results shows a marked decrease
      in contention on rb_lock and recv_lock.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      [trond.myklebust@primarydata.com: Remove call to rpcrdma_buffer_put() from
         the "out_norqst:" path in rpcrdma_reply_handler.]
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      9590d083
    • T
      Merge tag 'nfs-rdma-for-4.14-1' of git://git.linux-nfs.org/projects/anna/linux-nfs into linux-next · f9773b22
      Trond Myklebust 提交于
      NFS-over-RDMA client updates for Linux 4.14
      
      Bugfixes and cleanups:
      - Constify rpc_xprt_ops
      - Harden RPC call encoding and decoding
      - Clean up rpc call decoding to use xdr_streams
      - Remove unused variables from various structures
      - Refactor code to remove imul instructions
      - Rearrange rx_stats structure for better cacheline sharing
      f9773b22
  6. 23 8月, 2017 1 次提交
  7. 21 8月, 2017 4 次提交
    • T
      Merge branch 'bugfixes' · 7af7a596
      Trond Myklebust 提交于
      7af7a596
    • C
      NFS: Fix NFSv2 security settings · 53a75f22
      Chuck Lever 提交于
      For a while now any NFSv2 mount where sec= is specified uses
      AUTH_NULL. If sec= is not specified, the mount uses AUTH_UNIX.
      Commit e68fd7c8 ("mount: use sec= that was specified on the
      command line") attempted to address a very similar problem with
      NFSv3, and should have fixed this too, but it has a bug.
      
      The MNTv1 MNT procedure does not return a list of security flavors,
      so our client makes up a list containing just AUTH_NULL. This should
      enable nfs_verify_authflavors() to assign the sec= specified flavor,
      but instead, it incorrectly sets it to AUTH_NULL.
      
      I expect this would also be a problem for any NFSv3 server whose
      MNTv3 MNT procedure returned a security flavor list containing only
      AUTH_NULL.
      
      Fixes: e68fd7c8 ("mount: use sec= that was specified on ... ")
      BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=310Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      53a75f22
    • N
      NFSv4.1: don't use machine credentials for CLOSE when using 'sec=sys' · b79e87e0
      NeilBrown 提交于
      An NFSv4.1 client might close a file after the user who opened it has
      logged off.  In this case the user's credentials may no longer be
      valid, if they are e.g. kerberos credentials that have expired.
      
      NFSv4.1 has a mechanism to allow the client to use machine credentials
      to close a file.  However due to a short-coming in the RFC, a CLOSE
      with those credentials may not be possible if the file in question
      isn't exported to the same security flavor - the required PUTFH must
      be rejected when this is the case.
      
      Specifically if a server and client support kerberos in general and
      have used it to form a machine credential, but the file is only
      exported to "sec=sys", a PUTFH with the machine credentials will fail,
      so CLOSE is not possible.
      
      As RPC_AUTH_UNIX (used by sec=sys) credentials can never expire, there
      is no value in using the machine credential in place of them.
      So in that case, just use the users credentials for CLOSE etc, as you would
      in NFSv4.0
      Signed-off-by: NNeil Brown <neilb@suse.com>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      b79e87e0
    • N
      SUNRPC: ECONNREFUSED should cause a rebind. · fd01b259
      NeilBrown 提交于
      If you
       - mount and NFSv3 filesystem
       - do some file locking which requires the server
         to make a GRANT call back
       - unmount
       - mount again and do the same locking
      
      then the second attempt at locking suffers a 30 second delay.
      Unmounting and remounting causes lockd to stop and restart,
      which causes it to bind to a new port.
      The server still thinks the old port is valid and gets ECONNREFUSED
      when trying to contact it.
      ECONNREFUSED should be seen as a hard error that is not worth
      retrying.  Rebinding is the only reasonable response.
      
      This patch forces a rebind if that makes sense.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      fd01b259
  8. 20 8月, 2017 2 次提交
  9. 19 8月, 2017 2 次提交
  10. 17 8月, 2017 4 次提交
  11. 16 8月, 2017 2 次提交
  12. 15 8月, 2017 9 次提交