1. 12 9月, 2017 2 次提交
    • N
      NFS: various changes relating to reporting IO errors. · bf4b4905
      NeilBrown 提交于
      1/ remove 'start' and 'end' args from nfs_file_fsync_commit().
         They aren't used.
      
      2/ Make nfs_context_set_write_error() a "static inline" in internal.h
         so we can...
      
      3/ Use nfs_context_set_write_error() instead of mapping_set_error()
         if nfs_pageio_add_request() fails before sending any request.
         NFS generally keeps errors in the open_context, not the mapping,
         so this is more consistent.
      
      4/ If filemap_write_and_write_range() reports any error, still
         check ctx->error.  The value in ctx->error is likely to be
         more useful.  As part of this, NFS_CONTEXT_ERROR_WRITE is
         cleared slightly earlier, before nfs_file_fsync_commit() is called,
         rather than at the start of that function.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      bf4b4905
    • C
      NFS: Add static NFS I/O tracepoints · 8224b273
      Chuck Lever 提交于
      Tools like tcpdump and rpcdebug can be very useful. But there are
      plenty of environments where they are difficult or impossible to
      use. For example, we've had customers report I/O failures during
      workloads so heavy that collecting network traffic or enabling
      RPC debugging are themselves onerous.
      
      The kernel's static tracepoints are lightweight (less likely to
      introduce timing changes) and efficient (the trace data is compact).
      They also work in scenarios where capturing network traffic is not
      possible due to lack of hardware support (some InfiniBand HCAs) or
      where data or network privacy is a concern.
      
      Introduce tracepoints that show when an NFS READ, WRITE, or COMMIT
      is initiated, and when it completes. Record the arguments and
      results of each operation, which are not shown by existing sunrpc
      module's tracepoints.
      
      For instance, the recorded offset and count can be used to match an
      "initiate" event to a "done" event. If an NFS READ result returns
      fewer bytes than requested or zero, seeing the EOF flag can be
      probative. Seeing an NFS4ERR_BAD_STATEID result is also indication
      of a particular class of problems. The timing information attached
      to each event record can often be useful as well.
      
      Usage example:
      
      [root@manet tmp]# trace-cmd record -e nfs:*initiate* -e nfs:*done
      /sys/kernel/debug/tracing/events/nfs/*initiate*/filter
      /sys/kernel/debug/tracing/events/nfs/*done/filter
      Hit Ctrl^C to stop recording
      ^CKernel buffer statistics:
        Note: "entries" are the entries left in the kernel ring buffer and are not
              recorded in the trace data. They should all be zero.
      
      CPU: 0
      entries: 0
      overrun: 0
      commit overrun: 0
      bytes: 3680
      oldest event ts:    78.367422
      now ts:   100.124419
      dropped events: 0
      read events: 74
      
      ... and so on.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      8224b273
  2. 10 9月, 2017 4 次提交
  3. 07 9月, 2017 1 次提交
  4. 20 8月, 2017 1 次提交
  5. 15 8月, 2017 23 次提交
  6. 14 7月, 2017 2 次提交
  7. 09 5月, 2017 1 次提交
  8. 27 4月, 2017 1 次提交
  9. 26 4月, 2017 1 次提交
  10. 21 4月, 2017 4 次提交
    • J
      nfs: Convert to separately allocated bdi · 0db10944
      Jan Kara 提交于
      Allocate struct backing_dev_info separately instead of embedding it
      inside the superblock. This unifies handling of bdi among users.
      
      CC: Anna Schumaker <anna.schumaker@netapp.com>
      CC: linux-nfs@vger.kernel.org
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Acked-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      0db10944
    • B
      NFS: move rw_mode to nfs_pageio_header · fbe77c30
      Benjamin Coddington 提交于
      Let's try to have it in a cacheline in nfs4_proc_pgio_rpc_prepare().
      Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      fbe77c30
    • F
      NFS: Fix use after free in write error path · 1f84ccdf
      Fred Isaman 提交于
      Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
      Fixes: 0bcbf039 ("nfs: handle request add failure properly")
      Cc: stable@vger.kernel.org # v4.5+
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      1f84ccdf
    • N
      NFS: fix usage of mempools. · 518662e0
      NeilBrown 提交于
      When passed GFP flags that allow sleeping (such as
      GFP_NOIO), mempool_alloc() will never return NULL, it will
      wait until memory is available.
      
      This means that we don't need to handle failure, but that we
      do need to ensure one thread doesn't call mempool_alloc()
      twice on the one pool without queuing or freeing the first
      allocation.  If multiple threads did this during times of
      high memory pressure, the pool could be exhausted and a
      deadlock could result.
      
      pnfs_generic_alloc_ds_commits() attempts to allocate from
      the nfs_commit_mempool while already holding an allocation
      from that pool.  This is not safe.  So change
      nfs_commitdata_alloc() to take a flag that indicates whether
      failure is acceptable.
      
      In pnfs_generic_alloc_ds_commits(), accept failure and
      handle it as we currently do.  Else where, do not accept
      failure, and do not handle it.
      
      Even when failure is acceptable, we want to succeed if
      possible.  That means both
       - using an entry from the pool if there is one
       - waiting for direct reclaim is there isn't.
      
      We call mempool_alloc(GFP_NOWAIT) to achieve the first, then
      kmem_cache_alloc(GFP_NOIO|__GFP_NORETRY) to achieve the
      second.  Each of these can fail, but together they do the
      best they can without blocking indefinitely.
      
      The objects returned by kmem_cache_alloc() will still be freed
      by mempool_free().  This is safe as mempool_alloc() uses
      exactly the same function to allocate objects (since the mempool
      was created with mempool_create_slab_pool()).  The object returned
      by mempool_alloc() and kmem_cache_alloc() are indistinguishable
      so mempool_free() will handle both identically, either adding to the
      pool or calling kmem_cache_free().
      
      Also, don't test for failure when allocating from
      nfs_wdata_mempool.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      518662e0