1. 25 6月, 2016 1 次提交
  2. 28 5月, 2016 1 次提交
  3. 18 5月, 2016 7 次提交
    • J
      pnfs: rework LAYOUTGET retry handling · 183d9e7b
      Jeff Layton 提交于
      There are several problems in the way a stateid is selected for a
      LAYOUTGET operation:
      
      We pick a stateid to use in the RPC prepare op, but that makes
      it difficult to serialize LAYOUTGETs that use the open stateid. That
      serialization is done in pnfs_update_layout, which occurs well before
      the rpc_prepare operation.
      
      Between those two events, the i_lock is dropped and reacquired.
      pnfs_update_layout can find that the list has lsegs in it and not do any
      serialization, but then later pnfs_choose_layoutget_stateid ends up
      choosing the open stateid.
      
      This patch changes the client to select the stateid to use in the
      LAYOUTGET earlier, when we're searching for a usable layout segment.
      This way we can do it all while holding the i_lock the first time, and
      ensure that we serialize any LAYOUTGET call that uses a non-layout
      stateid.
      
      This also means a rework of how LAYOUTGET replies are handled, as we
      must now get the latest stateid if we want to retransmit in response
      to a retryable error.
      
      Most of those errors boil down to the fact that the layout state has
      changed in some fashion. Thus, what we really want to do is to re-search
      for a layout when it fails with a retryable error, so that we can avoid
      reissuing the RPC at all if possible.
      
      While the LAYOUTGET RPC is async, the initiating thread always waits for
      it to complete, so it's effectively synchronous anyway. Currently, when
      we need to retry a LAYOUTGET because of an error, we drive that retry
      via the rpc state machine.
      
      This means that once the call has been submitted, it runs until it
      completes. So, we must move the error handling for this RPC out of the
      rpc_call_done operation and into the caller.
      
      In order to handle errors like NFS4ERR_DELAY properly, we must also
      pass a pointer to the sliding timeout, which is now moved to the stack
      in pnfs_update_layout.
      
      The complicating errors are -NFS4ERR_RECALLCONFLICT and
      -NFS4ERR_LAYOUTTRYLATER, as those involve a timeout after which we give
      up and return NULL back to the caller. So, there is some special
      handling for those errors to ensure that the layers driving the retries
      can handle that appropriately.
      Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      183d9e7b
    • J
      pnfs: only tear down lsegs that precede seqid in LAYOUTRETURN args · 6d597e17
      Jeff Layton 提交于
      LAYOUTRETURN is "special" in that servers and clients are expected to
      work with old stateids. When the client sends a LAYOUTRETURN with an old
      stateid in it then the server is expected to only tear down layout
      segments that were present when that seqid was current. Ensure that the
      client handles its accounting accordingly.
      Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      6d597e17
    • T
      NFSv4: Use the right stateid for delegations in setattr, read and write · abf4e13c
      Trond Myklebust 提交于
      When we're using a delegation to represent our open state, we should
      ensure that we use the stateid that was used to create that delegation.
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      abf4e13c
    • T
      NFSv4: Label stateids with the type · 93b717fd
      Trond Myklebust 提交于
      In order to more easily distinguish what kind of stateid we are dealing
      with, introduce a type that can be used to label the stateid structure.
      
      The label will be useful both for debugging, but also when dealing with
      operations like SETATTR, READ and WRITE that can take several different
      types of stateid as arguments.
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      93b717fd
    • C
      sunrpc: Advertise maximum backchannel payload size · 6b26cc8c
      Chuck Lever 提交于
      RPC-over-RDMA transports have a limit on how large a backward
      direction (backchannel) RPC message can be. Ensure that the NFSv4.x
      CREATE_SESSION operation advertises this limit to servers.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Tested-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      6b26cc8c
    • T
      nfs4: client: do not send empty SETATTR after OPEN_CREATE · a1d1c4f1
      Tigran Mkrtchyan 提交于
      OPEN_CREATE with EXCLUSIVE4_1 sends initial file permission.
      Ignoring  fact, that server have indicated that file mod is set, client
      will send yet another SETATTR request, but, as mode is already set,
      new SETATTR will be empty. This is not a problem, nevertheless
      an extra roundtrip and slow open on high latency networks.
      
      This change is aims to skip extra setattr after open  if there are
      no attributes to be set.
      Signed-off-by: NTigran Mkrtchyan <tigran.mkrtchyan@desy.de>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      a1d1c4f1
    • A
      NFS: Add COPY nfs operation · 2e72448b
      Anna Schumaker 提交于
      This adds the copy_range file_ops function pointer used by the
      sys_copy_range() function call.  This patch only implements sync copies,
      so if an async copy happens we decode the stateid and ignore it.
      Signed-off-by: NAnna Schumaker <bjschuma@netapp.com>
      2e72448b
  4. 09 5月, 2016 2 次提交
    • A
      nfs: per-name sillyunlink exclusion · 884be175
      Al Viro 提交于
      use d_alloc_parallel() for sillyunlink/lookup exclusion and
      explicit rwsem (nfs_rmdir() being a writer and nfs_call_unlink() -
      a reader) for rmdir/sillyunlink one.
      
      That ought to make lookup/readdir/!O_CREAT atomic_open really
      parallel on NFS.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      884be175
    • C
      NFS: Fix an LOCK/OPEN race when unlinking an open file · 11476e9d
      Chuck Lever 提交于
      At Connectathon 2016, we found that recent upstream Linux clients
      would occasionally send a LOCK operation with a zero stateid. This
      appeared to happen in close proximity to another thread returning
      a delegation before unlinking the same file while it remained open.
      
      Earlier, the client received a write delegation on this file and
      returned the open stateid. Now, as it is getting ready to unlink the
      file, it returns the write delegation. But there is still an open
      file descriptor on that file, so the client must OPEN the file
      again before it returns the delegation.
      
      Since commit 24311f88 ('NFSv4: Recovery of recalled read
      delegations is broken'), nfs_open_delegation_recall() clears the
      NFS_DELEGATED_STATE flag _before_ it sends the OPEN. This allows a
      racing LOCK on the same inode to be put on the wire before the OPEN
      operation has returned a valid open stateid.
      
      To eliminate this race, serialize delegation return with the
      acquisition of a file lock on the same file. Adopt the same approach
      as is used in the unlock path.
      
      This patch also eliminates a similar race seen when sending a LOCK
      operation at the same time as returning a delegation on the same file.
      
      Fixes: 24311f88 ('NFSv4: Recovery of recalled read ... ')
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      [Anna: Add sentence about LOCK / delegation race]
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      11476e9d
  5. 11 4月, 2016 1 次提交
  6. 14 3月, 2016 1 次提交
  7. 18 2月, 2016 1 次提交
  8. 06 2月, 2016 2 次提交
  9. 25 1月, 2016 1 次提交
  10. 29 12月, 2015 4 次提交
  11. 28 12月, 2015 6 次提交
    • A
      nfs: machine credential support for additional operations · 99ade3c7
      Andrew Elble 提交于
      Allow LAYOUTRETURN and DELEGRETURN to use machine credentials if the
      server supports it. Add request for OPEN_DOWNGRADE as the close path
      also uses that.
      Signed-off-by: NAndrew Elble <aweits@rit.edu>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      99ade3c7
    • T
    • O
      9759b0fb
    • O
      Adding stateid information to tracepoints · 48c9579a
      Olga Kornievskaia 提交于
      Operations to which stateid information is added:
      close, delegreturn, open, read, setattr, layoutget, layoutcommit, test_stateid,
      write, lock, locku, lockt
      
      Format is "stateid=<seqid>:<crc32 hash stateid.other>", also "openstateid=",
      "layoutstateid=", and "lockstateid=" for open_file, layoutget, set_lock
      tracepoints.
      
      New function is added to internal.h, nfs_stateid_hash(), to compute the hash
      
      trace_nfs4_setattr() is moved from nfs4_do_setattr() to _nfs4_do_setattr()
      to get access to stateid.
      
      trace_nfs4_setattr and trace_nfs4_delegreturn are changed from INODE_EVENT
      to new event type, INODE_STATEID_EVENT which is same as INODE_EVENT but adds
      stateid information
      
      for locking tracepoints, moved trace_nfs4_set_lock() into _nfs4_do_setlk()
      to get access to stateid information, and removed trace_nfs4_lock_reclaim(),
      trace_nfs4_lock_expired() as they call into _nfs4_do_setlk() and both were
      previously same LOCK_EVENT type.
      Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      48c9579a
    • T
      NFS: Allow the combination pNFS and labeled NFS · 95864c91
      Trond Myklebust 提交于
      Fix the nfs4_pnfs_open_bitmap so that it also allows for labeled NFS.
      Signed-off-by: NTrond Myklebust <trond,myklebust@primarydata.com>
      95864c91
    • A
      nfs: Fix race in __update_open_stateid() · 361cad3c
      Andrew Elble 提交于
      We've seen this in a packet capture - I've intermixed what I
      think was going on. The fix here is to grab the so_lock sooner.
      
      1964379 -> #1 open (for write) reply seqid=1
      1964393 -> #2 open (for read) reply seqid=2
      
        __nfs4_close(), state->n_wronly--
        nfs4_state_set_mode_locked(), changes state->state = [R]
        state->flags is [RW]
        state->state is [R], state->n_wronly == 0, state->n_rdonly == 1
      
      1964398 -> #3 open (for write) call -> because close is already running
      1964399 -> downgrade (to read) call seqid=2 (close of #1)
      1964402 -> #3 open (for write) reply seqid=3
      
       __update_open_stateid()
         nfs_set_open_stateid_locked(), changes state->flags
         state->flags is [RW]
         state->state is [R], state->n_wronly == 0, state->n_rdonly == 1
         new sequence number is exposed now via nfs4_stateid_copy()
      
         next step would be update_open_stateflags(), pending so_lock
      
      1964403 -> downgrade reply seqid=2, fails with OLD_STATEID (close of #1)
      
         nfs4_close_prepare() gets so_lock and recalcs flags -> send close
      
      1964405 -> downgrade (to read) call seqid=3 (close of #1 retry)
      
         __update_open_stateid() gets so_lock
       * update_open_stateflags() updates state->n_wronly.
         nfs4_state_set_mode_locked() updates state->state
      
         state->flags is [RW]
         state->state is [RW], state->n_wronly == 1, state->n_rdonly == 1
      
       * should have suppressed the preceding nfs4_close_prepare() from
         sending open_downgrade
      
      1964406 -> write call
      1964408 -> downgrade (to read) reply seqid=4 (close of #1 retry)
      
         nfs_clear_open_stateid_locked()
         state->flags is [R]
         state->state is [RW], state->n_wronly == 1, state->n_rdonly == 1
      
      1964409 -> write reply (fails, openmode)
      Signed-off-by: NAndrew Elble <aweits@rit.edu>
      Cc: stable@vger,kernel.org
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      361cad3c
  12. 14 12月, 2015 2 次提交
  13. 07 12月, 2015 1 次提交
  14. 24 11月, 2015 1 次提交
  15. 14 11月, 2015 1 次提交
  16. 04 11月, 2015 1 次提交
  17. 23 10月, 2015 1 次提交
  18. 16 10月, 2015 2 次提交
  19. 08 10月, 2015 4 次提交