1. 24 10月, 2015 2 次提交
    • J
      nfsd: ensure that seqid morphing operations are atomic wrt to copies · 9767feb2
      Jeff Layton 提交于
      Bruce points out that the increment of the seqid in stateids is not
      serialized in any way, so it's possible for racing calls to bump it
      twice and end up sending the same stateid. While we don't have any
      reports of this problem it _is_ theoretically possible, and could lead
      to spurious state recovery by the client.
      
      In the current code, update_stateid is always followed by a memcpy of
      that stateid, so we can combine the two operations. For better
      atomicity, we add a spinlock to the nfs4_stid and hold that when bumping
      the seqid and copying the stateid.
      Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      9767feb2
    • J
      nfsd: serialize layout stateid morphing operations · cc8a5532
      Jeff Layton 提交于
      In order to allow the client to make a sane determination of what
      happened with racing LAYOUTGET/LAYOUTRETURN/CB_LAYOUTRECALL calls, we
      must ensure that the seqids return accurately represent the order of
      operations. The simplest way to do that is to ensure that operations on
      a single stateid are serialized.
      
      This patch adds a mutex to the layout stateid, and locks it when
      checking the layout stateid's seqid. The mutex is held over the entire
      operation and released after the seqid is bumped.
      
      Note that in the case of CB_LAYOUTRECALL we must move the increment of
      the seqid and setting into a new cb "prepare" operation. The lease
      infrastructure will call the lm_break callback with a spinlock held, so
      and we can't take the mutex in that codepath.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      cc8a5532
  2. 13 10月, 2015 1 次提交
    • J
      nfsd: serialize state seqid morphing operations · 35a92fe8
      Jeff Layton 提交于
      Andrew was seeing a race occur when an OPEN and OPEN_DOWNGRADE were
      running in parallel. The server would receive the OPEN_DOWNGRADE first
      and check its seqid, but then an OPEN would race in and bump it. The
      OPEN_DOWNGRADE would then complete and bump the seqid again.  The result
      was that the OPEN_DOWNGRADE would be applied after the OPEN, even though
      it should have been rejected since the seqid changed.
      
      The only recourse we have here I think is to serialize operations that
      bump the seqid in a stateid, particularly when we're given a seqid in
      the call. To address this, we add a new rw_semaphore to the
      nfs4_ol_stateid struct. We do a down_write prior to checking the seqid
      after looking up the stateid to ensure that nothing else is going to
      bump it while we're operating on it.
      
      In the case of OPEN, we do a down_read, as the call doesn't contain a
      seqid. Those can run in parallel -- we just need to serialize them when
      there is a concurrent OPEN_DOWNGRADE or CLOSE.
      
      LOCK and LOCKU however always take the write lock as there is no
      opportunity for parallelizing those.
      Reported-and-Tested-by: NAndrew W Elble <aweits@rit.edu>
      Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      35a92fe8
  3. 13 8月, 2015 1 次提交
  4. 23 6月, 2015 1 次提交
    • C
      nfsd: take struct file setup fully into nfs4_preprocess_stateid_op · af90f707
      Christoph Hellwig 提交于
      This patch changes nfs4_preprocess_stateid_op so it always returns
      a valid struct file if it has been asked for that.  For that we
      now allocate a temporary struct file for special stateids, and check
      permissions if we got the file structure from the stateid.  This
      ensures that all callers will get their handling of special stateids
      right, and avoids code duplication.
      
      There is a little wart in here because the read code needs to know
      if we allocated a file structure so that it can copy around the
      read-ahead parameters.  In the long run we should probably aim to
      cache full file structures used with special stateids instead.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      af90f707
  5. 05 6月, 2015 1 次提交
  6. 05 5月, 2015 3 次提交
    • C
      nfsd: fix callback restarts · cba5f62b
      Christoph Hellwig 提交于
      Checking the rpc_client pointer is not a reliable way to detect
      backchannel changes: cl_cb_client is changed only after shutting down
      the rpc client, so the condition cl_cb_client = tk_client will always be
      true.
      
      Check the RPC_TASK_KILLED flag instead, and rewrite the code to avoid
      the buggy cl_callbacks list and fix the lifetime rules due to double
      calls of the ->prepare callback operations method for this retry case.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      cba5f62b
    • C
      nfsd: split transport vs operation errors for callbacks · ef2a1b3e
      Christoph Hellwig 提交于
      We must only increment the sequence id if the client has seen and responded
      to a request.  If we failed to deliver it to the client we must resend with
      the same sequence id.  So just like the client track errors at the transport
      level differently from those returned in the XDR.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      ef2a1b3e
    • S
      nfsd: fix pNFS return on close semantics · 8287f009
      Sachin Bhamare 提交于
      For the sake of forgetful clients, the server should return the layouts
      to the file system on 'last close' of a file (assuming that there are no
      delegations outstanding to that particular client) or on delegreturn
      (assuming that there are no opens on a file from that particular
      client).
      
      In theory the information is all there in current data structures, but
      it's not efficiently available; nfs4_file->fi_ref includes references on
      the file across all clients, but we need a per-(client, file) count.
      Walking through lots of stateid's to calculate this on each close or
      delegreturn would be painful.
      
      This patch introduces infrastructure to maintain per-client opens and
      delegation counters on a per-file basis.
      
      [hch: ported to the mainline pNFS support, merged various fixes from Jeff]
      Signed-off-by: NSachin Bhamare <sachin.bhamare@primarydata.com>
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      8287f009
  7. 03 2月, 2015 5 次提交
    • C
      nfsd: implement pNFS layout recalls · c5c707f9
      Christoph Hellwig 提交于
      Add support to issue layout recalls to clients.  For now we only support
      full-file recalls to get a simple and stable implementation.  This allows
      to embedd a nfsd4_callback structure in the layout_state and thus avoid
      any memory allocations under spinlocks during a recall.  For normal
      use cases that do not intent to share a single file between multiple
      clients this implementation is fully sufficient.
      
      To ensure layouts are recalled on local filesystem access each layout
      state registers a new FL_LAYOUT lease with the kernel file locking code,
      which filesystems that support pNFS exports that require recalls need
      to break on conflicting access patterns.
      
      The XDR code is based on the old pNFS server implementation by
      Andy Adamson, Benny Halevy, Boaz Harrosh, Dean Hildebrand, Fred Isaman,
      Marc Eshel, Mike Sager and Ricardo Labiaga.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      c5c707f9
    • C
      nfsd: implement pNFS operations · 9cf514cc
      Christoph Hellwig 提交于
      Add support for the GETDEVICEINFO, LAYOUTGET, LAYOUTCOMMIT and
      LAYOUTRETURN NFSv4.1 operations, as well as backing code to manage
      outstanding layouts and devices.
      
      Layout management is very straight forward, with a nfs4_layout_stateid
      structure that extends nfs4_stid to manage layout stateids as the
      top-level structure.  It is linked into the nfs4_file and nfs4_client
      structures like the other stateids, and contains a linked list of
      layouts that hang of the stateid.  The actual layout operations are
      implemented in layout drivers that are not part of this commit, but
      will be added later.
      
      The worst part of this commit is the management of the pNFS device IDs,
      which suffers from a specification that is not sanely implementable due
      to the fact that the device-IDs are global and not bound to an export,
      and have a small enough size so that we can't store the fsid portion of
      a file handle, and must never be reused.  As we still do need perform all
      export authentication and validation checks on a device ID passed to
      GETDEVICEINFO we are caught between a rock and a hard place.  To work
      around this issue we add a new hash that maps from a 64-bit integer to a
      fsid so that we can look up the export to authenticate against it,
      a 32-bit integer as a generation that we can bump when changing the device,
      and a currently unused 32-bit integer that could be used in the future
      to handle more than a single device per export.  Entries in this hash
      table are never deleted as we can't reuse the ids anyway, and would have
      a severe lifetime problem anyway as Linux export structures are temporary
      structures that can go away under load.
      
      Parts of the XDR data, structures and marshaling/unmarshaling code, as
      well as many concepts are derived from the old pNFS server implementation
      from Andy Adamson, Benny Halevy, Dean Hildebrand, Marc Eshel, Fred Isaman,
      Mike Sager, Ricardo Labiaga and many others.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      9cf514cc
    • C
      4d227fca
    • C
      e6ba76e1
    • C
  8. 08 1月, 2015 1 次提交
  9. 08 11月, 2014 1 次提交
    • J
      nfsd: convert nfs4_file searches to use RCU · 5b095e99
      Jeff Layton 提交于
      The global state_lock protects the file_hashtbl, and that has the
      potential to be a scalability bottleneck.
      
      Address this by making the file_hashtbl use RCU. Add a rcu_head to the
      nfs4_file and use that when freeing ones that have been hashed. In order
      to conserve space, we union the fi_rcu field with the fi_delegations
      list_head which must be clear by the time the last reference to the file
      is dropped.
      
      Convert find_file_locked to use RCU lookup primitives and not to require
      that the state_lock be held, and convert find_file to do a lockless
      lookup. Convert find_or_add_file to attempt a lockless lookup first, and
      then fall back to doing a locked search and insert if that fails to find
      anything.
      
      Also, minimize the number of times we need to calculate the hash value
      by passing it in as an argument to the search and insert functions, and
      optimize the order of arguments in nfsd4_init_file.
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      5b095e99
  10. 24 10月, 2014 1 次提交
  11. 08 10月, 2014 1 次提交
  12. 02 10月, 2014 1 次提交
  13. 27 9月, 2014 4 次提交
  14. 18 9月, 2014 3 次提交
  15. 06 8月, 2014 2 次提交
  16. 05 8月, 2014 7 次提交
  17. 02 8月, 2014 1 次提交
  18. 01 8月, 2014 4 次提交