1. 18 11月, 2017 6 次提交
    • T
      NFSv4: Fix open create exclusive when the server reboots · 8fd1ab74
      Trond Myklebust 提交于
      If the server that does not implement NFSv4.1 persistent session
      semantics reboots while we are performing an exclusive create,
      then the return value of NFS4ERR_DELAY when we replay the open
      during the grace period causes us to lose the verifier.
      When the grace period expires, and we present a new verifier,
      the server will then correctly reply NFS4ERR_EXIST.
      
      This commit ensures that we always present the same verifier when
      replaying the OPEN.
      Reported-by: NTigran Mkrtchyan <tigran.mkrtchyan@desy.de>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      8fd1ab74
    • T
    • T
      NFSv4: Fix OPEN / CLOSE race · c9399f21
      Trond Myklebust 提交于
      Ben Coddington has noted the following race between OPEN and CLOSE
      on a single client.
      
      Process 1		Process 2		Server
      =========		=========		======
      
      1)  OPEN file
      2)			OPEN file
      3)						Process OPEN (1) seqid=1
      4)						Process OPEN (2) seqid=2
      5)						Reply OPEN (2)
      6)			Receive reply (2)
      7)			new stateid, seqid=2
      
      8)			CLOSE file, using
      			stateid w/ seqid=2
      9)						Reply OPEN (1)
      10(						Process CLOSE (8)
      11)						Reply CLOSE (8)
      12)						Forget stateid
      						file closed
      
      13)			Receive reply (7)
      14)			Forget stateid
      			file closed.
      
      15) Receive reply (1).
      16) New stateid seqid=1
          is really the same
          stateid that was
          closed.
      
      IOW: the reply to the first OPEN is delayed. Since "Process 2" does
      not wait before closing the file, and it does not cache the closed
      stateid, then when the delayed reply is finally received, it is treated
      as setting up a new stateid by the client.
      
      The fix is to ensure that the client processes the OPEN and CLOSE calls
      in the same order in which the server processed them.
      
      This commit ensures that we examine the seqid of the stateid
      returned by OPEN. If it is a new stateid, we assume the seqid
      must be equal to the value 1, and that each state transition
      increments the seqid value by 1 (See RFC7530, Section 9.1.4.2,
      and RFC5661, Section 8.2.2).
      
      If the tracker sees that an OPEN returns with a seqid that is greater
      than the cached seqid + 1, then it bumps a flag to ensure that the
      caller waits for the RPCs carrying the missing seqids to complete.
      
      Note that there can still be pathologies where the server crashes before
      it can even send us the missing seqids. Since the OPEN call is still
      holding a slot when it waits here, that could cause the recovery to
      stall forever. To avoid that, we time out after a 5 second wait.
      Reported-by: NBenjamin Coddington <bcodding@redhat.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      c9399f21
    • E
      fs, nfs: convert nfs_client.cl_count from atomic_t to refcount_t · 212bf41d
      Elena Reshetova 提交于
      atomic_t variables are currently used to implement reference
      counters with the following properties:
       - counter is initialized to 1 using atomic_set()
       - a resource is freed upon counter reaching zero
       - once counter reaches zero, its further
         increments aren't allowed
       - counter schema uses basic atomic operations
         (set, inc, inc_not_zero, dec_and_test, etc.)
      
      Such atomic variables should be converted to a newly provided
      refcount_t type and API that prevents accidental counter overflows
      and underflows. This is important since overflows and underflows
      can lead to use-after-free situation and be exploitable.
      
      The variable nfs_client.cl_count is used as pure reference counter.
      Convert it to refcount_t and fix up the operations.
      Suggested-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
      Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
      Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      212bf41d
    • E
      fs, nfs: convert nfs4_lock_state.ls_count from atomic_t to refcount_t · 194bc1f4
      Elena Reshetova 提交于
      atomic_t variables are currently used to implement reference
      counters with the following properties:
       - counter is initialized to 1 using atomic_set()
       - a resource is freed upon counter reaching zero
       - once counter reaches zero, its further
         increments aren't allowed
       - counter schema uses basic atomic operations
         (set, inc, inc_not_zero, dec_and_test, etc.)
      
      Such atomic variables should be converted to a newly provided
      refcount_t type and API that prevents accidental counter overflows
      and underflows. This is important since overflows and underflows
      can lead to use-after-free situation and be exploitable.
      
      The variable nfs4_lock_state.ls_count  is used as pure reference counter.
      Convert it to refcount_t and fix up the operations.
      Suggested-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
      Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
      Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      194bc1f4
    • T
      NFSv4.1: Fix up replays of interrupted requests · 3be0f80b
      Trond Myklebust 提交于
      If the previous request on a slot was interrupted before it was
      processed by the server, then our slot sequence number may be out of whack,
      and so we try the next operation using the old sequence number.
      
      The problem with this, is that not all servers check to see that the
      client is replaying the same operations as previously when they decide
      to go to the replay cache, and so instead of the expected error of
      NFS4ERR_SEQ_FALSE_RETRY, we get a replay of the old reply, which could
      (if the operations match up) be mistaken by the client for a new reply.
      
      To fix this, we attempt to send a COMPOUND containing only the SEQUENCE op
      in order to resync our slot sequence number.
      
      Cc: Olga Kornievskaia <olga.kornievskaia@gmail.com>
      [olga.kornievskaia@gmail.com: fix an Oops]
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      3be0f80b
  2. 17 10月, 2017 1 次提交
  3. 05 10月, 2017 1 次提交
    • T
      NFSv4/pnfs: Fix an infinite layoutget loop · e8fa33a6
      Trond Myklebust 提交于
      Since we can now use a lock stateid or a delegation stateid, that
      differs from the context stateid, we need to change the test in
      nfs4_layoutget_handle_exception() to take this into account.
      
      This fixes an infinite layoutget loop in the NFS client whereby
      it keeps retrying the initial layoutget using the same broken
      stateid.
      
      Fixes: 70d2f7b1 ("pNFS: Use the standard I/O stateid when...")
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      e8fa33a6
  4. 07 9月, 2017 1 次提交
  5. 14 8月, 2017 2 次提交
  6. 10 8月, 2017 1 次提交
  7. 07 8月, 2017 3 次提交
  8. 02 8月, 2017 2 次提交
  9. 29 7月, 2017 1 次提交
    • B
      NFSv4.1: Fix a race where CB_NOTIFY_LOCK fails to wake a waiter · b7dbcc0e
      Benjamin Coddington 提交于
      nfs4_retry_setlk() sets the task's state to TASK_INTERRUPTIBLE within the
      same region protected by the wait_queue's lock after checking for a
      notification from CB_NOTIFY_LOCK callback.  However, after releasing that
      lock, a wakeup for that task may race in before the call to
      freezable_schedule_timeout_interruptible() and set TASK_WAKING, then
      freezable_schedule_timeout_interruptible() will set the state back to
      TASK_INTERRUPTIBLE before the task will sleep.  The result is that the task
      will sleep for the entire duration of the timeout.
      
      Since we've already set TASK_INTERRUPTIBLE in the locked section, just use
      freezable_schedule_timout() instead.
      
      Fixes: a1d617d8 ("nfs: allow blocking locks to be awoken by lock callbacks")
      Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Cc: stable@vger.kernel.org # v4.9+
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      b7dbcc0e
  10. 27 7月, 2017 1 次提交
  11. 14 7月, 2017 4 次提交
  12. 28 6月, 2017 1 次提交
    • T
      NFSv4.1: Fix a race in nfs4_proc_layoutget · bd171930
      Trond Myklebust 提交于
      If the task calling layoutget is signalled, then it is possible for the
      calls to nfs4_sequence_free_slot() and nfs4_layoutget_prepare() to race,
      in which case we leak a slot.
      The fix is to move the call to nfs4_sequence_free_slot() into the
      nfs4_layoutget_release() so that it gets called at task teardown time.
      
      Fixes: 2e80dbe7 ("NFSv4.1: Close callback races for OPEN, LAYOUTGET...")
      Cc: stable@vger.kernel.org # v4.8+
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      bd171930
  13. 20 6月, 2017 1 次提交
    • I
      sched/wait: Rename wait_queue_t => wait_queue_entry_t · ac6424b9
      Ingo Molnar 提交于
      Rename:
      
      	wait_queue_t		=>	wait_queue_entry_t
      
      'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue",
      but in reality it's a queue *entry*. The 'real' queue is the wait queue head,
      which had to carry the name.
      
      Start sorting this out by renaming it to 'wait_queue_entry_t'.
      
      This also allows the real structure name 'struct __wait_queue' to
      lose its double underscore and become 'struct wait_queue_entry',
      which is the more canonical nomenclature for such data types.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ac6424b9
  14. 06 6月, 2017 1 次提交
  15. 10 5月, 2017 1 次提交
  16. 06 5月, 2017 2 次提交
  17. 29 4月, 2017 2 次提交
  18. 21 4月, 2017 8 次提交
  19. 01 4月, 2017 1 次提交