1. 15 9月, 2018 1 次提交
    • T
      NFSv4.1 fix infinite loop on I/O. · 994b15b9
      Trond Myklebust 提交于
      The previous fix broke recovery of delegated stateids because it assumes
      that if we did not mark the delegation as suspect, then the delegation has
      effectively been revoked, and so it removes that delegation irrespectively
      of whether or not it is valid and still in use. While this is "mostly
      harmless" for ordinary I/O, we've seen pNFS fail with LAYOUTGET spinning
      in an infinite loop while complaining that we're using an invalid stateid
      (in this case the all-zero stateid).
      
      What we rather want to do here is ensure that the delegation is always
      correctly marked as needing testing when that is the case. So we want
      to close the loophole offered by nfs4_schedule_stateid_recovery(),
      which marks the state as needing to be reclaimed, but not the
      delegation that may be backing it.
      
      Fixes: 0e3d3e5d ("NFSv4.1 fix infinite loop on IO BAD_STATEID error")
      Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
      Cc: stable@vger.kernel.org # v4.11+
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      994b15b9
  2. 14 8月, 2018 2 次提交
  3. 09 8月, 2018 1 次提交
  4. 01 6月, 2018 1 次提交
  5. 11 4月, 2018 1 次提交
  6. 15 1月, 2018 1 次提交
    • N
      NFSv4: always set NFS_LOCK_LOST when a lock is lost. · dce2630c
      NeilBrown 提交于
      There are 2 comments in the NFSv4 code which suggest that
      SIGLOST should possibly be sent to a process.  In these
      cases a lock has been lost.
      The current practice is to set NFS_LOCK_LOST so that
      read/write returns EIO when a lock is lost.
      So change these comments to code when sets NFS_LOCK_LOST.
      
      One case is when lock recovery after apparent server restart
      fails with NFS4ERR_DENIED, NFS4ERR_RECLAIM_BAD, or
      NFS4ERRO_RECLAIM_CONFLICT.  The other case is when a lock
      attempt as part of lease recovery fails with NFS4ERR_DENIED.
      
      In an ideal world, these should not happen.  However I have
      a packet trace showing an NFSv4.1 session getting
      NFS4ERR_BADSESSION after an extended network parition.  The
      NFSv4.1 client treats this like server reboot until/unless
      it get NFS4ERR_NO_GRACE, in which case it switches over to
      "nograce" recovery mode.  In this network trace, the client
      attempts to recover a lock and the server (incorrectly)
      reports NFS4ERR_DENIED rather than NFS4ERR_NO_GRACE.  This
      leads to the ineffective comment and the client then
      continues to write using the OPEN stateid.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      dce2630c
  7. 30 11月, 2017 1 次提交
  8. 18 11月, 2017 7 次提交
  9. 14 7月, 2017 1 次提交
    • C
      NFSv4.1: Handle EXCHGID4_FLAG_CONFIRMED_R during NFSv4.1 migration · 8dcbec6d
      Chuck Lever 提交于
      Transparent State Migration copies a client's lease state from the
      server where a filesystem used to reside to the server where it now
      resides. When an NFSv4.1 client first contacts that destination
      server, it uses EXCHANGE_ID to detect trunking relationships.
      
      The lease that was copied there is returned to that client, but the
      destination server sets EXCHGID4_FLAG_CONFIRMED_R when replying to
      the client. This is because the lease was confirmed on the source
      server (before it was copied).
      
      Normally, when CONFIRMED_R is set, a client purges the lease and
      creates a new one. However, that throws away the entire benefit of
      Transparent State Migration.
      
      Therefore, the client must not purge that lease when it is possible
      that Transparent State Migration has occurred.
      Reported-by: NXuan Qi <xuan.qi@oracle.com>
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Tested-by: NXuan Qi <xuan.qi@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      8dcbec6d
  10. 28 6月, 2017 1 次提交
  11. 06 5月, 2017 1 次提交
  12. 31 1月, 2017 1 次提交
  13. 27 1月, 2017 1 次提交
  14. 14 1月, 2017 1 次提交
  15. 20 12月, 2016 2 次提交
    • N
      NFS: Don't disconnect open-owner on NFS4ERR_BAD_SEQID · 86cfb041
      NeilBrown 提交于
      When an NFS4ERR_BAD_SEQID is received the open-owner is removed from
      the ->state_owners rbtree so that it will no longer be used.
      
      If any stateids attached to this open-owner are still in use, and if a
      request using one gets an NFS4ERR_BAD_STATEID reply, this can for bad.
      
      The state is marked as needing recovery and the nfs4_state_manager()
      is scheduled to clean up.  nfs4_state_manager() finds states to be
      recovered by walking the state_owners rbtree.  As the open-owner is
      not in the rbtree, the bad state is not found so nfs4_state_manager()
      completes having done nothing.  The request is then retried, with a
      predicatable result (indefinite retries).
      
      If the stateid is for a delegation, this open_owner will be used
      to open files when the delegation is returned.  For that to work,
      a new open-owner needs to be presented to the server.
      
      This patch changes NFS4ERR_BAD_SEQID handling to leave the open-owner
      in the rbtree but updates the 'create_time' so it looks like a new
      open-owner.  With this the indefinite retries no longer happen.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      86cfb041
    • N
      NFSv4: ensure __nfs4_find_lock_state returns consistent result. · 3f8f2548
      NeilBrown 提交于
      If a file has both flock locks and OFD locks, then it is possible that
      two different nfs4 lock states could apply to file accesses from a
      single process.
      
      It is not possible to know, efficiently, which one is "correct".
      Presumably the state which represents a lock that covers the region
      undergoing IO would be the "correct" one to use, but finding that has
      a non-trivial cost and would provide miniscule value.
      
      Currently we just return whichever is first in the list, which could
      result in inconsistent behaviour if an application ever put it self in
      this position.  As consistent behaviour is preferable (when perfectly
      correct behaviour is not available), change the search to return a
      consistent result in this circumstance.
      Specifically: if there is both a flock and OFD lock state, always return
      the flock one.
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      3f8f2548
  16. 05 12月, 2016 1 次提交
  17. 02 12月, 2016 3 次提交
  18. 19 11月, 2016 1 次提交
  19. 28 9月, 2016 5 次提交
  20. 06 8月, 2016 1 次提交
  21. 25 6月, 2016 1 次提交
    • O
      nfs4: Fix potential use after free of state in nfs4_do_reclaim. · cea7f829
      Oleg Drokin 提交于
      Commit e8d975e7 ("fixing infinite OPEN loop in 4.0 stateid recovery")
      introduced access to state after it was just potentially freed by
      nfs4_put_open_state leading to a random data corruption somewhere.
      
      BUG: unable to handle kernel paging request at ffff88004941ee40
      IP: [<ffffffff813baf01>] nfs4_do_reclaim+0x461/0x740
      PGD 3501067 PUD 3504067 PMD 6ff37067 PTE 800000004941e060
      Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
      Modules linked in: loop rpcsec_gss_krb5 acpi_cpufreq tpm_tis joydev i2c_piix4 pcspkr tpm virtio_console nfsd ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops floppy serio_raw virtio_blk drm
      CPU: 6 PID: 2161 Comm: 192.168.10.253- Not tainted 4.7.0-rc1-vm-nfs+ #112
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      task: ffff8800463dcd00 ti: ffff88003ff48000 task.ti: ffff88003ff48000
      RIP: 0010:[<ffffffff813baf01>]  [<ffffffff813baf01>] nfs4_do_reclaim+0x461/0x740
      RSP: 0018:ffff88003ff4bd68  EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffffffff81a49900 RCX: 00000000000000e8
      RDX: 00000000000000e8 RSI: ffff8800418b9930 RDI: ffff880040c96c88
      RBP: ffff88003ff4bdf8 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff880040c96c98
      R13: ffff88004941ee20 R14: ffff88004941ee40 R15: ffff88004941ee00
      FS:  0000000000000000(0000) GS:ffff88006d000000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffff88004941ee40 CR3: 0000000060b0b000 CR4: 00000000000006e0
      Stack:
       ffffffff813baad5 ffff8800463dcd00 ffff880000000001 ffffffff810e6b68
       ffff880043ddbc88 ffff8800418b9800 ffff8800418b98c8 ffff88004941ee48
       ffff880040c96c90 ffff880040c96c00 ffff880040c96c20 ffff880040c96c40
      Call Trace:
       [<ffffffff813baad5>] ? nfs4_do_reclaim+0x35/0x740
       [<ffffffff810e6b68>] ? trace_hardirqs_on_caller+0x128/0x1b0
       [<ffffffff813bb7cd>] nfs4_run_state_manager+0x5ed/0xa40
       [<ffffffff813bb1e0>] ? nfs4_do_reclaim+0x740/0x740
       [<ffffffff813bb1e0>] ? nfs4_do_reclaim+0x740/0x740
       [<ffffffff810af0d1>] kthread+0x101/0x120
       [<ffffffff810e6b68>] ? trace_hardirqs_on_caller+0x128/0x1b0
       [<ffffffff818843af>] ret_from_fork+0x1f/0x40
       [<ffffffff810aefd0>] ? kthread_create_on_node+0x250/0x250
      Code: 65 80 4c 8b b5 78 ff ff ff e8 fc 88 4c 00 48 8b 7d 88 e8 13 67 d2 ff 49 8b 47 40 a8 02 0f 84 d3 01 00 00 4c 89 ff e8 7f f9 ff ff <f0> 41 80 26 7f 48 8b 7d c8 e8 b1 84 4c 00 e9 39 fd ff ff 3d e6
      RIP  [<ffffffff813baf01>] nfs4_do_reclaim+0x461/0x740
       RSP <ffff88003ff4bd68>
      CR2: ffff88004941ee40
      Signed-off-by: NOleg Drokin <green@linuxhacker.ru>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      cea7f829
  22. 28 5月, 2016 1 次提交
    • L
      nfs: fix anonymous member initializer build failure with older compilers · e0714ec4
      Linus Torvalds 提交于
      Older versions of gcc don't understand named initializers inside a
      anonymous structure or union member.  It can be worked around by adding
      the bracin gin the initializer for the anonymous member.
      
      Without this, gcc 4.4.4 will fail the build with
      
          CC      fs/nfs/nfs4state.o
        fs/nfs/nfs4state.c:69: error: unknown field ‘data’ specified in initializer
        fs/nfs/nfs4state.c:69: warning: missing braces around initializer
        fs/nfs/nfs4state.c:69: warning: (near initialization for ‘zero_stateid.<anonymous>.data’)
        make[2]: *** [fs/nfs/nfs4state.o] Error 1
      
      introduced in commit 93b717fd ("NFSv4: Label stateids with the type")
      Reported-and-tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Anna Schumaker <Anna.Schumaker@netapp.com>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e0714ec4
  23. 18 5月, 2016 2 次提交
  24. 03 10月, 2015 1 次提交
    • T
      NFSv4: Don't try to reclaim unused state owners · 4a0954ef
      Trond Myklebust 提交于
      Currently, we don't test if the state owner is in use before we try to
      recover it. The problem is that if the refcount is zero, then the
      state owner will be waiting on the lru list for garbage collection.
      The expectation in that case is that if you bump the refcount, then
      you must also remove the state owner from the lru list. Otherwise
      the call to nfs4_put_state_owner will corrupt that list by trying
      to add our state owner a second time.
      
      Avoid the whole problem by just skipping state owners that hold no
      state.
      Reported-by: NAndrew W Elble <aweits@rit.edu>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      4a0954ef
  25. 18 9月, 2015 1 次提交