1. 04 3月, 2015 2 次提交
  2. 24 1月, 2015 2 次提交
  3. 17 1月, 2015 3 次提交
  4. 29 9月, 2014 2 次提交
    • T
      NFSv4: fix open/lock state recovery error handling · df817ba3
      Trond Myklebust 提交于
      The current open/lock state recovery unfortunately does not handle errors
      such as NFS4ERR_CONN_NOT_BOUND_TO_SESSION correctly. Instead of looping,
      just proceeds as if the state manager is finished recovering.
      This patch ensures that we loop back, handle higher priority errors
      and complete the open/lock state recovery.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      df817ba3
    • T
      NFSv4: Fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails · a4339b7b
      Trond Myklebust 提交于
      If a NFSv4.x server returns NFS4ERR_STALE_CLIENTID in response to a
      CREATE_SESSION or SETCLIENTID_CONFIRM in order to tell us that it rebooted
      a second time, then the client will currently take this to mean that it must
      declare all locks to be stale, and hence ineligible for reboot recovery.
      
      RFC3530 and RFC5661 both suggest that the client should instead rely on the
      server to respond to inelegible open share, lock and delegation reclaim
      requests with NFS4ERR_NO_GRACE in this situation.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      a4339b7b
  5. 25 9月, 2014 1 次提交
  6. 09 9月, 2014 1 次提交
    • J
      nfs: revert "nfs4: queue free_lock_state job submission to nfsiod" · 0c0e0d3c
      Jeff Layton 提交于
      This reverts commit 49a4bda2.
      
      Christoph reported an oops due to the above commit:
      
      generic/089 242s ...[ 2187.041239] general protection fault: 0000 [#1]
      SMP
      [ 2187.042899] Modules linked in:
      [ 2187.044000] CPU: 0 PID: 11913 Comm: kworker/0:1 Not tainted 3.16.0-rc6+ #1151
      [ 2187.044287] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [ 2187.044287] Workqueue: nfsiod free_lock_state_work
      [ 2187.044287] task: ffff880072b50cd0 ti: ffff88007a4ec000 task.ti: ffff88007a4ec000
      [ 2187.044287] RIP: 0010:[<ffffffff81361ca6>]  [<ffffffff81361ca6>] free_lock_state_work+0x16/0x30
      [ 2187.044287] RSP: 0018:ffff88007a4efd58  EFLAGS: 00010296
      [ 2187.044287] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88007a947ac0 RCX: 8000000000000000
      [ 2187.044287] RDX: ffffffff826af9e0 RSI: ffff88007b093c00 RDI: ffff88007b093db8
      [ 2187.044287] RBP: ffff88007a4efd58 R08: ffffffff832d3e10 R09: 000001c40efc0000
      [ 2187.044287] R10: 0000000000000000 R11: 0000000000059e30 R12: ffff88007fc13240
      [ 2187.044287] R13: ffff88007fc18b00 R14: ffff88007b093db8 R15: 0000000000000000
      [ 2187.044287] FS:  0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
      [ 2187.044287] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 2187.044287] CR2: 00007f93ec33fb80 CR3: 0000000079dc2000 CR4: 00000000000006f0
      [ 2187.044287] Stack:
      [ 2187.044287]  ffff88007a4efdd8 ffffffff810cc877 ffffffff810cc80d ffff88007fc13258
      [ 2187.044287]  000000007a947af0 0000000000000000 ffffffff8353ccc8 ffffffff82b6f3d0
      [ 2187.044287]  0000000000000000 ffffffff82267679 ffff88007a4efdd8 ffff88007fc13240
      [ 2187.044287] Call Trace:
      [ 2187.044287]  [<ffffffff810cc877>] process_one_work+0x1c7/0x490
      [ 2187.044287]  [<ffffffff810cc80d>] ? process_one_work+0x15d/0x490
      [ 2187.044287]  [<ffffffff810cd569>] worker_thread+0x119/0x4f0
      [ 2187.044287]  [<ffffffff810fbbad>] ? trace_hardirqs_on+0xd/0x10
      [ 2187.044287]  [<ffffffff810cd450>] ? init_pwq+0x190/0x190
      [ 2187.044287]  [<ffffffff810d3c6f>] kthread+0xdf/0x100
      [ 2187.044287]  [<ffffffff810d3b90>] ? __init_kthread_worker+0x70/0x70
      [ 2187.044287]  [<ffffffff81d9873c>] ret_from_fork+0x7c/0xb0
      [ 2187.044287]  [<ffffffff810d3b90>] ? __init_kthread_worker+0x70/0x70
      [ 2187.044287] Code: 0f 1f 44 00 00 31 c0 5d c3 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 8d b7 48 fe ff ff 48 8b 87 58 fe ff ff 48 89 e5 48 8b 40 30 <48> 8b 00 48 8b 10 48 89 c7 48 8b 92 90 03 00 00 ff 52 28 5d c3
      [ 2187.044287] RIP  [<ffffffff81361ca6>] free_lock_state_work+0x16/0x30
      [ 2187.044287]  RSP <ffff88007a4efd58>
      [ 2187.103626] ---[ end trace 0f11326d28e5d8fa ]---
      
      The original reason for this patch was because the fl_release_private
      operation couldn't sleep. With commit ed9814d8 (locks: defer freeing
      locks in locks_delete_lock until after i_lock has been dropped), this is
      no longer a problem so we can revert this patch.
      Reported-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Tested-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      0c0e0d3c
  7. 16 7月, 2014 1 次提交
    • N
      sched: Remove proliferation of wait_on_bit() action functions · 74316201
      NeilBrown 提交于
      The current "wait_on_bit" interface requires an 'action'
      function to be provided which does the actual waiting.
      There are over 20 such functions, many of them identical.
      Most cases can be satisfied by one of just two functions, one
      which uses io_schedule() and one which just uses schedule().
      
      So:
       Rename wait_on_bit and        wait_on_bit_lock to
              wait_on_bit_action and wait_on_bit_lock_action
       to make it explicit that they need an action function.
      
       Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io
       which are *not* given an action function but implicitly use
       a standard one.
       The decision to error-out if a signal is pending is now made
       based on the 'mode' argument rather than being encoded in the action
       function.
      
       All instances of the old wait_on_bit and wait_on_bit_lock which
       can use the new version have been changed accordingly and their
       action functions have been discarded.
       wait_on_bit{_lock} does not return any specific error code in the
       event of a signal so the caller must check for non-zero and
       interpolate their own error code as appropriate.
      
      The wait_on_bit() call in __fscache_wait_on_invalidate() was
      ambiguous as it specified TASK_UNINTERRUPTIBLE but used
      fscache_wait_bit_interruptible as an action function.
      David Howells confirms this should be uniformly
      "uninterruptible"
      
      The main remaining user of wait_on_bit{,_lock}_action is NFS
      which needs to use a freezer-aware schedule() call.
      
      A comment in fs/gfs2/glock.c notes that having multiple 'action'
      functions is useful as they display differently in the 'wchan'
      field of 'ps'. (and /proc/$PID/wchan).
      As the new bit_wait{,_io} functions are tagged "__sched", they
      will not show up at all, but something higher in the stack.  So
      the distinction will still be visible, only with different
      function names (gds2_glock_wait versus gfs2_glock_dq_wait in the
      gfs2/glock.c case).
      
      Since first version of this patch (against 3.15) two new action
      functions appeared, on in NFS and one in CIFS.  CIFS also now
      uses an action function that makes the same freezer aware
      schedule call as NFS.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Acked-by: David Howells <dhowells@redhat.com> (fscache, keys)
      Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2)
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steve French <sfrench@samba.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brownSigned-off-by: NIngo Molnar <mingo@kernel.org>
      74316201
  8. 13 7月, 2014 2 次提交
    • J
      nfs4: queue free_lock_state job submission to nfsiod · 49a4bda2
      Jeff Layton 提交于
      We got a report of the following warning in Fedora:
      
      BUG: sleeping function called from invalid context at mm/slub.c:969
      in_atomic(): 1, irqs_disabled(): 0, pid: 533, name: bash
      3 locks held by bash/533:
       #0:  (&sp->so_delegreturn_mutex){+.+...}, at: [<ffffffffa033da62>] nfs4_proc_lock+0x262/0x910 [nfsv4]
       #1:  (&nfsi->rwsem){.+.+.+}, at: [<ffffffffa033da6a>] nfs4_proc_lock+0x26a/0x910 [nfsv4]
       #2:  (&sb->s_type->i_lock_key#23){+.+...}, at: [<ffffffff812998dc>] flock_lock_file_wait+0x8c/0x3a0
      CPU: 0 PID: 533 Comm: bash Not tainted 3.15.0-0.rc1.git1.1.fc21.x86_64 #1
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
       0000000000000000 00000000d664ff3c ffff880078b69a70 ffffffff817e82e0
       0000000000000000 ffff880078b69a98 ffffffff810cf1a4 0000000000000050
       0000000000000050 ffff88007cc01a00 ffff880078b69ad8 ffffffff8121449e
      Call Trace:
       [<ffffffff817e82e0>] dump_stack+0x4d/0x66
       [<ffffffff810cf1a4>] __might_sleep+0x184/0x240
       [<ffffffff8121449e>] kmem_cache_alloc_trace+0x4e/0x330
       [<ffffffffa0331124>] ? nfs4_release_lockowner+0x74/0x110 [nfsv4]
       [<ffffffffa0331124>] nfs4_release_lockowner+0x74/0x110 [nfsv4]
       [<ffffffffa0352340>] nfs4_put_lock_state+0x90/0xb0 [nfsv4]
       [<ffffffffa0352375>] nfs4_fl_release_lock+0x15/0x20 [nfsv4]
       [<ffffffff81297515>] locks_free_lock+0x45/0x90
       [<ffffffff8129996c>] flock_lock_file_wait+0x11c/0x3a0
       [<ffffffffa033da6a>] ? nfs4_proc_lock+0x26a/0x910 [nfsv4]
       [<ffffffffa033301e>] do_vfs_lock+0x1e/0x30 [nfsv4]
       [<ffffffffa033da79>] nfs4_proc_lock+0x279/0x910 [nfsv4]
       [<ffffffff810dbb26>] ? local_clock+0x16/0x30
       [<ffffffff810f5a3f>] ? lock_release_holdtime.part.28+0xf/0x200
       [<ffffffffa02f820c>] do_unlk+0x8c/0xc0 [nfs]
       [<ffffffffa02f85c5>] nfs_flock+0xa5/0xf0 [nfs]
       [<ffffffff8129a6f6>] locks_remove_file+0xb6/0x1e0
       [<ffffffff812159d8>] ? kfree+0xd8/0x2d0
       [<ffffffff8123bc63>] __fput+0xd3/0x210
       [<ffffffff8123bdee>] ____fput+0xe/0x10
       [<ffffffff810bfb6d>] task_work_run+0xcd/0xf0
       [<ffffffff81019cd1>] do_notify_resume+0x61/0x90
       [<ffffffff817fbea2>] int_signal+0x12/0x17
      
      The problem is that NFSv4 is trying to do an allocation from
      fl_release_private (in order to send a RELEASE_LOCKOWNER call). That
      function can be called while holding the inode->i_lock, and it's
      currently set up to do __GFP_WAIT allocations. v4.1 code has a
      similar problem.
      
      This patch adds a work_struct to the nfs4_lock_state and has the code
      queue the free_lock_state operation to nfsiod.
      Reported-by: NJosh Stone <jistone@redhat.com>
      Signed-off-by: NJeff Layton <jlayton@poochiereds.net>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      49a4bda2
    • J
      nfs4: treat lock owners as opaque values · 8003d3c4
      Jeff Layton 提交于
      Do the following set of ops with a file on a NFSv4 mount:
      
          exec 3>>/file/on/nfsv4
          flock -x 3
          exec 3>&-
      
      You'll see the LOCK request go across the wire, but no LOCKU when the
      file is closed.
      
      What happens is that the fd is passed across a fork, and the final close
      is done in a different process than the opener. That makes
      __nfs4_find_lock_state miss finding the correct lock state because it
      uses the fl_pid as a search key. A new one is created, and the locking
      code treats it as a delegation stateid (because NFS_LOCK_INITIALIZED
      isn't set).
      
      The root cause of this breakage seems to be commit 77041ed9
      (NFSv4: Ensure the lockowners are labelled using the fl_owner and/or
      fl_pid).
      
      That changed it so that flock lockowners are allocated based on the
      fl_pid. I think this is incorrect. flock locks should be "owned" by the
      struct file, and that is already accounted for in the fl_owner field of
      the lock request when it comes through nfs_flock.
      
      This patch basically reverts the above commit and with it, a LOCKU is
      sent in the above reproducer.
      Signed-off-by: NJeff Layton <jlayton@poochiereds.net>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      8003d3c4
  9. 06 6月, 2014 1 次提交
  10. 18 4月, 2014 1 次提交
  11. 19 3月, 2014 1 次提交
  12. 06 3月, 2014 1 次提交
  13. 20 2月, 2014 1 次提交
  14. 19 2月, 2014 1 次提交
  15. 06 1月, 2014 1 次提交
  16. 15 11月, 2013 1 次提交
  17. 14 11月, 2013 1 次提交
    • J
      nfs: don't retry detect_trunking with RPC_AUTH_UNIX more than once · 6d769f1e
      Jeff Layton 提交于
      Currently, when we try to mount and get back NFS4ERR_CLID_IN_USE or
      NFS4ERR_WRONGSEC, we create a new rpc_clnt and then try the call again.
      There is no guarantee that doing so will work however, so we can end up
      retrying the call in an infinite loop.
      
      Worse yet, we create the new client using rpc_clone_client_set_auth,
      which creates the new client as a child of the old one. Thus, we can end
      up with a *very* long lineage of rpc_clnts. When we go to put all of the
      references to them, we can end up with a long call chain that can smash
      the stack as each rpc_free_client() call can recurse back into itself.
      
      This patch fixes this by simply ensuring that the SETCLIENTID call will
      only be retried in this situation if the last attempt did not use
      RPC_AUTH_UNIX.
      
      Note too that with this change, we don't need the (i > 2) check in the
      -EACCES case since we now have a more reliable test as to whether we
      should reattempt.
      
      Cc: stable@vger.kernel.org # v3.10+
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Tested-by/Acked-by: Weston Andros Adamson <dros@netapp.com>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      6d769f1e
  18. 01 11月, 2013 1 次提交
  19. 29 10月, 2013 5 次提交
    • C
      NFS: Fix possible endless state recovery wait · 0625c2dd
      Chuck Lever 提交于
      In nfs4_wait_clnt_recover(), hold a reference to the clp being
      waited on.  The state manager can reduce clp->cl_count to 1, in
      which case the nfs_put_client() in nfs4_run_state_manager() can
      free *clp before wait_on_bit() returns and allows
      nfs4_wait_clnt_recover() to run again.
      
      The behavior at that point is non-deterministic.  If the waited-on
      bit still happens to be zero, wait_on_bit() will wake the waiter as
      expected.  If the bit is set again (say, if the memory was poisoned
      when freed) wait_on_bit() can leave the waiter asleep.
      
      This is a narrow fix which ensures the safety of accessing *clp in
      nfs4_wait_clnt_recover(), but does not address the continued use
      of a possibly freed *clp after nfs4_wait_clnt_recover() returns
      (see nfs_end_delegation_return(), for example).
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      0625c2dd
    • C
      NFS: Handle SEQ4_STATUS_LEASE_MOVED · d1c2331e
      Chuck Lever 提交于
      With the advent of NFSv4 sessions in NFSv4.1 and following, a "lease
      moved" condition is reported differently than it is in NFSv4.0.
      
      NFSv4 minor version 0 servers return an error status code,
      NFS4ERR_LEASE_MOVED, to signal that a lease has moved.  This error
      causes the whole compound operation to fail.  Normal compounds
      against this server continue to fail until the client performs
      migration recovery on the migrated share.
      
      Minor version 1 and later servers assert a bit flag in the reply to
      a compound's SEQUENCE operation to signal LEASE_MOVED.  This is not
      a fatal condition: operations against this server continue normally.
      The server asserts this flag until the client performs migration
      recovery on the migrated share.
      
      Note that servers MUST NOT return NFS4ERR_LEASE_MOVED to NFSv4
      clients not using NFSv4.0.
      
      After the server asserts any of the sr_status_flags in the SEQUENCE
      operation in a typical compound, our client initiates standard lease
      recovery.  For NFSv4.1+, a stand-alone SEQUENCE operation is
      performed to discover what recovery is needed.
      
      If SEQ4_STATUS_LEASE_MOVED is asserted in this stand-alone SEQUENCE
      operation, our client attempts to discover which FSIDs have been
      migrated, and then performs migration recovery on each.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      d1c2331e
    • C
      NFS: Support NFS4ERR_LEASE_MOVED recovery in state manager · b7f7a66e
      Chuck Lever 提交于
      A migration on the FSID in play for the current NFS operation
      is reported via the error status code NFS4ERR_MOVED.
      
      "Lease moved" means that a migration has occurred on some other
      FSID than the one for the current operation.  It's a signal that
      the client should take action immediately to handle a migration
      that it may not have noticed otherwise.  This is so that the
      client's lease does not expire unnoticed on the destination server.
      
      In NFSv4.0, a moved lease is reported with the NFS4ERR_LEASE_MOVED
      error status code.
      
      To recover from NFS4ERR_LEASE_MOVED, check each FSID for that server
      to see if it is still present.  Invoke nfs4_try_migration() if the
      FSID is no longer present on the server.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      b7f7a66e
    • C
      NFS: Add basic migration support to state manager thread · c9fdeb28
      Chuck Lever 提交于
      Migration recovery and state recovery must be serialized, so handle
      both in the state manager thread.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      c9fdeb28
    • A
      NFSv4 Remove zeroing state kern warnings · 3660cd43
      Andy Adamson 提交于
      As of commit 5d422301 we no longer zero the
      state.
      Signed-off-by: NAndy Adamson <andros@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      3660cd43
  20. 05 9月, 2013 2 次提交
    • N
      NFSv4: Don't try to recover NFSv4 locks when they are lost. · ef1820f9
      NeilBrown 提交于
      When an NFSv4 client loses contact with the server it can lose any
      locks that it holds.
      
      Currently when it reconnects to the server it simply tries to reclaim
      those locks.  This might succeed even though some other client has
      held and released a lock in the mean time.  So the first client might
      think the file is unchanged, but it isn't.  This isn't good.
      
      If, when recovery happens, the locks cannot be claimed because some
      other client still holds the lock, then we get a message in the kernel
      logs, but the client can still write.  So two clients can both think
      they have a lock and can both write at the same time.  This is equally
      not good.
      
      There was a patch a while ago
        http://comments.gmane.org/gmane.linux.nfs/41917
      
      which tried to address some of this, but it didn't seem to go
      anywhere.  That patch would also send a signal to the process.  That
      might be useful but for now this patch just causes writes to fail.
      
      For NFSv4 (unlike v2/v3) there is a strong link between the lock and
      the write request so we can fairly easily fail any IO of the lock is
      gone.  While some applications might not expect this, it is still
      safer than allowing the write to succeed.
      
      Because this is a fairly big change in behaviour a module parameter,
      "recover_locks", is introduced which defaults to true (the current
      behaviour) but can be set to "false" to tell the client not to try to
      recover things that were lost.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      ef1820f9
    • C
      NFS: Fix warning introduced by NFSv4.0 transport blocking patches · b6a85258
      Chuck Lever 提交于
      When CONFIG_NFS_V4_1 is not enabled, gcc emits this warning:
      
      linux/fs/nfs/nfs4state.c:255:12: warning:
       ‘nfs4_begin_drain_session’ defined but not used [-Wunused-function]
       static int nfs4_begin_drain_session(struct nfs_client *clp)
                  ^
      
      Eventually NFSv4.0 migration recovery will invoke this function, but
      that has not yet been merged.  Hide nfs4_begin_drain_session()
      behind CONFIG_NFS_V4_1 for now.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      b6a85258
  21. 04 9月, 2013 2 次提交
  22. 23 8月, 2013 1 次提交
    • N
      NFS: remove incorrect "Lock reclaim failed!" warning. · 6686390b
      NeilBrown 提交于
      After reclaiming state that was lost, the NFS client tries to reclaim
      any locks, and then checks that each one has NFS_LOCK_INITIALIZED set
      (which means that the server has confirmed the lock).
      However if the client holds a delegation, nfs_reclaim_locks() simply aborts
      (or more accurately it called nfs_lock_reclaim() and that returns without
      doing anything).
      
      This is because when a delegation is held, the server doesn't need to
      know about locks.
      
      So if a delegation is held, NFS_LOCK_INITIALIZED is not expected, and
      its absence is certainly not an error.
      
      So don't print the warnings if NFS_DELGATED_STATE is set.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      6686390b
  23. 08 8月, 2013 2 次提交
    • C
      NFS: Never use user credentials for lease renewal · 73d8bde5
      Chuck Lever 提交于
      Never try to use a non-UID 0 user credential for lease management,
      as that credential can change out from under us.  The server will
      block NFSv4 lease recovery with NFS4ERR_CLID_INUSE.
      
      Since the mechanism to acquire a credential for lease management
      is now the same for all minor versions, replace the minor version-
      specific callout with a single function.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      73d8bde5
    • C
      NFS: Use root's credential for lease management when keytab is missing · d688f7b8
      Chuck Lever 提交于
      Commit 05f4c350 "NFS: Discover NFSv4 server trunking when mounting"
      Fri Sep 14 17:24:32 2012 introduced Uniform Client String support,
      which forces our NFS client to establish a client ID immediately
      during a mount operation rather than waiting until a user wants to
      open a file.
      
      Normally machine credentials (eg. from a keytab) are used to perform
      a mount operation that is protected by Kerberos.  Before 05fc350,
      SETCLIENTID used a machine credential, or fell back to a regular
      user's credential if no keytab is available.
      
      On clients that don't have a keytab, performing SETCLIENTID early
      means there's no user credential to fall back on, since no regular
      user has kinit'd yet.  05f4c350 seems to have broken the ability
      to mount with sec=krb5 on clients that don't have a keytab in
      kernels 3.7 - 3.10.
      
      To address this regression, commit 4edaa308 (NFS: Use "krb5i" to
      establish NFSv4 state whenever possible), Sat Mar 16 15:56:20 2013,
      was merged in 3.10.  This commit forces the NFS client to fall back
      to AUTH_SYS for lease management operations if no keytab is
      available.
      
      Neil Brown noticed that, since root is required to kinit to do a
      sec=krb5 mount when a client doesn't have a keytab, we can try to
      use root's Kerberos credential before AUTH_SYS.
      
      Now, when determining a principal and flavor to use for lease
      management, the NFS client tries in this order:
      
        1.  Flavor: AUTH_GSS, krb5i
            Principal: service principal (via keytab)
      
        2.  Flavor: AUTH_GSS, krb5i
            Principal: user principal established for UID 0 (via kinit)
      
        3.  Flavor: AUTH_SYS
            Principal: UID 0 / GID 0
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      d688f7b8
  24. 24 7月, 2013 1 次提交
  25. 04 7月, 2013 1 次提交
  26. 29 6月, 2013 1 次提交
    • J
      locks: protect most of the file_lock handling with i_lock · 1c8c601a
      Jeff Layton 提交于
      Having a global lock that protects all of this code is a clear
      scalability problem. Instead of doing that, move most of the code to be
      protected by the i_lock instead. The exceptions are the global lists
      that the ->fl_link sits on, and the ->fl_block list.
      
      ->fl_link is what connects these structures to the
      global lists, so we must ensure that we hold those locks when iterating
      over or updating these lists.
      
      Furthermore, sound deadlock detection requires that we hold the
      blocked_list state steady while checking for loops. We also must ensure
      that the search and update to the list are atomic.
      
      For the checking and insertion side of the blocked_list, push the
      acquisition of the global lock into __posix_lock_file and ensure that
      checking and update of the  blocked_list is done without dropping the
      lock in between.
      
      On the removal side, when waking up blocked lock waiters, take the
      global lock before walking the blocked list and dequeue the waiters from
      the global list prior to removal from the fl_block list.
      
      With this, deadlock detection should be race free while we minimize
      excessive file_lock_lock thrashing.
      
      Finally, in order to avoid a lock inversion problem when handling
      /proc/locks output we must ensure that manipulations of the fl_block
      list are also protected by the file_lock_lock.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1c8c601a
  27. 20 6月, 2013 1 次提交