1. 12 2月, 2013 5 次提交
    • T
      NFSv4: Fix a reboot recovery race when opening a file · c21443c2
      Trond Myklebust 提交于
      If the server reboots after it has replied to our OPEN, but before we
      call nfs4_opendata_to_nfs4_state(), then the reboot recovery thread
      will not see a stateid for this open, and so will fail to recover it.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      c21443c2
    • T
      NFSv4: Ensure delegation recall and byte range lock removal don't conflict · 65b62a29
      Trond Myklebust 提交于
      Add a mutex to the struct nfs4_state_owner to ensure that delegation
      recall doesn't conflict with byte range lock removal.
      
      Note that we nest the new mutex _outside_ the state manager reclaim
      protection (nfsi->rwsem) in order to avoid deadlocks.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      65b62a29
    • T
      NFSv4: Fix up the return values of nfs4_open_delegation_recall · 37380e42
      Trond Myklebust 提交于
      Adjust the return values so that they return EAGAIN to the caller in
      cases where we might want to retry the delegation recall after
      the state recovery has run.
      Note that we can't wait and retry in this routine, because the caller
      may be the state manager thread.
      
      If delegation recall fails due to a session or reboot related issue,
      also ensure that we mark the stateid as delegated so that
      nfs_delegation_claim_opens can find it again later.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      37380e42
    • T
      NFSv4.1: Don't lose locks when a server reboots during delegation return · d25be546
      Trond Myklebust 提交于
      If the server reboots while we are converting a delegation into
      OPEN/LOCK stateids as part of a delegation return, the current code
      will simply exit with an error. This causes us to lose both
      delegation state and locking state (i.e. locking atomicity).
      
      Deal with this by exposing the delegation stateid during delegation
      return, so that we can recover the delegation, and then resume
      open/lock recovery.
      
      Note that not having to hold the nfs_inode->rwsem across the
      calls to nfs_delegation_claim_opens() also fixes a deadlock against
      the NFSv4.1 reboot recovery code.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      d25be546
    • T
      NFSv4.1: Prevent deadlocks between state recovery and file locking · 9a99af49
      Trond Myklebust 提交于
      We currently have a deadlock in which the state recovery thread
      ends up blocking due to one of the locks which it is trying to
      recover holding the nfs_inode->rwsem.
      The situation is as follows: the state recovery thread is
      scheduled in order to recover from a reboot. It immediately
      drains the session, forcing all ordinary NFSv4.1 calls to
      nfs41_setup_sequence() to be put to sleep.  This includes the
      file locking process that holds the nfs_inode->rwsem.
      When the thread gets to nfs4_reclaim_locks(), it tries to
      grab a write lock on nfs_inode->rwsem, and boom...
      
      Fix is to have the lock drop the nfs_inode->rwsem while it is
      doing RPC calls. We use a sequence lock in order to signal to
      the locking process whether or not a state recovery thread has
      run on that inode, in which case it should retry the lock.
      Reported-by: NAndy Adamson <andros@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      9a99af49
  2. 01 2月, 2013 1 次提交
  3. 04 1月, 2013 1 次提交
  4. 21 12月, 2012 1 次提交
    • D
      NFS: Use FS-Cache invalidation · de242c0b
      David Howells 提交于
      Use the new FS-Cache invalidation facility from NFS to deal with foreign
      changes being detected on the server rather than attempting to retire the old
      cookie and get a new one.
      
      The problem with the old method was that NFS did not wait for all outstanding
      storage and retrieval ops on the cache to complete.  There was no automatic
      wait between the calls to ->readpages() and calls to invalidate_inode_pages2()
      as the latter can only wait on locked pages that have been added to the
      pagecache (which they haven't yet on entry to ->readpages()).
      
      This was leading to oopses like the one below when an outstanding read got cut
      off from its cookie by a premature release.
      
      BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
      IP: [<ffffffffa0075118>] __fscache_read_or_alloc_pages+0x1dd/0x315 [fscache]
      PGD 15889067 PUD 15890067 PMD 0
      Oops: 0000 [#1] SMP
      CPU 0
      Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc
      
      Pid: 4544, comm: tar Not tainted 3.1.0-rc4-fsdevel+ #1064                  /DG965RY
      RIP: 0010:[<ffffffffa0075118>]  [<ffffffffa0075118>] __fscache_read_or_alloc_pages+0x1dd/0x315 [fscache]
      RSP: 0018:ffff8800158799e8  EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff8800070d41e0 RCX: ffff8800083dc1b0
      RDX: 0000000000000000 RSI: ffff880015879960 RDI: ffff88003e627b90
      RBP: ffff880015879a28 R08: 0000000000000002 R09: 0000000000000002
      R10: 0000000000000001 R11: ffff880015879950 R12: ffff880015879aa4
      R13: 0000000000000000 R14: ffff8800083dc158 R15: ffff880015879be8
      FS:  00007f671e9d87c0(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00000000000000a8 CR3: 000000001587f000 CR4: 00000000000006f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process tar (pid: 4544, threadinfo ffff880015878000, task ffff880015875040)
      Stack:
       ffffffffa00b1759 ffff8800070dc158 ffff8800000213da ffff88002a286508
       ffff880015879aa4 ffff880015879be8 0000000000000001 ffff88002a2866e8
       ffff880015879a88 ffffffffa00b20be 00000000000200da ffff880015875040
      Call Trace:
       [<ffffffffa00b1759>] ? nfs_fscache_wait_bit+0xd/0xd [nfs]
       [<ffffffffa00b20be>] __nfs_readpages_from_fscache+0x7e/0x13f [nfs]
       [<ffffffff81095fe7>] ? __alloc_pages_nodemask+0x156/0x662
       [<ffffffffa0098763>] nfs_readpages+0xee/0x187 [nfs]
       [<ffffffff81098a5e>] __do_page_cache_readahead+0x1be/0x267
       [<ffffffff81098942>] ? __do_page_cache_readahead+0xa2/0x267
       [<ffffffff81098d7b>] ra_submit+0x1c/0x20
       [<ffffffff8109900a>] ondemand_readahead+0x28b/0x29a
       [<ffffffff810990ce>] page_cache_sync_readahead+0x38/0x3a
       [<ffffffff81091d8a>] generic_file_aio_read+0x2ab/0x67e
       [<ffffffffa008cfbe>] nfs_file_read+0xa4/0xc9 [nfs]
       [<ffffffff810c22c4>] do_sync_read+0xba/0xfa
       [<ffffffff810a62c9>] ? might_fault+0x4e/0x9e
       [<ffffffff81177a47>] ? security_file_permission+0x7b/0x84
       [<ffffffff810c25dd>] ? rw_verify_area+0xab/0xc8
       [<ffffffff810c29a4>] vfs_read+0xaa/0x13a
       [<ffffffff810c2a79>] sys_read+0x45/0x6c
       [<ffffffff813ac37b>] system_call_fastpath+0x16/0x1b
      Reported-by: NMark Moseley <moseleymark@gmail.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      de242c0b
  5. 16 12月, 2012 3 次提交
  6. 13 12月, 2012 1 次提交
    • A
      SUNRPC handle EKEYEXPIRED in call_refreshresult · eb96d5c9
      Andy Adamson 提交于
      Currently, when an RPCSEC_GSS context has expired or is non-existent
      and the users (Kerberos) credentials have also expired or are non-existent,
      the client receives the -EKEYEXPIRED error and tries to refresh the context
      forever.  If an application is performing I/O, or other work against the share,
      the application hangs, and the user is not prompted to refresh/establish their
      credentials. This can result in a denial of service for other users.
      
      Users are expected to manage their Kerberos credential lifetimes to mitigate
      this issue.
      
      Move the -EKEYEXPIRED handling into the RPC layer. Try tk_cred_retry number
      of times to refresh the gss_context, and then return -EACCES to the application.
      Signed-off-by: NAndy Adamson <andros@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      eb96d5c9
  7. 11 12月, 2012 2 次提交
  8. 06 12月, 2012 20 次提交
  9. 27 11月, 2012 5 次提交
  10. 21 11月, 2012 1 次提交