1. 09 9月, 2014 2 次提交
    • J
      nfs: revert "nfs4: queue free_lock_state job submission to nfsiod" · 0c0e0d3c
      Jeff Layton 提交于
      This reverts commit 49a4bda2.
      
      Christoph reported an oops due to the above commit:
      
      generic/089 242s ...[ 2187.041239] general protection fault: 0000 [#1]
      SMP
      [ 2187.042899] Modules linked in:
      [ 2187.044000] CPU: 0 PID: 11913 Comm: kworker/0:1 Not tainted 3.16.0-rc6+ #1151
      [ 2187.044287] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [ 2187.044287] Workqueue: nfsiod free_lock_state_work
      [ 2187.044287] task: ffff880072b50cd0 ti: ffff88007a4ec000 task.ti: ffff88007a4ec000
      [ 2187.044287] RIP: 0010:[<ffffffff81361ca6>]  [<ffffffff81361ca6>] free_lock_state_work+0x16/0x30
      [ 2187.044287] RSP: 0018:ffff88007a4efd58  EFLAGS: 00010296
      [ 2187.044287] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88007a947ac0 RCX: 8000000000000000
      [ 2187.044287] RDX: ffffffff826af9e0 RSI: ffff88007b093c00 RDI: ffff88007b093db8
      [ 2187.044287] RBP: ffff88007a4efd58 R08: ffffffff832d3e10 R09: 000001c40efc0000
      [ 2187.044287] R10: 0000000000000000 R11: 0000000000059e30 R12: ffff88007fc13240
      [ 2187.044287] R13: ffff88007fc18b00 R14: ffff88007b093db8 R15: 0000000000000000
      [ 2187.044287] FS:  0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
      [ 2187.044287] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 2187.044287] CR2: 00007f93ec33fb80 CR3: 0000000079dc2000 CR4: 00000000000006f0
      [ 2187.044287] Stack:
      [ 2187.044287]  ffff88007a4efdd8 ffffffff810cc877 ffffffff810cc80d ffff88007fc13258
      [ 2187.044287]  000000007a947af0 0000000000000000 ffffffff8353ccc8 ffffffff82b6f3d0
      [ 2187.044287]  0000000000000000 ffffffff82267679 ffff88007a4efdd8 ffff88007fc13240
      [ 2187.044287] Call Trace:
      [ 2187.044287]  [<ffffffff810cc877>] process_one_work+0x1c7/0x490
      [ 2187.044287]  [<ffffffff810cc80d>] ? process_one_work+0x15d/0x490
      [ 2187.044287]  [<ffffffff810cd569>] worker_thread+0x119/0x4f0
      [ 2187.044287]  [<ffffffff810fbbad>] ? trace_hardirqs_on+0xd/0x10
      [ 2187.044287]  [<ffffffff810cd450>] ? init_pwq+0x190/0x190
      [ 2187.044287]  [<ffffffff810d3c6f>] kthread+0xdf/0x100
      [ 2187.044287]  [<ffffffff810d3b90>] ? __init_kthread_worker+0x70/0x70
      [ 2187.044287]  [<ffffffff81d9873c>] ret_from_fork+0x7c/0xb0
      [ 2187.044287]  [<ffffffff810d3b90>] ? __init_kthread_worker+0x70/0x70
      [ 2187.044287] Code: 0f 1f 44 00 00 31 c0 5d c3 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 8d b7 48 fe ff ff 48 8b 87 58 fe ff ff 48 89 e5 48 8b 40 30 <48> 8b 00 48 8b 10 48 89 c7 48 8b 92 90 03 00 00 ff 52 28 5d c3
      [ 2187.044287] RIP  [<ffffffff81361ca6>] free_lock_state_work+0x16/0x30
      [ 2187.044287]  RSP <ffff88007a4efd58>
      [ 2187.103626] ---[ end trace 0f11326d28e5d8fa ]---
      
      The original reason for this patch was because the fl_release_private
      operation couldn't sleep. With commit ed9814d8 (locks: defer freeing
      locks in locks_delete_lock until after i_lock has been dropped), this is
      no longer a problem so we can revert this patch.
      Reported-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Tested-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      0c0e0d3c
    • C
      nfs: fix kernel warning when removing proc entry · 21e81002
      Cong Wang 提交于
      I saw the following kernel warning:
      
      [ 1852.321222] ------------[ cut here ]------------
      [ 1852.326527] WARNING: CPU: 0 PID: 118 at fs/proc/generic.c:521 remove_proc_entry+0x154/0x16b()
      [ 1852.335630] remove_proc_entry: removing non-empty directory 'fs/nfsfs', leaking at least 'volumes'
      [ 1852.344084] CPU: 0 PID: 118 Comm: kworker/u8:2 Not tainted 3.16.0+ #540
      [ 1852.350036] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [ 1852.354992] Workqueue: netns cleanup_net
      [ 1852.358701]  0000000000000000 ffff880116f2fbd0 ffffffff819c03e9 ffff880116f2fc18
      [ 1852.366474]  ffff880116f2fc08 ffffffff810744ee ffffffff811e0e6e ffff8800d4e96238
      [ 1852.373507]  ffffffff81dbe665 ffff8800d46a5948 0000000000000005 ffff880116f2fc68
      [ 1852.380224] Call Trace:
      [ 1852.381976]  [<ffffffff819c03e9>] dump_stack+0x4d/0x66
      [ 1852.385495]  [<ffffffff810744ee>] warn_slowpath_common+0x7a/0x93
      [ 1852.389869]  [<ffffffff811e0e6e>] ? remove_proc_entry+0x154/0x16b
      [ 1852.393987]  [<ffffffff8107457b>] warn_slowpath_fmt+0x4c/0x4e
      [ 1852.397999]  [<ffffffff811e0e6e>] remove_proc_entry+0x154/0x16b
      [ 1852.402034]  [<ffffffff8129c73d>] nfs_fs_proc_net_exit+0x53/0x56
      [ 1852.406136]  [<ffffffff812a103b>] nfs_net_exit+0x12/0x1d
      [ 1852.409774]  [<ffffffff81785bc9>] ops_exit_list+0x44/0x55
      [ 1852.413529]  [<ffffffff81786389>] cleanup_net+0xee/0x182
      [ 1852.417198]  [<ffffffff81088c9e>] process_one_work+0x209/0x40d
      [ 1852.502320]  [<ffffffff81088bf7>] ? process_one_work+0x162/0x40d
      [ 1852.587629]  [<ffffffff810890c1>] worker_thread+0x1f0/0x2c7
      [ 1852.673291]  [<ffffffff81088ed1>] ? process_scheduled_works+0x2f/0x2f
      [ 1852.759470]  [<ffffffff8108e079>] kthread+0xc9/0xd1
      [ 1852.843099]  [<ffffffff8109427f>] ? finish_task_switch+0x3a/0xce
      [ 1852.926518]  [<ffffffff8108dfb0>] ? __kthread_parkme+0x61/0x61
      [ 1853.008565]  [<ffffffff819cbeac>] ret_from_fork+0x7c/0xb0
      [ 1853.076477]  [<ffffffff8108dfb0>] ? __kthread_parkme+0x61/0x61
      [ 1853.140653] ---[ end trace 69c4c6617f78e32d ]---
      
      It looks wrong that we add "/proc/net/nfsfs" in nfs_fs_proc_net_init()
      while remove "/proc/fs/nfsfs" in nfs_fs_proc_net_exit().
      
      Fixes: commit 65b38851 (NFS: Fix /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes)
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: Dan Aloni <dan@kernelim.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      [Trond: replace uses of remove_proc_entry() with remove_proc_subtree()
      as suggested by Al Viro]
      Cc: stable@vger.kernel.org # 3.4.x : 65b38851: NFS: Fix /proc/fs/nfsfs/servers
      Cc: stable@vger.kernel.org # 3.4.x
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      21e81002
  2. 27 8月, 2014 3 次提交
  3. 23 8月, 2014 8 次提交
  4. 08 8月, 2014 1 次提交
    • J
      dcache: d_obtain_alias callers don't all want DISCONNECTED · 1a0a397e
      J. Bruce Fields 提交于
      There are a few d_obtain_alias callers that are using it to get the
      root of a filesystem which may already have an alias somewhere else.
      
      This is not the same as the filehandle-lookup case, and none of them
      actually need DCACHE_DISCONNECTED set.
      
      It isn't really a serious problem, but it would really be clearer if we
      reserved DCACHE_DISCONNECTED for those cases where it's actually needed.
      
      In the btrfs case this was causing a spurious printk from
      nfsd/nfsfh.c:fh_verify when it found an unexpected DCACHE_DISCONNECTED
      dentry.  Josef worked around this by unsetting DCACHE_DISCONNECTED
      manually in 3a0dfa6a "Btrfs: unset DCACHE_DISCONNECTED when mounting
      default subvol", and this replaces that workaround.
      
      Cc: Josef Bacik <jbacik@fb.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1a0a397e
  5. 05 8月, 2014 3 次提交
    • S
      nfs: reject changes to resvport and sharecache during remount · 71a6ec8a
      Scott Mayhew 提交于
      Commit c8e47028 made it possible to change resvport/noresvport and
      sharecache/nosharecache via a remount operation, neither of which should be
      allowed.
      Signed-off-by: NScott Mayhew <smayhew@redhat.com>
      Fixes: c8e47028 (nfs: Apply NFS_MOUNT_CMP_FLAGMASK to nfs_compare_remount_data)
      Cc: stable@vger.kernel.org # 3.16+
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      71a6ec8a
    • K
      NFS: Avoid infinite loop when RELEASE_LOCKOWNER getting expired error · 5b53dc88
      Kinglong Mee 提交于
      Fix Commit 60ea6812 (NFS: Migration support for RELEASE_LOCKOWNER)
      If getting expired error, client will enter a infinite loop as,
      
      client                            server
         RELEASE_LOCKOWNER(old clid) ----->
                      <--- expired error
         RENEW(old clid)             ----->
                      <--- expired error
         SETCLIENTID                 ----->
                      <--- a new clid
         SETCLIENTID_CONFIRM (new clid) -->
                      <--- ok
         RELEASE_LOCKOWNER(old clid) ----->
                      <--- expired error
         RENEW(new clid)             ----->
                      <-- ok
         RELEASE_LOCKOWNER(old clid) ----->
                      <--- expired error
         RENEW(new clid)             ----->
                      <-- ok
                      ... ...
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      [Trond: replace call to nfs4_async_handle_error() with
       nfs4_schedule_lease_recovery()]
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      5b53dc88
    • E
      NFS: Fix /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes · 65b38851
      Eric W. Biederman 提交于
      The usage of pid_ns->child_reaper->nsproxy->net_ns in
      nfs_server_list_open and nfs_client_list_open is not safe.
      
      /proc for a pid namespace can remain mounted after the all of the
      process in that pid namespace have exited.  There are also times
      before the initial process in a pid namespace has started or after the
      initial process in a pid namespace has exited where
      pid_ns->child_reaper can be NULL or stale.  Making the idiom
      pid_ns->child_reaper->nsproxy a double whammy of problems.
      
      Luckily all that needs to happen is to move /proc/fs/nfsfs/servers and
      /proc/fs/nfsfs/volumes under /proc/net to /proc/net/nfsfs/servers and
      /proc/net/nfsfs/volumes and add a symlink from the original location,
      and to use seq_open_net as it has been designed.
      
      Cc: stable@vger.kernel.org
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      65b38851
  6. 04 8月, 2014 16 次提交
  7. 23 7月, 2014 1 次提交
  8. 18 7月, 2014 1 次提交
    • D
      KEYS: Allow special keys (eg. DNS results) to be invalidated by CAP_SYS_ADMIN · 0c7774ab
      David Howells 提交于
      Special kernel keys, such as those used to hold DNS results for AFS, CIFS and
      NFS and those used to hold idmapper results for NFS, used to be
      'invalidateable' with key_revoke().  However, since the default permissions for
      keys were reduced:
      
      	Commit: 96b5c8fe
      	KEYS: Reduce initial permissions on keys
      
      it has become impossible to do this.
      
      Add a key flag (KEY_FLAG_ROOT_CAN_INVAL) that will permit a key to be
      invalidated by root.  This should not be used for system keyrings as the
      garbage collector will try and remove any invalidate key.  For system keyrings,
      KEY_FLAG_ROOT_CAN_CLEAR can be used instead.
      
      After this, from userspace, keyctl_invalidate() and "keyctl invalidate" can be
      used by any possessor of CAP_SYS_ADMIN (typically root) to invalidate DNS and
      idmapper keys.  Invalidated keys are immediately garbage collected and will be
      immediately rerequested if needed again.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NSteve Dickson <steved@redhat.com>
      0c7774ab
  9. 16 7月, 2014 2 次提交
    • N
      sched: Allow wait_on_bit_action() functions to support a timeout · c1221321
      NeilBrown 提交于
      It is currently not possible for various wait_on_bit functions
      to implement a timeout.
      
      While the "action" function that is called to do the waiting
      could certainly use schedule_timeout(), there is no way to carry
      forward the remaining timeout after a false wake-up.
      As false-wakeups a clearly possible at least due to possible
      hash collisions in bit_waitqueue(), this is a real problem.
      
      The 'action' function is currently passed a pointer to the word
      containing the bit being waited on.  No current action functions
      use this pointer.  So changing it to something else will be a
      little noisy but will have no immediate effect.
      
      This patch changes the 'action' function to take a pointer to
      the "struct wait_bit_key", which contains a pointer to the word
      containing the bit so nothing is really lost.
      
      It also adds a 'private' field to "struct wait_bit_key", which
      is initialized to zero.
      
      An action function can now implement a timeout with something
      like
      
      static int timed_out_waiter(struct wait_bit_key *key)
      {
      	unsigned long waited;
      	if (key->private == 0) {
      		key->private = jiffies;
      		if (key->private == 0)
      			key->private -= 1;
      	}
      	waited = jiffies - key->private;
      	if (waited > 10 * HZ)
      		return -EAGAIN;
      	schedule_timeout(waited - 10 * HZ);
      	return 0;
      }
      
      If any other need for context in a waiter were found it would be
      easy to use ->private for some other purpose, or even extend
      "struct wait_bit_key".
      
      My particular need is to support timeouts in nfs_release_page()
      to avoid deadlocks with loopback mounted NFS.
      
      While wait_on_bit_timeout() would be a cleaner interface, it
      will not meet my need.  I need the timeout to be sensitive to
      the state of the connection with the server, which could change.
       So I need to use an 'action' interface.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steve French <sfrench@samba.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20140707051604.28027.41257.stgit@notabene.brownSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c1221321
    • N
      sched: Remove proliferation of wait_on_bit() action functions · 74316201
      NeilBrown 提交于
      The current "wait_on_bit" interface requires an 'action'
      function to be provided which does the actual waiting.
      There are over 20 such functions, many of them identical.
      Most cases can be satisfied by one of just two functions, one
      which uses io_schedule() and one which just uses schedule().
      
      So:
       Rename wait_on_bit and        wait_on_bit_lock to
              wait_on_bit_action and wait_on_bit_lock_action
       to make it explicit that they need an action function.
      
       Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io
       which are *not* given an action function but implicitly use
       a standard one.
       The decision to error-out if a signal is pending is now made
       based on the 'mode' argument rather than being encoded in the action
       function.
      
       All instances of the old wait_on_bit and wait_on_bit_lock which
       can use the new version have been changed accordingly and their
       action functions have been discarded.
       wait_on_bit{_lock} does not return any specific error code in the
       event of a signal so the caller must check for non-zero and
       interpolate their own error code as appropriate.
      
      The wait_on_bit() call in __fscache_wait_on_invalidate() was
      ambiguous as it specified TASK_UNINTERRUPTIBLE but used
      fscache_wait_bit_interruptible as an action function.
      David Howells confirms this should be uniformly
      "uninterruptible"
      
      The main remaining user of wait_on_bit{,_lock}_action is NFS
      which needs to use a freezer-aware schedule() call.
      
      A comment in fs/gfs2/glock.c notes that having multiple 'action'
      functions is useful as they display differently in the 'wchan'
      field of 'ps'. (and /proc/$PID/wchan).
      As the new bit_wait{,_io} functions are tagged "__sched", they
      will not show up at all, but something higher in the stack.  So
      the distinction will still be visible, only with different
      function names (gds2_glock_wait versus gfs2_glock_dq_wait in the
      gfs2/glock.c case).
      
      Since first version of this patch (against 3.15) two new action
      functions appeared, on in NFS and one in CIFS.  CIFS also now
      uses an action function that makes the same freezer aware
      schedule call as NFS.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Acked-by: David Howells <dhowells@redhat.com> (fscache, keys)
      Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2)
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steve French <sfrench@samba.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brownSigned-off-by: NIngo Molnar <mingo@kernel.org>
      74316201
  10. 14 7月, 2014 1 次提交
  11. 13 7月, 2014 2 次提交