1. 25 11月, 2014 1 次提交
  2. 25 9月, 2014 1 次提交
  3. 16 7月, 2014 1 次提交
    • N
      sched: Allow wait_on_bit_action() functions to support a timeout · c1221321
      NeilBrown 提交于
      It is currently not possible for various wait_on_bit functions
      to implement a timeout.
      
      While the "action" function that is called to do the waiting
      could certainly use schedule_timeout(), there is no way to carry
      forward the remaining timeout after a false wake-up.
      As false-wakeups a clearly possible at least due to possible
      hash collisions in bit_waitqueue(), this is a real problem.
      
      The 'action' function is currently passed a pointer to the word
      containing the bit being waited on.  No current action functions
      use this pointer.  So changing it to something else will be a
      little noisy but will have no immediate effect.
      
      This patch changes the 'action' function to take a pointer to
      the "struct wait_bit_key", which contains a pointer to the word
      containing the bit so nothing is really lost.
      
      It also adds a 'private' field to "struct wait_bit_key", which
      is initialized to zero.
      
      An action function can now implement a timeout with something
      like
      
      static int timed_out_waiter(struct wait_bit_key *key)
      {
      	unsigned long waited;
      	if (key->private == 0) {
      		key->private = jiffies;
      		if (key->private == 0)
      			key->private -= 1;
      	}
      	waited = jiffies - key->private;
      	if (waited > 10 * HZ)
      		return -EAGAIN;
      	schedule_timeout(waited - 10 * HZ);
      	return 0;
      }
      
      If any other need for context in a waiter were found it would be
      easy to use ->private for some other purpose, or even extend
      "struct wait_bit_key".
      
      My particular need is to support timeouts in nfs_release_page()
      to avoid deadlocks with loopback mounted NFS.
      
      While wait_on_bit_timeout() would be a cleaner interface, it
      will not meet my need.  I need the timeout to be sensitive to
      the state of the connection with the server, which could change.
       So I need to use an 'action' interface.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steve French <sfrench@samba.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20140707051604.28027.41257.stgit@notabene.brownSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c1221321
  4. 29 5月, 2014 1 次提交
    • D
      net, sunrpc: suppress allocation warning in rpc_malloc() · c6c8fe79
      David Rientjes 提交于
      rpc_malloc() allocates with GFP_NOWAIT without making any attempt at
      reclaim so it easily fails when low on memory.  This ends up spamming the
      kernel log:
      
      SLAB: Unable to allocate memory on node 0 (gfp=0x4000)
        cache: kmalloc-8192, object size: 8192, order: 1
        node 0: slabs: 207/207, objs: 207/207, free: 0
      rekonq: page allocation failure: order:1, mode:0x204000
      CPU: 2 PID: 14321 Comm: rekonq Tainted: G           O  3.15.0-rc3-12.gfc9498b-desktop+ #6
      Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 2105    07/23/2010
       0000000000000000 ffff880010ff17d0 ffffffff815e693c 0000000000204000
       ffff880010ff1858 ffffffff81137bd2 0000000000000000 0000001000000000
       ffff88011ffebc38 0000000000000001 0000000000204000 ffff88011ffea000
      Call Trace:
       [<ffffffff815e693c>] dump_stack+0x4d/0x6f
       [<ffffffff81137bd2>] warn_alloc_failed+0xd2/0x140
       [<ffffffff8113be19>] __alloc_pages_nodemask+0x7e9/0xa30
       [<ffffffff811824a8>] kmem_getpages+0x58/0x140
       [<ffffffff81183de6>] fallback_alloc+0x1d6/0x210
       [<ffffffff81183be3>] ____cache_alloc_node+0x123/0x150
       [<ffffffff81185953>] __kmalloc+0x203/0x490
       [<ffffffffa06b0ee2>] rpc_malloc+0x32/0xa0 [sunrpc]
       [<ffffffffa06a6999>] call_allocate+0xb9/0x170 [sunrpc]
       [<ffffffffa06b19d8>] __rpc_execute+0x88/0x460 [sunrpc]
       [<ffffffffa06b2da9>] rpc_execute+0x59/0xc0 [sunrpc]
       [<ffffffffa06a932b>] rpc_run_task+0x6b/0x90 [sunrpc]
       [<ffffffffa077b5c1>] nfs4_call_sync_sequence+0x51/0x80 [nfsv4]
       [<ffffffffa077d45d>] _nfs4_do_setattr+0x1ed/0x280 [nfsv4]
       [<ffffffffa0782a72>] nfs4_do_setattr+0x72/0x180 [nfsv4]
       [<ffffffffa078334c>] nfs4_proc_setattr+0xbc/0x140 [nfsv4]
       [<ffffffffa074a7e8>] nfs_setattr+0xd8/0x240 [nfs]
       [<ffffffff811baa71>] notify_change+0x231/0x380
       [<ffffffff8119cf5c>] chmod_common+0xfc/0x120
       [<ffffffff8119df80>] SyS_chmod+0x40/0x90
       [<ffffffff815f4cfd>] system_call_fastpath+0x1a/0x1f
      ...
      
      If the allocation fails, simply return NULL and avoid spamming the kernel
      log.
      Reported-by: NMarc Dietrich <marvin24@gmx.de>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      c6c8fe79
  5. 21 3月, 2014 1 次提交
  6. 05 9月, 2013 1 次提交
  7. 07 6月, 2013 3 次提交
  8. 23 5月, 2013 1 次提交
    • T
      SUNRPC: Prevent an rpc_task wakeup race · a3c3cac5
      Trond Myklebust 提交于
      The lockless RPC_IS_QUEUED() test in __rpc_execute means that we need to
      be careful about ordering the calls to rpc_test_and_set_running(task) and
      rpc_clear_queued(task). If we get the order wrong, then we may end up
      testing the RPC_TASK_RUNNING flag after __rpc_execute() has looped
      and changed the state of the rpc_task.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@vger.kernel.org
      a3c3cac5
  9. 12 5月, 2013 1 次提交
    • C
      freezer: add unsafe versions of freezable helpers for NFS · 416ad3c9
      Colin Cross 提交于
      NFS calls the freezable helpers with locks held, which is unsafe
      and will cause lockdep warnings when 6aa97070 "lockdep: check
      that no locks held at freeze time" is reapplied (it was reverted
      in dbf520a9).  NFS shouldn't be doing this, but it has
      long-running syscalls that must hold a lock but also shouldn't
      block suspend.  Until NFS freeze handling is rewritten to use a
      signal to exit out of the critical section, add new *_unsafe
      versions of the helpers that will not run the lockdep test when
      6aa97070 is reapplied, and call them from NFS.
      
      In practice the likley result of holding the lock while freezing
      is that a second task blocked on the lock will never freeze,
      aborting suspend, but it is possible to manufacture a case using
      the cgroup freezer, the lock, and the suspend freezer to create
      a deadlock.  Silencing the lockdep warning here will allow
      problems to be found in other drivers that may have a more
      serious deadlock risk, and prevent new problems from being added.
      Signed-off-by: NColin Cross <ccross@android.com>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      416ad3c9
  10. 25 3月, 2013 1 次提交
  11. 31 1月, 2013 1 次提交
  12. 09 1月, 2013 1 次提交
    • T
      SUNRPC: Ensure we release the socket write lock if the rpc_task exits early · 87ed5003
      Trond Myklebust 提交于
      If the rpc_task exits while holding the socket write lock before it has
      allocated an rpc slot, then the usual mechanism for releasing the write
      lock in xprt_release() is defeated.
      
      The problem occurs if the call to xprt_lock_write() initially fails, so
      that the rpc_task is put on the xprt->sending wait queue. If the task
      exits after being assigned the lock by __xprt_lock_write_func, but
      before it has retried the call to xprt_lock_and_alloc_slot(), then
      it calls xprt_release() while holding the write lock, but will
      immediately exit due to the test for task->tk_rqstp != NULL.
      Reported-by: NChris Perl <chris.perl@gmail.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@vger.kernel.org [>= 3.1]
      87ed5003
  13. 05 1月, 2013 1 次提交
  14. 06 12月, 2012 2 次提交
  15. 05 11月, 2012 4 次提交
  16. 29 9月, 2012 1 次提交
  17. 01 8月, 2012 1 次提交
    • M
      nfs: enable swap on NFS · a564b8f0
      Mel Gorman 提交于
      Implement the new swapfile a_ops for NFS and hook up ->direct_IO.  This
      will set the NFS socket to SOCK_MEMALLOC and run socket reconnect under
      PF_MEMALLOC as well as reset SOCK_MEMALLOC before engaging the protocol
      ->connect() method.
      
      PF_MEMALLOC should allow the allocation of struct socket and related
      objects and the early (re)setting of SOCK_MEMALLOC should allow us to
      receive the packets required for the TCP connection buildup.
      
      [jlayton@redhat.com: Restore PF_MEMALLOC task flags in all cases]
      [dfeng@redhat.com: Fix handling of multiple swap files]
      [a.p.zijlstra@chello.nl: Original patch]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NRik van Riel <riel@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Xiaotian Feng <dfeng@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a564b8f0
  18. 31 7月, 2012 2 次提交
    • J
      nfs: skip commit in releasepage if we're freeing memory for fs-related reasons · 5cf02d09
      Jeff Layton 提交于
      We've had some reports of a deadlock where rpciod ends up with a stack
      trace like this:
      
          PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
           #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
           #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
           #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
           #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
           #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
           #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
           #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
           #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
           #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
           #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
          #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
          #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
          #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
          #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
          #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
          #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
          #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
          #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
          #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
          #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
          #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
          #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
          #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
          #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
          #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
          #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca
      
      rpciod is trying to allocate memory for a new socket to talk to the
      server. The VM ends up calling ->releasepage to get more memory, and it
      tries to do a blocking commit. That commit can't succeed however without
      a connected socket, so we deadlock.
      
      Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
      socket allocation, and having nfs_release_page check for that flag when
      deciding whether to do a commit call. Also, set PF_FSTRANS
      unconditionally in rpc_async_schedule since that function can also do
      allocations sometimes.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@vger.kernel.org
      5cf02d09
    • J
      sunrpc: clarify comments on rpc_make_runnable · 506026c3
      Jeff Layton 提交于
      rpc_make_runnable is not generally called with the queue lock held, unless
      it's waking up a task that has been sitting on a waitqueue. This is safe
      when the task has not entered the FSM yet, but the comments don't really
      spell this out.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      506026c3
  19. 20 3月, 2012 1 次提交
  20. 15 2月, 2012 1 次提交
  21. 01 2月, 2012 2 次提交
  22. 07 12月, 2011 1 次提交
  23. 02 12月, 2011 1 次提交
  24. 18 7月, 2011 1 次提交
  25. 08 7月, 2011 1 次提交
  26. 15 6月, 2011 1 次提交
  27. 27 3月, 2011 1 次提交
    • O
      NFS: Ensure that rpc_release_resources_task() can be called twice. · a271c5a0
      OGAWA Hirofumi 提交于
      BUG: atomic_dec_and_test(): -1: atomic counter underflow at:
      Pid: 2827, comm: mount.nfs Not tainted 2.6.38 #1
      Call Trace:
       [<ffffffffa02223a0>] ? put_rpccred+0x44/0x14e [sunrpc]
       [<ffffffffa021bbe9>] ? rpc_ping+0x4e/0x58 [sunrpc]
       [<ffffffffa021c4a5>] ? rpc_create+0x481/0x4fc [sunrpc]
       [<ffffffffa022298a>] ? rpcauth_lookup_credcache+0xab/0x22d [sunrpc]
       [<ffffffffa028be8c>] ? nfs_create_rpc_client+0xa6/0xeb [nfs]
       [<ffffffffa028c660>] ? nfs4_set_client+0xc2/0x1f9 [nfs]
       [<ffffffffa028cd3c>] ? nfs4_create_server+0xf2/0x2a6 [nfs]
       [<ffffffffa0295d07>] ? nfs4_remote_mount+0x4e/0x14a [nfs]
       [<ffffffff810dd570>] ? vfs_kern_mount+0x6e/0x133
       [<ffffffffa029605a>] ? nfs_do_root_mount+0x76/0x95 [nfs]
       [<ffffffffa029643d>] ? nfs4_try_mount+0x56/0xaf [nfs]
       [<ffffffffa0297434>] ? nfs_get_sb+0x435/0x73c [nfs]
       [<ffffffff810dd59b>] ? vfs_kern_mount+0x99/0x133
       [<ffffffff810dd693>] ? do_kern_mount+0x48/0xd8
       [<ffffffff810f5b75>] ? do_mount+0x6da/0x741
       [<ffffffff810f5c5f>] ? sys_mount+0x83/0xc0
       [<ffffffff8100293b>] ? system_call_fastpath+0x16/0x1b
      
      Well, so, I think this is real bug of nfs codes somewhere. With some
      review, the code
      
      rpc_call_sync()
          rpc_run_task
              rpc_execute()
                  __rpc_execute()
                      rpc_release_task()
                          rpc_release_resources_task()
                              put_rpccred()                <= release cred
          rpc_put_task
              rpc_do_put_task()
                  rpc_release_resources_task()
                      put_rpccred()                        <= release cred again
      
      seems to be release cred unintendedly.
      Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      a271c5a0
  28. 18 3月, 2011 1 次提交
  29. 12 3月, 2011 2 次提交
  30. 11 3月, 2011 1 次提交
    • T
      SUNRPC: Close a race in __rpc_wait_for_completion_task() · bf294b41
      Trond Myklebust 提交于
      Although they run as rpciod background tasks, under normal operation
      (i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
      and nfs4_do_close() want to be fully synchronous. This means that when we
      exit, we want all references to the rpc_task to be gone, and we want
      any dentry references etc. held by that task to be released.
      
      For this reason these functions call __rpc_wait_for_completion_task(),
      followed by rpc_put_task() in the expectation that the latter will be
      releasing the last reference to the rpc_task, and thus ensuring that the
      callback_ops->rpc_release() has been called synchronously.
      
      This patch fixes a race which exists due to the fact that
      rpciod calls rpc_complete_task() (in order to wake up the callers of
      __rpc_wait_for_completion_task()) and then subsequently calls
      rpc_put_task() without ensuring that these two steps are done atomically.
      
      In order to avoid adding new spin locks, the patch uses the existing
      waitqueue spin lock to order the rpc_task reference count releases between
      the waiting process and rpciod.
      The common case where nobody is waiting for completion is optimised for by
      checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
      reference count is 1: in those cases we drop trying to grab the spin lock,
      and immediately free up the rpc_task.
      
      Those few processes that need to put the rpc_task from inside an
      asynchronous context and that do not care about ordering are given a new
      helper: rpc_put_task_async().
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      bf294b41
  31. 25 1月, 2011 1 次提交