1. 06 2月, 2016 1 次提交
  2. 01 2月, 2016 1 次提交
    • T
      SUNRPC: Reorder rpc_task to put waitqueue related info in same cachelines · 5edd1051
      Trond Myklebust 提交于
      Try to group all the data required by the waitqueues, their timers and timer
      callbacks into the same cachelines for performance. With this reordering,
      "pahole" reports the following structure on x86_64:
      
      struct rpc_task {
              atomic_t                   tk_count;             /*     0     4 */
              int                        tk_status;            /*     4     4 */
              struct list_head           tk_task;              /*     8    16 */
              void                       (*tk_callback)(struct rpc_task *); /*    24
              void                       (*tk_action)(struct rpc_task *); /*    32
              long unsigned int          tk_timeout;           /*    40     8 */
              long unsigned int          tk_runstate;          /*    48     8 */
              struct rpc_wait_queue *    tk_waitqueue;         /*    56     8 */
              /* --- cacheline 1 boundary (64 bytes) --- */
              union {
                      struct work_struct tk_work;              /*          64 */
                      struct rpc_wait    tk_wait;              /*          56 */
              } u;                                             /*    64    64 */
              /* --- cacheline 2 boundary (128 bytes) --- */
              struct rpc_message         tk_msg;               /*   128    32 */
              void *                     tk_calldata;          /*   160     8 */
              const struct rpc_call_ops  * tk_ops;             /*   168     8 */
              struct rpc_clnt *          tk_client;            /*   176     8 */
              struct rpc_rqst *          tk_rqstp;             /*   184     8 */
              /* --- cacheline 3 boundary (192 bytes) --- */
              struct workqueue_struct *  tk_workqueue;         /*   192     8 */
              ktime_t                    tk_start;             /*   200     8 */
              pid_t                      tk_owner;             /*   208     4 */
              short unsigned int         tk_flags;             /*   212     2 */
              short unsigned int         tk_timeouts;          /*   214     2 */
              short unsigned int         tk_pid;               /*   216     2 */
              unsigned char              tk_priority:2;        /*   218: 6  1 */
              unsigned char              tk_garb_retry:2;      /*   218: 4  1 */
              unsigned char              tk_cred_retry:2;      /*   218: 2  1 */
              unsigned char              tk_rebind_retry:2;    /*   218: 0  1 */
      
              /* size: 224, cachelines: 4, members: 24 */
              /* padding: 5 */
              /* last cacheline: 32 bytes */
      };
      
      whereas on i386, it reports everything fitting into the 1st cacheline.
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      5edd1051
  3. 11 6月, 2015 1 次提交
    • J
      sunrpc: keep a count of swapfiles associated with the rpc_clnt · 3c87ef6e
      Jeff Layton 提交于
      Jerome reported seeing a warning pop when working with a swapfile on
      NFS. The nfs_swap_activate can end up calling sk_set_memalloc while
      holding the rcu_read_lock and that function can sleep.
      
      To fix that, we need to take a reference to the xprt while holding the
      rcu_read_lock, set the socket up for swapping and then drop that
      reference. But, xprt_put is not exported and having NFS deal with the
      underlying xprt is a bit of layering violation anyway.
      
      Fix this by adding a set of activate/deactivate functions that take a
      rpc_clnt pointer instead of an rpc_xprt, and have nfs_swap_activate and
      nfs_swap_deactivate call those.
      
      Also, add a per-rpc_clnt atomic counter to keep track of the number of
      active swapfiles associated with it. When the counter does a 0->1
      transition, we enable swapping on the xprt, when we do a 1->0 transition
      we disable swapping on it.
      
      This also allows us to be a bit more selective with the RPC_TASK_SWAPPER
      flag. If non-swapper and swapper clnts are sharing a xprt, then we only
      need to flag the tasks from the swapper clnt with that flag.
      Acked-by: NMel Gorman <mgorman@suse.de>
      Reported-by: NJerome Marchand <jmarchan@redhat.com>
      Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
      Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      3c87ef6e
  4. 05 6月, 2015 1 次提交
  5. 25 11月, 2014 2 次提交
  6. 16 7月, 2014 1 次提交
    • N
      sched: Allow wait_on_bit_action() functions to support a timeout · c1221321
      NeilBrown 提交于
      It is currently not possible for various wait_on_bit functions
      to implement a timeout.
      
      While the "action" function that is called to do the waiting
      could certainly use schedule_timeout(), there is no way to carry
      forward the remaining timeout after a false wake-up.
      As false-wakeups a clearly possible at least due to possible
      hash collisions in bit_waitqueue(), this is a real problem.
      
      The 'action' function is currently passed a pointer to the word
      containing the bit being waited on.  No current action functions
      use this pointer.  So changing it to something else will be a
      little noisy but will have no immediate effect.
      
      This patch changes the 'action' function to take a pointer to
      the "struct wait_bit_key", which contains a pointer to the word
      containing the bit so nothing is really lost.
      
      It also adds a 'private' field to "struct wait_bit_key", which
      is initialized to zero.
      
      An action function can now implement a timeout with something
      like
      
      static int timed_out_waiter(struct wait_bit_key *key)
      {
      	unsigned long waited;
      	if (key->private == 0) {
      		key->private = jiffies;
      		if (key->private == 0)
      			key->private -= 1;
      	}
      	waited = jiffies - key->private;
      	if (waited > 10 * HZ)
      		return -EAGAIN;
      	schedule_timeout(waited - 10 * HZ);
      	return 0;
      }
      
      If any other need for context in a waiter were found it would be
      easy to use ->private for some other purpose, or even extend
      "struct wait_bit_key".
      
      My particular need is to support timeouts in nfs_release_page()
      to avoid deadlocks with loopback mounted NFS.
      
      While wait_on_bit_timeout() would be a cleaner interface, it
      will not meet my need.  I need the timeout to be sensitive to
      the state of the connection with the server, which could change.
       So I need to use an 'action' interface.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steve French <sfrench@samba.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20140707051604.28027.41257.stgit@notabene.brownSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c1221321
  7. 18 4月, 2014 1 次提交
  8. 02 10月, 2013 1 次提交
  9. 05 9月, 2013 1 次提交
  10. 08 8月, 2013 1 次提交
    • T
      SUNRPC: If the rpcbind channel is disconnected, fail the call to unregister · 786615bc
      Trond Myklebust 提交于
      If rpcbind causes our connection to the AF_LOCAL socket to close after
      we've registered a service, then we want to be careful about reconnecting
      since the mount namespace may have changed.
      
      By simply refusing to reconnect the AF_LOCAL socket in the case of
      unregister, we avoid the need to somehow save the mount namespace. While
      this may lead to some services not unregistering properly, it should
      be safe.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Nix <nix@esperi.org.uk>
      Cc: Jeff Layton <jlayton@redhat.com>
      Cc: stable@vger.kernel.org # 3.9.x
      786615bc
  11. 07 6月, 2013 3 次提交
  12. 01 2月, 2013 1 次提交
  13. 06 12月, 2012 1 次提交
  14. 18 2月, 2012 1 次提交
  15. 15 2月, 2012 1 次提交
  16. 01 2月, 2012 3 次提交
  17. 18 7月, 2011 1 次提交
  18. 15 6月, 2011 1 次提交
  19. 25 4月, 2011 1 次提交
  20. 19 4月, 2011 1 次提交
  21. 11 3月, 2011 1 次提交
    • T
      SUNRPC: Close a race in __rpc_wait_for_completion_task() · bf294b41
      Trond Myklebust 提交于
      Although they run as rpciod background tasks, under normal operation
      (i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
      and nfs4_do_close() want to be fully synchronous. This means that when we
      exit, we want all references to the rpc_task to be gone, and we want
      any dentry references etc. held by that task to be released.
      
      For this reason these functions call __rpc_wait_for_completion_task(),
      followed by rpc_put_task() in the expectation that the latter will be
      releasing the last reference to the rpc_task, and thus ensuring that the
      callback_ops->rpc_release() has been called synchronously.
      
      This patch fixes a race which exists due to the fact that
      rpciod calls rpc_complete_task() (in order to wake up the callers of
      __rpc_wait_for_completion_task()) and then subsequently calls
      rpc_put_task() without ensuring that these two steps are done atomically.
      
      In order to avoid adding new spin locks, the patch uses the existing
      waitqueue spin lock to order the rpc_task reference count releases between
      the waiting process and rpciod.
      The common case where nobody is waiting for completion is optimised for by
      checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
      reference count is 1: in those cases we drop trying to grab the spin lock,
      and immediately free up the rpc_task.
      
      Those few processes that need to put the rpc_task from inside an
      asynchronous context and that do not care about ordering are given a new
      helper: rpc_put_task_async().
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      bf294b41
  22. 04 8月, 2010 1 次提交
  23. 15 5月, 2010 4 次提交
  24. 16 12月, 2009 2 次提交
  25. 04 12月, 2009 1 次提交
    • C
      SUNRPC: Allow RPCs to fail quickly if the server is unreachable · 09a21c41
      Chuck Lever 提交于
      The kernel sometimes makes RPC calls to services that aren't running.
      Because the kernel's RPC client always assumes the hard retry semantic
      when reconnecting a connection-oriented RPC transport, the underlying
      reconnect logic takes a long while to time out, even though the remote
      may have responded immediately with ECONNREFUSED.
      
      In certain cases, like upcalls to our local rpcbind daemon, or for NFS
      mount requests, we'd like the kernel to fail immediately if the remote
      service isn't reachable.  This allows another transport to be tried
      immediately, or the pending request can be abandoned quickly.
      
      Introduce a per-request flag which controls how call_transmit_status()
      behaves when request transmission fails because the server cannot be
      reached.
      
      We don't want soft connection semantics to apply to other errors.  The
      default case of the switch statement in call_transmit_status() no
      longer falls through; the fall through code is copied to the default
      case, and a "break;" is added.
      
      The transport's connection re-establishment timeout is also ignored for
      such requests.  We want the request to fail immediately, so the
      reconnect delay is skipped.  Additionally, we don't want a connect
      failure here to further increase the reconnect timeout value, since
      this request will not be retried.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      09a21c41
  26. 18 6月, 2009 2 次提交
  27. 10 7月, 2008 1 次提交
  28. 29 2月, 2008 3 次提交