1. 22 8月, 2012 8 次提交
  2. 21 8月, 2012 3 次提交
    • J
      svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping · d10f27a7
      J. Bruce Fields 提交于
      The rpc server tries to ensure that there will be room to send a reply
      before it receives a request.
      
      It does this by tracking, in xpt_reserved, an upper bound on the total
      size of the replies that is has already committed to for the socket.
      
      Currently it is adding in the estimate for a new reply *before* it
      checks whether there is space available.  If it finds that there is not
      space, it then subtracts the estimate back out.
      
      This may lead the subsequent svc_xprt_enqueue to decide that there is
      space after all.
      
      The results is a svc_recv() that will repeatedly return -EAGAIN, causing
      server threads to loop without doing any actual work.
      
      Cc: stable@vger.kernel.org
      Reported-by: NMichael Tokarev <mjt@tls.msk.ru>
      Tested-by: NMichael Tokarev <mjt@tls.msk.ru>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      d10f27a7
    • J
      svcrpc: sends on closed socket should stop immediately · f06f00a2
      J. Bruce Fields 提交于
      svc_tcp_sendto sets XPT_CLOSE if we fail to transmit the entire reply.
      However, the XPT_CLOSE won't be acted on immediately.  Meanwhile other
      threads could send further replies before the socket is really shut
      down.  This can manifest as data corruption: for example, if a truncated
      read reply is followed by another rpc reply, that second reply will look
      to the client like further read data.
      
      Symptoms were data corruption preceded by svc_tcp_sendto logging
      something like
      
      	kernel: rpc-srv/tcp: nfsd: sent only 963696 when sending 1048708 bytes - shutting down socket
      
      Cc: stable@vger.kernel.org
      Reported-by: NMalahal Naineni <malahal@us.ibm.com>
      Tested-by: NMalahal Naineni <malahal@us.ibm.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      f06f00a2
    • J
      svcrpc: fix BUG() in svc_tcp_clear_pages · be1e4444
      J. Bruce Fields 提交于
      Examination of svc_tcp_clear_pages shows that it assumes sk_tcplen is
      consistent with sk_pages[] (in particular, sk_pages[n] can't be NULL if
      sk_tcplen would lead us to expect n pages of data).
      
      svc_tcp_restore_pages zeroes out sk_pages[] while leaving sk_tcplen.
      This is OK, since both functions are serialized by XPT_BUSY.  However,
      that means the inconsistency must be repaired before dropping XPT_BUSY.
      
      Therefore we should be ensuring that svc_tcp_save_pages repairs the
      problem before exiting svc_tcp_recv_record on error.
      
      Symptoms were a BUG() in svc_tcp_clear_pages.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      be1e4444
  3. 01 8月, 2012 1 次提交
    • M
      nfs: enable swap on NFS · a564b8f0
      Mel Gorman 提交于
      Implement the new swapfile a_ops for NFS and hook up ->direct_IO.  This
      will set the NFS socket to SOCK_MEMALLOC and run socket reconnect under
      PF_MEMALLOC as well as reset SOCK_MEMALLOC before engaging the protocol
      ->connect() method.
      
      PF_MEMALLOC should allow the allocation of struct socket and related
      objects and the early (re)setting of SOCK_MEMALLOC should allow us to
      receive the packets required for the TCP connection buildup.
      
      [jlayton@redhat.com: Restore PF_MEMALLOC task flags in all cases]
      [dfeng@redhat.com: Fix handling of multiple swap files]
      [a.p.zijlstra@chello.nl: Original patch]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NRik van Riel <riel@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Xiaotian Feng <dfeng@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a564b8f0
  4. 31 7月, 2012 4 次提交
    • S
      SUNRPC: return negative value in case rpcbind client creation error · caea33da
      Stanislav Kinsbursky 提交于
      Without this patch kernel will panic on LockD start, because lockd_up() checks
      lockd_up_net() result for negative value.
      From my pow it's better to return negative value from rpcbind routines instead
      of replacing all such checks like in lockd_up().
      Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@vger.kernel.org [>= 3.0]
      caea33da
    • J
      nfs: skip commit in releasepage if we're freeing memory for fs-related reasons · 5cf02d09
      Jeff Layton 提交于
      We've had some reports of a deadlock where rpciod ends up with a stack
      trace like this:
      
          PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
           #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
           #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
           #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
           #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
           #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
           #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
           #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
           #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
           #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
           #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
          #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
          #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
          #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
          #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
          #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
          #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
          #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
          #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
          #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
          #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
          #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
          #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
          #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
          #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
          #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
          #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca
      
      rpciod is trying to allocate memory for a new socket to talk to the
      server. The VM ends up calling ->releasepage to get more memory, and it
      tries to do a blocking commit. That commit can't succeed however without
      a connected socket, so we deadlock.
      
      Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
      socket allocation, and having nfs_release_page check for that flag when
      deciding whether to do a commit call. Also, set PF_FSTRANS
      unconditionally in rpc_async_schedule since that function can also do
      allocations sometimes.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@vger.kernel.org
      5cf02d09
    • J
      sunrpc: clarify comments on rpc_make_runnable · 506026c3
      Jeff Layton 提交于
      rpc_make_runnable is not generally called with the queue lock held, unless
      it's waking up a task that has been sitting on a waitqueue. This is safe
      when the task has not entered the FSM yet, but the comments don't really
      spell this out.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      506026c3
    • J
      sunrpc: clnt: Add missing braces · cac5d07e
      Joe Perches 提交于
      Add a missing set of braces that commit 4e0038b6
      ("SUNRPC: Move clnt->cl_server into struct rpc_xprt")
      forgot.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@vger.kernel.org [>= 3.4]
      cac5d07e
  5. 19 7月, 2012 1 次提交
    • E
      ipv6: add ipv6_addr_hash() helper · ddbe5032
      Eric Dumazet 提交于
      Introduce ipv6_addr_hash() helper doing a XOR on all bits
      of an IPv6 address, with an optimized x86_64 version.
      
      Use it in flow dissector, as suggested by Andrew McGregor,
      to reduce hash collision probabilities in fq_codel (and other
      users of flow dissector)
      
      Use it in ip6_tunnel.c and use more bit shuffling, as suggested
      by David Laight, as existing hash was ignoring most of them.
      
      Use it in sunrpc and use more bit shuffling, using hash_32().
      
      Use it in net/ipv6/addrconf.c, using hash_32() as well.
      
      As a cleanup, use it in net/ipv4/tcp_metrics.c
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NAndrew McGregor <andrewmcgr@gmail.com>
      Cc: Dave Taht <dave.taht@gmail.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: David Laight <David.Laight@ACULAB.COM>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ddbe5032
  6. 18 7月, 2012 1 次提交
  7. 17 7月, 2012 1 次提交
    • C
      SUNRPC: Add rpcauth_list_flavors() · 6a1a1e34
      Chuck Lever 提交于
      The gss_mech_list_pseudoflavors() function provides a list of
      currently registered GSS pseudoflavors.  This list does not include
      any non-GSS flavors that have been registered with the RPC client.
      nfs4_find_root_sec() currently adds these extra flavors by hand.
      
      Instead, nfs4_find_root_sec() should be looking at the set of flavors
      that have been explicitly registered via rpcauth_register().  And,
      other areas of code will soon need the same kind of list that
      contains all flavors the kernel currently knows about (see below).
      
      Rather than cloning the open-coded logic in nfs4_find_root_sec() to
      those new places, introduce a generic RPC function that generates a
      full list of registered auth flavors and pseudoflavors.
      
      A new rpc_authops method is added that lists a flavor's
      pseudoflavors, if it has any.  I encountered an interesting module
      loader loop when I tried to get the RPC client to invoke
      gss_mech_list_pseudoflavors() by name.
      
      This patch is a pre-requisite for server trunking discovery, and a
      pre-requisite for fixing up the in-kernel mount client to do better
      automatic security flavor selection.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      6a1a1e34
  8. 12 7月, 2012 1 次提交
    • N
      SUNRPC/cache: fix reporting of expired cache entries in 'content' file. · 200724a7
      NeilBrown 提交于
      Entries that are in a sunrpc cache but are not valid should be reported
      with a leading '#' so they look like a comment.
      Commit  d202cce8 (sunrpc: never return expired entries in sunrpc_cache_lookup)
      broke this for expired entries.
      
      This particularly applies to entries that have been replaced by newer entries.
      sunrpc_cache_update sets the expiry of the replaced entry to '0', but it
      remains in the cache until the next 'cache_clean'.
      The result is that if you
      
        echo 0 2000000000 1 0 > /proc/net/rpc/auth.unix.gid/channel
      
      several times, then
      
        cat /proc/net/rpc/auth.unix.gid/content
      
      It will display multiple entries for the one uid, which is at least confusing:
      
        #uid cnt: gids...
        0 1: 0
        0 1: 0
        0 1: 0
      
      With this patch, expired entries are marked as comments so you get
      
        #uid cnt: gids...
        0 1: 0
        # 0 1: 0
        # 0 1: 0
      
      These expired entries will never be seen by cache_check() as they are always
      *after* a non-expired entry with the same key - so the extra check is only
      needed in c_show()
      Signed-off-by: NNeilBrown <neilb@suse.de>
      
      --
      It's not a big problem, but it had me confused for a while, so it could
      well confuse others.
      Thanks,
      NeilBrown
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      200724a7
  9. 11 7月, 2012 1 次提交
  10. 05 7月, 2012 1 次提交
    • D
      sunrpc: Don't do a dst_confirm() on an input routes. · 60d354eb
      David S. Miller 提交于
      xs_udp_data_ready() is operating on received packets, and tries to
      do a dst_confirm() on the dst attached to the SKB.
      
      This isn't right, dst confirmation is for output routes, not input
      rights.  It's for resetting the timers on the nexthop neighbour entry
      for the route, indicating that we've got good evidence that we've
      successfully reached it.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      60d354eb
  11. 29 6月, 2012 9 次提交
  12. 28 6月, 2012 1 次提交
  13. 27 6月, 2012 1 次提交
  14. 12 6月, 2012 1 次提交
  15. 01 6月, 2012 6 次提交