1. 09 10月, 2013 1 次提交
    • E
      ipv6: make lookups simpler and faster · efe4208f
      Eric Dumazet 提交于
      TCP listener refactoring, part 4 :
      
      To speed up inet lookups, we moved IPv4 addresses from inet to struct
      sock_common
      
      Now is time to do the same for IPv6, because it permits us to have fast
      lookups for all kind of sockets, including upcoming SYN_RECV.
      
      Getting IPv6 addresses in TCP lookups currently requires two extra cache
      lines, plus a dereference (and memory stall).
      
      inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6
      
      This patch is way bigger than its IPv4 counter part, because for IPv4,
      we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
      it's not doable easily.
      
      inet6_sk(sk)->daddr becomes sk->sk_v6_daddr
      inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr
      
      And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr
      at the same offset.
      
      We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
      macro.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      efe4208f
  2. 18 9月, 2013 1 次提交
    • J
      RPCSEC_GSS: fix crash on destroying gss auth · a0f6ed8e
      J. Bruce Fields 提交于
      This fixes a regression since  eb6dc19d
      "RPCSEC_GSS: Share all credential caches on a per-transport basis" which
      could cause an occasional oops in the nfsd code (see below).
      
      The problem was that an auth was left referencing a client that had been
      freed.  To avoid this we need to ensure that auths are shared only
      between descendants of a common client; the fact that a clone of an
      rpc_client takes a reference on its parent then ensures that the parent
      client will last as long as the auth.
      
      Also add a comment explaining what I think was the intention of this
      code.
      
        general protection fault: 0000 [#1] PREEMPT SMP
        Modules linked in: rpcsec_gss_krb5 nfsd auth_rpcgss oid_registry nfs_acl lockd sunrpc
        CPU: 3 PID: 4071 Comm: kworker/u8:2 Not tainted 3.11.0-rc2-00182-g025145f #1665
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        Workqueue: nfsd4_callbacks nfsd4_do_callback_rpc [nfsd]
        task: ffff88003e206080 ti: ffff88003c384000 task.ti: ffff88003c384000
        RIP: 0010:[<ffffffffa00001f3>]  [<ffffffffa00001f3>] rpc_net_ns+0x53/0x70 [sunrpc]
        RSP: 0000:ffff88003c385ab8  EFLAGS: 00010246
        RAX: 6b6b6b6b6b6b6b6b RBX: ffff88003af9a800 RCX: 0000000000000002
        RDX: ffffffffa00001a5 RSI: 0000000000000001 RDI: ffffffff81e284e0
        RBP: ffff88003c385ad8 R08: 0000000000000001 R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000015 R12: ffff88003c990840
        R13: ffff88003c990878 R14: ffff88003c385ba8 R15: ffff88003e206080
        FS:  0000000000000000(0000) GS:ffff88003fd80000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: 00007fcdf737e000 CR3: 000000003ad2b000 CR4: 00000000000006e0
        Stack:
         ffffffffa00001a5 0000000000000006 0000000000000006 ffff88003af9a800
         ffff88003c385b08 ffffffffa00d52a4 ffff88003c385ba8 ffff88003c751bd8
         ffff88003c751bc0 ffff88003e113600 ffff88003c385b18 ffffffffa00d530c
        Call Trace:
         [<ffffffffa00001a5>] ? rpc_net_ns+0x5/0x70 [sunrpc]
         [<ffffffffa00d52a4>] __gss_pipe_release+0x54/0x90 [auth_rpcgss]
         [<ffffffffa00d530c>] gss_pipe_free+0x2c/0x30 [auth_rpcgss]
         [<ffffffffa00d678b>] gss_destroy+0x9b/0xf0 [auth_rpcgss]
         [<ffffffffa000de63>] rpcauth_release+0x23/0x30 [sunrpc]
         [<ffffffffa0001e81>] rpc_release_client+0x51/0xb0 [sunrpc]
         [<ffffffffa00020d5>] rpc_shutdown_client+0xe5/0x170 [sunrpc]
         [<ffffffff81098a14>] ? cpuacct_charge+0xa4/0xb0
         [<ffffffff81098975>] ? cpuacct_charge+0x5/0xb0
         [<ffffffffa019556f>] nfsd4_process_cb_update.isra.17+0x2f/0x210 [nfsd]
         [<ffffffff819a4ac0>] ? _raw_spin_unlock_irq+0x30/0x60
         [<ffffffff819a4acb>] ? _raw_spin_unlock_irq+0x3b/0x60
         [<ffffffff810703ab>] ? process_one_work+0x15b/0x510
         [<ffffffffa01957dd>] nfsd4_do_callback_rpc+0x8d/0xa0 [nfsd]
         [<ffffffff8107041e>] process_one_work+0x1ce/0x510
         [<ffffffff810703ab>] ? process_one_work+0x15b/0x510
         [<ffffffff810712ab>] worker_thread+0x11b/0x370
         [<ffffffff81071190>] ? manage_workers.isra.24+0x2b0/0x2b0
         [<ffffffff8107854b>] kthread+0xdb/0xe0
         [<ffffffff819a4ac0>] ? _raw_spin_unlock_irq+0x30/0x60
         [<ffffffff81078470>] ? __init_kthread_worker+0x70/0x70
         [<ffffffff819ac7dc>] ret_from_fork+0x7c/0xb0
         [<ffffffff81078470>] ? __init_kthread_worker+0x70/0x70
        Code: a5 01 00 a0 31 d2 31 f6 48 c7 c7 e0 84 e2 81 e8 f4 91 0a e1 48 8b 43 60 48 c7 c2 a5 01 00 a0 be 01 00 00 00 48 c7 c7 e0 84 e2 81 <48> 8b 98 10 07 00 00 e8 91 8f 0a e1 e8
        +3c 4e 07 e1 48 83 c4 18
        RIP  [<ffffffffa00001f3>] rpc_net_ns+0x53/0x70 [sunrpc]
         RSP <ffff88003c385ab8>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      a0f6ed8e
  3. 12 9月, 2013 2 次提交
  4. 11 9月, 2013 1 次提交
    • D
      shrinker: convert remaining shrinkers to count/scan API · 70534a73
      Dave Chinner 提交于
      Convert the remaining couple of random shrinkers in the tree to the new
      API.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NGlauber Costa <glommer@openvz.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: J. Bruce Fields <bfields@redhat.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Cc: Arve Hjønnevåg <arve@android.com>
      Cc: Carlos Maiolino <cmaiolino@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: J. Bruce Fields <bfields@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Thomas Hellstrom <thellstrom@vmware.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      70534a73
  5. 06 9月, 2013 4 次提交
  6. 05 9月, 2013 3 次提交
  7. 04 9月, 2013 3 次提交
    • A
      SUNRPC refactor rpcauth_checkverf error returns · 35fa5f7b
      Andy Adamson 提交于
      Most of the time an error from the credops crvalidate function means the
      server has sent us a garbage verifier. The gss_validate function is the
      exception where there is an -EACCES case if the user GSS_context on the client
      has expired.
      Signed-off-by: NAndy Adamson <andros@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      35fa5f7b
    • A
      SUNRPC new rpc_credops to test credential expiry · 4de6caa2
      Andy Adamson 提交于
      This patch provides the RPC layer helper functions to allow NFS to manage
      data in the face of expired credentials - such as avoiding buffered WRITEs
      and COMMITs when the gss context will expire before the WRITEs are flushed
      and COMMITs are sent.
      
      These helper functions enable checking the expiration of an underlying
      credential key for a generic rpc credential, e.g. the gss_cred gss context
      gc_expiry which for Kerberos is set to the remaining TGT lifetime.
      
      A new rpc_authops key_timeout is only defined for the generic auth.
      A new rpc_credops crkey_to_expire is only defined for the generic cred.
      A new rpc_credops crkey_timeout is only defined for the gss cred.
      
      Set a credential key expiry watermark, RPC_KEY_EXPIRE_TIMEO set to 240 seconds
      as a default and can be set via a module parameter as we need to ensure there
      is time for any dirty data to be flushed.
      
      If key_timeout is called on a credential with an underlying credential key that
      will expire within watermark seconds, we set the RPC_CRED_KEY_EXPIRE_SOON
      flag in the generic_cred acred so that the NFS layer can clean up prior to
      key expiration.
      
      Checking a generic credential's underlying credential involves a cred lookup.
      To avoid this lookup in the normal case when the underlying credential has
      a key that is valid (before the watermark), a notify flag is set in
      the generic credential the first time the key_timeout is called. The
      generic credential then stops checking the underlying credential key expiry, and
      the underlying credential (gss_cred) match routine then checks the key
      expiration upon each normal use and sets a flag in the associated generic
      credential only when the key expiration is within the watermark.
      This in turn signals the generic credential key_timeout to perform the extra
      credential lookup thereafter.
      Signed-off-by: NAndy Adamson <andros@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      4de6caa2
    • A
      SUNRPC: don't map EKEYEXPIRED to EACCES in call_refreshresult · f1ff0c27
      Andy Adamson 提交于
      The NFS layer needs to know when a key has expired.
      This change also returns -EKEYEXPIRED to the application, and the informative
      "Key has expired" error message is displayed. The user then knows that
      credential renewal is required.
      Signed-off-by: NAndy Adamson <andros@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      f1ff0c27
  8. 03 9月, 2013 2 次提交
  9. 01 9月, 2013 5 次提交
  10. 30 8月, 2013 8 次提交
  11. 29 8月, 2013 1 次提交
  12. 08 8月, 2013 1 次提交
    • T
      SUNRPC: If the rpcbind channel is disconnected, fail the call to unregister · 786615bc
      Trond Myklebust 提交于
      If rpcbind causes our connection to the AF_LOCAL socket to close after
      we've registered a service, then we want to be careful about reconnecting
      since the mount namespace may have changed.
      
      By simply refusing to reconnect the AF_LOCAL socket in the case of
      unregister, we avoid the need to somehow save the mount namespace. While
      this may lead to some services not unregistering properly, it should
      be safe.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Nix <nix@esperi.org.uk>
      Cc: Jeff Layton <jlayton@redhat.com>
      Cc: stable@vger.kernel.org # 3.9.x
      786615bc
  13. 06 8月, 2013 1 次提交
  14. 01 8月, 2013 5 次提交
    • J
      svcrpc: set cr_gss_mech from gss-proxy as well as legacy upcall · 7193bd17
      J. Bruce Fields 提交于
      The change made to rsc_parse() in
      0dc1531a "svcrpc: store gss mech in
      svc_cred" should also have been propagated to the gss-proxy codepath.
      This fixes a crash in the gss-proxy case.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      7193bd17
    • J
      svcrpc: fix kfree oops in gss-proxy code · 743e2171
      J. Bruce Fields 提交于
      mech_oid.data is an array, not kmalloc()'d memory.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      743e2171
    • J
      svcrpc: fix gss-proxy xdr decoding oops · dc43376c
      J. Bruce Fields 提交于
      Uninitialized stack data was being used as the destination for memcpy's.
      
      Longer term we'll just delete some of this code; all we're doing is
      skipping over xdr that we don't care about.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      dc43376c
    • J
      svcrpc: fix gss_rpc_upcall create error · 9f96392b
      J. Bruce Fields 提交于
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      9f96392b
    • N
      NFSD/sunrpc: avoid deadlock on TCP connection due to memory pressure. · 447383d2
      NeilBrown 提交于
      Since we enabled auto-tuning for sunrpc TCP connections we do not
      guarantee that there is enough write-space on each connection to
      queue a reply.
      
      If memory pressure causes the window to shrink too small, the request
      throttling in sunrpc/svc will not accept any requests so no more requests
      will be handled.  Even when pressure decreases the window will not
      grow again until data is sent on the connection.
      This means we get a deadlock:  no requests will be handled until there
      is more space, and no space will be allocated until a request is
      handled.
      
      This can be simulated by modifying svc_tcp_has_wspace to inflate the
      number of byte required and removing the 'svc_sock_setbufsize' calls
      in svc_setup_socket.
      
      I found that multiplying by 16 was enough to make the requirement
      exceed the default allocation.  With this modification in place:
         mount -o vers=3,proto=tcp 127.0.0.1:/home /mnt
      would block and eventually time out because the nfs server could not
      accept any requests.
      
      This patch relaxes the request throttling to always allow at least one
      request through per connection.  It does this by checking both
        sk_stream_min_wspace() and xprt->xpt_reserved
      are zero.
      The first is zero when the TCP transmit queue is empty.
      The second is zero when there are no RPC requests being processed.
      When both of these are zero the socket is idle and so one more
      request can safely be allowed through.
      
      Applying this patch allows the above mount command to succeed cleanly.
      Tracing shows that the allocated write buffer space quickly grows and
      after a few requests are handled, the extra tests are no longer needed
      to permit further requests to be processed.
      
      The main purpose of request throttling is to handle the case when one
      client is slow at collecting replies and the send queue gets full of
      replies that the client hasn't acknowledged (at the TCP level) yet.
      As we only change behaviour when the send queue is empty this main
      purpose is still preserved.
      Reported-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      447383d2
  15. 25 7月, 2013 1 次提交
  16. 24 7月, 2013 1 次提交