1. 12 1月, 2011 3 次提交
    • J
      rpc: allow xprt_class->setup to return a preexisting xprt · f0418aa4
      J. Bruce Fields 提交于
      This allows us to reuse the xprt associated with a server connection if
      one has already been set up.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      f0418aa4
    • J
      rpc: keep backchannel xprt as long as server connection · 99de8ea9
      J. Bruce Fields 提交于
      Multiple backchannels can share the same tcp connection; from rfc 5661 section
      2.10.3.1:
      
      	A connection's association with a session is not exclusive.  A
      	connection associated with the channel(s) of one session may be
      	simultaneously associated with the channel(s) of other sessions
      	including sessions associated with other client IDs.
      
      However, multiple backchannels share a connection, they must all share
      the same xid stream (hence the same rpc_xprt); the only way we have to
      match replies with calls at the rpc layer is using the xid.
      
      So, keep the rpc_xprt around as long as the connection lasts, in case
      we're asked to use the connection as a backchannel again.
      
      Requests to create new backchannel clients over a given server
      connection should results in creating new clients that reuse the
      existing rpc_xprt.
      
      But to start, just reject attempts to associate multiple rpc_xprt's with
      the same underlying bc_xprt.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      99de8ea9
    • J
      rpc: move sk_bc_xprt to svc_xprt · d75faea3
      J. Bruce Fields 提交于
      This seems obviously transport-level information even if it's currently
      used only by the server socket code.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      d75faea3
  2. 05 1月, 2011 6 次提交
    • J
      svcrpc: ensure cache_check caller sees updated entry · fdef7aa5
      J. Bruce Fields 提交于
      Supposes cache_check runs simultaneously with an update on a different
      CPU:
      
      	cache_check			task doing update
      	^^^^^^^^^^^			^^^^^^^^^^^^^^^^^
      
      	1. test for CACHE_VALID		1'. set entry->data
      	   & !CACHE_NEGATIVE
      
      	2. use entry->data		2'. set CACHE_VALID
      
      If the two memory writes performed in step 1' and 2' appear misordered
      with respect to the reads in step 1 and 2, then the caller could get
      stale data at step 2 even though it saw CACHE_VALID set on the cache
      entry.
      
      Add memory barriers to prevent this.
      Reviewed-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      fdef7aa5
    • J
      svcrpc: take lock on turning entry NEGATIVE in cache_check · 6bab93f8
      J. Bruce Fields 提交于
      We attempt to turn a cache entry negative in place.  But that entry may
      already have been filled in by some other task since we last checked
      whether it was valid, so we could be modifying an already-valid entry.
      If nothing else there's a likely leak in such a case when the entry is
      eventually put() and contents are not freed because it has
      CACHE_NEGATIVE set.
      
      So, take the cache_lock just as sunrpc_cache_update() does.
      Reviewed-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      6bab93f8
    • J
      svcrpc: simpler request dropping · 9e701c61
      J. Bruce Fields 提交于
      Currently we use -EAGAIN returns to determine when to drop a deferred
      request.  On its own, that is error-prone, as it makes us treat -EAGAIN
      returns from other functions specially to prevent inadvertent dropping.
      
      So, use a flag on the request instead.
      
      Returning an error on request deferral is still required, to prevent
      further processing, but we no longer need worry that an error return on
      its own could result in a drop.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      9e701c61
    • J
      svcrpc: avoid double reply caused by deferral race · d76d1815
      J. Bruce Fields 提交于
      Commit d29068c4 "sunrpc: Simplify cache_defer_req and related
      functions." asserted that cache_check() could determine success or
      failure of cache_defer_req() by checking the CACHE_PENDING bit.
      
      This isn't quite right.
      
      We need to know whether cache_defer_req() created a deferred request,
      in which case sending an rpc reply has become the responsibility of the
      deferred request, and it is important that we not send our own reply,
      resulting in two different replies to the same request.
      
      And the CACHE_PENDING bit doesn't tell us that; we could have
      succesfully created a deferred request at the same time as another
      thread cleared the CACHE_PENDING bit.
      
      So, partially revert that commit, to ensure that cache_check() returns
      -EAGAIN if and only if a deferred request has been created.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      Acked-by: NNeilBrown <neilb@suse.de>
      d76d1815
    • J
      SUNRPC: Remove more code when NFSD_DEPRECATED is not configured · bdd5f05d
      J. Bruce Fields 提交于
      Signed-off-by: NNeilBrown <neilb@suse.de>
      [bfields@redhat.com: moved svcauth_unix_purge outside ifdef's.]
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      bdd5f05d
    • J
      svcrpc: modifying valid sunrpc cache entries is racy · 31f7aa65
      J. Bruce Fields 提交于
      Once a sunrpc cache entry is VALID, we should be replacing it (and
      allowing any concurrent users to destroy it on last put) instead of
      trying to update it in place.
      
      Otherwise someone referencing the ip_map we're modifying here could try
      to use the m_client just as we're putting the last reference.
      
      The bug should only be seen by users of the legacy nfsd interfaces.
      
      (Thanks to Neil for suggestion to use sunrpc_invalidate.)
      Reviewed-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      31f7aa65
  3. 18 12月, 2010 4 次提交
  4. 08 12月, 2010 1 次提交
    • N
      sunrpc: prevent use-after-free on clearing XPT_BUSY · ed2849d3
      NeilBrown 提交于
      When an xprt is created, it has a refcount of 1, and XPT_BUSY is set.
      The refcount is *not* owned by the thread that created the xprt
      (as is clear from the fact that creators never put the reference).
      Rather, it is owned by the absence of XPT_DEAD.  Once XPT_DEAD is set,
      (And XPT_BUSY is clear) that initial reference is dropped and the xprt
      can be freed.
      
      So when a creator clears XPT_BUSY it is dropping its only reference and
      so must not touch the xprt again.
      
      However svc_recv, after calling ->xpo_accept (and so getting an XPT_BUSY
      reference on a new xprt), calls svc_xprt_recieved.  This clears
      XPT_BUSY and then svc_xprt_enqueue - this last without owning a reference.
      This is dangerous and has been seen to leave svc_xprt_enqueue working
      with an xprt containing garbage.
      
      So we need to hold an extra counted reference over that call to
      svc_xprt_received.
      
      For safety, any time we clear XPT_BUSY and then use the xprt again, we
      first get a reference, and the put it again afterwards.
      
      Note that svc_close_all does not need this extra protection as there are
      no threads running, and the final free can only be called asynchronously
      from such a thread.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Cc: stable@kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      ed2849d3
  5. 23 11月, 2010 1 次提交
  6. 20 11月, 2010 5 次提交
    • J
      svcrpc: fix wspace-checking race · 9c335c0b
      J. Bruce Fields 提交于
      We call svc_xprt_enqueue() after something happens which we think may
      require handling from a server thread.  To avoid such events being lost,
      svc_xprt_enqueue() must guarantee that there will be a svc_serv() call
      from a server thread following any such event.  It does that by either
      waking up a server thread itself, or checking that XPT_BUSY is set (in
      which case somebody else is doing it).
      
      But the check of XPT_BUSY could occur just as someone finishes
      processing some other event, and just before they clear XPT_BUSY.
      
      Therefore it's important not to clear XPT_BUSY without subsequently
      doing another svc_export_enqueue() to check whether the xprt should be
      requeued.
      
      The xpo_wspace() check in svc_xprt_enqueue() breaks this rule, allowing
      an event to be missed in situations like:
      
      	data arrives
      	call svc_tcp_data_ready():
      	call svc_xprt_enqueue():
      	set BUSY
      	find no write space
      				svc_reserve():
      				free up write space
      				call svc_enqueue():
      				test BUSY
      	clear BUSY
      
      So, instead, check wspace in the same places that the state flags are
      checked: before taking BUSY, and in svc_receive().
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      9c335c0b
    • J
      svcrpc: svc_close_xprt comment · b1763316
      J. Bruce Fields 提交于
      Neil Brown had to explain to me why we do this here; record the answer
      for posterity.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      b1763316
    • J
      svcrpc: simplify svc_close_all · f8c0d226
      J. Bruce Fields 提交于
      There's no need to be fooling with XPT_BUSY now that all the threads
      are gone.
      
      The list_del_init() here could execute at the same time as the
      svc_xprt_enqueue()'s list_add_tail(), with undefined results.  We don't
      really care at this point, but it might result in a spurious
      list-corruption warning or something.
      
      And svc_close() isn't adding any value; just call svc_delete_xprt()
      directly.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      f8c0d226
    • J
      nfsd4: centralize more calls to svc_xprt_received · ca7896cd
      J. Bruce Fields 提交于
      Follow up on b48fa6b9 by moving all the
      svc_xprt_received() calls for the main xprt to one place.  The clearing
      of XPT_BUSY here is critical to the correctness of the server, so I'd
      prefer it to be obvious where we do it.
      
      The only substantive result is moving svc_xprt_received() after
      svc_receive_deferred().  Other than a (likely insignificant) delay
      waking up the next thread, that should be harmless.
      
      Also reshuffle the exit code a little to skip a few other steps that we
      don't care about the in the svc_delete_xprt() case.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      ca7896cd
    • J
      svcrpc: don't set then immediately clear XPT_DEFERRED · 62bac4af
      J. Bruce Fields 提交于
      There's no harm to doing this, since the only caller will immediately
      call svc_enqueue() afterwards, ensuring we don't miss the remaining
      deferred requests just because XPT_DEFERRED was briefly cleared.
      
      But why not just do this the simple way?
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      62bac4af
  7. 18 11月, 2010 1 次提交
  8. 17 11月, 2010 1 次提交
  9. 29 10月, 2010 1 次提交
  10. 26 10月, 2010 6 次提交
  11. 25 10月, 2010 2 次提交
    • T
    • T
      SUNRPC: After calling xprt_release(), we must restart from call_reserve · 118df3d1
      Trond Myklebust 提交于
      Rob Leslie reports seeing the following Oops after his Kerberos session
      expired.
      
      BUG: unable to handle kernel NULL pointer dereference at 00000058
      IP: [<e186ed94>] rpcauth_refreshcred+0x11/0x12c [sunrpc]
      *pde = 00000000
      Oops: 0000 [#1]
      last sysfs file: /sys/devices/platform/pc87360.26144/temp3_input
      Modules linked in: autofs4 authenc esp4 xfrm4_mode_transport ipt_LOG ipt_REJECT xt_limit xt_state ipt_REDIRECT xt_owner xt_HL xt_hl xt_tcpudp xt_mark cls_u32 cls_tcindex sch_sfq sch_htb sch_dsmark geodewdt deflate ctr twofish_generic twofish_i586 twofish_common camellia serpent blowfish cast5 cbc xcbc rmd160 sha512_generic sha1_generic hmac crypto_null af_key rpcsec_gss_krb5 nfsd exportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc ip_gre sit tunnel4 dummy ext3 jbd nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables pc8736x_gpio nsc_gpio pc87360 hwmon_vid loop aes_i586 aes_generic sha256_generic dm_crypt cs5535_gpio serio_raw cs5535_mfgpt hifn_795x des_generic geode_rng rng_core led_class ext4 mbcache jbd2 crc16 dm_mirror dm_region_hash dm_log dm_snapshot dm_mod sd_mod crc_t10dif ide_pci_generic cs5536 amd74xx ide_core pata_cs5536 ata_generic libata usb_storage via_rhine mii scsi_mod btrfs zlib_deflate crc32c libcrc32c [last unloaded: scsi_wait_scan]
      
      Pid: 12875, comm: sudo Not tainted 2.6.36-net5501 #1 /
      EIP: 0060:[<e186ed94>] EFLAGS: 00010292 CPU: 0
      EIP is at rpcauth_refreshcred+0x11/0x12c [sunrpc]
      EAX: 00000000 EBX: defb13a0 ECX: 00000006 EDX: e18683b8
      ESI: defb13a0 EDI: 00000000 EBP: 00000000 ESP: de571d58
       DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
      Process sudo (pid: 12875, ti=de570000 task=decd1430 task.ti=de570000)
      Stack:
       e186e008 00000000 defb13a0 0000000d deda6000 e1868f22 e196f12b defb13a0
      <0> defb13d8 00000000 00000000 e186e0aa 00000000 defb13a0 de571dac 00000000
      <0> e186956c de571e34 debea5c0 de571dc8 e186967a 00000000 debea5c0 de571e34
      Call Trace:
       [<e186e008>] ? rpc_wake_up_next+0x114/0x11b [sunrpc]
       [<e1868f22>] ? call_decode+0x24a/0x5af [sunrpc]
       [<e196f12b>] ? nfs4_xdr_dec_access+0x0/0xa2 [nfs]
       [<e186e0aa>] ? __rpc_execute+0x62/0x17b [sunrpc]
       [<e186956c>] ? rpc_run_task+0x91/0x97 [sunrpc]
       [<e186967a>] ? rpc_call_sync+0x40/0x5b [sunrpc]
       [<e1969ca2>] ? nfs4_proc_access+0x10a/0x176 [nfs]
       [<e19572fa>] ? nfs_do_access+0x2b1/0x2c0 [nfs]
       [<e186ed61>] ? rpcauth_lookupcred+0x62/0x84 [sunrpc]
       [<e19573b6>] ? nfs_permission+0xad/0x13b [nfs]
       [<c0177824>] ? exec_permission+0x15/0x4b
       [<c0177fbd>] ? link_path_walk+0x4f/0x456
       [<c017867d>] ? path_walk+0x4c/0xa8
       [<c0179678>] ? do_path_lookup+0x1f/0x68
       [<c017a3fb>] ? user_path_at+0x37/0x5f
       [<c016359c>] ? handle_mm_fault+0x229/0x55b
       [<c0170a2d>] ? sys_faccessat+0x93/0x146
       [<c0170aef>] ? sys_access+0xf/0x13
       [<c02cf615>] ? syscall_call+0x7/0xb
      Code: 0f 94 c2 84 d2 74 09 8b 44 24 0c e8 6a e9 8b de 83 c4 14 89 d8 5b 5e 5f 5d c3 55 57 56 53 83 ec 1c fc 89 c6 8b 40 10 89 44 24 04 <8b> 58 58 85 db 0f 85 d4 00 00 00 0f b7 46 70 8b 56 20 89 c5 83
      EIP: [<e186ed94>] rpcauth_refreshcred+0x11/0x12c [sunrpc] SS:ESP 0068:de571d58
      CR2: 0000000000000058
      
      This appears to be caused by the function rpc_verify_header() first
      calling xprt_release(), then doing a call_refresh. If we release the
      transport slot, we should _always_ jump back to call_reserve before
      calling anything else.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@kernel.org
      118df3d1
  12. 24 10月, 2010 1 次提交
  13. 21 10月, 2010 3 次提交
    • C
      SUNRPC: Properly initialize sock_xprt.srcaddr in all cases · 92476850
      Chuck Lever 提交于
      The source address field in the transport's sock_xprt is initialized
      ONLY IF the RPC application passed a pointer to a source address
      during the call to rpc_create().  However, xs_bind() subsequently uses
      the value of this field without regard to whether the source address
      was initialized during transport creation or not.
      
      So far we've been lucky: the uninitialized value of this field is
      zeroes.  xs_bind(), until recently, used only the sin[6]_addr field in
      this sockaddr, and all zeroes is a valid value for this: it means
      ANYADDR.  This is a happy coincidence.
      
      However, xs_bind() now wants to use the sa_family field as well, and
      expects it to be initialized to something other than zero.
      
      Therefore, the source address sockaddr field should be fully
      initialized at transport create time in _every_ case, not just when
      the RPC application wants to use a specific bind address.
      
      Bruce added a workaround for this missing initialization by adjusting
      commit 6bc9638a, but the "right" way to do this is to ensure that the
      source address sockaddr is always correctly initialized from the
      get-go.
      
      This patch doesn't introduce a behavior change.  It's simply a
      clean-up of Bruce's fix, to prevent future problems of this kind.  It
      may look like overkill, but
      
        a) it clearly documents the default initial value of this field,
      
        b) it doesn't assume that the sockaddr_storage memory is first
           initialized to any particular value, and
      
        c) it will fail verbosely if some unknown address family is passed
           in
      
      Originally introduced by commit d3bc9a1d.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      92476850
    • C
      SUNRPC: Use conventional switch statement when reclassifying sockets · 4232e863
      Chuck Lever 提交于
      Clean up.
      
      Defensive coding: If "family" is ever something that is neither
      AF_INET nor AF_INET6, xs_reclassify_socket6() is not the appropriate
      default action.  Choose to do nothing in that case.
      
      Introduced by commit 6bc9638a.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      4232e863
    • T
      sunrpc/xprtrdma: clean up workqueue usage · a25e758c
      Tejun Heo 提交于
      * Create and use svc_rdma_wq instead of using the system workqueue and
        flush_scheduled_work().  This workqueue is necessary to serve as
        flushing domain for rdma->sc_work which is used to destroy itself
        and thus can't be flushed explicitly.
      
      * Replace cancel_delayed_work() + flush_scheduled_work() with
        cancel_delayed_work_sync().
      
      * Implement synchronous connect in xprt_rdma_connect() using
        flush_delayed_work() on the rdma_connect work instead of using
        flush_scheduled_work().
      
      This is to prepare for the deprecation and removal of
      flush_scheduled_work().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      a25e758c
  14. 19 10月, 2010 5 次提交