1. 04 12月, 2009 7 次提交
    • C
      SUNRPC: Use soft connects for autobinding over TCP · 012da158
      Chuck Lever 提交于
      Autobinding is handled by the rpciod process, not in user processes
      that are generating regular RPC requests.  Thus autobinding is usually
      not affected by signals targetting user processes, such as KILL or
      timer expiration events.
      
      In addition, an RPC request generated by a user process that has
      RPC_TASK_SOFTCONN set and needs to perform an autobind will hang if
      the remote rpcbind service is not available.
      
      For rpcbind queries on connection-oriented transports, let's use the
      new soft connect semantic to return control to the user's process
      quickly, if the kernel's rpcbind client can't connect to the remote
      rpcbind service.
      
      Logic is introduced in call_bind_status() to handle connection errors
      that occurred during an asynchronous rpcbind query.  The logic
      abandons the rpcbind query if the RPC request has SOFTCONN set, and
      retries after a few seconds in the normal case.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      012da158
    • C
      SUNRPC: Use TCP for local rpcbind upcalls · 2a76b3bf
      Chuck Lever 提交于
      Use TCP with the soft connect semantic for local rpcbind upcalls so
      the kernel can detect immediately if the local rpcbind daemon is not
      running.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      2a76b3bf
    • C
      SUNRPC: Use a cached RPC client and transport for rpcbind upcalls · c526611d
      Chuck Lever 提交于
      The kernel's rpcbind client creates and deletes an rpc_clnt and its
      underlying transport socket for every upcall to the local rpcbind
      daemon.
      
      When starting a typical NFS server on IPv4 and IPv6, the NFS service
      itself does three upcalls (one per version) times two upcalls (one
      per transport) times two upcalls (one per address family), making 12,
      plus another one for the initial call to unregister previous NFS
      services.  Starting the NLM service adds an additional 13 upcalls,
      for similar reasons.
      
      (Currently the NFS service doesn't start IPv6 listeners, but it will
      soon enough).
      
      Instead, let's create an rpc_clnt for rpcbind upcalls during the
      first local rpcbind query, and cache it.  This saves the overhead of
      creating and destroying an rpc_clnt and a socket for every upcall.
      
      The new logic also prevents the kernel from attempting an RPCB_SET or
      RPCB_UNSET if it knows from the start that the local portmapper does
      not support rpcbind protocol version 4.  This will cut down on the
      number of rpcbind upcalls in legacy environments.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      c526611d
    • C
      SUNRPC: Simplify synopsis of rpcb_local_clnt() · 5a462115
      Chuck Lever 提交于
      Clean up: At one point, rpcb_local_clnt() handled IPv6 loopback
      addresses too, but it doesn't any more; only IPv4 loopback is used
      now.  Get rid of the @addr and @addrlen arguments to
      rpcb_local_clnt().
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      5a462115
    • C
      SUNRPC: Allow RPCs to fail quickly if the server is unreachable · 09a21c41
      Chuck Lever 提交于
      The kernel sometimes makes RPC calls to services that aren't running.
      Because the kernel's RPC client always assumes the hard retry semantic
      when reconnecting a connection-oriented RPC transport, the underlying
      reconnect logic takes a long while to time out, even though the remote
      may have responded immediately with ECONNREFUSED.
      
      In certain cases, like upcalls to our local rpcbind daemon, or for NFS
      mount requests, we'd like the kernel to fail immediately if the remote
      service isn't reachable.  This allows another transport to be tried
      immediately, or the pending request can be abandoned quickly.
      
      Introduce a per-request flag which controls how call_transmit_status()
      behaves when request transmission fails because the server cannot be
      reached.
      
      We don't want soft connection semantics to apply to other errors.  The
      default case of the switch statement in call_transmit_status() no
      longer falls through; the fall through code is copied to the default
      case, and a "break;" is added.
      
      The transport's connection re-establishment timeout is also ignored for
      such requests.  We want the request to fail immediately, so the
      reconnect delay is skipped.  Additionally, we don't want a connect
      failure here to further increase the reconnect timeout value, since
      this request will not be retried.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      09a21c41
    • C
      SUNRPC: Check explicitly for tk_status == 0 in call_transmit_status() · 206a134b
      Chuck Lever 提交于
      The success case, where task->tk_status == 0, is by far the most
      frequent case in call_transmit_status().
      
      The default: arm of the switch statement in call_transmit_status()
      handles the 0 case.  default: was moved close to the top of the switch
      statement in call_transmit_status() under the theory that the compiler
      places object code for the earliest arms of a switch statement first,
      making the CPU do less work.
      
      The default: arm of a switch statement, however, is executed only
      after all the other cases have been checked.  Even if the compiler
      rearranges the object code, the default: arm is the "last resort",
      meaning all of the other cases have been explicitly exhausted.  That
      makes the current arrangement about as inefficient as it gets for the
      common case.
      
      To fix this, add an explicit check for zero before the switch
      statement.  That forces the compiler to do the zero check first, no
      matter what optimizations it might try to do to the switch statement.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      206a134b
    • C
      SUNRPC: Display compressed (shorthand) IPv6 presentation addresses · dd1fd90f
      Chuck Lever 提交于
      Recent changes to snprintf() introduced the %pI6c formatter, which can
      display an IPv6 address with standard shorthanding.  Using a
      shorthanded address can save us a few bytes of memory for each stored
      presentation address, or a few bytes on the wire when sending these in
      a universal address.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      dd1fd90f
  2. 01 12月, 2009 2 次提交
    • J
      mac80211: fix spurious delBA handling · 827d42c9
      Johannes Berg 提交于
      Lennert Buytenhek noticed that delBA handling in mac80211
      was broken and has remotely triggerable problems, some of
      which are due to some code shuffling I did that ended up
      changing the order in which things were done -- this was
      
        commit d75636ef
        Author: Johannes Berg <johannes@sipsolutions.net>
        Date:   Tue Feb 10 21:25:53 2009 +0100
      
          mac80211: RX aggregation: clean up stop session
      
      and other parts were already present in the original
      
        commit d92684e6
        Author: Ron Rindjunsky <ron.rindjunsky@intel.com>
        Date:   Mon Jan 28 14:07:22 2008 +0200
      
            mac80211: A-MPDU Tx add delBA from recipient support
      
      The first problem is that I moved a BUG_ON before various
      checks -- thereby making it possible to hit. As the comment
      indicates, the BUG_ON can be removed since the ampdu_action
      callback must already exist when the state is != IDLE.
      
      The second problem isn't easily exploitable but there's a
      race condition due to unconditionally setting the state to
      OPERATIONAL when a delBA frame is received, even when no
      aggregation session was ever initiated. All the drivers
      accept stopping the session even then, but that opens a
      race window where crashes could happen before the driver
      accepts it. Right now, a WARN_ON may happen with non-HT
      drivers, while the race opens only for HT drivers.
      
      For this case, there are two things necessary to fix it:
       1) don't process spurious delBA frames, and be more careful
          about the session state; don't drop the lock
      
       2) HT drivers need to be prepared to handle a session stop
          even before the session was really started -- this is
          true for all drivers (that support aggregation) but
          iwlwifi which can be fixed easily. The other HT drivers
          (ath9k and ar9170) are behaving properly already.
      Reported-by: NLennert Buytenhek <buytenh@marvell.com>
      Cc: stable@kernel.org
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      827d42c9
    • J
      mac80211: fix two remote exploits · 4253119a
      Johannes Berg 提交于
      Lennert Buytenhek noticed a remotely triggerable problem
      in mac80211, which is due to some code shuffling I did
      that ended up changing the order in which things were
      done -- this was in
      
        commit d75636ef
        Author: Johannes Berg <johannes@sipsolutions.net>
        Date:   Tue Feb 10 21:25:53 2009 +0100
      
          mac80211: RX aggregation: clean up stop session
      
      The problem is that the BUG_ON moved before the various
      checks, and as such can be triggered.
      
      As the comment indicates, the BUG_ON can be removed since
      the ampdu_action callback must already exist when the
      state is OPERATIONAL.
      
      A similar code path leads to a WARN_ON in
      ieee80211_stop_tx_ba_session, which can also be removed.
      
      Cc: stable@kernel.org [2.6.29+]
      Cc: Lennert Buytenhek <buytenh@marvell.com>
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      4253119a
  3. 30 11月, 2009 1 次提交
  4. 29 11月, 2009 1 次提交
    • A
      sctp: on T3_RTX retransmit all the in-flight chunks · 5fdd4bae
      Andrei Pelinescu-Onciul 提交于
      When retransmitting due to T3 timeout, retransmit all the
      in-flight chunks for the corresponding  transport/path, including
      chunks sent less then 1 rto ago.
      This is the correct behaviour according to rfc4960 section 6.3.3
      E3 and
      "Note: Any DATA chunks that were sent to the address for which the
       T3-rtx timer expired but did not fit in one MTU (rule E3 above)
       should be marked for retransmission and sent as soon as cwnd
       allows (normally, when a SACK arrives). ".
      
      This fixes problems when more then one path is present and the T3
      retransmission of the first chunk that timeouts stops the T3 timer
      for the initial active path, leaving all the other in-flight
      chunks waiting forever or until a new chunk is transmitted on the
      same path and timeouts (and this will happen only if the cwnd
      allows sending new chunks, but since cwnd was dropped to MTU by
      the timeout => it will wait until the first heartbeat).
      
      Example: 10 packets in flight, sent at 0.1 s intervals on the
      primary path. The primary path is down and the first packet
      timeouts. The first packet is retransmitted on another path, the
      T3 timer for the primary path is stopped and cwnd is set to MTU.
      All the other 9 in-flight packets will not be retransmitted
      (unless more new packets are sent on the primary path which depend
      on cwnd allowing it, and even in this case the 9 packets will be
      retransmitted only after a new packet timeouts which even in the
      best case would be more then RTO).
      
      This commit reverts d0ce9291 and
      also removes the now unused transport->last_rto, introduced in
       b6157d8e.
      
      p.s  The problem is not only when multiple paths are there.  It
      can happen in a single homed environment.  If the application
      stops sending data, it possible to have a hung association.
      Signed-off-by: NAndrei Pelinescu-Onciul <andrei@iptel.org>
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5fdd4bae
  5. 25 11月, 2009 1 次提交
  6. 24 11月, 2009 2 次提交
    • J
      rfkill: fix miscdev ops · 45ba564d
      Johannes Berg 提交于
      The /dev/rfkill ops don't refer to the module,
      so it is possible to unload the module while
      file descriptors are open. Fix this oversight.
      Reported-by: NMaxim Levitsky <maximlevitsky@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      45ba564d
    • E
      pktgen: Fix device name compares · 593f63b0
      Eric Dumazet 提交于
      Commit e6fce5b9 (pktgen: multiqueue etc.) tried to relax
      the pktgen restriction of one device per kernel thread, adding a '@'
      tag to device names.
      
      Problem is we dont perform check on full pktgen device name.
      This allows adding many time same 'device' to pktgen thread
      
       pgset "add_device eth0@0"
      
      one session later :
      
       pgset "add_device eth0@0"
      
      (This doesnt find previous device)
      
      This consumes ~1.5 MBytes of vmalloc memory per round and also triggers
      this warning :
      
      [  673.186380] proc_dir_entry 'pktgen/eth0@0' already registered
      [  673.186383] Modules linked in: pktgen ixgbe ehci_hcd psmouse mdio mousedev evdev [last unloaded: pktgen]
      [  673.186406] Pid: 6219, comm: bash Tainted: G        W  2.6.32-rc7-03302-g41cec6f1-dirty #16
      [  673.186410] Call Trace:
      [  673.186417]  [<ffffffff8104a29b>] warn_slowpath_common+0x7b/0xc0
      [  673.186422]  [<ffffffff8104a341>] warn_slowpath_fmt+0x41/0x50
      [  673.186426]  [<ffffffff8114e789>] proc_register+0x109/0x210
      [  673.186433]  [<ffffffff8100bf2e>] ? apic_timer_interrupt+0xe/0x20
      [  673.186438]  [<ffffffff8114e905>] proc_create_data+0x75/0xd0
      [  673.186444]  [<ffffffffa006ad38>] pktgen_thread_write+0x568/0x640 [pktgen]
      [  673.186449]  [<ffffffffa006a7d0>] ? pktgen_thread_write+0x0/0x640 [pktgen]
      [  673.186453]  [<ffffffff81149144>] proc_reg_write+0x84/0xc0
      [  673.186458]  [<ffffffff810f5a58>] vfs_write+0xb8/0x180
      [  673.186463]  [<ffffffff810f5c11>] sys_write+0x51/0x90
      [  673.186468]  [<ffffffff8100b51b>] system_call_fastpath+0x16/0x1b
      [  673.186470] ---[ end trace ccbb991b0a8d994d ]---
      
      Solution to this problem is to use a odevname field (includes @ tag and suffix),
      instead of using netdevice name.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      593f63b0
  7. 23 11月, 2009 1 次提交
  8. 20 11月, 2009 3 次提交
  9. 19 11月, 2009 1 次提交
    • J
      mac80211: fix addba timer (again...) · 8ade0082
      Johannes Berg 提交于
      commit 2171abc5
        Author: Johannes Berg <johannes@sipsolutions.net>
        Date:   Thu Oct 29 08:34:00 2009 +0100
      
            mac80211: fix addba timer
      
      left a problem in there, even if the timer was
      never started it could be deleted and then added.
      
      Linus pointed out that del_timer_sync() isn't
      actually needed if we make the timer able to
      deal with no longer being needed when it gets
      queued _while_ we're in the locked section that
      also deletes it. For that the timer function only
      needs to check the HT_ADDBA_RECEIVED_MSK bit as
      well as the HT_ADDBA_REQUESTED_MSK bit, only if
      the former is clear should it do anything.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      8ade0082
  10. 17 11月, 2009 2 次提交
  11. 16 11月, 2009 4 次提交
  12. 14 11月, 2009 6 次提交
  13. 13 11月, 2009 2 次提交
  14. 08 11月, 2009 1 次提交
  15. 07 11月, 2009 2 次提交
  16. 06 11月, 2009 4 次提交