1. 01 8月, 2011 3 次提交
  2. 05 7月, 2011 6 次提交
    • G
      dccp ccid-2: Perform congestion-window validation · 113ced1f
      Gerrit Renker 提交于
      CCID-2's cwnd increases like TCP during slow-start, which has implications for
       * the local Sequence Window value (should be > cwnd),
       * the Ack Ratio value.
      Hence an exponential growth, if it does not reflect the actual network
      conditions, can quickly lead to instability.
      
      This patch adds congestion-window validation (RFC2861) to CCID-2:
       * cwnd is constrained if the sender is application limited;
       * cwnd is reduced after a long idle period, as suggested in the '90 paper
         by Van Jacobson, in RFC 2581 (sec. 4.1);
       * cwnd is never reduced below the RFC 3390 initial window.
      
      As marked in the comments, the code is actually almost a direct copy of the
      TCP congestion-window-validation algorithms. By continuing this work, it may
      in future be possible to use the TCP code (not possible at the moment).
      
      The mechanism can be turned off using a module parameter. Sampling of the
      currently-used window (moving-maximum) is however done constantly; this is
      used to determine the expected window, which can be exploited to regulate
      DCCP's Sequence Window value.
      
      This patch also sets slow-start-after-idle (RFC 4341, 5.1), i.e. it behaves like
      TCP when net.ipv4.tcp_slow_start_after_idle = 1.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      113ced1f
    • G
      dccp ccid-2: Use existing function to test for data packets · 58fdea0f
      Gerrit Renker 提交于
      This replaces a switch statement with a test, using the equivalent
      function dccp_data_packet(skb).  It also doubles the range of the field
      `rx_num_data_pkts' by changing the type from `int' to `u32', avoiding
      signed/unsigned comparison with the u16 field `dccps_r_ack_ratio'.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      58fdea0f
    • G
      dccp ccid-2: move rfc 3390 function into header file · b4d5f4b2
      Gerrit Renker 提交于
      This moves CCID-2's initial window function into the header file, since several
      parts throughout the CCID-2 code need to call it (CCID-2 still uses RFC 3390).
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NLeandro Melo de Sales <leandro@ic.ufal.br>
      b4d5f4b2
    • G
      dccp: cosmetics of info message · 1fd9d208
      Gerrit Renker 提交于
      Change the CCID (de)activation message to start with the
      protocol name, as 'CCID' is already in there.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      1fd9d208
    • G
      dccp: combine the functionality of enqeueing and cloning · 8695e801
      Gerrit Renker 提交于
      Realising the following call pattern,
       * first dccp_entail() is called to enqueue a new skb and
       * then skb_clone() is called to transmit a clone of that skb,
      this patch integrates both into the same function.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      8695e801
    • G
      dccp: Clean up slow-path input processing · c0c20150
      Gerrit Renker 提交于
      This patch rearranges the order of statements of the slow-path input processing
      (i.e. any other state than OPEN), to resolve the following issues.
      
       1. Dependencies: the order of statements now better matches RFC 4340, 8.5, i.e.
          step 7 is before step 9 (previously 9 was before 7), and parsing options in
          step 8 (which may consume resources) now comes after step 7.
       2. Sequence number checks are omitted if in state LISTEN/REQUEST, due to the
          note underneath the table in RFC 4340, 7.5.3.
          As a result, CCID processing is now indeed confined to OPEN/PARTOPEN states,
          i.e. congestion control is performed only on the flow of data packets. This
          avoids pathological cases of doing congestion control on those messages
          which set up and terminate the connection.
       3. Packets are now passed on to Ack Vector / CCID processing only after
          - step 7  (receive unexpected packets),
          - step 9  (receive Reset),
          - step 13 (receive CloseReq),
          - step 14 (receive Close)
          and only if the state is PARTOPEN. This simplifies CCID processing:
          - in LISTEN/CLOSED the CCIDs are non-existent;
          - in RESPOND/REQUEST the CCIDs have not yet been negotiated;
          - in CLOSEREQ and active-CLOSING the node has already closed this socket;
          - in passive-CLOSING the client is waiting for its Reset.
          In the last case, RFC 4340, 8.3 leaves it open to ignore further incoming
          data, which is the approach taken here.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      c0c20150
  3. 19 5月, 2011 1 次提交
  4. 09 5月, 2011 3 次提交
  5. 07 5月, 2011 1 次提交
  6. 04 5月, 2011 1 次提交
  7. 29 4月, 2011 2 次提交
    • D
      ipv4: Get route daddr from flow key in dccp_v4_connect(). · 91ab0b60
      David S. Miller 提交于
      Now that output route lookups update the flow with
      destination address selection, we can fetch it from
      fl4->daddr instead of rt->rt_dst
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91ab0b60
    • E
      inet: add RCU protection to inet->opt · f6d8bd05
      Eric Dumazet 提交于
      We lack proper synchronization to manipulate inet->opt ip_options
      
      Problem is ip_make_skb() calls ip_setup_cork() and
      ip_setup_cork() possibly makes a copy of ipc->opt (struct ip_options),
      without any protection against another thread manipulating inet->opt.
      
      Another thread can change inet->opt pointer and free old one under us.
      
      Use RCU to protect inet->opt (changed to inet->inet_opt).
      
      Instead of handling atomic refcounts, just copy ip_options when
      necessary, to avoid cache line dirtying.
      
      We cant insert an rcu_head in struct ip_options since its included in
      skb->cb[], so this patch is large because I had to introduce a new
      ip_options_rcu structure.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6d8bd05
  8. 28 4月, 2011 1 次提交
    • D
      ipv4: Sanitize and simplify ip_route_{connect,newports}() · 2d7192d6
      David S. Miller 提交于
      These functions are used together as a unit for route resolution
      during connect().  They address the chicken-and-egg problem that
      exists when ports need to be allocated during connect() processing,
      yet such port allocations require addressing information from the
      routing code.
      
      It's currently more heavy handed than it needs to be, and in
      particular we allocate and initialize a flow object twice.
      
      Let the callers provide the on-stack flow object.  That way we only
      need to initialize it once in the ip_route_connect() call.
      
      Later, if ip_route_newports() needs to do anything, it re-uses that
      flow object as-is except for the ports which it updates before the
      route re-lookup.
      
      Also, describe why this set of facilities are needed and how it works
      in a big comment.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
      2d7192d6
  9. 23 4月, 2011 1 次提交
  10. 31 3月, 2011 1 次提交
  11. 13 3月, 2011 6 次提交
  12. 03 3月, 2011 1 次提交
  13. 02 3月, 2011 5 次提交
    • G
      dccp: fix oops on Reset after close · 720dc34b
      Gerrit Renker 提交于
      This fixes a bug in the order of dccp_rcv_state_process() that still permitted
      reception even after closing the socket. A Reset after close thus causes a NULL
      pointer dereference by not preventing operations on an already torn-down socket.
      
       dccp_v4_do_rcv() 
      	|
      	| state other than OPEN
      	v
       dccp_rcv_state_process()
      	|
      	| DCCP_PKT_RESET
      	v
       dccp_rcv_reset()
      	|
      	v
       dccp_time_wait()
      
       WARNING: at net/ipv4/inet_timewait_sock.c:141 __inet_twsk_hashdance+0x48/0x128()
       Modules linked in: arc4 ecb carl9170 rt2870sta(C) mac80211 r8712u(C) crc_ccitt ah
       [<c0038850>] (unwind_backtrace+0x0/0xec) from [<c0055364>] (warn_slowpath_common)
       [<c0055364>] (warn_slowpath_common+0x4c/0x64) from [<c0055398>] (warn_slowpath_n)
       [<c0055398>] (warn_slowpath_null+0x1c/0x24) from [<c02b72d0>] (__inet_twsk_hashd)
       [<c02b72d0>] (__inet_twsk_hashdance+0x48/0x128) from [<c031caa0>] (dccp_time_wai)
       [<c031caa0>] (dccp_time_wait+0x40/0xc8) from [<c031c15c>] (dccp_rcv_state_proces)
       [<c031c15c>] (dccp_rcv_state_process+0x120/0x538) from [<c032609c>] (dccp_v4_do_)
       [<c032609c>] (dccp_v4_do_rcv+0x11c/0x14c) from [<c0286594>] (release_sock+0xac/0)
       [<c0286594>] (release_sock+0xac/0x110) from [<c031fd34>] (dccp_close+0x28c/0x380)
       [<c031fd34>] (dccp_close+0x28c/0x380) from [<c02d9a78>] (inet_release+0x64/0x70)
      
      The fix is by testing the socket state first. Receiving a packet in Closed state
      now also produces the required "No connection" Reset reply of RFC 4340, 8.3.1.
      Reported-and-tested-by: NJohan Hovold <jhovold@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      720dc34b
    • D
      ipv4: Kill can_sleep arg to ip_route_output_flow() · 273447b3
      David S. Miller 提交于
      This boolean state is now available in the flow flags.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      273447b3
    • D
      ipv4: Make final arg to ip_route_output_flow to be boolean "can_sleep" · 420d44da
      David S. Miller 提交于
      Since that is what the current vague "flags" argument means.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      420d44da
    • D
      ipv4: Can final ip_route_connect() arg to boolean "can_sleep". · abdf7e72
      David S. Miller 提交于
      Since that's what the current vague "flags" thing means.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      abdf7e72
    • D
      ipv6: Consolidate route lookup sequences. · 68d0c6d3
      David S. Miller 提交于
      Route lookups follow a general pattern in the ipv6 code wherein
      we first find the non-IPSEC route, potentially override the
      flow destination address due to ipv6 options settings, and then
      finally make an IPSEC search using either xfrm_lookup() or
      __xfrm_lookup().
      
      __xfrm_lookup() is used when we want to generate a blackhole route
      if the key manager needs to resolve the IPSEC rules (in this case
      -EREMOTE is returned and the original 'dst' is left unchanged).
      
      Otherwise plain xfrm_lookup() is used and when asynchronous IPSEC
      resolution is necessary, we simply fail the lookup completely.
      
      All of these cases are encapsulated into two routines,
      ip6_dst_lookup_flow and ip6_sk_dst_lookup_flow.  The latter of which
      handles unconnected UDP datagram sockets.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68d0c6d3
  14. 26 2月, 2011 1 次提交
  15. 25 2月, 2011 1 次提交
    • D
      ipv4: Rearrange how ip_route_newports() gets port keys. · dca8b089
      David S. Miller 提交于
      ip_route_newports() is the only place in the entire kernel that
      cares about the port members in the routing cache entry's lookup
      flow key.
      
      Therefore the only reason we store an entire flow inside of the
      struct rtentry is for this one special case.
      
      Rewrite ip_route_newports() such that:
      
      1) The caller passes in the original port values, so we don't need
         to use the rth->fl.fl_ip_{s,d}port values to remember them.
      
      2) The lookup flow is constructed by hand instead of being copied
         from the routing cache entry's flow.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dca8b089
  16. 03 2月, 2011 1 次提交
  17. 07 1月, 2011 3 次提交
  18. 10 12月, 2010 1 次提交
  19. 07 12月, 2010 1 次提交