1. 05 9月, 2013 4 次提交
    • D
      net: ipv6: mld: clean up MLD_V1_SEEN macro · 6c567b78
      Daniel Borkmann 提交于
      Replace the macro with a function to make it more readable. GCC will
      eventually decide whether to inline this or not (also, that's not
      fast-path anyway).
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c567b78
    • D
      net: ipv6: mld: fix v1/v2 switchback timeout to rfc3810, 9.12. · 89225d1c
      Daniel Borkmann 提交于
      i) RFC3810, 9.2. Query Interval [QI] says:
      
         The Query Interval variable denotes the interval between General
         Queries sent by the Querier. Default value: 125 seconds. [...]
      
      ii) RFC3810, 9.3. Query Response Interval [QRI] says:
      
        The Maximum Response Delay used to calculate the Maximum Response
        Code inserted into the periodic General Queries. Default value:
        10000 (10 seconds) [...] The number of seconds represented by the
        [Query Response Interval] must be less than the [Query Interval].
      
      iii) RFC3810, 9.12. Older Version Querier Present Timeout [OVQPT] says:
      
        The Older Version Querier Present Timeout is the time-out for
        transitioning a host back to MLDv2 Host Compatibility Mode. When an
        MLDv1 query is received, MLDv2 hosts set their Older Version Querier
        Present Timer to [Older Version Querier Present Timeout].
      
        This value MUST be ([Robustness Variable] times (the [Query Interval]
        in the last Query received)) plus ([Query Response Interval]).
      
      Hence, on *default* the timeout results in:
      
        [RV] = 2, [QI] = 125sec, [QRI] = 10sec
        [OVQPT] = [RV] * [QI] + [QRI] = 260sec
      
      Having that said, we currently calculate [OVQPT] (here given as 'switchback'
      variable) as ...
      
        switchback = (idev->mc_qrv + 1) * max_delay
      
      RFC3810, 9.12. says "the [Query Interval] in the last Query received". In
      section "9.14. Configuring timers", it is said:
      
        This section is meant to provide advice to network administrators on
        how to tune these settings to their network. Ambitious router
        implementations might tune these settings dynamically based upon
        changing characteristics of the network. [...]
      
      iv) RFC38010, 9.14.2. Query Interval:
      
        The overall level of periodic MLD traffic is inversely proportional
        to the Query Interval. A longer Query Interval results in a lower
        overall level of MLD traffic. The value of the Query Interval MUST
        be equal to or greater than the Maximum Response Delay used to
        calculate the Maximum Response Code inserted in General Query
        messages.
      
      I assume that was why switchback is calculated as is (3 * max_delay), although
      this setting seems to be meant for routers only to configure their [QI]
      interval for non-default intervals. So usage here like this is clearly wrong.
      
      Concluding, the current behaviour in IPv6's multicast code is not conform
      to the RFC as switch back is calculated wrongly. That is, it has a too small
      value, so MLDv2 hosts switch back again to MLDv2 way too early, i.e. ~30secs
      instead of ~260secs on default.
      
      Hence, introduce necessary helper functions and fix this up properly as it
      should be.
      
      Introduced in 06da92283 ("[IPV6]: Add MLDv2 support."). Credits to Hannes
      Frederic Sowa who also had a hand in this as well. Also thanks to Hangbin Liu
      who did initial testing.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: David Stevens <dlstevens@us.ibm.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89225d1c
    • Y
      tcp: better comments for RTO initiallization · 52f20e65
      Yuchung Cheng 提交于
      Commit 1b7fdd2a("tcp: do not use cached RTT for RTT estimation")
      removes important comments on how RTO is initialized and updated.
      Hopefully this patch puts those information back.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52f20e65
    • A
      net: sctp: Fix data chunk fragmentation for MTU values which are not multiple of 4 · c08751c8
      Alexander Sverdlin 提交于
      net: sctp: Fix data chunk fragmentation for MTU values which are not multiple of 4
      
      Initially the problem was observed with ipsec, but later it became clear that
      SCTP data chunk fragmentation algorithm has problems with MTU values which are
      not multiple of 4. Test program was used which just transmits 2000 bytes long
      packets to other host. tcpdump was used to observe re-fragmentation in IP layer
      after SCTP already fragmented data chunks.
      
      With MTU 1500:
      12:54:34.082904 IP (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 1500)
          10.151.38.153.39303 > 10.151.24.91.54321: sctp (1) [DATA] (B) [TSN: 2366088589] [SID: 0] [SSEQ 1] [PPID 0x0]
      12:54:34.082933 IP (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 596)
          10.151.38.153.39303 > 10.151.24.91.54321: sctp (1) [DATA] (E) [TSN: 2366088590] [SID: 0] [SSEQ 1] [PPID 0x0]
      12:54:34.090576 IP (tos 0x2,ECT(0), ttl 63, id 0, offset 0, flags [DF], proto SCTP (132), length 48)
          10.151.24.91.54321 > 10.151.38.153.39303: sctp (1) [SACK] [cum ack 2366088590] [a_rwnd 79920] [#gap acks 0] [#dup tsns 0]
      
      With MTU 1499:
      13:02:49.955220 IP (tos 0x2,ECT(0), ttl 64, id 48215, offset 0, flags [+], proto SCTP (132), length 1492)
          10.151.38.153.39084 > 10.151.24.91.54321: sctp[|sctp]
      13:02:49.955249 IP (tos 0x2,ECT(0), ttl 64, id 48215, offset 1472, flags [none], proto SCTP (132), length 28)
          10.151.38.153 > 10.151.24.91: ip-proto-132
      13:02:49.955262 IP (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 600)
          10.151.38.153.39084 > 10.151.24.91.54321: sctp (1) [DATA] (E) [TSN: 404355346] [SID: 0] [SSEQ 1] [PPID 0x0]
      13:02:49.956770 IP (tos 0x2,ECT(0), ttl 63, id 0, offset 0, flags [DF], proto SCTP (132), length 48)
          10.151.24.91.54321 > 10.151.38.153.39084: sctp (1) [SACK] [cum ack 404355346] [a_rwnd 79920] [#gap acks 0] [#dup tsns 0]
      
      Here problem in data portion limit calculation leads to re-fragmentation in IP,
      which is sub-optimal. The problem is max_data initial value, which doesn't take
      into account the fact, that data chunk must be padded to 4-bytes boundary.
      It's enough to correct max_data, because all later adjustments are correctly
      aligned to 4-bytes boundary.
      
      After the fix is applied, everything is fragmented correctly for uneven MTUs:
      15:16:27.083881 IP (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 1496)
          10.151.38.153.53417 > 10.151.24.91.54321: sctp (1) [DATA] (B) [TSN: 3077098183] [SID: 0] [SSEQ 1] [PPID 0x0]
      15:16:27.083907 IP (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 600)
          10.151.38.153.53417 > 10.151.24.91.54321: sctp (1) [DATA] (E) [TSN: 3077098184] [SID: 0] [SSEQ 1] [PPID 0x0]
      15:16:27.085640 IP (tos 0x2,ECT(0), ttl 63, id 0, offset 0, flags [DF], proto SCTP (132), length 48)
          10.151.24.91.54321 > 10.151.38.153.53417: sctp (1) [SACK] [cum ack 3077098184] [a_rwnd 79920] [#gap acks 0] [#dup tsns 0]
      
      The bug was there for years already, but
       - is a performance issue, the packets are still transmitted
       - doesn't show up with default MTU 1500, but possibly with ipsec (MTU 1438)
      Signed-off-by: NAlexander Sverdlin <alexander.sverdlin@nsn.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c08751c8
  2. 04 9月, 2013 20 次提交
  3. 03 9月, 2013 1 次提交
  4. 01 9月, 2013 13 次提交
  5. 31 8月, 2013 2 次提交