1. 23 8月, 2013 1 次提交
    • Y
      tcp: increase throughput when reordering is high · 0f7cc9a3
      Yuchung Cheng 提交于
      The stack currently detects reordering and avoid spurious
      retransmission very well. However the throughput is sub-optimal under
      high reordering because cwnd is increased only if the data is deliverd
      in order. I.e., FLAG_DATA_ACKED check in tcp_ack().  The more packet
      are reordered the worse the throughput is.
      
      Therefore when reordering is proven high, cwnd should advance whenever
      the data is delivered regardless of its ordering. If reordering is low,
      conservatively advance cwnd only on ordered deliveries in Open state,
      and retain cwnd in Disordered state (RFC5681).
      
      Using netperf on a qdisc setup of 20Mbps BW and random RTT from 45ms
      to 55ms (for reordering effect). This change increases TCP throughput
      by 20 - 25% to near bottleneck BW.
      
      A special case is the stretched ACK with new SACK and/or ECE mark.
      For example, a receiver may receive an out of order or ECN packet with
      unacked data buffered because of LRO or delayed ACK. The principle on
      such an ACK is to advance cwnd on the cummulative acked part first,
      then reduce cwnd in tcp_fastretrans_alert().
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0f7cc9a3
  2. 21 8月, 2013 16 次提交
  3. 20 8月, 2013 1 次提交
  4. 16 8月, 2013 2 次提交
  5. 15 8月, 2013 9 次提交
  6. 14 8月, 2013 5 次提交
    • A
      rtnetlink: rtnl_bridge_getlink: Call nlmsg_find_attr() with ifinfomsg header · 3e805ad2
      Asbjoern Sloth Toennesen 提交于
      Fix the iproute2 command `bridge vlan show`, after switching from
      rtgenmsg to ifinfomsg.
      
      Let's start with a little history:
      
      Feb 20:   Vlad Yasevich got his VLAN-aware bridge patchset included in
                the 3.9 merge window.
                In the kernel commit 6cbdceeb, he added attribute support to
                bridge GETLINK requests sent with rtgenmsg.
      
      Mar 6th:  Vlad got this iproute2 reference implementation of the bridge
                vlan netlink interface accepted (iproute2 9eff0e5c)
      
      Apr 25th: iproute2 switched from using rtgenmsg to ifinfomsg (63338dca)
                http://patchwork.ozlabs.org/patch/239602/
                http://marc.info/?t=136680900700007
      
      Apr 28th: Linus released 3.9
      
      Apr 30th: Stephen released iproute2 3.9.0
      
      The `bridge vlan show` command haven't been working since the switch to
      ifinfomsg, or in a released version of iproute2. Since the kernel side
      only supports rtgenmsg, which iproute2 switched away from just prior to
      the iproute2 3.9.0 release.
      
      I haven't been able to find any documentation, about neither rtgenmsg
      nor ifinfomsg, and in which situation to use which, but kernel commit
      88c5b5ce seams to suggest that ifinfomsg should be used.
      
      Fixing this in kernel will break compatibility, but I doubt that anybody
      have been using it due to this bug in the user space reference
      implementation, at least not without noticing this bug. That said the
      functionality is still fully functional in 3.9, when reversing iproute2
      commit 63338dca.
      
      This could also be fixed in iproute2, but thats an ugly patch that would
      reintroduce rtgenmsg in iproute2, and from searching in netdev it seams
      like rtgenmsg usage is discouraged. I'm assuming that the only reason
      that Vlad implemented the kernel side to use rtgenmsg, was because
      iproute2 was using it at the time.
      Signed-off-by: NAsbjoern Sloth Toennesen <ast@fiberby.net>
      Reviewed-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3e805ad2
    • H
      ipv6: make unsolicited report intervals configurable for mld · fc4eba58
      Hannes Frederic Sowa 提交于
      Commit cab70040 ("net: igmp:
      Reduce Unsolicited report interval to 1s when using IGMPv3") and
      2690048c ("net: igmp: Allow user-space
      configuration of igmp unsolicited report interval") by William Manley made
      igmp unsolicited report intervals configurable per interface and corrected
      the interval of unsolicited igmpv3 report messages resendings to 1s.
      
      Same needs to be done for IPv6:
      
      MLDv1 (RFC2710 7.10.): 10 seconds
      MLDv2 (RFC3810 9.11.): 1 second
      
      Both intervals are configurable via new procfs knobs
      mldv1_unsolicited_report_interval and mldv2_unsolicited_report_interval.
      
      (also added .force_mld_version to ipv6_devconf_dflt to bring structs in
      line without semantic changes)
      
      v2:
      a) Joined documentation update for IPv4 and IPv6 MLD/IGMP
         unsolicited_report_interval procfs knobs.
      b) incorporate stylistic feedback from William Manley
      
      v3:
      a) add new DEVCONF_* values to the end of the enum (thanks to David
         Miller)
      
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: William Manley <william.manley@youview.com>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc4eba58
    • P
      ip_tunnel: Do not use inner ip-header-id for tunnel ip-header-id. · 4221f405
      Pravin B Shelar 提交于
      Using inner-id for tunnel id is not safe in some rare cases.
      E.g. packets coming from multiple sources entering same tunnel
      can have same id. Therefore on tunnel packet receive we
      could have packets from two different stream but with same
      source and dst IP with same ip-id which could confuse ip packet
      reassembly.
      
      Following patch reverts optimization from commit
      490ab081 (IP_GRE: Fix IP-Identification.)
      
      CC: Jarno Rajahalme <jrajahalme@nicira.com>
      CC: Ansis Atteka <aatteka@nicira.com>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4221f405
    • Y
      tcp: reset reordering est. selectively on timeout · 74c181d5
      Yuchung Cheng 提交于
      On timeout the TCP sender unconditionally resets the estimated degree
      of network reordering (tp->reordering). The idea behind this is that
      the estimate is too large to trigger fast recovery (e.g., due to a IP
      path change).
      
      But for example if the sender only had 2 packets outstanding, then a
      timeout doesn't tell much about reordering. A sender that learns about
      reordering on big writes and loses packets on small writes will end up
      falsely retransmitting again and again, especially when reordering is
      more likely on big writes.
      
      Therefore the sender should only suspect that tp->reordering is too
      high if it could have gone into fast recovery with the (lower) default
      estimate.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74c181d5
    • V
      net: sctp: Add rudimentary infrastructure to account for control chunks · 072017b4
      Vlad Yasevich 提交于
      This patch adds a base infrastructure that allows SCTP to do
      memory accounting for control chunks.  Real accounting code will
      follow.
      
      This patch alos fixes the following triggered bug ...
      
      [  553.109742] kernel BUG at include/linux/skbuff.h:1813!
      [  553.109766] invalid opcode: 0000 [#1] SMP
      [  553.109789] Modules linked in: sctp libcrc32c rfcomm [...]
      [  553.110259]  uinput i915 i2c_algo_bit drm_kms_helper e1000e drm ptp
      pps_core i2c_core wmi video sunrpc
      [  553.110320] CPU: 0 PID: 1636 Comm: lt-test_1_to_1_ Not tainted
      3.11.0-rc3+ #2
      [  553.110350] Hardware name: LENOVO 74597D6/74597D6, BIOS 6DET60WW
      (3.10 ) 09/17/2009
      [  553.110381] task: ffff88020a01dd40 ti: ffff880204ed0000 task.ti:
      ffff880204ed0000
      [  553.110411] RIP: 0010:[<ffffffffa0698017>]  [<ffffffffa0698017>]
      skb_orphan.part.9+0x4/0x6 [sctp]
      [  553.110459] RSP: 0018:ffff880204ed1bb8  EFLAGS: 00010286
      [  553.110483] RAX: ffff8802086f5a40 RBX: ffff880204303300 RCX:
      0000000000000000
      [  553.110487] RDX: ffff880204303c28 RSI: ffff8802086f5a40 RDI:
      ffff880202158000
      [  553.110487] RBP: ffff880204ed1bb8 R08: 0000000000000000 R09:
      0000000000000000
      [  553.110487] R10: ffff88022f2d9a04 R11: ffff880233001600 R12:
      0000000000000000
      [  553.110487] R13: ffff880204303c00 R14: ffff8802293d0000 R15:
      ffff880202158000
      [  553.110487] FS:  00007f31b31fe740(0000) GS:ffff88023bc00000(0000)
      knlGS:0000000000000000
      [  553.110487] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  553.110487] CR2: 000000379980e3e0 CR3: 000000020d225000 CR4:
      00000000000407f0
      [  553.110487] Stack:
      [  553.110487]  ffff880204ed1ca8 ffffffffa068d7fc 0000000000000000
      0000000000000000
      [  553.110487]  0000000000000000 ffff8802293d0000 ffff880202158000
      ffffffff81cb7900
      [  553.110487]  0000000000000000 0000400000001c68 ffff8802086f5a40
      000000000000000f
      [  553.110487] Call Trace:
      [  553.110487]  [<ffffffffa068d7fc>] sctp_sendmsg+0x6bc/0xc80 [sctp]
      [  553.110487]  [<ffffffff8128f185>] ? sock_has_perm+0x75/0x90
      [  553.110487]  [<ffffffff815a3593>] inet_sendmsg+0x63/0xb0
      [  553.110487]  [<ffffffff8128f2b3>] ? selinux_socket_sendmsg+0x23/0x30
      [  553.110487]  [<ffffffff8151c5d6>] sock_sendmsg+0xa6/0xd0
      [  553.110487]  [<ffffffff81637b05>] ? _raw_spin_unlock_bh+0x15/0x20
      [  553.110487]  [<ffffffff8151cd38>] SYSC_sendto+0x128/0x180
      [  553.110487]  [<ffffffff8151ce6b>] ? SYSC_connect+0xdb/0x100
      [  553.110487]  [<ffffffffa0690031>] ? sctp_inet_listen+0x71/0x1f0
      [sctp]
      [  553.110487]  [<ffffffff8151d35e>] SyS_sendto+0xe/0x10
      [  553.110487]  [<ffffffff81640202>] system_call_fastpath+0x16/0x1b
      [  553.110487] Code: e0 48 c7 c7 00 22 6a a0 e8 67 a3 f0 e0 48 c7 [...]
      [  553.110487] RIP  [<ffffffffa0698017>] skb_orphan.part.9+0x4/0x6
      [sctp]
      [  553.110487]  RSP <ffff880204ed1bb8>
      [  553.121578] ---[ end trace 46c20c5903ef5be2 ]---
      
      The approach taken here is to split data and control chunks
      creation a  bit.  Data chunks already have memory accounting
      so noting needs to happen.  For control chunks, add stubs handlers.
      Signed-off-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      072017b4
  7. 13 8月, 2013 5 次提交
  8. 12 8月, 2013 1 次提交