1. 14 4月, 2015 2 次提交
    • E
      tcp/dccp: get rid of central timewait timer · 789f558c
      Eric Dumazet 提交于
      Using a timer wheel for timewait sockets was nice ~15 years ago when
      memory was expensive and machines had a single processor.
      
      This does not scale, code is ugly and source of huge latencies
      (Typically 30 ms have been seen, cpus spinning on death_lock spinlock.)
      
      We can afford to use an extra 64 bytes per timewait sock and spread
      timewait load to all cpus to have better behavior.
      
      Tested:
      
      On following test, /proc/sys/net/ipv4/tcp_tw_recycle is set to 1
      on the target (lpaa24)
      
      Before patch :
      
      lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0
      419594
      
      lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0
      437171
      
      While test is running, we can observe 25 or even 33 ms latencies.
      
      lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23
      ...
      1000 packets transmitted, 1000 received, 0% packet loss, time 20601ms
      rtt min/avg/max/mdev = 0.020/0.217/25.771/1.535 ms, pipe 2
      
      lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23
      ...
      1000 packets transmitted, 1000 received, 0% packet loss, time 20702ms
      rtt min/avg/max/mdev = 0.019/0.183/33.761/1.441 ms, pipe 2
      
      After patch :
      
      About 90% increase of throughput :
      
      lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0
      810442
      
      lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0
      800992
      
      And latencies are kept to minimal values during this load, even
      if network utilization is 90% higher :
      
      lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23
      ...
      1000 packets transmitted, 1000 received, 0% packet loss, time 19991ms
      rtt min/avg/max/mdev = 0.023/0.064/0.360/0.042 ms
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      789f558c
    • K
      tcp: fix bogus RTT for CC when retransmissions are acked · 3d0d26c7
      Kenneth Klette Jonassen 提交于
      Since retransmitted segments are not used for RTT estimation, previously
      SACKed segments present in the rtx queue are used. This estimation can be
      several times larger than the actual RTT. When a cumulative ack covers both
      previously SACKed and retransmitted segments, CC may thus get a bogus RTT.
      
      Such segments previously had an RTT estimation in tcp_sacktag_one(), so it
      seems reasonable to not reuse them in tcp_clean_rtx_queue() at all.
      
      Afaik, this has had no effect on SRTT/RTO because of Karn's check.
      Signed-off-by: NKenneth Klette Jonassen <kennetkl@ifi.uio.no>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Tested-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d0d26c7
  2. 13 4月, 2015 5 次提交
  3. 10 4月, 2015 1 次提交
  4. 09 4月, 2015 3 次提交
  5. 08 4月, 2015 6 次提交
  6. 05 4月, 2015 6 次提交
  7. 04 4月, 2015 2 次提交
  8. 03 4月, 2015 6 次提交
  9. 01 4月, 2015 5 次提交
    • J
      netlink: implement nla_get_in_addr and nla_get_in6_addr · 67b61f6c
      Jiri Benc 提交于
      Those are counterparts to nla_put_in_addr and nla_put_in6_addr.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67b61f6c
    • J
      netlink: implement nla_put_in_addr and nla_put_in6_addr · 930345ea
      Jiri Benc 提交于
      IP addresses are often stored in netlink attributes. Add generic functions
      to do that.
      
      For nla_put_in_addr, it would be nicer to pass struct in_addr but this is
      not used universally throughout the kernel, in way too many places __be32 is
      used to store IPv4 address.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      930345ea
    • J
      tcp: simplify inetpeer_addr_base use · 8f55db48
      Jiri Benc 提交于
      In many places, the a6 field is typecasted to struct in6_addr. As the
      fields are in union anyway, just add in6_addr type to the union and get rid
      of the typecasting.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f55db48
    • A
      fib_trie: Cleanup ip_fib_net_exit code path · 6e47d6ca
      Alexander Duyck 提交于
      While fixing a recent issue I noticed that we are doing some unnecessary
      work inside the loop for ip_fib_net_exit.  As such I am pulling out the
      initialization to NULL for the locally stored fib_local, fib_main, and
      fib_default.
      
      In addition I am restoring the original code for flushing the table as
      there is no need to split up the fib_table_flush and hlist_del work since
      the code for packing the tnodes with multiple key vectors was dropped.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e47d6ca
    • A
      fib_trie: Fix warning on fib4_rules_exit · ad88d051
      Alexander Duyck 提交于
      This fixes the following warning:
      
       BUG: sleeping function called from invalid context at mm/slub.c:1268
       in_atomic(): 1, irqs_disabled(): 0, pid: 6, name: kworker/u8:0
       INFO: lockdep is turned off.
       CPU: 3 PID: 6 Comm: kworker/u8:0 Tainted: G        W       4.0.0-rc5+ #895
       Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
       Workqueue: netns cleanup_net
        0000000000000006 ffff88011953fa68 ffffffff81a203b6 000000002c3a2c39
        ffff88011952a680 ffff88011953fa98 ffffffff8109daf0 ffff8801186c6aa8
        ffffffff81fbc9e5 00000000000004f4 0000000000000000 ffff88011953fac8
       Call Trace:
        [<ffffffff81a203b6>] dump_stack+0x4c/0x65
        [<ffffffff8109daf0>] ___might_sleep+0x1c3/0x1cb
        [<ffffffff8109db70>] __might_sleep+0x78/0x80
        [<ffffffff8117a60e>] slab_pre_alloc_hook+0x31/0x8f
        [<ffffffff8117d4f6>] __kmalloc+0x69/0x14e
        [<ffffffff818ed0e1>] ? kzalloc.constprop.20+0xe/0x10
        [<ffffffff818ed0e1>] kzalloc.constprop.20+0xe/0x10
        [<ffffffff818ef622>] fib_trie_table+0x27/0x8b
        [<ffffffff818ef6bd>] fib_trie_unmerge+0x37/0x2a6
        [<ffffffff810b06e1>] ? arch_local_irq_save+0x9/0xc
        [<ffffffff818e9793>] fib_unmerge+0x2d/0xb3
        [<ffffffff818f5f56>] fib4_rule_delete+0x1f/0x52
        [<ffffffff817f1c3f>] ? fib_rules_unregister+0x30/0xb2
        [<ffffffff817f1c8b>] fib_rules_unregister+0x7c/0xb2
        [<ffffffff818f64a1>] fib4_rules_exit+0x15/0x18
        [<ffffffff818e8c0a>] ip_fib_net_exit+0x23/0xf2
        [<ffffffff818e91f8>] fib_net_exit+0x32/0x36
        [<ffffffff817c8352>] ops_exit_list+0x45/0x57
        [<ffffffff817c8d3d>] cleanup_net+0x13c/0x1cd
        [<ffffffff8108b05d>] process_one_work+0x255/0x4ad
        [<ffffffff8108af69>] ? process_one_work+0x161/0x4ad
        [<ffffffff8108b4b1>] worker_thread+0x1cd/0x2ab
        [<ffffffff8108b2e4>] ? process_scheduled_works+0x2f/0x2f
        [<ffffffff81090686>] kthread+0xd4/0xdc
        [<ffffffff8109ec8f>] ? local_clock+0x19/0x22
        [<ffffffff810905b2>] ? __kthread_parkme+0x83/0x83
        [<ffffffff81a2c0c8>] ret_from_fork+0x58/0x90
        [<ffffffff810905b2>] ? __kthread_parkme+0x83/0x83
      
      The issue was that as a part of exiting the default rules were being
      deleted which resulted in the local trie being unmerged.  By moving the
      freeing of the FIB tables up we can avoid the unmerge since there is no
      local table left when we call the fib4_rules_exit function.
      
      Fixes: 0ddcf43d ("ipv4: FIB Local/MAIN table collapse")
      Reported-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad88d051
  10. 30 3月, 2015 2 次提交
  11. 26 3月, 2015 1 次提交
  12. 25 3月, 2015 1 次提交