1. 28 8月, 2015 14 次提交
  2. 27 8月, 2015 2 次提交
  3. 26 8月, 2015 4 次提交
    • J
      vxlan: fix multiple inclusion of vxlan.h · 48e92c44
      Jiri Benc 提交于
      The vxlan_get_sk_family inline function was added after the last #endif,
      making multiple inclusion of net/vxlan.h fail. Move it to the proper place.
      Reported-by: NMark Rustad <mark.d.rustad@intel.com>
      Fixes: 705cc62f ("vxlan: provide access function for vxlan socket address family")
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      48e92c44
    • D
      inetpeer: remove dead code · 2c0027cd
      David Ahern 提交于
      Remove various inlined functions not referenced in the kernel.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c0027cd
    • E
      tcp: refine pacing rate determination · 43e122b0
      Eric Dumazet 提交于
      When TCP pacing was added back in linux-3.12, we chose
      to apply a fixed ratio of 200 % against current rate,
      to allow probing for optimal throughput even during
      slow start phase, where cwnd can be doubled every other gRTT.
      
      At Google, we found it was better applying a different ratio
      while in Congestion Avoidance phase.
      This ratio was set to 120 %.
      
      We've used the normal tcp_in_slow_start() helper for a while,
      then tuned the condition to select the conservative ratio
      as soon as cwnd >= ssthresh/2 :
      
      - After cwnd reduction, it is safer to ramp up more slowly,
        as we approach optimal cwnd.
      - Initial ramp up (ssthresh == INFINITY) still allows doubling
        cwnd every other RTT.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      43e122b0
    • E
      tcp: fix slow start after idle vs TSO/GSO · 6f021c62
      Eric Dumazet 提交于
      slow start after idle might reduce cwnd, but we perform this
      after first packet was cooked and sent.
      
      With TSO/GSO, it means that we might send a full TSO packet
      even if cwnd should have been reduced to IW10.
      
      Moving the SSAI check in skb_entail() makes sense, because
      we slightly reduce number of times this check is done,
      especially for large send() and TCP Small queue callbacks from
      softirq context.
      
      As Neal pointed out, we also need to perform the check
      if/when receive window opens.
      
      Tested:
      
      Following packetdrill test demonstrates the problem
      // Test of slow start after idle
      
      `sysctl -q net.ipv4.tcp_slow_start_after_idle=1`
      
      0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +0    setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      +0    bind(3, ..., ...) = 0
      +0    listen(3, 1) = 0
      
      +0    < S 0:0(0) win 65535 <mss 1000,sackOK,nop,nop,nop,wscale 7>
      +0    > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
      +.100 < . 1:1(0) ack 1 win 511
      +0    accept(3, ..., ...) = 4
      +0    setsockopt(4, SOL_SOCKET, SO_SNDBUF, [200000], 4) = 0
      
      +0    write(4, ..., 26000) = 26000
      +0    > . 1:5001(5000) ack 1
      +0    > . 5001:10001(5000) ack 1
      +0    %{ assert tcpi_snd_cwnd == 10 }%
      
      +.100 < . 1:1(0) ack 10001 win 511
      +0    %{ assert tcpi_snd_cwnd == 20, tcpi_snd_cwnd }%
      +0    > . 10001:20001(10000) ack 1
      +0    > P. 20001:26001(6000) ack 1
      
      +.100 < . 1:1(0) ack 26001 win 511
      +0    %{ assert tcpi_snd_cwnd == 36, tcpi_snd_cwnd }%
      
      +4 write(4, ..., 20000) = 20000
      // If slow start after idle works properly, we should send 5 MSS here (cwnd/2)
      +0    > . 26001:31001(5000) ack 1
      +0    %{ assert tcpi_snd_cwnd == 10, tcpi_snd_cwnd }%
      +0    > . 31001:36001(5000) ack 1
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f021c62
  4. 25 8月, 2015 1 次提交
  5. 24 8月, 2015 3 次提交
  6. 21 8月, 2015 14 次提交
  7. 20 8月, 2015 2 次提交
    • N
      vrf: vrf_master_ifindex_rcu is not always called with rcu read lock · 18041e31
      Nikolay Aleksandrov 提交于
      While running net-next I hit this:
      [  634.073119] ===============================
      [  634.073150] [ INFO: suspicious RCU usage. ]
      [  634.073182] 4.2.0-rc6+ #45 Not tainted
      [  634.073213] -------------------------------
      [  634.073244] include/net/vrf.h:38 suspicious rcu_dereference_check()
      usage!
      [  634.073274]
                     other info that might help us debug this:
      
      [  634.073307]
                     rcu_scheduler_active = 1, debug_locks = 1
      [  634.073338] 2 locks held by swapper/0/0:
      [  634.073369]  #0:  (((&n->timer))){+.-...}, at: [<ffffffff8112bc35>]
      call_timer_fn+0x5/0x480
      [  634.073412]  #1:  (slock-AF_INET){+.-...}, at: [<ffffffff8174f0f5>]
      icmp_send+0x155/0x5f0
      [  634.073450]
                     stack backtrace:
      [  634.073483] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc6+ #45
      [  634.073514] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
      VirtualBox 12/01/2006
      [  634.073545]  0000000000000000 0593ba8242d9ace4 ffff88002fc03b48
      ffffffff81803f1b
      [  634.073612]  0000000000000000 ffffffff81e12500 ffff88002fc03b78
      ffffffff811003c5
      [  634.073642]  0000000000000000 ffff88002ec4e600 ffffffff81f00f80
      ffff88002fc03cf0
      [  634.073669] Call Trace:
      [  634.073694]  <IRQ>  [<ffffffff81803f1b>] dump_stack+0x4c/0x65
      [  634.073728]  [<ffffffff811003c5>] lockdep_rcu_suspicious+0xc5/0x100
      [  634.073763]  [<ffffffff8174eb56>] icmp_route_lookup+0x176/0x5c0
      [  634.073793]  [<ffffffff8174f2fb>] ? icmp_send+0x35b/0x5f0
      [  634.073818]  [<ffffffff8174f274>] ? icmp_send+0x2d4/0x5f0
      [  634.073844]  [<ffffffff8174f3ce>] icmp_send+0x42e/0x5f0
      [  634.073873]  [<ffffffff8170b662>] ipv4_link_failure+0x22/0xa0
      [  634.073899]  [<ffffffff8174bdda>] arp_error_report+0x3a/0x80
      [  634.073926]  [<ffffffff816d6100>] ? neigh_lookup+0x2c0/0x2c0
      [  634.073952]  [<ffffffff816d396e>] neigh_invalidate+0x8e/0x110
      [  634.073984]  [<ffffffff816d62ae>] neigh_timer_handler+0x1ae/0x290
      [  634.074013]  [<ffffffff816d6100>] ? neigh_lookup+0x2c0/0x2c0
      [  634.074013]  [<ffffffff8112bce3>] call_timer_fn+0xb3/0x480
      [  634.074013]  [<ffffffff8112bc35>] ? call_timer_fn+0x5/0x480
      [  634.074013]  [<ffffffff816d6100>] ? neigh_lookup+0x2c0/0x2c0
      [  634.074013]  [<ffffffff8112c2bc>] run_timer_softirq+0x20c/0x430
      [  634.074013]  [<ffffffff810af50e>] __do_softirq+0xde/0x630
      [  634.074013]  [<ffffffff810afc97>] irq_exit+0x117/0x120
      [  634.074013]  [<ffffffff81810976>] smp_apic_timer_interrupt+0x46/0x60
      [  634.074013]  [<ffffffff8180e950>] apic_timer_interrupt+0x70/0x80
      [  634.074013]  <EOI>  [<ffffffff8106b9d6>] ? native_safe_halt+0x6/0x10
      [  634.074013]  [<ffffffff81101d8d>] ? trace_hardirqs_on+0xd/0x10
      [  634.074013]  [<ffffffff81027d43>] default_idle+0x23/0x200
      [  634.074013]  [<ffffffff8102852f>] arch_cpu_idle+0xf/0x20
      [  634.074013]  [<ffffffff810f89ba>] default_idle_call+0x2a/0x40
      [  634.074013]  [<ffffffff810f8dcc>] cpu_startup_entry+0x39c/0x4c0
      [  634.074013]  [<ffffffff817f9cad>] rest_init+0x13d/0x150
      [  634.074013]  [<ffffffff81f69038>] start_kernel+0x4a8/0x4c9
      [  634.074013]  [<ffffffff81f68120>] ?
      early_idt_handler_array+0x120/0x120
      [  634.074013]  [<ffffffff81f68339>] x86_64_start_reservations+0x2a/0x2c
      [  634.074013]  [<ffffffff81f68485>] x86_64_start_kernel+0x14a/0x16d
      
      It would seem vrf_master_ifindex_rcu() can be called without RCU held in
      other contexts as well so introduce a new helper which acquires rcu and
      returns the ifindex.
      Also add curly braces around both the "if" and "else" parts as per the
      style guide.
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      18041e31
    • Y
      lwtunnel: Fix the sparse warnings in fib_encap_match · 824e7383
      Ying Xue 提交于
      When CONFIG_LWTUNNEL config is not enabled, the lwtstate_free() is not
      declared in lwtunnel.h at all. However, even in this case, the function
      is still referenced in fib_semantics.c so that there appears the
      following sparse warnings:
      
      net/ipv4/fib_semantics.c:553:17: error: undefined identifier 'lwtstate_free'
        CC      net/ipv4/fib_semantics.o
        net/ipv4/fib_semantics.c: In function ‘fib_encap_match’:
        net/ipv4/fib_semantics.c:553:3: error: implicit declaration of function ‘lwtstate_free’ [-Werror=implicit-function-declaration]
        cc1: some warnings being treated as errors
        make[1]: *** [net/ipv4/fib_semantics.o] Error 1
        make: *** [net/ipv4/fib_semantics.o] Error 2
      
      To eliminate the error, we define an empty function for lwtstate_free()
      in lwtunnel.h when CONFIG_LWTUNNEL is disabled.
      
      Fixes: df383e62 ("lwtunnel: fix memory leak")
      Cc: Jiri Benc <jbenc@redhat.com>
      Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Acked-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      824e7383