1. 17 11月, 2016 2 次提交
    • E
      net: busy-poll: allow preemption in sk_busy_loop() · 217f6974
      Eric Dumazet 提交于
      After commit 4cd13c21 ("softirq: Let ksoftirqd do its job"),
      sk_busy_loop() needs a bit of care :
      softirqs might be delayed since we do not allow preemption yet.
      
      This patch adds preemptiom points in sk_busy_loop(),
      and makes sure no unnecessary cache line dirtying
      or atomic operations are done while looping.
      
      A new flag is added into napi->state : NAPI_STATE_IN_BUSY_POLL
      
      This prevents napi_complete_done() from clearing NAPIF_STATE_SCHED,
      so that sk_busy_loop() does not have to grab it again.
      
      Similarly, netpoll_poll_lock() is done one time.
      
      This gives about 10 to 20 % improvement in various busy polling
      tests, especially when many threads are busy polling in
      configurations with large number of NIC queues.
      
      This should allow experimenting with bigger delays without
      hurting overall latencies.
      
      Tested:
       On a 40Gb mlx4 NIC, 32 RX/TX queues.
      
       echo 70 >/proc/sys/net/core/busy_read
       for i in `seq 1 40`; do echo -n $i: ; ./super_netperf $i -H lpaa24 -t UDP_RR -- -N -n; done
      
          Before:      After:
       1:   90072   92819
       2:  157289  184007
       3:  235772  213504
       4:  344074  357513
       5:  394755  458267
       6:  461151  487819
       7:  549116  625963
       8:  544423  716219
       9:  720460  738446
      10:  794686  837612
      11:  915998  923960
      12:  937507  925107
      13: 1019677  971506
      14: 1046831 1113650
      15: 1114154 1148902
      16: 1105221 1179263
      17: 1266552 1299585
      18: 1258454 1383817
      19: 1341453 1312194
      20: 1363557 1488487
      21: 1387979 1501004
      22: 1417552 1601683
      23: 1550049 1642002
      24: 1568876 1601915
      25: 1560239 1683607
      26: 1640207 1745211
      27: 1706540 1723574
      28: 1638518 1722036
      29: 1734309 1757447
      30: 1782007 1855436
      31: 1724806 1888539
      32: 1717716 1944297
      33: 1778716 1869118
      34: 1805738 1983466
      35: 1815694 2020758
      36: 1893059 2035632
      37: 1843406 2034653
      38: 1888830 2086580
      39: 1972827 2143567
      40: 1877729 2181851
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Adam Belay <abelay@google.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Cc: Yuval Mintz <Yuval.Mintz@cavium.com>
      Cc: Ariel Elior <ariel.elior@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      217f6974
    • D
      ipv6: sr: add option to control lwtunnel support · 46738b13
      David Lebrun 提交于
      This patch adds a new option CONFIG_IPV6_SEG6_LWTUNNEL to enable/disable
      support of encapsulation with the lightweight tunnels. When this option
      is enabled, CONFIG_LWTUNNEL is automatically selected.
      
      Fix commit 6c8702c6 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
      
      Without a proper option to control lwtunnel support for SR-IPv6, if
      CONFIG_LWTUNNEL=n then the IPv6 initialization fails as a consequence
      of seg6_iptunnel_init() failure with EOPNOTSUPP:
      
      NET: Registered protocol family 10
      IPv6: Attempt to unregister permanent protocol 6
      IPv6: Attempt to unregister permanent protocol 136
      IPv6: Attempt to unregister permanent protocol 17
      NET: Unregistered protocol family 10
      
      Tested (compiling, booting, and loading ipv6 module when relevant)
      with possible combinations of CONFIG_IPV6={y,m,n},
      CONFIG_IPV6_SEG6_LWTUNNEL={y,n} and CONFIG_LWTUNNEL={y,n}.
      Reported-by: NLorenzo Colitti <lorenzo@google.com>
      Suggested-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid Lebrun <david.lebrun@uclouvain.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46738b13
  2. 16 11月, 2016 3 次提交
  3. 15 11月, 2016 2 次提交
  4. 14 11月, 2016 3 次提交
    • J
      netfilter: x_tables: simplify IS_ERR_OR_NULL to NULL test · eb1a6bdc
      Julia Lawall 提交于
      Since commit 7926dbfa ("netfilter: don't use
      mutex_lock_interruptible()"), the function xt_find_table_lock can only
      return NULL on an error.  Simplify the call sites and update the
      comment before the function.
      
      The semantic patch that change the code is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      expression t,e;
      @@
      
      t = \(xt_find_table_lock(...)\|
            try_then_request_module(xt_find_table_lock(...),...)\)
      ... when != t=e
      - ! IS_ERR_OR_NULL(t)
      + t
      
      @@
      expression t,e;
      @@
      
      t = \(xt_find_table_lock(...)\|
            try_then_request_module(xt_find_table_lock(...),...)\)
      ... when != t=e
      - IS_ERR_OR_NULL(t)
      + !t
      
      @@
      expression t,e,e1;
      @@
      
      t = \(xt_find_table_lock(...)\|
            try_then_request_module(xt_find_table_lock(...),...)\)
      ... when != t=e
      ?- t ? PTR_ERR(t) : e1
      + e1
      ... when any
      
      // </smpl>
      Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      eb1a6bdc
    • E
      tcp: take care of truncations done by sk_filter() · ac6e7800
      Eric Dumazet 提交于
      With syzkaller help, Marco Grassi found a bug in TCP stack,
      crashing in tcp_collapse()
      
      Root cause is that sk_filter() can truncate the incoming skb,
      but TCP stack was not really expecting this to happen.
      It probably was expecting a simple DROP or ACCEPT behavior.
      
      We first need to make sure no part of TCP header could be removed.
      Then we need to adjust TCP_SKB_CB(skb)->end_seq
      
      Many thanks to syzkaller team and Marco for giving us a reproducer.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NMarco Grassi <marco.gra@gmail.com>
      Reported-by: NVladis Dronov <vdronov@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac6e7800
    • S
      ipv4: use new_gw for redirect neigh lookup · 969447f2
      Stephen Suryaputra Lin 提交于
      In v2.6, ip_rt_redirect() calls arp_bind_neighbour() which returns 0
      and then the state of the neigh for the new_gw is checked. If the state
      isn't valid then the redirected route is deleted. This behavior is
      maintained up to v3.5.7 by check_peer_redirect() because rt->rt_gateway
      is assigned to peer->redirect_learned.a4 before calling
      ipv4_neigh_lookup().
      
      After commit 5943634f ("ipv4: Maintain redirect and PMTU info in
      struct rtable again."), ipv4_neigh_lookup() is performed without the
      rt_gateway assigned to the new_gw. In the case when rt_gateway (old_gw)
      isn't zero, the function uses it as the key. The neigh is most likely
      valid since the old_gw is the one that sends the ICMP redirect message.
      Then the new_gw is assigned to fib_nh_exception. The problem is: the
      new_gw ARP may never gets resolved and the traffic is blackholed.
      
      So, use the new_gw for neigh lookup.
      
      Changes from v1:
       - use __ipv4_neigh_lookup instead (per Eric Dumazet).
      
      Fixes: 5943634f ("ipv4: Maintain redirect and PMTU info in struct rtable again.")
      Signed-off-by: NStephen Suryaputra Lin <ssurya@ieee.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      969447f2
  5. 13 11月, 2016 10 次提交
  6. 11 11月, 2016 4 次提交
  7. 10 11月, 2016 16 次提交