1. 10 3月, 2018 3 次提交
    • P
      net: introduce IFF_NO_RX_HANDLER · f5426250
      Paolo Abeni 提交于
      Some network devices - notably ipvlan slave - are not compatible with
      any kind of rx_handler. Currently the hook can be installed but any
      configuration (bridge, bond, macsec, ...) is nonfunctional.
      
      This change allocates a priv_flag bit to mark such devices and explicitly
      forbid installing a rx_handler if such bit is set. The new bit is used
      by ipvlan slave device.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f5426250
    • G
      pktgen: Remove VLA usage · 35951393
      Gustavo A. R. Silva 提交于
      In preparation to enabling -Wvla, remove VLA usage and replace it
      with a fixed-length array instead.
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35951393
    • E
      net: do not create fallback tunnels for non-default namespaces · 79134e6c
      Eric Dumazet 提交于
      fallback tunnels (like tunl0, gre0, gretap0, erspan0, sit0,
      ip6tnl0, ip6gre0) are automatically created when the corresponding
      module is loaded.
      
      These tunnels are also automatically created when a new network
      namespace is created, at a great cost.
      
      In many cases, netns are used for isolation purposes, and these
      extra network devices are a waste of resources. We are using
      thousands of netns per host, and hit the netns creation/delete
      bottleneck a lot. (Many thanks to Kirill for recent work on this)
      
      Add a new sysctl so that we can opt-out from this automatic creation.
      
      Note that these tunnels are still created for the initial namespace,
      to be the least intrusive for typical setups.
      
      Tested:
      lpk43:~# cat add_del_unshare.sh
      for i in `seq 1 40`
      do
       (for j in `seq 1 100` ; do  unshare -n /bin/true >/dev/null ; done) &
      done
      wait
      
      lpk43:~# echo 0 >/proc/sys/net/core/fb_tunnels_only_for_init_net
      lpk43:~# time ./add_del_unshare.sh
      
      real	0m37.521s
      user	0m0.886s
      sys	7m7.084s
      lpk43:~# echo 1 >/proc/sys/net/core/fb_tunnels_only_for_init_net
      lpk43:~# time ./add_del_unshare.sh
      
      real	0m4.761s
      user	0m0.851s
      sys	1m8.343s
      lpk43:~#
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79134e6c
  2. 09 3月, 2018 3 次提交
  3. 08 3月, 2018 1 次提交
  4. 07 3月, 2018 1 次提交
    • K
      net: Make account struct net to memcg · 30855ffc
      Kirill Tkhai 提交于
      The patch adds SLAB_ACCOUNT to flags of net_cachep cache,
      which enables accounting of struct net memory to memcg kmem.
      Since number of net_namespaces may be significant, user
      want to know, how much there were consumed, and control.
      
      Note, that we do not account net_generic to the same memcg,
      where net was accounted, moreover, we don't do this at all (*).
      We do not want the situation, when single memcg memory deficit
      prevents us to register new pernet_operations.
      
      (*)Even despite there is !current process accounting already
      available in linux-next. See kmalloc_memcg() there for the details.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30855ffc
  5. 06 3月, 2018 1 次提交
  6. 05 3月, 2018 5 次提交
  7. 02 3月, 2018 3 次提交
    • E
      net: ethtool: don't ignore return from driver get_fecparam method · a6d50512
      Edward Cree 提交于
      If ethtool_ops->get_fecparam returns an error, pass that error on to the
       user, rather than ignoring it.
      
      Fixes: 1a5f3da2 ("net: ethtool: add support for forward error correction modes")
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6d50512
    • M
      net: allow interface to be set into VRF if VLAN interface in same VRF · 50d629e7
      Mike Manning 提交于
      Setting an interface into a VRF fails with 'RTNETLINK answers: File
      exists' if one of its VLAN interfaces is already in the same VRF.
      As the VRF is an upper device of the VLAN interface, it is also showing
      up as an upper device of the interface itself. The solution is to
      restrict this check to devices other than master. As only one master
      device can be linked to a device, the check in this case is that the
      upper device (VRF) being linked to is not the same as the master device
      instead of it not being any one of the upper devices.
      
      The following example shows an interface ens12 (with a VLAN interface
      ens12.10) being set into VRF green, which behaves as expected:
      
        # ip link add link ens12 ens12.10 type vlan id 10
        # ip link set dev ens12 master vrfgreen
        # ip link show dev ens12
          3: ens12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel
             master vrfgreen state UP mode DEFAULT group default qlen 1000
             link/ether 52:54:00:4c:a0:45 brd ff:ff:ff:ff:ff:ff
      
      But if the VLAN interface has previously been set into the same VRF,
      then setting the interface into the VRF fails:
      
        # ip link set dev ens12 nomaster
        # ip link set dev ens12.10 master vrfgreen
        # ip link show dev ens12.10
          39: ens12.10@ens12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
          qdisc noqueue master vrfgreen state UP mode DEFAULT group default
          qlen 1000 link/ether 52:54:00:4c:a0:45 brd ff:ff:ff:ff:ff:ff
        # ip link set dev ens12 master vrfgreen
          RTNETLINK answers: File exists
      
      The workaround is to move the VLAN interface back into the default VRF
      beforehand, but it has to be shut first so as to avoid the risk of
      traffic leaking from the VRF. This fix avoids needing this workaround.
      Signed-off-by: NMike Manning <mmanning@att.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      50d629e7
    • G
      net: Fix spelling mistake "greater then" -> "greater than" · 3a053b1a
      Gal Pressman 提交于
      Fix trivial spelling mistake "greater then" -> "greater than".
      Signed-off-by: NGal Pressman <galp@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a053b1a
  8. 01 3月, 2018 2 次提交
  9. 28 2月, 2018 1 次提交
  10. 27 2月, 2018 2 次提交
  11. 24 2月, 2018 2 次提交
  12. 22 2月, 2018 3 次提交
    • A
      bpf: clean up unused-variable warning · a7dcdf6e
      Arnd Bergmann 提交于
      The only user of this variable is inside of an #ifdef, causing
      a warning without CONFIG_INET:
      
      net/core/filter.c: In function '____bpf_sock_ops_cb_flags_set':
      net/core/filter.c:3382:6: error: unused variable 'val' [-Werror=unused-variable]
        int val = argval & BPF_SOCK_OPS_ALL_CB_FLAGS;
      
      This replaces the #ifdef with a nicer IS_ENABLED() check that
      makes the code more readable and avoids the warning.
      
      Fixes: b13d8807 ("bpf: Adds field bpf_sock_ops_cb_flags to tcp_sock")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      a7dcdf6e
    • D
      net: Allow a rule to track originating protocol · cac56209
      Donald Sharp 提交于
      Allow a rule that is being added/deleted/modified or
      dumped to contain the originating protocol's id.
      
      The protocol is handled just like a routes originating
      protocol is.  This is especially useful because there
      is starting to be a plethora of different user space
      programs adding rules.
      
      Allow the vrf device to specify that the kernel is the originator
      of the rule created for this device.
      Signed-off-by: NDonald Sharp <sharpd@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cac56209
    • E
      tcp: switch to GSO being always on · 0a6b2a1d
      Eric Dumazet 提交于
      Oleksandr Natalenko reported performance issues with BBR without FQ
      packet scheduler that were root caused to lack of SG and GSO/TSO on
      his configuration.
      
      In this mode, TCP internal pacing has to setup a high resolution timer
      for each MSS sent.
      
      We could implement in TCP a strategy similar to the one adopted
      in commit fefa569a ("net_sched: sch_fq: account for schedule/timers drifts")
      or decide to finally switch TCP stack to a GSO only mode.
      
      This has many benefits :
      
      1) Most TCP developments are done with TSO in mind.
      2) Less high-resolution timers needs to be armed for TCP-pacing
      3) GSO can benefit of xmit_more hint
      4) Receiver GRO is more effective (as if TSO was used for real on sender)
         -> Lower ACK traffic
      5) Write queues have less overhead (one skb holds about 64KB of payload)
      6) SACK coalescing just works.
      7) rtx rb-tree contains less packets, SACK is cheaper.
      
      This patch implements the minimum patch, but we can remove some legacy
      code as follow ups.
      
      Tested:
      
      On 40Gbit link, one netperf -t TCP_STREAM
      
      BBR+fq:
      sg on:  26 Gbits/sec
      sg off: 15.7 Gbits/sec   (was 2.3 Gbit before patch)
      
      BBR+pfifo_fast:
      sg on:  24.2 Gbits/sec
      sg off: 14.9 Gbits/sec  (was 0.66 Gbit before patch !!! )
      
      BBR+fq_codel:
      sg on:  24.4 Gbits/sec
      sg off: 15 Gbits/sec  (was 0.66 Gbit before patch !!! )
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NOleksandr Natalenko <oleksandr@natalenko.name>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a6b2a1d
  13. 21 2月, 2018 4 次提交
  14. 17 2月, 2018 2 次提交
  15. 15 2月, 2018 2 次提交
  16. 13 2月, 2018 5 次提交