1. 17 11月, 2018 1 次提交
  2. 08 11月, 2018 1 次提交
    • D
      net: vlan: add support for tunnel offload · 7dad9937
      Davide Caratti 提交于
      GSO tunneled packets are always segmented in software before they are
      transmitted by a VLAN, even when the lower device can offload tunnel
      encapsulation and VLAN together (i.e., some bits in NETIF_F_GSO_ENCAP_ALL
      mask are set in the lower device 'vlan_features'). If we let VLANs have
      the same tunnel offload capabilities as their lower device, throughput
      can improve significantly when CPU is limited on the transmitter side.
      
       - set NETIF_F_GSO_ENCAP_ALL bits in the VLAN 'hw_features', to ensure
       that 'features' will have those bits zeroed only when the lower device
       has no hardware support for tunnel encapsulation.
       - for the same reason, copy GSO-related bits of 'hw_enc_features' from
       lower device to VLAN, and ensure to update that value when the lower
       device changes its features.
       - set NETIF_F_HW_CSUM bit in the VLAN 'hw_enc_features' if 'real_dev'
       is able to compute checksums at least for a kind of packets, like done
       with commit 8403debe ("vlan: Keep NETIF_F_HW_CSUM similar to other
       software devices"). This avoids software segmentation due to mismatching
       checksum capabilities between VLAN's 'features' and 'hw_enc_features'.
      Reported-by: NFlavio Leitner <fbl@redhat.com>
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7dad9937
  3. 02 7月, 2018 1 次提交
    • S
      net: fix use-after-free in GRO with ESP · 603d4cf8
      Sabrina Dubroca 提交于
      Since the addition of GRO for ESP, gro_receive can consume the skb and
      return -EINPROGRESS. In that case, the lower layer GRO handler cannot
      touch the skb anymore.
      
      Commit 5f114163 ("net: Add a skb_gro_flush_final helper.") converted
      some of the gro_receive handlers that can lead to ESP's gro_receive so
      that they wouldn't access the skb when -EINPROGRESS is returned, but
      missed other spots, mainly in tunneling protocols.
      
      This patch finishes the conversion to using skb_gro_flush_final(), and
      adds a new helper, skb_gro_flush_final_remcsum(), used in VXLAN and
      GUE.
      
      Fixes: 5f114163 ("net: Add a skb_gro_flush_final helper.")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      603d4cf8
  4. 26 6月, 2018 1 次提交
    • D
      net: Convert GRO SKB handling to list_head. · d4546c25
      David Miller 提交于
      Manage pending per-NAPI GRO packets via list_head.
      
      Return an SKB pointer from the GRO receive handlers.  When GRO receive
      handlers return non-NULL, it means that this SKB needs to be completed
      at this time and removed from the NAPI queue.
      
      Several operations are greatly simplified by this transformation,
      especially timing out the oldest SKB in the list when gro_count
      exceeds MAX_GRO_SKBS, and napi_gro_flush() which walks the queue
      in reverse order.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4546c25
  5. 18 5月, 2018 1 次提交
  6. 30 3月, 2018 1 次提交
    • G
      net: Call add/kill vid ndo on vlan filter feature toggling · 9daae9bd
      Gal Pressman 提交于
      NETIF_F_HW_VLAN_[CS]TAG_FILTER features require more than just a bit
      flip in dev->features in order to keep the driver in a consistent state.
      These features notify the driver of each added/removed vlan, but toggling
      of vlan-filter does not notify the driver accordingly for each of the
      existing vlans.
      
      This patch implements a similar solution to NETIF_F_RX_UDP_TUNNEL_PORT
      behavior (which notifies the driver about UDP ports in the same manner
      that vids are reported).
      
      Each toggling of the features propagates to the 8021q module, which
      iterates over the vlans and call add/kill ndo accordingly.
      Signed-off-by: NGal Pressman <galp@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9daae9bd
  7. 28 3月, 2018 1 次提交
  8. 28 2月, 2018 1 次提交
  9. 11 1月, 2018 1 次提交
  10. 11 11月, 2017 1 次提交
  11. 04 11月, 2017 1 次提交
  12. 05 10月, 2017 1 次提交
  13. 20 6月, 2017 1 次提交
    • G
      net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev · 9745e362
      Gao Feng 提交于
      The register_vlan_device would invoke free_netdev directly, when
      register_vlan_dev failed. It would trigger the BUG_ON in free_netdev
      if the dev was already registered. In this case, the netdev would be
      freed in netdev_run_todo later.
      
      So add one condition check now. Only when dev is not registered, then
      free it directly.
      
      The following is the part coredump when netdev_upper_dev_link failed
      in register_vlan_dev. I removed the lines which are too long.
      
      [  411.237457] ------------[ cut here ]------------
      [  411.237458] kernel BUG at net/core/dev.c:7998!
      [  411.237484] invalid opcode: 0000 [#1] SMP
      [  411.237705]  [last unloaded: 8021q]
      [  411.237718] CPU: 1 PID: 12845 Comm: vconfig Tainted: G            E   4.12.0-rc5+ #6
      [  411.237737] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
      [  411.237764] task: ffff9cbeb6685580 task.stack: ffffa7d2807d8000
      [  411.237782] RIP: 0010:free_netdev+0x116/0x120
      [  411.237794] RSP: 0018:ffffa7d2807dbdb0 EFLAGS: 00010297
      [  411.237808] RAX: 0000000000000002 RBX: ffff9cbeb6ba8fd8 RCX: 0000000000001878
      [  411.237826] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 0000000000000000
      [  411.237844] RBP: ffffa7d2807dbdc8 R08: 0002986100029841 R09: 0002982100029801
      [  411.237861] R10: 0004000100029980 R11: 0004000100029980 R12: ffff9cbeb6ba9000
      [  411.238761] R13: ffff9cbeb6ba9060 R14: ffff9cbe60f1a000 R15: ffff9cbeb6ba9000
      [  411.239518] FS:  00007fb690d81700(0000) GS:ffff9cbebb640000(0000) knlGS:0000000000000000
      [  411.239949] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  411.240454] CR2: 00007f7115624000 CR3: 0000000077cdf000 CR4: 00000000003406e0
      [  411.240936] Call Trace:
      [  411.241462]  vlan_ioctl_handler+0x3f1/0x400 [8021q]
      [  411.241910]  sock_ioctl+0x18b/0x2c0
      [  411.242394]  do_vfs_ioctl+0xa1/0x5d0
      [  411.242853]  ? sock_alloc_file+0xa6/0x130
      [  411.243465]  SyS_ioctl+0x79/0x90
      [  411.243900]  entry_SYSCALL_64_fastpath+0x1e/0xa9
      [  411.244425] RIP: 0033:0x7fb69089a357
      [  411.244863] RSP: 002b:00007ffcd04e0fc8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
      [  411.245445] RAX: ffffffffffffffda RBX: 00007ffcd04e2884 RCX: 00007fb69089a357
      [  411.245903] RDX: 00007ffcd04e0fd0 RSI: 0000000000008983 RDI: 0000000000000003
      [  411.246527] RBP: 00007ffcd04e0fd0 R08: 0000000000000000 R09: 1999999999999999
      [  411.246976] R10: 000000000000053f R11: 0000000000000202 R12: 0000000000000004
      [  411.247414] R13: 00007ffcd04e1128 R14: 00007ffcd04e2888 R15: 0000000000000001
      [  411.249129] RIP: free_netdev+0x116/0x120 RSP: ffffa7d2807dbdb0
      Signed-off-by: NGao Feng <gfree.wind@vip.163.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9745e362
  14. 25 12月, 2016 1 次提交
  15. 18 11月, 2016 1 次提交
    • A
      netns: make struct pernet_operations::id unsigned int · c7d03a00
      Alexey Dobriyan 提交于
      Make struct pernet_operations::id unsigned.
      
      There are 2 reasons to do so:
      
      1)
      This field is really an index into an zero based array and
      thus is unsigned entity. Using negative value is out-of-bound
      access by definition.
      
      2)
      On x86_64 unsigned 32-bit data which are mixed with pointers
      via array indexing or offsets added or subtracted to pointers
      are preffered to signed 32-bit data.
      
      "int" being used as an array index needs to be sign-extended
      to 64-bit before being used.
      
      	void f(long *p, int i)
      	{
      		g(p[i]);
      	}
      
        roughly translates to
      
      	movsx	rsi, esi
      	mov	rdi, [rsi+...]
      	call 	g
      
      MOVSX is 3 byte instruction which isn't necessary if the variable is
      unsigned because x86_64 is zero extending by default.
      
      Now, there is net_generic() function which, you guessed it right, uses
      "int" as an array index:
      
      	static inline void *net_generic(const struct net *net, int id)
      	{
      		...
      		ptr = ng->ptr[id - 1];
      		...
      	}
      
      And this function is used a lot, so those sign extensions add up.
      
      Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
      messing with code generation):
      
      	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
      
      Unfortunately some functions actually grow bigger.
      This is a semmingly random artefact of code generation with register
      allocator being used differently. gcc decides that some variable
      needs to live in new r8+ registers and every access now requires REX
      prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
      used which is longer than [r8]
      
      However, overall balance is in negative direction:
      
      	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
      	function                                     old     new   delta
      	nfsd4_lock                                  3886    3959     +73
      	tipc_link_build_proto_msg                   1096    1140     +44
      	mac80211_hwsim_new_radio                    2776    2808     +32
      	tipc_mon_rcv                                1032    1058     +26
      	svcauth_gss_legacy_init                     1413    1429     +16
      	tipc_bcbase_select_primary                   379     392     +13
      	nfsd4_exchange_id                           1247    1260     +13
      	nfsd4_setclientid_confirm                    782     793     +11
      		...
      	put_client_renew_locked                      494     480     -14
      	ip_set_sockfn_get                            730     716     -14
      	geneve_sock_add                              829     813     -16
      	nfsd4_sequence_done                          721     703     -18
      	nlmclnt_lookup_host                          708     686     -22
      	nfsd4_lockt                                 1085    1063     -22
      	nfs_get_client                              1077    1050     -27
      	tcf_bpf_init                                1106    1076     -30
      	nfsd4_encode_fattr                          5997    5930     -67
      	Total: Before=154856051, After=154854321, chg -0.00%
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7d03a00
  16. 21 10月, 2016 1 次提交
    • S
      net: add recursion limit to GRO · fcd91dd4
      Sabrina Dubroca 提交于
      Currently, GRO can do unlimited recursion through the gro_receive
      handlers.  This was fixed for tunneling protocols by limiting tunnel GRO
      to one level with encap_mark, but both VLAN and TEB still have this
      problem.  Thus, the kernel is vulnerable to a stack overflow, if we
      receive a packet composed entirely of VLAN headers.
      
      This patch adds a recursion counter to the GRO layer to prevent stack
      overflow.  When a gro_receive function hits the recursion limit, GRO is
      aborted for this skb and it is processed normally.  This recursion
      counter is put in the GRO CB, but could be turned into a percpu counter
      if we run out of space in the CB.
      
      Thanks to Vladimír Beneš <vbenes@redhat.com> for the initial bug report.
      
      Fixes: CVE-2016-7039
      Fixes: 9b174d88 ("net: Add Transparent Ethernet Bridging GRO support.")
      Fixes: 66e5133f ("vlan: Add GRO support for non hardware accelerated vlan")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: NJiri Benc <jbenc@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fcd91dd4
  17. 19 10月, 2016 1 次提交
  18. 18 10月, 2016 1 次提交
  19. 14 8月, 2016 1 次提交
    • S
      net: remove type_check from dev_get_nest_level() · 952fcfd0
      Sabrina Dubroca 提交于
      The idea for type_check in dev_get_nest_level() was to count the number
      of nested devices of the same type (currently, only macvlan or vlan
      devices).
      This prevented the false positive lockdep warning on configurations such
      as:
      
      eth0 <--- macvlan0 <--- vlan0 <--- macvlan1
      
      However, this doesn't prevent a warning on a configuration such as:
      
      eth0 <--- macvlan0 <--- vlan0
      eth1 <--- vlan1 <--- macvlan1
      
      In this case, all the locks end up with a nesting subclass of 1, so
      lockdep thinks that there is still a deadlock:
      
      - in the first case we have (macvlan_netdev_addr_lock_key, 1) and then
        take (vlan_netdev_xmit_lock_key, 1)
      - in the second case, we have (vlan_netdev_xmit_lock_key, 1) and then
        take (macvlan_netdev_addr_lock_key, 1)
      
      By removing the linktype check in dev_get_nest_level() and always
      incrementing the nesting depth, lockdep considers this configuration
      valid.
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      952fcfd0
  20. 01 6月, 2016 1 次提交
    • M
      vlan: Propagate MAC address to VLANs · 308453aa
      Mike Manning 提交于
      The MAC address of the physical interface is only copied to the VLAN
      when it is first created, resulting in an inconsistency after MAC
      address changes of only newly created VLANs having an up-to-date MAC.
      
      The VLANs should continue inheriting the MAC address of the physical
      interface until the VLAN MAC address is explicitly set to any value.
      This allows IPv6 EUI64 addresses for the VLAN to reflect any changes
      to the MAC of the physical interface and thus for DAD to behave as
      expected.
      Signed-off-by: NMike Manning <mmanning@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      308453aa
  21. 18 3月, 2016 1 次提交
  22. 22 2月, 2016 1 次提交
  23. 02 6月, 2015 1 次提交
    • T
      vlan: Add GRO support for non hardware accelerated vlan · 66e5133f
      Toshiaki Makita 提交于
      Currently packets with non-hardware-accelerated vlan cannot be handled
      by GRO. This causes low performance for 802.1ad and stacked vlan, as their
      vlan tags are currently not stripped by hardware.
      
      This patch adds GRO support for non-hardware-accelerated vlan and
      improves receive performance of them.
      
      Test Environment:
       vlan device (.1Q) on vlan device (.1ad) on ixgbe (82599)
      
      Result:
      
      - Before
      
      $ netperf -t TCP_STREAM -H 192.168.20.2 -l 60
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       87380  16384  16384    60.00    5233.17
      
      Rx side CPU usage:
        %usr      %sys      %irq     %soft     %idle
        0.27     58.03      0.00     41.70      0.00
      
      - After
      
      $ netperf -t TCP_STREAM -H 192.168.20.2 -l 60
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       87380  16384  16384    60.00    7586.85
      
      Rx side CPU usage:
        %usr      %sys      %irq     %soft     %idle
        0.50     25.83      0.00     59.53     14.14
      
      [ Register VLAN offloads with priority 10 -DaveM ]
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      66e5133f
  24. 14 5月, 2015 1 次提交
    • V
      vlan: Correctly propagate promisc|allmulti flags in notifier. · be346ffa
      Vlad Yasevich 提交于
      Currently vlan notifier handler will try to update all vlans
      for a device when that device comes up.  A problem occurs,
      however, when the vlan device was set to promiscuous, but not
      by the user (ex: a bridge).  In that case, dev->gflags are
      not updated.  What results is that the lower device ends
      up with an extra promiscuity count.  Here are the
      backtraces that prove this:
      [62852.052179]  [<ffffffff814fe248>] __dev_set_promiscuity+0x38/0x1e0
      [62852.052186]  [<ffffffff8160bcbb>] ? _raw_spin_unlock_bh+0x1b/0x40
      [62852.052188]  [<ffffffff814fe4be>] ? dev_set_rx_mode+0x2e/0x40
      [62852.052190]  [<ffffffff814fe694>] dev_set_promiscuity+0x24/0x50
      [62852.052194]  [<ffffffffa0324795>] vlan_dev_open+0xd5/0x1f0 [8021q]
      [62852.052196]  [<ffffffff814fe58f>] __dev_open+0xbf/0x140
      [62852.052198]  [<ffffffff814fe88d>] __dev_change_flags+0x9d/0x170
      [62852.052200]  [<ffffffff814fe989>] dev_change_flags+0x29/0x60
      
      The above comes from the setting the vlan device to IFF_UP state.
      
      [62852.053569]  [<ffffffff814fe248>] __dev_set_promiscuity+0x38/0x1e0
      [62852.053571]  [<ffffffffa032459b>] ? vlan_dev_set_rx_mode+0x2b/0x30
      [8021q]
      [62852.053573]  [<ffffffff814fe8d5>] __dev_change_flags+0xe5/0x170
      [62852.053645]  [<ffffffff814fe989>] dev_change_flags+0x29/0x60
      [62852.053647]  [<ffffffffa032334a>] vlan_device_event+0x18a/0x690
      [8021q]
      [62852.053649]  [<ffffffff8161036c>] notifier_call_chain+0x4c/0x70
      [62852.053651]  [<ffffffff8109d456>] raw_notifier_call_chain+0x16/0x20
      [62852.053653]  [<ffffffff814f744d>] call_netdevice_notifiers+0x2d/0x60
      [62852.053654]  [<ffffffff814fe1a3>] __dev_notify_flags+0x33/0xa0
      [62852.053656]  [<ffffffff814fe9b2>] dev_change_flags+0x52/0x60
      [62852.053657]  [<ffffffff8150cd57>] do_setlink+0x397/0xa40
      
      And this one comes from the notification code.  What we end
      up with is a vlan with promiscuity count of 1 and and a physical
      device with a promiscuity count of 2.  They should both have
      a count 1.
      
      To resolve this issue, vlan code can use dev_get_flags() api
      which correctly masks promiscuity and allmulti flags.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be346ffa
  25. 19 3月, 2015 1 次提交
    • D
      net: Fix high overhead of vlan sub-device teardown. · 99c4a26a
      David S. Miller 提交于
      When a networking device is taken down that has a non-trivial number
      of VLAN devices configured under it, we eat a full synchronize_net()
      for every such VLAN device.
      
      This is because of the call chain:
      
      	NETDEV_DOWN notifier
      	--> vlan_device_event()
      		--> dev_change_flags()
      		--> __dev_change_flags()
      		--> __dev_close()
      		--> __dev_close_many()
      		--> dev_deactivate_many()
      			--> synchronize_net()
      
      This is kind of rediculous because we already have infrastructure for
      batching doing operation X to a list of net devices so that we only
      incur one sync.
      
      So make use of that by exporting dev_close_many() and adjusting it's
      interfaace so that the caller can fully manage the batch list.  Use
      this in vlan_device_event() and all the overhead goes away.
      Reported-by: NSalam Noureddine <noureddine@arista.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      99c4a26a
  26. 30 7月, 2014 1 次提交
  27. 16 7月, 2014 1 次提交
    • T
      net: set name_assign_type in alloc_netdev() · c835a677
      Tom Gundersen 提交于
      Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert
      all users to pass NET_NAME_UNKNOWN.
      
      Coccinelle patch:
      
      @@
      expression sizeof_priv, name, setup, txqs, rxqs, count;
      @@
      
      (
      -alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs)
      +alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs)
      |
      -alloc_netdev_mq(sizeof_priv, name, setup, count)
      +alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count)
      |
      -alloc_netdev(sizeof_priv, name, setup)
      +alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup)
      )
      
      v9: move comments here from the wrong commit
      Signed-off-by: NTom Gundersen <teg@jklm.no>
      Reviewed-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c835a677
  28. 17 5月, 2014 1 次提交
  29. 28 3月, 2014 1 次提交
  30. 22 1月, 2014 1 次提交
  31. 27 9月, 2013 2 次提交
  32. 04 8月, 2013 1 次提交
    • W
      vlan: cleanup the usage of vlan_dev_priv(dev) · 0c0667a8
      Wang Sheng-Hui 提交于
      This patch cleanup 2 points for the usage of vlan_dev_priv(dev):
      * In vlan_dev.c/vlan_dev_hard_header, we should use the var *vlan directly
        after grabing the pointer at the beginning with
              *vlan = vlan_dev_priv(dev);
        when we need to access the fields of *vlan.
      * In vlan.c/register_vlan_device, add the var *vlan pointer
              struct vlan_dev_priv *vlan;
      to cleanup the code to access the fields of vlan_dev_priv(new_dev).
      Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c0667a8
  33. 24 7月, 2013 1 次提交
  34. 29 5月, 2013 1 次提交
  35. 20 4月, 2013 3 次提交
  36. 25 3月, 2013 1 次提交
  37. 11 2月, 2013 1 次提交