1. 21 3月, 2016 1 次提交
    • J
      tunnels: Remove encapsulation offloads on decap. · a09a4c8d
      Jesse Gross 提交于
      If a packet is either locally encapsulated or processed through GRO
      it is marked with the offloads that it requires. However, when it is
      decapsulated these tunnel offload indications are not removed. This
      means that if we receive an encapsulated TCP packet, aggregate it with
      GRO, decapsulate, and retransmit the resulting frame on a NIC that does
      not support encapsulation, we won't be able to take advantage of hardware
      offloads even though it is just a simple TCP packet at this point.
      
      This fixes the problem by stripping off encapsulation offload indications
      when packets are decapsulated.
      
      The performance impacts of this bug are significant. In a test where a
      Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed, decapsulated,
      and bridged to a VM performance is improved by 60% (5Gbps->8Gbps) as a
      result of avoiding unnecessary segmentation at the VM tap interface.
      Reported-by: NRamu Ramamurthy <sramamur@linux.vnet.ibm.com>
      Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE")
      Signed-off-by: NJesse Gross <jesse@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a09a4c8d
  2. 19 3月, 2016 1 次提交
  3. 12 3月, 2016 1 次提交
  4. 09 3月, 2016 2 次提交
  5. 19 2月, 2016 1 次提交
  6. 17 2月, 2016 2 次提交
  7. 12 2月, 2016 1 次提交
  8. 10 2月, 2016 1 次提交
    • D
      vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices · 7e059158
      David Wragg 提交于
      Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could
      transmit vxlan packets of any size, constrained only by the ability to
      send out the resulting packets.  4.3 introduced netdevs corresponding
      to tunnel vports.  These netdevs have an MTU, which limits the size of
      a packet that can be successfully encapsulated.  The default MTU
      values are low (1500 or less), which is awkwardly small in the context
      of physical networks supporting jumbo frames, and leads to a
      conspicuous change in behaviour for userspace.
      
      Instead, set the MTU on openvswitch-created netdevs to be the relevant
      maximum (i.e. the maximum IP packet size minus any relevant overhead),
      effectively restoring the behaviour prior to 4.3.
      Signed-off-by: NDavid Wragg <david@weave.works>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7e059158
  9. 26 12月, 2015 1 次提交
  10. 17 11月, 2015 1 次提交
    • J
      ip_tunnel: disable preemption when updating per-cpu tstats · b4fe85f9
      Jason A. Donenfeld 提交于
      Drivers like vxlan use the recently introduced
      udp_tunnel_xmit_skb/udp_tunnel6_xmit_skb APIs. udp_tunnel6_xmit_skb
      makes use of ip6tunnel_xmit, and ip6tunnel_xmit, after sending the
      packet, updates the struct stats using the usual
      u64_stats_update_begin/end calls on this_cpu_ptr(dev->tstats).
      udp_tunnel_xmit_skb makes use of iptunnel_xmit, which doesn't touch
      tstats, so drivers like vxlan, immediately after, call
      iptunnel_xmit_stats, which does the same thing - calls
      u64_stats_update_begin/end on this_cpu_ptr(dev->tstats).
      
      While vxlan is probably fine (I don't know?), calling a similar function
      from, say, an unbound workqueue, on a fully preemptable kernel causes
      real issues:
      
      [  188.434537] BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u8:0/6
      [  188.435579] caller is debug_smp_processor_id+0x17/0x20
      [  188.435583] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.2.6 #2
      [  188.435607] Call Trace:
      [  188.435611]  [<ffffffff8234e936>] dump_stack+0x4f/0x7b
      [  188.435615]  [<ffffffff81915f3d>] check_preemption_disabled+0x19d/0x1c0
      [  188.435619]  [<ffffffff81915f77>] debug_smp_processor_id+0x17/0x20
      
      The solution would be to protect the whole
      this_cpu_ptr(dev->tstats)/u64_stats_update_begin/end blocks with
      disabling preemption and then reenabling it.
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b4fe85f9
  11. 25 9月, 2015 1 次提交
    • J
      ipv4: send arp replies to the correct tunnel · 63d008a4
      Jiri Benc 提交于
      When using ip lwtunnels, the additional data for xmit (basically, the actual
      tunnel to use) are carried in ip_tunnel_info either in dst->lwtstate or in
      metadata dst. When replying to ARP requests, we need to send the reply to
      the same tunnel the request came from. This means we need to construct
      proper metadata dst for ARP replies.
      
      We could perform another route lookup to get a dst entry with the correct
      lwtstate. However, this won't always ensure that the outgoing tunnel is the
      same as the incoming one, and it won't work anyway for IPv4 duplicate
      address detection.
      
      The only thing to do is to "reverse" the ip_tunnel_info.
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63d008a4
  12. 01 9月, 2015 1 次提交
  13. 30 8月, 2015 2 次提交
  14. 21 8月, 2015 5 次提交
  15. 11 8月, 2015 1 次提交
  16. 23 7月, 2015 2 次提交
  17. 22 7月, 2015 4 次提交
    • T
      fib: Add fib rule match on tunnel id · e7030878
      Thomas Graf 提交于
      This add the ability to select a routing table based on the tunnel
      id which allows to maintain separate routing tables for each virtual
      tunnel network.
      
      ip rule add from all tunnel-id 100 lookup 100
      ip rule add from all tunnel-id 200 lookup 200
      
      A new static key controls the collection of metadata at tunnel level
      upon demand.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7030878
    • T
      route: Per route IP tunnel metadata via lightweight tunnel · 3093fbe7
      Thomas Graf 提交于
      This introduces a new IP tunnel lightweight tunnel type which allows
      to specify IP tunnel instructions per route. Only IPv4 is supported
      at this point.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3093fbe7
    • T
      vxlan: Flow based tunneling · ee122c79
      Thomas Graf 提交于
      Allows putting a VXLAN device into a new flow-based mode in which
      skbs with a ip_tunnel_info dst metadata attached will be encapsulated
      according to the instructions stored in there with the VXLAN device
      defaults taken into consideration.
      
      Similar on the receive side, if the VXLAN_F_COLLECT_METADATA flag is
      set, the packet processing will populate a ip_tunnel_info struct for
      each packet received and attach it to the skb using the new metadata
      dst.  The metadata structure will contain the outer header and tunnel
      header fields which have been stripped off. Layers further up in the
      stack such as routing, tc or netfitler can later match on these fields
      and perform forwarding. It is the responsibility of upper layers to
      ensure that the flag is set if the metadata is needed. The flag limits
      the additional cost of metadata collecting based on demand.
      
      This prepares the VXLAN device to be steered by the routing and other
      subsystems which allows to support encapsulation for a large number
      of tunnel endpoints and tunnel ids through a single net_device which
      improves the scalability.
      
      It also allows for OVS to leverage this mode which in turn allows for
      the removal of the OVS specific VXLAN code.
      
      Because the skb is currently scrubed in vxlan_rcv(), the attachment of
      the new dst metadata is postponed until after scrubing which requires
      the temporary addition of a new member to vxlan_metadata. This member
      is removed again in a later commit after the indirect VXLAN receive API
      has been removed.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee122c79
    • T
      ip_tunnel: Make ovs_tunnel_info and ovs_key_ipv4_tunnel generic · 1d8fff90
      Thomas Graf 提交于
      Rename the tunnel metadata data structures currently internal to
      OVS and make them generic for use by all IP tunnels.
      
      Both structures are kernel internal and will stay that way. Their
      members are exposed to user space through individual Netlink
      attributes by OVS. It will therefore be possible to extend/modify
      these structures without affecting user ABI.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1d8fff90
  18. 03 4月, 2015 1 次提交
  19. 20 1月, 2015 1 次提交
  20. 15 1月, 2015 1 次提交
    • T
      openvswitch: Support VXLAN Group Policy extension · 1dd144cf
      Thomas Graf 提交于
      Introduces support for the group policy extension to the VXLAN virtual
      port. The extension is disabled by default and only enabled if the user
      has provided the respective configuration.
      
        ovs-vsctl add-port br0 vxlan0 -- \
           set Interface vxlan0 type=vxlan options:exts=gbp
      
      The configuration interface to enable the extension is based on a new
      attribute OVS_VXLAN_EXT_GBP nested inside OVS_TUNNEL_ATTR_EXTENSION
      which can carry additional extensions as needed in the future.
      
      The group policy metadata is stored as binary blob (struct ovs_vxlan_opts)
      internally just like Geneve options but transported as nested Netlink
      attributes to user space.
      
      Renames the existing TUNNEL_OPTIONS_PRESENT to TUNNEL_GENEVE_OPT with the
      binary value kept intact, a new flag TUNNEL_VXLAN_OPT is introduced.
      
      The attributes OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS and existing
      OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS are implemented mutually exclusive.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1dd144cf
  21. 13 11月, 2014 1 次提交
  22. 06 10月, 2014 2 次提交
  23. 20 9月, 2014 1 次提交
  24. 31 7月, 2014 1 次提交
  25. 16 4月, 2014 1 次提交
  26. 21 2月, 2014 1 次提交
  27. 18 1月, 2014 1 次提交
    • E
      ipv4: fix a dst leak in tunnels · 6c7e7610
      Eric Dumazet 提交于
      This patch :
      
      1) Remove a dst leak if DST_NOCACHE was set on dst
         Fix this by holding a reference only if dst really cached.
      
      2) Remove a lockdep warning in __tunnel_dst_set()
          This was reported by Cong Wang.
      
      3) Remove usage of a spinlock where xchg() is enough
      
      4) Remove some spurious inline keywords.
         Let compiler decide for us.
      
      Fixes: 7d442fab ("ipv4: Cache dst in tunnels")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Cong Wang <cwang@twopensource.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Maciej Żenczykowski <maze@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c7e7610
  28. 05 1月, 2014 1 次提交