1. 23 1月, 2018 1 次提交
  2. 24 11月, 2017 1 次提交
    • W
      net: accept UFO datagrams from tuntap and packet · 0c19f846
      Willem de Bruijn 提交于
      Tuntap and similar devices can inject GSO packets. Accept type
      VIRTIO_NET_HDR_GSO_UDP, even though not generating UFO natively.
      
      Processes are expected to use feature negotiation such as TUNSETOFFLOAD
      to detect supported offload types and refrain from injecting other
      packets. This process breaks down with live migration: guest kernels
      do not renegotiate flags, so destination hosts need to expose all
      features that the source host does.
      
      Partially revert the UFO removal from 182e0b6b~1..d9d30adf.
      This patch introduces nearly(*) no new code to simplify verification.
      It brings back verbatim tuntap UFO negotiation, VIRTIO_NET_HDR_GSO_UDP
      insertion and software UFO segmentation.
      
      It does not reinstate protocol stack support, hardware offload
      (NETIF_F_UFO), SKB_GSO_UDP tunneling in SKB_GSO_SOFTWARE or reception
      of VIRTIO_NET_HDR_GSO_UDP packets in tuntap.
      
      To support SKB_GSO_UDP reappearing in the stack, also reinstate
      logic in act_csum and openvswitch. Achieve equivalence with v4.13 HEAD
      by squashing in commit 93991221 ("net: skb_needs_check() removes
      CHECKSUM_UNNECESSARY check for tx.") and reverting commit 8d63bee6
      ("net: avoid skb_warn_bad_offload false positives on UFO").
      
      (*) To avoid having to bring back skb_shinfo(skb)->ip6_frag_id,
      ipv6_proxy_select_ident is changed to return a __be32 and this is
      assigned directly to the frag_hdr. Also, SKB_GSO_UDP is inserted
      at the end of the enum to minimize code churn.
      
      Tested
        Booted a v4.13 guest kernel with QEMU. On a host kernel before this
        patch `ethtool -k eth0` shows UFO disabled. After the patch, it is
        enabled, same as on a v4.13 host kernel.
      
        A UFO packet sent from the guest appears on the tap device:
          host:
            nc -l -p -u 8000 &
            tcpdump -n -i tap0
      
          guest:
            dd if=/dev/zero of=payload.txt bs=1 count=2000
            nc -u 192.16.1.1 8000 < payload.txt
      
        Direct tap to tap transmission of VIRTIO_NET_HDR_GSO_UDP succeeds,
        packets arriving fragmented:
      
          ./with_tap_pair.sh ./tap_send_ufo tap0 tap1
          (from https://github.com/wdebruij/kerneltools/tree/master/tests)
      
      Changes
        v1 -> v2
          - simplified set_offload change (review comment)
          - documented test procedure
      
      Link: http://lkml.kernel.org/r/<CAF=yD-LuUeDuL9YWPJD9ykOZ0QCjNeznPDr6whqZ9NGMNF12Mw@mail.gmail.com>
      Fixes: fb652fdf ("macvlan/macvtap: Remove NETIF_F_UFO advertisement.")
      Reported-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c19f846
  3. 09 10月, 2017 1 次提交
  4. 09 8月, 2017 1 次提交
    • W
      net: avoid skb_warn_bad_offload false positives on UFO · 8d63bee6
      Willem de Bruijn 提交于
      skb_warn_bad_offload triggers a warning when an skb enters the GSO
      stack at __skb_gso_segment that does not have CHECKSUM_PARTIAL
      checksum offload set.
      
      Commit b2504a5d ("net: reduce skb_warn_bad_offload() noise")
      observed that SKB_GSO_DODGY producers can trigger the check and
      that passing those packets through the GSO handlers will fix it
      up. But, the software UFO handler will set ip_summed to
      CHECKSUM_NONE.
      
      When __skb_gso_segment is called from the receive path, this
      triggers the warning again.
      
      Make UFO set CHECKSUM_UNNECESSARY instead of CHECKSUM_NONE. On
      Tx these two are equivalent. On Rx, this better matches the
      skb state (checksum computed), as CHECKSUM_NONE here means no
      checksum computed.
      
      See also this thread for context:
      http://patchwork.ozlabs.org/patch/799015/
      
      Fixes: b2504a5d ("net: reduce skb_warn_bad_offload() noise")
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d63bee6
  5. 18 7月, 2017 2 次提交
  6. 25 4月, 2017 1 次提交
    • A
      udp: disable inner UDP checksum offloads in IPsec case · b40c5f4f
      Ansis Atteka 提交于
      Otherwise, UDP checksum offloads could corrupt ESP packets by attempting
      to calculate UDP checksum when this inner UDP packet is already protected
      by IPsec.
      
      One way to reproduce this bug is to have a VM with virtio_net driver (UFO
      set to ON in the guest VM); and then encapsulate all guest's Ethernet
      frames in Geneve; and then further encrypt Geneve with IPsec.  In this
      case following symptoms are observed:
      1. If using ixgbe NIC, then it will complain with following error message:
         ixgbe 0000:01:00.1: partial checksum but l4 proto=32!
      2. Receiving IPsec stack will drop all the corrupted ESP packets and
         increase XfrmInStateProtoError counter in /proc/net/xfrm_stat.
      3. iperf UDP test from the VM with packet sizes above MTU will not work at
         all.
      4. iperf TCP test from the VM will get ridiculously low performance because.
      Signed-off-by: NAnsis Atteka <aatteka@ovn.org>
      Co-authored-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b40c5f4f
  7. 21 10月, 2016 1 次提交
    • S
      net: add recursion limit to GRO · fcd91dd4
      Sabrina Dubroca 提交于
      Currently, GRO can do unlimited recursion through the gro_receive
      handlers.  This was fixed for tunneling protocols by limiting tunnel GRO
      to one level with encap_mark, but both VLAN and TEB still have this
      problem.  Thus, the kernel is vulnerable to a stack overflow, if we
      receive a packet composed entirely of VLAN headers.
      
      This patch adds a recursion counter to the GRO layer to prevent stack
      overflow.  When a gro_receive function hits the recursion limit, GRO is
      aborted for this skb and it is processed normally.  This recursion
      counter is put in the GRO CB, but could be turned into a percpu counter
      if we run out of space in the CB.
      
      Thanks to Vladimír Beneš <vbenes@redhat.com> for the initial bug report.
      
      Fixes: CVE-2016-7039
      Fixes: 9b174d88 ("net: Add Transparent Ethernet Bridging GRO support.")
      Fixes: 66e5133f ("vlan: Add GRO support for non hardware accelerated vlan")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: NJiri Benc <jbenc@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fcd91dd4
  8. 20 9月, 2016 1 次提交
  9. 21 5月, 2016 1 次提交
    • T
      gso: Remove arbitrary checks for unsupported GSO · 5c7cdf33
      Tom Herbert 提交于
      In several gso_segment functions there are checks of gso_type against
      a seemingly arbitrary list of SKB_GSO_* flags. This seems like an
      attempt to identify unsupported GSO types, but since the stack is
      the one that set these GSO types in the first place this seems
      unnecessary to do. If a combination isn't valid in the first
      place that stack should not allow setting it.
      
      This is a code simplication especially for add new GSO types.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5c7cdf33
  10. 07 5月, 2016 1 次提交
    • J
      udp_offload: Set encapsulation before inner completes. · 229740c6
      Jarno Rajahalme 提交于
      UDP tunnel segmentation code relies on the inner offsets being set for
      an UDP tunnel GSO packet, but the inner *_complete() functions will
      set the inner offsets only if 'encapsulation' is set before calling
      them.  Currently, udp_gro_complete() sets 'encapsulation' only after
      the inner *_complete() functions are done.  This causes the inner
      offsets having invalid values after udp_gro_complete() returns, which
      in turn will make it impossible to properly segment the packet in case
      it needs to be forwarded, which would be visible to the user either as
      invalid packets being sent or as packet loss.
      
      This patch fixes this by setting skb's 'encapsulation' in
      udp_gro_complete() before calling into the inner complete functions,
      and by making each possible UDP tunnel gro_complete() callback set the
      inner_mac_header to the beginning of the tunnel payload.
      Signed-off-by: NJarno Rajahalme <jarno@ovn.org>
      Reviewed-by: NAlexander Duyck <aduyck@mirantis.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      229740c6
  11. 15 4月, 2016 1 次提交
    • A
      GSO: Support partial segmentation offload · 802ab55a
      Alexander Duyck 提交于
      This patch adds support for something I am referring to as GSO partial.
      The basic idea is that we can support a broader range of devices for
      segmentation if we use fixed outer headers and have the hardware only
      really deal with segmenting the inner header.  The idea behind the naming
      is due to the fact that everything before csum_start will be fixed headers,
      and everything after will be the region that is handled by hardware.
      
      With the current implementation it allows us to add support for the
      following GSO types with an inner TSO_MANGLEID or TSO6 offload:
      NETIF_F_GSO_GRE
      NETIF_F_GSO_GRE_CSUM
      NETIF_F_GSO_IPIP
      NETIF_F_GSO_SIT
      NETIF_F_UDP_TUNNEL
      NETIF_F_UDP_TUNNEL_CSUM
      
      In the case of hardware that already supports tunneling we may be able to
      extend this further to support TSO_TCPV4 without TSO_MANGLEID if the
      hardware can support updating inner IPv4 headers.
      Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      802ab55a
  12. 08 4月, 2016 2 次提交
    • T
      udp: Remove udp_offloads · 46aa2f30
      Tom Herbert 提交于
      Now that the UDP encapsulation GRO functions have been moved to the UDP
      socket we not longer need the udp_offload insfrastructure so removing it.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46aa2f30
    • T
      udp: Add GRO functions to UDP socket · a6024562
      Tom Herbert 提交于
      This patch adds GRO functions (gro_receive and gro_complete) to UDP
      sockets. udp_gro_receive is changed to perform socket lookup on a
      packet. If a socket is found the related GRO functions are called.
      
      This features obsoletes using UDP offload infrastructure for GRO
      (udp_offload). This has the advantage of not being limited to provide
      offload on a per port basis, GRO is now applied to whatever individual
      UDP sockets are bound to.  This also allows the possbility of
      "application defined GRO"-- that is we can attach something like
      a BPF program to a UDP socket to perfrom GRO on an application
      layer protocol.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6024562
  13. 24 3月, 2016 1 次提交
  14. 21 3月, 2016 1 次提交
    • J
      tunnels: Don't apply GRO to multiple layers of encapsulation. · fac8e0f5
      Jesse Gross 提交于
      When drivers express support for TSO of encapsulated packets, they
      only mean that they can do it for one layer of encapsulation.
      Supporting additional levels would mean updating, at a minimum,
      more IP length fields and they are unaware of this.
      
      No encapsulation device expresses support for handling offloaded
      encapsulated packets, so we won't generate these types of frames
      in the transmit path. However, GRO doesn't have a check for
      multiple levels of encapsulation and will attempt to build them.
      
      UDP tunnel GRO actually does prevent this situation but it only
      handles multiple UDP tunnels stacked on top of each other. This
      generalizes that solution to prevent any kind of tunnel stacking
      that would cause problems.
      
      Fixes: bf5a755f ("net-gre-gro: Add GRE support to the GRO stack")
      Signed-off-by: NJesse Gross <jesse@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fac8e0f5
  15. 14 3月, 2016 1 次提交
  16. 27 2月, 2016 1 次提交
    • A
      GSO: Provide software checksum of tunneled UDP fragmentation offload · 22463876
      Alexander Duyck 提交于
      On reviewing the code I realized that GRE and UDP tunnels could cause a
      kernel panic if we used GSO to segment a large UDP frame that was sent
      through the tunnel with an outer checksum and hardware offloads were not
      available.
      
      In order to correct this we need to update the feature flags that are
      passed to the skb_segment function so that in the event of UDP
      fragmentation being requested for the inner header the segmentation
      function will correctly generate the checksum for the payload if we cannot
      segment the outer header.
      Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22463876
  17. 11 2月, 2016 4 次提交
  18. 11 1月, 2016 1 次提交
    • H
      udp: restrict offloads to one namespace · 787d7ac3
      Hannes Frederic Sowa 提交于
      udp tunnel offloads tend to aggregate datagrams based on inner
      headers. gro engine gets notified by tunnel implementations about
      possible offloads. The match is solely based on the port number.
      
      Imagine a tunnel bound to port 53, the offloading will look into all
      DNS packets and tries to aggregate them based on the inner data found
      within. This could lead to data corruption and malformed DNS packets.
      
      While this patch minimizes the problem and helps an administrator to find
      the issue by querying ip tunnel/fou, a better way would be to match on
      the specific destination ip address so if a user space socket is bound
      to the same address it will conflict.
      
      Cc: Tom Herbert <tom@herbertland.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      787d7ac3
  19. 16 12月, 2015 1 次提交
  20. 04 4月, 2015 1 次提交
  21. 12 2月, 2015 1 次提交
  22. 15 1月, 2015 1 次提交
  23. 06 11月, 2014 3 次提交
  24. 21 10月, 2014 1 次提交
    • F
      net: gso: use feature flag argument in all protocol gso handlers · 1e16aa3d
      Florian Westphal 提交于
      skb_gso_segment() has a 'features' argument representing offload features
      available to the output path.
      
      A few handlers, e.g. GRE, instead re-fetch the features of skb->dev and use
      those instead of the provided ones when handing encapsulation/tunnels.
      
      Depending on dev->hw_enc_features of the output device skb_gso_segment() can
      then return NULL even when the caller has disabled all GSO feature bits,
      as segmentation of inner header thinks device will take care of segmentation.
      
      This e.g. affects the tbf scheduler, which will silently drop GRE-encap GSO skbs
      that did not fit the remaining token quota as the segmentation does not work
      when device supports corresponding hw offload capabilities.
      
      Cc: Pravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1e16aa3d
  25. 04 10月, 2014 1 次提交
    • T
      fou: eliminate IPv4,v6 specific GRO functions · efc98d08
      Tom Herbert 提交于
      This patch removes fou[46]_gro_receive and fou[46]_gro_complete
      functions. The v4 or v6 variants were chosen for the UDP offloads
      based on the address family of the socket this is not necessary
      or correct. Alternatively, this patch adds is_ipv6 to napi_gro_skb.
      This is set in udp6_gro_receive and unset in udp4_gro_receive. In
      fou_gro_receive the value is used to select the correct inet_offloads
      for the protocol of the outer IP header.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      efc98d08
  26. 02 10月, 2014 1 次提交
    • T
      udp: Generalize skb_udp_segment · 8bce6d7d
      Tom Herbert 提交于
      skb_udp_segment is the function called from udp4_ufo_fragment to
      segment a UDP tunnel packet. This function currently assumes
      segmentation is transparent Ethernet bridging (i.e. VXLAN
      encapsulation). This patch generalizes the function to
      operate on either Ethertype or IP protocol.
      
      The inner_protocol field must be set to the protocol of the inner
      header. This can now be either an Ethertype or an IP protocol
      (in a union). A new flag in the skbuff indicates which type is
      effective. skb_set_inner_protocol and skb_set_inner_ipproto
      helper functions were added to set the inner_protocol. These
      functions are called from the point where the tunnel encapsulation
      is occuring.
      
      When skb_udp_tunnel_segment is called, the function to segment the
      inner packet is selected based on the inner IP or Ethertype. In the
      case of an IP protocol encapsulation, the function is derived from
      inet[6]_offloads. In the case of Ethertype, skb->protocol is
      set to the inner_protocol and skb_mac_gso_segment is called. (GRE
      currently does this, but it might be possible to lookup the protocol
      in offload_base and call the appropriate segmenation function
      directly).
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8bce6d7d
  27. 26 9月, 2014 2 次提交
  28. 20 9月, 2014 1 次提交
  29. 13 9月, 2014 1 次提交
    • S
      udp: Fix inverted NAPI_GRO_CB(skb)->flush test · 2d8f7e2c
      Scott Wood 提交于
      Commit 2abb7cdc ("udp: Add support for doing checksum unnecessary
      conversion") caused napi_gro_cb structs with the "flush" field zero to
      take the "udp_gro_receive" path rather than the "set flush to 1" path
      that they would previously take.  As a result I saw booting from an NFS
      root hang shortly after starting userspace, with "server not
      responding" messages.
      
      This change to the handling of "flush == 0" packets appears to be
      incidental to the goal of adding new code in the case where
      skb_gro_checksum_validate_zero_check() returns zero.  Based on that and
      the fact that it breaks things, I'm assuming that it is unintentional.
      
      Fixes: 2abb7cdc ("udp: Add support for doing checksum unnecessary conversion")
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d8f7e2c
  30. 10 9月, 2014 1 次提交
  31. 02 9月, 2014 1 次提交
    • T
      udp: Add support for doing checksum unnecessary conversion · 2abb7cdc
      Tom Herbert 提交于
      Add support for doing CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE
      conversion in UDP tunneling path.
      
      In the normal UDP path, we call skb_checksum_try_convert after locating
      the UDP socket. The check is that checksum conversion is enabled for
      the socket (new flag in UDP socket) and that checksum field is
      non-zero.
      
      In the UDP GRO path, we call skb_gro_checksum_try_convert after
      checksum is validated and checksum field is non-zero. Since this is
      already in GRO we assume that checksum conversion is always wanted.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2abb7cdc
  32. 30 8月, 2014 1 次提交
    • T
      net: Allow GRO to use and set levels of checksum unnecessary · 662880f4
      Tom Herbert 提交于
      Allow GRO path to "consume" checksums provided in CHECKSUM_UNNECESSARY
      and to report new checksums verfied for use in fallback to normal
      path.
      
      Change GRO checksum path to track csum_level using a csum_cnt field
      in NAPI_GRO_CB. On GRO initialization, if ip_summed is
      CHECKSUM_UNNECESSARY set NAPI_GRO_CB(skb)->csum_cnt to
      skb->csum_level + 1. For each checksum verified, decrement
      NAPI_GRO_CB(skb)->csum_cnt while its greater than zero. If a checksum
      is verfied and NAPI_GRO_CB(skb)->csum_cnt == 0, we have verified a
      deeper checksum than originally indicated in skbuf so increment
      csum_level (or initialize to CHECKSUM_UNNECESSARY if ip_summed is
      CHECKSUM_NONE or CHECKSUM_COMPLETE).
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      662880f4