1. 17 4月, 2016 1 次提交
  2. 16 4月, 2016 1 次提交
    • E
      tcp: do not mess with listener sk_wmem_alloc · b3d05147
      Eric Dumazet 提交于
      When removing sk_refcnt manipulation on synflood, I missed that
      using skb_set_owner_w() was racy, if sk->sk_wmem_alloc had already
      transitioned to 0.
      
      We should hold sk_refcnt instead, but this is a big deal under attack.
      (Doing so increase performance from 3.2 Mpps to 3.8 Mpps only)
      
      In this patch, I chose to not attach a socket to syncookies skb.
      
      Performance is now 5 Mpps instead of 3.2 Mpps.
      
      Following patch will remove last known false sharing in
      tcp_rcv_state_process()
      
      Fixes: 3b24d854 ("tcp/dccp: do not touch listener sk_refcnt under synflood")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b3d05147
  3. 15 4月, 2016 3 次提交
    • A
      GSO: Support partial segmentation offload · 802ab55a
      Alexander Duyck 提交于
      This patch adds support for something I am referring to as GSO partial.
      The basic idea is that we can support a broader range of devices for
      segmentation if we use fixed outer headers and have the hardware only
      really deal with segmenting the inner header.  The idea behind the naming
      is due to the fact that everything before csum_start will be fixed headers,
      and everything after will be the region that is handled by hardware.
      
      With the current implementation it allows us to add support for the
      following GSO types with an inner TSO_MANGLEID or TSO6 offload:
      NETIF_F_GSO_GRE
      NETIF_F_GSO_GRE_CSUM
      NETIF_F_GSO_IPIP
      NETIF_F_GSO_SIT
      NETIF_F_UDP_TUNNEL
      NETIF_F_UDP_TUNNEL_CSUM
      
      In the case of hardware that already supports tunneling we may be able to
      extend this further to support TSO_TCPV4 without TSO_MANGLEID if the
      hardware can support updating inner IPv4 headers.
      Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      802ab55a
    • A
      GRO: Add support for TCP with fixed IPv4 ID field, limit tunnel IP ID values · 1530545e
      Alexander Duyck 提交于
      This patch does two things.
      
      First it allows TCP to aggregate TCP frames with a fixed IPv4 ID field.  As
      a result we should now be able to aggregate flows that were converted from
      IPv6 to IPv4.  In addition this allows us more flexibility for future
      implementations of segmentation as we may be able to use a fixed IP ID when
      segmenting the flow.
      
      The second thing this does is that it places limitations on the outer IPv4
      ID header in the case of tunneled frames.  Specifically it forces the IP ID
      to be incrementing by 1 unless the DF bit is set in the outer IPv4 header.
      This way we can avoid creating overlapping series of IP IDs that could
      possibly be fragmented if the frame goes through GRO and is then
      resegmented via GSO.
      Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1530545e
    • A
      GSO: Add GSO type for fixed IPv4 ID · cbc53e08
      Alexander Duyck 提交于
      This patch adds support for TSO using IPv4 headers with a fixed IP ID
      field.  This is meant to allow us to do a lossless GRO in the case of TCP
      flows that use a fixed IP ID such as those that convert IPv6 header to IPv4
      headers.
      
      In addition I am adding a feature that for now I am referring to TSO with
      IP ID mangling.  Basically when this flag is enabled the device has the
      option to either output the flow with incrementing IP IDs or with a fixed
      IP ID regardless of what the original IP ID ordering was.  This is useful
      in cases where the DF bit is set and we do not care if the original IP ID
      value is maintained.
      Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cbc53e08
  4. 14 4月, 2016 1 次提交
  5. 10 4月, 2016 1 次提交
  6. 08 4月, 2016 5 次提交
  7. 06 4月, 2016 3 次提交
  8. 05 4月, 2016 9 次提交
  9. 31 3月, 2016 1 次提交
  10. 29 3月, 2016 1 次提交
  11. 28 3月, 2016 4 次提交
  12. 24 3月, 2016 1 次提交
  13. 21 3月, 2016 2 次提交
    • J
      tunnels: Remove encapsulation offloads on decap. · a09a4c8d
      Jesse Gross 提交于
      If a packet is either locally encapsulated or processed through GRO
      it is marked with the offloads that it requires. However, when it is
      decapsulated these tunnel offload indications are not removed. This
      means that if we receive an encapsulated TCP packet, aggregate it with
      GRO, decapsulate, and retransmit the resulting frame on a NIC that does
      not support encapsulation, we won't be able to take advantage of hardware
      offloads even though it is just a simple TCP packet at this point.
      
      This fixes the problem by stripping off encapsulation offload indications
      when packets are decapsulated.
      
      The performance impacts of this bug are significant. In a test where a
      Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed, decapsulated,
      and bridged to a VM performance is improved by 60% (5Gbps->8Gbps) as a
      result of avoiding unnecessary segmentation at the VM tap interface.
      Reported-by: NRamu Ramamurthy <sramamur@linux.vnet.ibm.com>
      Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE")
      Signed-off-by: NJesse Gross <jesse@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a09a4c8d
    • J
      tunnels: Don't apply GRO to multiple layers of encapsulation. · fac8e0f5
      Jesse Gross 提交于
      When drivers express support for TSO of encapsulated packets, they
      only mean that they can do it for one layer of encapsulation.
      Supporting additional levels would mean updating, at a minimum,
      more IP length fields and they are unaware of this.
      
      No encapsulation device expresses support for handling offloaded
      encapsulated packets, so we won't generate these types of frames
      in the transmit path. However, GRO doesn't have a check for
      multiple levels of encapsulation and will attempt to build them.
      
      UDP tunnel GRO actually does prevent this situation but it only
      handles multiple UDP tunnels stacked on top of each other. This
      generalizes that solution to prevent any kind of tunnel stacking
      that would cause problems.
      
      Fixes: bf5a755f ("net-gre-gro: Add GRE support to the GRO stack")
      Signed-off-by: NJesse Gross <jesse@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fac8e0f5
  14. 16 3月, 2016 1 次提交
    • P
      tags: Fix DEFINE_PER_CPU expansions · 25528213
      Peter Zijlstra 提交于
      $ make tags
        GEN     tags
      ctags: Warning: drivers/acpi/processor_idle.c:64: null expansion of name pattern "\1"
      ctags: Warning: drivers/xen/events/events_2l.c:41: null expansion of name pattern "\1"
      ctags: Warning: kernel/locking/lockdep.c:151: null expansion of name pattern "\1"
      ctags: Warning: kernel/rcu/rcutorture.c:133: null expansion of name pattern "\1"
      ctags: Warning: kernel/rcu/rcutorture.c:135: null expansion of name pattern "\1"
      ctags: Warning: kernel/workqueue.c:323: null expansion of name pattern "\1"
      ctags: Warning: net/ipv4/syncookies.c:53: null expansion of name pattern "\1"
      ctags: Warning: net/ipv6/syncookies.c:44: null expansion of name pattern "\1"
      ctags: Warning: net/rds/page.c:45: null expansion of name pattern "\1"
      
      Which are all the result of the DEFINE_PER_CPU pattern:
      
        scripts/tags.sh:200:	'/\<DEFINE_PER_CPU([^,]*, *\([[:alnum:]_]*\)/\1/v/'
        scripts/tags.sh:201:	'/\<DEFINE_PER_CPU_SHARED_ALIGNED([^,]*, *\([[:alnum:]_]*\)/\1/v/'
      
      The below cures them. All except the workqueue one are within reasonable
      distance of the 80 char limit. TJ do you have any preference on how to
      fix the wq one, or shall we just not care its too long?
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      25528213
  15. 15 3月, 2016 2 次提交
    • J
      netfilter: Allow calling into nat helper without skb_dst. · 26461905
      Jarno Rajahalme 提交于
      NAT checksum recalculation code assumes existence of skb_dst, which
      becomes a problem for a later patch in the series ("openvswitch:
      Interface with NAT.").  Simplify this by removing the check on
      skb_dst, as the checksum will be dealt with later in the stack.
      Suggested-by: NPravin Shelar <pshelar@nicira.com>
      Signed-off-by: NJarno Rajahalme <jarno@ovn.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      26461905
    • M
      tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In · a44d6eac
      Martin KaFai Lau 提交于
      Per RFC4898, they count segments sent/received
      containing a positive length data segment (that includes
      retransmission segments carrying data).  Unlike
      tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments
      carrying no data (e.g. pure ack).
      
      The patch also updates the segs_in in tcp_fastopen_add_skb()
      so that segs_in >= data_segs_in property is kept.
      
      Together with retransmission data, tcpi_data_segs_out
      gives a better signal on the rxmit rate.
      
      v6: Rebase on the latest net-next
      
      v5: Eric pointed out that checking skb->len is still needed in
      tcp_fastopen_add_skb() because skb can carry a FIN without data.
      Hence, instead of open coding segs_in and data_segs_in, tcp_segs_in()
      helper is used.  Comment is added to the fastopen case to explain why
      segs_in has to be reset and tcp_segs_in() has to be called before
      __skb_pull().
      
      v4: Add comment to the changes in tcp_fastopen_add_skb()
      and also add remark on this case in the commit message.
      
      v3: Add const modifier to the skb parameter in tcp_segs_in()
      
      v2: Rework based on recent fix by Eric:
      commit a9d99ce2 ("tcp: fix tcpi_segs_in after connection establishment")
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Cc: Chris Rapier <rapier@psc.edu>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Marcelo Ricardo Leitner <mleitner@redhat.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a44d6eac
  16. 14 3月, 2016 2 次提交
  17. 12 3月, 2016 1 次提交
  18. 09 3月, 2016 1 次提交