1. 19 7月, 2017 2 次提交
    • F
      xfrm: add xdst pcpu cache · ec30d78c
      Florian Westphal 提交于
      retain last used xfrm_dst in a pcpu cache.
      On next request, reuse this dst if the policies are the same.
      
      The cache will not help with strict RR workloads as there is no hit.
      
      The cache packet-path part is reasonably small, the notifier part is
      needed so we do not add long hangs when a device is dismantled but some
      pcpu xdst still holds a reference, there are also calls to the flush
      operation when userspace deletes SAs so modules can be removed
      (there is no hit.
      
      We need to run the dst_release on the correct cpu to avoid races with
      packet path.  This is done by adding a work_struct for each cpu and then
      doing the actual test/release on each affected cpu via schedule_work_on().
      
      Test results using 4 network namespaces and null encryption:
      
      ns1           ns2          -> ns3           -> ns4
      netperf -> xfrm/null enc   -> xfrm/null dec -> netserver
      
      what                    TCP_STREAM      UDP_STREAM      UDP_RR
      Flow cache:             14644.61        294.35          327231.64
      No flow cache:		14349.81	242.64		202301.72
      Pcpu cache:		14629.70	292.21		205595.22
      
      UDP tests used 64byte packets, tests ran for one minute each,
      value is average over ten iterations.
      
      'Flow cache' is 'net-next', 'No flow cache' is net-next plus this
      series but without this patch.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec30d78c
    • F
      xfrm: remove flow cache · 09c75704
      Florian Westphal 提交于
      After rcu conversions performance degradation in forward tests isn't that
      noticeable anymore.
      
      See next patch for some numbers.
      
      A followup patcg could then also remove genid from the policies
      as we do not cache bundles anymore.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09c75704
  2. 05 7月, 2017 3 次提交
  3. 07 6月, 2017 3 次提交
  4. 04 5月, 2017 1 次提交
    • S
      xfrm: fix stack access out of bounds with CONFIG_XFRM_SUB_POLICY · 9b3eb541
      Sabrina Dubroca 提交于
      When CONFIG_XFRM_SUB_POLICY=y, xfrm_dst stores a copy of the flowi for
      that dst. Unfortunately, the code that allocates and fills this copy
      doesn't care about what type of flowi (flowi, flowi4, flowi6) gets
      passed. In multiple code paths (from raw_sendmsg, from TCP when
      replying to a FIN, in vxlan, geneve, and gre), the flowi that gets
      passed to xfrm is actually an on-stack flowi4, so we end up reading
      stuff from the stack past the end of the flowi4 struct.
      
      Since xfrm_dst->origin isn't used anywhere following commit
      ca116922 ("xfrm: Eliminate "fl" and "pol" args to
      xfrm_bundle_ok()."), just get rid of it.  xfrm_dst->partner isn't used
      either, so get rid of that too.
      
      Fixes: 9d6ec938 ("ipv4: Use flowi4 in public route lookup interfaces.")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      9b3eb541
  5. 14 4月, 2017 5 次提交
  6. 27 3月, 2017 1 次提交
    • A
      xfrm: branchless addr4_match() on 64-bit · 6c786bcb
      Alexey Dobriyan 提交于
      Current addr4_match() code has special test for /0 prefixes because of
      standard required undefined behaviour. However, it is possible to omit
      it on 64-bit because shifting can be done within a 64-bit register and
      then truncated to the expected value (which is 0 mask).
      
      Implicit truncation by htonl() fits nicely into R32-within-R64 model
      on x86-64.
      
      Space savings: none (coincidence)
      Branch savings: 1
      
      Before:
      
      	movzx  eax,BYTE PTR [rdi+0x2a]		# ->prefixlen_d
      	test   al,al
      	jne    xfrm_selector_match + 0x23f
      		...
      	movzx  eax,BYTE PTR [rbx+0x2b]		# ->prefixlen_s
      	test   al,al
      	je     xfrm_selector_match + 0x1c7
      
      After (no branches):
      
      	mov    r8d,0x20
      	mov    rdx,0xffffffffffffffff
      	mov    esi,DWORD PTR [rsi+0x2c]
      	mov    ecx,r8d
      	sub    cl,BYTE PTR [rdi+0x2a]
      	xor    esi,DWORD PTR [rbx]
      	mov    rdi,rdx
      	xor    eax,eax
      	shl    rdi,cl
      	bswap  edi
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      6c786bcb
  7. 24 3月, 2017 2 次提交
  8. 15 2月, 2017 4 次提交
  9. 09 2月, 2017 4 次提交
  10. 17 1月, 2017 1 次提交
  11. 10 1月, 2017 2 次提交
  12. 21 9月, 2016 1 次提交
    • N
      vti6: fix input path · 63c43787
      Nicolas Dichtel 提交于
      Since commit 1625f452, vti6 is broken, all input packets are dropped
      (LINUX_MIB_XFRMINNOSTATES is incremented).
      
      XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 is set by vti6_rcv() before calling
      xfrm6_rcv()/xfrm6_rcv_spi(), thus we cannot set to NULL that value in
      xfrm6_rcv_spi().
      
      A new function xfrm6_rcv_tnl() that enables to pass a value to
      xfrm6_rcv_spi() is added, so that xfrm6_rcv() is not touched (this function
      is used in several handlers).
      
      CC: Alexey Kodanev <alexey.kodanev@oracle.com>
      Fixes: 1625f452 ("net/xfrm_input: fix possible NULL deref of tunnel.ip6->parms.i_key")
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      63c43787
  13. 10 8月, 2016 1 次提交
  14. 28 4月, 2016 2 次提交
  15. 12 12月, 2015 2 次提交
  16. 08 10月, 2015 1 次提交
  17. 18 9月, 2015 1 次提交
  18. 11 8月, 2015 1 次提交
  19. 28 5月, 2015 2 次提交
  20. 08 4月, 2015 1 次提交
    • D
      netfilter: Pass socket pointer down through okfn(). · 7026b1dd
      David Miller 提交于
      On the output paths in particular, we have to sometimes deal with two
      socket contexts.  First, and usually skb->sk, is the local socket that
      generated the frame.
      
      And second, is potentially the socket used to control a tunneling
      socket, such as one the encapsulates using UDP.
      
      We do not want to disassociate skb->sk when encapsulating in order
      to fix this, because that would break socket memory accounting.
      
      The most extreme case where this can cause huge problems is an
      AF_PACKET socket transmitting over a vxlan device.  We hit code
      paths doing checks that assume they are dealing with an ipv4
      socket, but are actually operating upon the AF_PACKET one.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7026b1dd