1. 08 4月, 2019 7 次提交
    • F
      xfrm: remove output2 indirection from xfrm_mode · 1de70830
      Florian Westphal 提交于
      similar to previous patch: no external module dependencies,
      so we can avoid the indirection by placing this in the core.
      
      This change removes the last indirection from xfrm_mode and the
      xfrm4|6_mode_{beet,tunnel}.c modules contain (almost) no code anymore.
      
      Before:
         text    data     bss     dec     hex filename
         3957     136       0    4093     ffd net/xfrm/xfrm_output.o
          587      44       0     631     277 net/ipv4/xfrm4_mode_beet.o
          649      32       0     681     2a9 net/ipv4/xfrm4_mode_tunnel.o
          625      44       0     669     29d net/ipv6/xfrm6_mode_beet.o
          599      32       0     631     277 net/ipv6/xfrm6_mode_tunnel.o
      After:
         text    data     bss     dec     hex filename
         5359     184       0    5543    15a7 net/xfrm/xfrm_output.o
          171      24       0     195      c3 net/ipv4/xfrm4_mode_beet.o
          171      24       0     195      c3 net/ipv4/xfrm4_mode_tunnel.o
          172      24       0     196      c4 net/ipv6/xfrm6_mode_beet.o
          172      24       0     196      c4 net/ipv6/xfrm6_mode_tunnel.o
      
      v2: fold the *encap_add functions into xfrm*_prepare_output
          preserve (move) output2 comment (Sabrina)
          use x->outer_mode->encap, not inner
          fix a build breakage on ppc (kbuild robot)
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      1de70830
    • F
      xfrm: remove input2 indirection from xfrm_mode · b3284df1
      Florian Westphal 提交于
      No external dependencies on any module, place this in the core.
      Increase is about 1800 byte for xfrm_input.o.
      
      The beet helpers get added to internal header, as they can be reused
      from xfrm_output.c in the next patch (kernel contains several
      copies of them in the xfrm{4,6}_mode_beet.c files).
      
      Before:
         text    data     bss     dec filename
         5578     176    2364    8118 net/xfrm/xfrm_input.o
         1180      64       0    1244 net/ipv4/xfrm4_mode_beet.o
          171      40       0     211 net/ipv4/xfrm4_mode_transport.o
         1163      40       0    1203 net/ipv4/xfrm4_mode_tunnel.o
         1083      52       0    1135 net/ipv6/xfrm6_mode_beet.o
          172      40       0     212 net/ipv6/xfrm6_mode_ro.o
          172      40       0     212 net/ipv6/xfrm6_mode_transport.o
         1056      40       0    1096 net/ipv6/xfrm6_mode_tunnel.o
      
      After:
         text    data     bss     dec filename
         7373     200    2364    9937 net/xfrm/xfrm_input.o
          587      44       0     631 net/ipv4/xfrm4_mode_beet.o
          171      32       0     203 net/ipv4/xfrm4_mode_transport.o
          649      32       0     681 net/ipv4/xfrm4_mode_tunnel.o
          625      44       0     669 net/ipv6/xfrm6_mode_beet.o
          172      32       0     204 net/ipv6/xfrm6_mode_ro.o
          172      32       0     204 net/ipv6/xfrm6_mode_transport.o
          599      32       0     631 net/ipv6/xfrm6_mode_tunnel.o
      
      v2: pass inner_mode to xfrm_inner_mode_encap_remove to fix
          AF_UNSPEC selector breakage (bisected by Benedict Wong)
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      b3284df1
    • F
      xfrm: remove gso_segment indirection from xfrm_mode · 7613b92b
      Florian Westphal 提交于
      These functions are small and we only have versions for tunnel
      and transport mode for ipv4 and ipv6 respectively.
      
      Just place the 'transport or tunnel' conditional in the protocol
      specific function instead of using an indirection.
      
      Before:
          3226       12       0     3238   net/ipv4/esp4_offload.o
          7004      492       0     7496   net/ipv4/ip_vti.o
          3339       12       0     3351   net/ipv6/esp6_offload.o
         11294      460       0    11754   net/ipv6/ip6_vti.o
          1180       72       0     1252   net/ipv4/xfrm4_mode_beet.o
           428       48       0      476   net/ipv4/xfrm4_mode_transport.o
          1271       48       0     1319   net/ipv4/xfrm4_mode_tunnel.o
          1083       60       0     1143   net/ipv6/xfrm6_mode_beet.o
           172       48       0      220   net/ipv6/xfrm6_mode_ro.o
           429       48       0      477   net/ipv6/xfrm6_mode_transport.o
          1164       48       0     1212   net/ipv6/xfrm6_mode_tunnel.o
      15730428  6937008 4046908 26714344   vmlinux
      
      After:
          3461       12       0     3473   net/ipv4/esp4_offload.o
          7000      492       0     7492   net/ipv4/ip_vti.o
          3574       12       0     3586   net/ipv6/esp6_offload.o
         11295      460       0    11755   net/ipv6/ip6_vti.o
          1180       64       0     1244   net/ipv4/xfrm4_mode_beet.o
           171       40       0      211   net/ipv4/xfrm4_mode_transport.o
          1163       40       0     1203   net/ipv4/xfrm4_mode_tunnel.o
          1083       52       0     1135   net/ipv6/xfrm6_mode_beet.o
           172       40       0      212   net/ipv6/xfrm6_mode_ro.o
           172       40       0      212   net/ipv6/xfrm6_mode_transport.o
          1056       40       0     1096   net/ipv6/xfrm6_mode_tunnel.o
      15730424  6937008 4046908 26714340   vmlinux
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      7613b92b
    • F
      xfrm: remove xmit indirection from xfrm_mode · 303c5fab
      Florian Westphal 提交于
      There are only two versions (tunnel and transport). The ip/ipv6 versions
      are only differ in sizeof(iphdr) vs ipv6hdr.
      
      Place this in the core and use x->outer_mode->encap type to call the
      correct adjustment helper.
      
      Before:
         text   data    bss     dec      filename
      15730311  6937008 4046908 26714227 vmlinux
      
      After:
      15730428  6937008 4046908 26714344 vmlinux
      
      (about 117 byte increase)
      
      v2: use family from x->outer_mode, not inner
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      303c5fab
    • F
      xfrm: remove output indirection from xfrm_mode · 0c620e97
      Florian Westphal 提交于
      Same is input indirection.  Only exception: we need to export
      xfrm_outer_mode_output for pktgen.
      
      Increases size of vmlinux by about 163 byte:
      Before:
         text    data     bss     dec      filename
      15730208  6936948 4046908 26714064   vmlinux
      
      After:
      15730311  6937008 4046908 26714227   vmlinux
      
      xfrm_inner_extract_output has no more external callers, make it static.
      
      v2: add IS_ENABLED(IPV6) guard in xfrm6_prepare_output
          add two missing breaks in xfrm_outer_mode_output (Sabrina Dubroca)
          add WARN_ON_ONCE for 'call AF_INET6 related output function, but
          CONFIG_IPV6=n' case.
          make xfrm_inner_extract_output static
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      0c620e97
    • F
      xfrm: remove input indirection from xfrm_mode · c2d305e5
      Florian Westphal 提交于
      No need for any indirection or abstraction here, both functions
      are pretty much the same and quite small, they also have no external
      dependencies.
      
      xfrm_prepare_input can then be made static.
      
      With allmodconfig build, size increase of vmlinux is 25 byte:
      
      Before:
         text   data     bss     dec      filename
      15730207  6936924 4046908 26714039  vmlinux
      
      After:
      15730208  6936948 4046908 26714064 vmlinux
      
      v2: Fix INET_XFRM_MODE_TRANSPORT name in is-enabled test (Sabrina Dubroca)
          change copied comment to refer to transport and network header,
          not skb->{h,nh}, which don't exist anymore. (Sabrina)
          make xfrm_prepare_input static (Eyal Birger)
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      c2d305e5
    • F
      xfrm: place af number into xfrm_mode struct · b262a695
      Florian Westphal 提交于
      This will be useful to know if we're supposed to decode ipv4 or ipv6.
      
      While at it, make the unregister function return void, all module_exit
      functions did just BUG(); there is never a point in doing error checks
      if there is no way to handle such error.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      b262a695
  2. 05 2月, 2019 1 次提交
    • C
      xfrm: destroy xfrm_state synchronously on net exit path · f75a2804
      Cong Wang 提交于
      xfrm_state_put() moves struct xfrm_state to the GC list
      and schedules the GC work to clean it up. On net exit call
      path, xfrm_state_flush() is called to clean up and
      xfrm_flush_gc() is called to wait for the GC work to complete
      before exit.
      
      However, this doesn't work because one of the ->destructor(),
      ipcomp_destroy(), schedules the same GC work again inside
      the GC work. It is hard to wait for such a nested async
      callback. This is also why syzbot still reports the following
      warning:
      
       WARNING: CPU: 1 PID: 33 at net/ipv6/xfrm6_tunnel.c:351 xfrm6_tunnel_net_exit+0x2cb/0x500 net/ipv6/xfrm6_tunnel.c:351
       ...
        ops_exit_list.isra.0+0xb0/0x160 net/core/net_namespace.c:153
        cleanup_net+0x51d/0xb10 net/core/net_namespace.c:551
        process_one_work+0xd0c/0x1ce0 kernel/workqueue.c:2153
        worker_thread+0x143/0x14a0 kernel/workqueue.c:2296
        kthread+0x357/0x430 kernel/kthread.c:246
        ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
      
      In fact, it is perfectly fine to bypass GC and destroy xfrm_state
      synchronously on net exit call path, because it is in process context
      and doesn't need a work struct to do any blocking work.
      
      This patch introduces xfrm_state_put_sync() which simply bypasses
      GC, and lets its callers to decide whether to use this synchronous
      version. On net exit path, xfrm_state_fini() and
      xfrm6_tunnel_net_exit() use it. And, as ipcomp_destroy() itself is
      blocking, it can use xfrm_state_put_sync() directly too.
      
      Also rename xfrm_state_gc_destroy() to ___xfrm_state_destroy() to
      reflect this change.
      
      Fixes: b48c05ab ("xfrm: Fix warning in xfrm6_tunnel_net_exit.")
      Reported-and-tested-by: syzbot+e9aebef558e3ed673934@syzkaller.appspotmail.com
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      f75a2804
  3. 20 12月, 2018 5 次提交
  4. 10 12月, 2018 1 次提交
  5. 23 11月, 2018 1 次提交
  6. 09 11月, 2018 3 次提交
  7. 20 7月, 2018 1 次提交
    • B
      xfrm: Remove xfrmi interface ID from flowi · bc56b334
      Benedict Wong 提交于
      In order to remove performance impact of having the extra u32 in every
      single flowi, this change removes the flowi_xfrm struct, prefering to
      take the if_id as a method parameter where needed.
      
      In the inbound direction, if_id is only needed during the
      __xfrm_check_policy() function, and the if_id can be determined at that
      point based on the skb. As such, xfrmi_decode_session() is only called
      with the skb in __xfrm_check_policy().
      
      In the outbound direction, the only place where if_id is needed is the
      xfrm_lookup() call in xfrmi_xmit2(). With this change, the if_id is
      directly passed into the xfrm_lookup_with_ifid() call. All existing
      callers can still call xfrm_lookup(), which uses a default if_id of 0.
      
      This change does not change any behavior of XFRMIs except for improving
      overall system performance via flowi size reduction.
      
      This change has been tested against the Android Kernel Networking Tests:
      
      https://android.googlesource.com/kernel/tests/+/master/net/testSigned-off-by: NBenedict Wong <benedictwong@google.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      bc56b334
  8. 11 7月, 2018 1 次提交
  9. 25 6月, 2018 1 次提交
    • F
      xfrm: policy: remove pcpu policy cache · e4db5b61
      Florian Westphal 提交于
      Kristian Evensen says:
        In a project I am involved in, we are running ipsec (Strongswan) on
        different mt7621-based routers. Each router is configured as an
        initiator and has around ~30 tunnels to different responders (running
        on misc. devices). Before the flow cache was removed (kernel 4.9), we
        got a combined throughput of around 70Mbit/s for all tunnels on one
        router. However, we recently switched to kernel 4.14 (4.14.48), and
        the total throughput is somewhere around 57Mbit/s (best-case). I.e., a
        drop of around 20%. Reverting the flow cache removal restores, as
        expected, performance levels to that of kernel 4.9.
      
      When pcpu xdst exists, it has to be validated first before it can be
      used.
      
      A negative hit thus increases cost vs. no-cache.
      
      As number of tunnels increases, hit rate decreases so this pcpu caching
      isn't a viable strategy.
      
      Furthermore, the xdst cache also needs to run with BH off, so when
      removing this the bh disable/enable pairs can be removed too.
      
      Kristian tested a 4.14.y backport of this change and reported
      increased performance:
      
        In our tests, the throughput reduction has been reduced from around -20%
        to -5%. We also see that the overall throughput is independent of the
        number of tunnels, while before the throughput was reduced as the number
        of tunnels increased.
      Reported-by: NKristian Evensen <kristian.evensen@gmail.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      e4db5b61
  10. 23 6月, 2018 3 次提交
  11. 19 6月, 2018 1 次提交
  12. 15 5月, 2018 2 次提交
  13. 16 4月, 2018 1 次提交
  14. 30 3月, 2018 1 次提交
  15. 01 3月, 2018 1 次提交
  16. 18 1月, 2018 1 次提交
  17. 21 12月, 2017 1 次提交
  18. 20 12月, 2017 3 次提交
  19. 19 12月, 2017 1 次提交
  20. 30 11月, 2017 4 次提交
    • D
      xfrm: Move dst->path into struct xfrm_dst · 0f6c480f
      David Miller 提交于
      The first member of an IPSEC route bundle chain sets it's dst->path to
      the underlying ipv4/ipv6 route that carries the bundle.
      
      Stated another way, if one were to follow the xfrm_dst->child chain of
      the bundle, the final non-NULL pointer would be the path and point to
      either an ipv4 or an ipv6 route.
      
      This is largely used to make sure that PMTU events propagate down to
      the correct ipv4 or ipv6 route.
      
      When we don't have the top of an IPSEC bundle 'dst->path == dst'.
      
      Move it down into xfrm_dst and key off of dst->xfrm.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      0f6c480f
    • D
      xfrm: Move child route linkage into xfrm_dst. · b6ca8bd5
      David Miller 提交于
      XFRM bundle child chains look like this:
      
      	xdst1 --> xdst2 --> xdst3 --> path_dst
      
      All of xdstN are xfrm_dst objects and xdst->u.dst.xfrm is non-NULL.
      The final child pointer in the chain, here called 'path_dst', is some
      other kind of route such as an ipv4 or ipv6 one.
      
      The xfrm output path pops routes, one at a time, via the child
      pointer, until we hit one which has a dst->xfrm pointer which
      is NULL.
      
      We can easily preserve the above mechanisms with child sitting
      only in the xfrm_dst structure.  All children in the chain
      before we break out of the xfrm_output() loop have dst->xfrm
      non-NULL and are therefore xfrm_dst objects.
      
      Since we break out of the loop when we find dst->xfrm NULL, we
      will not try to dereference 'dst' as if it were an xfrm_dst.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6ca8bd5
    • D
      ipsec: Create and use new helpers for dst child access. · 45b018be
      David Miller 提交于
      This will make a future change moving the dst->child pointer less
      invasive.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      45b018be
    • D
      net: Create and use new helper xfrm_dst_child(). · b92cf4aa
      David Miller 提交于
      Only IPSEC routes have a non-NULL dst->child pointer.  And IPSEC
      routes are identified by a non-NULL dst->xfrm pointer.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b92cf4aa