1. 28 3月, 2018 1 次提交
  2. 07 3月, 2018 1 次提交
  3. 27 2月, 2018 1 次提交
  4. 20 2月, 2018 1 次提交
  5. 19 2月, 2018 1 次提交
  6. 13 2月, 2018 2 次提交
    • K
      net: Convert pernet_subsys, registered from inet_init() · f84c6821
      Kirill Tkhai 提交于
      arp_net_ops just addr/removes /proc entry.
      
      devinet_ops allocates and frees duplicate of init_net tables
      and (un)registers sysctl entries.
      
      fib_net_ops allocates and frees pernet tables, creates/destroys
      netlink socket and (un)initializes /proc entries. Foreign
      pernet_operations do not touch them.
      
      ip_rt_proc_ops only modifies pernet /proc entries.
      
      xfrm_net_ops creates/destroys /proc entries, allocates/frees
      pernet statistics, hashes and tables, and (un)initializes
      sysctl files. These are not touched by foreigh pernet_operations
      
      xfrm4_net_ops allocates/frees private pernet memory, and
      configures sysctls.
      
      sysctl_route_ops creates/destroys sysctls.
      
      rt_genid_ops only initializes fields of just allocated net.
      
      ipv4_inetpeer_ops allocated/frees net private memory.
      
      igmp_net_ops just creates/destroys /proc files and socket,
      noone else interested in.
      
      tcp_sk_ops seems to be safe, because tcp_sk_init() does not
      depend on any other pernet_operations modifications. Iteration
      over hash table in inet_twsk_purge() is made under RCU lock,
      and it's safe to iterate the table this way. Removing from
      the table happen from inet_twsk_deschedule_put(), but this
      function is safe without any extern locks, as it's synchronized
      inside itself. There are many examples, it's used in different
      context. So, it's safe to leave tcp_sk_exit_batch() unlocked.
      
      tcp_net_metrics_ops is synchronized on tcp_metrics_lock and safe.
      
      udplite4_net_ops only creates/destroys pernet /proc file.
      
      icmp_sk_ops creates percpu sockets, not touched by foreign
      pernet_operations.
      
      ipmr_net_ops creates/destroys pernet fib tables, (un)registers
      fib rules and /proc files. This seem to be safe to execute
      in parallel with foreign pernet_operations.
      
      af_inet_ops just sets up default parameters of newly created net.
      
      ipv4_mib_ops creates and destroys pernet percpu statistics.
      
      raw_net_ops, tcp4_net_ops, udp4_net_ops, ping_v4_net_ops
      and ip_proc_ops only create/destroy pernet /proc files.
      
      ip4_frags_ops creates and destroys sysctl file.
      
      So, it's safe to make the pernet_operations async.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Acked-by: NAndrei Vagin <avagin@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f84c6821
    • S
      xfrm: Fix policy hold queue after flowcache removal. · 2471c981
      Steffen Klassert 提交于
      Now that the flowcache is removed we need to generate
      a new dummy bundle every time we check if the needed
      SAs are in place because the dummy bundle is not cached
      anymore. Fix it by passing the XFRM_LOOKUP_QUEUE flag
      to xfrm_lookup(). This makes sure that we get a dummy
      bundle in case the SAs are not yet in place.
      
      Fixes: 3ca28286 ("xfrm_policy: bypass flow_cache_lookup")
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      2471c981
  7. 10 1月, 2018 1 次提交
  8. 08 1月, 2018 1 次提交
  9. 30 12月, 2017 1 次提交
    • F
      xfrm: skip policies marked as dead while rehashing · 862591bf
      Florian Westphal 提交于
      syzkaller triggered following KASAN splat:
      
      BUG: KASAN: slab-out-of-bounds in xfrm_hash_rebuild+0xdbe/0xf00 net/xfrm/xfrm_policy.c:618
      read of size 2 at addr ffff8801c8e92fe4 by task kworker/1:1/23 [..]
      Workqueue: events xfrm_hash_rebuild [..]
       __asan_report_load2_noabort+0x14/0x20 mm/kasan/report.c:428
       xfrm_hash_rebuild+0xdbe/0xf00 net/xfrm/xfrm_policy.c:618
       process_one_work+0xbbf/0x1b10 kernel/workqueue.c:2112
       worker_thread+0x223/0x1990 kernel/workqueue.c:2246 [..]
      
      The reproducer triggers:
      1016                 if (error) {
      1017                         list_move_tail(&walk->walk.all, &x->all);
      1018                         goto out;
      1019                 }
      
      in xfrm_policy_walk() via pfkey (it sets tiny rcv space, dump
      callback returns -ENOBUFS).
      
      In this case, *walk is located the pfkey socket struct, so this socket
      becomes visible in the global policy list.
      
      It looks like this is intentional -- phony walker has walk.dead set to 1
      and all other places skip such "policies".
      
      Ccing original authors of the two commits that seem to expose this
      issue (first patch missed ->dead check, second patch adds pfkey
      sockets to policies dumper list).
      
      Fixes: 880a6fab ("xfrm: configure policy hash table thresholds by netlink")
      Fixes: 12a169e7 ("ipsec: Put dumpers on the dump list")
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Timo Teras <timo.teras@iki.fi>
      Cc: Christophe Gouault <christophe.gouault@6wind.com>
      Reported-by: Nsyzbot <bot+c028095236fcb6f4348811565b75084c754dc729@syzkaller.appspotmail.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      862591bf
  10. 12 12月, 2017 1 次提交
  11. 01 12月, 2017 1 次提交
  12. 30 11月, 2017 5 次提交
  13. 15 11月, 2017 1 次提交
  14. 14 11月, 2017 1 次提交
  15. 03 11月, 2017 2 次提交
    • S
      xfrm: Fix stack-out-of-bounds read in xfrm_state_find. · c9f3f813
      Steffen Klassert 提交于
      When we do tunnel or beet mode, we pass saddr and daddr from the
      template to xfrm_state_find(), this is ok. On transport mode,
      we pass the addresses from the flowi, assuming that the IP
      addresses (and address family) don't change during transformation.
      This assumption is wrong in the IPv4 mapped IPv6 case, packet
      is IPv4 and template is IPv6. Fix this by using the addresses
      from the template unconditionally.
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      c9f3f813
    • F
      xfrm: do unconditional template resolution before pcpu cache check · cf379667
      Florian Westphal 提交于
      Stephen Smalley says:
       Since 4.14-rc1, the selinux-testsuite has been encountering sporadic
       failures during testing of labeled IPSEC. git bisect pointed to
       commit ec30d ("xfrm: add xdst pcpu cache").
       The xdst pcpu cache is only checking that the policies are the same,
       but does not validate that the policy, state, and flow match with respect
       to security context labeling.
       As a result, the wrong SA could be used and the receiver could end up
       performing permission checking and providing SO_PEERSEC or SCM_SECURITY
       values for the wrong security context.
      
      This fix makes it so that we always do the template resolution, and
      then checks that the found states match those in the pcpu bundle.
      
      This has the disadvantage of doing a bit more work (lookup in state hash
      table) if we can reuse the xdst entry (we only avoid xdst alloc/free)
      but we don't add a lot of extra work in case we can't reuse.
      
      xfrm_pol_dead() check is removed, reasoning is that
      xfrm_tmpl_resolve does all needed checks.
      
      Cc: Paul Moore <paul@paul-moore.com>
      Fixes: ec30d78c ("xfrm: add xdst pcpu cache")
      Reported-by: NStephen Smalley <sds@tycho.nsa.gov>
      Tested-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      cf379667
  16. 24 10月, 2017 1 次提交
  17. 18 10月, 2017 1 次提交
  18. 11 10月, 2017 1 次提交
  19. 24 8月, 2017 1 次提交
  20. 11 8月, 2017 1 次提交
    • L
      net: xfrm: support setting an output mark. · 077fbac4
      Lorenzo Colitti 提交于
      On systems that use mark-based routing it may be necessary for
      routing lookups to use marks in order for packets to be routed
      correctly. An example of such a system is Android, which uses
      socket marks to route packets via different networks.
      
      Currently, routing lookups in tunnel mode always use a mark of
      zero, making routing incorrect on such systems.
      
      This patch adds a new output_mark element to the xfrm state and
      a corresponding XFRMA_OUTPUT_MARK netlink attribute. The output
      mark differs from the existing xfrm mark in two ways:
      
      1. The xfrm mark is used to match xfrm policies and states, while
         the xfrm output mark is used to set the mark (and influence
         the routing) of the packets emitted by those states.
      2. The existing mark is constrained to be a subset of the bits of
         the originating socket or transformed packet, but the output
         mark is arbitrary and depends only on the state.
      
      The use of a separate mark provides additional flexibility. For
      example:
      
      - A packet subject to two transforms (e.g., transport mode inside
        tunnel mode) can have two different output marks applied to it,
        one for the transport mode SA and one for the tunnel mode SA.
      - On a system where socket marks determine routing, the packets
        emitted by an IPsec tunnel can be routed based on a mark that
        is determined by the tunnel, not by the marks of the
        unencrypted packets.
      - Support for setting the output marks can be introduced without
        breaking any existing setups that employ both mark-based
        routing and xfrm tunnel mode. Simply changing the code to use
        the xfrm mark for routing output packets could xfrm mark could
        change behaviour in a way that breaks these setups.
      
      If the output mark is unspecified or set to zero, the mark is not
      set or changed.
      
      Tested: make allyesconfig; make -j64
      Tested: https://android-review.googlesource.com/452776Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      077fbac4
  21. 08 8月, 2017 1 次提交
  22. 03 8月, 2017 1 次提交
  23. 19 7月, 2017 7 次提交
  24. 05 7月, 2017 1 次提交
  25. 18 6月, 2017 3 次提交
    • W
      net: remove DST_NOCACHE flag · a4c2fd7f
      Wei Wang 提交于
      DST_NOCACHE flag check has been removed from dst_release() and
      dst_hold_safe() in a previous patch because all the dst are now ref
      counted properly and can be released based on refcnt only.
      Looking at the rest of the DST_NOCACHE use, all of them can now be
      removed or replaced with other checks.
      So this patch gets rid of all the DST_NOCACHE usage and remove this flag
      completely.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a4c2fd7f
    • W
      net: remove DST_NOGC flag · b2a9c0ed
      Wei Wang 提交于
      Now that all the components have been changed to release dst based on
      refcnt only and not depend on dst gc anymore, we can remove the
      temporary flag DST_NOGC.
      
      Note that we also need to remove the DST_NOCACHE check in dst_release()
      and dst_hold_safe() because now all the dst are released based on refcnt
      and behaves as DST_NOCACHE.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b2a9c0ed
    • W
      xfrm: take refcnt of dst when creating struct xfrm_dst bundle · 52df157f
      Wei Wang 提交于
      During the creation of xfrm_dst bundle, always take ref count when
      allocating the dst. This way, xfrm_bundle_create() will form a linked
      list of dst with dst->child pointing to a ref counted dst child. And
      the returned dst pointer is also ref counted. This makes the link from
      the flow cache to this dst now ref counted properly.
      As the dst is always ref counted properly, we can safely mark
      DST_NOGC flag so dst_release() will release dst based on refcnt only.
      And dst gc is no longer needed and all dst_free() and its related
      function calls should be replaced with dst_release() or
      dst_release_immediate().
      
      The special handling logic for dst->child in dst_destroy() can be
      replaced with a simple dst_release_immediate() call on the child to
      release the whole list linked by dst->child pointer.
      Previously used DST_NOHASH flag is not needed anymore as well. The
      reason that DST_NOHASH is used in the existing code is mainly to prevent
      the dst inserted in the fib tree to be wrongly destroyed during the
      deletion of the xfrm_dst bundle. So in the existing code, DST_NOHASH
      flag is marked in all the dst children except the one which is in the
      fib tree.
      However, with this patch series to remove dst gc logic and release dst
      only based on ref count, it is safe to release all the children from a
      xfrm_dst bundle as long as the dst children are all ref counted
      properly which is already the case in the existing code.
      So, this patch removes the use of DST_NOHASH flag.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52df157f
  26. 12 6月, 2017 1 次提交
    • H
      xfrm: move xfrm_garbage_collect out of xfrm_policy_flush · 138437f5
      Hangbin Liu 提交于
      Now we will force to do garbage collection if any policy removed in
      xfrm_policy_flush(). But during xfrm_net_exit(). We call flow_cache_fini()
      first and set set fc->percpu to NULL. Then after we call xfrm_policy_fini()
      -> frxm_policy_flush() -> flow_cache_flush(), we will get NULL pointer
      dereference when check percpu_empty. The code path looks like:
      
      flow_cache_fini()
        - fc->percpu = NULL
      xfrm_policy_fini()
        - xfrm_policy_flush()
          - xfrm_garbage_collect()
            - flow_cache_flush()
              - flow_cache_percpu_empty()
      	  - fcp = per_cpu_ptr(fc->percpu, cpu)
      
      To reproduce, just add ipsec in netns and then remove the netns.
      
      v2:
      As Xin Long suggested, since only two other places need to call it. move
      xfrm_garbage_collect() outside xfrm_policy_flush().
      
      v3:
      Fix subject mismatch after v2 fix.
      
      Fixes: 35db0691 ("xfrm: do the garbage collection after flushing policy")
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      138437f5