提交 · 42e8e6d906dabb58a0e0ea53443b0e0a0821f1d5 · openeuler / Kernel

29 8月, 2022 1 次提交

xfrm: interface: support collect metadata mode · abc340b3

由 Eyal Birger 提交于 8月 26, 2022

This commit adds support for 'collect_md' mode on xfrm interfaces.

Each net can have one collect_md device, created by providing the
IFLA_XFRM_COLLECT_METADATA flag at creation. This device cannot be
altered and has no if_id or link device attributes.

On transmit to this device, the if_id is fetched from the attached dst
metadata on the skb. If exists, the link property is also fetched from
the metadata. The dst metadata type used is METADATA_XFRM which holds
these properties.

On the receive side, xfrmi_rcv_cb() populates a dst metadata for each
packet received and attaches it to the skb. The if_id used in this case is
fetched from the xfrm state, and the link is fetched from the incoming
device. This information can later be used by upper layers such as tc,
ebpf, and ip rules.

Because the skb is scrubed in xfrmi_rcv_cb(), the attachment of the dst
metadata is postponed until after scrubing. Similarly, xfrm_input() is
adapted to avoid dropping metadata dsts by only dropping 'valid'
(skb_valid_dst(skb) == true) dsts.

Policy matching on packets arriving from collect_md xfrmi devices is
done by using the xfrm state existing in the skb's sec_path.
The xfrm_if_cb.decode_cb() interface implemented by xfrmi_decode_session()
is changed to keep the details of the if_id extraction tucked away
in xfrm_interface.c.
Reviewed-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: NNikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: NEyal Birger <eyal.birger@gmail.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

abc340b3

23 8月, 2022 1 次提交

xfrm: Drop unused argument · 0de19788

由 Hongbin Wang 提交于 8月 21, 2022

Drop unused argument from xfrm_policy_match,
__xfrm_policy_eval_candidates and xfrm_policy_eval_candidates.
No functional changes intended.
Signed-off-by: NHongbin Wang <wh_bin@126.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

0de19788

17 8月, 2022 1 次提交

xfrm: policy: fix metadata dst->dev xmit null pointer dereference · 17ecd4a4

由 Nikolay Aleksandrov 提交于 8月 16, 2022

When we try to transmit an skb with metadata_dst attached (i.e. dst->dev
== NULL) through xfrm interface we can hit a null pointer dereference[1]
in xfrmi_xmit2() -> xfrm_lookup_with_ifid() due to the check for a
loopback skb device when there's no policy which dereferences dst->dev
unconditionally. Not having dst->dev can be interepreted as it not being
a loopback device, so just add a check for a null dst_orig->dev.

With this fix xfrm interface's Tx error counters go up as usual.

[1] net-next calltrace captured via netconsole:
  BUG: kernel NULL pointer dereference, address: 00000000000000c0
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] PREEMPT SMP
  CPU: 1 PID: 7231 Comm: ping Kdump: loaded Not tainted 5.19.0+ #24
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014
  RIP: 0010:xfrm_lookup_with_ifid+0x5eb/0xa60
  Code: 8d 74 24 38 e8 26 a4 37 00 48 89 c1 e9 12 fc ff ff 49 63 ed 41 83 fd be 0f 85 be 01 00 00 41 be ff ff ff ff 45 31 ed 48 8b 03 <f6> 80 c0 00 00 00 08 75 0f 41 80 bc 24 19 0d 00 00 01 0f 84 1e 02
  RSP: 0018:ffffb0db82c679f0 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffffd0db7fcad430 RCX: ffffb0db82c67a10
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb0db82c67a80
  RBP: ffffb0db82c67a80 R08: ffffb0db82c67a14 R09: 0000000000000000
  R10: 0000000000000000 R11: ffff8fa449667dc8 R12: ffffffff966db880
  R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000000
  FS:  00007ff35c83f000(0000) GS:ffff8fa478480000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000000c0 CR3: 000000001ebb7000 CR4: 0000000000350ee0
  Call Trace:
   <TASK>
   xfrmi_xmit+0xde/0x460
   ? tcf_bpf_act+0x13d/0x2a0
   dev_hard_start_xmit+0x72/0x1e0
   __dev_queue_xmit+0x251/0xd30
   ip_finish_output2+0x140/0x550
   ip_push_pending_frames+0x56/0x80
   raw_sendmsg+0x663/0x10a0
   ? try_charge_memcg+0x3fd/0x7a0
   ? __mod_memcg_lruvec_state+0x93/0x110
   ? sock_sendmsg+0x30/0x40
   sock_sendmsg+0x30/0x40
   __sys_sendto+0xeb/0x130
   ? handle_mm_fault+0xae/0x280
   ? do_user_addr_fault+0x1e7/0x680
   ? kvm_read_and_reset_apf_flags+0x3b/0x50
   __x64_sys_sendto+0x20/0x30
   do_syscall_64+0x34/0x80
   entry_SYSCALL_64_after_hwframe+0x46/0xb0
  RIP: 0033:0x7ff35cac1366
  Code: eb 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 72 c3 90 55 48 83 ec 30 44 89 4c 24 2c 4c 89
  RSP: 002b:00007fff738e4028 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
  RAX: ffffffffffffffda RBX: 00007fff738e57b0 RCX: 00007ff35cac1366
  RDX: 0000000000000040 RSI: 0000557164e4b450 RDI: 0000000000000003
  RBP: 0000557164e4b450 R08: 00007fff738e7a2c R09: 0000000000000010
  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
  R13: 00007fff738e5770 R14: 00007fff738e4030 R15: 0000001d00000001
   </TASK>
  Modules linked in: netconsole veth br_netfilter bridge bonding virtio_net [last unloaded: netconsole]
  CR2: 00000000000000c0

CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Daniel Borkmann <daniel@iogearbox.net>
Fixes: 2d151d39 ("xfrm: Add possibility to set the default to block if we have no policy")
Signed-off-by: NNikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

17ecd4a4

27 7月, 2022 1 次提交

xfrm: fix refcount leak in __xfrm_policy_check() · 9c9cb23e

由 Xin Xiong 提交于 7月 24, 2022

The issue happens on an error path in __xfrm_policy_check(). When the
fetching process of the object `pols[1]` fails, the function simply
returns 0, forgetting to decrement the reference count of `pols[0]`,
which is incremented earlier by either xfrm_sk_policy_lookup() or
xfrm_policy_lookup(). This may result in memory leaks.

Fix it by decreasing the reference count of `pols[0]` in that path.

Fixes: 134b0fc5 ("IPsec: propagate security module errors up from flow_cache_lookup")
Signed-off-by: NXin Xiong <xiongx18@fudan.edu.cn>
Signed-off-by: NXin Tan <tanxin.ctf@gmail.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

9c9cb23e

02 6月, 2022 1 次提交

xfrm: xfrm_policy: fix a possible double xfrm_pols_put() in xfrm_bundle_lookup() · f85daf0e

由 Hangyu Hua 提交于 6月 01, 2022

xfrm_policy_lookup() will call xfrm_pol_hold_rcu() to get a refcount of
pols[0]. This refcount can be dropped in xfrm_expand_policies() when
xfrm_expand_policies() return error. pols[0]'s refcount is balanced in
here. But xfrm_bundle_lookup() will also call xfrm_pols_put() with
num_pols == 1 to drop this refcount when xfrm_expand_policies() return
error.

This patch also fix an illegal address access. pols[0] will save a error
point when xfrm_policy_lookup fails. This lead to xfrm_pols_put to resolve
an illegal address in xfrm_bundle_lookup's error path.

Fix these by setting num_pols = 0 in xfrm_expand_policies()'s error path.

Fixes: 80c802f3 ("xfrm: cache bundles instead of policies for outgoing flows")
Signed-off-by: NHangyu Hua <hbh25y@gmail.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

f85daf0e

17 5月, 2022 1 次提交

xfrm: set dst dev to blackhole_netdev instead of loopback_dev in ifdown · 4d33ab08

由 Xin Long 提交于 5月 15, 2022

The global blackhole_netdev has replaced pernet loopback_dev to become the
one given to the object that holds an netdev when ifdown in many places of
ipv4 and ipv6 since commit 8d7017fd ("blackhole_netdev: use
blackhole_netdev to invalidate dst entries").

Especially after commit faab39f6 ("net: allow out-of-order netdev
unregistration"), it's no longer safe to use loopback_dev that may be
freed before other netdev.

This patch is to set dst dev to blackhole_netdev instead of loopback_dev
in ifdown.

v1->v2:
  - add Fixes tag as Eric suggested.

Fixes: faab39f6 ("net: allow out-of-order netdev unregistration")
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/e8c87482998ca6fcdab214f5a9d582899ec0c648.1652665047.git.lucien.xin@gmail.comSigned-off-by: NPaolo Abeni <pabeni@redhat.com>

4d33ab08

04 4月, 2022 1 次提交

xfrm: Pass flowi_oif or l3mdev as oif to xfrm_dst_lookup · 748b82c2

由 David Ahern 提交于 4月 01, 2022

The commit referenced in the Fixes tag no longer changes the
flow oif to the l3mdev ifindex. A xfrm use case was expecting
the flowi_oif to be the VRF if relevant and the change broke
that test. Update xfrm_bundle_create to pass oif if set and any
potential flowi_l3mdev if oif is not set.

Fixes: 40867d74 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices")
Reported-by: Nkernel test robot <oliver.sang@intel.com>
Signed-off-by: NDavid Ahern <dsahern@kernel.org>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

748b82c2

18 3月, 2022 1 次提交

xfrm: rework default policy structure · b58b1f56

由 Nicolas Dichtel 提交于 3月 14, 2022

This is a follow up of commit f8d858e6 ("xfrm: make user policy API
complete"). The goal is to align userland API to the internal structures.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: NAntony Antony <antony.antony@secunet.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

b58b1f56

26 1月, 2022 1 次提交

xfrm: Check if_id in xfrm_migrate · c1aca308

由 Yan Yan 提交于 1月 18, 2022

This patch enables distinguishing SAs and SPs based on if_id during
the xfrm_migrate flow. This ensures support for xfrm interfaces
throughout the SA/SP lifecycle.

When there are multiple existing SPs with the same direction,
the same xfrm_selector and different endpoint addresses,
xfrm_migrate might fail with ENODATA.

Specifically, the code path for performing xfrm_migrate is:
  Stage 1: find policy to migrate with
    xfrm_migrate_policy_find(sel, dir, type, net)
  Stage 2: find and update state(s) with
    xfrm_migrate_state_find(mp, net)
  Stage 3: update endpoint address(es) of template(s) with
    xfrm_policy_migrate(pol, m, num_migrate)

Currently "Stage 1" always returns the first xfrm_policy that
matches, and "Stage 3" looks for the xfrm_tmpl that matches the
old endpoint address. Thus if there are multiple xfrm_policy
with same selector, direction, type and net, "Stage 1" might
rertun a wrong xfrm_policy and "Stage 3" will fail with ENODATA
because it cannot find a xfrm_tmpl with the matching endpoint
address.

The fix is to allow userspace to pass an if_id and add if_id
to the matching rule in Stage 1 and Stage 2 since if_id is a
unique ID for xfrm_policy and xfrm_state. For compatibility,
if_id will only be checked if the attribute is set.

Tested with additions to Android's kernel unit test suite:
https://android-review.googlesource.com/c/kernel/tests/+/1668886Signed-off-by: NYan Yan <evitayan@google.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

c1aca308

12 1月, 2022 1 次提交

xfrm: Don't accidentally set RTO_ONLINK in decode_session4() · 23e7b1bf

由 Guillaume Nault 提交于 1月 10, 2022

Similar to commit 94e22389 ("xfrm4: strip ECN bits from tos field"),
clear the ECN bits from iph->tos when setting ->flowi4_tos.
This ensures that the last bit of ->flowi4_tos is cleared, so
ip_route_output_key_hash() isn't going to restrict the scope of the
route lookup.

Use ~INET_ECN_MASK instead of IPTOS_RT_MASK, because we have no reason
to clear the high order bits.

Found by code inspection, compile tested only.

Fixes: 4da3089f ("[IPSEC]: Use TOS when doing tunnel lookups")
Signed-off-by: NGuillaume Nault <gnault@redhat.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

23e7b1bf

01 12月, 2021 1 次提交

net: xfrm: drop check of pols[0] for the second time · ac1077e9

由 Jean Sacren 提交于 11月 30, 2021

!pols[0] is checked earlier.  If we don't return, pols[0] is always
true.  We should drop the check of pols[0] for the second time and the
binary is also smaller.

Before:
   text	   data	    bss	    dec	    hex	filename
  48395	    957	    240	  49592	   c1b8	net/xfrm/xfrm_policy.o

After:
   text	   data	    bss	    dec	    hex	filename
  48379	    957	    240	  49576	   c1a8	net/xfrm/xfrm_policy.o
Signed-off-by: NJean Sacren <sakiwit@gmail.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

ac1077e9

23 11月, 2021 1 次提交

xfrm: fix policy lookup for ipv6 gre packets · bcf141b2

由 Ghalem Boudour 提交于 11月 19, 2021

On egress side, xfrm lookup is called from __gre6_xmit() with the
fl6_gre_key field not initialized leading to policies selectors check
failure. Consequently, gre packets are sent without encryption.

On ingress side, INET6_PROTO_NOPOLICY was set, thus packets were not
checked against xfrm policies. Like for egress side, fl6_gre_key should be
correctly set, this is now done in decode_session6().

Fixes: c12b395a ("gre: Support GRE over IPv6")
Cc: stable@vger.kernel.org
Signed-off-by: NGhalem Boudour <ghalem.boudour@6wind.com>
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

bcf141b2

19 11月, 2021 1 次提交

xfrm: Remove duplicate assignment · 2e180920

由 luo penghao 提交于 11月 04, 2021

The statement in the switch is repeated with the statement at the
beginning of the while loop, so this statement is meaningless.

The clang_analyzer complains as follows:

net/xfrm/xfrm_policy.c:3392:2 warning:

Value stored to 'exthdr' is never read
Reported-by: NZeal Robot <zealci@zte.com.cn>
Signed-off-by: Nluo penghao <luo.penghao@zte.com.cn>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

2e180920

19 10月, 2021 1 次提交

xfrm: Use memset_after() to clear padding · caf283d0

由 Kees Cook 提交于 6月 17, 2021

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Clear trailing padding bytes using the new helper so that memset()
doesn't get confused about writing "past the end" of the last struct
member. There is no change to the resulting machine code.

Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: NKees Cook <keescook@chromium.org>

caf283d0

21 7月, 2021 1 次提交

xfrm: Add possibility to set the default to block if we have no policy · 2d151d39

由 Steffen Klassert 提交于 7月 18, 2021

As the default we assume the traffic to pass, if we have no
matching IPsec policy. With this patch, we have a possibility to
change this default from allow to block. It can be configured
via netlink. Each direction (input/output/forward) can be
configured separately. With the default to block configuered,
we need allow policies for all packet flows we accept.
We do not use default policy lookup for the loopback device.

v1->v2
 - fix compiling when XFRM is disabled
 - Reported-by: kernel test robot <lkp@intel.com>
Co-developed-by: NChristian Langrock <christian.langrock@secunet.com>
Signed-off-by: NChristian Langrock <christian.langrock@secunet.com>
Co-developed-by: NAntony Antony <antony.antony@secunet.com>
Signed-off-by: NAntony Antony <antony.antony@secunet.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

2d151d39

02 7月, 2021 2 次提交

xfrm: Fix RCU vs hash_resize_mutex lock inversion · 2580d3f4

由 Frederic Weisbecker 提交于 6月 28, 2021

xfrm_bydst_resize() calls synchronize_rcu() while holding
hash_resize_mutex. But then on PREEMPT_RT configurations,
xfrm_policy_lookup_bytype() may acquire that mutex while running in an
RCU read side critical section. This results in a deadlock.

In fact the scope of hash_resize_mutex is way beyond the purpose of
xfrm_policy_lookup_bytype() to just fetch a coherent and stable policy
for a given destination/direction, along with other details.

The lower level net->xfrm.xfrm_policy_lock, which among other things
protects per destination/direction references to policy entries, is
enough to serialize and benefit from priority inheritance against the
write side. As a bonus, it makes it officially a per network namespace
synchronization business where a policy table resize on namespace A
shouldn't block a policy lookup on namespace B.

Fixes: 77cc278f (xfrm: policy: Use sequence counters with associated lock)
Cc: stable@vger.kernel.org
Cc: Ahmed S. Darwish <a.darwish@linutronix.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Varad Gautam <varad.gautam@suse.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NFrederic Weisbecker <frederic@kernel.org>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

2580d3f4

Revert "xfrm: policy: Read seqcount outside of rcu-read side in xfrm_policy_lookup_bytype" · eaf22826

由 Steffen Klassert 提交于 7月 02, 2021

This reverts commit d7b04089.

This commit tried to fix a locking bug introduced by commit 77cc278f
("xfrm: policy: Use sequence counters with associated lock"). As it
turned out, this patch did not really fix the bug. A proper fix
for this bug is applied on top of this revert.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

eaf22826

11 6月, 2021 1 次提交

xfrm: policy: fix a spelling mistake · 7a7ae1eb

由 gushengxian 提交于 6月 08, 2021

Fix a spelling mistake.
Signed-off-by: Ngushengxian <gushengxian@yulong.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

7a7ae1eb

01 6月, 2021 1 次提交

xfrm: policy: Read seqcount outside of rcu-read side in xfrm_policy_lookup_bytype · d7b04089

由 Varad Gautam 提交于 5月 28, 2021

xfrm_policy_lookup_bytype loops on seqcount mutex xfrm_policy_hash_generation
within an RCU read side critical section. Although ill advised, this is fine if
the loop is bounded.

xfrm_policy_hash_generation wraps mutex hash_resize_mutex, which is used to
serialize writers (xfrm_hash_resize, xfrm_hash_rebuild). This is fine too.

On PREEMPT_RT=y, the read_seqcount_begin call within xfrm_policy_lookup_bytype
emits a mutex lock/unlock for hash_resize_mutex. Mutex locking is fine, since
RCU read side critical sections are allowed to sleep with PREEMPT_RT.

xfrm_hash_resize can, however, block on synchronize_rcu while holding
hash_resize_mutex.

This leads to the following situation on PREEMPT_RT, where the writer is
blocked on RCU grace period expiry, while the reader is blocked on a lock held
by the writer:

Thead 1 (xfrm_hash_resize)	Thread 2 (xfrm_policy_lookup_bytype)

				rcu_read_lock();
mutex_lock(&hash_resize_mutex);
				read_seqcount_begin(&xfrm_policy_hash_generation);
				mutex_lock(&hash_resize_mutex); // block
xfrm_bydst_resize();
synchronize_rcu(); // block
		<RCU stalls in xfrm_policy_lookup_bytype>

Move the read_seqcount_begin call outside of the RCU read side critical section,
and do an rcu_read_unlock/retry if we got stale data within the critical section.

On non-PREEMPT_RT, this shortens the time spent within RCU read side critical
section in case the seqcount needs a retry, and avoids unbounded looping.

Fixes: 77cc278f ("xfrm: policy: Use sequence counters with associated lock")
Signed-off-by: NVarad Gautam <varad.gautam@suse.com>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org # v4.9
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: "Ahmed S. Darwish" <a.darwish@linutronix.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Acked-by: NAhmed S. Darwish <a.darwish@linutronix.de>

d7b04089

11 5月, 2021 1 次提交

selinux: delete selinux_xfrm_policy_lookup() useless argument · 8a922805

由 Zhongjun Tan 提交于 4月 09, 2021

seliunx_xfrm_policy_lookup() is hooks of security_xfrm_policy_lookup().
The dir argument is uselss in security_xfrm_policy_lookup(). So
remove the dir argument from selinux_xfrm_policy_lookup() and
security_xfrm_policy_lookup().
Signed-off-by: NZhongjun Tan <tanzhongjun@yulong.com>
[PM: reformat the subject line]
Signed-off-by: NPaul Moore <paul@paul-moore.com>

8a922805

19 4月, 2021 2 次提交

xfrm: remove stray synchronize_rcu from xfrm_init · 7baf867f

由 Florian Westphal 提交于 4月 14, 2021

This function is called during boot, from ipv4 stack, there is no need
to set the pointer to NULL (static storage duration, so already NULL).

No need for the synchronize_rcu either.  Remove both.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

7baf867f

flow: remove spi key from flowi struct · b07dd26f

由 Florian Westphal 提交于 4月 14, 2021

xfrm session decode ipv4 path (but not ipv6) sets this, but there are no
consumers.  Remove it.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

b07dd26f

29 3月, 2021 1 次提交

xfrm_policy.c : Mundane typo fix · aa8ef1b9

由 Bhaskar Chowdhury 提交于 3月 27, 2021

s/sucessful/successful/
Signed-off-by: NBhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa8ef1b9

04 1月, 2021 2 次提交

xfrm: Fix wraparound in xfrm_policy_addr_delta() · da64ae2d

由 Visa Hankala 提交于 12月 30, 2020

Use three-way comparison for address components to avoid integer
wraparound in the result of xfrm_policy_addr_delta(). This ensures
that the search trees are built and traversed correctly.

Treat IPv4 and IPv6 similarly by returning 0 when prefixlen == 0.
Prefix /0 has only one equivalence class.

Fixes: 9cf545eb ("xfrm: policy: store inexact policies in a tree ordered by destination address")
Signed-off-by: NVisa Hankala <visa@hankala.org>
Acked-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

da64ae2d

xfrm: fix disable_xfrm sysctl when used on xfrm interfaces · 9f8550e4

由 Eyal Birger 提交于 12月 23, 2020

The disable_xfrm flag signals that xfrm should not be performed during
routing towards a device before reaching device xmit.

For xfrm interfaces this is usually desired as they perform the outbound
policy lookup as part of their xmit using their if_id.

Before this change enabling this flag on xfrm interfaces prevented them
from xmitting as xfrm_lookup_with_ifid() would not perform a policy lookup
in case the original dst had the DST_NOXFRM flag.

This optimization is incorrect when the lookup is done by the xfrm
interface xmit logic.

Fix by performing policy lookup when invoked by xfrmi as if_id != 0.

Similarly it's unlikely for the 'no policy exists on net' check to yield
any performance benefits when invoked from xfrmi.

Fixes: f203b76d ("xfrm: Add virtual xfrm interfaces")
Signed-off-by: NEyal Birger <eyal.birger@gmail.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

9f8550e4

24 8月, 2020 1 次提交

treewide: Use fallthrough pseudo-keyword · df561f66

由 Gustavo A. R. Silva 提交于 8月 23, 2020

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

df561f66

29 7月, 2020 1 次提交

xfrm: policy: Use sequence counters with associated lock · 77cc278f

由 Ahmed S. Darwish 提交于 7月 20, 2020

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. If the serialization primitive is
not disabling preemption implicitly, preemption has to be explicitly
disabled before entering the sequence counter write side critical
section.

A plain seqcount_t does not contain the information of which lock must
be held when entering a write side critical section.

Use the new seqcount_spinlock_t and seqcount_mutex_t data types instead,
which allow to associate a lock with the sequence counter. This enables
lockdep to verify that the lock used for writer serialization is held
when the write side critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.
Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200720155530.1173732-17-a.darwish@linutronix.de

77cc278f

21 7月, 2020 1 次提交

xfrm: Make the policy hold queue work with VTI. · b328ecc4

由 Steffen Klassert 提交于 7月 17, 2020

We forgot to support the xfrm policy hold queue when
VTI was implemented. This patch adds everything we
need so that we can use the policy hold queue together
with VTI interfaces.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

b328ecc4

17 7月, 2020 1 次提交

xfrm: policy: fix IPv6-only espintcp compilation · 95a35b42

由 Sabrina Dubroca 提交于 7月 16, 2020

In case we're compiling espintcp support only for IPv6, we should
still initialize the common code.

Fixes: 26333c37 ("xfrm: add IPv6 support for espintcp")
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

95a35b42

24 6月, 2020 1 次提交

xfrm: policy: match with both mark and mask on user interfaces · 4f47e8ab

由 Xin Long 提交于 6月 22, 2020

In commit ed17b8d3 ("xfrm: fix a warning in xfrm_policy_insert_list"),
it would take 'priority' to make a policy unique, and allow duplicated
policies with different 'priority' to be added, which is not expected
by userland, as Tobias reported in strongswan.

To fix this duplicated policies issue, and also fix the issue in
commit ed17b8d3 ("xfrm: fix a warning in xfrm_policy_insert_list"),
when doing add/del/get/update on user interfaces, this patch is to change
to look up a policy with both mark and mask by doing:

  mark.v == pol->mark.v && mark.m == pol->mark.m

and leave the check:

  (mark & pol->mark.m) == pol->mark.v

for tx/rx path only.

As the userland expects an exact mark and mask match to manage policies.

v1->v2:
  - make xfrm_policy_mark_match inline and fix the changelog as
    Tobias suggested.

Fixes: 295fae56 ("xfrm: Allow user space manipulation of SPD mark")
Fixes: ed17b8d3 ("xfrm: fix a warning in xfrm_policy_insert_list")
Reported-by: NTobias Brunner <tobias@strongswan.org>
Tested-by: NTobias Brunner <tobias@strongswan.org>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

4f47e8ab

25 5月, 2020 1 次提交

xfrm: fix a warning in xfrm_policy_insert_list · ed17b8d3

由 Xin Long 提交于 5月 25, 2020

This waring can be triggered simply by:

  # ip xfrm policy update src 192.168.1.1/24 dst 192.168.1.2/24 dir in \
    priority 1 mark 0 mask 0x10  #[1]
  # ip xfrm policy update src 192.168.1.1/24 dst 192.168.1.2/24 dir in \
    priority 2 mark 0 mask 0x1   #[2]
  # ip xfrm policy update src 192.168.1.1/24 dst 192.168.1.2/24 dir in \
    priority 2 mark 0 mask 0x10  #[3]

Then dmesg shows:

  [ ] WARNING: CPU: 1 PID: 7265 at net/xfrm/xfrm_policy.c:1548
  [ ] RIP: 0010:xfrm_policy_insert_list+0x2f2/0x1030
  [ ] Call Trace:
  [ ]  xfrm_policy_inexact_insert+0x85/0xe50
  [ ]  xfrm_policy_insert+0x4ba/0x680
  [ ]  xfrm_add_policy+0x246/0x4d0
  [ ]  xfrm_user_rcv_msg+0x331/0x5c0
  [ ]  netlink_rcv_skb+0x121/0x350
  [ ]  xfrm_netlink_rcv+0x66/0x80
  [ ]  netlink_unicast+0x439/0x630
  [ ]  netlink_sendmsg+0x714/0xbf0
  [ ]  sock_sendmsg+0xe2/0x110

The issue was introduced by Commit 7cb8a939 ("xfrm: Allow inserting
policies with matching mark and different priorities"). After that, the
policies [1] and [2] would be able to be added with different priorities.

However, policy [3] will actually match both [1] and [2]. Policy [1]
was matched due to the 1st 'return true' in xfrm_policy_mark_match(),
and policy [2] was matched due to the 2nd 'return true' in there. It
caused WARN_ON() in xfrm_policy_insert_list().

This patch is to fix it by only (the same value and priority) as the
same policy in xfrm_policy_mark_match().

Thanks to Yuehaibing, we could make this fix better.

v1->v2:
  - check policy->mark.v == pol->mark.v only without mask.

Fixes: 7cb8a939 ("xfrm: Allow inserting policies with matching mark and different priorities")
Reported-by: NXiumei Mu <xmu@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

ed17b8d3

24 3月, 2020 2 次提交

xfrm: policy: Fix doulbe free in xfrm_policy_timer · 4c59406e

由 YueHaibing 提交于 3月 23, 2020

After xfrm_add_policy add a policy, its ref is 2, then

                             xfrm_policy_timer
                               read_lock
                               xp->walk.dead is 0
                               ....
                               mod_timer()
xfrm_policy_kill
  policy->walk.dead = 1
  ....
  del_timer(&policy->timer)
    xfrm_pol_put //ref is 1
  xfrm_pol_put  //ref is 0
    xfrm_policy_destroy
      call_rcu
                                 xfrm_pol_hold //ref is 1
                               read_unlock
                               xfrm_pol_put //ref is 0
                                 xfrm_policy_destroy
                                  call_rcu

xfrm_policy_destroy is called twice, which may leads to
double free.

Call Trace:
RIP: 0010:refcount_warn_saturate+0x161/0x210
...
 xfrm_policy_timer+0x522/0x600
 call_timer_fn+0x1b3/0x5e0
 ? __xfrm_decode_session+0x2990/0x2990
 ? msleep+0xb0/0xb0
 ? _raw_spin_unlock_irq+0x24/0x40
 ? __xfrm_decode_session+0x2990/0x2990
 ? __xfrm_decode_session+0x2990/0x2990
 run_timer_softirq+0x5c5/0x10e0

Fix this by use write_lock_bh in xfrm_policy_kill.

Fixes: ea2dea9d ("xfrm: remove policy lock when accessing policy->walk.dead")
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Acked-by: NTimo Teräs <timo.teras@iki.fi>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

4c59406e

Remove DST_HOST · af13b3c3

由 David Laight 提交于 3月 23, 2020

Previous changes to the IP routing code have removed all the
tests for the DS_HOST route flag.
Remove the flags and all the code that sets it.
Signed-off-by: NDavid Laight <david.laight@aculab.com>
Acked-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

af13b3c3

04 2月, 2020 1 次提交

treewide: remove redundant IS_ERR() before error code check · 45586c70

由 Masahiro Yamada 提交于 2月 03, 2020

'PTR_ERR(p) == -E*' is a stronger condition than IS_ERR(p).
Hence, IS_ERR(p) is unneeded.

The semantic patch that generates this commit is as follows:

// <smpl>
@@
expression ptr;
constant error_code;
@@
-IS_ERR(ptr) && (PTR_ERR(ptr) == - error_code)
+PTR_ERR(ptr) == - error_code
// </smpl>

Link: http://lkml.kernel.org/r/20200106045833.1725-1-masahiroy@kernel.orgSigned-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Acked-by: Stephen Boyd <sboyd@kernel.org> [drivers/clk/clk.c]
Acked-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> [GPIO]
Acked-by: Wolfram Sang <wsa@the-dreams.de> [drivers/i2c]
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [acpi/scan.c]
Acked-by: NRob Herring <robh@kernel.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

45586c70

09 12月, 2019 1 次提交

xfrm: add espintcp (RFC 8229) · e27cca96

由 Sabrina Dubroca 提交于 11月 25, 2019

TCP encapsulation of IKE and IPsec messages (RFC 8229) is implemented
as a TCP ULP, overriding in particular the sendmsg and recvmsg
operations. A Stream Parser is used to extract messages out of the TCP
stream using the first 2 bytes as length marker. Received IKE messages
are put on "ike_queue", waiting to be dequeued by the custom recvmsg
implementation. Received ESP messages are sent to XFRM, like with UDP
encapsulation.

Some of this code is taken from the original submission by Herbert
Xu. Currently, only IPv4 is supported, like for UDP encapsulation.
Co-developed-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

e27cca96

02 10月, 2019 1 次提交

netfilter: drop bridge nf reset from nf_reset · 895b5c9f

由 Florian Westphal 提交于 9月 29, 2019

commit 174e2381
("sk_buff: drop all skb extensions on free and skb scrubbing") made napi
recycle always drop skb extensions.  The additional skb_ext_del() that is
performed via nf_reset on napi skb recycle is not needed anymore.

Most nf_reset() calls in the stack are there so queued skb won't block
'rmmod nf_conntrack' indefinitely.

This removes the skb_ext_del from nf_reset, and renames it to a more
fitting nf_reset_ct().

In a few selected places, add a call to skb_ext_reset to make sure that
no active extensions remain.

I am submitting this for "net", because we're still early in the release
cycle.  The patch applies to net-next too, but I think the rename causes
needless divergence between those trees.
Suggested-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

895b5c9f

25 8月, 2019 1 次提交

xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md mode · c3b4c3a4

由 Hangbin Liu 提交于 8月 22, 2019

In decode_session{4,6} there is a possibility that the skb dst dev is NULL,
e,g, with tunnel collect_md mode, which will cause kernel crash.
Here is what the code path looks like, for GRE:

- ip6gre_tunnel_xmit
  - ip6gre_xmit_ipv6
    - __gre6_xmit
      - ip6_tnl_xmit
        - if skb->len - t->tun_hlen - eth_hlen > mtu; return -EMSGSIZE
    - icmpv6_send
      - icmpv6_route_lookup
        - xfrm_decode_session_reverse
          - decode_session4
            - oif = skb_dst(skb)->dev->ifindex; <-- here
          - decode_session6
            - oif = skb_dst(skb)->dev->ifindex; <-- here

The reason is __metadata_dst_init() init dst->dev to NULL by default.
We could not fix it in __metadata_dst_init() as there is no dev supplied.
On the other hand, the skb_dst(skb)->dev is actually not needed as we
called decode_session{4,6} via xfrm_decode_session_reverse(), so oif is not
used by: fl4->flowi4_oif = reverse ? skb->skb_iif : oif;

So make a dst dev check here should be clean and safe.

v4: No changes.

v3: No changes.

v2: fix the issue in decode_session{4,6} instead of updating shared dst dev
in {ip_md, ip6}_tunnel_xmit.

Fixes: 8d79266b ("ip6_tunnel: add collect_md mode to IPv6 tunnels")
Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
Tested-by: NJonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3b4c3a4

20 8月, 2019 1 次提交

xfrm: policy: avoid warning splat when merging nodes · 769a807d

由 Florian Westphal 提交于 8月 12, 2019

syzbot reported a splat:
 xfrm_policy_inexact_list_reinsert+0x625/0x6e0 net/xfrm/xfrm_policy.c:877
 CPU: 1 PID: 6756 Comm: syz-executor.1 Not tainted 5.3.0-rc2+ #57
 Call Trace:
  xfrm_policy_inexact_node_reinsert net/xfrm/xfrm_policy.c:922 [inline]
  xfrm_policy_inexact_node_merge net/xfrm/xfrm_policy.c:958 [inline]
  xfrm_policy_inexact_insert_node+0x537/0xb50 net/xfrm/xfrm_policy.c:1023
  xfrm_policy_inexact_alloc_chain+0x62b/0xbd0 net/xfrm/xfrm_policy.c:1139
  xfrm_policy_inexact_insert+0xe8/0x1540 net/xfrm/xfrm_policy.c:1182
  xfrm_policy_insert+0xdf/0xce0 net/xfrm/xfrm_policy.c:1574
  xfrm_add_policy+0x4cf/0x9b0 net/xfrm/xfrm_user.c:1670
  xfrm_user_rcv_msg+0x46b/0x720 net/xfrm/xfrm_user.c:2676
  netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2477
  xfrm_netlink_rcv+0x74/0x90 net/xfrm/xfrm_user.c:2684
  netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
  netlink_unicast+0x809/0x9a0 net/netlink/af_netlink.c:1328
  netlink_sendmsg+0xa70/0xd30 net/netlink/af_netlink.c:1917
  sock_sendmsg_nosec net/socket.c:637 [inline]
  sock_sendmsg net/socket.c:657 [inline]

There is no reproducer, however, the warning can be reproduced
by adding rules with ever smaller prefixes.

The sanity check ("does the policy match the node") uses the prefix value
of the node before its updated to the smaller value.

To fix this, update the prefix earlier.  The bug has no impact on tree
correctness, this is only to prevent a false warning.

Reported-by: syzbot+8cc27ace5f6972910b31@syzkaller.appspotmail.com
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

769a807d

03 7月, 2019 1 次提交

xfrm: policy: fix bydst hlist corruption on hash rebuild · fd709721

由 Florian Westphal 提交于 7月 02, 2019

syzbot reported following spat:

BUG: KASAN: use-after-free in __write_once_size include/linux/compiler.h:221
BUG: KASAN: use-after-free in hlist_del_rcu include/linux/rculist.h:455
BUG: KASAN: use-after-free in xfrm_hash_rebuild+0xa0d/0x1000 net/xfrm/xfrm_policy.c:1318
Write of size 8 at addr ffff888095e79c00 by task kworker/1:3/8066
Workqueue: events xfrm_hash_rebuild
Call Trace:
 __write_once_size include/linux/compiler.h:221 [inline]
 hlist_del_rcu include/linux/rculist.h:455 [inline]
 xfrm_hash_rebuild+0xa0d/0x1000 net/xfrm/xfrm_policy.c:1318
 process_one_work+0x814/0x1130 kernel/workqueue.c:2269
Allocated by task 8064:
 __kmalloc+0x23c/0x310 mm/slab.c:3669
 kzalloc include/linux/slab.h:742 [inline]
 xfrm_hash_alloc+0x38/0xe0 net/xfrm/xfrm_hash.c:21
 xfrm_policy_init net/xfrm/xfrm_policy.c:4036 [inline]
 xfrm_net_init+0x269/0xd60 net/xfrm/xfrm_policy.c:4120
 ops_init+0x336/0x420 net/core/net_namespace.c:130
 setup_net+0x212/0x690 net/core/net_namespace.c:316

The faulting address is the address of the old chain head,
free'd by xfrm_hash_resize().

In xfrm_hash_rehash(), chain heads get re-initialized without
any hlist_del_rcu:

 for (i = hmask; i >= 0; i--)
    INIT_HLIST_HEAD(odst + i);

Then, hlist_del_rcu() gets called on the about to-be-reinserted policy
when iterating the per-net list of policies.

hlist_del_rcu() will then make chain->first be nonzero again:

static inline void __hlist_del(struct hlist_node *n)
{
   struct hlist_node *next = n->next;   // address of next element in list
   struct hlist_node **pprev = n->pprev;// location of previous elem, this
                                        // can point at chain->first
        WRITE_ONCE(*pprev, next);       // chain->first points to next elem
        if (next)
                next->pprev = pprev;

Then, when we walk chainlist to find insertion point, we may find a
non-empty list even though we're supposedly reinserting the first
policy to an empty chain.

To fix this first unlink all exact and inexact policies instead of
zeroing the list heads.

Add the commands equivalent to the syzbot reproducer to xfrm_policy.sh,
without fix KASAN catches the corruption as it happens, SLUB poisoning
detects it a bit later.

Reported-by: syzbot+0165480d4ef07360eeda@syzkaller.appspotmail.com
Fixes: 1548bc4e ("xfrm: policy: delete inexact policies from inexact list on hash rebuild")
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

fd709721

02 7月, 2019 1 次提交

xfrm: remove a duplicated assignment · 52e63a4e

由 Cong Wang 提交于 6月 29, 2019

Fixes: 30846090 ("xfrm: policy: add sequence count to sync with hash resize")
Cc: Florian Westphal <fw@strlen.de>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

52e63a4e

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功