提交 · 56c1c77948ba3576df1c387cefcf3bab93600822 · openeuler / Kernel

07 12月, 2021 4 次提交

E
ipv6: add net device refcount tracker to struct ip6_tnl · 56c1c779
由 Eric Dumazet 提交于 12月 04, 2021
```
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
```
56c1c779

sit: add net device refcount tracking to ip_tunnel · c0fd407a

由 Eric Dumazet 提交于 12月 04, 2021

Note that other ip_tunnel users do not seem to hold a reference
on tunnel->dev. Probably needs some investigations.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

c0fd407a

E
ipv6: add net device refcount tracker to rt6_probe_deferred() · fb67510b
由 Eric Dumazet 提交于 12月 04, 2021
```
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
```
fb67510b

net: dst: add net device refcount tracking to dst_entry · 9038c320

由 Eric Dumazet 提交于 12月 04, 2021

We want to track all dev_hold()/dev_put() to ease leak hunting.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

9038c320

02 12月, 2021 1 次提交

gro: Fix inconsistent indenting · 1ebb87cc

由 Jiapeng Chong 提交于 12月 02, 2021

Eliminate the follow smatch warning:

net/ipv6/ip6_offload.c:249 ipv6_gro_receive() warn: inconsistent
indenting.
Reported-by: NAbaci Robot <abaci@linux.alibaba.com>
Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ebb87cc

30 11月, 2021 1 次提交

net: ipv6: use the new fib6_nh_release_dsts helper in fib6_nh_release · 61308050

由 Nikolay Aleksandrov 提交于 11月 29, 2021

We can remove a bit of code duplication by reusing the new
fib6_nh_release_dsts helper in fib6_nh_release. Their only difference is
that fib6_nh_release's version doesn't use atomic operation to swap the
pointers because it assumes the fib6_nh is no longer visible, while
fib6_nh_release_dsts can be used anywhere.
Suggested-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NNikolay Aleksandrov <nikolay@nvidia.com>
Reviewed-by: NDavid Ahern <dsahern@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61308050

29 11月, 2021 1 次提交

ipv6: fix memory leak in fib6_rule_suppress · cdef4852

由 msizanoen1 提交于 11月 23, 2021

The kernel leaks memory when a `fib` rule is present in IPv6 nftables
firewall rules and a suppress_prefix rule is present in the IPv6 routing
rules (used by certain tools such as wg-quick). In such scenarios, every
incoming packet will leak an allocation in `ip6_dst_cache` slab cache.

After some hours of `bpftrace`-ing and source code reading, I tracked
down the issue to ca7a03c4 ("ipv6: do not free rt if
FIB_LOOKUP_NOREF is set on suppress rule").

The problem with that change is that the generic `args->flags` always have
`FIB_LOOKUP_NOREF` set[1][2] but the IPv6-specific flag
`RT6_LOOKUP_F_DST_NOREF` might not be, leading to `fib6_rule_suppress` not
decreasing the refcount when needed.

How to reproduce:
 - Add the following nftables rule to a prerouting chain:
     meta nfproto ipv6 fib saddr . mark . iif oif missing drop
   This can be done with:
     sudo nft create table inet test
     sudo nft create chain inet test test_chain '{ type filter hook prerouting priority filter + 10; policy accept; }'
     sudo nft add rule inet test test_chain meta nfproto ipv6 fib saddr . mark . iif oif missing drop
 - Run:
     sudo ip -6 rule add table main suppress_prefixlength 0
 - Watch `sudo slabtop -o | grep ip6_dst_cache` to see memory usage increase
   with every incoming ipv6 packet.

This patch exposes the protocol-specific flags to the protocol
specific `suppress` function, and check the protocol-specific `flags`
argument for RT6_LOOKUP_F_DST_NOREF instead of the generic
FIB_LOOKUP_NOREF when decreasing the refcount, like this.

[1]: https://github.com/torvalds/linux/blob/ca7a03c4175366a92cee0ccc4fec0038c3266e26/net/ipv6/fib6_rules.c#L71
[2]: https://github.com/torvalds/linux/blob/ca7a03c4175366a92cee0ccc4fec0038c3266e26/net/ipv6/fib6_rules.c#L99

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215105
Fixes: ca7a03c4 ("ipv6: do not free rt if FIB_LOOKUP_NOREF is set on suppress rule")
Cc: stable@vger.kernel.org
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cdef4852

25 11月, 2021 4 次提交

net-ipv6: changes to ->tclass (via IPV6_TCLASS) should sk_dst_reset() · 305e95bb

由 Maciej Żenczykowski 提交于 11月 23, 2021

This is to match ipv4 behaviour, see __ip_sock_set_tos()
implementation.

Technically for ipv6 this might not be required because normally we
do not allow tclass to influence routing, yet the cli tooling does
support it:

lpk11:~# ip -6 rule add pref 5 tos 45 lookup 5
lpk11:~# ip -6 rule
5:      from all tos 0x45 lookup 5

and in general dscp/tclass based routing does make sense.

We already have cases where dscp can affect vlan priority and/or
transmit queue (especially on wifi).

So let's just make things match.  Easier to reason about and no harm.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: NMaciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20211123223208.1117871-1-zenczykowski@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

305e95bb

net-ipv6: do not allow IPV6_TCLASS to muck with tcp's ECN · 9f7b3a69

由 Maciej Żenczykowski 提交于 11月 23, 2021

This is to match ipv4 behaviour, see __ip_sock_set_tos()
implementation at ipv4/ip_sockglue.c:579

void __ip_sock_set_tos(struct sock *sk, int val)
{
  if (sk->sk_type == SOCK_STREAM) {
    val &= ~INET_ECN_MASK;
    val |= inet_sk(sk)->tos & INET_ECN_MASK;
  }
  if (inet_sk(sk)->tos != val) {
    inet_sk(sk)->tos = val;
    sk->sk_priority = rt_tos2priority(val);
    sk_dst_reset(sk);
  }
}

Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: NMaciej Żenczykowski <maze@google.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20211123223154.1117794-1-zenczykowski@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

9f7b3a69

gro: remove rcu_read_lock/rcu_read_unlock from gro_complete handlers · 627b94f7

由 Eric Dumazet 提交于 11月 23, 2021

All gro_complete() handlers are called from napi_gro_complete()
while rcu_read_lock() has been called.

There is no point stacking more rcu_read_lock()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

627b94f7

gro: remove rcu_read_lock/rcu_read_unlock from gro_receive handlers · fc1ca334

由 Eric Dumazet 提交于 11月 23, 2021

All gro_receive() handlers are called from dev_gro_receive()
while rcu_read_lock() has been called.

There is no point stacking more rcu_read_lock()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

fc1ca334

22 11月, 2021 2 次提交

net: ipv6: add fib6_nh_release_dsts stub · 8837cbbf

由 Nikolay Aleksandrov 提交于 11月 22, 2021

We need a way to release a fib6_nh's per-cpu dsts when replacing
nexthops otherwise we can end up with stale per-cpu dsts which hold net
device references, so add a new IPv6 stub called fib6_nh_release_dsts.
It must be used after an RCU grace period, so no new dsts can be created
through a group's nexthop entry.
Similar to fib6_nh_release it shouldn't be used if fib6_nh_init has failed
so it doesn't need a dummy stub when IPv6 is not enabled.

Fixes: 7bf4796d ("nexthops: add support for replace")
Signed-off-by: NNikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8837cbbf

ipv6: fix typos in __ip6_finish_output() · 19d36c5f

由 Eric Dumazet 提交于 11月 18, 2021

We deal with IPv6 packets, so we need to use IP6CB(skb)->flags and
IP6SKB_REROUTED, instead of IPCB(skb)->flags and IPSKB_REROUTED

Found by code inspection, please double check that fixing this bug
does not surface other bugs.

Fixes: 09ee9dba ("ipv6: Reinject IPv6 packets if IPsec policy matches after SNAT")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Tobias Brunner <tobias@strongswan.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: David Ahern <dsahern@kernel.org>
Reviewed-by: NDavid Ahern <dsahern@kernel.org>
Tested-by: NTobias Brunner <tobias@strongswan.org>
Acked-by: NTobias Brunner <tobias@strongswan.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

19d36c5f

19 11月, 2021 1 次提交

ipv6: Use memset_after() to zero rt6_info · 8f2a83b4

由 Kees Cook 提交于 11月 18, 2021

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Use memset_after() to clear everything after the dst_entry member of
struct rt6_info.
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8f2a83b4

18 11月, 2021 3 次提交

ipv6: ah6: use swap() to make code cleaner · 4cdf85ef

由 Yao Jing 提交于 11月 18, 2021

Use the macro 'swap()' defined in 'include/linux/minmax.h' to avoid
opencoding it.
Reported-by: NZeal Robot <zealci@zte.com.cn>
Signed-off-by: NYao Jing <yao.jing2@zte.com.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4cdf85ef

ipv6: check return value of ipv6_skip_exthdr · 5f9c55c8

由 Jordy Zomer 提交于 11月 17, 2021

The offset value is used in pointer math on skb->data.
Since ipv6_skip_exthdr may return -1 the pointer to uh and th
may not point to the actual udp and tcp headers and potentially
overwrite other stuff. This is why I think this should be checked.

EDIT:  added {}'s, thanks Kees
Signed-off-by: NJordy Zomer <jordy@pwning.systems>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f9c55c8

ipv4/raw: support binding to nonlocal addresses · 8ff978b8

由 Riccardo Paolo Bestetti 提交于 11月 17, 2021

Add support to inet v4 raw sockets for binding to nonlocal addresses
through the IP_FREEBIND and IP_TRANSPARENT socket options, as well as
the ipv4.ip_nonlocal_bind kernel parameter.

Add helper function to inet_sock.h to check for bind address validity on
the base of the address type and whether nonlocal address are enabled
for the socket via any of the sockopts/sysctl, deduplicating checks in
ipv4/ping.c, ipv4/af_inet.c, ipv6/af_inet6.c (for mapped v4->v6
addresses), and ipv4/raw.c.

Add test cases with IP[V6]_FREEBIND verifying that both v4 and v6 raw
sockets support binding to nonlocal addresses after the change. Add
necessary support for the test cases to nettest.
Signed-off-by: NRiccardo Paolo Bestetti <pbl@bestov.io>
Reviewed-by: NDavid Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20211117090010.125393-1-pbl@bestov.ioSigned-off-by: NJakub Kicinski <kuba@kernel.org>

8ff978b8

17 11月, 2021 1 次提交

net: align static siphash keys · 49ecc2e9