提交 · 178c49d9f9a4b5ade00c93480d714708fe971e24 · openeuler / Kernel

25 7月, 2020 7 次提交

net: pass a sockptr_t into ->setsockopt · a7b75c5a

由 Christoph Hellwig 提交于 7月 23, 2020

Rework the remaining setsockopt code to pass a sockptr_t instead of a
plain user pointer.  This removes the last remaining set_fs(KERNEL_DS)
outside of architecture specific code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> [ieee802154]
Acked-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7b75c5a

net/ipv6: switch do_ipv6_setsockopt to sockptr_t · 894cfbc0

由 Christoph Hellwig 提交于 7月 23, 2020

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

894cfbc0

net/ipv6: factor out a ipv6_set_opt_hdr helper · b84d2b73

由 Christoph Hellwig 提交于 7月 23, 2020

Factour out a helper to set the IPv6 option headers from
do_ipv6_setsockopt.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b84d2b73

net/ipv6: switch ipv6_flowlabel_opt to sockptr_t · 86298285

由 Christoph Hellwig 提交于 7月 23, 2020

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Note that the get case is pretty weird in that it actually copies data
back to userspace from setsockopt.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86298285

net/ipv6: switch ip6_mroute_setsockopt to sockptr_t · b43c6153

由 Christoph Hellwig 提交于 7月 23, 2020

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b43c6153

netfilter: switch nf_setsockopt to sockptr_t · c2f12630

由 Christoph Hellwig 提交于 7月 23, 2020

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2f12630

net/xfrm: switch xfrm_user_policy to sockptr_t · c6d1b26a

由 Christoph Hellwig 提交于 7月 23, 2020

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c6d1b26a

20 7月, 2020 6 次提交

net/ipv6: remove compat_ipv6_{get,set}sockopt · 3021ad52

由 Christoph Hellwig 提交于 7月 17, 2020

Handle the few cases that need special treatment in-line using
in_compat_syscall().  This also removes all the now unused
compat_{get,set}sockopt methods.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3021ad52

net/ipv6: factor out mcast join/leave setsockopt helpers · fdf5bdd8

由 Christoph Hellwig 提交于 7月 17, 2020

Factor out one helper each for setting the native and compat
version of the MCAST_MSFILTER option.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fdf5bdd8

net/ipv6: factor out MCAST_MSFILTER setsockopt helpers · ca0e65eb

由 Christoph Hellwig 提交于 7月 17, 2020

Factor out one helper each for setting the native and compat
version of the MCAST_MSFILTER option.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca0e65eb

net/ipv6: factor out MCAST_MSFILTER getsockopt helpers · d5541e85

由 Christoph Hellwig 提交于 7月 17, 2020

Factor out one helper each for getting the native and compat
version of the MCAST_MSFILTER option.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5541e85

netfilter: remove the compat_{get,set} methods · 77d4df41

由 Christoph Hellwig 提交于 7月 17, 2020

All instances handle compat sockopts via in_compat_syscall() now, so
remove the compat_{get,set} methods as well as the
compat_nf_{get,set}sockopt wrappers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

77d4df41

net: remove compat_sock_common_{get,set}sockopt · 8c918ffb

由 Christoph Hellwig 提交于 7月 17, 2020

Add the compat handling to sock_common_{get,set}sockopt instead,
keyed of in_compat_syscall().  This allow to remove the now unused
->compat_{get,set}sockopt methods from struct proto_ops.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Acked-by: NStefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8c918ffb

05 6月, 2020 1 次提交

seg6: fix seg6_validate_srh() to avoid slab-out-of-bounds · bb986a50

由 Ahmed Abdelsalam 提交于 6月 03, 2020

The seg6_validate_srh() is used to validate SRH for three cases:

case1: SRH of data-plane SRv6 packets to be processed by the Linux kernel.
Case2: SRH of the netlink message received  from user-space (iproute2)
Case3: SRH injected into packets through setsockopt

In case1, the SRH can be encoded in the Reduced way (i.e., first SID is
carried in DA only and not represented as SID in the SRH) and the
seg6_validate_srh() now handles this case correctly.

In case2 and case3, the SRH shouldn’t be encoded in the Reduced way
otherwise we lose the first segment (i.e., the first hop).

The current implementation of the seg6_validate_srh() allow SRH of case2
and case3 to be encoded in the Reduced way. This leads a slab-out-of-bounds
problem.

This patch verifies SRH of case1, case2 and case3. Allowing case1 to be
reduced while preventing SRH of case2 and case3 from being reduced .

Reported-by: syzbot+e8c028b62439eac42073@syzkaller.appspotmail.com
Reported-by: NYueHaibing <yuehaibing@huawei.com>
Fixes: 0cb7498f ("seg6: fix SRH processing to comply with RFC8754")
Signed-off-by: NAhmed Abdelsalam <ahabdels@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bb986a50

02 6月, 2020 1 次提交

ipv6: fix IPV6_ADDRFORM operation logic · 79a1f0cc

由 Hangbin Liu 提交于 6月 01, 2020

Socket option IPV6_ADDRFORM supports UDP/UDPLITE and TCP at present.
Previously the checking logic looks like:
if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE)
	do_some_check;
else if (sk->sk_protocol != IPPROTO_TCP)
	break;

After commit b6f61189 ("ipv6: restrict IPV6_ADDRFORM operation"), TCP
was blocked as the logic changed to:
if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE)
	do_some_check;
else if (sk->sk_protocol == IPPROTO_TCP)
	do_some_check;
	break;
else
	break;

Then after commit 82c9ae44 ("ipv6: fix restrict IPV6_ADDRFORM operation")
UDP/UDPLITE were blocked as the logic changed to:
if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE)
	do_some_check;
if (sk->sk_protocol == IPPROTO_TCP)
	do_some_check;

if (sk->sk_protocol != IPPROTO_TCP)
	break;

Fix it by using Eric's code and simply remove the break in TCP check, which
looks like:
if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE)
	do_some_check;
else if (sk->sk_protocol == IPPROTO_TCP)
	do_some_check;
else
	break;

Fixes: 82c9ae44 ("ipv6: fix restrict IPV6_ADDRFORM operation")
Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

79a1f0cc

29 5月, 2020 1 次提交

ipv6: add ip6_sock_set_addr_preferences · 18d5ad62

由 Christoph Hellwig 提交于 5月 28, 2020

Add a helper to directly set the IPV6_ADD_PREFERENCES sockopt from kernel
space without going through a fake uaccess.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

18d5ad62

21 5月, 2020 8 次提交

handle the group_source_req options directly · b212c322

由 Al Viro 提交于 4月 27, 2020

Native ->setsockopt() handling of these options (MCAST_..._SOURCE_GROUP
and MCAST_{,UN}BLOCK_SOURCE) consists of copyin + call of a helper that
does the actual work. The only change needed for ->compat_setsockopt()
is a slightly different copyin - the helpers can be reused as-is.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b212c322

A
ipv6: take handling of group_source_req options into a helper · fcfa0b09
由 Al Viro 提交于 4月 27, 2020
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
fcfa0b09

ipv[46]: do compat setsockopt for MCAST_{JOIN,LEAVE}_GROUP directly · 2f984f11

由 Al Viro 提交于 4月 26, 2020

direct parallel to the way these two are handled in the native
->setsockopt() instances - the helpers that do the real work
are already separated and can be reused as-is in this case.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2f984f11

ipv6: do compat setsockopt for MCAST_MSFILTER directly · 168a2cca

由 Al Viro 提交于 3月 30, 2020

similar to the ipv4 counterpart of that patch - the same
trick used to align the tail array properly.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

168a2cca

A
ip6_mc_msfilter(): pass the address list separately · d59eb177
由 Al Viro 提交于 3月 30, 2020
```
that way we'll be able to reuse it for compat case
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
d59eb177

get rid of compat_mc_getsockopt() · 0dfe6581

由 Al Viro 提交于 3月 29, 2020

now we can do MCAST_MSFILTER in compat ->getsockopt() without
playing silly buggers with copying things back and forth.
We can form a native struct group_filter (sans the variable-length
tail) on stack, pass that + pointer to the tail of original request
to the helper doing the bulk of the work, then do the rest of
copyout - same as the native getsockopt() does.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0dfe6581

ip*_mc_gsfget(): lift copyout of struct group_filter into callers · 931ca7ab

由 Al Viro 提交于 3月 29, 2020

pass the userland pointer to the array in its tail, so that part
gets copied out by our functions; copyout of everything else is
done in the callers.  Rationale: reuse for compat; the array
is the same in native and compat, the layout of parts before it
is different for compat.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

931ca7ab

compat_ip{,v6}_setsockopt(): enumerate MCAST_... options explicitly · e9c375fb

由 Al Viro 提交于 5月 09, 2020

We want to check if optname is among the MCAST_... ones; do that as
an explicit switch.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e9c375fb

14 5月, 2020 1 次提交

ipv6: set msg_control_is_user in do_ipv6_getsockopt · 1b2f08df

由 Christoph Hellwig 提交于 5月 13, 2020

While do_ipv6_getsockopt does not call the high-level recvmsg helper,
the msghdr eventually ends up being passed to put_cmsg anyway, and thus
needs msg_control_is_user set to the proper value.

Fixes: 1f466e1f ("net: cleanly handle kernel vs user buffers for ->msg_control")
Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b2f08df

21 4月, 2020 1 次提交

ipv6: fix restrict IPV6_ADDRFORM operation · 82c9ae44

由 John Haxby 提交于 4月 18, 2020

Commit b6f61189 ("ipv6: restrict IPV6_ADDRFORM operation") fixed a
problem found by syzbot an unfortunate logic error meant that it
also broke IPV6_ADDRFORM.

Rearrange the checks so that the earlier test is just one of the series
of checks made before moving the socket from IPv6 to IPv4.

Fixes: b6f61189 ("ipv6: restrict IPV6_ADDRFORM operation")
Signed-off-by: NJohn Haxby <john.haxby@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82c9ae44

27 2月, 2020 1 次提交

ipv6: restrict IPV6_ADDRFORM operation · b6f61189

由 Eric Dumazet 提交于 2月 25, 2020

IPV6_ADDRFORM is able to transform IPv6 socket to IPv4 one.
While this operation sounds illogical, we have to support it.

One of the things it does for TCP socket is to switch sk->sk_prot
to tcp_prot.

We now have other layers playing with sk->sk_prot, so we should make
sure to not interfere with them.

This patch makes sure sk_prot is the default pointer for TCP IPv6 socket.

syzbot reported :
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD a0113067 P4D a0113067 PUD a8771067 PMD 0
Oops: 0010 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 10686 Comm: syz-executor.0 Not tainted 5.6.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246
RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40
RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5
R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098
R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000
FS:  00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 inet_release+0x165/0x1c0 net/ipv4/af_inet.c:427
 __sock_release net/socket.c:605 [inline]
 sock_close+0xe1/0x260 net/socket.c:1283
 __fput+0x2e4/0x740 fs/file_table.c:280
 ____fput+0x15/0x20 fs/file_table.c:313
 task_work_run+0x176/0x1b0 kernel/task_work.c:113
 tracehook_notify_resume include/linux/tracehook.h:188 [inline]
 exit_to_usermode_loop arch/x86/entry/common.c:164 [inline]
 prepare_exit_to_usermode+0x480/0x5b0 arch/x86/entry/common.c:195
 syscall_return_slowpath+0x113/0x4a0 arch/x86/entry/common.c:278
 do_syscall_64+0x11f/0x1c0 arch/x86/entry/common.c:304
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x45c429
Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f2ae75dac78 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: 0000000000000000 RBX: 00007f2ae75db6d4 RCX: 000000000045c429
RDX: 0000000000000001 RSI: 000000000000011a RDI: 0000000000000004
RBP: 000000000076bf20 R08: 0000000000000038 R09: 0000000000000000
R10: 0000000020000180 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000000a9d R14: 00000000004ccfb4 R15: 000000000076bf2c
Modules linked in:
CR2: 0000000000000000
---[ end trace 82567b5207e87bae ]---
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246
RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40
RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5
R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098
R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000
FS:  00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: syzbot+1938db17e275e85dc328@syzkaller.appspotmail.com
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6f61189

22 11月, 2019 1 次提交

net-ipv6: IPV6_TRANSPARENT - check NET_RAW prior to NET_ADMIN · 35fc59c9

由 Maciej Żenczykowski 提交于 11月 21, 2019

NET_RAW is less dangerous, so more likely to be available to a process,
so check it first to prevent some spurious logging.

This matches IP_TRANSPARENT which checks NET_RAW first.
Signed-off-by: NMaciej Żenczykowski <maze@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

35fc59c9

31 5月, 2019 1 次提交

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 · 2874c5fd

由 Thomas Gleixner 提交于 5月 27, 2019

Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAllison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

2874c5fd

26 5月, 2019 1 次提交

ipv6_sockglue: Fix a missing-check bug in ip6_ra_control() · 95baa60a

由 Gen Zhang 提交于 5月 24, 2019

In function ip6_ra_control(), the pointer new_ra is allocated a memory
space via kmalloc(). And it is used in the following codes. However,
when there is a memory allocation error, kmalloc() fails. Thus null
pointer dereference may happen. And it will cause the kernel to crash.
Therefore, we should check the return value and handle the error.
Signed-off-by: NGen Zhang <blackgod016574@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95baa60a

04 3月, 2019 1 次提交

net: ipv6: add socket option IPV6_ROUTER_ALERT_ISOLATE · 9036b2fe

由 Francesco Ruggeri 提交于 3月 01, 2019

By default IPv6 socket with IPV6_ROUTER_ALERT socket option set will
receive all IPv6 RA packets from all namespaces.
IPV6_ROUTER_ALERT_ISOLATE socket option restricts packets received by
the socket to be only from the socket's namespace.
Signed-off-by: NMaxim Martynov <maxim@arista.com>
Signed-off-by: NFrancesco Ruggeri <fruggeri@arista.com>
Reviewed-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9036b2fe

08 11月, 2018 1 次提交

ipv6: allow ping to link-local address in VRF · d839a0eb

由 Mike Manning 提交于 11月 07, 2018

If link-local packets are marked as enslaved to a VRF, then to allow
ping to the link-local from a vrf, the error handling for IPV6_PKTINFO
needs to be relaxed to also allow the pkt ipi6_ifindex to be that of a
slave device to the vrf.

Note that the real device also needs to be retrieved in icmp6_iif()
to set the ipv6 flow oif to this for icmp echo reply handling. The
recent commit 24b711ed ("net/ipv6: Fix linklocal to global address
with VRF") takes care of this, so the sdif does not need checking here.

This fix makes ping to link-local consistent with that to global
addresses, in that this can now be done from within the same VRF that
the address is in.
Signed-off-by: NMike Manning <mmanning@vyatta.att-mail.com>
Reviewed-by: NDavid Ahern <dsahern@gmail.com>
Tested-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d839a0eb

13 9月, 2018 1 次提交

ipv6: Add sockopt IPV6_MULTICAST_ALL analogue to IP_MULTICAST_ALL · 15033f04

由 Andre Naujoks 提交于 9月 10, 2018

The socket option will be enabled by default to ensure current behaviour
is not changed. This is the same for the IPv4 version.

A socket bound to in6addr_any and a specific port will receive all traffic
on that port. Analogue to IP_MULTICAST_ALL, disable this behaviour, if
one or more multicast groups were joined (using said socket) and only
pass on multicast traffic from groups, which were explicitly joined via
this socket.

Without this option disabled a socket (system even) joined to multiple
multicast groups is very hard to get right. Filtering by destination
address has to take place in user space to avoid receiving multicast
traffic from other multicast groups, which might have traffic on the same
port.

The extension of the IP_MULTICAST_ALL socketoption to just apply to ipv6,
too, is not done to avoid changing the behaviour of current applications.
Signed-off-by: NAndre Naujoks <nautsch2@gmail.com>
Acked-By: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15033f04

17 7月, 2018 1 次提交

ipv6/mcast: init as INCLUDE when join SSM INCLUDE group · c7ea20c9

由 Hangbin Liu 提交于 7月 10, 2018

This an IPv6 version patch of "ipv4/igmp: init group mode as INCLUDE when
join source group". From RFC3810, part 6.1:

If no per-interface state existed for that
multicast address before the change (i.e., the change consisted of
creating a new per-interface record), or if no state exists after the
change (i.e., the change consisted of deleting a per-interface
record), then the "non-existent" state is considered to have an
INCLUDE filter mode and an empty source list.

Which means a new multicast group should start with state IN(). Currently,
for MLDv2 SSM JOIN_SOURCE_GROUP mode, we first call ipv6_sock_mc_join(),
then ip6_mc_source(), which will trigger a TO_IN() message instead of
ALLOW().

The issue was exposed by commit a052517a ("net/multicast: should not
send source list records when have filter mode change"). Before this change,
we sent both ALLOW(A) and TO_IN(A). Now, we only send TO_IN(A).

Fix it by adding a new parameter to init group mode. Also add some wrapper
functions to avoid changing too much code.

v1 -> v2:
In the first version I only cleared the group change record. But this is not
enough. Because when a new group join, it will init as EXCLUDE and trigger
a filter mode change in ip/ip6_mc_add_src(), which will clear all source
addresses sf_crcount. This will prevent early joined address sending state
change records if multi source addressed joined at the same time.

In v2 patch, I fixed it by directly initializing the mode to INCLUDE for SSM
JOIN_SOURCE_GROUP. I also split the original patch into two separated patches
for IPv4 and IPv6.

There is also a difference between v4 and v6 version. For IPv6, when the
interface goes down and up, we will send correct state change record with
unspecified IPv6 address (::) with function ipv6_mc_up(). But after DAD is
completed, we resend the change record TO_IN() in mld_send_initial_cr().
Fix it by sending ALLOW() for INCLUDE mode in mld_send_initial_cr().

Fixes: a052517a ("net/multicast: should not send source list records when have filter mode change")
Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7ea20c9

07 7月, 2018 1 次提交

ipv6: fold sockcm_cookie into ipcm6_cookie · 5fdaa88d

由 Willem de Bruijn 提交于 7月 06, 2018

ipcm_cookie includes sockcm_cookie. Do the same for ipcm6_cookie.

This reduces the number of arguments that need to be passed around,
applies ipcm6_init to all cookie fields at once and reduces code
differentiation between ipv4 and ipv6.
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5fdaa88d

05 7月, 2018 1 次提交

ipv6: make ipv6_renew_options() interrupt/kernel safe · a9ba23d4

由 Paul Moore 提交于 7月 04, 2018

At present the ipv6_renew_options_kern() function ends up calling into
access_ok() which is problematic if done from inside an interrupt as
access_ok() calls WARN_ON_IN_IRQ() on some (all?) architectures
(x86-64 is affected).  Example warning/backtrace is shown below:

 WARNING: CPU: 1 PID: 3144 at lib/usercopy.c:11 _copy_from_user+0x85/0x90
 ...
 Call Trace:
  <IRQ>
  ipv6_renew_option+0xb2/0xf0
  ipv6_renew_options+0x26a/0x340
  ipv6_renew_options_kern+0x2c/0x40
  calipso_req_setattr+0x72/0xe0
  netlbl_req_setattr+0x126/0x1b0
  selinux_netlbl_inet_conn_request+0x80/0x100
  selinux_inet_conn_request+0x6d/0xb0
  security_inet_conn_request+0x32/0x50
  tcp_conn_request+0x35f/0xe00
  ? __lock_acquire+0x250/0x16c0
  ? selinux_socket_sock_rcv_skb+0x1ae/0x210
  ? tcp_rcv_state_process+0x289/0x106b
  tcp_rcv_state_process+0x289/0x106b
  ? tcp_v6_do_rcv+0x1a7/0x3c0
  tcp_v6_do_rcv+0x1a7/0x3c0
  tcp_v6_rcv+0xc82/0xcf0
  ip6_input_finish+0x10d/0x690
  ip6_input+0x45/0x1e0
  ? ip6_rcv_finish+0x1d0/0x1d0
  ipv6_rcv+0x32b/0x880
  ? ip6_make_skb+0x1e0/0x1e0
  __netif_receive_skb_core+0x6f2/0xdf0
  ? process_backlog+0x85/0x250
  ? process_backlog+0x85/0x250
  ? process_backlog+0xec/0x250
  process_backlog+0xec/0x250
  net_rx_action+0x153/0x480
  __do_softirq+0xd9/0x4f7
  do_softirq_own_stack+0x2a/0x40
  </IRQ>
  ...

While not present in the backtrace, ipv6_renew_option() ends up calling
access_ok() via the following chain:

  access_ok()
  _copy_from_user()
  copy_from_user()
  ipv6_renew_option()

The fix presented in this patch is to perform the userspace copy
earlier in the call chain such that it is only called when the option
data is actually coming from userspace; that place is
do_ipv6_setsockopt().  Not only does this solve the problem seen in
the backtrace above, it also allows us to simplify the code quite a
bit by removing ipv6_renew_options_kern() completely.  We also take
this opportunity to cleanup ipv6_renew_options()/ipv6_renew_option()
a small amount as well.

This patch is heavily based on a rough patch by Al Viro.  I've taken
his original patch, converted a kmemdup() call in do_ipv6_setsockopt()
to a memdup_user() call, made better use of the e_inval jump target in
the same function, and cleaned up the use ipv6_renew_option() by
ipv6_renew_options().

CC: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a9ba23d4

01 3月, 2018 1 次提交

inet: whitespace cleanup · 82695b30

由 Stephen Hemminger 提交于 2月 27, 2018

Ran simple script to find/remove trailing whitespace and blank lines
at EOF because that kind of stuff git whines about and editors leave
behind.
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82695b30

15 2月, 2018 1 次提交

netfilter: drop outermost socket lock in getsockopt() · 01ea306f

由 Paolo Abeni 提交于 2月 08, 2018

The Syzbot reported a possible deadlock in the netfilter area caused by
rtnl lock, xt lock and socket lock being acquired with a different order
on different code paths, leading to the following backtrace:
Reviewed-by: NXin Long <lucien.xin@gmail.com>

======================================================
WARNING: possible circular locking dependency detected
4.15.0+ #301 Not tainted
------------------------------------------------------
syzkaller233489/4179 is trying to acquire lock:
  (rtnl_mutex){+.+.}, at: [<0000000048e996fd>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74

but task is already holding lock:
  (&xt[i].mutex){+.+.}, at: [<00000000328553a2>]
xt_find_table_lock+0x3e/0x3e0 net/netfilter/x_tables.c:1041

which lock already depends on the new lock.
===

Since commit 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock
only in the required scope"), we already acquire the socket lock in
the innermost scope, where needed. In such commit I forgot to remove
the outer-most socket lock from the getsockopt() path, this commit
addresses the issues dropping it now.

v1 -> v2: fix bad subj, added relavant 'fixes' tag

Fixes: 22265a5c ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
Fixes: 202f59af ("netfilter: ipt_CLUSTERIP: do not hold dev")
Fixes: 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock only in the required scope")
Reported-by: syzbot+ddde1c7b7ff7442d7f2d@syzkaller.appspotmail.com
Suggested-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

01ea306f

31 1月, 2018 1 次提交

netfilter: on sockopt() acquire sock lock only in the required scope · 3f34cfae

由 Paolo Abeni 提交于 1月 30, 2018

Syzbot reported several deadlocks in the netfilter area caused by
rtnl lock and socket lock being acquired with a different order on
different code paths, leading to backtraces like the following one:

======================================================
WARNING: possible circular locking dependency detected
4.15.0-rc9+ #212 Not tainted
------------------------------------------------------
syzkaller041579/3682 is trying to acquire lock:
  (sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>] lock_sock
include/net/sock.h:1463 [inline]
  (sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>]
do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167

but task is already holding lock:
  (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (rtnl_mutex){+.+.}:
        __mutex_lock_common kernel/locking/mutex.c:756 [inline]
        __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
        mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
        rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
        register_netdevice_notifier+0xad/0x860 net/core/dev.c:1607
        tee_tg_check+0x1a0/0x280 net/netfilter/xt_TEE.c:106
        xt_check_target+0x22c/0x7d0 net/netfilter/x_tables.c:845
        check_target net/ipv6/netfilter/ip6_tables.c:538 [inline]
        find_check_entry.isra.7+0x935/0xcf0
net/ipv6/netfilter/ip6_tables.c:580
        translate_table+0xf52/0x1690 net/ipv6/netfilter/ip6_tables.c:749
        do_replace net/ipv6/netfilter/ip6_tables.c:1165 [inline]
        do_ip6t_set_ctl+0x370/0x5f0 net/ipv6/netfilter/ip6_tables.c:1691
        nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
        nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
        ipv6_setsockopt+0x115/0x150 net/ipv6/ipv6_sockglue.c:928
        udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978
        SYSC_setsockopt net/socket.c:1849 [inline]
        SyS_setsockopt+0x189/0x360 net/socket.c:1828
        entry_SYSCALL_64_fastpath+0x29/0xa0

-> #0 (sk_lock-AF_INET6){+.+.}:
        lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3914
        lock_sock_nested+0xc2/0x110 net/core/sock.c:2780
        lock_sock include/net/sock.h:1463 [inline]
        do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167
        ipv6_setsockopt+0xd7/0x150 net/ipv6/ipv6_sockglue.c:922
        udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978
        SYSC_setsockopt net/socket.c:1849 [inline]
        SyS_setsockopt+0x189/0x360 net/socket.c:1828
        entry_SYSCALL_64_fastpath+0x29/0xa0

other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(rtnl_mutex);
                                lock(sk_lock-AF_INET6);
                                lock(rtnl_mutex);
   lock(sk_lock-AF_INET6);

  *** DEADLOCK ***

1 lock held by syzkaller041579/3682:
  #0:  (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74

The problem, as Florian noted, is that nf_setsockopt() is always
called with the socket held, even if the lock itself is required only
for very tight scopes and only for some operation.

This patch addresses the issues moving the lock_sock() call only
where really needed, namely in ipv*_getorigdst(), so that nf_setsockopt()
does not need anymore to acquire both locks.

Fixes: 22265a5c ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
Reported-by: syzbot+a4c2dc980ac1af699b36@syzkaller.appspotmail.com
Suggested-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

3f34cfae

24 1月, 2018 1 次提交

ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL · e9191ffb

由 Ben Hutchings 提交于 1月 22, 2018

Commit 513674b5 ("net: reevalulate autoflowlabel setting after
sysctl setting") removed the initialisation of
ipv6_pinfo::autoflowlabel and added a second flag to indicate
whether this field or the net namespace default should be used.

The getsockopt() handling for this case was not updated, so it
currently returns 0 for all sockets for which IPV6_AUTOFLOWLABEL is
not explicitly enabled. Fix it to return the effective value, whether
that has been set at the socket or net namespace level.

Fixes: 513674b5 ("net: reevalulate autoflowlabel setting after sysctl ...")
Signed-off-by: NBen Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e9191ffb

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功