提交 · 21985f43376cee092702d6cb963ff97a9d2ede68 · openeuler / Kernel

13 10月, 2022 2 次提交

udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM). · 21985f43

由 Kuniyuki Iwashima 提交于 10月 06, 2022

Commit 4b340ae2 ("IPv6: Complete IPV6_DONTFRAG support") forgot
to add a change to free inet6_sk(sk)->rxpmtu while converting an IPv6
socket into IPv4 with IPV6_ADDRFORM.  After conversion, sk_prot is
changed to udp_prot and ->destroy() never cleans it up, resulting in
a memory leak.

This is due to the discrepancy between inet6_destroy_sock() and
IPV6_ADDRFORM, so let's call inet6_destroy_sock() from IPV6_ADDRFORM
to remove the difference.

However, this is not enough for now because rxpmtu can be changed
without lock_sock() after commit 03485f2a ("udpv6: Add lockless
sendmsg() support").  We will fix this case in the following patch.

Note we will rename inet6_destroy_sock() to inet6_cleanup_sock() and
remove unnecessary inet6_destroy_sock() calls in sk_prot->destroy()
in the future.

Fixes: 4b340ae2 ("IPv6: Complete IPV6_DONTFRAG support")
Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

21985f43

tcp/udp: Fix memory leak in ipv6_renew_options(). · 3c52c6bb

由 Kuniyuki Iwashima 提交于 10月 06, 2022

syzbot reported a memory leak [0] related to IPV6_ADDRFORM.

The scenario is that while one thread is converting an IPv6 socket into
IPv4 with IPV6_ADDRFORM, another thread calls do_ipv6_setsockopt() and
allocates memory to inet6_sk(sk)->XXX after conversion.

Then, the converted sk with (tcp|udp)_prot never frees the IPv6 resources,
which inet6_destroy_sock() should have cleaned up.

setsockopt(IPV6_ADDRFORM)                 setsockopt(IPV6_DSTOPTS)
+-----------------------+                 +----------------------+
- do_ipv6_setsockopt(sk, ...)
  - sockopt_lock_sock(sk)                 - do_ipv6_setsockopt(sk, ...)
    - lock_sock(sk)                         ^._ called via tcpv6_prot
  - WRITE_ONCE(sk->sk_prot, &tcp_prot)          before WRITE_ONCE()
  - xchg(&np->opt, NULL)
  - txopt_put(opt)
  - sockopt_release_sock(sk)
    - release_sock(sk)                      - sockopt_lock_sock(sk)
                                              - lock_sock(sk)
                                            - ipv6_set_opt_hdr(sk, ...)
                                              - ipv6_update_options(sk, opt)
                                                - xchg(&inet6_sk(sk)->opt, opt)
                                                  ^._ opt is never freed.

                                            - sockopt_release_sock(sk)
                                              - release_sock(sk)

Since IPV6_DSTOPTS allocates options under lock_sock(), we can avoid this
memory leak by testing whether sk_family is changed by IPV6_ADDRFORM after
acquiring the lock.

This issue exists from the initial commit between IPV6_ADDRFORM and
IPV6_PKTOPTIONS.

[0]:
BUG: memory leak
unreferenced object 0xffff888009ab9f80 (size 96):
  comm "syz-executor583", pid 328, jiffies 4294916198 (age 13.034s)
  hex dump (first 32 bytes):
    01 00 00 00 48 00 00 00 08 00 00 00 00 00 00 00  ....H...........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000002ee98ae1>] kmalloc include/linux/slab.h:605 [inline]
    [<000000002ee98ae1>] sock_kmalloc+0xb3/0x100 net/core/sock.c:2566
    [<0000000065d7b698>] ipv6_renew_options+0x21e/0x10b0 net/ipv6/exthdrs.c:1318
    [<00000000a8c756d7>] ipv6_set_opt_hdr net/ipv6/ipv6_sockglue.c:354 [inline]
    [<00000000a8c756d7>] do_ipv6_setsockopt.constprop.0+0x28b7/0x4350 net/ipv6/ipv6_sockglue.c:668
    [<000000002854d204>] ipv6_setsockopt+0xdf/0x190 net/ipv6/ipv6_sockglue.c:1021
    [<00000000e69fdcf8>] tcp_setsockopt+0x13b/0x2620 net/ipv4/tcp.c:3789
    [<0000000090da4b9b>] __sys_setsockopt+0x239/0x620 net/socket.c:2252
    [<00000000b10d192f>] __do_sys_setsockopt net/socket.c:2263 [inline]
    [<00000000b10d192f>] __se_sys_setsockopt net/socket.c:2260 [inline]
    [<00000000b10d192f>] __x64_sys_setsockopt+0xbe/0x160 net/socket.c:2260
    [<000000000a80d7aa>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    [<000000000a80d7aa>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
    [<000000004562b5c6>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

3c52c6bb

12 10月, 2022 2 次提交

netfilter: rpfilter/fib: Populate flowic_l3mdev field · acc641ab

由 Phil Sutter 提交于 10月 05, 2022

Use the introduced field for correct operation with VRF devices instead
of conditionally overwriting flowic_oif. This is a partial revert of
commit b575b24b ("netfilter: Fix rpfilter dropping vrf packets by
mistake"), implementing a simpler solution.
Signed-off-by: NPhil Sutter <phil@nwl.cc>
Reviewed-by: NDavid Ahern <dsahern@kernel.org>
Reviewed-by: NGuillaume Nault <gnault@redhat.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>

acc641ab

inet: ping: fix recent breakage · 0d24148b

由 Eric Dumazet 提交于 10月 11, 2022

Blamed commit broke the assumption used by ping sendmsg() that
allocated skb would have MAX_HEADER bytes in skb->head.

This patch changes the way ping works, by making sure
the skb head contains space for the icmp header,
and adjusting ping_getfrag() which was desperate
about going past the icmp header :/

This is adopting what UDP does, mostly.

syzbot is able to crash a host using both kfence and following repro in a loop.

fd = socket(AF_INET6, SOCK_DGRAM, IPPROTO_ICMPV6)
connect(fd, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0),
		inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28
sendmsg(fd, {msg_name=NULL, msg_namelen=0, msg_iov=[
		{iov_base="\200\0\0\0\23\0\0\0\0\0\0\0\0\0"..., iov_len=65496}],
		msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0

When kfence triggers, skb->head only has 64 bytes, immediately followed
by struct skb_shared_info (no extra headroom based on ksize(ptr))

Then icmpv6_push_pending_frames() is overwriting first bytes
of skb_shinfo(skb), making nr_frags bigger than MAX_SKB_FRAGS,
and/or setting shinfo->gso_size to a non zero value.

If nr_frags is mangled, a crash happens in skb_release_data()

If gso_size is mangled, we have the following report:

lo: caps=(0x00000516401d7c69, 0x00000516401d7c69)
WARNING: CPU: 0 PID: 7548 at net/core/dev.c:3239 skb_warn_bad_offload+0x119/0x230 net/core/dev.c:3239
Modules linked in:
CPU: 0 PID: 7548 Comm: syz-executor268 Not tainted 6.0.0-syzkaller-02754-g557f0501 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022
RIP: 0010:skb_warn_bad_offload+0x119/0x230 net/core/dev.c:3239
Code: 70 03 00 00 e8 58 c3 24 fa 4c 8d a5 e8 00 00 00 e8 4c c3 24 fa 4c 89 e9 4c 89 e2 4c 89 f6 48 c7 c7 00 53 f5 8a e8 13 ac e7 01 <0f> 0b 5b 5d 41 5c 41 5d 41 5e e9 28 c3 24 fa e8 23 c3 24 fa 48 89
RSP: 0018:ffffc9000366f3e8 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88807a9d9d00 RCX: 0000000000000000
RDX: ffff8880780c0000 RSI: ffffffff8160f6f8 RDI: fffff520006cde6f
RBP: ffff888079952000 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000400 R11: 0000000000000000 R12: ffff8880799520e8
R13: ffff88807a9da070 R14: ffff888079952000 R15: 0000000000000000
FS: 0000555556be6300(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020010000 CR3: 000000006eb7b000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
gso_features_check net/core/dev.c:3521 [inline]
netif_skb_features+0x83e/0xb90 net/core/dev.c:3554
validate_xmit_skb+0x2b/0xf10 net/core/dev.c:3659
__dev_queue_xmit+0x998/0x3ad0 net/core/dev.c:4248
dev_queue_xmit include/linux/netdevice.h:3008 [inline]
neigh_hh_output include/net/neighbour.h:530 [inline]
neigh_output include/net/neighbour.h:544 [inline]
ip6_finish_output2+0xf97/0x1520 net/ipv6/ip6_output.c:134
__ip6_finish_output net/ipv6/ip6_output.c:195 [inline]
ip6_finish_output+0x690/0x1160 net/ipv6/ip6_output.c:206
NF_HOOK_COND include/linux/netfilter.h:291 [inline]
ip6_output+0x1ed/0x540 net/ipv6/ip6_output.c:227
dst_output include/net/dst.h:445 [inline]
ip6_local_out+0xaf/0x1a0 net/ipv6/output_core.c:161
ip6_send_skb+0xb7/0x340 net/ipv6/ip6_output.c:1966
ip6_push_pending_frames+0xdd/0x100 net/ipv6/ip6_output.c:1986
icmpv6_push_pending_frames+0x2af/0x490 net/ipv6/icmp.c:303
ping_v6_sendmsg+0xc44/0x1190 net/ipv6/ping.c:190
inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:819
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:734
____sys_sendmsg+0x712/0x8c0 net/socket.c:2482
___sys_sendmsg+0x110/0x1b0 net/socket.c:2536
__sys_sendmsg+0xf3/0x1c0 net/socket.c:2565
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f21aab42b89
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 41 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff1729d038 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f21aab42b89
RDX: 0000000000000000 RSI: 0000000020000180 RDI: 0000000000000003
RBP: 0000000000000000 R08: 000000000000000d R09: 000000000000000d
R10: 000000000000000d R11: 0000000000000246 R12: 00007fff1729d050
R13: 00000000000f4240 R14: 0000000000021dd1 R15: 00007fff1729d044
</TASK>

Fixes: 47cf8899 ("net: unify alloclen calculation for paged requests")
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d24148b

03 10月, 2022 2 次提交

net: Add helper function to parse netlink msg of ip_tunnel_parm · b86fca80

由 Liu Jian 提交于 9月 29, 2022

Add ip_tunnel_netlink_parms to parse netlink msg of ip_tunnel_parm.
Reduces duplicate code, no actual functional changes.
Signed-off-by: NLiu Jian <liujian56@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b86fca80

net: Add helper function to parse netlink msg of ip_tunnel_encap · 537dd2d9

由 Liu Jian 提交于 9月 29, 2022

Add ip_tunnel_netlink_encap_parms to parse netlink msg of ip_tunnel_encap.
Reduces duplicate code, no actual functional changes.
Signed-off-by: NLiu Jian <liujian56@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

537dd2d9

30 9月, 2022 2 次提交

ip6_vti:Remove the space before the comma · 0f5ef005

由 Hongbin Wang 提交于 9月 29, 2022

There should be no space before the comma
Signed-off-by: NHongbin Wang <wh_bin@126.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0f5ef005

net-next: skbuff: refactor pskb_pull · d427c899

由 Richard Gobert 提交于 9月 28, 2022

pskb_may_pull already contains all of the checks performed by
pskb_pull.
Use pskb_may_pull for validation in pskb_pull, eliminating the
duplication and making __pskb_pull obsolete.
Replace __pskb_pull with pskb_pull where applicable.
Signed-off-by: NRichard Gobert <richardbgobert@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d427c899

29 9月, 2022 7 次提交

S
xfrm: mip6: add extack to mip6_destopt_init_state, mip6_rthdr_init_state · 28b5dbd5
由 Sabrina Dubroca 提交于 9月 27, 2022
```
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
```
28b5dbd5

xfrm: ipcomp: add extack to ipcomp{4,6}_init_state · 6ee55320

由 Sabrina Dubroca 提交于 9月 27, 2022

And the shared helper ipcomp_init_state.
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

6ee55320

S
xfrm: tunnel: add extack to ipip_init_state, xfrm6_tunnel_init_state · 25ec92cd
由 Sabrina Dubroca 提交于 9月 27, 2022
```
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
```
25ec92cd

xfrm: esp: add extack to esp_init_state, esp6_init_state · 67c44f93

由 Sabrina Dubroca 提交于 9月 27, 2022

Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

67c44f93

xfrm: ah: add extack to ah_init_state, ah6_init_state · ef87a4f8

由 Sabrina Dubroca 提交于 9月 27, 2022

Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

ef87a4f8

xfrm: pass extack down to xfrm_type ->init_state · e1e10b44

由 Sabrina Dubroca 提交于 9月 27, 2022

Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

e1e10b44

net: shrink struct ubuf_info · e7d2b510

由 Pavel Begunkov 提交于 9月 23, 2022

We can benefit from a smaller struct ubuf_info, so leave only mandatory
fields and let users to decide how they want to extend it. Convert
MSG_ZEROCOPY to struct ubuf_info_msgzc and remove duplicated fields.
This reduces the size from 48 bytes to just 16.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Acked-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

e7d2b510

28 9月, 2022 2 次提交

netfilter: nft_fib: Fix for rpath check with VRF devices · 2a8a7c0e

由 Phil Sutter 提交于 9月 21, 2022

Analogous to commit b575b24b ("netfilter: Fix rpfilter
dropping vrf packets by mistake") but for nftables fib expression:
Add special treatment of VRF devices so that typical reverse path
filtering via 'fib saddr . iif oif' expression works as expected.

Fixes: f6d0cbcf ("netfilter: nf_tables: add fib expression")
Signed-off-by: NPhil Sutter <phil@nwl.cc>
Signed-off-by: NFlorian Westphal <fw@strlen.de>

2a8a7c0e

Add skb drop reasons to IPv6 UDP receive path · 0d92efde

由 Donald Hunter 提交于 9月 26, 2022

Enumerate the skb drop reasons in the receive path for IPv6 UDP packets.
Signed-off-by: NDonald Hunter <donald.hunter@redhat.com>
Link: https://lore.kernel.org/r/20220926120350.14928-1-donald.hunter@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

0d92efde

24 9月, 2022 1 次提交

ipv6: tcp: send consistent autoflowlabel in RST packets · 9258b8b1

由 Eric Dumazet 提交于 9月 22, 2022

Blamed commit added a txhash parameter to tcp_v6_send_response()
but forgot to update tcp_v6_send_reset() accordingly.

Fixes: aa51b80e ("ipv6: tcp: send consistent autoflowlabel in SYN_RECV state")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220922165036.1795862-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

9258b8b1

21 9月, 2022 6 次提交

ipv6: Fix crash when IPv6 is administratively disabled · 76dd0728

由 Ido Schimmel 提交于 9月 16, 2022

The global 'raw_v6_hashinfo' variable can be accessed even when IPv6 is
administratively disabled via the 'ipv6.disable=1' kernel command line
option, leading to a crash [1].

Fix by restoring the original behavior and always initializing the
variable, regardless of IPv6 support being administratively disabled or
not.

[1]
 BUG: unable to handle page fault for address: ffffffffffffffc8
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 173e18067 P4D 173e18067 PUD 173e1a067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP KASAN
 CPU: 3 PID: 271 Comm: ss Not tainted 6.0.0-rc4-custom-00136-g0727a9a5 #1396
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
 RIP: 0010:raw_diag_dump+0x310/0x7f0
 [...]
 Call Trace:
  <TASK>
  __inet_diag_dump+0x10f/0x2e0
  netlink_dump+0x575/0xfd0
  __netlink_dump_start+0x67b/0x940
  inet_diag_handler_cmd+0x273/0x2d0
  sock_diag_rcv_msg+0x317/0x440
  netlink_rcv_skb+0x15e/0x430
  sock_diag_rcv+0x2b/0x40
  netlink_unicast+0x53b/0x800
  netlink_sendmsg+0x945/0xe60
  ____sys_sendmsg+0x747/0x960
  ___sys_sendmsg+0x13a/0x1e0
  __sys_sendmsg+0x118/0x1e0
  do_syscall_64+0x34/0x80
  entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Fixes: 0daf07e5 ("raw: convert raw sockets to RCU")
Reported-by: NRoberto Ricci <rroberto2r@gmail.com>
Tested-by: NRoberto Ricci <rroberto2r@gmail.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Reviewed-by: NDavid Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220916084821.229287-1-idosch@nvidia.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

76dd0728

tcp: Save unnecessary inet_twsk_purge() calls. · edc12f03

由 Kuniyuki Iwashima 提交于 9月 07, 2022

While destroying netns, we call inet_twsk_purge() in tcp_sk_exit_batch()
and tcpv6_net_exit_batch() for AF_INET and AF_INET6.  These commands
trigger the kernel to walk through the potentially big ehash twice even
though the netns has no TIME_WAIT sockets.

  # ip netns add test
  # ip netns del test

  or

  # unshare -n /bin/true >/dev/null

When tw_refcount is 1, we need not call inet_twsk_purge() at least
for the net.  We can save such unneeded iterations if all netns in
net_exit_list have no TIME_WAIT sockets.  This change eliminates
the tax by the additional unshare() described in the next patch to
guarantee the per-netns ehash size.

Tested:

  # mount -t debugfs none /sys/kernel/debug/
  # echo cleanup_net > /sys/kernel/debug/tracing/set_ftrace_filter
  # echo inet_twsk_purge >> /sys/kernel/debug/tracing/set_ftrace_filter
  # echo function > /sys/kernel/debug/tracing/current_tracer
  # cat ./add_del_unshare.sh
  for i in `seq 1 40`
  do
      (for j in `seq 1 100` ; do  unshare -n /bin/true >/dev/null ; done) &
  done
  wait;
  # ./add_del_unshare.sh

Before the patch:

  # cat /sys/kernel/debug/tracing/trace_pipe
    kworker/u128:0-8       [031] ...1.   174.162765: cleanup_net <-process_one_work
    kworker/u128:0-8       [031] ...1.   174.240796: inet_twsk_purge <-cleanup_net
    kworker/u128:0-8       [032] ...1.   174.244759: inet_twsk_purge <-tcp_sk_exit_batch
    kworker/u128:0-8       [034] ...1.   174.290861: cleanup_net <-process_one_work
    kworker/u128:0-8       [039] ...1.   175.245027: inet_twsk_purge <-cleanup_net
    kworker/u128:0-8       [046] ...1.   175.290541: inet_twsk_purge <-tcp_sk_exit_batch
    kworker/u128:0-8       [037] ...1.   175.321046: cleanup_net <-process_one_work
    kworker/u128:0-8       [024] ...1.   175.941633: inet_twsk_purge <-cleanup_net
    kworker/u128:0-8       [025] ...1.   176.242539: inet_twsk_purge <-tcp_sk_exit_batch

After:

  # cat /sys/kernel/debug/tracing/trace_pipe
    kworker/u128:0-8       [038] ...1.   428.116174: cleanup_net <-process_one_work
    kworker/u128:0-8       [038] ...1.   428.262532: cleanup_net <-process_one_work
    kworker/u128:0-8       [030] ...1.   429.292645: cleanup_net <-process_one_work
Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

edc12f03

tcp: Access &tcp_hashinfo via net. · 4461568a

由 Kuniyuki Iwashima 提交于 9月 07, 2022

We will soon introduce an optional per-netns ehash.

This means we cannot use tcp_hashinfo directly in most places.

Instead, access it via net->ipv4.tcp_death_row.hashinfo.

The access will be valid only while initialising tcp_hashinfo
itself and creating/destroying each netns.
Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

4461568a

tcp: Set NULL to sk->sk_prot->h.hashinfo. · 429e42c1

由 Kuniyuki Iwashima 提交于 9月 07, 2022

We will soon introduce an optional per-netns ehash.

This means we cannot use the global sk->sk_prot->h.hashinfo
to fetch a TCP hashinfo.

Instead, set NULL to sk->sk_prot->h.hashinfo for TCP and get
a proper hashinfo from net->ipv4.tcp_death_row.hashinfo.

Note that we need not use sk->sk_prot->h.hashinfo if DCCP is
disabled.
Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

429e42c1

tcp: Don't allocate tcp_death_row outside of struct netns_ipv4. · e9bd0cca

由 Kuniyuki Iwashima 提交于 9月 07, 2022

We will soon introduce an optional per-netns ehash and access hash
tables via net->ipv4.tcp_death_row->hashinfo instead of &tcp_hashinfo
in most places.

It could harm the fast path because dereferences of two fields in net
and tcp_death_row might incur two extra cache line misses.  To save one
dereference, let's place tcp_death_row back in netns_ipv4 and fetch
hashinfo via net->ipv4.tcp_death_row"."hashinfo.

Note tcp_death_row was initially placed in netns_ipv4, and commit
fbb82952 ("tcp: allocate tcp_death_row outside of struct netns_ipv4")
changed it to a pointer so that we can fire TIME_WAIT timers after freeing
net.  However, we don't do so after commit 04c494e6 ("Revert "tcp/dccp:
get rid of inet_twsk_purge()""), so we need not define tcp_death_row as a
pointer.

Also, we move refcount_dec_and_test(&tw_refcount) from tcp_sk_exit() to
tcp_sk_exit_batch() as a debug check.
Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

e9bd0cca

tcp: Clean up some functions. · 08eaef90

由 Kuniyuki Iwashima 提交于 9月 07, 2022

This patch adds no functional change and cleans up some functions
that the following patches touch around so that we make them tidy
and easy to review/revert.  The changes are

  - Keep reverse christmas tree order
  - Remove unnecessary init of port in inet_csk_find_open_port()
  - Use req_to_sk() once in reqsk_queue_unlink()
  - Use sock_net(sk) once in tcp_time_wait() and tcp_v[46]_connect()
Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

08eaef90

20 9月, 2022 4 次提交

ipmr: Always call ip{,6}_mr_forward() from RCU read-side critical section · b07a9b26

由 Ido Schimmel 提交于 9月 14, 2022

These functions expect to be called from RCU read-side critical section,
but this only happens when invoked from the data path via
ip{,6}_mr_input(). They can also be invoked from process context in
response to user space adding a multicast route which resolves a cache
entry with queued packets [1][2].

Fix by adding missing rcu_read_lock() / rcu_read_unlock() in these call
paths.

[1]
WARNING: suspicious RCU usage
6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387 Not tainted
-----------------------------
net/ipv4/ipmr.c:84 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
1 lock held by smcrouted/246:
 #0: ffffffff862389b0 (rtnl_mutex){+.+.}-{3:3}, at: ip_mroute_setsockopt+0x11c/0x1420

stack backtrace:
CPU: 0 PID: 246 Comm: smcrouted Not tainted 6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x91/0xb9
 vif_dev_read+0xbf/0xd0
 ipmr_queue_xmit+0x135/0x1ab0
 ip_mr_forward+0xe7b/0x13d0
 ipmr_mfc_add+0x1a06/0x2ad0
 ip_mroute_setsockopt+0x5c1/0x1420
 do_ip_setsockopt+0x23d/0x37f0
 ip_setsockopt+0x56/0x80
 raw_setsockopt+0x219/0x290
 __sys_setsockopt+0x236/0x4d0
 __x64_sys_setsockopt+0xbe/0x160
 do_syscall_64+0x34/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

[2]
WARNING: suspicious RCU usage
6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387 Not tainted
-----------------------------
net/ipv6/ip6mr.c:69 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
1 lock held by smcrouted/246:
 #0: ffffffff862389b0 (rtnl_mutex){+.+.}-{3:3}, at: ip6_mroute_setsockopt+0x6b9/0x2630

stack backtrace:
CPU: 1 PID: 246 Comm: smcrouted Not tainted 6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x91/0xb9
 vif_dev_read+0xbf/0xd0
 ip6mr_forward2.isra.0+0xc9/0x1160
 ip6_mr_forward+0xef0/0x13f0
 ip6mr_mfc_add+0x1ff2/0x31f0
 ip6_mroute_setsockopt+0x1825/0x2630
 do_ipv6_setsockopt+0x462/0x4440
 ipv6_setsockopt+0x105/0x140
 rawv6_setsockopt+0xd8/0x690
 __sys_setsockopt+0x236/0x4d0
 __x64_sys_setsockopt+0xbe/0x160
 do_syscall_64+0x34/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: ebc31979 ("ipmr: add rcu protection over (struct vif_device)->dev")
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

b07a9b26

seg6: add NEXT-C-SID support for SRv6 End behavior · 848f3c0d

由 Andrea Mayer 提交于 9月 12, 2022

The NEXT-C-SID mechanism described in [1] offers the possibility of
encoding several SRv6 segments within a single 128 bit SID address. Such
a SID address is called a Compressed SID (C-SID) container. In this way,
the length of the SID List can be drastically reduced.

A SID instantiated with the NEXT-C-SID flavor considers an IPv6 address
logically structured in three main blocks: i) Locator-Block; ii)
Locator-Node Function; iii) Argument.

                        C-SID container
+------------------------------------------------------------------+
|     Locator-Block      |Loc-Node|            Argument            |
|                        |Function|                                |
+------------------------------------------------------------------+
<--------- B -----------> <- NF -> <------------- A --------------->

   (i) The Locator-Block can be any IPv6 prefix available to the provider;

  (ii) The Locator-Node Function represents the node and the function to
       be triggered when a packet is received on the node;

 (iii) The Argument carries the remaining C-SIDs in the current C-SID
       container.

The NEXT-C-SID mechanism relies on the "flavors" framework defined in
[2]. The flavors represent additional operations that can modify or
extend a subset of the existing behaviors.

This patch introduces the support for flavors in SRv6 End behavior
implementing the NEXT-C-SID one. An SRv6 End behavior with NEXT-C-SID
flavor works as an End behavior but it is capable of processing the
compressed SID List encoded in C-SID containers.

An SRv6 End behavior with NEXT-C-SID flavor can be configured to support
user-provided Locator-Block and Locator-Node Function lengths. In this
implementation, such lengths must be evenly divisible by 8 (i.e. must be
byte-aligned), otherwise the kernel informs the user about invalid
values with a meaningful error code and message through netlink_ext_ack.

If Locator-Block and/or Locator-Node Function lengths are not provided
by the user during configuration of an SRv6 End behavior instance with
NEXT-C-SID flavor, the kernel will choose their default values i.e.,
32-bit Locator-Block and 16-bit Locator-Node Function.

[1] - https://datatracker.ietf.org/doc/html/draft-ietf-spring-srv6-srh-compression
[2] - https://datatracker.ietf.org/doc/html/rfc8986Signed-off-by: NAndrea Mayer <andrea.mayer@uniroma2.it>
Reviewed-by: NDavid Ahern <dsahern@kernel.org>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>

848f3c0d

seg6: add netlink_ext_ack support in parsing SRv6 behavior attributes · e2a8ecc4

由 Andrea Mayer 提交于 9月 12, 2022

An SRv6 behavior instance can be set up using mandatory and/or optional
attributes.
In the setup phase, each supplied attribute is parsed and processed. If
the parsing operation fails, the creation of the behavior instance stops
and an error number/code is reported to the user.  In many cases, it is
challenging for the user to figure out exactly what happened by relying
only on the error code.

For this reason, we add the support for netlink_ext_ack in parsing SRv6
behavior attributes. In this way, when an SRv6 behavior attribute is
parsed and an error occurs, the kernel can send a message to the
userspace describing the error through a meaningful text message in
addition to the classic error code.
Signed-off-by: NAndrea Mayer <andrea.mayer@uniroma2.it>
Reviewed-by: NDavid Ahern <dsahern@kernel.org>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>

e2a8ecc4

net-next: gro: Fix use of skb_gro_header_slow · cb628a9a

由 Richard Gobert 提交于 9月 11, 2022

In the cited commit, the function ipv6_gro_receive was accidentally
changed to use skb_gro_header_slow, without attempting the fast path.
Fix it.

Fixes: 35ffb665 ("net: gro: skb_gro_header helper function")
Signed-off-by: NRichard Gobert <richardbgobert@gmail.com>
Link: https://lore.kernel.org/r/20220911184835.GA105063@debianSigned-off-by: NPaolo Abeni <pabeni@redhat.com>

cb628a9a

10 9月, 2022 1 次提交

bpf: Invoke cgroup/connect{4,6} programs for unprivileged ICMP ping · 0ffe2412

由 YiFei Zhu 提交于 9月 09, 2022

Usually when a TCP/UDP connection is initiated, we can bind the socket
to a specific IP attached to an interface in a cgroup/connect hook.
But for pings, this is impossible, as the hook is not being called.

This adds the hook invocation to unprivileged ICMP ping (i.e. ping
sockets created with SOCK_DGRAM IPPROTO_ICMP(V6) as opposed to
SOCK_RAW. Logic is mirrored from UDP sockets where the hook is invoked
during pre_connect, after a check for suficiently sized addr_len.
Signed-off-by: NYiFei Zhu <zhuyifei@google.com>
Link: https://lore.kernel.org/r/5764914c252fad4cd134fb6664c6ede95f409412.1662682323.git.zhuyifei@google.comSigned-off-by: NMartin KaFai Lau <martin.lau@kernel.org>

0ffe2412

05 9月, 2022 2 次提交

ipv6: sr: fix out-of-bounds read when setting HMAC data. · 84a53580

由 David Lebrun 提交于 9月 02, 2022

The SRv6 layer allows defining HMAC data that can later be used to sign IPv6
Segment Routing Headers. This configuration is realised via netlink through
four attributes: SEG6_ATTR_HMACKEYID, SEG6_ATTR_SECRET, SEG6_ATTR_SECRETLEN and
SEG6_ATTR_ALGID. Because the SECRETLEN attribute is decoupled from the actual
length of the SECRET attribute, it is possible to provide invalid combinations
(e.g., secret = "", secretlen = 64). This case is not checked in the code and
with an appropriately crafted netlink message, an out-of-bounds read of up
to 64 bytes (max secret length) can occur past the skb end pointer and into
skb_shared_info:

Breakpoint 1, seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208
208		memcpy(hinfo->secret, secret, slen);
(gdb) bt
 #0  seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208
 #1  0xffffffff81e012e9 in genl_family_rcv_msg_doit (skb=skb@entry=0xffff88800b1f9f00, nlh=nlh@entry=0xffff88800b1b7600,
    extack=extack@entry=0xffffc90000ba7af0, ops=ops@entry=0xffffc90000ba7a80, hdrlen=4, net=0xffffffff84237580 <init_net>, family=<optimized out>,
    family=<optimized out>) at net/netlink/genetlink.c:731
 #2  0xffffffff81e01435 in genl_family_rcv_msg (extack=0xffffc90000ba7af0, nlh=0xffff88800b1b7600, skb=0xffff88800b1f9f00,
    family=0xffffffff82fef6c0 <seg6_genl_family>) at net/netlink/genetlink.c:775
 #3  genl_rcv_msg (skb=0xffff88800b1f9f00, nlh=0xffff88800b1b7600, extack=0xffffc90000ba7af0) at net/netlink/genetlink.c:792
 #4  0xffffffff81dfffc3 in netlink_rcv_skb (skb=skb@entry=0xffff88800b1f9f00, cb=cb@entry=0xffffffff81e01350 <genl_rcv_msg>)
    at net/netlink/af_netlink.c:2501
 #5  0xffffffff81e00919 in genl_rcv (skb=0xffff88800b1f9f00) at net/netlink/genetlink.c:803
 #6  0xffffffff81dff6ae in netlink_unicast_kernel (ssk=0xffff888010eec800, skb=0xffff88800b1f9f00, sk=0xffff888004aed000)
    at net/netlink/af_netlink.c:1319
 #7  netlink_unicast (ssk=ssk@entry=0xffff888010eec800, skb=skb@entry=0xffff88800b1f9f00, portid=portid@entry=0, nonblock=<optimized out>)
    at net/netlink/af_netlink.c:1345
 #8  0xffffffff81dff9a4 in netlink_sendmsg (sock=<optimized out>, msg=0xffffc90000ba7e48, len=<optimized out>) at net/netlink/af_netlink.c:1921
...
(gdb) p/x ((struct sk_buff *)0xffff88800b1f9f00)->head + ((struct sk_buff *)0xffff88800b1f9f00)->end
$1 = 0xffff88800b1b76c0
(gdb) p/x secret
$2 = 0xffff88800b1b76c0
(gdb) p slen
$3 = 64 '@'

The OOB data can then be read back from userspace by dumping HMAC state. This
commit fixes this by ensuring SECRETLEN cannot exceed the actual length of
SECRET.
Reported-by: NLucas Leong <wmliang.tw@gmail.com>
Tested: verified that EINVAL is correctly returned when secretlen > len(secret)
Fixes: 4f4853dc ("ipv6: sr: implement API to control SR HMAC structure")
Signed-off-by: NDavid Lebrun <dlebrun@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84a53580

bonding: add all node mcast address when slave up · fd16eb94

由 Hangbin Liu 提交于 8月 30, 2022

When a link is enslave to bond, it need to set the interface down first.
This makes the slave remove mac multicast address 33:33:00:00:00:01(The
IPv6 multicast address ff02::1 is kept even when the interface down). When
bond set the slave up, ipv6_mc_up() was not called due to commit c2edacf8
("bonding / ipv6: no addrconf for slaves separately from master").

This is not an issue before we adding the lladdr target feature for bonding,
as the mac multicast address will be added back when bond interface up and
join group ff02::1.

But after adding lladdr target feature for bonding. When user set a lladdr
target, the unsolicited NA message with all-nodes multicast dest will be
dropped as the slave interface never add 33:33:00:00:00:01 back.

Fix this by calling ipv6_mc_up() to add 33:33:00:00:00:01 back when
the slave interface up.
Reported-by: NLiLiang <liali@redhat.com>
Fixes: 5e1eeef6 ("bonding: NS target should accept link local address")
Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd16eb94

03 9月, 2022 5 次提交

bpf: Change bpf_getsockopt(SOL_IPV6) to reuse do_ipv6_getsockopt() · 38566ec0

由 Martin KaFai Lau 提交于 9月 01, 2022

This patch changes bpf_getsockopt(SOL_IPV6) to reuse
do_ipv6_getsockopt().  It removes the duplicated code from
bpf_getsockopt(SOL_IPV6).

This also makes bpf_getsockopt(SOL_IPV6) supporting the same
set of optnames as in bpf_setsockopt(SOL_IPV6).  In particular,
this adds IPV6_AUTOFLOWLABEL support to bpf_getsockopt(SOL_IPV6).

ipv6 could be compiled as a module.  Like how other code solved it
with stubs in ipv6_stubs.h, this patch adds the do_ipv6_getsockopt
to the ipv6_bpf_stub.
Signed-off-by: NMartin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002931.2896218-1-kafai@fb.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

38566ec0

bpf: net: Avoid do_ipv6_getsockopt() taking sk lock when called from bpf · 0f95f7d4

由 Martin KaFai Lau 提交于 9月 01, 2022

Similar to the earlier patch that changes sk_getsockopt() to
use sockopt_{lock,release}_sock() such that it can avoid taking the
lock when called from bpf. This patch also changes do_ipv6_getsockopt()
to use sockopt_{lock,release}_sock() such that bpf_getsockopt(SOL_IPV6)
can reuse do_ipv6_getsockopt().

Although bpf_getsockopt(SOL_IPV6) currently does not support optname
that requires lock_sock(), using sockopt_{lock,release}_sock()
consistently across *_getsockopt() will make future optname addition
harder to miss the sockopt_{lock,release}_sock() usage. eg. when
adding new optname that requires a lock and the new optname is
needed in bpf_getsockopt() also.
Signed-off-by: NMartin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002859.2893064-1-kafai@fb.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

0f95f7d4

bpf: net: Change do_ipv6_getsockopt() to take the sockptr_t argument · 6dadbe4b

由 Martin KaFai Lau 提交于 9月 01, 2022

Similar to the earlier patch that changes sk_getsockopt() to
take the sockptr_t argument .  This patch also changes
do_ipv6_getsockopt() to take the sockptr_t argument such that
a latter patch can make bpf_getsockopt(SOL_IPV6) to reuse
do_ipv6_getsockopt().

Note on the change in ip6_mc_msfget().  This function is to
return an array of sockaddr_storage in optval.  This function
is shared between ipv6_get_msfilter() and compat_ipv6_get_msfilter().
However, the sockaddr_storage is stored at different offset of the
optval because of the difference between group_filter and
compat_group_filter.  Thus, a new 'ss_offset' argument is
added to ip6_mc_msfget().
Signed-off-by: NMartin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002853.2892532-1-kafai@fb.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

6dadbe4b

net: Add a len argument to compat_ipv6_get_msfilter() · 9c3f9707

由 Martin KaFai Lau 提交于 9月 01, 2022

Pass the len to the compat_ipv6_get_msfilter() instead of
compat_ipv6_get_msfilter() getting it again from optlen.
Its counter part ipv6_get_msfilter() is also taking the
len from do_ipv6_getsockopt().
Signed-off-by: NMartin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002846.2892091-1-kafai@fb.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

9c3f9707

net: Remove unused flags argument from do_ipv6_getsockopt · 75f23979

由 Martin KaFai Lau 提交于 9月 01, 2022

The 'unsigned int flags' argument is always 0, so it can be removed.
Signed-off-by: NMartin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002840.2891763-1-kafai@fb.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

75f23979

02 9月, 2022 1 次提交

ipv6: tcp: send consistent autoflowlabel in SYN_RECV state · aa51b80e

由 Eric Dumazet 提交于 8月 31, 2022

This is a followup of commit c67b8555 ("ipv6: tcp: send consistent
autoflowlabel in TIME_WAIT state"), but for SYN_RECV state.

In some cases, TCP sends a challenge ACK on behalf of a SYN_RECV request.
WHen this happens, we want to use the flow label that was used when
the prior SYNACK packet was sent, instead of another one.

After his patch, following packetdrill passes:

    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
   +0 bind(3, ..., ...) = 0
   +0 listen(3, 1) = 0

  +.2 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
   +0 > (flowlabel 0x11) S. 0:0(0) ack 1 <...>
// Test if a challenge ack is properly sent (same flowlabel than prior SYNACK)
   +.01 < . 4000000000:4000000000(0) ack 1 win 320
   +0  > (flowlabel 0x11) . 1:1(0) ack 1
Signed-off-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220831203729.458000-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

aa51b80e

01 9月, 2022 1 次提交

rxrpc: Fix ICMP/ICMP6 error handling · ac56a0b4

由 David Howells 提交于 8月 26, 2022

Because rxrpc pretends to be a tunnel on top of a UDP/UDP6 socket, allowing
it to siphon off UDP packets early in the handling of received UDP packets
thereby avoiding the packet going through the UDP receive queue, it doesn't
get ICMP packets through the UDP ->sk_error_report() callback.  In fact, it
doesn't appear that there's any usable option for getting hold of ICMP
packets.

Fix this by adding a new UDP encap hook to distribute error messages for
UDP tunnels.  If the hook is set, then the tunnel driver will be able to
see ICMP packets.  The hook provides the offset into the packet of the UDP
header of the original packet that caused the notification.

An alternative would be to call the ->error_handler() hook - but that
requires that the skbuff be cloned (as ip_icmp_error() or ipv6_cmp_error()
do, though isn't really necessary or desirable in rxrpc's case is we want
to parse them there and then, not queue them).

Changes
=======
ver #3)
 - Fixed an uninitialised variable.

ver #2)
 - Fixed some missing CONFIG_AF_RXRPC_IPV6 conditionals.

Fixes: 5271953c ("rxrpc: Use the UDP encap_rcv hook")
Signed-off-by: NDavid Howells <dhowells@redhat.com>

ac56a0b4

30 8月, 2022 1 次提交

net: unify alloclen calculation for paged requests · 47cf8899

由 Pavel Begunkov 提交于 8月 25, 2022

Consolidate alloclen and pagedlen calculation for zerocopy and normal
paged requests. The current non-zerocopy paged version can a bit
overallocate and unnecessary copy a small chunk of data into the linear
part.

Cc: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/netdev/CA+FuTSf0+cJ9_N_xrHmCGX_KoVCWcE0YQBdtgEkzGvcLMSv7Qw@mail.gmail.com/Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b0e4edb7b91f171c7119891d3c61040b8c56596e.1661428921.git.asml.silence@gmail.comSigned-off-by: NPaolo Abeni <pabeni@redhat.com>

47cf8899

29 8月, 2022 1 次提交

genetlink: start to validate reserved header bytes · 9c5d03d3

由 Jakub Kicinski 提交于 8月 24, 2022

We had historically not checked that genlmsghdr.reserved
is 0 on input which prevents us from using those precious
bytes in the future.

One use case would be to extend the cmd field, which is
currently just 8 bits wide and 256 is not a lot of commands
for some core families.

To make sure that new families do the right thing by default
put the onus of opting out of validation on existing families.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com> (NetLabel)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9c5d03d3

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功