提交 · fb7bc9204095090731430c8921f9e629740c110a · openeuler / Kernel

30 12月, 2021 2 次提交

ipv6: raw: check passed optlen before reading · fb7bc920

由 Tamir Duberstein 提交于 12月 29, 2021

Add a check that the user-provided option is at least as long as the
number of bytes we intend to read. Before this patch we would blindly
read sizeof(int) bytes even in cases where the user passed
optlen<sizeof(int), which would potentially read garbage or fault.

Discovered by new tests in https://github.com/google/gvisor/pull/6957 .

The original get_user call predates history in the git repo.
Signed-off-by: NTamir Duberstein <tamird@gmail.com>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20211229200947.2862255-1-willemdebruijn.kernel@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

fb7bc920

xsk: Initialise xskb free_list_node · 5bec7ca2

由 Ciara Loftus 提交于 12月 20, 2021

This commit initialises the xskb's free_list_node when the xskb is
allocated. This prevents a potential false negative returned from a call
to list_empty for that node, such as the one introduced in commit
199d983b ("xsk: Fix crash on double free in buffer pool")

In my environment this issue caused packets to not be received by
the xdpsock application if the traffic was running prior to application
launch. This happened when the first batch of packets failed the xskmap
lookup and XDP_PASS was returned from the bpf program. This action is
handled in the i40e zc driver (and others) by allocating an skbuff,
freeing the xdp_buff and adding the associated xskb to the
xsk_buff_pool's free_list if it hadn't been added already. Without this
fix, the xskb is not added to the free_list because the check to determine
if it was added already returns an invalid positive result. Later, this
caused allocation errors in the driver and the failure to receive packets.

Fixes: 199d983b ("xsk: Fix crash on double free in buffer pool")
Fixes: 2b43470a ("xsk: Introduce AF_XDP buffer allocation API")
Signed-off-by: NCiara Loftus <ciara.loftus@intel.com>
Acked-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20211220155250.2746-1-ciara.loftus@intel.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

5bec7ca2

29 12月, 2021 4 次提交

Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 9665e03a

由 Jakub Kicinski 提交于 12月 28, 2021

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-12-28

This series contains updates to igc driver only.

Vinicius disables support for crosstimestamp on i225-V as lockups are being
observed.

James McLaughlin fixes Tx timestamping support on non-MSI-X platforms.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  igc: Fix TX timestamp support for non-MSI-X platforms
  igc: Do not enable crosstimestamping for i225-V models
====================

Link: https://lore.kernel.org/r/20211228182421.340354-1-anthony.l.nguyen@intel.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

9665e03a

ionic: Initialize the 'lif->dbid_inuse' bitmap · 140c7bc7

由 Christophe JAILLET 提交于 12月 26, 2021

When allocated, this bitmap is not initialized. Only the first bit is set a
few lines below.

Use bitmap_zalloc() to make sure that it is cleared before being used.

Fixes: 6461b446 ("ionic: Add interrupts and doorbells")
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Link: https://lore.kernel.org/r/6a478eae0b5e6c63774e1f0ddb1a3f8c38fa8ade.1640527506.git.christophe.jaillet@wanadoo.frSigned-off-by: NJakub Kicinski <kuba@kernel.org>

140c7bc7

igc: Fix TX timestamp support for non-MSI-X platforms · f85846bb

由 James McLaughlin 提交于 12月 17, 2021

Time synchronization was not properly enabled on non-MSI-X platforms.

Fixes: 2c344ae2 ("igc: Add support for TX timestamping")
Signed-off-by: NJames McLaughlin <james.mclaughlin@qsc.com>
Reviewed-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: NNechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

f85846bb

igc: Do not enable crosstimestamping for i225-V models · 1e81dcc1

由 Vinicius Costa Gomes 提交于 12月 13, 2021

It was reported that when PCIe PTM is enabled, some lockups could
be observed with some integrated i225-V models.

While the issue is investigated, we can disable crosstimestamp for
those models and see no loss of functionality, because those models
don't have any support for time synchronization.

Fixes: a90ec848 ("igc: Add support for PTP getcrosststamp()")
Link: https://lore.kernel.org/all/924175a188159f4e03bd69908a91e606b574139b.camel@gmx.de/Reported-by: NStefan Dietrich <roots@gmx.de>
Signed-off-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: NNechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

1e81dcc1

28 12月, 2021 7 次提交

Merge branch 'smc-fixes' · 16fa29ae

由 David S. Miller 提交于 12月 28, 2021

Dust Li says:

====================
net/smc: fix kernel panic caused by race of smc_sock

This patchset fixes the race between smc_release triggered by
close(2) and cdc_handle triggered by underlaying RDMA device.

The race is caused because the smc_connection may been released
before the pending tx CDC messages got its CQEs. In order to fix
this, I add a counter to track how many pending WRs we have posted
through the smc_connection, and only release the smc_connection
after there is no pending WRs on the connection.

The first patch prevents posting WR on a QP that is not in RTS
state. This patch is needed because if we post WR on a QP that
is not in RTS state, ib_post_send() may success but no CQE will
return, and that will confuse the counter tracking the pending
WRs.

The second patch add a counter to track how many WRs were posted
through the smc_connection, and don't reset the QP on link destroying
to prevent leak of the counter.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

16fa29ae

net/smc: fix kernel panic caused by race of smc_sock · 349d4312

由 Dust Li 提交于 12月 28, 2021

A crash occurs when smc_cdc_tx_handler() tries to access smc_sock
but smc_release() has already freed it.

[ 4570.695099] BUG: unable to handle page fault for address: 000000002eae9e88
[ 4570.696048] #PF: supervisor write access in kernel mode
[ 4570.696728] #PF: error_code(0x0002) - not-present page
[ 4570.697401] PGD 0 P4D 0
[ 4570.697716] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 4570.698228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc4+ #111
[ 4570.699013] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8c24b4c 04/0
[ 4570.699933] RIP: 0010:_raw_spin_lock+0x1a/0x30
<...>
[ 4570.711446] Call Trace:
[ 4570.711746]  <IRQ>
[ 4570.711992]  smc_cdc_tx_handler+0x41/0xc0
[ 4570.712470]  smc_wr_tx_tasklet_fn+0x213/0x560
[ 4570.712981]  ? smc_cdc_tx_dismisser+0x10/0x10
[ 4570.713489]  tasklet_action_common.isra.17+0x66/0x140
[ 4570.714083]  __do_softirq+0x123/0x2f4
[ 4570.714521]  irq_exit_rcu+0xc4/0xf0
[ 4570.714934]  common_interrupt+0xba/0xe0

Though smc_cdc_tx_handler() checked the existence of smc connection,
smc_release() may have already dismissed and released the smc socket
before smc_cdc_tx_handler() further visits it.

smc_cdc_tx_handler()           |smc_release()
if (!conn)                     |
                               |
                               |smc_cdc_tx_dismiss_slots()
                               |      smc_cdc_tx_dismisser()
                               |
                               |sock_put(&smc->sk) <- last sock_put,
                               |                      smc_sock freed
bh_lock_sock(&smc->sk) (panic) |

To make sure we won't receive any CDC messages after we free the
smc_sock, add a refcount on the smc_connection for inflight CDC
message(posted to the QP but haven't received related CQE), and
don't release the smc_connection until all the inflight CDC messages
haven been done, for both success or failed ones.

Using refcount on CDC messages brings another problem: when the link
is going to be destroyed, smcr_link_clear() will reset the QP, which
then remove all the pending CQEs related to the QP in the CQ. To make
sure all the CQEs will always come back so the refcount on the
smc_connection can always reach 0, smc_ib_modify_qp_reset() was replaced
by smc_ib_modify_qp_error().
And remove the timeout in smc_wr_tx_wait_no_pending_sends() since we
need to wait for all pending WQEs done, or we may encounter use-after-
free when handling CQEs.

For IB device removal routine, we need to wait for all the QPs on that
device been destroyed before we can destroy CQs on the device, or
the refcount on smc_connection won't reach 0 and smc_sock cannot be
released.

Fixes: 5f08318f ("smc: connection data control (CDC)")
Reported-by: NWen Gu <guwen@linux.alibaba.com>
Signed-off-by: NDust Li <dust.li@linux.alibaba.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

349d4312

net/smc: don't send CDC/LLC message if link not ready · 90cee52f

由 Dust Li 提交于 12月 28, 2021

We found smc_llc_send_link_delete_all() sometimes wait
for 2s timeout when testing with RDMA link up/down.
It is possible when a smc_link is in ACTIVATING state,
the underlaying QP is still in RESET or RTR state, which
cannot send any messages out.

smc_llc_send_link_delete_all() use smc_link_usable() to
checks whether the link is usable, if the QP is still in
RESET or RTR state, but the smc_link is in ACTIVATING, this
LLC message will always fail without any CQE entering the
CQ, and we will always wait 2s before timeout.

Since we cannot send any messages through the QP before
the QP enter RTS. I add a wrapper smc_link_sendable()
which checks the state of QP along with the link state.
And replace smc_link_usable() with smc_link_sendable()
in all LLC & CDC message sending routine.

Fixes: 5f08318f ("smc: connection data control (CDC)")
Signed-off-by: NDust Li <dust.li@linux.alibaba.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

90cee52f

NFC: st21nfca: Fix memory leak in device probe and remove · 1b9dadba

由 Wei Yongjun 提交于 12月 28, 2021

'phy->pending_skb' is alloced when device probe, but forgot to free
in the error handling path and remove path, this cause memory leak
as follows:

unreferenced object 0xffff88800bc06800 (size 512):
  comm "8", pid 11775, jiffies 4295159829 (age 9.032s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000d66c09ce>] __kmalloc_node_track_caller+0x1ed/0x450
    [<00000000c93382b3>] kmalloc_reserve+0x37/0xd0
    [<000000005fea522c>] __alloc_skb+0x124/0x380
    [<0000000019f29f9a>] st21nfca_hci_i2c_probe+0x170/0x8f2

Fix it by freeing 'pending_skb' in error and remove.

Fixes: 68957303 ("NFC: ST21NFCA: Add driver for STMicroelectronics ST21NFCA NFC Chip")
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b9dadba

net: lantiq_xrx200: fix statistics of received bytes · 5be60a94

由 Aleksander Jan Bajkowski 提交于 12月 27, 2021

Received frames have FCS truncated. There is no need
to subtract FCS length from the statistics.

Fixes: fe1a5642 ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver")
Signed-off-by: NAleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5be60a94

net: ag71xx: Fix a potential double free in error handling paths · 1cd5384c

由 Christophe JAILLET 提交于 12月 26, 2021

'ndev' is a managed resource allocated with devm_alloc_etherdev(), so there
is no need to call free_netdev() explicitly or there will be a double
free().

Simplify all error handling paths accordingly.

Fixes: d51b6ce4 ("net: ethernet: add ag71xx driver")
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1cd5384c

mISDN: change function names to avoid conflicts · 8b5fdfc5

由 wolfgang huang 提交于 12月 28, 2021

As we build for mips, we meet following error. l1_init error with
multiple definition. Some architecture devices usually marked with
l1, l2, lxx as the start-up phase. so we change the mISDN function
names, align with Isdnl2_xxx.

mips-linux-gnu-ld: drivers/isdn/mISDN/layer1.o: in function `l1_init':
(.text+0x890): multiple definition of `l1_init'; \
arch/mips/kernel/bmips_5xxx_init.o:(.text+0xf0): first defined here
make[1]: *** [home/mips/kernel-build/linux/Makefile:1161: vmlinux] Error 1
Signed-off-by: Nwolfgang huang <huangjinhui@kylinos.cn>
Reported-by: Nk2ci <kernel-bot@kylinos.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8b5fdfc5

27 12月, 2021 7 次提交

nfc: uapi: use kernel size_t to fix user-space builds · 79b69a83

由 Krzysztof Kozlowski 提交于 12月 26, 2021

Fix user-space builds if it includes /usr/include/linux/nfc.h before
some of other headers:

  /usr/include/linux/nfc.h:281:9: error: unknown type name ‘size_t’
    281 |         size_t service_name_len;
        |         ^~~~~~

Fixes: d646960f ("NFC: Initial LLCP support")
Cc: <stable@vger.kernel.org>
Signed-off-by: NKrzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

79b69a83

uapi: fix linux/nfc.h userspace compilation errors · 7175f02c

由 Dmitry V. Levin 提交于 12月 26, 2021

Replace sa_family_t with __kernel_sa_family_t to fix the following
linux/nfc.h userspace compilation errors:

/usr/include/linux/nfc.h:266:2: error: unknown type name 'sa_family_t'
  sa_family_t sa_family;
/usr/include/linux/nfc.h:274:2: error: unknown type name 'sa_family_t'
  sa_family_t sa_family;

Fixes: 23b7869c ("NFC: add the NFC socket raw protocol")
Fixes: d646960f ("NFC: Initial LLCP support")
Cc: <stable@vger.kernel.org>
Signed-off-by: NDmitry V. Levin <ldv@altlinux.org>
Reviewed-by: NKrzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7175f02c

net: usb: pegasus: Do not drop long Ethernet frames · ca506fca

由 Matthias-Christian Ott 提交于 12月 26, 2021

The D-Link DSB-650TX (2001:4002) is unable to receive Ethernet frames
that are longer than 1518 octets, for example, Ethernet frames that
contain 802.1Q VLAN tags.

The frames are sent to the pegasus driver via USB but the driver
discards them because they have the Long_pkt field set to 1 in the
received status report. The function read_bulk_callback of the pegasus
driver treats such received "packets" (in the terminology of the
hardware) as errors but the field simply does just indicate that the
Ethernet frame (MAC destination to FCS) is longer than 1518 octets.

It seems that in the 1990s there was a distinction between
"giant" (> 1518) and "runt" (< 64) frames and the hardware includes
flags to indicate this distinction. It seems that the purpose of the
distinction "giant" frames was to not allow infinitely long frames due
to transmission errors and to allow hardware to have an upper limit of
the frame size. However, the hardware already has such limit with its
2048 octet receive buffer and, therefore, Long_pkt is merely a
convention and should not be treated as a receive error.

Actually, the hardware is even able to receive Ethernet frames with 2048
octets which exceeds the claimed limit frame size limit of the driver of
1536 octets (PEGASUS_MTU).

Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Signed-off-by: NMatthias-Christian Ott <ott@mirix.org>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca506fca

atlantic: Fix buff_ring OOB in aq_ring_rx_clean · 5f501532

由 Zekun Shen 提交于 12月 26, 2021

The function obtain the next buffer without boundary check.
We should return with I/O error code.

The bug is found by fuzzing and the crash report is attached.
It is an OOB bug although reported as use-after-free.

[    4.804724] BUG: KASAN: use-after-free in aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.805661] Read of size 4 at addr ffff888034fe93a8 by task ksoftirqd/0/9
[    4.806505]
[    4.806703] CPU: 0 PID: 9 Comm: ksoftirqd/0 Tainted: G        W         5.6.0 #34
[    4.809030] Call Trace:
[    4.809343]  dump_stack+0x76/0xa0
[    4.809755]  print_address_description.constprop.0+0x16/0x200
[    4.810455]  ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.811234]  ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.813183]  __kasan_report.cold+0x37/0x7c
[    4.813715]  ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.814393]  kasan_report+0xe/0x20
[    4.814837]  aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.815499]  ? hw_atl_b0_hw_ring_rx_receive+0x9a5/0xb90 [atlantic]
[    4.816290]  aq_vec_poll+0x179/0x5d0 [atlantic]
[    4.816870]  ? _GLOBAL__sub_I_65535_1_aq_pci_func_init+0x20/0x20 [atlantic]
[    4.817746]  ? __next_timer_interrupt+0xba/0xf0
[    4.818322]  net_rx_action+0x363/0xbd0
[    4.818803]  ? call_timer_fn+0x240/0x240
[    4.819302]  ? __switch_to_asm+0x40/0x70
[    4.819809]  ? napi_busy_loop+0x520/0x520
[    4.820324]  __do_softirq+0x18c/0x634
[    4.820797]  ? takeover_tasklets+0x5f0/0x5f0
[    4.821343]  run_ksoftirqd+0x15/0x20
[    4.821804]  smpboot_thread_fn+0x2f1/0x6b0
[    4.822331]  ? smpboot_unregister_percpu_thread+0x160/0x160
[    4.823041]  ? __kthread_parkme+0x80/0x100
[    4.823571]  ? smpboot_unregister_percpu_thread+0x160/0x160
[    4.824301]  kthread+0x2b5/0x3b0
[    4.824723]  ? kthread_create_on_node+0xd0/0xd0
[    4.825304]  ret_from_fork+0x35/0x40
Signed-off-by: NZekun Shen <bruceshenzk@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f501532

net: udp: fix alignment problem in udp4_seq_show() · 6c25449e

由 yangxingwu 提交于 12月 27, 2021

$ cat /pro/net/udp

before:

  sl  local_address rem_address   st tx_queue rx_queue tr tm->when
26050: 0100007F:0035 00000000:0000 07 00000000:00000000 00:00000000
26320: 0100007F:0143 00000000:0000 07 00000000:00000000 00:00000000
27135: 00000000:8472 00000000:0000 07 00000000:00000000 00:00000000

after:

   sl  local_address rem_address   st tx_queue rx_queue tr tm->when
26050: 0100007F:0035 00000000:0000 07 00000000:00000000 00:00000000
26320: 0100007F:0143 00000000:0000 07 00000000:00000000 00:00000000
27135: 00000000:8472 00000000:0000 07 00000000:00000000 00:00000000
Signed-off-by: Nyangxingwu <xingwu.yang@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c25449e

net/smc: fix using of uninitialized completions · 6d7373da

由 Karsten Graul 提交于 12月 27, 2021

In smc_wr_tx_send_wait() the completion on index specified by
pend->idx is initialized and after smc_wr_tx_send() was called the wait
for completion starts. pend->idx is used to get the correct index for
the wait, but the pend structure could already be cleared in
smc_wr_tx_process_cqe().
Introduce pnd_idx to hold and use a local copy of the correct index.

Fixes: 09c61d24 ("net/smc: wait for departure of an IB message")
Signed-off-by: NKarsten Graul <kgraul@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d7373da

ip6_vti: initialize __ip6_tnl_parm struct in vti6_siocdevprivate · c1833c39

由 William Zhao 提交于 12月 23, 2021

The "__ip6_tnl_parm" struct was left uninitialized causing an invalid
load of random data when the "__ip6_tnl_parm" struct was used elsewhere.
As an example, in the function "ip6_tnl_xmit_ctl()", it tries to access
the "collect_md" member. With "__ip6_tnl_parm" being uninitialized and
containing random data, the UBSAN detected that "collect_md" held a
non-boolean value.

The UBSAN issue is as follows:
===============================================================
UBSAN: invalid-load in net/ipv6/ip6_tunnel.c:1025:14
load of value 30 is not a valid value for type '_Bool'
CPU: 1 PID: 228 Comm: kworker/1:3 Not tainted 5.16.0-rc4+ #8
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Workqueue: ipv6_addrconf addrconf_dad_work
Call Trace:
<TASK>
dump_stack_lvl+0x44/0x57
ubsan_epilogue+0x5/0x40
__ubsan_handle_load_invalid_value+0x66/0x70
? __cpuhp_setup_state+0x1d3/0x210
ip6_tnl_xmit_ctl.cold.52+0x2c/0x6f [ip6_tunnel]
vti6_tnl_xmit+0x79c/0x1e96 [ip6_vti]
? lock_is_held_type+0xd9/0x130
? vti6_rcv+0x100/0x100 [ip6_vti]
? lock_is_held_type+0xd9/0x130
? rcu_read_lock_bh_held+0xc0/0xc0
? lock_acquired+0x262/0xb10
dev_hard_start_xmit+0x1e6/0x820
__dev_queue_xmit+0x2079/0x3340
? mark_lock.part.52+0xf7/0x1050
? netdev_core_pick_tx+0x290/0x290
? kvm_clock_read+0x14/0x30
? kvm_sched_clock_read+0x5/0x10
? sched_clock_cpu+0x15/0x200
? find_held_lock+0x3a/0x1c0
? lock_release+0x42f/0xc90
? lock_downgrade+0x6b0/0x6b0
? mark_held_locks+0xb7/0x120
? neigh_connected_output+0x31f/0x470
? lockdep_hardirqs_on+0x79/0x100
? neigh_connected_output+0x31f/0x470
? ip6_finish_output2+0x9b0/0x1d90
? rcu_read_lock_bh_held+0x62/0xc0
? ip6_finish_output2+0x9b0/0x1d90
ip6_finish_output2+0x9b0/0x1d90
? ip6_append_data+0x330/0x330
? ip6_mtu+0x166/0x370
? __ip6_finish_output+0x1ad/0xfb0
? nf_hook_slow+0xa6/0x170
ip6_output+0x1fb/0x710
? nf_hook.constprop.32+0x317/0x430
? ip6_finish_output+0x180/0x180
? __ip6_finish_output+0xfb0/0xfb0
? lock_is_held_type+0xd9/0x130
ndisc_send_skb+0xb33/0x1590
? __sk_mem_raise_allocated+0x11cf/0x1560
? dst_output+0x4a0/0x4a0
? ndisc_send_rs+0x432/0x610
addrconf_dad_completed+0x30c/0xbb0
? addrconf_rs_timer+0x650/0x650
? addrconf_dad_work+0x73c/0x10e0
addrconf_dad_work+0x73c/0x10e0
? addrconf_dad_completed+0xbb0/0xbb0
? rcu_read_lock_sched_held+0xaf/0xe0
? rcu_read_lock_bh_held+0xc0/0xc0
process_one_work+0x97b/0x1740
? pwq_dec_nr_in_flight+0x270/0x270
worker_thread+0x87/0xbf0
? process_one_work+0x1740/0x1740
kthread+0x3ac/0x490
? set_kthread_struct+0x100/0x100
ret_from_fork+0x22/0x30
</TASK>
===============================================================

The solution is to initialize "__ip6_tnl_parm" struct to zeros in the
"vti6_siocdevprivate()" function.
Signed-off-by: NWilliam Zhao <wizhao@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c1833c39

26 12月, 2021 2 次提交

selftests: mptcp: Remove the deprecated config NFT_COUNTER · e6007b85

由 Ma Xinjian 提交于 12月 24, 2021

NFT_COUNTER was removed since
390ad4295aa ("netfilter: nf_tables: make counter support built-in")
LKP/0Day will check if all configs listing under selftests are able to
be enabled properly.

For the missing configs, it will report something like:
LKP WARN miss config CONFIG_NFT_COUNTER= of net/mptcp/config

- it's not reasonable to keep the deprecated configs.
- configs under kselftests are recommended by corresponding tests.
So if some configs are missing, it will impact the testing results
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NMa Xinjian <xinjianx.ma@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e6007b85

sctp: use call_rcu to free endpoint · 5ec7d18d

由 Xin Long 提交于 12月 23, 2021

This patch is to delay the endpoint free by calling call_rcu() to fix
another use-after-free issue in sctp_sock_dump():

  BUG: KASAN: use-after-free in __lock_acquire+0x36d9/0x4c20
  Call Trace:
    __lock_acquire+0x36d9/0x4c20 kernel/locking/lockdep.c:3218
    lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3844
    __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
    _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
    spin_lock_bh include/linux/spinlock.h:334 [inline]
    __lock_sock+0x203/0x350 net/core/sock.c:2253
    lock_sock_nested+0xfe/0x120 net/core/sock.c:2774
    lock_sock include/net/sock.h:1492 [inline]
    sctp_sock_dump+0x122/0xb20 net/sctp/diag.c:324
    sctp_for_each_transport+0x2b5/0x370 net/sctp/socket.c:5091
    sctp_diag_dump+0x3ac/0x660 net/sctp/diag.c:527
    __inet_diag_dump+0xa8/0x140 net/ipv4/inet_diag.c:1049
    inet_diag_dump+0x9b/0x110 net/ipv4/inet_diag.c:1065
    netlink_dump+0x606/0x1080 net/netlink/af_netlink.c:2244
    __netlink_dump_start+0x59a/0x7c0 net/netlink/af_netlink.c:2352
    netlink_dump_start include/linux/netlink.h:216 [inline]
    inet_diag_handler_cmd+0x2ce/0x3f0 net/ipv4/inet_diag.c:1170
    __sock_diag_cmd net/core/sock_diag.c:232 [inline]
    sock_diag_rcv_msg+0x31d/0x410 net/core/sock_diag.c:263
    netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2477
    sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:274

This issue occurs when asoc is peeled off and the old sk is freed after
getting it by asoc->base.sk and before calling lock_sock(sk).

To prevent the sk free, as a holder of the sk, ep should be alive when
calling lock_sock(). This patch uses call_rcu() and moves sock_put and
ep free into sctp_endpoint_destroy_rcu(), so that it's safe to try to
hold the ep under rcu_read_lock in sctp_transport_traverse_process().

If sctp_endpoint_hold() returns true, it means this ep is still alive
and we have held it and can continue to dump it; If it returns false,
it means this ep is dead and can be freed after rcu_read_unlock, and
we should skip it.

In sctp_sock_dump(), after locking the sk, if this ep is different from
tsp->asoc->ep, it means during this dumping, this asoc was peeled off
before calling lock_sock(), and the sk should be skipped; If this ep is
the same with tsp->asoc->ep, it means no peeloff happens on this asoc,
and due to lock_sock, no peeloff will happen either until release_sock.

Note that delaying endpoint free won't delay the port release, as the
port release happens in sctp_endpoint_destroy() before calling call_rcu().
Also, freeing endpoint by call_rcu() makes it safe to access the sk by
asoc->base.sk in sctp_assocs_seq_show() and sctp_rcv().

Thanks Jones to bring this issue up.

v1->v2:
  - improve the changelog.
  - add kfree(ep) into sctp_endpoint_destroy_rcu(), as Jakub noticed.

Reported-by: syzbot+9276d76e83e3bcde6c99@syzkaller.appspotmail.com
Reported-by: NLee Jones <lee.jones@linaro.org>
Fixes: d25adbeb ("sctp: fix an use-after-free issue in sctp_sock_dump")
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ec7d18d

25 12月, 2021 1 次提交

net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register · b45396af

由 Miaoqian Lin 提交于 12月 24, 2021

The fixed_phy_get_gpiod function() returns NULL, it doesn't return error
pointers, using NULL checking to fix this.i

Fixes: 5468e82f ("net: phy: fixed-phy: Drop GPIO from fixed_phy_add()")
Signed-off-by: NMiaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20211224021500.10362-1-linmq006@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

b45396af

24 12月, 2021 17 次提交

selftests: Calculate udpgso segment count without header adjustment · 5471d522

由 Coco Li 提交于 12月 23, 2021

The below referenced commit correctly updated the computation of number
of segments (gso_size) by using only the gso payload size and
removing the header lengths.

With this change the regression test started failing. Update
the tests to match this new behavior.

Both IPv4 and IPv6 tests are updated, as a separate patch in this series
will update udp_v6_send_skb to match this change in udp_send_skb.

Fixes: 158390e4 ("udp: using datalen to cap max gso segments")
Signed-off-by: NCoco Li <lixiaoyan@google.com>
Reviewed-by: NWillem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20211223222441.2975883-2-lixiaoyan@google.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

5471d522

udp: using datalen to cap ipv6 udp max gso segments · 736ef37f

由 Coco Li 提交于 12月 23, 2021

The max number of UDP gso segments is intended to cap to
UDP_MAX_SEGMENTS, this is checked in udp_send_skb().

skb->len contains network and transport header len here, we should use
only data len instead.

This is the ipv6 counterpart to the below referenced commit,
which missed the ipv6 change

Fixes: 158390e4 ("udp: using datalen to cap max gso segments")
Signed-off-by: NCoco Li <lixiaoyan@google.com>
Reviewed-by: NWillem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20211223222441.2975883-1-lixiaoyan@google.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

736ef37f

Merge tag 'mlx5-fixes-2021-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 6f6f0ac6

由 Jakub Kicinski 提交于 12月 23, 2021

Saeed Mahameed says:

====================
mlx5 fixes 2021-12-22

This series provides bug fixes to mlx5 driver.

* tag 'mlx5-fixes-2021-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'
  net/mlx5e: Delete forward rule for ct or sample action
  net/mlx5e: Fix ICOSQ recovery flow for XSK
  net/mlx5e: Fix interoperability between XSK and ICOSQ recovery flow
  net/mlx5e: Fix skb memory leak when TC classifier action offloads are disabled
  net/mlx5e: Wrap the tx reporter dump callback to extract the sq
  net/mlx5: Fix tc max supported prio for nic mode
  net/mlx5: Fix SF health recovery flow
  net/mlx5: Fix error print in case of IRQ request failed
  net/mlx5: Use first online CPU instead of hard coded CPU
  net/mlx5: DR, Fix querying eswitch manager vport for ECPF
  net/mlx5: DR, Fix NULL vs IS_ERR checking in dr_domain_init_resources
====================

Link: https://lore.kernel.org/r/20211223190441.153012-1-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

6f6f0ac6

Merge tag 'drm-fixes-2021-12-24' of git://anongit.freedesktop.org/drm/drm · 95b40115

由 Linus Torvalds 提交于 12月 23, 2021

Pull drm fixes from Dave Airlie:
 "Happy Xmas. Nothing major, one mediatek and a couple of i915 locking
  fixes. There might be a few stragglers over next week or so but I
  don't expect much before next release.

  mediatek:
   - NULL pointer check

  i915:
   - guc submission locking fixes"

* tag 'drm-fixes-2021-12-24' of git://anongit.freedesktop.org/drm/drm:
  drm/i915/guc: Only assign guc_id.id when stealing guc_id
  drm/i915/guc: Use correct context lock when callig clr_context_registered
  drm/mediatek: hdmi: Perform NULL pointer check for mtk_hdmi_conf

95b40115

Merge tag 'io_uring-5.16-2021-12-23' of git://git.kernel.dk/linux-block · a026fa54

由 Linus Torvalds 提交于 12月 23, 2021

Pull io_uring fix from Jens Axboe:
 "Single fix for not clearing kiocb->ki_pos back to 0 for a stream,
  destined for stable as well"

* tag 'io_uring-5.16-2021-12-23' of git://git.kernel.dk/linux-block:
  io_uring: zero iocb->ki_pos for stream file types

a026fa54

Merge branch 'ucount-rlimit-fixes-for-v5.16' of... · 7fe2bc1b

由 Linus Torvalds 提交于 12月 23, 2021

Merge branch 'ucount-rlimit-fixes-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace

Pull ucount fix from Eric Biederman:
 "This fixes a silly logic bug in the ucount rlimits code, where it was
  comparing against the wrong limit"

* 'ucount-rlimit-fixes-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ucounts: Fix rlimit max values check

7fe2bc1b

Merge tag 'net-5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 76657eae

由 Linus Torvalds 提交于 12月 23, 2021

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter.

  Current release - regressions:

   - revert "tipc: use consistent GFP flags"

  Previous releases - regressions:

   - igb: fix deadlock caused by taking RTNL in runtime resume path

   - accept UFOv6 packages in virtio_net_hdr_to_skb

   - netfilter: fix regression in looped (broad|multi)cast's MAC
     handling

   - bridge: fix ioctl old_deviceless bridge argument

   - ice: xsk: do not clear status_error0 for ntu + nb_buffs descriptor,
     avoid stalls when multiple sockets use an interface

  Previous releases - always broken:

   - inet: fully convert sk->sk_rx_dst to RCU rules

   - veth: ensure skb entering GRO are not cloned

   - sched: fix zone matching for invalid conntrack state

   - bonding: fix ad_actor_system option setting to default

   - nf_tables: fix use-after-free in nft_set_catchall_destroy()

   - lantiq_xrx200: increase buffer reservation to avoid mem corruption

   - ice: xsk: avoid leaking app buffers during clean up

   - tun: avoid double free in tun_free_netdev"

* tag 'net-5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
  net: stmmac: dwmac-visconti: Fix value of ETHER_CLK_SEL_FREQ_SEL_2P5M
  r8152: sync ocp base
  r8152: fix the force speed doesn't work for RTL8156
  net: bridge: fix ioctl old_deviceless bridge argument
  net: stmmac: ptp: fix potentially overflowing expression
  net: dsa: tag_ocelot: use traffic class to map priority on injected header
  veth: ensure skb entering GRO are not cloned.
  asix: fix wrong return value in asix_check_host_enable()
  asix: fix uninit-value in asix_mdio_read()
  sfc: falcon: Check null pointer of rx_queue->page_ring
  sfc: Check null pointer of rx_queue->page_ring
  net: ks8851: Check for error irq
  drivers: net: smc911x: Check for error irq
  fjes: Check for error irq
  bonding: fix ad_actor_system option setting to default
  igb: fix deadlock caused by taking RTNL in RPM resume path
  gve: Correct order of processing device options
  net: skip virtio_net_hdr_set_proto if protocol already set
  net: accept UFOv6 packages in virtio_net_hdr_to_skb
  docs: networking: replace skb_hwtstamp_tx with skb_tstamp_tx
  ...

76657eae

net: stmmac: dwmac-visconti: Fix value of ETHER_CLK_SEL_FREQ_SEL_2P5M · 391e5975

由 Nobuhiro Iwamatsu 提交于 12月 23, 2021

ETHER_CLK_SEL_FREQ_SEL_2P5M is not 0 bit of the register. This is a
value, which is 0. Fix from BIT(0) to 0.
Reported-by: NYuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>
Fixes: b38dd98f ("net: stmmac: Add Toshiba Visconti SoCs glue driver")
Signed-off-by: NNobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Link: https://lore.kernel.org/r/20211223073633.101306-1-nobuhiro1.iwamatsu@toshiba.co.jpSigned-off-by: NJakub Kicinski <kuba@kernel.org>

391e5975

Merge branch 'r8152-fix-bugs' · 65fd0c33

由 Jakub Kicinski 提交于 12月 23, 2021

Hayes Wang says:

====================
r8152: fix bugs

Patch #1 fix the issue of force speed mode for RTL8156.
Patch #2 fix the issue of unexpected ocp_base.
====================

Link: https://lore.kernel.org/r/20211223092702.23841-386-nic_swsd@realtek.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

65fd0c33

r8152: sync ocp base · b24edca3

由 Hayes Wang 提交于 12月 23, 2021

There are some chances that the actual base of hardware is different
from the value recorded by driver, so we have to reset the variable
of ocp_base to sync it.

Set ocp_base to -1. Then, it would be updated and the new base would be
set to the hardware next time.
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

b24edca3

r8152: fix the force speed doesn't work for RTL8156 · 45bf944e

由 Hayes Wang 提交于 12月 23, 2021

It needs to set mdio force mode. Otherwise, link off always occurs when
setting force speed.

Fixes: 195aae32 ("r8152: support new chips")
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

45bf944e

Merge tag 'sound-5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 996a18eb

由 Linus Torvalds 提交于 12月 23, 2021

Pull sound fixes from Takashi Iwai:
 "Quite a few small fixes, hopefully the last batch for 5.16.

  Most of them are device-specific quirks and/or fixes, and nothing
  looks scary for the late stage"

* tag 'sound-5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek: Fix quirk for Clevo NJ51CU
  ALSA: rawmidi - fix the uninitalized user_pversion
  ALSA: hda: intel-sdw-acpi: go through HDAS ACPI at max depth of 2
  ALSA: hda: intel-sdw-acpi: harden detection of controller
  ALSA: hda/hdmi: Disable silent stream on GLK
  ALSA: hda/realtek: fix mute/micmute LEDs for a HP ProBook
  ASoC: meson: aiu: Move AIU_I2S_MISC hold setting to aiu-fifo-i2s
  ASoC: meson: aiu: fifo: Add missing dma_coerce_mask_and_coherent()
  ASoC: tas2770: Fix setting of high sample rates
  ASoC: rt5682: fix the wrong jack type detected
  ALSA: hda/realtek: Add new alc285-hp-amp-init model
  ALSA: hda/realtek: Amp init fixup for HP ZBook 15 G6
  ASoC: tegra: Restore headphones jack name on Nyan Big
  ASoC: tegra: Add DAPM switches for headphones and mic jack
  ALSA: jack: Check the return value of kstrdup()
  ALSA: drivers: opl3: Fix incorrect use of vp->state
  ASoC: SOF: Intel: pci-tgl: add new ADL-P variant
  ASoC: SOF: Intel: pci-tgl: add ADL-N support

996a18eb

net: bridge: fix ioctl old_deviceless bridge argument · d95a5620

由 Remi Pommarel 提交于 12月 23, 2021

Commit 561d8352 ("bridge: use ndo_siocdevprivate") changed the
source and destination arguments of copy_{to,from}_user in bridge's
old_deviceless() from args[1] to uarg breaking SIOC{G,S}IFBR ioctls.

Commit cbd7ad29 ("net: bridge: fix ioctl old_deviceless bridge
argument") fixed only BRCTL_{ADD,DEL}_BRIDGES commands leaving
BRCTL_GET_BRIDGES one untouched.

The fixes BRCTL_GET_BRIDGES as well and has been tested with busybox's
brctl.

Example of broken brctl:
$ brctl show
bridge name     bridge id               STP enabled     interfaces
brctl: can't get bridge name for index 0: No such device or address

Example of fixed brctl:
$ brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.000000000000       no

Fixes: 561d8352 ("bridge: use ndo_siocdevprivate")
Signed-off-by: NRemi Pommarel <repk@triplefau.lt>
Reviewed-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NNikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/all/20211223153139.7661-2-repk@triplefau.lt/Signed-off-by: NJakub Kicinski <kuba@kernel.org>

d95a5620

net: stmmac: ptp: fix potentially overflowing expression · eccffcf4

由 Xiaoliang Yang 提交于 12月 23, 2021

Convert the u32 variable to type u64 in a context where expression of
type u64 is required to avoid potential overflow.

Fixes: e9e37200 ("net: stmmac: ptp: update tas basetime after ptp adjust")
Signed-off-by: NXiaoliang Yang <xiaoliang.yang_1@nxp.com>
Link: https://lore.kernel.org/r/20211223073928.37371-1-xiaoliang.yang_1@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

eccffcf4

net: dsa: tag_ocelot: use traffic class to map priority on injected header · ae2778a6

由 Xiaoliang Yang 提交于 12月 23, 2021

For Ocelot switches, the CPU injected frames have an injection header
where it can specify the QoS class of the packet and the DSA tag, now it
uses the SKB priority to set that. If a traffic class to priority
mapping is configured on the netdevice (with mqprio for example ...), it
won't be considered for CPU injected headers. This patch make the QoS
class aligned to the priority to traffic class mapping if it exists.

Fixes: 8dce89aa ("net: dsa: ocelot: add tagger for Ocelot/Felix switches")
Signed-off-by: NXiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: NMarouen Ghodhbane <marouen.ghodhbane@nxp.com>
Link: https://lore.kernel.org/r/20211223072211.33130-1-xiaoliang.yang_1@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

ae2778a6

Merge tag 'gpio-fixes-for-v5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 3bf6f013

由 Linus Torvalds 提交于 12月 23, 2021

Pull gpio fixes from Bartosz Golaszewski:

 - fix interrupts when replugging the device in gpio-dln2

 - remove the arbitrary timeout on virtio requests from gpio-virtio

* tag 'gpio-fixes-for-v5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: virtio: remove timeout
  gpio: dln2: Fix interrupts when replugging the device

3bf6f013

veth: ensure skb entering GRO are not cloned. · 9695b7de

由 Paolo Abeni 提交于 12月 22, 2021

After commit d3256efd ("veth: allow enabling NAPI even without XDP"),
if GRO is enabled on a veth device and TSO is disabled on the peer
device, TCP skbs will go through the NAPI callback. If there is no XDP
program attached, the veth code does not perform any share check, and
shared/cloned skbs could enter the GRO engine.

Ignat reported a BUG triggered later-on due to the above condition:

[   53.970529][    C1] kernel BUG at net/core/skbuff.c:3574!
[   53.981755][    C1] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[   53.982634][    C1] CPU: 1 PID: 19 Comm: ksoftirqd/1 Not tainted 5.16.0-rc5+ #25
[   53.982634][    C1] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[   53.982634][    C1] RIP: 0010:skb_shift+0x13ef/0x23b0
[   53.982634][    C1] Code: ea 03 0f b6 04 02 48 89 fa 83 e2 07 38 d0
7f 08 84 c0 0f 85 41 0c 00 00 41 80 7f 02 00 4d 8d b5 d0 00 00 00 0f
85 74 f5 ff ff <0f> 0b 4d 8d 77 20 be 04 00 00 00 4c 89 44 24 78 4c 89
f7 4c 89 8c
[   53.982634][    C1] RSP: 0018:ffff8881008f7008 EFLAGS: 00010246
[   53.982634][    C1] RAX: 0000000000000000 RBX: ffff8881180b4c80 RCX: 0000000000000000
[   53.982634][    C1] RDX: 0000000000000002 RSI: ffff8881180b4d3c RDI: ffff88810bc9cac2
[   53.982634][    C1] RBP: ffff8881008f70b8 R08: ffff8881180b4cf4 R09: ffff8881180b4cf0
[   53.982634][    C1] R10: ffffed1022999e5c R11: 0000000000000002 R12: 0000000000000590
[   53.982634][    C1] R13: ffff88810f940c80 R14: ffff88810f940d50 R15: ffff88810bc9cac0
[   53.982634][    C1] FS:  0000000000000000(0000) GS:ffff888235880000(0000) knlGS:0000000000000000
[   53.982634][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   53.982634][    C1] CR2: 00007ff5f9b86680 CR3: 0000000108ce8004 CR4: 0000000000170ee0
[   53.982634][    C1] Call Trace:
[   53.982634][    C1]  <TASK>
[   53.982634][    C1]  tcp_sacktag_walk+0xaba/0x18e0
[   53.982634][    C1]  tcp_sacktag_write_queue+0xe7b/0x3460
[   53.982634][    C1]  tcp_ack+0x2666/0x54b0
[   53.982634][    C1]  tcp_rcv_established+0x4d9/0x20f0
[   53.982634][    C1]  tcp_v4_do_rcv+0x551/0x810
[   53.982634][    C1]  tcp_v4_rcv+0x22ed/0x2ed0
[   53.982634][    C1]  ip_protocol_deliver_rcu+0x96/0xaf0
[   53.982634][    C1]  ip_local_deliver_finish+0x1e0/0x2f0
[   53.982634][    C1]  ip_sublist_rcv_finish+0x211/0x440
[   53.982634][    C1]  ip_list_rcv_finish.constprop.0+0x424/0x660
[   53.982634][    C1]  ip_list_rcv+0x2c8/0x410
[   53.982634][    C1]  __netif_receive_skb_list_core+0x65c/0x910
[   53.982634][    C1]  netif_receive_skb_list_internal+0x5f9/0xcb0
[   53.982634][    C1]  napi_complete_done+0x188/0x6e0
[   53.982634][    C1]  gro_cell_poll+0x10c/0x1d0
[   53.982634][    C1]  __napi_poll+0xa1/0x530
[   53.982634][    C1]  net_rx_action+0x567/0x1270
[   53.982634][    C1]  __do_softirq+0x28a/0x9ba
[   53.982634][    C1]  run_ksoftirqd+0x32/0x60
[   53.982634][    C1]  smpboot_thread_fn+0x559/0x8c0
[   53.982634][    C1]  kthread+0x3b9/0x490
[   53.982634][    C1]  ret_from_fork+0x22/0x30
[   53.982634][    C1]  </TASK>

Address the issue by skipping the GRO stage for shared or cloned skbs.
To reduce the chance of OoO, try to unclone the skbs before giving up.

v1 -> v2:
 - use avoid skb_copy and fallback to netif_receive_skb  - Eric
Reported-by: NIgnat Korchagin <ignat@cloudflare.com>
Fixes: d3256efd ("veth: allow enabling NAPI even without XDP")
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Tested-by: NIgnat Korchagin <ignat@cloudflare.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/b5f61c5602aab01bac8d711d8d1bfab0a4817db7.1640197544.git.pabeni@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

9695b7de

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功