提交 · 3116f59c12bd24c513194cd3acb3ec1f7d468954 · openeuler / Kernel

05 1月, 2022 2 次提交

i40e: fix use-after-free in i40e_sync_filters_subtask() · 3116f59c

由 Di Zhu 提交于 11月 29, 2021

Using ifconfig command to delete the ipv6 address will cause
the i40e network card driver to delete its internal mac_filter and
i40e_service_task kernel thread will concurrently access the mac_filter.
These two processes are not protected by lock
so causing the following use-after-free problems.

 print_address_description+0x70/0x360
 ? vprintk_func+0x5e/0xf0
 kasan_report+0x1b2/0x330
 i40e_sync_vsi_filters+0x4f0/0x1850 [i40e]
 i40e_sync_filters_subtask+0xe3/0x130 [i40e]
 i40e_service_task+0x195/0x24c0 [i40e]
 process_one_work+0x3f5/0x7d0
 worker_thread+0x61/0x6c0
 ? process_one_work+0x7d0/0x7d0
 kthread+0x1c3/0x1f0
 ? kthread_park+0xc0/0xc0
 ret_from_fork+0x35/0x40

Allocated by task 2279810:
 kasan_kmalloc+0xa0/0xd0
 kmem_cache_alloc_trace+0xf3/0x1e0
 i40e_add_filter+0x127/0x2b0 [i40e]
 i40e_add_mac_filter+0x156/0x190 [i40e]
 i40e_addr_sync+0x2d/0x40 [i40e]
 __hw_addr_sync_dev+0x154/0x210
 i40e_set_rx_mode+0x6d/0xf0 [i40e]
 __dev_set_rx_mode+0xfb/0x1f0
 __dev_mc_add+0x6c/0x90
 igmp6_group_added+0x214/0x230
 __ipv6_dev_mc_inc+0x338/0x4f0
 addrconf_join_solict.part.7+0xa2/0xd0
 addrconf_dad_work+0x500/0x980
 process_one_work+0x3f5/0x7d0
 worker_thread+0x61/0x6c0
 kthread+0x1c3/0x1f0
 ret_from_fork+0x35/0x40

Freed by task 2547073:
 __kasan_slab_free+0x130/0x180
 kfree+0x90/0x1b0
 __i40e_del_filter+0xa3/0xf0 [i40e]
 i40e_del_mac_filter+0xf3/0x130 [i40e]
 i40e_addr_unsync+0x85/0xa0 [i40e]
 __hw_addr_sync_dev+0x9d/0x210
 i40e_set_rx_mode+0x6d/0xf0 [i40e]
 __dev_set_rx_mode+0xfb/0x1f0
 __dev_mc_del+0x69/0x80
 igmp6_group_dropped+0x279/0x510
 __ipv6_dev_mc_dec+0x174/0x220
 addrconf_leave_solict.part.8+0xa2/0xd0
 __ipv6_ifa_notify+0x4cd/0x570
 ipv6_ifa_notify+0x58/0x80
 ipv6_del_addr+0x259/0x4a0
 inet6_addr_del+0x188/0x260
 addrconf_del_ifaddr+0xcc/0x130
 inet6_ioctl+0x152/0x190
 sock_do_ioctl+0xd8/0x2b0
 sock_ioctl+0x2e5/0x4c0
 do_vfs_ioctl+0x14e/0xa80
 ksys_ioctl+0x7c/0xa0
 __x64_sys_ioctl+0x42/0x50
 do_syscall_64+0x98/0x2c0
 entry_SYSCALL_64_after_hwframe+0x65/0xca

Fixes: 41c445ff ("i40e: main driver core")
Signed-off-by: NDi Zhu <zhudi2@huawei.com>
Signed-off-by: NRui Zhang <zhangrui182@huawei.com>
Tested-by: NGurucharan G <gurucharanx.g@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

3116f59c

i40e: Fix to not show opcode msg on unsuccessful VF MAC change · 01cbf508

由 Mateusz Palczewski 提交于 3月 03, 2021

Hide i40e opcode information sent during response to VF in case when
untrusted VF tried to change MAC on the VF interface.

This is implemented by adding an additional parameter 'hide' to the
response sent to VF function that hides the display of error
information, but forwards the error code to VF.

Previously it was not possible to send response with some error code
to VF without displaying opcode information.

Fixes: 5c3c48ac ("i40e: implement virtual device interface")
Signed-off-by: NGrzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Reviewed-by: NPaul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: NAleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: NTony Brelinski <tony.brelinski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

01cbf508

04 1月, 2022 1 次提交

Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register" · 065e1ae0

由 Florian Fainelli 提交于 1月 03, 2022

This reverts commit b45396af ("net: phy:
fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register")
since it prevents any system that uses a fixed PHY without a GPIO
descriptor from properly working:

[    5.971952] brcm-systemport 9300000.ethernet: failed to register fixed PHY
[    5.978854] brcm-systemport: probe of 9300000.ethernet failed with error -22
[    5.986047] brcm-systemport 9400000.ethernet: failed to register fixed PHY
[    5.992947] brcm-systemport: probe of 9400000.ethernet failed with error -22

Fixes: b45396af ("net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220103193453.1214961-1-f.fainelli@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

065e1ae0

03 1月, 2022 2 次提交

net/fsl: Remove leftover definition in xgmac_mdio · 1ef5e1d0

由 Markus Koch 提交于 1月 02, 2022

commit 26eee021 ("net/fsl: fix a bug in xgmac_mdio") fixed a bug in
the QorIQ mdio driver but left the (now unused) incorrect bit definition
for MDIO_DATA_BSY in the code. This commit removes it.
Signed-off-by: NMarkus Koch <markus@notsyncing.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ef5e1d0

rndis_host: support Hytera digital radios · 29262e1f

由 Thomas Toye 提交于 1月 01, 2022

Hytera makes a range of digital (DMR) radios. These radios can be
programmed to a allow a computer to control them over Ethernet over USB,
either using NCM or RNDIS.

This commit adds support for RNDIS for Hytera radios. I tested with a
Hytera PD785 and a Hytera MD785G. When these radios are programmed to
set up a Radio to PC Network using RNDIS, an USB interface will be added
with class 2 (Communications), subclass 2 (Abstract Modem Control) and
an interface protocol of 255 ("vendor specific" - lsusb even hints "MSFT
RNDIS?").

This patch is similar to the solution of this StackOverflow user, but
that only works for the Hytera MD785:
https://stackoverflow.com/a/53550858

To use the "Radio to PC Network" functionality of Hytera DMR radios, the
radios need to be programmed correctly in CPS (Hytera's Customer
Programming Software). "Forward to PC" should be checked in "Network"
(under "General Setting" in "Conventional") and the "USB Network
Communication Protocol" should be set to RNDIS.
Signed-off-by: NThomas Toye <thomas@toye.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29262e1f

02 1月, 2022 3 次提交

net: ena: Fix error handling when calculating max IO queues number · 5055dc03

由 Arthur Kiyanovski 提交于 1月 02, 2022

The role of ena_calc_max_io_queue_num() is to return the number
of queues supported by the device, which means the return value
should be >=0.

The function that calls ena_calc_max_io_queue_num(), checks
the return value. If it is 0, it means the device reported
it supports 0 IO queues. This case is considered an error
and is handled by the calling function accordingly.

However the current implementation of ena_calc_max_io_queue_num()
is wrong, since when it detects the device supports 0 IO queues,
it returns -EFAULT.

In such a case the calling function doesn't detect the error,
and therefore doesn't handle it.

This commit changes ena_calc_max_io_queue_num() to return 0
in case the device reported it supports 0 queues, allowing the
calling function to properly handle the error case.

Fixes: 736ce3f4 ("net: ena: make ethtool -l show correct max number of queues")
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5055dc03

net: ena: Fix wrong rx request id by resetting device · cb3d4f98

由 Arthur Kiyanovski 提交于 1月 02, 2022

A wrong request id received from the device is a sign that
something is wrong with it, therefore trigger a device reset.

Also add some debug info to the "Page is NULL" print to make
it easier to debug.

Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cb3d4f98

net: ena: Fix undefined state when tx request id is out of bounds · c255a34e

由 Arthur Kiyanovski 提交于 1月 02, 2022

ena_com_tx_comp_req_id_get() checks the req_id of a received completion,
and if it is out of bounds returns -EINVAL. This is a sign that
something is wrong with the device and it needs to be reset.

The current code does not reset the device in this case, which leaves
the driver in an undefined state, where this completion is not properly
handled.

This commit adds a call to handle_invalid_req_id() in ena_clean_tx_irq()
and ena_clean_xdp_irq() which resets the device to fix the issue.

This commit also removes unnecessary request id checks from
validate_tx_req_id() and validate_xdp_req_id(). This check is unneeded
because it was already performed in ena_com_tx_comp_req_id_get(), which
is called right before these functions.

Fixes: 548c4940 ("net: ena: Implement XDP_TX action")
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c255a34e

30 12月, 2021 1 次提交

fsl/fman: Fix missing put_device() call in fman_port_probe · bf2b09fe

由 Miaoqian Lin 提交于 12月 30, 2021

The reference taken by 'of_find_device_by_node()' must be released when
not needed anymore.
Add the corresponding 'put_device()' in the and error handling paths.

Fixes: 18a6c85f ("fsl/fman: Add FMan Port Support")
Signed-off-by: NMiaoqian Lin <linmq006@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf2b09fe

29 12月, 2021 5 次提交

net/mlx5e: Fix wrong features assignment in case of error · 992d8a4e

由 Gal Pressman 提交于 11月 29, 2021

In case of an error in mlx5e_set_features(), 'netdev->features' must be
updated with the correct state of the device to indicate which features
were updated successfully.
To do that we maintain a copy of 'netdev->features' and update it after
successful feature changes, so we can assign it to back to
'netdev->features' if needed.

However, since not all netdev features are handled by the driver (e.g.
GRO/TSO/etc), some features may not be updated correctly in case of an
error updating another feature.

For example, while requesting to disable TSO (feature which is not
handled by the driver) and enable HW-GRO, if an error occurs during
HW-GRO enable, 'oper_features' will be assigned with 'netdev->features'
and HW-GRO turned off. TSO will remain enabled in such case, which is a
bug.

To solve that, instead of using 'netdev->features' as the baseline of
'oper_features' and changing it on set feature success, use 'features'
instead and update it in case of errors.

Fixes: 75b81ce7 ("net/mlx5e: Don't override netdev features field unless in error flow")
Signed-off-by: NGal Pressman <gal@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

992d8a4e

net/mlx5e: TC, Fix memory leak with rules with internal port · 077cdda7

由 Roi Dayan 提交于 12月 22, 2021

Fix a memory leak with decap rule with internal port as destination
device. The driver allocates a modify hdr action but doesn't set
the flow attr modify hdr action which results in skipping releasing
the modify hdr action when releasing the flow.

backtrace:
    [<000000005f8c651c>] krealloc+0x83/0xd0
    [<000000009f59b143>] alloc_mod_hdr_actions+0x156/0x310 [mlx5_core]
    [<000000002257f342>] mlx5e_tc_match_to_reg_set_and_get_id+0x12a/0x360 [mlx5_core]
    [<00000000b44ea75a>] mlx5e_tc_add_fdb_flow+0x962/0x1470 [mlx5_core]
    [<0000000003e384a0>] __mlx5e_add_fdb_flow+0x54c/0xb90 [mlx5_core]
    [<00000000ed8b22b6>] mlx5e_configure_flower+0xe45/0x4af0 [mlx5_core]
    [<00000000024f4ab5>] mlx5e_rep_indr_offload.isra.0+0xfe/0x1b0 [mlx5_core]
    [<000000006c3bb494>] mlx5e_rep_indr_setup_tc_cb+0x90/0x130 [mlx5_core]
    [<00000000d3dac2ea>] tc_setup_cb_add+0x1d2/0x420

Fixes: b16eb3c8 ("net/mlx5: Support internal port as decap route device")
Signed-off-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

077cdda7

ionic: Initialize the 'lif->dbid_inuse' bitmap · 140c7bc7

由 Christophe JAILLET 提交于 12月 26, 2021

When allocated, this bitmap is not initialized. Only the first bit is set a
few lines below.

Use bitmap_zalloc() to make sure that it is cleared before being used.

Fixes: 6461b446 ("ionic: Add interrupts and doorbells")
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Link: https://lore.kernel.org/r/6a478eae0b5e6c63774e1f0ddb1a3f8c38fa8ade.1640527506.git.christophe.jaillet@wanadoo.frSigned-off-by: NJakub Kicinski <kuba@kernel.org>

140c7bc7

igc: Fix TX timestamp support for non-MSI-X platforms · f85846bb

由 James McLaughlin 提交于 12月 17, 2021

Time synchronization was not properly enabled on non-MSI-X platforms.

Fixes: 2c344ae2 ("igc: Add support for TX timestamping")
Signed-off-by: NJames McLaughlin <james.mclaughlin@qsc.com>
Reviewed-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: NNechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

f85846bb

igc: Do not enable crosstimestamping for i225-V models · 1e81dcc1

由 Vinicius Costa Gomes 提交于 12月 13, 2021

It was reported that when PCIe PTM is enabled, some lockups could
be observed with some integrated i225-V models.

While the issue is investigated, we can disable crosstimestamp for
those models and see no loss of functionality, because those models
don't have any support for time synchronization.

Fixes: a90ec848 ("igc: Add support for PTP getcrosststamp()")
Link: https://lore.kernel.org/all/924175a188159f4e03bd69908a91e606b574139b.camel@gmx.de/Reported-by: NStefan Dietrich <roots@gmx.de>
Signed-off-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: NNechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

1e81dcc1

28 12月, 2021 2 次提交

net: lantiq_xrx200: fix statistics of received bytes · 5be60a94

由 Aleksander Jan Bajkowski 提交于 12月 27, 2021

Received frames have FCS truncated. There is no need
to subtract FCS length from the statistics.

Fixes: fe1a5642 ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver")
Signed-off-by: NAleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5be60a94

net: ag71xx: Fix a potential double free in error handling paths · 1cd5384c

由 Christophe JAILLET 提交于 12月 26, 2021

'ndev' is a managed resource allocated with devm_alloc_etherdev(), so there
is no need to call free_netdev() explicitly or there will be a double
free().

Simplify all error handling paths accordingly.

Fixes: d51b6ce4 ("net: ethernet: add ag71xx driver")
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1cd5384c

27 12月, 2021 2 次提交

net: usb: pegasus: Do not drop long Ethernet frames · ca506fca

由 Matthias-Christian Ott 提交于 12月 26, 2021

The D-Link DSB-650TX (2001:4002) is unable to receive Ethernet frames
that are longer than 1518 octets, for example, Ethernet frames that
contain 802.1Q VLAN tags.

The frames are sent to the pegasus driver via USB but the driver
discards them because they have the Long_pkt field set to 1 in the
received status report. The function read_bulk_callback of the pegasus
driver treats such received "packets" (in the terminology of the
hardware) as errors but the field simply does just indicate that the
Ethernet frame (MAC destination to FCS) is longer than 1518 octets.

It seems that in the 1990s there was a distinction between
"giant" (> 1518) and "runt" (< 64) frames and the hardware includes
flags to indicate this distinction. It seems that the purpose of the
distinction "giant" frames was to not allow infinitely long frames due
to transmission errors and to allow hardware to have an upper limit of
the frame size. However, the hardware already has such limit with its
2048 octet receive buffer and, therefore, Long_pkt is merely a
convention and should not be treated as a receive error.

Actually, the hardware is even able to receive Ethernet frames with 2048
octets which exceeds the claimed limit frame size limit of the driver of
1536 octets (PEGASUS_MTU).

Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Signed-off-by: NMatthias-Christian Ott <ott@mirix.org>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca506fca

atlantic: Fix buff_ring OOB in aq_ring_rx_clean · 5f501532

由 Zekun Shen 提交于 12月 26, 2021

The function obtain the next buffer without boundary check.
We should return with I/O error code.

The bug is found by fuzzing and the crash report is attached.
It is an OOB bug although reported as use-after-free.

[    4.804724] BUG: KASAN: use-after-free in aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.805661] Read of size 4 at addr ffff888034fe93a8 by task ksoftirqd/0/9
[    4.806505]
[    4.806703] CPU: 0 PID: 9 Comm: ksoftirqd/0 Tainted: G        W         5.6.0 #34
[    4.809030] Call Trace:
[    4.809343]  dump_stack+0x76/0xa0
[    4.809755]  print_address_description.constprop.0+0x16/0x200
[    4.810455]  ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.811234]  ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.813183]  __kasan_report.cold+0x37/0x7c
[    4.813715]  ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.814393]  kasan_report+0xe/0x20
[    4.814837]  aq_ring_rx_clean+0x1e88/0x2730 [atlantic]
[    4.815499]  ? hw_atl_b0_hw_ring_rx_receive+0x9a5/0xb90 [atlantic]
[    4.816290]  aq_vec_poll+0x179/0x5d0 [atlantic]
[    4.816870]  ? _GLOBAL__sub_I_65535_1_aq_pci_func_init+0x20/0x20 [atlantic]
[    4.817746]  ? __next_timer_interrupt+0xba/0xf0
[    4.818322]  net_rx_action+0x363/0xbd0
[    4.818803]  ? call_timer_fn+0x240/0x240
[    4.819302]  ? __switch_to_asm+0x40/0x70
[    4.819809]  ? napi_busy_loop+0x520/0x520
[    4.820324]  __do_softirq+0x18c/0x634
[    4.820797]  ? takeover_tasklets+0x5f0/0x5f0
[    4.821343]  run_ksoftirqd+0x15/0x20
[    4.821804]  smpboot_thread_fn+0x2f1/0x6b0
[    4.822331]  ? smpboot_unregister_percpu_thread+0x160/0x160
[    4.823041]  ? __kthread_parkme+0x80/0x100
[    4.823571]  ? smpboot_unregister_percpu_thread+0x160/0x160
[    4.824301]  kthread+0x2b5/0x3b0
[    4.824723]  ? kthread_create_on_node+0xd0/0xd0
[    4.825304]  ret_from_fork+0x35/0x40
Signed-off-by: NZekun Shen <bruceshenzk@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f501532

25 12月, 2021 1 次提交

net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register · b45396af

由 Miaoqian Lin 提交于 12月 24, 2021

The fixed_phy_get_gpiod function() returns NULL, it doesn't return error
pointers, using NULL checking to fix this.i

Fixes: 5468e82f ("net: phy: fixed-phy: Drop GPIO from fixed_phy_add()")
Signed-off-by: NMiaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20211224021500.10362-1-linmq006@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

b45396af

24 12月, 2021 5 次提交

net: stmmac: dwmac-visconti: Fix value of ETHER_CLK_SEL_FREQ_SEL_2P5M · 391e5975

由 Nobuhiro Iwamatsu 提交于 12月 23, 2021

ETHER_CLK_SEL_FREQ_SEL_2P5M is not 0 bit of the register. This is a
value, which is 0. Fix from BIT(0) to 0.
Reported-by: NYuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>
Fixes: b38dd98f ("net: stmmac: Add Toshiba Visconti SoCs glue driver")
Signed-off-by: NNobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Link: https://lore.kernel.org/r/20211223073633.101306-1-nobuhiro1.iwamatsu@toshiba.co.jpSigned-off-by: NJakub Kicinski <kuba@kernel.org>

391e5975

r8152: sync ocp base · b24edca3

由 Hayes Wang 提交于 12月 23, 2021

There are some chances that the actual base of hardware is different
from the value recorded by driver, so we have to reset the variable
of ocp_base to sync it.

Set ocp_base to -1. Then, it would be updated and the new base would be
set to the hardware next time.
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

b24edca3

r8152: fix the force speed doesn't work for RTL8156 · 45bf944e

由 Hayes Wang 提交于 12月 23, 2021

It needs to set mdio force mode. Otherwise, link off always occurs when
setting force speed.

Fixes: 195aae32 ("r8152: support new chips")
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

45bf944e

net: stmmac: ptp: fix potentially overflowing expression · eccffcf4

由 Xiaoliang Yang 提交于 12月 23, 2021

Convert the u32 variable to type u64 in a context where expression of
type u64 is required to avoid potential overflow.

Fixes: e9e37200 ("net: stmmac: ptp: update tas basetime after ptp adjust")
Signed-off-by: NXiaoliang Yang <xiaoliang.yang_1@nxp.com>
Link: https://lore.kernel.org/r/20211223073928.37371-1-xiaoliang.yang_1@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

eccffcf4

veth: ensure skb entering GRO are not cloned. · 9695b7de

由 Paolo Abeni 提交于 12月 22, 2021

After commit d3256efd ("veth: allow enabling NAPI even without XDP"),
if GRO is enabled on a veth device and TSO is disabled on the peer
device, TCP skbs will go through the NAPI callback. If there is no XDP
program attached, the veth code does not perform any share check, and
shared/cloned skbs could enter the GRO engine.

Ignat reported a BUG triggered later-on due to the above condition:

[   53.970529][    C1] kernel BUG at net/core/skbuff.c:3574!
[   53.981755][    C1] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[   53.982634][    C1] CPU: 1 PID: 19 Comm: ksoftirqd/1 Not tainted 5.16.0-rc5+ #25
[   53.982634][    C1] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[   53.982634][    C1] RIP: 0010:skb_shift+0x13ef/0x23b0
[   53.982634][    C1] Code: ea 03 0f b6 04 02 48 89 fa 83 e2 07 38 d0
7f 08 84 c0 0f 85 41 0c 00 00 41 80 7f 02 00 4d 8d b5 d0 00 00 00 0f
85 74 f5 ff ff <0f> 0b 4d 8d 77 20 be 04 00 00 00 4c 89 44 24 78 4c 89
f7 4c 89 8c
[   53.982634][    C1] RSP: 0018:ffff8881008f7008 EFLAGS: 00010246
[   53.982634][    C1] RAX: 0000000000000000 RBX: ffff8881180b4c80 RCX: 0000000000000000
[   53.982634][    C1] RDX: 0000000000000002 RSI: ffff8881180b4d3c RDI: ffff88810bc9cac2
[   53.982634][    C1] RBP: ffff8881008f70b8 R08: ffff8881180b4cf4 R09: ffff8881180b4cf0
[   53.982634][    C1] R10: ffffed1022999e5c R11: 0000000000000002 R12: 0000000000000590
[   53.982634][    C1] R13: ffff88810f940c80 R14: ffff88810f940d50 R15: ffff88810bc9cac0
[   53.982634][    C1] FS:  0000000000000000(0000) GS:ffff888235880000(0000) knlGS:0000000000000000
[   53.982634][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   53.982634][    C1] CR2: 00007ff5f9b86680 CR3: 0000000108ce8004 CR4: 0000000000170ee0
[   53.982634][    C1] Call Trace:
[   53.982634][    C1]  <TASK>
[   53.982634][    C1]  tcp_sacktag_walk+0xaba/0x18e0
[   53.982634][    C1]  tcp_sacktag_write_queue+0xe7b/0x3460
[   53.982634][    C1]  tcp_ack+0x2666/0x54b0
[   53.982634][    C1]  tcp_rcv_established+0x4d9/0x20f0
[   53.982634][    C1]  tcp_v4_do_rcv+0x551/0x810
[   53.982634][    C1]  tcp_v4_rcv+0x22ed/0x2ed0
[   53.982634][    C1]  ip_protocol_deliver_rcu+0x96/0xaf0
[   53.982634][    C1]  ip_local_deliver_finish+0x1e0/0x2f0
[   53.982634][    C1]  ip_sublist_rcv_finish+0x211/0x440
[   53.982634][    C1]  ip_list_rcv_finish.constprop.0+0x424/0x660
[   53.982634][    C1]  ip_list_rcv+0x2c8/0x410
[   53.982634][    C1]  __netif_receive_skb_list_core+0x65c/0x910
[   53.982634][    C1]  netif_receive_skb_list_internal+0x5f9/0xcb0
[   53.982634][    C1]  napi_complete_done+0x188/0x6e0
[   53.982634][    C1]  gro_cell_poll+0x10c/0x1d0
[   53.982634][    C1]  __napi_poll+0xa1/0x530
[   53.982634][    C1]  net_rx_action+0x567/0x1270
[   53.982634][    C1]  __do_softirq+0x28a/0x9ba
[   53.982634][    C1]  run_ksoftirqd+0x32/0x60
[   53.982634][    C1]  smpboot_thread_fn+0x559/0x8c0
[   53.982634][    C1]  kthread+0x3b9/0x490
[   53.982634][    C1]  ret_from_fork+0x22/0x30
[   53.982634][    C1]  </TASK>

Address the issue by skipping the GRO stage for shared or cloned skbs.
To reduce the chance of OoO, try to unclone the skbs before giving up.

v1 -> v2:
 - use avoid skb_copy and fallback to netif_receive_skb  - Eric
Reported-by: NIgnat Korchagin <ignat@cloudflare.com>
Fixes: d3256efd ("veth: allow enabling NAPI even without XDP")
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Tested-by: NIgnat Korchagin <ignat@cloudflare.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/b5f61c5602aab01bac8d711d8d1bfab0a4817db7.1640197544.git.pabeni@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

9695b7de

23 12月, 2021 16 次提交

net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()' · 4390c6ed

由 Christophe JAILLET 提交于 11月 06, 2021

All the error handling paths of 'mlx5e_tc_add_fdb_flow()' end to 'err_out'
where 'flow_flag_set(flow, FAILED);' is called.

All but the new error handling paths added by the commits given in the
Fixes tag below.

Fix these error handling paths and branch to 'err_out'.

Fixes: 166f431e ("net/mlx5e: Add indirect tc offload of ovs internal port")
Fixes: b16eb3c8 ("net/mlx5: Support internal port as decap route device")
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
(cherry picked from commit 31108d14)

4390c6ed

net/mlx5e: Delete forward rule for ct or sample action · 2820110d

由 Chris Mi 提交于 12月 02, 2021

When there is ct or sample action, the ct or sample rule will be deleted
and return. But if there is an extra mirror action, the forward rule can't
be deleted because of the return.

Fix it by removing the return.

Fixes: 69e2916e ("net/mlx5: CT: Add support for mirroring")
Fixes: f94d6389 ("net/mlx5e: TC, Add support to offload sample action")
Signed-off-by: NChris Mi <cmi@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

2820110d

net/mlx5e: Fix ICOSQ recovery flow for XSK · 19c4aba2

由 Maxim Mikityanskiy 提交于 7月 22, 2020

There are two ICOSQs per channel: one is needed for RX, and the other
for async operations (XSK TX, kTLS offload). Currently, the recovery
flow for both is the same, and async ICOSQ is mistakenly treated like
the regular ICOSQ.

This patch prevents running the regular ICOSQ recovery on async ICOSQ.
The purpose of async ICOSQ is to handle XSK wakeup requests and post
kTLS offload RX parameters, it has nothing to do with RQ and XSKRQ UMRs,
so the regular recovery sequence is not applicable here.

Fixes: be5323c8 ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: NAya Levin <ayal@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

19c4aba2

net/mlx5e: Fix interoperability between XSK and ICOSQ recovery flow · 17958d7c

由 Maxim Mikityanskiy 提交于 10月 12, 2021

Both regular RQ and XSKRQ use the same ICOSQ for UMRs. When doing
recovery for the ICOSQ, don't forget to deactivate XSKRQ.

XSK can be opened and closed while channels are active, so a new mutex
prevents the ICOSQ recovery from running at the same time. The ICOSQ
recovery deactivates and reactivates XSKRQ, so any parallel change in
XSK state would break consistency. As the regular RQ is running, it's
not enough to just flush the recovery work, because it can be
rescheduled.

Fixes: be5323c8 ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

17958d7c

net/mlx5e: Fix skb memory leak when TC classifier action offloads are disabled · a0cb9096

由 Gal Pressman 提交于 12月 13, 2021

When TC classifier action offloads are disabled (CONFIG_MLX5_CLS_ACT in
Kconfig), the mlx5e_rep_tc_receive() function which is responsible for
passing the skb to the stack (or freeing it) is defined as a nop, and
results in leaking the skb memory. Replace the nop with a call to
napi_gro_receive() to resolve the leak.

Fixes: 28e7606f ("net/mlx5e: Refactor rx handler of represetor device")
Signed-off-by: NGal Pressman <gal@nvidia.com>
Reviewed-by: NAriel Levkovich <lariel@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

a0cb9096

net/mlx5e: Wrap the tx reporter dump callback to extract the sq · 918fc385

由 Amir Tzin 提交于 11月 30, 2021

Function mlx5e_tx_reporter_dump_sq() casts its void * argument to struct
mlx5e_txqsq *, but in TX-timeout-recovery flow the argument is actually
of type struct mlx5e_tx_timeout_ctx *.

 mlx5_core 0000:08:00.1 enp8s0f1: TX timeout detected
 mlx5_core 0000:08:00.1 enp8s0f1: TX timeout on queue: 1, SQ: 0x11ec, CQ: 0x146d, SQ Cons: 0x0 SQ Prod: 0x1, usecs since last trans: 21565000
 BUG: stack guard page was hit at 0000000093f1a2de (stack is 00000000b66ea0dc..000000004d932dae)
 kernel stack overflow (page fault): 0000 [#1] SMP NOPTI
 CPU: 5 PID: 95 Comm: kworker/u20:1 Tainted: G W OE 5.13.0_mlnx #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core]
 RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
 [mlx5_core]
 Call Trace:
 mlx5e_tx_reporter_dump+0x43/0x1c0 [mlx5_core]
 devlink_health_do_dump.part.91+0x71/0xd0
 devlink_health_report+0x157/0x1b0
 mlx5e_reporter_tx_timeout+0xb9/0xf0 [mlx5_core]
 ? mlx5e_tx_reporter_err_cqe_recover+0x1d0/0x1d0
 [mlx5_core]
 ? mlx5e_health_queue_dump+0xd0/0xd0 [mlx5_core]
 ? update_load_avg+0x19b/0x550
 ? set_next_entity+0x72/0x80
 ? pick_next_task_fair+0x227/0x340
 ? finish_task_switch+0xa2/0x280
   mlx5e_tx_timeout_work+0x83/0xb0 [mlx5_core]
   process_one_work+0x1de/0x3a0
   worker_thread+0x2d/0x3c0
 ? process_one_work+0x3a0/0x3a0
   kthread+0x115/0x130
 ? kthread_park+0x90/0x90
   ret_from_fork+0x1f/0x30
 --[ end trace 51ccabea504edaff ]---
 RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
 PKRU: 55555554
 Kernel panic - not syncing: Fatal exception
 Kernel Offset: disabled
 end Kernel panic - not syncing: Fatal exception

To fix this bug add a wrapper for mlx5e_tx_reporter_dump_sq() which
extracts the sq from struct mlx5e_tx_timeout_ctx and set it as the
TX-timeout-recovery flow dump callback.

Fixes: 5f29458b ("net/mlx5e: Support dump callback in TX reporter")
Signed-off-by: NAya Levin <ayal@nvidia.com>
Signed-off-by: NAmir Tzin <amirtz@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

918fc385

net/mlx5: Fix tc max supported prio for nic mode · d671e109

由 Chris Mi 提交于 12月 14, 2021

Only prio 1 is supported if firmware doesn't support ignore flow
level for nic mode. The offending commit removed the check wrongly.
Add it back.

Fixes: 9a99c8f1 ("net/mlx5e: E-Switch, Offload all chain 0 priorities when modify header and forward action is not supported")
Signed-off-by: NChris Mi <cmi@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

d671e109

net/mlx5: Fix SF health recovery flow · 33de865f

由 Moshe Shemesh 提交于 11月 23, 2021

SF do not directly control the PCI device. During recovery flow SF
should not be allowed to do pci disable or pci reset, its PF will do it.

It fixes the following kernel trace:
mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:387:(pid 40948): starting health recovery flow
mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called
mlx5_core 0000:03:00.0: wait vital counter value 0xab175 after 1 iterations
mlx5_core.sf mlx5_core.sf.25: firmware version: 24.32.532
mlx5_core.sf mlx5_core.sf.23: mlx5_health_try_recover:387:(pid 40946): starting health recovery flow
mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called
mlx5_core 0000:03:00.0: wait vital counter value 0xab193 after 1 iterations
mlx5_core.sf mlx5_core.sf.23: firmware version: 24.32.532
mlx5_core.sf mlx5_core.sf.25: mlx5_cmd_check:813:(pid 40948): ENABLE_HCA(0x104) op_mod(0x0) failed,
status bad resource state(0x9), syndrome (0x658908)
mlx5_core.sf mlx5_core.sf.25: mlx5_function_setup:1292:(pid 40948): enable hca failed
mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:389:(pid 40948): health recovery failed

Fixes: 1958fc2f ("net/mlx5: SF, Add auxiliary device driver")
Signed-off-by: NMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

33de865f

net/mlx5: Fix error print in case of IRQ request failed · aa968f92

由 Shay Drory 提交于 11月 24, 2021

In case IRQ layer failed to find or to request irq, the driver is
printing the first cpu of the provided affinity as part of the error
print. Empty affinity is a valid input for the IRQ layer, and it is
an error to call cpumask_first() on empty affinity.

Remove the first cpu print from the error message.

Fixes: c36326d3 ("net/mlx5: Round-Robin EQs over IRQs")
Signed-off-by: NShay Drory <shayd@nvidia.com>
Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

aa968f92

net/mlx5: Use first online CPU instead of hard coded CPU · 26a7993c

由 Shay Drory 提交于 10月 26, 2021

Hard coded CPU (0 in our case) might be offline. Hence, use the first
online CPU instead.

Fixes: f891b7cd ("net/mlx5: Enable single IRQ for PCI Function")
Signed-off-by: NShay Drory <shayd@nvidia.com>
Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

26a7993c

net/mlx5: DR, Fix querying eswitch manager vport for ECPF · 624bf42c

由 Yevgeny Kliteynik 提交于 12月 12, 2021

On BlueField the E-Switch manager is the ECPF (vport 0xFFFE), but when
querying capabilities of ECPF eswitch manager, need to query vport 0
with other_vport = 0.

Fixes: 9091b821 ("net/mlx5: DR, Handle eswitch manager and uplink vports separately")
Signed-off-by: NYevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: NAlex Vesker <valex@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

624bf42c

net/mlx5: DR, Fix NULL vs IS_ERR checking in dr_domain_init_resources · 6b8b4258

由 Miaoqian Lin 提交于 12月 22, 2021

The mlx5_get_uars_page() function  returns error pointers.
Using IS_ERR() to check the return value to fix this.

Fixes: 4ec9e7b0 ("net/mlx5: DR, Expose steering domain functionality")
Signed-off-by: NMiaoqian Lin <linmq006@gmail.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

6b8b4258

asix: fix wrong return value in asix_check_host_enable() · d1652b70

由 Pavel Skripkin 提交于 12月 21, 2021

If asix_read_cmd() returns 0 on 30th interation, 0 will be returned from
asix_check_host_enable(), which is logically wrong. Fix it by returning
-ETIMEDOUT explicitly if we have exceeded 30 iterations

Also, replaced 30 with #define as suggested by Andrew

Fixes: a786e319 ("net: asix: fix uninit value bugs")
Reported-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NPavel Skripkin <paskripkin@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/ecd3470ce6c2d5697ac635d0d3b14a47defb4acb.1640117288.git.paskripkin@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

d1652b70

asix: fix uninit-value in asix_mdio_read() · 8035b1a2

由 Pavel Skripkin 提交于 12月 21, 2021

asix_read_cmd() may read less than sizeof(smsr) bytes and in this case
smsr will be uninitialized.

Fail log:
BUG: KMSAN: uninit-value in asix_check_host_enable drivers/net/usb/asix_common.c:82 [inline]
BUG: KMSAN: uninit-value in asix_check_host_enable drivers/net/usb/asix_common.c:82 [inline] drivers/net/usb/asix_common.c:497
BUG: KMSAN: uninit-value in asix_mdio_read+0x3c1/0xb00 drivers/net/usb/asix_common.c:497 drivers/net/usb/asix_common.c:497
asix_check_host_enable drivers/net/usb/asix_common.c:82 [inline]
asix_check_host_enable drivers/net/usb/asix_common.c:82 [inline] drivers/net/usb/asix_common.c:497
asix_mdio_read+0x3c1/0xb00 drivers/net/usb/asix_common.c:497 drivers/net/usb/asix_common.c:497

Fixes: d9fe64e5 ("net: asix: Add in_pm parameter")
Reported-and-tested-by: syzbot+f44badb06036334e867a@syzkaller.appspotmail.com
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NPavel Skripkin <paskripkin@gmail.com>
Link: https://lore.kernel.org/r/8966e3b514edf39857dd93603fc79ec02e000a75.1640117288.git.paskripkin@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

8035b1a2

sfc: falcon: Check null pointer of rx_queue->page_ring · 9b8bdd1e

由 Jiasheng Jiang 提交于 12月 20, 2021

Because of the possible failure of the kcalloc, it should be better to
set rx_queue->page_ptr_mask to 0 when it happens in order to maintain
the consistency.

Fixes: 5a6681e2 ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver")
Signed-off-by: NJiasheng Jiang <jiasheng@iscas.ac.cn>
Acked-by: NMartin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20211220140344.978408-1-jiasheng@iscas.ac.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org>

9b8bdd1e

sfc: Check null pointer of rx_queue->page_ring · bdf1b5c3

由 Jiasheng Jiang 提交于 12月 20, 2021

Because of the possible failure of the kcalloc, it should be better to
set rx_queue->page_ptr_mask to 0 when it happens in order to maintain
the consistency.

Fixes: 5a6681e2 ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver")
Signed-off-by: NJiasheng Jiang <jiasheng@iscas.ac.cn>
Acked-by: NMartin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20211220135603.954944-1-jiasheng@iscas.ac.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org>

bdf1b5c3

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功