- 06 5月, 2022 10 次提交
-
-
由 Jacob Keller 提交于
The ice_for_each_vf macros have comments describing the implementation. One of the arguments has a period on the end, which is not our typical style. Remove the unnecessary period. Signed-off-by: NJacob Keller <jacob.e.keller@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Jacob Keller 提交于
This function definition was missing a comment describing its implementation. Add one. Signed-off-by: NJacob Keller <jacob.e.keller@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Jacob Keller 提交于
The comment explaining ice_reset_vf has an extraneous "the" with the "if the resets are disabled". Remove it. Signed-off-by: NJacob Keller <jacob.e.keller@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Jacob Keller 提交于
Since commit fe99d1c0 ("ice: make ice_reset_all_vfs void"), the ice_reset_all_vfs function has not returned anything. The function comment still indicated it did. Fix this. While here, also add a line to clarify the function resets all VFs at once in response to hardware resets such as a PF reset. Signed-off-by: NJacob Keller <jacob.e.keller@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Jacob Keller 提交于
The ice_get_vf_vsi function can return NULL in some cases, such as if handling messages during a reset where the VSI is being removed and recreated. Several places throughout the driver do not bother to check whether this VSI pointer is valid. Static analysis tools maybe report issues because they detect paths where a potentially NULL pointer could be dereferenced. Fix this by checking the return value of ice_get_vf_vsi everywhere. Signed-off-by: NJacob Keller <jacob.e.keller@intel.com> Reviewed-by: NPaul Menzel <pmenzel@molgen.mpg.de> Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Jacob Keller 提交于
The debug print in ice_vf_fdir_dump_info does not end in newlines. This can look confusing when reading the kernel log, as the next print will immediately continue on the same line. Fix this by adding the forgotten newline. Signed-off-by: NJacob Keller <jacob.e.keller@intel.com> Reviewed-by: NPaul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Michal Swiatkowski 提交于
Switch id should be the same for each netdevice on a driver. The id must be unique between devices on the same system, but does not need to be unique between devices on different systems. The switch id is used to locate ports on a switch and to know if aggregated ports belong to the same switch. To meet this requirements, use pci_get_dsn as switch id value, as this is unique value for each devices on the same system. Implementing switch id is needed by automatic tools for kubernetes. Set switch id by setting devlink port attribiutes and calling devlink_port_attrs_set while creating pf (for uplink) and vf (for representator) devlink port. To get switch id (in switchdev mode): cat /sys/class/net/$PF0/phys_switch_id Signed-off-by: NMichal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: NMarcin Szycik <marcin.szycik@linux.intel.com> Tested-by: NSandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Wojciech Drewek 提交于
When number of words exceeds ICE_MAX_CHAIN_WORDS, -ENOSPC should be returned not -EINVAL. Do not overwrite this error code in ice_add_tc_flower_adv_fltr. Signed-off-by: NWojciech Drewek <wojciech.drewek@intel.com> Suggested-by: NMarcin Szycik <marcin.szycik@linux.intel.com> Acked-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: NSandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Maciej Fijalkowski 提交于
Both ice_idc.c and ice_virtchnl.c carry their own implementation of a helper function that is looking for a given VSI based on provided vsi_num. Their functionality is the same, so let's introduce the common function in ice.h that both of the mentioned sites will use. This is a strictly cleanup thing, no functionality is changed. Reviewed-by: NAlexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Wan Jiabing 提交于
Fix the following coccicheck warning: ./drivers/net/ethernet/intel/ice/ice_gnss.c:79:26-27: WARNING opportunity for min() Signed-off-by: NWan Jiabing <wanjiabing@vivo.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
- 27 4月, 2022 4 次提交
-
-
由 Jacob Keller 提交于
During ice_sriov_configure, if num_vfs is 0, we are being asked by the kernel to remove all VFs. The driver first de-initializes the snapshot before freeing all the VFs. This results in a use-after-free BUG detected by KASAN. The bug occurs because the snapshot can still be accessed until all VFs are removed. Fix this by freeing all the VFs first before calling ice_mbx_deinit_snapshot. [ +0.032591] ================================================================== [ +0.000021] BUG: KASAN: use-after-free in ice_mbx_vf_state_handler+0x1c3/0x410 [ice] [ +0.000315] Write of size 28 at addr ffff889908eb6f28 by task kworker/55:2/1530996 [ +0.000029] CPU: 55 PID: 1530996 Comm: kworker/55:2 Kdump: loaded Tainted: G S I 5.17.0-dirty #1 [ +0.000022] Hardware name: Dell Inc. PowerEdge R740/0923K0, BIOS 1.6.13 12/17/2018 [ +0.000013] Workqueue: ice ice_service_task [ice] [ +0.000279] Call Trace: [ +0.000012] <TASK> [ +0.000011] dump_stack_lvl+0x33/0x42 [ +0.000030] print_report.cold.13+0xb2/0x6b3 [ +0.000028] ? ice_mbx_vf_state_handler+0x1c3/0x410 [ice] [ +0.000295] kasan_report+0xa5/0x120 [ +0.000026] ? __switch_to_asm+0x21/0x70 [ +0.000024] ? ice_mbx_vf_state_handler+0x1c3/0x410 [ice] [ +0.000298] kasan_check_range+0x183/0x1e0 [ +0.000019] memset+0x1f/0x40 [ +0.000018] ice_mbx_vf_state_handler+0x1c3/0x410 [ice] [ +0.000304] ? ice_conv_link_speed_to_virtchnl+0x160/0x160 [ice] [ +0.000297] ? ice_vsi_dis_spoofchk+0x40/0x40 [ice] [ +0.000305] ice_is_malicious_vf+0x1aa/0x250 [ice] [ +0.000303] ? ice_restore_all_vfs_msi_state+0x160/0x160 [ice] [ +0.000297] ? __mutex_unlock_slowpath.isra.15+0x410/0x410 [ +0.000022] ? ice_debug_cq+0xb7/0x230 [ice] [ +0.000273] ? __kasan_slab_alloc+0x2f/0x90 [ +0.000022] ? memset+0x1f/0x40 [ +0.000017] ? do_raw_spin_lock+0x119/0x1d0 [ +0.000022] ? rwlock_bug.part.2+0x60/0x60 [ +0.000024] __ice_clean_ctrlq+0x3a6/0xd60 [ice] [ +0.000273] ? newidle_balance+0x5b1/0x700 [ +0.000026] ? ice_print_link_msg+0x2f0/0x2f0 [ice] [ +0.000271] ? update_cfs_group+0x1b/0x140 [ +0.000018] ? load_balance+0x1260/0x1260 [ +0.000022] ? ice_process_vflr_event+0x27/0x130 [ice] [ +0.000301] ice_service_task+0x136e/0x1470 [ice] [ +0.000281] process_one_work+0x3b4/0x6c0 [ +0.000030] worker_thread+0x65/0x660 [ +0.000023] ? __kthread_parkme+0xe4/0x100 [ +0.000021] ? process_one_work+0x6c0/0x6c0 [ +0.000020] kthread+0x179/0x1b0 [ +0.000018] ? kthread_complete_and_exit+0x20/0x20 [ +0.000022] ret_from_fork+0x22/0x30 [ +0.000026] </TASK> [ +0.000018] Allocated by task 10742: [ +0.000013] kasan_save_stack+0x1c/0x40 [ +0.000018] __kasan_kmalloc+0x84/0xa0 [ +0.000016] kmem_cache_alloc_trace+0x16c/0x2e0 [ +0.000015] intel_iommu_probe_device+0xeb/0x860 [ +0.000015] __iommu_probe_device+0x9a/0x2f0 [ +0.000016] iommu_probe_device+0x43/0x270 [ +0.000015] iommu_bus_notifier+0xa7/0xd0 [ +0.000015] blocking_notifier_call_chain+0x90/0xc0 [ +0.000017] device_add+0x5f3/0xd70 [ +0.000014] pci_device_add+0x404/0xa40 [ +0.000015] pci_iov_add_virtfn+0x3b0/0x550 [ +0.000016] sriov_enable+0x3bb/0x600 [ +0.000013] ice_ena_vfs+0x113/0xa79 [ice] [ +0.000293] ice_sriov_configure.cold.17+0x21/0xe0 [ice] [ +0.000291] sriov_numvfs_store+0x160/0x200 [ +0.000015] kernfs_fop_write_iter+0x1db/0x270 [ +0.000018] new_sync_write+0x21d/0x330 [ +0.000013] vfs_write+0x376/0x410 [ +0.000013] ksys_write+0xba/0x150 [ +0.000012] do_syscall_64+0x3a/0x80 [ +0.000012] entry_SYSCALL_64_after_hwframe+0x44/0xae [ +0.000028] Freed by task 10742: [ +0.000011] kasan_save_stack+0x1c/0x40 [ +0.000015] kasan_set_track+0x21/0x30 [ +0.000016] kasan_set_free_info+0x20/0x30 [ +0.000012] __kasan_slab_free+0x104/0x170 [ +0.000016] kfree+0x9b/0x470 [ +0.000013] devres_destroy+0x1c/0x20 [ +0.000015] devm_kfree+0x33/0x40 [ +0.000012] ice_mbx_deinit_snapshot+0x39/0x70 [ice] [ +0.000295] ice_sriov_configure+0xb0/0x260 [ice] [ +0.000295] sriov_numvfs_store+0x1bc/0x200 [ +0.000015] kernfs_fop_write_iter+0x1db/0x270 [ +0.000016] new_sync_write+0x21d/0x330 [ +0.000012] vfs_write+0x376/0x410 [ +0.000012] ksys_write+0xba/0x150 [ +0.000012] do_syscall_64+0x3a/0x80 [ +0.000012] entry_SYSCALL_64_after_hwframe+0x44/0xae [ +0.000024] Last potentially related work creation: [ +0.000010] kasan_save_stack+0x1c/0x40 [ +0.000016] __kasan_record_aux_stack+0x98/0xa0 [ +0.000013] insert_work+0x34/0x160 [ +0.000015] __queue_work+0x20e/0x650 [ +0.000016] queue_work_on+0x4c/0x60 [ +0.000015] nf_nat_masq_schedule+0x297/0x2e0 [nf_nat] [ +0.000034] masq_device_event+0x5a/0x60 [nf_nat] [ +0.000031] raw_notifier_call_chain+0x5f/0x80 [ +0.000017] dev_close_many+0x1d6/0x2c0 [ +0.000015] unregister_netdevice_many+0x4e3/0xa30 [ +0.000015] unregister_netdevice_queue+0x192/0x1d0 [ +0.000014] iavf_remove+0x8f9/0x930 [iavf] [ +0.000058] pci_device_remove+0x65/0x110 [ +0.000015] device_release_driver_internal+0xf8/0x190 [ +0.000017] pci_stop_bus_device+0xb5/0xf0 [ +0.000014] pci_stop_and_remove_bus_device+0xe/0x20 [ +0.000016] pci_iov_remove_virtfn+0x19c/0x230 [ +0.000015] sriov_disable+0x4f/0x170 [ +0.000014] ice_free_vfs+0x9a/0x490 [ice] [ +0.000306] ice_sriov_configure+0xb8/0x260 [ice] [ +0.000294] sriov_numvfs_store+0x1bc/0x200 [ +0.000015] kernfs_fop_write_iter+0x1db/0x270 [ +0.000016] new_sync_write+0x21d/0x330 [ +0.000012] vfs_write+0x376/0x410 [ +0.000012] ksys_write+0xba/0x150 [ +0.000012] do_syscall_64+0x3a/0x80 [ +0.000012] entry_SYSCALL_64_after_hwframe+0x44/0xae [ +0.000025] The buggy address belongs to the object at ffff889908eb6f00 which belongs to the cache kmalloc-96 of size 96 [ +0.000016] The buggy address is located 40 bytes inside of 96-byte region [ffff889908eb6f00, ffff889908eb6f60) [ +0.000026] The buggy address belongs to the physical page: [ +0.000010] page:00000000b7e99a2e refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1908eb6 [ +0.000016] flags: 0x57ffffc0000200(slab|node=1|zone=2|lastcpupid=0x1fffff) [ +0.000024] raw: 0057ffffc0000200 ffffea0069d9fd80 dead000000000002 ffff88810004c780 [ +0.000015] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000 [ +0.000009] page dumped because: kasan: bad access detected [ +0.000016] Memory state around the buggy address: [ +0.000012] ffff889908eb6e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ +0.000014] ffff889908eb6e80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ +0.000014] >ffff889908eb6f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ +0.000011] ^ [ +0.000013] ffff889908eb6f80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ +0.000013] ffff889908eb7000: fa fb fb fb fb fb fb fb fc fc fc fc fa fb fb fb [ +0.000012] ================================================================== Fixes: 0891c896 ("ice: warn about potentially malicious VFs") Reported-by: NSlawomir Laba <slawomirx.laba@intel.com> Signed-off-by: NJacob Keller <jacob.e.keller@intel.com> Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Petr Oros 提交于
We need to wait 5 s for EMP reset after firmware flash. Code was extracted from OOT driver (ice v1.8.3 downloaded from sourceforge). Without this wait, fw_activate let card in inconsistent state and recoverable only by second flash/activate. Flash was tested on these fw's: From -> To 3.00 -> 3.10/3.20 3.10 -> 3.00/3.20 3.20 -> 3.00/3.10 Reproducer: [root@host ~]# devlink dev flash pci/0000:ca:00.0 file E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bin Preparing to flash [fw.mgmt] Erasing [fw.mgmt] Erasing done [fw.mgmt] Flashing 100% [fw.mgmt] Flashing done 100% [fw.undi] Erasing [fw.undi] Erasing done [fw.undi] Flashing 100% [fw.undi] Flashing done 100% [fw.netlist] Erasing [fw.netlist] Erasing done [fw.netlist] Flashing 100% [fw.netlist] Flashing done 100% Activate new firmware by devlink reload [root@host ~]# devlink dev reload pci/0000:ca:00.0 action fw_activate reload_actions_performed: fw_activate [root@host ~]# ip link show ens7f0 71: ens7f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff altname enp202s0f0 dmesg after flash: [ 55.120788] ice: Copyright (c) 2018, Intel Corporation. [ 55.274734] ice 0000:ca:00.0: Get PHY capabilities failed status = -5, continuing anyway [ 55.569797] ice 0000:ca:00.0: The DDP package was successfully loaded: ICE OS Default Package version 1.3.28.0 [ 55.603629] ice 0000:ca:00.0: Get PHY capability failed. [ 55.608951] ice 0000:ca:00.0: ice_init_nvm_phy_type failed: -5 [ 55.647348] ice 0000:ca:00.0: PTP init successful [ 55.675536] ice 0000:ca:00.0: DCB is enabled in the hardware, max number of TCs supported on this port are 8 [ 55.685365] ice 0000:ca:00.0: FW LLDP is disabled, DCBx/LLDP in SW mode. [ 55.692179] ice 0000:ca:00.0: Commit DCB Configuration to the hardware [ 55.701382] ice 0000:ca:00.0: 126.024 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x8 link at 0000:c9:02.0 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link) Reboot doesn’t help, only second flash/activate with OOT or patched driver put card back in consistent state. After patch: [root@host ~]# devlink dev flash pci/0000:ca:00.0 file E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bin Preparing to flash [fw.mgmt] Erasing [fw.mgmt] Erasing done [fw.mgmt] Flashing 100% [fw.mgmt] Flashing done 100% [fw.undi] Erasing [fw.undi] Erasing done [fw.undi] Flashing 100% [fw.undi] Flashing done 100% [fw.netlist] Erasing [fw.netlist] Erasing done [fw.netlist] Flashing 100% [fw.netlist] Flashing done 100% Activate new firmware by devlink reload [root@host ~]# devlink dev reload pci/0000:ca:00.0 action fw_activate reload_actions_performed: fw_activate [root@host ~]# ip link show ens7f0 19: ens7f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff altname enp202s0f0 Fixes: 399e27db ("ice: support immediate firmware activation via devlink reload") Signed-off-by: NPetr Oros <poros@redhat.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Ivan Vecera 提交于
Previous patch labelled "ice: Fix incorrect locking in ice_vc_process_vf_msg()" fixed an issue with ignored messages sent by VF driver but a small race window still left. Recently caught trace during 'ip link set ... vf 0 vlan ...' operation: [ 7332.995625] ice 0000:3b:00.0: Clearing port VLAN on VF 0 [ 7333.001023] iavf 0000:3b:01.0: Reset indication received from the PF [ 7333.007391] iavf 0000:3b:01.0: Scheduling reset task [ 7333.059575] iavf 0000:3b:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 3 [ 7333.059626] ice 0000:3b:00.0: Invalid message from VF 0, opcode 3, len 4, error -1 Setting of VLAN for VF causes a reset of the affected VF using ice_reset_vf() function that runs with cfg_lock taken: 1. ice_notify_vf_reset() informs IAVF driver that reset is needed and IAVF schedules its own reset procedure 2. Bit ICE_VF_STATE_DIS is set in vf->vf_state 3. Misc initialization steps 4. ice_sriov_post_vsi_rebuild() -> ice_vf_set_initialized() and that clears ICE_VF_STATE_DIS in vf->vf_state Step 3 is mentioned race window because IAVF reset procedure runs in parallel and one of its step is sending of VIRTCHNL_OP_GET_VF_RESOURCES message (opcode==3). This message is handled in ice_vc_process_vf_msg() and if it is received during the mentioned race window then it's marked as invalid and error is returned to VF driver. Protect vf_state check in ice_vc_process_vf_msg() by cfg_lock to avoid this race condition. Fixes: e6ba5273 ("ice: Fix race conditions between virtchnl handling and VF ndo ops") Tested-by: NFei Liu <feliu@redhat.com> Signed-off-by: NIvan Vecera <ivecera@redhat.com> Reviewed-by: NJacob Keller <jacob.e.keller@intel.com> Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Ivan Vecera 提交于
Usage of mutex_trylock() in ice_vc_process_vf_msg() is incorrect because message sent from VF is ignored and never processed. Use mutex_lock() instead to fix the issue. It is safe because this mutex is used to prevent races between VF related NDOs and handlers processing request messages from VF and these handlers are running in ice_service_task() context. Additionally move this mutex lock prior ice_vc_is_opcode_allowed() call to avoid potential races during allowlist access. Fixes: e6ba5273 ("ice: Fix race conditions between virtchnl handling and VF ndo ops") Signed-off-by: NIvan Vecera <ivecera@redhat.com> Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
- 16 4月, 2022 4 次提交
-
-
由 Maciej Fijalkowski 提交于
Call alloc Rx routine for ZC driver only when the amount of unallocated descriptors exceeds given threshold. Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220413153015.453864-14-maciej.fijalkowski@intel.com
-
由 Maciej Fijalkowski 提交于
Currently, when debugging AF_XDP workloads, one can correlate the -ENXIO return code as the case that XSK is not in the bound state. Returning same code from ndo_xsk_wakeup can be misleading and simply makes it harder to follow what is going on. Change ENXIOs in ice's ndo_xsk_wakeup() implementation to EINVALs, so that when probing it is clear that something is wrong on the driver side, not the xsk_{recv,send}msg. There is a -ENETDOWN that can happen from both kernel/driver sides though, but I don't have a correct replacement for this on one of the sides, so let's keep it that way. Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220413153015.453864-9-maciej.fijalkowski@intel.com
-
由 Maciej Fijalkowski 提交于
When XSK pool uses need_wakeup feature, correlate -ENOBUFS that was returned from xdp_do_redirect() with a XSK Rx queue being full. In such case, terminate the Rx processing that is being done on the current HW Rx ring and let the user space consume descriptors from XSK Rx queue so that there is room that driver can use later on. Introduce new internal return code ICE_XDP_EXIT that will indicate case described above. Note that it does not affect Tx processing that is bound to the same NAPI context, nor the other Rx rings. Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220413153015.453864-6-maciej.fijalkowski@intel.com
-
由 Maciej Fijalkowski 提交于
ice_run_xdp_zc() suggests to compiler that XDP_REDIRECT is the most probable action returned from BPF program that AF_XDP has in its pipeline. Let's also bring this suggestion up to the callsite of ice_run_xdp_zc() so that compiler will be able to generate more optimized code which in turn will make branch predictor happy. Suggested-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220413153015.453864-4-maciej.fijalkowski@intel.com
-
- 14 4月, 2022 4 次提交
-
-
由 Jianglei Nie 提交于
A memory chunk was allocated for orom_data in ice_get_orom_civd_data() by vzmalloc(). But when ice_read_flash_module() fails, the allocated memory is not freed, which will lead to a memory leak. We can fix it by freeing the orom_data when ce_read_flash_module() fails. Fixes: af18d886 ("ice: reduce time to read Option ROM CIVD data") Signed-off-by: NJianglei Nie <niejianglei2021@163.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Wojciech Drewek 提交于
Below steps end up with crash: - modprobe ice - devlink dev eswitch set $PF1_PCI mode switchdev - echo 64 > /sys/class/net/$PF1/device/sriov_numvfs - rmmod ice Calling ice_eswitch_port_start_xmit while the process of removing VFs is in progress ends up with NULL pointer dereference. That's because PR netdev is not released but some resources are already freed. Fix it by checking if ICE_VF_DIS bit is set. Call trace: [ 1379.595146] BUG: kernel NULL pointer dereference, address: 0000000000000040 [ 1379.595284] #PF: supervisor read access in kernel mode [ 1379.595410] #PF: error_code(0x0000) - not-present page [ 1379.595535] PGD 0 P4D 0 [ 1379.595657] Oops: 0000 [#1] PREEMPT SMP PTI [ 1379.595783] CPU: 4 PID: 974 Comm: NetworkManager Kdump: loaded Tainted: G OE 5.17.0-rc8_mrq_dev-queue+ #12 [ 1379.595926] Hardware name: Intel Corporation S1200SP/S1200SP, BIOS S1200SP.86B.03.01.0042.013020190050 01/30/2019 [ 1379.596063] RIP: 0010:ice_eswitch_port_start_xmit+0x46/0xd0 [ice] [ 1379.596292] Code: c7 c8 09 00 00 e8 9a c9 fc ff 84 c0 0f 85 82 00 00 00 4c 89 e7 e8 ca 70 fe ff 48 8b 7d 58 48 89 c3 48 85 ff 75 5e 48 8b 53 20 <8b> 42 40 85 c0 74 78 8d 48 01 f0 0f b1 4a 40 75 f2 0f b6 95 84 00 [ 1379.596456] RSP: 0018:ffffaba0c0d7bad0 EFLAGS: 00010246 [ 1379.596584] RAX: ffff969c14c71680 RBX: ffff969c14c71680 RCX: 000100107a0f0000 [ 1379.596715] RDX: 0000000000000000 RSI: ffff969b9d631000 RDI: 0000000000000000 [ 1379.596846] RBP: ffff969c07b46500 R08: ffff969becfca8ac R09: 0000000000000001 [ 1379.596977] R10: 0000000000000004 R11: ffffaba0c0d7bbec R12: ffff969b9d631000 [ 1379.597106] R13: ffffffffc08357a0 R14: ffff969c07b46500 R15: ffff969b9d631000 [ 1379.597237] FS: 00007f72c0e25c80(0000) GS:ffff969f13500000(0000) knlGS:0000000000000000 [ 1379.597414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1379.597562] CR2: 0000000000000040 CR3: 000000012b316006 CR4: 00000000003706e0 [ 1379.597713] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1379.597863] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1379.598015] Call Trace: [ 1379.598153] <TASK> [ 1379.598294] dev_hard_start_xmit+0xd9/0x220 [ 1379.598444] sch_direct_xmit+0x8a/0x340 [ 1379.598592] __dev_queue_xmit+0xa3c/0xd30 [ 1379.598739] ? packet_parse_headers+0xb4/0xf0 [ 1379.598890] packet_sendmsg+0xa15/0x1620 [ 1379.599038] ? __check_object_size+0x46/0x140 [ 1379.599186] sock_sendmsg+0x5e/0x60 [ 1379.599330] ____sys_sendmsg+0x22c/0x270 [ 1379.599474] ? import_iovec+0x17/0x20 [ 1379.599622] ? sendmsg_copy_msghdr+0x59/0x90 [ 1379.599771] ___sys_sendmsg+0x81/0xc0 [ 1379.599917] ? __pollwait+0xd0/0xd0 [ 1379.600061] ? preempt_count_add+0x68/0xa0 [ 1379.600210] ? _raw_write_lock_irq+0x1a/0x40 [ 1379.600369] ? ep_done_scan+0xc9/0x110 [ 1379.600494] ? _raw_spin_unlock_irqrestore+0x25/0x40 [ 1379.600622] ? preempt_count_add+0x68/0xa0 [ 1379.600747] ? _raw_spin_lock_irq+0x1a/0x40 [ 1379.600899] ? __fget_light+0x8f/0x110 [ 1379.601024] __sys_sendmsg+0x49/0x80 [ 1379.601148] ? release_ds_buffers+0x50/0xe0 [ 1379.601274] do_syscall_64+0x3b/0x90 [ 1379.601399] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 1379.601525] RIP: 0033:0x7f72c1e2e35d Fixes: f5396b8a ("ice: switchdev slow path") Signed-off-by: NWojciech Drewek <wojciech.drewek@intel.com> Reported-by: NMarcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: NMichal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: NSandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Maciej Fijalkowski 提交于
Currently for !CONFIG_NET_SWITCHDEV kernel builds it is not possible to create VFs properly as call to ice_eswitch_configure() returns -EOPNOTSUPP for us. This is because CONFIG_ICE_SWITCHDEV depends on CONFIG_NET_SWITCHDEV. Change the ice_eswitch_configure() implementation for !CONFIG_ICE_SWITCHDEV to return 0 instead -EOPNOTSUPP and let ice_ena_vfs() finish its work properly. CC: Grzegorz Nitka <grzegorz.nitka@intel.com> Fixes: 1a1c40df ("ice: set and release switchdev environment") Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NMichal Swiatkowski <michal.swiatkowski@intel.com> Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Maciej Fijalkowski 提交于
__ice_alloc_rx_bufs_zc() checks if a number of the descriptors to be allocated would cause the ring wrap. In that case, driver will issue two calls to xsk_buff_alloc_batch() - one that will fill the ring up to the end and the second one that will start with filling descriptors from the beginning of the ring. ice_fill_rx_descs() is a wrapper for taking care of what xsk_buff_alloc_batch() gave back to the driver. It works in a best effort approach, so for example when driver asks for 64 buffers, ice_fill_rx_descs() could assign only 32. Such case needs to be checked when ring is being filled up to the end, because in that situation ntu might not reached the end of the ring. Fix the ring wrap by checking if nb_buffs_extra has the expected value. If not, bump ntu and go directly to tail update. Fixes: 3876ff52 ("ice: xsk: Handle SW XDP ring wrap and bump tail more often") Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: NShwetha Nagaraju <Shwetha.nagaraju@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
- 13 4月, 2022 1 次提交
-
-
由 Joe Damato 提交于
Attempt to add mpls+tso support. I don't have ice hardware available to test myself, but I just implemented this feature in i40e and thought it might be useful to implement for ice while this is fresh in my brain. Hoping some one at intel will be able to test this on my behalf. Signed-off-by: NJoe Damato <jdamato@fastly.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
- 09 4月, 2022 1 次提交
-
-
由 Alexander Lobakin 提交于
The CI testing bots triggered the following splat: [ 718.203054] BUG: KASAN: use-after-free in free_irq_cpu_rmap+0x53/0x80 [ 718.206349] Read of size 4 at addr ffff8881bd127e00 by task sh/20834 [ 718.212852] CPU: 28 PID: 20834 Comm: sh Kdump: loaded Tainted: G S W IOE 5.17.0-rc8_nextqueue-devqueue-02643-g23f3121aca93 #1 [ 718.219695] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0012.070720200218 07/07/2020 [ 718.223418] Call Trace: [ 718.227139] [ 718.230783] dump_stack_lvl+0x33/0x42 [ 718.234431] print_address_description.constprop.9+0x21/0x170 [ 718.238177] ? free_irq_cpu_rmap+0x53/0x80 [ 718.241885] ? free_irq_cpu_rmap+0x53/0x80 [ 718.245539] kasan_report.cold.18+0x7f/0x11b [ 718.249197] ? free_irq_cpu_rmap+0x53/0x80 [ 718.252852] free_irq_cpu_rmap+0x53/0x80 [ 718.256471] ice_free_cpu_rx_rmap.part.11+0x37/0x50 [ice] [ 718.260174] ice_remove_arfs+0x5f/0x70 [ice] [ 718.263810] ice_rebuild_arfs+0x3b/0x70 [ice] [ 718.267419] ice_rebuild+0x39c/0xb60 [ice] [ 718.270974] ? asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 718.274472] ? ice_init_phy_user_cfg+0x360/0x360 [ice] [ 718.278033] ? delay_tsc+0x4a/0xb0 [ 718.281513] ? preempt_count_sub+0x14/0xc0 [ 718.284984] ? delay_tsc+0x8f/0xb0 [ 718.288463] ice_do_reset+0x92/0xf0 [ice] [ 718.292014] ice_pci_err_resume+0x91/0xf0 [ice] [ 718.295561] pci_reset_function+0x53/0x80 <...> [ 718.393035] Allocated by task 690: [ 718.433497] Freed by task 20834: [ 718.495688] Last potentially related work creation: [ 718.568966] The buggy address belongs to the object at ffff8881bd127e00 which belongs to the cache kmalloc-96 of size 96 [ 718.574085] The buggy address is located 0 bytes inside of 96-byte region [ffff8881bd127e00, ffff8881bd127e60) [ 718.579265] The buggy address belongs to the page: [ 718.598905] Memory state around the buggy address: [ 718.601809] ffff8881bd127d00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ 718.604796] ffff8881bd127d80: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc [ 718.607794] >ffff8881bd127e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ 718.610811] ^ [ 718.613819] ffff8881bd127e80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc [ 718.617107] ffff8881bd127f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc This is due to that free_irq_cpu_rmap() is always being called *after* (devm_)free_irq() and thus it tries to work with IRQ descs already freed. For example, on device reset the driver frees the rmap right before allocating a new one (the splat above). Make rmap creation and freeing function symmetrical with {request,free}_irq() calls i.e. do that on ifup/ifdown instead of device probe/remove/resume. These operations can be performed independently from the actual device aRFS configuration. Also, make sure ice_vsi_free_irq() clears IRQ affinity notifiers only when aRFS is disabled -- otherwise, CPU rmap sets and clears its own and they must not be touched manually. Fixes: 28bf2672 ("ice: Implement aRFS") Co-developed-by: NIvan Vecera <ivecera@redhat.com> Signed-off-by: NIvan Vecera <ivecera@redhat.com> Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com> Tested-by: NIvan Vecera <ivecera@redhat.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
- 07 4月, 2022 5 次提交
-
-
由 Alexander Lobakin 提交于
Trade text size for rodata size and replace tons of nested if-elses to the const mask match based structs. The almost entire ice_find_dummy_packet() now becomes just one plain while-increment loop. The order in ice_dummy_pkt_profiles[] should be same with the if-elses order previously, as masks become less and less strict through the array to follow the original code flow. Apart from removing 80 locs of 4-level if-elses, it brings a solid text size optimization: add/remove: 0/1 grow/shrink: 1/1 up/down: 2/-1058 (-1056) Function old new delta ice_fill_adv_dummy_packet 289 291 +2 ice_adv_add_update_vsi_list 201 - -201 ice_add_adv_rule 2950 2093 -857 Total: Before=414512, After=413456, chg -0.25% add/remove: 53/52 grow/shrink: 0/0 up/down: 4660/-3988 (672) RO Data old new delta ice_dummy_pkt_profiles - 672 +672 Total: Before=37895, After=38567, chg +1.77% Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: NMichal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: NMarcin Szycik <marcin.szycik@linux.intel.com> Tested-by: NSandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Alexander Lobakin 提交于
Declarations of dummy/template packet headers and offsets can be minified to improve readability and simplify adding new templates. Move all the repetitive constructions into two macros and let them do the name and type expansions. Linewrap removal is yet another positive side effect. Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: NMichal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: NMarcin Szycik <marcin.szycik@linux.intel.com> Tested-by: NSandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Alexander Lobakin 提交于
ice_find_dummy_packet() contains a lot of boilerplate code and a nice room for copy-paste mistakes. Instead of passing 3 separate pointers back and forth to get packet template (dummy) params, directly return a structure containing them. Then, use a macro to compose compound literals and avoid code duplication on return path. Now, dummy packet type/name is needed only once to return a full correct triple pkt-pkt_len-offsets, and those are all one-liners. dummy_ipv4_gtpu_ipv4_packet_offsets is just moved around and renamed (as well as dummy_ipv6_gtp_packet_offsets) with no function changes. Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: NMichal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: NMarcin Szycik <marcin.szycik@linux.intel.com> Tested-by: NSandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Alexander Lobakin 提交于
A loop performing header modification according to the provided mask in ice_fill_adv_dummy_packet() is very cryptic (and error-prone). Replace two identical cast-deferences with a variable. Replace three struct-member-array-accesses with a variable. Invert the condition, reduce the indentation by one -> eliminate line wraps. Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: NMichal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: NMarcin Szycik <marcin.szycik@linux.intel.com> Tested-by: NSandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Alexander Lobakin 提交于
ice_adv_lkup_elem fields h_u and m_u are being accessed as raw u16 arrays in several places. To reduce cast and braces burden, add permanent array-of-u16 aliases with the same size as the `union ice_prot_hdr` itself via anonymous unions to the actual struct declaration, and just access them directly. This: - removes the need to cast the union to u16[] and then dereference it each time -> reduces the horizon for potential bugs; - improves -Warray-bounds coverage -- the array size is now known at compilation time; - addresses cppcheck complaints. Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: NMichal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: NMarcin Szycik <marcin.szycik@linux.intel.com> Tested-by: NSandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
- 06 4月, 2022 3 次提交
-
-
由 Maciej Fijalkowski 提交于
Currently when XDP rings are created, each descriptor gets its DD bit set, which turns out to be the wrong approach as it can lead to a situation where more descriptors get cleaned than it was supposed to, e.g. when AF_XDP busy poll is run with a large batch size. In this situation, the driver would request for more buffers than it is able to handle. Fix this by not setting the DD bits in ice_xdp_alloc_setup_rings(). They should be initialized to zero instead. Fixes: 9610bd98 ("ice: optimize XDP_TX workloads") Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: NShwetha Nagaraju <shwetha.nagaraju@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Maciej Fijalkowski 提交于
ICE_DOWN is dedicated for pf->state. Check for ICE_VSI_DOWN being set on vsi->state in ice_xsk_wakeup(). Fixes: 2d4238f5 ("ice: Add support for AF_XDP") Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: NShwetha Nagaraju <shwetha.nagaraju@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
由 Maciej Fijalkowski 提交于
Unfortunately, the ice driver doesn't respect the RCU critical section that XSK wakeup is surrounded with. To fix this, add synchronize_rcu() calls to paths that destroy resources that might be in use. This was addressed in other AF_XDP ZC enabled drivers, for reference see for example commit b3873a5b ("net/i40e: Fix concurrency issues between config flow and XSK") Fixes: efc2214b ("ice: Add support for XDP") Fixes: 2d4238f5 ("ice: Add support for AF_XDP") Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: NShwetha Nagaraju <shwetha.nagaraju@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
-
- 05 4月, 2022 2 次提交
-
-
由 Anatolii Gerasymenko 提交于
Disable check for queue being enabled in ice_vc_dis_qs_msg, because there could be a case when queues were created, but were not enabled. We still need to delete those queues. Normal workflow for VF looks like: Enable path: VIRTCHNL_OP_ADD_ETH_ADDR (opcode 10) VIRTCHNL_OP_CONFIG_VSI_QUEUES (opcode 6) VIRTCHNL_OP_ENABLE_QUEUES (opcode 8) Disable path: VIRTCHNL_OP_DISABLE_QUEUES (opcode 9) VIRTCHNL_OP_DEL_ETH_ADDR (opcode 11) The issue appears only in stress conditions when VF is enabled and disabled very fast. Eventually there will be a case, when queues are created by VIRTCHNL_OP_CONFIG_VSI_QUEUES, but are not enabled by VIRTCHNL_OP_ENABLE_QUEUES. In turn, these queues are not deleted by VIRTCHNL_OP_DISABLE_QUEUES, because there is a check whether queues are enabled in ice_vc_dis_qs_msg. When we bring up the VF again, we will see the "Failed to set LAN Tx queue context" error during VIRTCHNL_OP_CONFIG_VSI_QUEUES step. This happens because old 16 queues were not deleted and VF requests to create 16 more, but ice_sched_get_free_qparent in ice_ena_vsi_txq would fail to find a parent node for first newly requested queue (because all nodes are allocated to 16 old queues). Testing Hints: Just enable and disable VF fast enough, so it would be disabled before reaching VIRTCHNL_OP_ENABLE_QUEUES. while true; do ip link set dev ens785f0v0 up sleep 0.065 # adjust delay value for you machine ip link set dev ens785f0v0 down done Fixes: 77ca27c4 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap") Signed-off-by: NAnatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: NAlice Michael <alice.michael@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
-
由 Anatolii Gerasymenko 提交于
When VF is freshly created, but not brought up, ring->txq_teid value is by default set to 0. But 0 is a valid TEID. On some platforms the Root Node of Tx scheduler has a TEID = 0. This can cause issues as shown below. The proper way is to set ring->txq_teid to ICE_INVAL_TEID (0xFFFFFFFF). Testing Hints: echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs ip link set dev ens785f0v0 up ip link set dev ens785f0v0 down If we have freshly created VF and quickly turn it on and off, so there would be no time to reach VIRTCHNL_OP_CONFIG_VSI_QUEUES stage, then VIRTCHNL_OP_DISABLE_QUEUES stage will fail with error: [ 639.531454] disable queue 89 failed 14 [ 639.532233] Failed to disable LAN Tx queues, error: ICE_ERR_AQ_ERROR [ 639.533107] ice 0000:02:00.0: Failed to stop Tx ring 0 on VSI 5 The reason for the fail is that we are trying to send AQ command to delete queue 89, which has never been created and receive an "invalid argument" error from firmware. As this queue has never been created, it's teid and ring->txq_teid have default value 0. ice_dis_vsi_txq has a check against non-existent queues: node = ice_sched_find_node_by_teid(pi->root, q_teids[i]); if (!node) continue; But on some platforms the Root Node of Tx scheduler has a teid = 0. Hence, ice_sched_find_node_by_teid finds a node with teid = 0 (it is pi->root), and we go further to submit an erroneous request to firmware. Fixes: 37bb8390 ("ice: Move common functions out of ice_main.c part 7/7") Signed-off-by: NAnatolii Gerasymenko <anatolii.gerasymenko@intel.com> Reviewed-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: NAlice Michael <alice.michael@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
-
- 01 4月, 2022 3 次提交
-
-
由 Ivan Vecera 提交于
Handling of all-multicast flag and associated multicast promiscuous mode is broken in ice driver. When an user switches allmulticast flag on or off the driver checks whether any VLANs are configured over the interface (except default VLAN 0). If any extra VLANs are registered it enables multicast promiscuous mode for all these VLANs (including default VLAN 0) using ICE_SW_LKUP_PROMISC_VLAN look-up type. In this situation all multicast packets tagged with known VLAN ID or untagged are received and multicast packets tagged with unknown VLAN ID ignored. If no extra VLANs are registered (so only VLAN 0 exists) it enables multicast promiscuous mode for VLAN 0 and uses ICE_SW_LKUP_PROMISC look-up type. In this situation any multicast packets including tagged ones are received. The driver handles IFF_ALLMULTI in ice_vsi_sync_fltr() this way: ice_vsi_sync_fltr() { ... if (changed_flags & IFF_ALLMULTI) { if (netdev->flags & IFF_ALLMULTI) { if (vsi->num_vlans > 1) ice_set_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS); else ice_set_promisc(..., ICE_MCAST_PROMISC_BITS); } else { if (vsi->num_vlans > 1) ice_clear_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS); else ice_clear_promisc(..., ICE_MCAST_PROMISC_BITS); } } ... } The code above depends on value vsi->num_vlan that specifies number of VLANs configured over the interface (including VLAN 0) and this is problem because that value is modified in NDO callbacks ice_vlan_rx_add_vid() and ice_vlan_rx_kill_vid(). Scenario 1: 1. ip link set ens7f0 allmulticast on 2. ip link add vlan10 link ens7f0 type vlan id 10 3. ip link set ens7f0 allmulticast off 4. ip link set ens7f0 allmulticast on [1] In this scenario IFF_ALLMULTI is enabled and the driver calls ice_set_promisc(..., ICE_MCAST_PROMISC_BITS) that installs multicast promisc rule with non-VLAN look-up type. [2] Then VLAN with ID 10 is added and vsi->num_vlan incremented to 2 [3] Command switches IFF_ALLMULTI off and the driver calls ice_clear_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS) but this call is effectively NOP because it looks for multicast promisc rules for VLAN 0 and VLAN 10 with VLAN look-up type but no such rules exist. So the all-multicast remains enabled silently in hardware. [4] Command tries to switch IFF_ALLMULTI on and the driver calls ice_clear_promisc(..., ICE_MCAST_PROMISC_BITS) but this call fails (-EEXIST) because non-VLAN multicast promisc rule already exists. Scenario 2: 1. ip link add vlan10 link ens7f0 type vlan id 10 2. ip link set ens7f0 allmulticast on 3. ip link add vlan20 link ens7f0 type vlan id 20 4. ip link del vlan10 ; ip link del vlan20 5. ip link set ens7f0 allmulticast off [1] VLAN with ID 10 is added and vsi->num_vlan==2 [2] Command switches IFF_ALLMULTI on and driver installs multicast promisc rules with VLAN look-up type for VLAN 0 and 10 [3] VLAN with ID 20 is added and vsi->num_vlan==3 but no multicast promisc rules is added for this new VLAN so the interface does not receive MC packets from VLAN 20 [4] Both VLANs are removed but multicast rule for VLAN 10 remains installed so interface receives multicast packets from VLAN 10 [5] Command switches IFF_ALLMULTI off and because vsi->num_vlan is 1 the driver tries to remove multicast promisc rule for VLAN 0 with non-VLAN look-up that does not exist. All-multicast looks disabled from user point of view but it is partially enabled in HW (interface receives all multicast packets either untagged or tagged with VLAN ID 10) To resolve these issues the patch introduces these changes: 1. Adds handling for IFF_ALLMULTI to ice_vlan_rx_add_vid() and ice_vlan_rx_kill_vid() callbacks. So when VLAN is added/removed and IFF_ALLMULTI is enabled an appropriate multicast promisc rule for that VLAN ID is added/removed. 2. In ice_vlan_rx_add_vid() when first VLAN besides VLAN 0 is added so (vsi->num_vlan == 2) and IFF_ALLMULTI is enabled then look-up type for existing multicast promisc rule for VLAN 0 is updated to ICE_MCAST_VLAN_PROMISC_BITS. 3. In ice_vlan_rx_kill_vid() when last VLAN besides VLAN 0 is removed so (vsi->num_vlan == 1) and IFF_ALLMULTI is enabled then look-up type for existing multicast promisc rule for VLAN 0 is updated to ICE_MCAST_PROMISC_BITS. 4. Both ice_vlan_rx_{add,kill}_vid() have to run under ICE_CFG_BUSY bit protection to avoid races with ice_vsi_sync_fltr() that runs in ice_service_task() context. 5. Bit ICE_VSI_VLAN_FLTR_CHANGED is use-less and can be removed. 6. Error messages added to ice_fltr_*_vsi_promisc() helper functions to avoid them in their callers 7. Small improvements to increase readability Fixes: 5eda8afd ("ice: Add support for PF/VF promiscuous mode") Signed-off-by: NIvan Vecera <ivecera@redhat.com> Reviewed-by: NJacob Keller <jacob.e.keller@intel.com> Signed-off-by: NAlice Michael <alice.michael@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ivan Vecera 提交于
Commit 2ccc1c1c ("ice: Remove excess error variables") merged the usage of 'status' and 'err' variables into single one in function ice_set_mac_address(). Unfortunately this causes a regression when call of ice_fltr_add_mac() returns -EEXIST because this return value does not indicate an error in this case but value of 'err' remains to be -EEXIST till the end of the function and is returned to caller. Prior mentioned commit this does not happen because return value of ice_fltr_add_mac() was stored to 'status' variable first and if it was -EEXIST then 'err' remains to be zero. Fix the problem by reset 'err' to zero when ice_fltr_add_mac() returns -EEXIST. Fixes: 2ccc1c1c ("ice: Remove excess error variables") Signed-off-by: NIvan Vecera <ivecera@redhat.com> Reviewed-by: NJacob Keller <jacob.e.keller@intel.com> Acked-by: NAlexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: NAlice Michael <alice.michael@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ivan Vecera 提交于
VSI is set as default forwarding one when promisc mode is set for PF interface, when PF is switched to switchdev mode or when VF driver asks to enable allmulticast or promisc mode for the VF interface (when vf-true-promisc-support priv flag is off). The third case is buggy because in that case VSI associated with VF remains as default one after VF removal. Reproducer: 1. Create VF echo 1 > sys/class/net/ens7f0/device/sriov_numvfs 2. Enable allmulticast or promisc mode on VF ip link set ens7f0v0 allmulticast on ip link set ens7f0v0 promisc on 3. Delete VF echo 0 > sys/class/net/ens7f0/device/sriov_numvfs 4. Try to enable promisc mode on PF ip link set ens7f0 promisc on Although it looks that promisc mode on PF is enabled the opposite is true because ice_vsi_sync_fltr() responsible for IFF_PROMISC handling first checks if any other VSI is set as default forwarding one and if so the function does not do anything. At this point it is not possible to enable promisc mode on PF without re-probe device. To resolve the issue this patch clear default forwarding VSI during ice_vsi_release() when the VSI to be released is the default one. Fixes: 01b5e89a ("ice: Add VF promiscuous support") Signed-off-by: NIvan Vecera <ivecera@redhat.com> Reviewed-by: NMichal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NAlice Michael <alice.michael@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 3月, 2022 3 次提交
-
-
由 Maciej Fijalkowski 提交于
Ice driver tries to always create XDP rings array to be num_possible_cpus() sized, regardless of user's queue count setting that can be changed via ethtool -L for example. Currently, ice_tx_xsk_pool() calculates the qid by decrementing the ring->q_index by the count of XDP queues, but ring->q_index is set to 'i + vsi->alloc_txq'. When user did ethtool -L $IFACE combined 1, alloc_txq is 1, but vsi->num_xdp_txq is still num_possible_cpus(). Then, ice_tx_xsk_pool() will do OOB access and in the final result ring would not get xsk_pool pointer assigned. Then, each ice_xsk_wakeup() call will fail with error and it will not be possible to get into NAPI and do the processing from driver side. Fix this by decrementing vsi->alloc_txq instead of vsi->num_xdp_txq from ring-q_index in ice_tx_xsk_pool() so the calculation is reflected to the setting of ring->q_index. Fixes: 22bf877e ("ice: introduce XDP_TX fallback path") Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220328142123.170157-5-maciej.fijalkowski@intel.com
-
由 Maciej Fijalkowski 提交于
This can happen with big budget values and some breakage of re-filling descriptors as we do not clear the entry that ntu is pointing at the end of ice_alloc_rx_bufs_zc. So if ntc is at ntu then it might be the case that status_error0 has an old, uncleared value and ntc would go over with processing which would result in false results. Break Rx loop when ntc == ntu to avoid broken behavior. Fixes: 3876ff52 ("ice: xsk: Handle SW XDP ring wrap and bump tail more often") Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220328142123.170157-4-maciej.fijalkowski@intel.com
-
由 Magnus Karlsson 提交于
The NIC Tx ring completion routine cleans entries from the ring in batches. However, it processes one more batch than it is supposed to. Note that this does not matter from a functionality point of view since it will not find a set DD bit for the next batch and just exit the loop. But from a performance perspective, it is faster to terminate the loop before and not issue an expensive read over PCIe to get the DD bit. Fixes: 126cdfe1 ("ice: xsk: Improve AF_XDP ZC Tx and use batching API") Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220328142123.170157-3-maciej.fijalkowski@intel.com
-