提交 · 2018b22a759e26a4c7e3ac6c60c283cfbd2c9c93 · openeuler / Kernel

07 9月, 2022 1 次提交

iavf: Fix race between iavf_close and iavf_reset_task · 11c12adc

由 Michal Jaron 提交于 8月 18, 2022

During stress tests with adding VF to namespace and changing vf's
trust there was a race between iavf_reset_task and iavf_close.
Sometimes when IAVF_FLAG_AQ_DISABLE_QUEUES from iavf_close was sent
to PF after reset and before IAVF_AQ_GET_CONFIG was sent then PF
returns error IAVF_NOT_SUPPORTED to disable queues request and
following requests. There is need to get_config before other
aq_required will be send but iavf_close clears all flags, if
get_config was not sent before iavf_close, then it will not be send
at all.

In case when IAVF_FLAG_AQ_GET_OFFLOAD_VLAN_V2_CAPS was sent before
IAVF_FLAG_AQ_DISABLE_QUEUES then there was rtnl_lock deadlock
between iavf_close and iavf_adminq_task until iavf_close timeouts
and disable queues was sent after iavf_close ends.

There was also a problem with sending delete/add filters.
Sometimes when filters was not yet added to PF and in
iavf_close all filters was set to remove there might be a try
to remove nonexistent filters on PF.

Add aq_required_tmp to save aq_required flags and send them after
disable_queues will be handled. Clear flags given to iavf_down
different than IAVF_FLAG_AQ_GET_CONFIG as this flag is necessary
to sent other aq_required. Remove some flags that we don't
want to send as we are in iavf_close and we want to disable
interface. Remove filters which was not yet sent and send del
filters flags only when there are filters to remove.
Signed-off-by: NMichal Jaron <michalx.jaron@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

11c12adc

12 8月, 2022 3 次提交

iavf: Fix deadlock in initialization · cbe9e511

由 Ivan Vecera 提交于 8月 08, 2022

Fix deadlock that occurs when iavf interface is a part of failover
configuration.

1. Mutex crit_lock is taken at the beginning of iavf_watchdog_task()
2. Function iavf_init_config_adapter() is called when adapter
   state is __IAVF_INIT_CONFIG_ADAPTER
3. iavf_init_config_adapter() calls register_netdevice() that emits
   NETDEV_REGISTER event
4. Notifier function failover_event() then calls
   net_failover_slave_register() that calls dev_open()
5. dev_open() calls iavf_open() that tries to take crit_lock in
   end-less loop

Stack trace:
...
[  790.251876]  usleep_range_state+0x5b/0x80
[  790.252547]  iavf_open+0x37/0x1d0 [iavf]
[  790.253139]  __dev_open+0xcd/0x160
[  790.253699]  dev_open+0x47/0x90
[  790.254323]  net_failover_slave_register+0x122/0x220 [net_failover]
[  790.255213]  failover_slave_register.part.7+0xd2/0x180 [failover]
[  790.256050]  failover_event+0x122/0x1ab [failover]
[  790.256821]  notifier_call_chain+0x47/0x70
[  790.257510]  register_netdevice+0x20f/0x550
[  790.258263]  iavf_watchdog_task+0x7c8/0xea0 [iavf]
[  790.259009]  process_one_work+0x1a7/0x360
[  790.259705]  worker_thread+0x30/0x390

To fix the situation we should check the current adapter state after
first unsuccessful mutex_trylock() and return with -EBUSY if it is
__IAVF_INIT_CONFIG_ADAPTER.

Fixes: 226d5285 ("iavf: fix locking of critical sections")
Signed-off-by: NIvan Vecera <ivecera@redhat.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

cbe9e511

iavf: Fix reset error handling · 31071173

由 Przemyslaw Patynowski 提交于 7月 19, 2022

Do not call iavf_close in iavf_reset_task error handling. Doing so can
lead to double call of napi_disable, which can lead to deadlock there.
Removing VF would lead to iavf_remove task being stuck, because it
requires crit_lock, which is held by iavf_close.
Call iavf_disable_vf if reset fail, so that driver will clean up
remaining invalid resources.
During rapid VF resets, HW can fail to setup VF mailbox. Wrong
error handling can lead to iavf_remove being stuck with:
[ 5218.999087] iavf 0000:82:01.0: Failed to init adminq: -53
...
[ 5267.189211] INFO: task repro.sh:11219 blocked for more than 30 seconds.
[ 5267.189520]       Tainted: G S          E     5.18.0-04958-ga54ce370-dirty #1
[ 5267.189764] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 5267.190062] task:repro.sh        state:D stack:    0 pid:11219 ppid:  8162 flags:0x00000000
[ 5267.190347] Call Trace:
[ 5267.190647]  <TASK>
[ 5267.190927]  __schedule+0x460/0x9f0
[ 5267.191264]  schedule+0x44/0xb0
[ 5267.191563]  schedule_preempt_disabled+0x14/0x20
[ 5267.191890]  __mutex_lock.isra.12+0x6e3/0xac0
[ 5267.192237]  ? iavf_remove+0xf9/0x6c0 [iavf]
[ 5267.192565]  iavf_remove+0x12a/0x6c0 [iavf]
[ 5267.192911]  ? _raw_spin_unlock_irqrestore+0x1e/0x40
[ 5267.193285]  pci_device_remove+0x36/0xb0
[ 5267.193619]  device_release_driver_internal+0xc1/0x150
[ 5267.193974]  pci_stop_bus_device+0x69/0x90
[ 5267.194361]  pci_stop_and_remove_bus_device+0xe/0x20
[ 5267.194735]  pci_iov_remove_virtfn+0xba/0x120
[ 5267.195130]  sriov_disable+0x2f/0xe0
[ 5267.195506]  ice_free_vfs+0x7d/0x2f0 [ice]
[ 5267.196056]  ? pci_get_device+0x4f/0x70
[ 5267.196496]  ice_sriov_configure+0x78/0x1a0 [ice]
[ 5267.196995]  sriov_numvfs_store+0xfe/0x140
[ 5267.197466]  kernfs_fop_write_iter+0x12e/0x1c0
[ 5267.197918]  new_sync_write+0x10c/0x190
[ 5267.198404]  vfs_write+0x24e/0x2d0
[ 5267.198886]  ksys_write+0x5c/0xd0
[ 5267.199367]  do_syscall_64+0x3a/0x80
[ 5267.199827]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 5267.200317] RIP: 0033:0x7f5b381205c8
[ 5267.200814] RSP: 002b:00007fff8c7e8c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 5267.201981] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f5b381205c8
[ 5267.202620] RDX: 0000000000000002 RSI: 00005569420ee900 RDI: 0000000000000001
[ 5267.203426] RBP: 00005569420ee900 R08: 000000000000000a R09: 00007f5b38180820
[ 5267.204327] R10: 000000000000000a R11: 0000000000000246 R12: 00007f5b383c06e0
[ 5267.205193] R13: 0000000000000002 R14: 00007f5b383bb880 R15: 0000000000000002
[ 5267.206041]  </TASK>
[ 5267.206970] Kernel panic - not syncing: hung_task: blocked tasks
[ 5267.207809] CPU: 48 PID: 551 Comm: khungtaskd Kdump: loaded Tainted: G S          E     5.18.0-04958-ga54ce370-dirty #1
[ 5267.208726] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.11.0 11/02/2019
[ 5267.209623] Call Trace:
[ 5267.210569]  <TASK>
[ 5267.211480]  dump_stack_lvl+0x33/0x42
[ 5267.212472]  panic+0x107/0x294
[ 5267.213467]  watchdog.cold.8+0xc/0xbb
[ 5267.214413]  ? proc_dohung_task_timeout_secs+0x30/0x30
[ 5267.215511]  kthread+0xf4/0x120
[ 5267.216459]  ? kthread_complete_and_exit+0x20/0x20
[ 5267.217505]  ret_from_fork+0x22/0x30
[ 5267.218459]  </TASK>

Fixes: f0db7892 ("i40evf: use netdev variable in reset task")
Signed-off-by: NPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: NJedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: NMarek Szlosek <marek.szlosek@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

31071173

iavf: Fix NULL pointer dereference in iavf_get_link_ksettings · 541a1af4

由 Przemyslaw Patynowski 提交于 7月 19, 2022

Fix possible NULL pointer dereference, due to freeing of adapter->vf_res
in iavf_init_get_resources. Previous commit introduced a regression,
where receiving IAVF_ERR_ADMIN_QUEUE_NO_WORK from iavf_get_vf_config
would free adapter->vf_res. However, netdev is still registered, so
ethtool_ops can be called. Calling iavf_get_link_ksettings with no vf_res,
will result with:
[ 9385.242676] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 9385.242683] #PF: supervisor read access in kernel mode
[ 9385.242686] #PF: error_code(0x0000) - not-present page
[ 9385.242690] PGD 0 P4D 0
[ 9385.242696] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
[ 9385.242701] CPU: 6 PID: 3217 Comm: pmdalinux Kdump: loaded Tainted: G S          E     5.18.0-04958-ga54ce370-dirty #1
[ 9385.242708] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.11.0 11/02/2019
[ 9385.242710] RIP: 0010:iavf_get_link_ksettings+0x29/0xd0 [iavf]
[ 9385.242745] Code: 00 0f 1f 44 00 00 b8 01 ef ff ff 48 c7 46 30 00 00 00 00 48 c7 46 38 00 00 00 00 c6 46 0b 00 66 89 46 08 48 8b 87 68 0e 00 00 <f6> 40 08 80 75 50 8b 87 5c 0e 00 00 83 f8 08 74 7a 76 1d 83 f8 20
[ 9385.242749] RSP: 0018:ffffc0560ec7fbd0 EFLAGS: 00010246
[ 9385.242755] RAX: 0000000000000000 RBX: ffffc0560ec7fc08 RCX: 0000000000000000
[ 9385.242759] RDX: ffffffffc0ad4550 RSI: ffffc0560ec7fc08 RDI: ffffa0fc66674000
[ 9385.242762] RBP: 00007ffd1fb2bf50 R08: b6a2d54b892363ee R09: ffffa101dc14fb00
[ 9385.242765] R10: 0000000000000000 R11: 0000000000000004 R12: ffffa0fc66674000
[ 9385.242768] R13: 0000000000000000 R14: ffffa0fc66674000 R15: 00000000ffffffa1
[ 9385.242771] FS:  00007f93711a2980(0000) GS:ffffa0fad72c0000(0000) knlGS:0000000000000000
[ 9385.242775] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9385.242778] CR2: 0000000000000008 CR3: 0000000a8e61c003 CR4: 00000000003706e0
[ 9385.242781] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 9385.242784] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 9385.242787] Call Trace:
[ 9385.242791]  <TASK>
[ 9385.242793]  ethtool_get_settings+0x71/0x1a0
[ 9385.242814]  __dev_ethtool+0x426/0x2f40
[ 9385.242823]  ? slab_post_alloc_hook+0x4f/0x280
[ 9385.242836]  ? kmem_cache_alloc_trace+0x15d/0x2f0
[ 9385.242841]  ? dev_ethtool+0x59/0x170
[ 9385.242848]  dev_ethtool+0xa7/0x170
[ 9385.242856]  dev_ioctl+0xc3/0x520
[ 9385.242866]  sock_do_ioctl+0xa0/0xe0
[ 9385.242877]  sock_ioctl+0x22f/0x320
[ 9385.242885]  __x64_sys_ioctl+0x84/0xc0
[ 9385.242896]  do_syscall_64+0x3a/0x80
[ 9385.242904]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 9385.242918] RIP: 0033:0x7f93702396db
[ 9385.242923] Code: 73 01 c3 48 8b 0d ad 57 38 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 7d 57 38 00 f7 d8 64 89 01 48
[ 9385.242927] RSP: 002b:00007ffd1fb2bf18 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 9385.242932] RAX: ffffffffffffffda RBX: 000055671b1d2fe0 RCX: 00007f93702396db
[ 9385.242935] RDX: 00007ffd1fb2bf20 RSI: 0000000000008946 RDI: 0000000000000007
[ 9385.242937] RBP: 00007ffd1fb2bf20 R08: 0000000000000003 R09: 0030763066307330
[ 9385.242940] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd1fb2bf80
[ 9385.242942] R13: 0000000000000007 R14: 0000556719f6de90 R15: 00007ffd1fb2c1b0
[ 9385.242948]  </TASK>
[ 9385.242949] Modules linked in: iavf(E) xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nft_compat nf_nat_tftp nft_objref nf_conntrack_tftp bridge stp llc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables rfkill nfnetlink vfat fat irdma ib_uverbs ib_core intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support ice irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl i40e pcspkr intel_cstate joydev mei_me intel_uncore mxm_wmi mei ipmi_ssif lpc_ich ipmi_si acpi_power_meter xfs libcrc32c mgag200 i2c_algo_bit drm_shmem_helper drm_kms_helper sd_mod t10_pi crc64_rocksoft crc64 syscopyarea sg sysfillrect sysimgblt fb_sys_fops drm ixgbe ahci libahci libata crc32c_intel mdio dca wmi dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse
[ 9385.243065]  [last unloaded: iavf]

Dereference happens in if (ADV_LINK_SUPPORT(adapter)) statement

Fixes: 209f2f9c ("iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 negotiation")
Signed-off-by: NPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: NJedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: NMarek Szlosek <marek.szlosek@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

541a1af4

29 7月, 2022 2 次提交

iavf: Fix 'tc qdisc show' listing too many queues · 93cb804e

由 Przemyslaw Patynowski 提交于 6月 15, 2022

Fix tc qdisc show dev <ethX> root displaying too many fq_codel qdiscs.
tc_modify_qdisc, which is caller of ndo_setup_tc, expects driver to call
netif_set_real_num_tx_queues, which prepares qdiscs.
Without this patch, fq_codel qdiscs would not be adjusted to number of
queues on VF.
e.g.:
tc qdisc show dev <ethX>
qdisc mq 0: root
qdisc fq_codel 0: parent :4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :3 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
tc qdisc add dev <ethX> root mqprio num_tc 2 map 1 0 0 0 0 0 0 0 queues 1@0 1@1 hw 1 mode channel shaper bw_rlimit max_rate 5000Mbit 150Mbit
tc qdisc show dev <ethX>
qdisc mqprio 8003: root tc 2 map 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
queues:(0:0) (1:1)
mode:channel
shaper:bw_rlimit max_rate:5Gbit 150Mbit
qdisc fq_codel 0: parent 8003:4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent 8003:3 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent 8003:2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent 8003:1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64

While after fix:
tc qdisc add dev <ethX> root mqprio num_tc 2 map 1 0 0 0 0 0 0 0 queues 1@0 1@1 hw 1 mode channel shaper bw_rlimit max_rate 5000Mbit 150Mbit
tc qdisc show dev <ethX> #should show 2, shows 4
qdisc mqprio 8004: root tc 2 map 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
queues:(0:0) (1:1)
mode:channel
shaper:bw_rlimit max_rate:5Gbit 150Mbit
qdisc fq_codel 0: parent 8004:2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent 8004:1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64

Fixes: d5b33d02 ("i40evf: add ndo_setup_tc callback to i40evf")
Signed-off-by: NPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Co-developed-by: NGrzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: NGrzegorz Szczurek <grzegorzx.szczurek@intel.com>
Co-developed-by: NKiran Patil <kiran.patil@intel.com>
Signed-off-by: NKiran Patil <kiran.patil@intel.com>
Signed-off-by: NJedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: NBharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

93cb804e

iavf: Fix max_rate limiting · ec60d54c

由 Przemyslaw Patynowski 提交于 6月 13, 2022

Fix max_rate option in TC, check for proper quanta boundaries.
Check for minimum value provided and if it fits expected 50Mbps
quanta.

Without this patch, iavf could send settings for max_rate limiting
that would be accepted from by PF even the max_rate option is less
than expected 50Mbps quanta. It results in no rate limiting
on traffic as rate limiting will be floored to 0.

Example:
tc qdisc add dev $vf root mqprio num_tc 3 map 0 2 1 queues \
2@0 2@2 2@4 hw 1 mode channel shaper bw_rlimit \
max_rate 50Mbps 500Mbps 500Mbps

Should limit TC0 to circa 50 Mbps

tc qdisc add dev $vf root mqprio num_tc 3 map 0 2 1 queues \
2@0 2@2 2@4 hw 1 mode channel shaper bw_rlimit \
max_rate 0Mbps 100Kbit 500Mbps

Should return error

Fixes: d5b33d02 ("i40evf: add ndo_setup_tc callback to i40evf")
Signed-off-by: NPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: NJun Zhang <xuejun.zhang@intel.com>
Tested-by: NBharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

ec60d54c

23 7月, 2022 1 次提交

iavf: Check for duplicate TC flower filter before parsing · 40e589ba

由 Avinash Dayanand 提交于 6月 21, 2022

Record of all the TC flower filters are kept for local book keeping, so
take advantage of that and check for duplicate filter even before sending
a request to the PF driver.
Signed-off-by: NAvinash Dayanand <avinash.dayanand@intel.com>
Signed-off-by: NJun Zhang <xuejun.zhang@intel.com>
Tested-by: NBharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

40e589ba

19 7月, 2022 2 次提交

iavf: Disallow changing rx/tx-frames and rx/tx-frames-irq · 4635fd3a

由 Przemyslaw Patynowski 提交于 6月 13, 2022

Remove from supported_coalesce_params ETHTOOL_COALESCE_MAX_FRAMES
and ETHTOOL_COALESCE_MAX_FRAMES_IRQ. As tx-frames-irq allowed
user to change budget for iavf_clean_tx_irq, remove work_limit
and use define for budget.

Without this patch there would be possibility to change rx/tx-frames
and rx/tx-frames-irq, which for rx/tx-frames did nothing, while for
rx/tx-frames-irq it changed rx/tx-frames and only changed budget
for cleaning NAPI poll.

Fixes: fbb7ddfe ("i40evf: core ethtool functionality")
Signed-off-by: NPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: NJun Zhang <xuejun.zhang@intel.com>
Tested-by: NMarek Szlosek <marek.szlosek@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

4635fd3a

iavf: Fix VLAN_V2 addition/rejection · 968996c0

由 Przemyslaw Patynowski 提交于 6月 10, 2022

Fix VLAN addition, so that PF driver does not reject whole VLAN batch.
Add VLAN reject handling, so rejected VLANs, won't litter VLAN filter
list. Fix handling of active_(c/s)vlans, so it will be possible to
re-add VLAN filters for user.
Without this patch, after changing trust to off, with VLAN filters
saturated, no VLAN is added, due to PF rejecting addition.

Fixes: 92fc5085 ("iavf: Restrict maximum VLAN filters for VIRTCHNL_VF_OFFLOAD_VLAN_V2")
Signed-off-by: NPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: NJedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

968996c0

01 7月, 2022 1 次提交

intel/iavf:fix repeated words in comments · afdc8a54

由 Jilin Yuan 提交于 6月 29, 2022

Delete the redundant word 'a'.
Signed-off-by: NJilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

afdc8a54

09 6月, 2022 1 次提交

iavf: Fix issue with MAC address of VF shown as zero · 64560384

由 Michal Wilczynski 提交于 5月 20, 2022

After reinitialization of iavf, ice driver gets VIRTCHNL_OP_ADD_ETH_ADDR
message with incorrectly set type of MAC address. Hardware address should
have is_primary flag set as true. This way ice driver knows what it has
to set as a MAC address.

Check if the address is primary in iavf_add_filter function and set flag
accordingly.

To test set all-zero MAC on a VF. This triggers iavf re-initialization
and VIRTCHNL_OP_ADD_ETH_ADDR message gets sent to PF.
For example:

ip link set dev ens785 vf 0 mac 00:00:00:00:00:00

This triggers re-initialization of iavf. New MAC should be assigned.
Now check if MAC is non-zero:

ip link show dev ens785

Fixes: a3e839d5 ("iavf: Add usage of new virtchnl format to set default MAC")
Signed-off-by: NMichal Wilczynski <michal.wilczynski@intel.com>
Reviewed-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

64560384

08 6月, 2022 1 次提交

iavf: Add waiting for response from PF in set mac · 35a2443d

由 Mateusz Palczewski 提交于 5月 20, 2022

Make iavf_set_mac synchronous by waiting for a response
from a PF. Without this iavf_set_mac is always returning
success even though set_mac can be rejected by a PF.
This ensures that when set_mac exits netdev MAC is updated.
This is needed for sending ARPs with correct MAC after
changing VF's MAC. This is also needed by bonding module.
Signed-off-by: NSylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

35a2443d

09 4月, 2022 1 次提交

Revert "iavf: Fix deadlock occurrence during resetting VF interface" · 7d59706d

由 Mateusz Palczewski 提交于 3月 24, 2022

This change caused a regression with resetting while changing network
namespaces. By clearing the IFF_UP flag, the kernel now thinks it has
fully closed the device.

This reverts commit 0cc318d2.

Fixes: 0cc318d2 ("iavf: Fix deadlock occurrence during resetting VF interface")
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

7d59706d

18 3月, 2022 1 次提交

iavf: Fix hang during reboot/shutdown · b04683ff

由 Ivan Vecera 提交于 3月 17, 2022

Recent commit 97457801 ("iavf: Add waiting so the port is
initialized in remove") adds a wait-loop at the beginning of
iavf_remove() to ensure that port initialization is finished
prior unregistering net device. This causes a regression
in reboot/shutdown scenario because in this case callback
iavf_shutdown() is called and this callback detaches the device,
makes it down if it is running and sets its state to __IAVF_REMOVE.
Later shutdown callback of associated PF driver (e.g. ice_shutdown)
is called. That callback calls among other things sriov_disable()
that calls indirectly iavf_remove() (see stack trace below).
As the adapter state is already __IAVF_REMOVE then the mentioned
loop is end-less and shutdown process hangs.

The patch fixes this by checking adapter's state at the beginning
of iavf_remove() and skips the rest of the function if the adapter
is already in remove state (shutdown is in progress).

Reproducer:
1. Create VF on PF driven by ice or i40e driver
2. Ensure that the VF is bound to iavf driver
3. Reboot

[52625.981294] sysrq: SysRq : Show Blocked State
[52625.988377] task:reboot          state:D stack:    0 pid:17359 ppid:     1 f2
[52625.996732] Call Trace:
[52625.999187]  __schedule+0x2d1/0x830
[52626.007400]  schedule+0x35/0xa0
[52626.010545]  schedule_hrtimeout_range_clock+0x83/0x100
[52626.020046]  usleep_range+0x5b/0x80
[52626.023540]  iavf_remove+0x63/0x5b0 [iavf]
[52626.027645]  pci_device_remove+0x3b/0xc0
[52626.031572]  device_release_driver_internal+0x103/0x1f0
[52626.036805]  pci_stop_bus_device+0x72/0xa0
[52626.040904]  pci_stop_and_remove_bus_device+0xe/0x20
[52626.045870]  pci_iov_remove_virtfn+0xba/0x120
[52626.050232]  sriov_disable+0x2f/0xe0
[52626.053813]  ice_free_vfs+0x7c/0x340 [ice]
[52626.057946]  ice_remove+0x220/0x240 [ice]
[52626.061967]  ice_shutdown+0x16/0x50 [ice]
[52626.065987]  pci_device_shutdown+0x34/0x60
[52626.070086]  device_shutdown+0x165/0x1c5
[52626.074011]  kernel_restart+0xe/0x30
[52626.077593]  __do_sys_reboot+0x1d2/0x210
[52626.093815]  do_syscall_64+0x5b/0x1a0
[52626.097483]  entry_SYSCALL_64_after_hwframe+0x65/0xca

Fixes: 97457801 ("iavf: Add waiting so the port is initialized in remove")
Signed-off-by: NIvan Vecera <ivecera@redhat.com>
Link: https://lore.kernel.org/r/20220317104524.2802848-1-ivecera@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

b04683ff

16 3月, 2022 1 次提交

iavf: Fix double free in iavf_reset_task · 16b2dd8c

由 Przemyslaw Patynowski 提交于 3月 09, 2022

Fix double free possibility in iavf_disable_vf, as crit_lock is
freed in caller, iavf_reset_task. Add kernel-doc for iavf_disable_vf.
Remove mutex_unlock in iavf_disable_vf.
Without this patch there is double free scenario, when calling
iavf_reset_task.

Fixes: e85ff9c6 ("iavf: Fix deadlock in iavf_reset_task")
Signed-off-by: NPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Suggested-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

16b2dd8c

09 3月, 2022 1 次提交

iavf: Fix adopting new combined setting · 57d03f56

由 Michal Maloszewski 提交于 2月 02, 2022

In some cases overloaded flag IAVF_FLAG_REINIT_ITR_NEEDED
which should indicate that interrupts need to be completely
reinitialized during reset leads to RTNL deadlocks using ethtool -C
while a reset is in progress.
To fix, it was added a new flag IAVF_FLAG_REINIT_MSIX_NEEDED
used to trigger MSI-X reinit.
New combined setting is fixed adopt after VF reset.
This has been implemented by call reinit interrupt scheme
during VF reset.
Without this fix new combined setting has never been adopted.

Fixes: 209f2f9c ("iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 negotiation")
Signed-off-by: NGrzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Signed-off-by: NMichal Maloszewski <michal.maloszewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

57d03f56

02 3月, 2022 5 次提交

iavf: Remove non-inclusive language · 0a62b209

由 Mateusz Palczewski 提交于 2月 03, 2022

Remove non-inclusive language from the iavf driver.
Signed-off-by: NAleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

0a62b209

iavf: stop leaking iavf_status as "errno" values · bae569d0

由 Mateusz Palczewski 提交于 1月 27, 2022

Several functions in the iAVF core files take status values of the enum
iavf_status and convert them into integer values. This leads to
confusion as functions return both Linux errno values and status codes
intermixed. Reporting status codes as if they were "errno" values can
lead to confusion when reviewing error logs. Additionally, it can lead
to unexpected behavior if a return value is not interpreted properly.

Fix this by introducing iavf_status_to_errno, a switch that explicitly
converts from the status codes into an appropriate error value. Also
introduce a virtchnl_status_to_errno function for the one case where we
were returning both virtchnl status codes and iavf_status codes in the
same function.
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

bae569d0

iavf: remove redundant ret variable · c3fec56e

由 Minghao Chi 提交于 1月 10, 2022

Return value directly instead of taking this in another redundant
variable.
Reported-by: NZeal Robot <zealci@zte.com.cn>
Signed-off-by: NMinghao Chi <chi.minghao@zte.com.cn>
Signed-off-by: NCGEL ZTE <cgel.zte@gmail.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

c3fec56e

iavf: Add usage of new virtchnl format to set default MAC · a3e839d5

由 Mateusz Palczewski 提交于 1月 19, 2022

Use new type field of VIRTCHNL_OP_ADD_ETH_ADDR and
VIRTCHNL_OP_DEL_ETH_ADDR requests to indicate that
VF wants to change its default MAC address.
Signed-off-by: NSylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: NJedrzej Jagielski <jedrzej.jagielski@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

a3e839d5

iavf: refactor processing of VLAN V2 capability message · 87dba256

由 Mateusz Palczewski 提交于 1月 14, 2022

In order to handle the capability exchange necessary for
VIRTCHNL_VF_OFFLOAD_VLAN_V2, the driver must send
a VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS message. This must occur prior to
__IAVF_CONFIG_ADAPTER, and the driver must wait for the response from
the PF.

To handle this, the __IAVF_INIT_GET_OFFLOAD_VLAN_V2_CAPS state was
introduced. This state is intended to process the response from the VLAN
V2 caps message. This works ok, but is difficult to extend to adding
more extended capability exchange.

Existing (and future) AVF features are relying more and more on these
sort of extended ops for processing additional capabilities. Just like
VLAN V2, this exchange must happen prior to __IAVF_CONFIG_ADPATER.

Since we only send one outstanding AQ message at a time during init, it
is not clear where to place this state. Adding more capability specific
states becomes a mess. Instead of having the "previous" state send
a message and then transition into a capability-specific state,
introduce __IAVF_EXTENDED_CAPS state. This state will use a list of
extended_caps that determines what messages to send and receive. As long
as there are extended_caps bits still set, the driver will remain in
this state performing one send or one receive per state machine loop.

Refactor the VLAN V2 negotiation to use this new state, and remove the
capability-specific state. This makes it significantly easier to add
a new similar capability exchange going forward.

Extended capabilities are processed by having an associated SEND and
RECV extended capability bit. During __IAVF_EXTENDED_CAPS, the
driver checks these bits in order by feature, first the send bit for
a feature, then the recv bit for a feature. Each send flag will call
a function that sends the necessary response, while each receive flag
will wait for the response from the PF. If a given feature can't be
negotiated with the PF, the associated flags will be cleared in
order to skip processing of that feature.
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

87dba256

26 2月, 2022 8 次提交

iavf: Fix __IAVF_RESETTING state usage · 14756b2a

由 Slawomir Laba 提交于 2月 23, 2022

The setup of __IAVF_RESETTING state in watchdog task had no
effect and could lead to slow resets in the driver as
the task for __IAVF_RESETTING state only requeues watchdog.
Till now the __IAVF_RESETTING was interpreted by reset task
as running state which could lead to errors with allocating
and resources disposal.

Make watchdog_task queue the reset task when it's necessary.
Do not update the state to __IAVF_RESETTING so the reset task
knows exactly what is the current state of the adapter.

Fixes: 898ef1cb ("iavf: Combine init and watchdog state machines")
Signed-off-by: NSlawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: NPhani Burra <phani.r.burra@intel.com>
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

14756b2a

iavf: Fix missing check for running netdev · d2c0f45f

由 Slawomir Laba 提交于 2月 23, 2022

The driver was queueing reset_task regardless of the netdev
state.

Do not queue the reset task in iavf_change_mtu if netdev
is not running.

Fixes: fdd4044f ("iavf: Remove timer for work triggering, use delaying work instead")
Signed-off-by: NSlawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: NPhani Burra <phani.r.burra@intel.com>
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

d2c0f45f

iavf: Fix deadlock in iavf_reset_task · e85ff9c6

由 Slawomir Laba 提交于 2月 23, 2022

There exists a missing mutex_unlock call on crit_lock in
iavf_reset_task call path.

Unlock the crit_lock before returning from reset task.

Fixes: 5ac49f3c ("iavf: use mutexes for locking of critical sections")
Signed-off-by: NSlawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: NPhani Burra <phani.r.burra@intel.com>
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

e85ff9c6

iavf: Fix race in init state · a472eb5c

由 Slawomir Laba 提交于 2月 23, 2022

When iavf_init_version_check sends VIRTCHNL_OP_GET_VF_RESOURCES
message, the driver will wait for the response after requeueing
the watchdog task in iavf_init_get_resources call stack. The
logic is implemented this way that iavf_init_get_resources has
to be called in order to allocate adapter->vf_res. It is polling
for the AQ response in iavf_get_vf_config function. Expect a
call trace from kernel when adminq_task worker handles this
message first. adapter->vf_res will be NULL in
iavf_virtchnl_completion.

Make the watchdog task not queue the adminq_task if the init
process is not finished yet.

Fixes: 898ef1cb ("iavf: Combine init and watchdog state machines")
Signed-off-by: NSlawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: NPhani Burra <phani.r.burra@intel.com>
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

a472eb5c

iavf: Fix locking for VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS · 0579fafd

由 Slawomir Laba 提交于 2月 23, 2022

iavf_virtchnl_completion is called under crit_lock but when
the code for VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS is called,
this lock is released in order to obtain rtnl_lock to avoid
ABBA deadlock with unregister_netdev.

Along with the new way iavf_remove behaves, there exist
many risks related to the lock release and attmepts to regrab
it. The driver faces crashes related to races between
unregister_netdev and netdev_update_features. Yet another
risk is that the driver could already obtain the crit_lock
in order to destroy it and iavf_virtchnl_completion could
crash or block forever.

Make iavf_virtchnl_completion never relock crit_lock in it's
call paths.

Extract rtnl_lock locking logic to the driver for
unregister_netdev in order to set the netdev_registered flag
inside the lock.

Introduce a new flag that will inform adminq_task to perform
the code from VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS right after
it finishes processing messages. Guard this code with remove
flags so it's never called when the driver is in remove state.

Fixes: 5951a2b9 ("iavf: Fix VLAN feature flags after VFR")
Signed-off-by: NSlawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: NPhani Burra <phani.r.burra@intel.com>
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

0579fafd

iavf: Fix init state closure on remove · 3ccd54ef

由 Slawomir Laba 提交于 2月 23, 2022

When init states of the adapter work, the errors like lack
of communication with the PF might hop in. If such events
occur the driver restores previous states in order to retry
initialization in a proper way. When remove task kicks in,
this situation could lead to races with unregistering the
netdevice as well as resources cleanup. With the commit
introducing the waiting in remove for init to complete,
this problem turns into an endless waiting if init never
recovers from errors.

Introduce __IAVF_IN_REMOVE_TASK bit to indicate that the
remove thread has started.

Make __IAVF_COMM_FAILED adapter state respect the
__IAVF_IN_REMOVE_TASK bit and set the __IAVF_INIT_FAILED
state and return without any action instead of trying to
recover.

Make __IAVF_INIT_FAILED adapter state respect the
__IAVF_IN_REMOVE_TASK bit and return without any further
actions.

Make the loop in the remove handler break when adapter has
__IAVF_INIT_FAILED state set.

Fixes: 898ef1cb ("iavf: Combine init and watchdog state machines")
Signed-off-by: NSlawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: NPhani Burra <phani.r.burra@intel.com>
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

3ccd54ef

iavf: Add waiting so the port is initialized in remove · 97457801

由 Slawomir Laba 提交于 2月 23, 2022

There exist races when port is being configured and remove is
triggered.

unregister_netdev is not and can't be called under crit_lock
mutex since it is calling ndo_stop -> iavf_close which requires
this lock. Depending on init state the netdev could be still
unregistered so unregister_netdev never cleans up, when shortly
after that the device could become registered.

Make iavf_remove wait until port finishes initialization.
All critical state changes are atomic (under crit_lock).
Crashes that come from iavf_reset_interrupt_capability and
iavf_free_traffic_irqs should now be solved in a graceful
manner.

Fixes: 605ca7c5 ("iavf: Fix kernel BUG in free_msi_irqs")
Signed-off-by: NSlawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: NPhani Burra <phani.r.burra@intel.com>
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

97457801

iavf: Rework mutexes for better synchronisation · fc2e6b3b

由 Slawomir Laba 提交于 2月 23, 2022

The driver used to crash in multiple spots when put to stress testing
of the init, reset and remove paths.

The user would experience call traces or hangs when creating,
resetting, removing VFs. Depending on the machines, the call traces
are happening in random spots, like reset restoring resources racing
with driver remove.

Make adapter->crit_lock mutex a mandatory lock for guarding the
operations performed on all workqueues and functions dealing with
resource allocation and disposal.

Make __IAVF_REMOVE a final state of the driver respected by
workqueues that shall not requeue, when they fail to obtain the
crit_lock.

Make the IRQ handler not to queue the new work for adminq_task
when the __IAVF_REMOVE state is set.

Fixes: 5ac49f3c ("iavf: use mutexes for locking of critical sections")
Signed-off-by: NSlawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: NPhani Burra <phani.r.burra@intel.com>
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

fc2e6b3b

28 1月, 2022 1 次提交

iavf: Remove useless DMA-32 fallback configuration · 9498d4af

由 Christophe JAILLET 提交于 1月 09, 2022

As stated in [1], dma_set_mask() with a 64-bit mask never fails if
dev->dma_mask is non-NULL.
So, if it fails, the 32 bits case will also fail for the same reason.

Simplify code and remove some dead code accordingly.

[1]: https://lkml.org/lkml/2021/6/7/398Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

9498d4af

05 1月, 2022 1 次提交

iavf: Fix limit of total number of queues to active queues of VF · b712941c

由 Karen Sornek 提交于 9月 01, 2021

In the absence of this validation, if the user requests to
configure queues more than the enabled queues, it results in
sending the requested number of queues to the kernel stack
(due to the asynchronous nature of VF response), in which
case the stack might pick a queue to transmit that is not
enabled and result in Tx hang. Fix this bug by
limiting the total number of queues allocated for VF to
active queues of VF.

Fixes: d5b33d02 ("i40evf: add ndo_setup_tc callback to i40evf")
Signed-off-by: NAshwin Vijayavel <ashwin.vijayavel@intel.com>
Signed-off-by: NKaren Sornek <karen.sornek@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

b712941c

18 12月, 2021 5 次提交

iavf: Restrict maximum VLAN filters for VIRTCHNL_VF_OFFLOAD_VLAN_V2 · 92fc5085