提交 · 1d2c717bc7f7fd3c9cf38d4a0d5d7ede06adf05b · openeuler / Kernel

18 5月, 2022 1 次提交

net/mlx5: Add last command failure syndrome to debugfs · 1d2c717b

由 Moshe Shemesh 提交于 5月 13, 2022

Add syndrome of last command failure per command type to debugfs to ease
debugging of such failure.
last_failed_syndrome - last command failed syndrome returned by FW.
Signed-off-by: NMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

1d2c717b

10 5月, 2022 4 次提交

net/mlx5: Lag, add debugfs to query hardware lag state · 7f46a0b7

由 Mark Bloch 提交于 3月 15, 2022

Lag state has become very complicated with many modes, flags, types and
port selections methods and future work will add additional features.

Add a debugfs to query the current lag state. A new directory named "lag"
will be created under the mlx5 debugfs directory. As the driver has
debugfs per pci function the location will be: <debugfs>/mlx5/<BDF>/lag

For example:
/sys/kernel/debug/mlx5/0000:08:00.0/lag

The following files are exposed:

- state: Returns "active" or "disabled". If "active" it means hardware
         lag is active.

- members: Returns the BDFs of all the members of lag object.

- type: Returns the type of the lag currently configured. Valid only
	if hardware lag is active.
	* "roce" - Members are bare metal PFs.
	* "switchdev" - Members are in switchdev mode.
	* "multipath" - ECMP offloads.

- port_sel_mode: Returns the egress port selection method, valid
		 only if hardware lag is active.
		 * "queue_affinity" - Egress port is selected by
		   the QP/SQ affinity.
		 * "hash" - Egress port is selected by hash done on
		   each packet. Controlled by: xmit_hash_policy of the
		   bond device.
- flags: Returns flags that are specific per lag @type. Valid only if
	 hardware lag is active.
	 * "shared_fdb" - "on" or "off", if "on" single FDB is used.

- mapping: Returns the mapping which is used to select egress port.
	   Valid only if hardware lag is active.
	   If @port_sel_mode is "hash" returns the active egress ports.
	   The hash result will select only active ports.
	   if @port_sel_mode is "queue_affinity" returns the mapping
	   between the configured port affinity of the QP/SQ and actual
	   egress port. For example:
	   * 1:1 - Mapping means if the configured affinity is port 1
	           traffic will egress via port 1.
	   * 1:2 - Mapping means if the configured affinity is port 1
		   traffic will egress via port 2. This can happen
		   if port 1 is down or in active/backup mode and port 1
		   is backup.
Signed-off-by: NMark Bloch <mbloch@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

7f46a0b7

net/mlx5: Support devices with more than 2 ports · 4cd14d44

由 Mark Bloch 提交于 3月 01, 2022

Increase the define MLX5_MAX_PORTS to 4 as the driver is ready
to support NICs with 4 ports.
Signed-off-by: NMark Bloch <mbloch@nvidia.com>
Reviewed-by: NMaor Gottlieb <maorg@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

4cd14d44

net/mlx5: Lag, expose number of lag ports · 34a30d76

由 Mark Bloch 提交于 3月 01, 2022

Downstream patches will add support for hardware lag with
more than 2 ports. Add a way for users to query the number of lag ports.
Signed-off-by: NMark Bloch <mbloch@nvidia.com>
Reviewed-by: NMaor Gottlieb <maorg@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

34a30d76

net/mlx5: Add exit route when waiting for FW · 8324a02c

由 Gavin Li 提交于 3月 27, 2022

Currently, removing a device needs to get the driver interface lock before
doing any cleanup. If the driver is waiting in a loop for FW init, there
is no way to cancel the wait, instead the device cleanup waits for the
loop to conclude and release the lock.

To allow immediate response to remove device commands, check the TEARDOWN
flag while waiting for FW init, and exit the loop if it has been set.
Signed-off-by: NGavin Li <gavinl@nvidia.com>
Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

8324a02c

09 4月, 2022 2 次提交

net/mlx5: Remove ipsec_ops function table · f2b41b32

由 Leon Romanovsky 提交于 4月 06, 2022

There is only one IPsec implementation and ipsec_ops is not needed
at all in this situation. Together with removal of ipsec_ops, we can
drop the entry checks as these functions are called for IPsec devices
only.

Link: https://lore.kernel.org/r/bc8dd1c8a77b65dbf5e2cf92c813ffaca2505c5f.1649232994.git.leonro@nvidia.comReviewed-by: NRaed Salem <raeds@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>

f2b41b32

net/mlx5: Remove not-needed IPsec config · 54deb0e7

由 Leon Romanovsky 提交于 4月 06, 2022

In current code, the CONFIG_MLX5_IPSEC and CONFIG_MLX5_EN_IPSEC are
the same. So remove useless indirection.

Link: https://lore.kernel.org/r/fd14492cbc01a0d51a5bfedde02bcd2154123fde.1649232994.git.leonro@nvidia.comReviewed-by: NRaed Salem <raeds@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>

54deb0e7

18 3月, 2022 2 次提交

net/mlx5: Remove unused fill page array API function · 770c9a3a

由 Tariq Toukan 提交于 5月 20, 2021

mlx5_fill_page_array API function is not used.
Remove it, reduce the number of exported functions.
Signed-off-by: NTariq Toukan <tariqt@nvidia.com>
Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

770c9a3a

net/mlx5: Remove unused exported contiguous coherent buffer allocation API · 4206fe40

由 Tariq Toukan 提交于 4月 27, 2021

All WQ types moved to using the fragmented allocation API
for coherent memory. Contiguous API is not used anymore.
Remove it, reduce the number of exported functions.
Signed-off-by: NTariq Toukan <tariqt@nvidia.com>
Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

4206fe40

10 3月, 2022 4 次提交

net/mlx5: Add debugfs counters for page commands failures · 32071187

由 Moshe Shemesh 提交于 1月 27, 2022

Add the following new debugfs counters for debug and verbosity:
fw_pages_alloc_failed - number of pages FW requested but driver failed
to allocate.
give_pages_dropped - number of pages given to FW, but command give pages
failed by FW.
reclaim_pages_discard - number of pages which were about to reclaim back
and FW failed the command.
Signed-off-by: NMoshe Shemesh <moshe@nvidia.com>
Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

32071187

net/mlx5: Add pages debugfs · 4e05cbf0

由 Moshe Shemesh 提交于 1月 27, 2022

Add pages debugfs to expose the following counters for debuggability:
fw_pages_total - How many pages were given to FW and not returned yet.
vfs_pages - For SRIOV, how many pages were given to FW for virtual
functions usage.
host_pf_pages - For ECPF, how many pages were given to FW for external
hosts physical functions usage.
Signed-off-by: NMoshe Shemesh <moshe@nvidia.com>
Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

4e05cbf0

net/mlx5: Move debugfs entries to separate struct · 66771a1c

由 Moshe Shemesh 提交于 2月 18, 2022

Move the debugfs entry pointers under priv to their own struct.
Add get function for device debugfs root.
Signed-off-by: NMoshe Shemesh <moshe@nvidia.com>
Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

66771a1c

net/mlx5: Add command failures data to debugfs · 34f46ae0

由 Moshe Shemesh 提交于 1月 27, 2022

Add new counters to command interface debugfs to count command failures.
The following counters added:
total_failed - number of times command failed (any kind of failure).
failed_mbox_status - number of times command failed on bad status
returned by FW.

In addition, add data about last command failure to command interface
debugfs:
last_failed_errno - last command failed returned errno.
last_failed_mbox_status - last bad status returned by FW.
Signed-off-by: NMoshe Shemesh <moshe@nvidia.com>
Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

34f46ae0

27 2月, 2022 1 次提交

net/mlx5: Expose APIs to get/put the mlx5 core device · 1695b97b

由 Yishai Hadas 提交于 2月 24, 2022

Expose an API to get the mlx5 core device from a given VF PCI device if
mlx5_core is its driver.

Upon the get API we stay with the intf_state_mutex locked to make sure
that the device can't be gone/unloaded till the caller will complete
its job over the device, this expects to be for a short period of time
for any flow that the lock is taken.

Upon the put API we unlock the intf_state_mutex.

The use case for those APIs is the migration flow of a VF over VFIO PCI.
In that case the VF doesn't ride on mlx5_core, because the device is
driving *two* different PCI devices, the PF owned by mlx5_core and the
VF owned by the vfio driver.

The mlx5_core of the PF is accessed only during the narrow window of the
VF's ioctl that requires its services.

This allows the PF driver to be more independent of the VF driver, so
long as it doesn't reset the FW.

Link: https://lore.kernel.org/all/20220224142024.147653-6-yishaih@nvidia.comSigned-off-by: NYishai Hadas <yishaih@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>

1695b97b

24 2月, 2022 5 次提交

net/mlx5: Add clarification on sync reset failure · 45fee8ed

由 Moshe Shemesh 提交于 9月 06, 2021

In case devlink reload action fw_activate failed in sync reset stage,
use the new MFRL field reset_state to find why it failed and share this
clarification with the user.
Signed-off-by: NMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

45fee8ed

net/mlx5: cmdif, Refactor error handling and reporting of async commands · 0a415276

由 Saeed Mahameed 提交于 3月 31, 2020

Same as the new mlx5_cmd_do API, report all information to callers and
let them handle the error values and outbox parsing.

The user callback status "work->user_callback(status)" is now similar to
the error rc code returned from the blocking mlx5_cmd_do() version,
and now is defined as follows:

 -EREMOTEIO : Command executed by FW, outbox.status != MLX5_CMD_STAT_OK.
              Caller must check FW outbox status.
 0 : Command execution successful,  outbox.status == MLX5_CMD_STAT_OK.
 < 0 : Command couldn't execute, FW or driver induced error.
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

0a415276

net/mlx5: cmdif, Add new api for command execution · f23519e5

由 Saeed Mahameed 提交于 8月 17, 2019

Add mlx5_cmd_do. Unlike mlx5_cmd_exec, this function will not modify
or translate outbox.status.

The function will return:

return = 0: Command was executed, outbox.status == MLX5_CMD_STAT_OK.

return = -EREMOTEIO: Executed, outbox.status != MLX5_CMD_STAT_OK.

return < 0: Command execution couldn't be performed by FW or driver.

And document other mlx5_cmd_exec functions.
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

f23519e5

net/mlx5: cmdif, cmd_check refactoring · 605bef00

由 Saeed Mahameed 提交于 4月 05, 2020

Do not mangle the command outbox in the internal low level cmd_exec and
cmd_invoke functions.

Instead return a proper unique error code and move the driver error
checking to be at a higher level in mlx5_cmd_exec().
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

605bef00

mlx5: remove unused static inlines · 43c07595

由 Jakub Kicinski 提交于 1月 26, 2022

mlx5 has some unused static inline helpers in include/
while at it also clean static inlines in the driver itself.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

43c07595

03 12月, 2021 1 次提交

net/mlx5: Dynamically resize flow counters query buffer · b247f32a

由 Avihai Horon 提交于 10月 28, 2021

The flow counters bulk query buffer is allocated once during
mlx5_fc_init_stats(). For PFs and VFs this buffer usually takes a little
more than 512KB of memory, which is aligned to the next power of 2, to
1MB. For SFs, this buffer is reduced and takes around 128 Bytes.

The buffer size determines the maximum number of flow counters that
can be queried at a time. Thus, having a bigger buffer can improve
performance for users that need to query many flow counters.

There are cases that don't use many flow counters and don't need a big
buffer (e.g. SFs, VFs). Since this size is critical with large scale,
in these cases the buffer size should be reduced.

In order to reduce memory consumption while maintaining query
performance, change the query buffer's allocation scheme to the
following:
- First allocate the buffer with small initial size.
- If the number of counters surpasses the initial size, resize the
  buffer to the maximum size.

The buffer only grows and isn't shrank, because users with many flow
counters don't care about the buffer size and we don't want to add
resize overhead if the current number of counters drops.

This solution is preferable to the current one, which is less accurate
and only addresses SFs.
Signed-off-by: NAvihai Horon <avihaih@nvidia.com>
Reviewed-by: NMark Bloch <mbloch@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

b247f32a

27 10月, 2021 1 次提交

net/mlx5: remove the recent devlink params · 6b367174

由 Jakub Kicinski 提交于 10月 26, 2021

revert commit 46ae40b9 ("net/mlx5: Let user configure io_eq_size param")
revert commit a6cb08da ("net/mlx5: Let user configure event_eq_size param")
revert commit 55460406 ("net/mlx5: Let user configure max_macs param")

The EQE parameters are applicable to more drivers, they should
be configured via standard API, probably ethtool. Example of
another driver needing something similar:

https://lore.kernel.org/all/1633454136-14679-3-git-send-email-sbhatta@marvell.com/

The last param for "max_macs" is probably fine but the documentation
is severely lacking. The meaning and implications for changing the
param need to be stated.

Link: https://lore.kernel.org/r/20211026152939.3125950-1-kuba@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

6b367174

26 10月, 2021 2 次提交

net/mlx5: Let user configure io_eq_size param · 46ae40b9

由 Shay Drory 提交于 8月 12, 2021

Currently, each I/O EQ is taking 128KB of memory. This size
is not needed in all use cases, and is critical with large scale.
Hence, allow user to configure the size of I/O EQs.

For example, to reduce I/O EQ size to 64, execute:
$ devlink resource set pci/0000:00:0b.0 path /io_eq_size/ size 64
$ devlink dev reload pci/0000:00:0b.0
Signed-off-by: NShay Drory <shayd@nvidia.com>
Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
Reviewed-by: NParav Pandit <parav@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

46ae40b9

net/mlx5: Add periodic update of host time to firmware · 5a1023de

由 Aya Levin 提交于 10月 13, 2021

Firmware logs its asserts also to non-volatile memory. In order to
reduce drift between the NIC and the host, the driver sets the host
epoch-time to the firmware every hour.
Signed-off-by: NAya Levin <ayal@nvidia.com>
Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

5a1023de

21 10月, 2021 1 次提交

net/mlx5: Lag, change multipath and bonding to be mutually exclusive · 14fe2471

由 Maor Dickman 提交于 10月 07, 2021

Both multipath and bonding events are changing the HW LAG state
independently.
Handling one of the features events while the other is already
enabled can cause unwanted behavior, for example handling
bonding event while multipath enabled will disable the lag and
cause multipath to stop working.

Fix it by ignoring bonding event while in multipath and ignoring FIB
events while in bonding mode.

Fixes: 544fe7c2 ("net/mlx5e: Activate HW multipath and handle port affinity based on FIB events")
Signed-off-by: NMaor Dickman <maord@nvidia.com>
Reviewed-by: NRoi Dayan <roid@nvidia.com>
Reviewed-by: NMark Bloch <mbloch@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

14fe2471

19 10月, 2021 5 次提交

RDMA/mlx5: Move struct mlx5_core_mkey to mlx5_ib · 4123bfb0