提交 · 345502af4e42cef57782118520c3c326b55f1071 · openeuler / Kernel

10 6月, 2021 11 次提交

net: stmmac: Fix missing { } around two statements in an if statement · 345502af

由 Colin Ian King 提交于 6月 09, 2021

There are missing { } around a block of code on an if statement. Fix this
by adding them in.

Addresses-Coverity: ("Nesting level does not match indentation")
Fixes: 46682cb8 ("net: stmmac: enable Intel mGbE 2.5Gbps link speed")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

345502af

net: ethernet: ti: cpsw-phy-sel: Use devm_platform_ioremap_resource_byname() · ba539319

由 Yang Yingliang 提交于 6月 09, 2021

Use the devm_platform_ioremap_resource_byname() helper instead of
calling platform_get_resource_byname() and devm_ioremap_resource()
separately.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba539319

mvpp2: prefetch page · 2f128eb3

由 Matteo Croce 提交于 6月 09, 2021

Most of the time during the RX is caused by the compound_head() call
done at the end of the RX loop:

       │     build_skb():
       [...]
       │     static inline struct page *compound_head(struct page *page)
       │     {
       │     unsigned long head = READ_ONCE(page->compound_head);
 65.23 │       ldr  x2, [x1, #8]

Prefetch the page struct as soon as possible, to speedup the RX path
noticeabily by a ~3-4% packet rate in a drop test.

       │     build_skb():
       [...]
       │     static inline struct page *compound_head(struct page *page)
       │     {
       │     unsigned long head = READ_ONCE(page->compound_head);
 17.92 │       ldr  x2, [x1, #8]
Signed-off-by: NMatteo Croce <mcroce@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f128eb3

mvpp2: prefetch right address · d8ea89fe

由 Matteo Croce 提交于 6月 09, 2021

In the RX buffer, the received data starts after a headroom used to
align the IP header and to allow prepending headers efficiently.
The prefetch() should take this into account, and prefetch from
the very start of the received data.

We can see that ether_addr_equal_64bits(), which is the first function
to access the data, drops from the top of the perf top output.

prefetch(data):

Overhead  Shared Object     Symbol
  11.64%  [kernel]          [k] eth_type_trans

prefetch(data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM):

Overhead  Shared Object     Symbol
  13.42%  [kernel]          [k] build_skb
  10.35%  [mvpp2]           [k] mvpp2_rx
   9.35%  [kernel]          [k] __netif_receive_skb_core
   8.24%  [kernel]          [k] kmem_cache_free
   7.97%  [kernel]          [k] dev_gro_receive
   7.68%  [kernel]          [k] page_pool_put_page
   7.32%  [kernel]          [k] kmem_cache_alloc
   7.09%  [mvpp2]           [k] mvpp2_bm_pool_put
   3.36%  [kernel]          [k] eth_type_trans

Also, move the eth_type_trans() call a bit down, to give the RAM more
time to prefetch the data.
Signed-off-by: NMatteo Croce <mcroce@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d8ea89fe

net: ethernet: ti: am65-cpts: Use devm_platform_ioremap_resource_byname() · e77e2cf4

由 Yang Yingliang 提交于 6月 09, 2021

Use the devm_platform_ioremap_resource_byname() helper instead of
calling platform_get_resource_byname() and devm_ioremap_resource()
separately.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e77e2cf4

net: stmmac: Use devm_platform_ioremap_resource_byname() · 3a5a32b5

由 Yang Yingliang 提交于 6月 09, 2021

Use the devm_platform_ioremap_resource_byname() helper instead of
calling platform_get_resource_byname() and devm_ioremap_resource()
separately.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a5a32b5

net: sgi: ioc3-eth: check return value after calling platform_get_resource() · db8f7be1

由 Yang Yingliang 提交于 6月 09, 2021

It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db8f7be1

net: hns3: use list_move_tail instead of list_del/list_add_tail in hclge_main.c · 4724acc4

由 Baokun Li 提交于 6月 09, 2021

Using list_move_tail() instead of list_del() + list_add_tail() in hclge_main.c.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4724acc4

net: hns3: use list_move_tail instead of list_del/list_add_tail in hclgevf_main.c · 49768ce9

由 Baokun Li 提交于 6月 09, 2021

Using list_move_tail() instead of list_del() + list_add_tail() in hclgevf_main.c.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

49768ce9

nfp: use list_move instead of list_del/list_add in nfp_cppcore.c · 39c3783e

由 Baokun Li 提交于 6月 09, 2021

Using list_move() instead of list_del() + list_add() in nfp_cppcore.c.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NSimon Horman <simon.horman@corigine.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

39c3783e

net: ethernet: ravb: Use devm_platform_get_and_ioremap_resource() · e89a2cdb

由 Yang Yingliang 提交于 6月 09, 2021

Use devm_platform_get_and_ioremap_resource() to simplify
code.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NSergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e89a2cdb

09 6月, 2021 27 次提交

net: stmmac: explicitly deassert GMAC_AHB_RESET · e67f325e

由 Matthew Hagan 提交于 6月 08, 2021

We are currently assuming that GMAC_AHB_RESET will already be deasserted
by the bootloader. However if this has not been done, probing of the GMAC
will fail. To remedy this we must ensure GMAC_AHB_RESET has been deasserted
prior to probing.

v2 changes:
 - remove NULL condition check for stmmac_ahb_rst in stmmac_main.c
 - unwrap dev_err() message in stmmac_main.c
 - add PTR_ERR() around plat->stmmac_ahb_rst in stmmac_platform.c

v3 changes:
 - add error pointer to dev_err() output
 - add reset_control_assert(stmmac_ahb_rst) in stmmac_dvr_remove
 - revert PTR_ERR() around plat->stmmac_ahb_rst since this is performed
   on the returned value of ret by the calling function
Signed-off-by: NMatthew Hagan <mnhagan88@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e67f325e

sh_eth: Use devm_platform_get_and_ioremap_resource() · 52481e58

由 Yang Yingliang 提交于 6月 08, 2021

Use devm_platform_get_and_ioremap_resource() to simplify
code.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NSergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

52481e58

net: nixge: simplify code with devm platform functions · 5b38b97f

由 Yang Yingliang 提交于 6月 08, 2021

Use devm_platform_get_and_ioremap_resource() and
devm_platform_ioremap_resource_byname to simplify
code.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b38b97f

ethernet/qlogic: Use list_for_each_entry() to simplify code in qlcnic_hw.c · 78595dfc

由 Wang Hai 提交于 6月 08, 2021

Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NWang Hai <wanghai38@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78595dfc

net: stmmac: fix NPD with phylink_set_pcs if there is no MDIO bus · b55b1d50

由 Vladimir Oltean 提交于 6月 08, 2021

priv->plat->mdio_bus_data is optional, some platforms may not set it,
however we proceed to look straight at priv->plat->mdio_bus_data->has_xpcs.

Since the xpcs is instantiated based on the has_xpcs property, we can
avoid looking at the priv->plat->mdio_bus_data structure altogether and
just check for the presence of the xpcs pointer.

Fixes: 11059740 ("net: pcs: xpcs: convert to phylink_pcs_ops")
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b55b1d50

net: qede: Use list_for_each_entry() to simplify code · 36861d1f

由 Wang Hai 提交于 6月 08, 2021

Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NWang Hai <wanghai38@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

36861d1f

net: hns3: add error handling compatibility during initialization · 1c360a4a

由 Jiaran Zhang 提交于 6月 08, 2021

During initialization, the driver logs and clears the hw errors that
already occurred. For device supports imp-handle ras capability, it
needs handle different error status, otherwise it may cause wrong reset.

So fix it by adding a new processing branch.
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c360a4a

net: hns3: update error recovery module and type · 8a95e360

由 Jiaran Zhang 提交于 6月 08, 2021

Update error recovery module and type for RoCE.

The enumeration values of module names and error types are not sorted
in sequence. If use the current printing mode, they cannot be correctly
printed.

Use the index mode, If mod_id and type_id match the enumerated value,
display the corresponding information.
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: NWeihang Li <liweihang@huawei.com>
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8a95e360

net: hns3: add support for imp-handle ras capability · e65e9f5c

由 Jiaran Zhang 提交于 6月 08, 2021

IMP(Intelligent Management Processor) firmware add a new feature to
handle and consolidate RAS information for new devices, NIC driver
only needs to query the reported RAS information. NIC driver adds
support for this feature.

Driver queries device capability to check whether IMP support this
feature, If yes, execute the new RAS processing branch.

In order to add a method to check whether PF supports imp-handle RAS
feature, add dumping this info in debugfs.
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e65e9f5c

net: hns3: add the RAS compatibility adaptation solution · 2e2deee7

由 Jiaran Zhang 提交于 6月 08, 2021

To adapt to hardware modification and ensure that the driver is
compatible with the original error handling content, we need to add the
RAS compatibility adaptation solution.

Add a processing branch to the driver during error handling. In the new
processing branch, NIC fault information is integrated by the IMP. An
interaction command is added between the driver and IMP to query
and clear the fault source and interrupt source. The IMP integrates
error information and reports the highest reset level to the driver.
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e2deee7

net: hns3: add support for handling all errors through MSI-X · 17f59244

由 Yufeng Mo 提交于 6月 08, 2021

Currently, hardware errors can be reported through AER or MSI-X mode.
However, the AER mode is intended to handle only bus errors, but not
hardware errors. On the other hand, virtual machines cannot handle
AER errors. When an AER error is reported, virtual machines will be
suspended. So add support for handling all these hardware errors
through MSI-X mode which depends on a newer version of firmware,
and reserve the handler of the AER mode for compatibility.
Signed-off-by: NYufeng Mo <moyufeng@huawei.com>
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

17f59244

net: ena: re-organize code to improve readability · a01f2cd0

由 Shay Agroskin 提交于 6月 08, 2021

Restructure some ethtool to a switch-case blocks to make it more uniform
with other similar functions.
Also restructure variable declaration to create reversed x-mas tree.
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a01f2cd0

net: ena: Use dev_alloc() in RX buffer allocation · 947c54c3

由 Shay Agroskin 提交于 6月 08, 2021

Use dev_alloc() when allocating RX buffers instead of specifying the
allocation flags explicitly. This result in same behaviour with less
code.

Also move the page allocation and its DMA mapping into a function. This
creates a logical block, which may help understanding the code.
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

947c54c3

net: ena: aggregate doorbell common operations into a function · 9e8afb05

由 Shay Agroskin 提交于 6月 08, 2021

The ena_ring_tx_doorbell() is introduced to call the doorbell and
increase the driver's corresponding stat.
Signed-off-by: NIdo Segev <idose@amazon.com>
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e8afb05

net: ena: Remove module param and change message severity · 15efff76

由 Shay Agroskin 提交于 6月 08, 2021

Remove the module param 'debug' which allows to specify the message
level of the driver. This value can be specified using ethtool command.
Also reduce the message level of LLQ support to be a warning since it is
not an indication of an error.
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15efff76

net: ena: add jiffies of last napi call to stats · 0ee251cd

由 Shay Agroskin 提交于 6月 08, 2021

There are instances when we want to know when the last napi was
called for debugging.

On stuck / heavy loaded CPUs, the ena napi handler might not be
called for a long period of time. This stat can help us to
determine how much time passed since the last execution of napi.
Signed-off-by: NSameeh Jubran <sameehj@amazon.com>
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ee251cd

net: ena: use build_skb() in RX path · 9e5269a9

由 Shay Agroskin 提交于 6月 08, 2021

This patch converts the RX path to use build_skb() for packets larger
than copybreak (set to 256 by default). This function makes the first
descriptor's page to be the linear part of the sk_buff struct buffer.

Also remove the SKB description from the README since most of it no
longer relevant and the parts that are left don't add information.
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e5269a9

net: ena: Improve error logging in driver · 091d0e85

由 Shay Agroskin 提交于 6月 08, 2021

Add prints to improve logging of driver's errors.
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

091d0e85

net: ena: Remove unused code · 9912c72e

由 Shay Agroskin 提交于 6月 08, 2021

The ENA_DEFAULT_MIN_RX_BUFF_ALLOC_SIZE macro,
ena_xdp_queues_present() function and SUSPEND_RESUME enums aren't used
in the driver, and so not needed.
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NGal Pressman <galpress@amazon.com>
Signed-off-by: NSameeh Jubran <sameehj@amazon.com>
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9912c72e

net: ena: optimize data access in fast-path code · e4ac382e

由 Shay Agroskin 提交于 6月 08, 2021

This tweaks several small places to improve the data access in fast
path:

* Remove duplicates of first_interrupt flag and surround it with
  WRITE/READ_ONCE macros:

  The flag is used to detect HW disorders in its
  interrupt communication with the driver. The flag is set when an
  interrupt is received and used in the health check function
  (ena_timer_service()) to help it find irregularities.

* Reorder some fields in ena_napi struct to take better advantage of
  cache access pattern.

* Move XDP TX queue number to a variable to save its calculation for
  every packet.

* Use likely in a condition to improve branch prediction

The 'first_interrupt' and 'interrupt_masked' flags were moved to reside
in the same cache line as the first fields of 'napi' struct. This
placement ensures that all memory accessed during upper-half handler
reside in the same cacheline (napi_schedule_irqoff() only accesses
'state' and 'poll_list' fields which are at the beginning of napi
struct).
Signed-off-by: NSameeh Jubran <sameehj@amazon.com>
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e4ac382e

mlxsw: thermal: Read module temperature thresholds using MTMP register · 72a64c2f

由 Mykola Kostenok 提交于 6月 08, 2021

mlxsw_thermal_module_trips_update() is used to update the trip points of
the module's thermal zone. Currently, this is done by querying the
thresholds from the module's EEPROM via MCIA register. This data does
not pass validation and in some cases can be unreliable. For example,
due to some problem with transceiver module.

Previous patch made it possible to read module's temperature and
thresholds via MTMP register. Therefore, extend
mlxsw_thermal_module_trips_update() to use the thresholds queried from
MTMP, if valid.

This is both more reliable and more efficient than current method, as
temperature and thresholds are queried in one transaction instead of
three. This is significant when working over a slow bus such as I2C.
Signed-off-by: NMykola Kostenok <c_mykolak@nvidia.com>
Acked-by: NVadim Pasternak <vadimp@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72a64c2f

mlxsw: thermal: Add function for reading module temperature and thresholds · e57977b3

由 Mykola Kostenok 提交于 6月 08, 2021

Provide new function mlxsw_thermal_module_temp_and_thresholds_get() for
reading temperature and temperature thresholds by a single operation.
The motivation is to reduce the number of transactions with the device
which is important when operating over a slow bus such as I2C.

Currently, the sole caller of the function is only using it to read the
module's temperature. The next patch will also use it to query the
module's temperature thresholds.
Signed-off-by: NMykola Kostenok <c_mykolak@nvidia.com>
Acked-by: NVadim Pasternak <vadimp@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e57977b3

mlxsw: core_env: Read module temperature thresholds using MTMP register · befc2048

由 Mykola Kostenok 提交于 6月 08, 2021

Currently, module temperature thresholds are obtained from Management
Cable Info Access (MCIA) register by specifying the thresholds offsets
within module EEPROM layout. This data does not pass validation and in
some cases can be unreliable. For example, due to some problem with the
module.

Add support for a new feature provided by Management Temperature (MTMP)
register for sanitization of temperature thresholds values.

Extend mlxsw_env_module_temp_thresholds_get() to get temperature
thresholds through MTMP field 'max_operational_temperature' - if it is
not zero, feature is supported. Otherwise fallback to old method and get
the thresholds through MCIA.
Signed-off-by: NMykola Kostenok <c_mykolak@nvidia.com>
Acked-by: NVadim Pasternak <vadimp@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

befc2048

mlxsw: reg: Extend MTMP register with new threshold field · 314dbb19

由 Mykola Kostenok 提交于 6月 08, 2021

Extend Management Temperature (MTMP) register with new field specifying
the maximum temperature threshold.

Extend mlxsw_reg_mtmp_unpack() function with two extra arguments,
providing high and maximum temperature thresholds. For modules, these
thresholds correspond to critical and emergency thresholds that are read
from the module's EEPROM.
Signed-off-by: NMykola Kostenok <c_mykolak@nvidia.com>
Acked-by: NVadim Pasternak <vadimp@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

314dbb19

mlxsw: spectrum_router: Remove abort mechanism · a08a6193

由 Amit Cohen 提交于 6月 08, 2021

The abort mechanism was introduced in commit 8e05fd71 ("fib: hook
IPv4 fib for hardware offload") with the purpose of falling back to
software-based routing in case of a route programming error in hardware.
The process is irreversible and requires users to reload the offloading
driver or reboot the machine.

While this approach might make sense in theory, it makes very little
sense in practice. In the case of high speed ASICs such as the Spectrum
ASIC, the abort mechanism effectively kills the machine upon a non-fatal
error such as a route programming error.

Such an extreme policy does not belong in the kernel, especially when
user space can simply try to reprogram the route following the
RTM_NEWROUTE failure notification.

Therefore, remove the abort mechanism.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Reviewed-by: NPetr Machata <petrm@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a08a6193

net: stmmac: enable Intel mGbE 2.5Gbps link speed · 46682cb8

由 Voon Weifeng 提交于 6月 08, 2021

The Intel mGbE supports 2.5Gbps link speed by increasing the clock rate by
2.5 times of the original rate. In this mode, the serdes/PHY operates at a
serial baud rate of 3.125 Gbps and the PCS data path and GMII interface of
the MAC operate at 312.5 MHz instead of 125 MHz.

For Intel mGbE, the overclocking of 2.5 times clock rate to support 2.5G is
only able to be configured in the BIOS during boot time. Kernel driver has
no access to modify the clock rate for 1Gbps/2.5G mode. The way to
determined the current 1G/2.5G mode is by reading a dedicated adhoc
register through mdio bus. In short, after the system boot up, it is either
in 1G mode or 2.5G mode which not able to be changed on the fly.

Compared to 1G mode, the 2.5G mode selects the 2500BASEX as PHY interface and
disables the xpcs_an_inband. This is to cater for some PHYs that only
supports 2500BASEX PHY interface with no autonegotiation.

v2: remove MAC supported link speed masking
v3: Restructure to introduce intel_speed_mode_2500() to read serdes registers
for max speed supported and select the appropritate configuration.
Use max_speed to determine the supported link speed mask.
Signed-off-by: NVoon Weifeng <weifeng.voon@intel.com>
Signed-off-by: NMichael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Reviewed-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46682cb8

net: stmmac: split xPCS setup from mdio register · 597a68ce

由 Voon Weifeng 提交于 6月 08, 2021

This patch is a preparation patch for the enabling of Intel mGbE 2.5Gbps
link speed. The Intel mGbR link speed configuration (1G/2.5G) is depends on
a mdio ADHOC register which can be configured in the bios menu.
As PHY interface might be different for 1G and 2.5G, the mdio bus need be
ready to check the link speed and select the PHY interface before probing
the xPCS.
Signed-off-by: NVoon Weifeng <weifeng.voon@intel.com>
Signed-off-by: NMichael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

597a68ce

08 6月, 2021 2 次提交

mvneta: recycle buffers · e4017570

由 Matteo Croce 提交于 6月 07, 2021

Use the new recycling API for page_pool.
In a drop rate test, the packet rate increased by 10%,
from 296 Kpps to 326 Kpps.

perf top on a stock system shows:

Overhead  Shared Object     Symbol
  23.66%  [kernel]          [k] __pi___inval_dcache_area
  22.85%  [mvneta]          [k] mvneta_rx_swbm
   7.54%  [kernel]          [k] kmem_cache_alloc
   6.49%  [kernel]          [k] eth_type_trans
   3.94%  [kernel]          [k] dev_gro_receive
   3.91%  [kernel]          [k] __netif_receive_skb_core
   3.91%  [kernel]          [k] kmem_cache_free
   3.76%  [kernel]          [k] page_pool_release_page
   3.56%  [kernel]          [k] free_unref_page
   2.40%  [kernel]          [k] build_skb
   1.49%  [kernel]          [k] skb_release_data
   1.45%  [kernel]          [k] __alloc_pages_bulk
   1.30%  [kernel]          [k] page_frag_free

And this is the same output with recycling enabled:

Overhead  Shared Object     Symbol
  26.41%  [kernel]          [k] __pi___inval_dcache_area
  25.00%  [mvneta]          [k] mvneta_rx_swbm
   8.14%  [kernel]          [k] kmem_cache_alloc
   6.84%  [kernel]          [k] eth_type_trans
   4.44%  [kernel]          [k] __netif_receive_skb_core
   4.38%  [kernel]          [k] kmem_cache_free
   4.16%  [kernel]          [k] dev_gro_receive
   3.21%  [kernel]          [k] page_pool_put_page
   2.41%  [kernel]          [k] build_skb
   1.82%  [kernel]          [k] skb_release_data
   1.61%  [kernel]          [k] napi_gro_receive
   1.25%  [kernel]          [k] page_pool_refill_alloc_cache
   1.16%  [kernel]          [k] __netif_receive_skb_list_core

We can see that page_pool_release_page(), free_unref_page() and
__alloc_pages_bulk() are no longer on top of the list when receiving
traffic.

The test was done with mausezahn on the TX side with 64 byte raw
ethernet frames.
Signed-off-by: NMatteo Croce <mcroce@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e4017570

mvpp2: recycle buffers · 133637fc

由 Matteo Croce 提交于 6月 07, 2021

Use the new recycling API for page_pool.
In a drop rate test, the packet rate is almost doubled,
from 1110 Kpps to 2128 Kpps.

perf top on a stock system shows:

Overhead  Shared Object     Symbol
  34.88%  [kernel]          [k] page_pool_release_page
   8.06%  [kernel]          [k] free_unref_page
   6.42%  [mvpp2]           [k] mvpp2_rx
   6.07%  [kernel]          [k] eth_type_trans
   5.18%  [kernel]          [k] __netif_receive_skb_core
   4.95%  [kernel]          [k] build_skb
   4.88%  [kernel]          [k] kmem_cache_free
   3.97%  [kernel]          [k] kmem_cache_alloc
   3.45%  [kernel]          [k] dev_gro_receive
   2.73%  [kernel]          [k] page_frag_free
   2.07%  [kernel]          [k] __alloc_pages_bulk
   1.99%  [kernel]          [k] arch_local_irq_save
   1.84%  [kernel]          [k] skb_release_data
   1.20%  [kernel]          [k] netif_receive_skb_list_internal

With packet rate stable at 1100 Kpps:

tx: 0 bps 0 pps rx: 532.7 Mbps 1110 Kpps
tx: 0 bps 0 pps rx: 532.6 Mbps 1110 Kpps
tx: 0 bps 0 pps rx: 532.4 Mbps 1109 Kpps
tx: 0 bps 0 pps rx: 532.1 Mbps 1109 Kpps
tx: 0 bps 0 pps rx: 531.9 Mbps 1108 Kpps
tx: 0 bps 0 pps rx: 531.9 Mbps 1108 Kpps

And this is the same output with recycling enabled:

Overhead  Shared Object     Symbol
  12.91%  [kernel]          [k] eth_type_trans
  12.54%  [mvpp2]           [k] mvpp2_rx
   9.67%  [kernel]          [k] build_skb
   9.63%  [kernel]          [k] __netif_receive_skb_core
   8.44%  [kernel]          [k] page_pool_put_page
   8.07%  [kernel]          [k] kmem_cache_free
   7.79%  [kernel]          [k] kmem_cache_alloc
   6.86%  [kernel]          [k] dev_gro_receive
   3.19%  [kernel]          [k] skb_release_data
   2.41%  [kernel]          [k] netif_receive_skb_list_internal
   2.18%  [kernel]          [k] page_pool_refill_alloc_cache
   1.76%  [kernel]          [k] napi_gro_receive
   1.61%  [kernel]          [k] kfree_skb
   1.20%  [kernel]          [k] dma_sync_single_for_device
   1.16%  [mvpp2]           [k] mvpp2_poll
   1.12%  [mvpp2]           [k] mvpp2_read

With packet rate above 2100 Kpps:

tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1021 Mbps 2127 Kpps
tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1022 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1022 Mbps 2129 Kpps

The major performance increase is explained by the fact that the most CPU
consuming functions (page_pool_release_page, page_frag_free and
free_unref_page) are no longer called on a per packet basis.

The test was done by sending to the macchiatobin 64 byte ethernet frames
with an invalid ethertype, so the packets are dropped early in the RX path.
Signed-off-by: NMatteo Croce <mcroce@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

133637fc

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功