提交 · 205a55f4e65353dd4846547d376a6f85cdda3d04 · openeuler / Kernel

24 7月, 2020 1 次提交

sfc: convert to new udp_tunnel infrastructure · 205a55f4

由 Jakub Kicinski 提交于 7月 22, 2020

Check MC_CMD_DRV_ATTACH_EXT_OUT_FLAG_TRUSTED, before setting
the info, which will hopefully protect us from -EPERM errors
the previous code was gracefully ignoring. Ed reports this
is not the 100% correct bit, but it's the best approximation
we have. Shared code reports the port information back to user
space, so we really want to know what was added and what failed.
Ignoring -EPERM is not an option.

The driver does not call udp_tunnel_get_rx_info(), so its own
management of table state is not really all that problematic,
we can leave it be. This allows the driver to continue with its
copious table syncing, and matching the ports to TX frames,
which it will reportedly do one day.

Leave the feature checking in the callbacks, as the device may
remove the capabilities on reset.

Inline the loop from __efx_ef10_udp_tnl_lookup_port() into
efx_ef10_udp_tnl_has_port(), since it's the only caller now.

With new infra this driver gains port replace - when space frees
up in a full table a new port will be selected for offload.
Plus efx will no longer sleep in an atomic context.

v2:
 - amend the commit message about TRUSTED not being 100%
 - add TUNNEL_ENCAP_UDP_PORT_ENTRY_INVALID to mark unsed
   entries
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Acked-By: NEdward Cree <ecree@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

205a55f4

23 7月, 2020 20 次提交

qede: add .ndo_xdp_xmit() and XDP_REDIRECT support · d1b25b79

由 Alexander Lobakin 提交于 7月 23, 2020

Add XDP_REDIRECT case handling and the corresponding NDO to support
redirecting XDP frames. This also includes registering driver memory
model (currently order-0 page mode) in BPF subsystem.
The total number of XDP queues is usually 1:1 with Rx ones.
Signed-off-by: NAlexander Lobakin <alobakin@marvell.com>
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1b25b79

qede: refactor XDP Tx processing · 4c2bacbe

由 Alexander Lobakin 提交于 7月 23, 2020

Current XDP Tx logic is suboptimal and can't be reused for XDP_REDIRECT
path.
Make qede_xdp_{tx_int,xmit}() more universal and effective in general to
allow future expanding.

Misc: use unlikely() hints where appropriate and replace "fallthrough"
comments with pseudo-keywords.
Signed-off-by: NAlexander Lobakin <alobakin@marvell.com>
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c2bacbe

qede: reformat net_device_ops declarations · f285ad57

由 Alexander Lobakin 提交于 7月 23, 2020

Correct the indentation of net_device_ops declarations for fancier look.
Signed-off-by: NAlexander Lobakin <alobakin@marvell.com>
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f285ad57

qede: reformat several structures in "qede.h" · f35535f7

由 Alexander Lobakin 提交于 7月 23, 2020

Make the file more readable and easier for adding new fields.

Misc: use IFNAMSIZ and netdev_name() instead of sizeof_field()
and direct net_device::name dereferencing.
Signed-off-by: NAlexander Lobakin <alobakin@marvell.com>
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f35535f7

qed: add support for different page sizes for chains · 15506586

由 Alexander Lobakin 提交于 7月 23, 2020

Extend current infrastructure to store chain page size in a struct
and use it in all functions instead of fixed QED_CHAIN_PAGE_SIZE.
Its value remains the default one, but can be overridden in
qed_chain_init_params before chain allocation.
Signed-off-by: NAlexander Lobakin <alobakin@marvell.com>
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15506586

qed: simplify chain allocation with init params struct · b6db3f71

由 Alexander Lobakin 提交于 7月 23, 2020

To simplify qed_chain_alloc() prototype and call sites, introduce struct
qed_chain_init_params to specify chain params, and pass a pointer to
filled struct to the actual qed_chain_alloc() instead of a long list
of separate arguments.
Signed-off-by: NAlexander Lobakin <alobakin@marvell.com>
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6db3f71

qed: simplify initialization of the chains with an external PBL · c3a321b0

由 Alexander Lobakin 提交于 7月 23, 2020

Fill PBL table parameters for chains with an external PBL data earlier on
qed_chain_init_params() rather than on allocation itself. This simplifies
allocation code and allows to extend struct ext_pbl for other chain types.
Signed-off-by: NAlexander Lobakin <alobakin@marvell.com>
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3a321b0

qed: move chain initialization inlines next to allocation functions · 5e776d80

由 Alexander Lobakin 提交于 7月 23, 2020

qed_chain_init*() are used in one file/place on "cold" path only, so they
can be uninlined and moved next to the call sites.
Signed-off-by: NAlexander Lobakin <alobakin@marvell.com>
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5e776d80

qed: sanitize PBL chains allocation · 9b6ee3cf

由 Alexander Lobakin 提交于 7月 23, 2020

PBL chain elements are actually DMA addresses stored in __le64, but
currently their size is hardcoded to 8, and DMA addresses are assigned
via cast to variable-sized dma_addr_t without any bitwise conversions.
Change the type of pbl_virt array to match the actual one, add a new
field to store the size of allocated DMA memory and sanitize elements
assignment.

Misc: give more logic names to the members of qed_chain::pbl_sp embedded
struct.
Signed-off-by: NAlexander Lobakin <alobakin@marvell.com>
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b6ee3cf

qed: prevent possible double-frees of the chains · 96ca4c50

由 Alexander Lobakin 提交于 7月 23, 2020

Zero-initialize chain on qed_chain_free(), so it couldn't be freed
twice and provoke undefined behaviour.
Signed-off-by: NAlexander Lobakin <alobakin@marvell.com>
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

96ca4c50

qed: move chain methods to a separate file · a08c9b2c

由 Alexander Lobakin 提交于 7月 23, 2020

Move chain allocation/freeing functions to a new file to not mix it with
hardware-related code.
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NAlexander Lobakin <alobakin@marvell.com>
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a08c9b2c

qed: reformat Makefile · bdaf98f6

由 Alexander Lobakin 提交于 7月 23, 2020

List one entry per line and sort them alphabetically to simplify the
addition of the new ones.
Signed-off-by: NAlexander Lobakin <alobakin@marvell.com>
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bdaf98f6

net: qed_hsi.h: Avoid the use of one-element array · f1fa27f5

由 Gustavo A. R. Silva 提交于 7月 22, 2020

One-element arrays are being deprecated[1]. Replace the one-element
array with a simple value type '__le32 reserved1'[2], once it seems
this is just a placeholder for alignment.

[1] https://github.com/KSPP/linux/issues/79
[2] https://github.com/KSPP/linux/issues/86Tested-by: Nkernel test robot <lkp@intel.com>
Link: https://github.com/GustavoARSilva/linux-hardening/blob/master/cii/0-day/qed_hsi-20200718.mdSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f1fa27f5

bna: bfi.h: Avoid the use of one-element array · 6fcf9aff

由 Gustavo A. R. Silva 提交于 7月 22, 2020

One-element arrays are being deprecated[1]. Replace the one-element
array with a simple value type 'u8 rsvd'[2], once it seems this is
just a placeholder for alignment.

[1] https://github.com/KSPP/linux/issues/79
[2] https://github.com/KSPP/linux/issues/86Tested-by: Nkernel test robot <lkp@intel.com>
Link: https://github.com/GustavoARSilva/linux-hardening/blob/master/cii/0-day/bfi-20200718.mdSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6fcf9aff

tg3: Avoid the use of one-element array · 7ec3e95e

由 Gustavo A. R. Silva 提交于 7月 22, 2020

One-element arrays are being deprecated[1]. Replace the one-element
array with a simple value type 'u32 reserved2'[2], once it seems
this is just a placeholder for alignment.

[1] https://github.com/KSPP/linux/issues/79
[2] https://github.com/KSPP/linux/issues/86Tested-by: Nkernel test robot <lkp@intel.com>
Link: https://github.com/GustavoARSilva/linux-hardening/blob/master/cii/0-day/tg3-20200718.mdSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: NMichael Chan <michael.chan@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7ec3e95e

ionic: fix memory leak of object 'lid' · 4b1debbe

由 Colin Ian King 提交于 7月 22, 2020

Currently when netdev fails to allocate the error return path
fails to free the allocated object 'lid'.  Fix this by setting
err to the return error code and jumping to a new label that
performs the kfree of lid before returning.

Addresses-Coverity: ("Resource leak")
Fixes: 4b03b273 ("ionic: get MTU from lif identity")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Acked-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b1debbe

lan743x: remove redundant initialization of variable current_head_index · bb809a04

由 Colin Ian King 提交于 7月 22, 2020

The variable current_head_index is being initialized with a value that
is never read and it is being updated later with a new value.  Replace
the initialization of -1 with the latter assignment.

Addresses-Coverity: ("Unused value")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bb809a04

enetc: Remove the imdio bus on PF probe bailout · c6dd6488

由 Claudiu Manoil 提交于 7月 22, 2020

enetc_imdio_remove() is missing from the enetc_pf_probe()
bailout path. Not surprisingly because enetc_setup_serdes()
is registering the imdio bus for internal purposes, and it's
not obvious that enetc_imdio_remove() currently performs the
teardown of enetc_setup_serdes().
To fix this, define enetc_teardown_serdes() to wrap
enetc_imdio_remove() (improve code maintenance) and call it
on bailout and remove paths.

Fixes: 975d183e ("net: enetc: Initialize SerDes for SGMII and USXGMII protocols")
Signed-off-by: NClaudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c6dd6488

net: qed: Remove unneeded cast from memory allocation · 7979a7d2

由 Wang Hai 提交于 7月 22, 2020

Remove casting the values returned by memory allocation function.

Coccinelle emits WARNING: casting value returned by memory allocation
unction to (struct roce_destroy_qp_req_output_params *) is useless.

This issue was detected by using the Coccinelle software.
Signed-off-by: NWang Hai <wanghai38@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7979a7d2

net: mscc: ocelot: fix non-initialized CPU port on VSC7514 · 8bb849d6

由 Vladimir Oltean 提交于 7月 22, 2020

The VSC7514 is marketed as a 10-port switch, however it has 11 physical
ports (0->10) in the block diagram:
https://www.microsemi.com/product-directory/ethernet-switches/3992-vsc7514
(also in the device tree at arch/mips/boot/dts/mscc/ocelot.dtsi)

Additionally, by architecture it has one more entry in the analyzer
block, situated right after the physical ports, for the CPU port module.
This is not a physical port, it only represents a channel for frame
injection and extraction. That entry for the CPU port is at index 11 in
the analyzer.

When the register groups for QSYS_SWITCH_PORT_MODE, SYS_PORT_MODE and
SYS_PAUSE_CFG are declared to be replicated 11 times, the 11th entry in
the array of regfields is not initialized, so the CPU port module is not
initialized either.

The documentation of QSYS_SWITCH_PORT_MODE for VSC7514 also says that
this register group is replicated 12 times, so this patch is simply
reflecting that and not introducing any further inconsistency.

Fixes: 886e1387 ("net: mscc: ocelot: convert QSYS_SWITCH_PORT_MODE and SYS_PORT_MODE to regfields")
Fixes: 541132f0 ("net: mscc: ocelot: convert SYS_PAUSE_CFG register access to regfield")
Reported-by: NBryan Whitehead <bryan.whitehead@microchip.com>
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8bb849d6

22 7月, 2020 19 次提交

ionic: interface file updates · 1b897e7d

由 Shannon Nelson 提交于 7月 21, 2020

Add some new interface values and update a few more descriptions.
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b897e7d

ionic: rearrange reset and bus-master control · 6a6014e2

由 Shannon Nelson 提交于 7月 21, 2020

We can prevent potential incorrect DMA access attempts from the
NIC by enabling bus-master after the reset, and by disabling
bus-master earlier in cleanup.
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a6014e2

ionic: update eid test for overflow · 3fbc9bb6

由 Shannon Nelson 提交于 7月 21, 2020

Fix up our comparison to better handle a potential (but largely
unlikely) wrap around.
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3fbc9bb6

ionic: remove unused ionic_coal_hw_to_usec · 4471b1c1

由 Shannon Nelson 提交于 7月 21, 2020

Clean up some unused code.
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4471b1c1

ionic: set netdev default name · c8768e73

由 Shannon Nelson 提交于 7月 21, 2020

If the host system's udev fails to set a new name for the
network port, there is no NETDEV_CHANGENAME event to trigger
the driver to send the name down to the firmware.  It is safe
to set the lif name multiple times, so we add a call early on
to set the default netdev name to be sure the FW has something
to use in its internal debug logging.  Then when udev gets
around to changing it we can update it to the actual name the
system will be using.
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8768e73

ionic: get MTU from lif identity · 4b03b273

由 Shannon Nelson 提交于 7月 21, 2020

Change from using hardcoded MTU limits and instead use the
firmware defined limits. The value from the LIF attributes is
the frame size, so we take off the header size to convert to
MTU size.
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b03b273

dpaa2-eth: add support for TBF offload · 3657cdaf

由 Ioana Ciornei 提交于 7月 21, 2020

React to TC_SETUP_QDISC_TBF and configure the egress shaper as
appropriate with the maximum rate and burst size requested by the user.
TBF can only be offloaded on DPAA2 when it's the root qdisc, ie it's a
per port shaper.
Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3657cdaf

dpaa2-eth: add API for Tx shaping · 39344a89

由 Ioana Ciornei 提交于 7月 21, 2020

Add the necessary API (dpni_set_tx_shaping) for configuring the rate and
burst size of a per port shaper in DPAA2.
Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

39344a89

dpaa2-eth: move the mqprio setup into a separate function · e3ec13be

由 Ioana Ciornei 提交于 7月 21, 2020

Move the setup done for MQPRIO into a separate function so that
with the addition of another offload we do not crowd
dpaa2_eth_setup_tc(). After this restructuring it's easier to see what
is supported in terms of Qdisc offloading.
Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3ec13be

r8169: allow to enable ASPM on RTL8125A · 3fc364c0

由 Heiner Kallweit 提交于 7月 21, 2020

For most chip versions this has been added already. Allow also for
RTL8125A to enable ASPM.
Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3fc364c0

net: ena: support new LLQ acceleration mode · 0e3a3f6d

由 Arthur Kiyanovski 提交于 7月 21, 2020

New devices add a new hardware acceleration engine, which adds some
restrictions to the driver.
Metadata descriptor must be present for each packet and the maximum
burst size between two doorbells is now limited to a number
advertised by the device.

This patch adds:
1. A handshake protocol between the driver and the device, so the
device will enable the accelerated queues only when both sides
support it.

2. The driver support for the new acceleration engine:
2.1. Send metadata descriptor for each Tx packet.
2.2. Limit the number of packets sent between doorbells.(*)

(*) A previous driver implementation of this feature was comitted in
commit 05d62ca2 ("net: ena: add handling of llq max tx burst size")
however the design of the interface between the driver and device
changed since then. This change is reflected in this commit.
Signed-off-by: NNetanel Belgazal <netanel@amazon.com>
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e3a3f6d

net: ena: move llq configuration from ena_probe to ena_device_init() · c29efeae

由 Arthur Kiyanovski 提交于 7月 21, 2020

When the ENA device resets to recover from some error state, all LLQ
configuration values are reset to their defaults, because LLQ is
initialized only once during ena_probe().

Changes in this commit:
1. Move the LLQ configuration process into ena_init_device()
which is called from both ena_probe() and ena_restore_device(). This
way, LLQ setup configurations that are different from the default
values will survive resets.

2. Extract the LLQ bar mapping to ena_map_llq_bar(),
and call once in the lifetime of the driver from ena_probe(),
since there is no need to unmap and map the LLQ bar again every reset.

3. Map the LLQ bar if it exists, regardless if initialization of LLQ
placement policy (ENA_ADMIN_PLACEMENT_POLICY_DEV) succeeded
or not. Initialization might fail the first time, falling back to the
ENA_ADMIN_PLACEMENT_POLICY_HOST placement policy, but later succeed
after device reset, in which case the LLQ bar needs to be mapped
already.
Signed-off-by: NSameeh Jubran <sameehj@amazon.com>
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c29efeae

net: ena: enable support of rss hash key and function changes · 0ee60edf

由 Arthur Kiyanovski 提交于 7月 21, 2020

Add the rss_configurable_function_key bit to driver_supported_feature.

This bit tells the device that the driver in question supports the
retrieving and updating of RSS function and hash key, and therefore
the device should allow RSS function and key manipulation.

This commit turns on device support for hash key and RSS function
management. Without this commit this feature is turned off at the
device and appears to the user as unsupported.

This commit concludes the following series of already merged commits:
commit 0af3c4e2 ("net: ena: changes to RSS hash key allocation")
commit c1bd17e5 ("net: ena: change default RSS hash function to Toeplitz")
commit f66c2ea3 ("net: ena: allow setting the hash function without changing the key")
commit e9a1de37 ("net: ena: fix error returning in ena_com_get_hash_function()")
commit 80f8443f ("net: ena: avoid unnecessary admin command when RSS function set fails")
commit 6a4f7dc8 ("net: ena: rss: do not allocate key when not supported")
commit 0d1c3de7 ("net: ena: fix incorrect default RSS key")

The above commits represent the last part of the implementation of
this feature, and with them merged the feature can be enabled
in the device.
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ee60edf

net: ena: add support for traffic mirroring · 0f505c60

由 Arthur Kiyanovski 提交于 7月 21, 2020

Add support for traffic mirroring, where the hardware reads the
buffer from the instance memory directly.

Traffic Mirroring needs access to the rx buffers in the instance.
To have this access, this patch:
1. Changes the code to map and unmap the rx buffers bidirectionally.
2. Enables the relevant bit in driver_supported_features to indicate
   to the FW that this driver supports traffic mirroring.

Rx completion is not generated until mirroring is done to avoid
the situation where the driver changes the buffer before it is
mirrored.
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0f505c60

net: ena: cosmetic: change ena_com_stats_admin stats to u64 · 0dcec686

由 Arthur Kiyanovski 提交于 7月 21, 2020

The size of the admin statistics in ena_com_stats_admin is changed
from 32bit to 64bit so to align with the sizes of the other statistics
in the driver (i.e. rx_stats, tx_stats and ena_stats_dev).

This is done as part of an effort to create a unified API to read
statistics.
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0dcec686

net: ena: cosmetic: satisfy gcc warning · 79890d3f

由 Arthur Kiyanovski 提交于 7月 21, 2020

gcc 4.8 reports a warning when initializing with = {0}.
Dropping the "0" from the braces fixes the issue.
This fix is not ANSI compatible but is allowed by gcc.
Signed-off-by: NSameeh Jubran <sameehj@amazon.com>
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

79890d3f

net: ena: add reserved PCI device ID · 866032ab

由 Arthur Kiyanovski 提交于 7月 21, 2020

Add a reserved PCI device ID to the driver's table
Used for internal testing purposes.
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

866032ab

net: ena: avoid unnecessary rearming of interrupt vector when busy-polling · 1e5ae350

由 Arthur Kiyanovski 提交于 7月 21, 2020

For an overview of the race created by this patch goto synchronization
label.

In napi busy-poll mode, the kernel invokes the napi handler of the
device repeatedly to poll the NIC's receive queues. This process
repeats until a timeout, specific for each connection, is up.
By polling packets in busy-poll mode the user may gain lower latency
and higher throughput (since the kernel no longer waits for interrupts
to poll the queues) in expense of CPU usage.

Upon completing a napi routine, the driver checks whether
the routine was called by an interrupt handler. If so, the driver
re-enables interrupts for the device. This is needed since an
interrupt routine invocation disables future invocations until
explicitly re-enabled.

The driver avoids re-enabling the interrupts if they were not disabled
in the first place (e.g. if driver in busy mode).
Originally, the driver checked whether interrupt re-enabling is needed
by reading the 'ena_napi->unmask_interrupt' variable. This atomic
variable was set upon interrupt and cleared after re-enabling it.

In the 4.10 Linux version, the 'napi_complete_done' call was changed
so that it returns 'false' when device should not re-enable
interrupts, and 'true' otherwise. The change includes reading the
"NAPIF_STATE_IN_BUSY_POLL" flag to check if the napi call is in
busy-poll mode, and if so, return 'false'.
The driver was changed to re-enable interrupts according to this
routine's return value.
The Linux community rejected the use of the
'ena_napi->unmaunmask_interrupt' variable to determine whether
unmasking is needed, and urged to use napi_napi_complete_done()
return value solely.
See https://lore.kernel.org/patchwork/patch/741149/ for more details

As explained, a busy-poll session exists for a specified timeout
value, after which it exits the busy-poll mode and re-enters it later.
This leads to many invocations of the napi handler where
napi_complete_done() false indicates that interrupts should be
re-enabled.
This creates a bug in which the interrupts are re-enabled
unnecessarily.
To reproduce this bug:
    1) echo 50 | sudo tee /proc/sys/net/core/busy_poll
    2) echo 50 | sudo tee /proc/sys/net/core/busy_read
    3) Add counters that check whether
    'ena_unmask_interrupt(tx_ring, rx_ring);'
    is called without disabling the interrupts in the first
    place (i.e. with calling the interrupt routine
    ena_intr_msix_io())

Steps 1+2 enable busy-poll as the default mode for new connections.

The busy poll routine rearms the interrupts after every session by
design, and so we need to add an extra check that the interrupts were
masked in the first place.

synchronization:
This patch introduces a race between the interrupt handler
ena_intr_msix_io() and the napi routine ena_io_poll().
Some macros and instruction were added to prevent this race from leaving
the interrupts masked. The following specifies the different race
scenarios in this patch:

1) interrupt handler and napi routine run sequentially
    i) interrupt handler is called, sets 'interrupts_masked' flag and
	successfully schedules the napi handler via softirq.

    In this scenario the napi routine might not see the flag change
    for several reasons:
	a) The flag is stored in a register by the compiler. For this
	case the WRITE_ONCE macro which prevents this.
	b) The compiler might reorder the instruction. For this the
	smp_wmb() instruction was used which implies a compiler memory
	barrier.
	c) On archs with weak consistency model (like ARM64) the napi
	routine might be scheduled and start running before the flag
	STORE instruction is committed to cache/memory. To ensure this
	doesn't happen, the smp_wmb() instruction was added. It ensures
	that the flag set instruction is committed before scheduling
	napi.

    ii) compiler reorders the flag's value check in the 'if' with
    the flag set in the napi routine.

    This scenario is prevented by smp_rmb() call after the flag check.

2) interrupt handler and napi routine run in parallel (can happen when
busy poll routine invokes the napi handler)

    i) interrupt handler sets the flag in one core, while the napi
    routine reads it in another core.

    This scenario also is divided into two cases:
	a) napi_complete_done() doesn't finish running, in which case
	napi_sched() would just set NAPIF_STATE_MISSED and the napi
	routine would reschedule itself without changing the flag's value.

	b) napi_complete_done() finishes running. In this case the
	napi routine might override the flag's value.
	This doesn't present any rise since it later unmasks the
	interrupt vector.
Signed-off-by: NShay Agroskin <shayagr@amazon.com>
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1e5ae350

qed: Fix ILT and XRCD bitmap memory leaks · d4eae993

由 Yuval Basson 提交于 7月 21, 2020

- Free ILT lines used for XRC-SRQ's contexts.
- Free XRCD bitmap

Fixes: b8204ad8 ("qed: changes to ILT to support XRC")
Fixes: 7bfb399e ("qed: Add XRC to RoCE")
Signed-off-by: NMichal Kalderon <mkalderon@marvell.com>
Signed-off-by: NYuval Basson <ybason@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4eae993

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功