提交 · 5bded8259ee3815a91791462dfb3312480779c3d · openeuler / Kernel

09 10月, 2021 2 次提交

net: dsa: mv88e6xxx: isolate the ATU databases of standalone and bridged ports · 5bded825

由 Vladimir Oltean 提交于 10月 07, 2021

Similar to commit 6087175b ("net: dsa: mt7530: use independent VLAN
learning on VLAN-unaware bridges"), software forwarding between an
unoffloaded LAG port (a bonding interface with an unsupported policy)
and a mv88e6xxx user port directly under a bridge is broken.

We adopt the same strategy, which is to make the standalone ports not
find any ATU entry learned on a bridge port.

Theory: the mv88e6xxx ATU is looked up by FID and MAC address. There are
as many FIDs as VIDs (4096). The FID is derived from the VID when
possible (the VTU maps a VID to a FID), with a fallback to the port
based default FID value when not (802.1Q Mode is disabled on the port,
or the classified VID isn't present in the VTU).

The mv88e6xxx driver makes the following use of FIDs and VIDs:

- the port's DefaultVID (to which untagged & pvid-tagged packets get
  classified) is 0 and is absent from the VTU, so this kind of packets is
  processed in FID 0, the default FID assigned by mv88e6xxx_setup_port.

- every time a bridge VLAN is created, mv88e6xxx_port_vlan_join() ->
  mv88e6xxx_atu_new() associates a FID with that VID which increases
  linearly starting from 1. Like this:

  bridge vlan add dev lan0 vid 100 # FID 1
  bridge vlan add dev lan1 vid 100 # still FID 1
  bridge vlan add dev lan2 vid 1024 # FID 2

The FID allocation made by the driver is sub-optimal for the following
reasons:

(a) A standalone port has a DefaultPVID of 0 and a default FID of 0 too.
    A VLAN-unaware bridged port has a DefaultPVID of 0 and a default FID
    of 0 too. The difference is that the bridged ports may learn ATU
    entries, while the standalone port has the requirement that it must
    not, and must not find them either. Standalone ports must not use
    the same FID as ports belonging to a bridge. All standalone ports
    can use the same FID, since the ATU will never have an entry in
    that FID.

(b) Multiple VLAN-unaware bridges will all use a DefaultPVID of 0 and a
    default FID of 0 on all their ports. The FDBs will not be isolated
    between these bridges. Every VLAN-unaware bridge must use the same
    FID on all its ports, different from the FID of other bridge ports.

(c) Each bridge VLAN uses a unique FID which is useful for Independent
    VLAN Learning, but the same VLAN ID on multiple VLAN-aware bridges
    will result in the same FID being used by mv88e6xxx_atu_new().
    The correct behavior is for VLAN 1 in br0 to have a different FID
    compared to VLAN 1 in br1.

This patch cannot fix all the above. Traditionally the DSA framework did
not care about this, and the reality is that DSA core involvement is
needed for the aforementioned issues to be solved. The only thing we can
solve here is an issue which does not require API changes, and that is
issue (a), aka use a different FID for standalone ports vs ports under
VLAN-unaware bridges.

The first step is deciding what VID and FID to use for standalone ports,
and what VID and FID for bridged ports. The 0/0 pair for standalone
ports is what they used up till now, let's keep using that. For bridged
ports, there are 2 cases:

- VLAN-aware ports will never end up using the port default FID, because
  packets will always be classified to a VID in the VTU or dropped
  otherwise. The FID is the one associated with the VID in the VTU.

- On VLAN-unaware ports, we _could_ leave their DefaultVID (pvid) at
  zero (just as in the case of standalone ports), and just change the
  port's default FID from 0 to a different number (say 1).

However, Tobias points out that there is one more requirement to cater to:
cross-chip bridging. The Marvell DSA header does not carry the FID in
it, only the VID. So once a packet crosses a DSA link, if it has a VID
of zero it will get classified to the default FID of that cascade port.
Relying on a port default FID for upstream cascade ports results in
contradictions: a default FID of 0 breaks ATU isolation of bridged ports
on the downstream switch, a default FID of 1 breaks standalone ports on
the downstream switch.

So not only must standalone ports have different FIDs compared to
bridged ports, they must also have different DefaultVID values.
IEEE 802.1Q defines two reserved VID values: 0 and 4095. So we simply
choose 4095 as the DefaultVID of ports belonging to VLAN-unaware
bridges, and VID 4095 maps to FID 1.

For the xmit operation to look up the same ATU database, we need to put
VID 4095 in DSA tags sent to ports belonging to VLAN-unaware bridges
too. All shared ports are configured to map this VID to the bridging
FID, because they are members of that VLAN in the VTU. Shared ports
don't need to have 802.1QMode enabled in any way, they always parse the
VID from the DSA header, they don't need to look at the 802.1Q header.

We install VID 4095 to the VTU in mv88e6xxx_setup_port(), with the
mention that mv88e6xxx_vtu_setup() which was located right below that
call was flushing the VTU so those entries wouldn't be preserved.
So we need to relocate the VTU flushing prior to the port initialization
during ->setup(). Also note that this is why it is safe to assume that
VID 4095 will get associated with FID 1: the user ports haven't been
created, so there is no avenue for the user to create a bridge VLAN
which could otherwise race with the creation of another FID which would
otherwise use up the non-reserved FID value of 1.

[ Currently mv88e6xxx_port_vlan_join() doesn't have the option of
  specifying a preferred FID, it always calls mv88e6xxx_atu_new(). ]

mv88e6xxx_port_db_load_purge() is the function to access the ATU for
FDB/MDB entries, and it used to determine the FID to use for
VLAN-unaware FDB entries (VID=0) using mv88e6xxx_port_get_fid().
But the driver only called mv88e6xxx_port_set_fid() once, during probe,
so no surprises, the port FID was always 0, the call to get_fid() was
redundant. As much as I would have wanted to not touch that code, the
logic is broken when we add a new FID which is not the port-based
default. Now the port-based default FID only corresponds to standalone
ports, and FDB/MDB entries belong to the bridging service. So while in
the future, when the DSA API will support FDB isolation, we will have to
figure out the FID based on the bridge number, for now there's a single
bridging FID, so hardcode that.

Lastly, the tagger needs to check, when it is transmitting a VLAN
untagged skb, whether it is sending it towards a bridged or a standalone
port. When we see it is bridged we assume the bridge is VLAN-unaware.
Not because it cannot be VLAN-aware but:

- if we are transmitting from a VLAN-aware bridge we are likely doing so
  using TX forwarding offload. That code path guarantees that skbs have
  a vlan hwaccel tag in them, so we would not enter the "else" branch
  of the "if (skb->protocol == htons(ETH_P_8021Q))" condition.

- if we are transmitting on behalf of a VLAN-aware bridge but with no TX
  forwarding offload (no PVT support, out of space in the PVT, whatever),
  we would indeed be transmitting with VLAN 4095 instead of the bridge
  device's pvid. However we would be injecting a "From CPU" frame, and
  the switch won't learn from that - it only learns from "Forward" frames.
  So it is inconsequential for address learning. And VLAN 4095 is
  absolutely enough for the frame to exit the switch, since we never
  remove that VLAN from any port.

Fixes: 57e661aa ("net: dsa: mv88e6xxx: Link aggregation support")
Reported-by: NTobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

5bded825

net: dsa: mv88e6xxx: keep the pvid at 0 when VLAN-unaware · 8b6836d8

由 Vladimir Oltean 提交于 10月 07, 2021

The VLAN support in mv88e6xxx has a loaded history. Commit 2ea7a679
("net: dsa: Don't add vlans when vlan filtering is disabled") noticed
some issues with VLAN and decided the best way to deal with them was to
make the DSA core ignore VLANs added by the bridge while VLAN awareness
is turned off. Those issues were never explained, just presented as
"at least one corner case".

That approach had problems of its own, presented by
commit 54a0ed0d ("net: dsa: provide an option for drivers to always
receive bridge VLANs") for the DSA core, followed by
commit 1fb74191 ("net: dsa: mv88e6xxx: fix vlan setup") which
applied ds->configure_vlan_while_not_filtering = true for mv88e6xxx in
particular.

We still don't know what corner case Andrew saw when he wrote
commit 2ea7a679 ("net: dsa: Don't add vlans when vlan filtering is
disabled"), but Tobias now reports that when we use TX forwarding
offload, pinging an external station from the bridge device is broken if
the front-facing DSA user port has flooding turned off. The full
description is in the link below, but for short, when a mv88e6xxx port
is under a VLAN-unaware bridge, it inherits that bridge's pvid.
So packets ingressing a user port will be classified to e.g. VID 1
(assuming that value for the bridge_default_pvid), whereas when
tag_dsa.c xmits towards a user port, it always sends packets using a VID
of 0 if that port is standalone or under a VLAN-unaware bridge - or at
least it did so prior to commit d82f8ab0 ("net: dsa: tag_dsa:
offload the bridge forwarding process").

In any case, when there is a conversation between the CPU and a station
connected to a user port, the station's MAC address is learned in VID 1
but the CPU tries to transmit through VID 0. The packets reach the
intended station, but via flooding and not by virtue of matching the
existing ATU entry.

DSA has established (and enforced in other drivers: sja1105, felix,
mt7530) that a VLAN-unaware port should use a private pvid, and not
inherit the one from the bridge. The bridge's pvid should only be
inherited when that bridge is VLAN-aware, so all state transitions need
to be handled. On the other hand, all bridge VLANs should sit in the VTU
starting with the moment when the bridge offloads them via switchdev,
they are just not used.

This solves the problem that Tobias sees because packets ingressing on
VLAN-unaware user ports now get classified to VID 0, which is also the
VID used by tag_dsa.c on xmit.

Fixes: d82f8ab0 ("net: dsa: tag_dsa: offload the bridge forwarding process")
Link: https://patchwork.kernel.org/project/netdevbpf/patch/20211003222312.284175-2-vladimir.oltean@nxp.com/#24491503Reported-by: NTobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

8b6836d8

08 10月, 2021 2 次提交

net: stmmac: add support for dwmac 3.40a · 9cb1d19f

由 Herve Codina 提交于 10月 08, 2021

dwmac 3.40a is an old ip version that can be found on SPEAr3xx soc.
Signed-off-by: NHerve Codina <herve.codina@bootlin.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9cb1d19f

net: stmmac: fix get_hw_feature() on old hardware · 075da584

由 Herve Codina 提交于 10月 08, 2021

Some old IPs do not provide the hardware feature register.
On these IPs, this register is read 0x00000000.

In old driver version, this feature was handled but a regression came
with the commit f10a6a35 ("stmmac: rework get_hw_feature function").
Indeed, this commit removes the return value in dma->get_hw_feature().
This return value was used to indicate the validity of retrieved
information and used later on in stmmac_hw_init() to override
priv->plat data if this hardware feature were valid.

This patch restores the return code in ->get_hw_feature() in order
to indicate the hardware feature validity and override priv->plat
data only if this hardware feature is valid.

Fixes: f10a6a35 ("stmmac: rework get_hw_feature function")
Signed-off-by: NHerve Codina <herve.codina@bootlin.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

075da584

07 10月, 2021 3 次提交

iavf: fix double unlock of crit_lock · 54ee3943

由 Stefan Assmann 提交于 8月 24, 2021

The crit_lock mutex could be unlocked twice as reported here
https://lists.osuosl.org/pipermail/intel-wired-lan/Week-of-Mon-20210823/025525.html

Remove the superfluous unlock. Technically the problem was already
present before 5ac49f3c as that commit only replaced the locking
primitive, but no functional change.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Fixes: 5ac49f3c ("iavf: use mutexes for locking of critical sections")
Fixes: bac84861 ("iavf: Refactor the watchdog state machine")
Signed-off-by: NStefan Assmann <sassmann@kpanic.de>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

54ee3943

i40e: Fix freeing of uninitialized misc IRQ vector · 2e5a2057

由 Sylwester Dziedziuch 提交于 9月 24, 2021

When VSI set up failed in i40e_probe() as part of PF switch set up
driver was trying to free misc IRQ vectors in
i40e_clear_interrupt_scheme and produced a kernel Oops:

   Trying to free already-free IRQ 266
   WARNING: CPU: 0 PID: 5 at kernel/irq/manage.c:1731 __free_irq+0x9a/0x300
   Workqueue: events work_for_cpu_fn
   RIP: 0010:__free_irq+0x9a/0x300
   Call Trace:
   ? synchronize_irq+0x3a/0xa0
   free_irq+0x2e/0x60
   i40e_clear_interrupt_scheme+0x53/0x190 [i40e]
   i40e_probe.part.108+0x134b/0x1a40 [i40e]
   ? kmem_cache_alloc+0x158/0x1c0
   ? acpi_ut_update_ref_count.part.1+0x8e/0x345
   ? acpi_ut_update_object_reference+0x15e/0x1e2
   ? strstr+0x21/0x70
   ? irq_get_irq_data+0xa/0x20
   ? mp_check_pin_attr+0x13/0xc0
   ? irq_get_irq_data+0xa/0x20
   ? mp_map_pin_to_irq+0xd3/0x2f0
   ? acpi_register_gsi_ioapic+0x93/0x170
   ? pci_conf1_read+0xa4/0x100
   ? pci_bus_read_config_word+0x49/0x70
   ? do_pci_enable_device+0xcc/0x100
   local_pci_probe+0x41/0x90
   work_for_cpu_fn+0x16/0x20
   process_one_work+0x1a7/0x360
   worker_thread+0x1cf/0x390
   ? create_worker+0x1a0/0x1a0
   kthread+0x112/0x130
   ? kthread_flush_work_fn+0x10/0x10
   ret_from_fork+0x1f/0x40

The problem is that at that point misc IRQ vectors
were not allocated yet and we get a call trace
that driver is trying to free already free IRQ vectors.

Add a check in i40e_clear_interrupt_scheme for __I40E_MISC_IRQ_REQUESTED
PF state before calling i40e_free_misc_vector. This state is set only if
misc IRQ vectors were properly initialized.

Fixes: c17401a1 ("i40e: use separate state bit for miscellaneous IRQ setup")
Reported-by: NPJ Waskiewicz <pwaskiewicz@jumptrading.com>
Signed-off-by: NSylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NDave Switzer <david.switzer@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

2e5a2057

i40e: fix endless loop under rtnl · 857b6c6f

由 Jiri Benc 提交于 9月 14, 2021

The loop in i40e_get_capabilities can never end. The problem is that
although i40e_aq_discover_capabilities returns with an error if there's
a firmware problem, the returned error is not checked. There is a check for
pf->hw.aq.asq_last_status but that value is set to I40E_AQ_RC_OK on most
firmware problems.

When i40e_aq_discover_capabilities encounters a firmware problem, it will
encounter the same problem on its next invocation. As the result, the loop
becomes endless. We hit this with I40E_ERR_ADMIN_QUEUE_TIMEOUT but looking
at the code, it can happen with a range of other firmware errors.

I don't know what the correct behavior should be: whether the firmware
should be retried a few times, or whether pf->hw.aq.asq_last_status should
be always set to the encountered firmware error (but then it would be
pointless and can be just replaced by the i40e_aq_discover_capabilities
return value). However, the current behavior with an endless loop under the
rtnl mutex(!) is unacceptable and Intel has not submitted a fix, although we
explained the bug to them 7 months ago.

This may not be the best possible fix but it's better than hanging the whole
system on a firmware bug.

Fixes: 56a62fc8 ("i40e: init code and hardware support")
Tested-by: NStefan Assmann <sassmann@redhat.com>
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Reviewed-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NDave Switzer <david.switzer@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

857b6c6f

06 10月, 2021 8 次提交

ionic: move filter sync_needed bit set · 3707428d

由 Shannon Nelson 提交于 10月 05, 2021

Move the setting of the filter-sync-needed bit to the error
case in the filter add routine to be sure we're checking the
live filter status rather than a copy of the pre-sync status.

Fixes: 969f8439 ("ionic: sync the filters in the work task")
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3707428d

gve: report 64bit tx_bytes counter from gve_handle_report_stats() · 17c37d74

由 Eric Dumazet 提交于 10月 05, 2021

Each tx queue maintains a 64bit counter for bytes, there is
no reason to truncate this to 32bit (or this has not been
documented)

Fixes: 24aeb56f ("gve: Add Gvnic stats AQ command and ethtool show/set-priv-flags.")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Yangchun Fu <yangchun@google.com>
Cc: Kuo Zhao <kuozhao@google.com>
Cc: David Awogbemila <awogbemila@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

17c37d74

gve: fix gve_get_stats() · 2f57d497

由 Eric Dumazet 提交于 10月 05, 2021

gve_get_stats() can report wrong numbers if/when u64_stats_fetch_retry()
returns true.

What is needed here is to sample values in temporary variables,
and only use them after each loop is ended.

Fixes: f5cedc84 ("gve: Add transmit and receive support")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Catherine Sullivan <csully@google.com>
Cc: Sagi Shahar <sagis@google.com>
Cc: Jon Olson <jonolson@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Luigi Rizzo <lrizzo@google.com>
Cc: Jeroen de Borst <jeroendb@google.com>
Cc: Tao Liu <xliutaox@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f57d497

gve: Properly handle errors in gve_assign_qpl · d4b111fd

由 Catherine Sullivan 提交于 10月 05, 2021

Ignored errors would result in crash.

Fixes: ede3fcf5 ("gve: Add support for raw addressing to the rx path")
Signed-off-by: NCatherine Sullivan <csully@google.com>
Signed-off-by: NJeroen de Borst <jeroendb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4b111fd

gve: Avoid freeing NULL pointer · 922aa9bc

由 Tao Liu 提交于 10月 05, 2021

Prevent possible crashes when cleaning up after unsuccessful
initializations.

Fixes: 893ce44d ("gve: Add basic driver framework for Compute Engine Virtual NIC")
Signed-off-by: NTao Liu <xliutaox@google.com>
Signed-off-by: NCatherine Sully <csully@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

922aa9bc

gve: Correct available tx qpl check · d03477ee

由 Catherine Sullivan 提交于 10月 05, 2021

The qpl_map_size is rounded up to a multiple of sizeof(long), but the
number of qpls doesn't have to be.

Fixes: f5cedc84 ("gve: Add transmit and receive support")
Signed-off-by: NCatherine Sullivan <csully@google.com>
Signed-off-by: NJeroen de Borst <jeroendb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d03477ee

net: stmmac: trigger PCS EEE to turn off on link down · d4aeaed8

由 Wong Vee Khee 提交于 10月 05, 2021

The current implementation enable PCS EEE feature in the event of link
up, but PCS EEE feature is not disabled on link down.

This patch makes sure PCE EEE feature is disabled on link down.

Fixes: 656ed8b0 ("net: stmmac: fix EEE init issue when paired with EEE capable PHYs")
Signed-off-by: NWong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4aeaed8

net: pcs: xpcs: fix incorrect steps on disable EEE · 590df78b

由 Wong Vee Khee 提交于 10月 05, 2021

When Energy-Efficient Ethernet(EEE) is disable from the MAC side,
we need to clear the DW_VR_MII_EEE_TRN_LPI bit of DW_VR_MII_EEE_MCTRL1
register.

Fixes: 7617af3d ("net: pcs: Introducing support for DWC xpcs Energy Efficient Ethernet")
Cc: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Signed-off-by: NWong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

590df78b

05 10月, 2021 3 次提交

net: pcs: xpcs: fix incorrect CL37 AN sequence · e3cf002d

由 Wong Vee Khee 提交于 10月 05, 2021

According to Synopsys DesignWare Cores Ethernet PCS databook, it is
required to disable Clause 37 auto-negotiation by programming bit-12
(AN_ENABLE) to 0 if it is already enabled, before programming various
fields of VR_MII_AN_CTRL registers.

After all these programming are done, it is then required to enable
Clause 37 auto-negotiation by programming bit-12 (AN_ENABLE) to 1.

Fixes: b97b5331 ("net: pcs: add C37 SGMII AN support for intel mGbE controller")
Cc: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NWong Vee Khee <vee.khee.wong@linux.intel.com>
Reviewed-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3cf002d

net: sfp: Fix typo in state machine debug string · 25a9da66

由 Sean Anderson 提交于 10月 04, 2021

The string should be "tx_disable" to match the state enum.

Fixes: 4005a7cb ("net: phy: sftp: print debug message with text, not numbers")
Signed-off-by: NSean Anderson <sean.anderson@seco.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

25a9da66

r8152: avoid to resubmit rx immediately · baf33d7a

由 Hayes Wang 提交于 10月 04, 2021

For the situation that the disconnect event comes very late when the
device is unplugged, the driver would resubmit the RX bulk transfer
after getting the callback with -EPROTO immediately and continually.
Finally, soft lockup occurs.

This patch avoids to resubmit RX immediately. It uses a workqueue to
schedule the RX NAPI. And the NAPI would resubmit the RX. It let the
disconnect event have opportunity to stop the submission before soft
lockup.
Reported-by: NJason-ch Chen <jason-ch.chen@mediatek.com>
Tested-by: NJason-ch Chen <jason-ch.chen@mediatek.com>
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

baf33d7a

02 10月, 2021 2 次提交

net: stmmac: dwmac-rk: Fix ethernet on rk3399 based devices · aec3f415

由 Punit Agrawal 提交于 9月 29, 2021

Commit 2d26f6e3 ("net: stmmac: dwmac-rk: fix unbalanced pm_runtime_enable warnings")
while getting rid of a runtime PM warning ended up breaking ethernet
on rk3399 based devices. By dropping an extra reference to the device,
the commit ends up enabling suspend / resume of the ethernet device -
which appears to be broken.

While the issue with runtime pm is being investigated, partially
revert commit 2d26f6e3 to restore the network on rk3399.

Fixes: 2d26f6e3 ("net: stmmac: dwmac-rk: fix unbalanced pm_runtime_enable warnings")
Suggested-by: NHeiko Stuebner <heiko@sntech.de>
Signed-off-by: NPunit Agrawal <punitagrawal@gmail.com>
Cc: Michael Riesch <michael.riesch@wolfvision.net>
Tested-by: NHeiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20210929135049.3426058-1-punitagrawal@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

aec3f415

net: mscc: ocelot: fix VCAP filters remaining active after being deleted · 019d9329

由 Vladimir Oltean 提交于 9月 30, 2021

When ocelot_flower.c calls ocelot_vcap_filter_add(), the filter has a
given filter->id.cookie. This filter is added to the block->rules list.

However, when ocelot_flower.c calls ocelot_vcap_block_find_filter_by_id()
which passes the cookie as argument, the filter is never found by
filter->id.cookie when searching through the block->rules list.

This is unsurprising, since the filter->id.cookie is an unsigned long,
but the cookie argument provided to ocelot_vcap_block_find_filter_by_id()
is a signed int, and the comparison fails.

Fixes: 50c6cc5b ("net: mscc: ocelot: store a namespaced VCAP filter ID")
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20210930125330.2078625-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

019d9329

01 10月, 2021 12 次提交

phy: mdio: fix memory leak · ca6e11c3

由 Pavel Skripkin 提交于 9月 30, 2021

Syzbot reported memory leak in MDIO bus interface, the problem was in
wrong state logic.

MDIOBUS_ALLOCATED indicates 2 states:
	1. Bus is only allocated
	2. Bus allocated and __mdiobus_register() fails, but
	   device_register() was called

In case of device_register() has been called we should call put_device()
to correctly free the memory allocated for this device, but mdiobus_free()
calls just kfree(dev) in case of MDIOBUS_ALLOCATED state

To avoid this behaviour we need to set bus->state to MDIOBUS_UNREGISTERED
_before_ calling device_register(), because put_device() should be
called even in case of device_register() failure.

Link: https://lore.kernel.org/netdev/YVMRWNDZDUOvQjHL@shell.armlinux.org.uk/
Fixes: 46abc021 ("phylib: give mdio buses a device tree presence")
Reported-and-tested-by: syzbot+398e7dc692ddbbb4cfec@syzkaller.appspotmail.com
Reviewed-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NPavel Skripkin <paskripkin@gmail.com>
Link: https://lore.kernel.org/r/eceae1429fbf8fa5c73dd2a0d39d525aa905074d.1633024062.git.paskripkin@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

ca6e11c3

Revert "net: mdiobus: Fix memory leak in __mdiobus_register" · 10eff1f5

由 Pavel Skripkin 提交于 9月 30, 2021

This reverts commit ab609f25.

This patch is correct in the sense that we _should_ call device_put() in
case of device_register() failure, but the problem in this code is more
vast.

We need to set bus->state to UNMDIOBUS_REGISTERED before calling
device_register() to correctly release the device in mdiobus_free().
This patch prevents us from doing it, since in case of device_register()
failure put_device() will be called 2 times and it will cause UAF or
something else.

Also, Reported-by: tag in revered commit was wrong, since syzbot
reported different leak in same function.

Link: https://lore.kernel.org/netdev/20210928092657.GI2048@kadam/Acked-by: NYanfei Xu <yanfei.xu@windriver.com>
Signed-off-by: NPavel Skripkin <paskripkin@gmail.com>
Link: https://lore.kernel.org/r/f12fb1faa4eccf0f355788225335eb4309ff2599.1633024062.git.paskripkin@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

10eff1f5

net/mlx5e: Mutually exclude setting of TX-port-TS and MQPRIO in channel mode · 3bf1742f

由 Aya Levin 提交于 9月 13, 2021

TX-port-TS hijacks the PTP traffic to a specific HW TX-queue. This
conflicts with MQPRIO in channel mode, which specifies explicitly which
TC accepts the packet. This patch mutually excludes the above
configuration.

Fixes: ec60c458 ("net/mlx5e: Support MQPRIO channel mode")
Signed-off-by: NAya Levin <ayal@nvidia.com>
Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

3bf1742f

net/mlx5e: Fix the presented RQ index in PTP stats · dd1979cf

由 Lama Kayal 提交于 8月 29, 2021

PTP-RQ counters title format contains PTP-RQ identifier, which is
mistakenly not passed to sprinft().
This leads to unexpected garbage values instead.
This patch fixes it.

Before applying the patch:
ethtool -S eth3 | grep ptp_rq
     ptp_rq15_packets: 0
     ptp_rq8_bytes: 0
     ptp_rq6_csum_complete: 0
     ptp_rq14_csum_complete_tail: 0
     ptp_rq3_csum_complete_tail_slow : 0
     ptp_rq9_csum_unnecessary: 0
     ptp_rq1_csum_unnecessary_inner: 0
     ptp_rq7_csum_none: 0
     ptp_rq10_xdp_drop: 0
     ptp_rq9_xdp_redirect: 0
     ptp_rq13_lro_packets: 0
     ptp_rq12_lro_bytes: 0
     ptp_rq10_ecn_mark: 0
     ptp_rq9_removed_vlan_packets: 0
     ptp_rq5_wqe_err: 0
     ptp_rq8_mpwqe_filler_cqes: 0
     ptp_rq2_mpwqe_filler_strides: 0
     ptp_rq5_oversize_pkts_sw_drop: 0
     ptp_rq6_buff_alloc_err: 0
     ptp_rq15_cqe_compress_blks: 0
     ptp_rq2_cqe_compress_pkts: 0
     ptp_rq2_cache_reuse: 0
     ptp_rq12_cache_full: 0
     ptp_rq11_cache_empty: 256
     ptp_rq12_cache_busy: 0
     ptp_rq11_cache_waive: 0
     ptp_rq12_congst_umr: 0
     ptp_rq11_arfs_err: 0
     ptp_rq9_recover: 0

After applying the patch:
ethtool -S eth3 | grep ptp_rq
     ptp_rq0_packets: 0
     ptp_rq0_bytes: 0
     ptp_rq0_csum_complete: 0
     ptp_rq0_csum_complete_tail: 0
     ptp_rq0_csum_complete_tail_slow : 0
     ptp_rq0_csum_unnecessary: 0
     ptp_rq0_csum_unnecessary_inner: 0
     ptp_rq0_csum_none: 0
     ptp_rq0_xdp_drop: 0
     ptp_rq0_xdp_redirect: 0
     ptp_rq0_lro_packets: 0
     ptp_rq0_lro_bytes: 0
     ptp_rq0_ecn_mark: 0
     ptp_rq0_removed_vlan_packets: 0
     ptp_rq0_wqe_err: 0
     ptp_rq0_mpwqe_filler_cqes: 0
     ptp_rq0_mpwqe_filler_strides: 0
     ptp_rq0_oversize_pkts_sw_drop: 0
     ptp_rq0_buff_alloc_err: 0
     ptp_rq0_cqe_compress_blks: 0
     ptp_rq0_cqe_compress_pkts: 0
     ptp_rq0_cache_reuse: 0
     ptp_rq0_cache_full: 0
     ptp_rq0_cache_empty: 256
     ptp_rq0_cache_busy: 0
     ptp_rq0_cache_waive: 0
     ptp_rq0_congst_umr: 0
     ptp_rq0_arfs_err: 0
     ptp_rq0_recover: 0

Fixes: a28359e9 ("net/mlx5e: Add PTP-RX statistics")
Signed-off-by: NLama Kayal <lkayal@nvidia.com>
Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

dd1979cf

net/mlx5: Fix setting number of EQs of SFs · f88c4876

由 Shay Drory 提交于 9月 14, 2021

When setting number of completion EQs of the SF, consider number of
online CPUs.
Without this consideration, when number of online cpus are less than 8,
unnecessary 8 completion EQs are allocated.

Fixes: c36326d3 ("net/mlx5: Round-Robin EQs over IRQs")
Signed-off-by: NShay Drory <shayd@nvidia.com>
Reviewed-by: NParav Pandit <parav@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

f88c4876

net/mlx5: Fix length of irq_index in chars · ac8b7d50

由 Shay Drory 提交于 8月 19, 2021

The maximum irq_index can be 2047, This means irq_name should have 4
characters reserve for the irq_index. Hence, increase it to 4.

Fixes: 3af26495 ("net/mlx5: Enlarge interrupt field in CREATE_EQ")
Signed-off-by: NShay Drory <shayd@nvidia.com>
Reviewed-by: NParav Pandit <parav@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

ac8b7d50

net/mlx5: Avoid generating event after PPS out in Real time mode · 99b9a678

由 Aya Levin 提交于 9月 23, 2021

When in Real-time mode, HW clock is synced with the PTP daemon. Hence
driver should not re-calibrate the next pulse (via MTPPSE repetitive
events mechanism).

This patch arms repetitive events only in free-running mode.

Fixes: 432119de ("net/mlx5: Add cyc2time HW translation mode support")
Signed-off-by: NAya Levin <ayal@nvidia.com>
Reviewed-by: NEran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

99b9a678

net/mlx5: Force round second at 1PPS out start time · 64728294

由 Aya Levin 提交于 9月 23, 2021

Allow configuration of 1PPS start time only with time-stamp representing
a round second. Prior to this patch driver allowed setting of a
non-round-second which is not supported by the device. Avoid unexpected
behavior by restricting start-time configuration to a round-second.

Fixes: 4272f9b8 ("net/mlx5e: Change 1PPS out scheme")
Signed-off-by: NAya Levin <ayal@nvidia.com>
Reviewed-by: NEran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

64728294

net/mlx5: E-Switch, Fix double allocation of acl flow counter · a586775f

由 Moshe Shemesh 提交于 9月 23, 2021

Flow counter is allocated in eswitch legacy acl setting functions
without checking if already allocated by previous setting. Add a check
to avoid such double allocation.

Fixes: 07bab950 ("net/mlx5: E-Switch, Refactor eswitch ingress acl codes")
Fixes: ea651a86 ("net/mlx5: E-Switch, Refactor eswitch egress acl codes")
Signed-off-by: NMoshe Shemesh <moshe@nvidia.com>
Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

a586775f

net/mlx5e: Improve MQPRIO resiliency · 7dbc849b

由 Tariq Toukan 提交于 9月 29, 2021

* Add netdev->tc_to_txq rollback in case of failure in
  mlx5e_update_netdev_queues().
* Fix broken transition between the two modes:
  MQPRIO DCB mode with tc==8, and MQPRIO channel mode.
* Disable MQPRIO channel mode if re-attaching with a different number
  of channels.
* Improve code sharing.

Fixes: ec60c458 ("net/mlx5e: Support MQPRIO channel mode")
Signed-off-by: NTariq Toukan <tariqt@nvidia.com>
Reviewed-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

7dbc849b

net/mlx5e: Keep the value for maximum number of channels in-sync · 9d758d4a

由 Tariq Toukan 提交于 9月 02, 2021

The value for maximum number of channels is first calculated based
on the netdev's profile and current function resources (specifically,
number of MSIX vectors, which depends among other things on the number
of online cores in the system).
This value is then used to calculate the netdev's number of rxqs/txqs.
Once created (by alloc_etherdev_mqs), the number of netdev's rxqs/txqs
is constant and we must not exceed it.

To achieve this, keep the maximum number of channels in sync upon any
netdevice re-attach.

Use mlx5e_get_max_num_channels() for calculating the number of netdev's
rxqs/txqs. After netdev is created, use mlx5e_calc_max_nch() (which
coinsiders core device resources, profile, and netdev) to init or
update priv->max_nch.

Before this patch, the value of priv->max_nch might get out of sync,
mistakenly allowing accesses to out-of-bounds objects, which would
crash the system.

Track the number of channels stats structures used in a separate
field, as they are persistent to suspend/resume operations. All the
collected stats of every channel index that ever existed should be
preserved. They are reset only when struct mlx5e_priv is,
in mlx5e_priv_cleanup(), which is part of the profile changing flow.

There is no point anymore in blocking a profile change due to max_nch
mismatch in mlx5e_netdev_change_profile(). Remove the limitation.

Fixes: a1f240f1 ("net/mlx5e: Adjust to max number of channles when re-attaching")
Signed-off-by: NTariq Toukan <tariqt@nvidia.com>
Reviewed-by: NAya Levin <ayal@nvidia.com>
Reviewed-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

9d758d4a

net/mlx5e: IPSEC RX, enable checksum complete · f9a10440

由 Raed Salem 提交于 8月 26, 2021

Currently in Rx data path IPsec crypto offloaded packets uses
csum_none flag, so checksum is handled by the stack, this naturally
have some performance/cpu utilization impact on such flows. As Nvidia
NIC starting from ConnectX6DX provides checksum complete value out of
the box also for such flows there is no sense in taking csum_none path,
furthermore the stack (xfrm) have the method to handle checksum complete
corrections for such flows i.e. IPsec trailer removal and consequently
checksum value adjustment.

Because of the above and in addition the ConnectX6DX is the first HW
which supports IPsec crypto offload then it is safe to report csum
complete for IPsec offloaded traffic.

Fixes: b2ac7541 ("net/mlx5e: IPsec: Add Connect-X IPsec Rx data path offload")
Signed-off-by: NRaed Salem <raeds@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>

f9a10440

30 9月, 2021 1 次提交

net: stmmac: fix EEE init issue when paired with EEE capable PHYs · 656ed8b0

由 Wong Vee Khee 提交于 9月 30, 2021

When STMMAC is paired with Energy-Efficient Ethernet(EEE) capable PHY,
and the PHY is advertising EEE by default, we need to enable EEE on the
xPCS side too, instead of having user to manually trigger the enabling
config via ethtool.

Fixed this by adding xpcs_config_eee() call in stmmac_eee_init().

Fixes: 7617af3d ("net: pcs: Introducing support for DWC xpcs Energy Efficient Ethernet")
Cc: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Signed-off-by: NWong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

656ed8b0

29 9月, 2021 7 次提交

net: phy: bcm7xxx: Fixed indirect MMD operations · d88fd1b5

由 Florian Fainelli 提交于 9月 28, 2021

When EEE support was added to the 28nm EPHY it was assumed that it would
be able to support the standard clause 45 over clause 22 register access
method. It turns out that the PHY does not support that, which is the
very reason for using the indirect shadow mode 2 bank 3 access method.

Implement {read,write}_mmd to allow the standard PHY library routines
pertaining to EEE querying and configuration to work correctly on these
PHYs. This forces us to implement a __phy_set_clr_bits() function that
does not grab the MDIO bus lock since the PHY driver's {read,write}_mmd
functions are always called with that lock held.

Fixes: 83ee102a ("net: phy: bcm7xxx: add support for 28nm EPHY")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d88fd1b5

net: hns3: disable firmware compatible features when uninstall PF · 0178839c

由 Guangbin Huang 提交于 9月 29, 2021

Currently, the firmware compatible features are enabled in PF driver
initialization process, but they are not disabled in PF driver
deinitialization process and firmware keeps these features in enabled
status.

In this case, if load an old PF driver (for example, in VM) which not
support the firmware compatible features, firmware will still send mailbox
message to PF when link status changed and PF will print
"un-supported mailbox message, code = 201".

To fix this problem, disable these firmware compatible features in PF
driver deinitialization process.

Fixes: ed8fb4b2 ("net: hns3: add link change event report")
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0178839c

net: hns3: fix always enable rx vlan filter problem after selftest · 27bf4af6

由 Guangbin Huang 提交于 9月 29, 2021

Currently, the rx vlan filter will always be disabled before selftest and
be enabled after selftest as the rx vlan filter feature is fixed on in
old device earlier than V3.

However, this feature is not fixed in some new devices and it can be
disabled by user. In this case, it is wrong if rx vlan filter is enabled
after selftest. So fix it.

Fixes: bcc26e8d ("net: hns3: remove unused code in hns3_self_test()")
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27bf4af6

net: hns3: PF enable promisc for VF when mac table is overflow · 276e6042

由 Guangbin Huang 提交于 9月 29, 2021

If unicast mac address table is full, and user add a new mac address, the
unicast promisc needs to be enabled for the new unicast mac address can be
used. So does the multicast promisc.

Now this feature has been implemented for PF, and VF should be implemented
too. When the mac table of VF is overflow, PF will enable promisc for this
VF.

Fixes: 1e6e7610 ("net: hns3: configure promisc mode for VF asynchronously")
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

276e6042

net: hns3: fix show wrong state when add existing uc mac address · 108b3c78

由 Jian Shen 提交于 9月 29, 2021

Currently, if function adds an existing unicast mac address, eventhough
driver will not add this address into hardware, but it will return 0 in
function hclge_add_uc_addr_common(). It will cause the state of this
unicast mac address is ACTIVE in driver, but it should be in TO-ADD state.

To fix this problem, function hclge_add_uc_addr_common() returns -EEXIST
if mac address is existing, and delete two error log to avoid printing
them all the time after this modification.

Fixes: 72110b56 ("net: hns3: return 0 and print warning when hit duplicate MAC")
Signed-off-by: NJian Shen <shenjian15@huawei.com>
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

108b3c78

net: hns3: fix mixed flag HCLGE_FLAG_MQPRIO_ENABLE and HCLGE_FLAG_DCB_ENABLE · 0472e95f

由 Jian Shen 提交于 9月 29, 2021

HCLGE_FLAG_MQPRIO_ENABLE is supposed to set when enable
multiple TCs with tc mqprio, and HCLGE_FLAG_DCB_ENABLE is
supposed to set when enable multiple TCs with ets. But
the driver mixed the flags when updating the tm configuration.

Furtherly, PFC should be available when HCLGE_FLAG_MQPRIO_ENABLE
too, so remove the unnecessary limitation.

Fixes: 5a5c9091 ("net: hns3: add support for tc mqprio offload")
Signed-off-by: NJian Shen <shenjian15@huawei.com>
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0472e95f

net: hns3: don't rollback when destroy mqprio fail · d82650be

由 Jian Shen 提交于 9月 29, 2021

For destroy mqprio is irreversible in stack, so it's unnecessary
to rollback the tc configuration when destroy mqprio failed.
Otherwise, it may cause the configuration being inconsistent
between driver and netstack.

As the failure is usually caused by reset, and the driver will
restore the configuration after reset, so it can keep the
configuration being consistent between driver and hardware.

Fixes: 5a5c9091 ("net: hns3: add support for tc mqprio offload")
Signed-off-by: NJian Shen <shenjian15@huawei.com>
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d82650be

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功