1. 09 10月, 2021 2 次提交
    • V
      net: dsa: mv88e6xxx: isolate the ATU databases of standalone and bridged ports · 5bded825
      Vladimir Oltean 提交于
      Similar to commit 6087175b ("net: dsa: mt7530: use independent VLAN
      learning on VLAN-unaware bridges"), software forwarding between an
      unoffloaded LAG port (a bonding interface with an unsupported policy)
      and a mv88e6xxx user port directly under a bridge is broken.
      
      We adopt the same strategy, which is to make the standalone ports not
      find any ATU entry learned on a bridge port.
      
      Theory: the mv88e6xxx ATU is looked up by FID and MAC address. There are
      as many FIDs as VIDs (4096). The FID is derived from the VID when
      possible (the VTU maps a VID to a FID), with a fallback to the port
      based default FID value when not (802.1Q Mode is disabled on the port,
      or the classified VID isn't present in the VTU).
      
      The mv88e6xxx driver makes the following use of FIDs and VIDs:
      
      - the port's DefaultVID (to which untagged & pvid-tagged packets get
        classified) is 0 and is absent from the VTU, so this kind of packets is
        processed in FID 0, the default FID assigned by mv88e6xxx_setup_port.
      
      - every time a bridge VLAN is created, mv88e6xxx_port_vlan_join() ->
        mv88e6xxx_atu_new() associates a FID with that VID which increases
        linearly starting from 1. Like this:
      
        bridge vlan add dev lan0 vid 100 # FID 1
        bridge vlan add dev lan1 vid 100 # still FID 1
        bridge vlan add dev lan2 vid 1024 # FID 2
      
      The FID allocation made by the driver is sub-optimal for the following
      reasons:
      
      (a) A standalone port has a DefaultPVID of 0 and a default FID of 0 too.
          A VLAN-unaware bridged port has a DefaultPVID of 0 and a default FID
          of 0 too. The difference is that the bridged ports may learn ATU
          entries, while the standalone port has the requirement that it must
          not, and must not find them either. Standalone ports must not use
          the same FID as ports belonging to a bridge. All standalone ports
          can use the same FID, since the ATU will never have an entry in
          that FID.
      
      (b) Multiple VLAN-unaware bridges will all use a DefaultPVID of 0 and a
          default FID of 0 on all their ports. The FDBs will not be isolated
          between these bridges. Every VLAN-unaware bridge must use the same
          FID on all its ports, different from the FID of other bridge ports.
      
      (c) Each bridge VLAN uses a unique FID which is useful for Independent
          VLAN Learning, but the same VLAN ID on multiple VLAN-aware bridges
          will result in the same FID being used by mv88e6xxx_atu_new().
          The correct behavior is for VLAN 1 in br0 to have a different FID
          compared to VLAN 1 in br1.
      
      This patch cannot fix all the above. Traditionally the DSA framework did
      not care about this, and the reality is that DSA core involvement is
      needed for the aforementioned issues to be solved. The only thing we can
      solve here is an issue which does not require API changes, and that is
      issue (a), aka use a different FID for standalone ports vs ports under
      VLAN-unaware bridges.
      
      The first step is deciding what VID and FID to use for standalone ports,
      and what VID and FID for bridged ports. The 0/0 pair for standalone
      ports is what they used up till now, let's keep using that. For bridged
      ports, there are 2 cases:
      
      - VLAN-aware ports will never end up using the port default FID, because
        packets will always be classified to a VID in the VTU or dropped
        otherwise. The FID is the one associated with the VID in the VTU.
      
      - On VLAN-unaware ports, we _could_ leave their DefaultVID (pvid) at
        zero (just as in the case of standalone ports), and just change the
        port's default FID from 0 to a different number (say 1).
      
      However, Tobias points out that there is one more requirement to cater to:
      cross-chip bridging. The Marvell DSA header does not carry the FID in
      it, only the VID. So once a packet crosses a DSA link, if it has a VID
      of zero it will get classified to the default FID of that cascade port.
      Relying on a port default FID for upstream cascade ports results in
      contradictions: a default FID of 0 breaks ATU isolation of bridged ports
      on the downstream switch, a default FID of 1 breaks standalone ports on
      the downstream switch.
      
      So not only must standalone ports have different FIDs compared to
      bridged ports, they must also have different DefaultVID values.
      IEEE 802.1Q defines two reserved VID values: 0 and 4095. So we simply
      choose 4095 as the DefaultVID of ports belonging to VLAN-unaware
      bridges, and VID 4095 maps to FID 1.
      
      For the xmit operation to look up the same ATU database, we need to put
      VID 4095 in DSA tags sent to ports belonging to VLAN-unaware bridges
      too. All shared ports are configured to map this VID to the bridging
      FID, because they are members of that VLAN in the VTU. Shared ports
      don't need to have 802.1QMode enabled in any way, they always parse the
      VID from the DSA header, they don't need to look at the 802.1Q header.
      
      We install VID 4095 to the VTU in mv88e6xxx_setup_port(), with the
      mention that mv88e6xxx_vtu_setup() which was located right below that
      call was flushing the VTU so those entries wouldn't be preserved.
      So we need to relocate the VTU flushing prior to the port initialization
      during ->setup(). Also note that this is why it is safe to assume that
      VID 4095 will get associated with FID 1: the user ports haven't been
      created, so there is no avenue for the user to create a bridge VLAN
      which could otherwise race with the creation of another FID which would
      otherwise use up the non-reserved FID value of 1.
      
      [ Currently mv88e6xxx_port_vlan_join() doesn't have the option of
        specifying a preferred FID, it always calls mv88e6xxx_atu_new(). ]
      
      mv88e6xxx_port_db_load_purge() is the function to access the ATU for
      FDB/MDB entries, and it used to determine the FID to use for
      VLAN-unaware FDB entries (VID=0) using mv88e6xxx_port_get_fid().
      But the driver only called mv88e6xxx_port_set_fid() once, during probe,
      so no surprises, the port FID was always 0, the call to get_fid() was
      redundant. As much as I would have wanted to not touch that code, the
      logic is broken when we add a new FID which is not the port-based
      default. Now the port-based default FID only corresponds to standalone
      ports, and FDB/MDB entries belong to the bridging service. So while in
      the future, when the DSA API will support FDB isolation, we will have to
      figure out the FID based on the bridge number, for now there's a single
      bridging FID, so hardcode that.
      
      Lastly, the tagger needs to check, when it is transmitting a VLAN
      untagged skb, whether it is sending it towards a bridged or a standalone
      port. When we see it is bridged we assume the bridge is VLAN-unaware.
      Not because it cannot be VLAN-aware but:
      
      - if we are transmitting from a VLAN-aware bridge we are likely doing so
        using TX forwarding offload. That code path guarantees that skbs have
        a vlan hwaccel tag in them, so we would not enter the "else" branch
        of the "if (skb->protocol == htons(ETH_P_8021Q))" condition.
      
      - if we are transmitting on behalf of a VLAN-aware bridge but with no TX
        forwarding offload (no PVT support, out of space in the PVT, whatever),
        we would indeed be transmitting with VLAN 4095 instead of the bridge
        device's pvid. However we would be injecting a "From CPU" frame, and
        the switch won't learn from that - it only learns from "Forward" frames.
        So it is inconsequential for address learning. And VLAN 4095 is
        absolutely enough for the frame to exit the switch, since we never
        remove that VLAN from any port.
      
      Fixes: 57e661aa ("net: dsa: mv88e6xxx: Link aggregation support")
      Reported-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      5bded825
    • V
      net: dsa: mv88e6xxx: keep the pvid at 0 when VLAN-unaware · 8b6836d8
      Vladimir Oltean 提交于
      The VLAN support in mv88e6xxx has a loaded history. Commit 2ea7a679
      ("net: dsa: Don't add vlans when vlan filtering is disabled") noticed
      some issues with VLAN and decided the best way to deal with them was to
      make the DSA core ignore VLANs added by the bridge while VLAN awareness
      is turned off. Those issues were never explained, just presented as
      "at least one corner case".
      
      That approach had problems of its own, presented by
      commit 54a0ed0d ("net: dsa: provide an option for drivers to always
      receive bridge VLANs") for the DSA core, followed by
      commit 1fb74191 ("net: dsa: mv88e6xxx: fix vlan setup") which
      applied ds->configure_vlan_while_not_filtering = true for mv88e6xxx in
      particular.
      
      We still don't know what corner case Andrew saw when he wrote
      commit 2ea7a679 ("net: dsa: Don't add vlans when vlan filtering is
      disabled"), but Tobias now reports that when we use TX forwarding
      offload, pinging an external station from the bridge device is broken if
      the front-facing DSA user port has flooding turned off. The full
      description is in the link below, but for short, when a mv88e6xxx port
      is under a VLAN-unaware bridge, it inherits that bridge's pvid.
      So packets ingressing a user port will be classified to e.g. VID 1
      (assuming that value for the bridge_default_pvid), whereas when
      tag_dsa.c xmits towards a user port, it always sends packets using a VID
      of 0 if that port is standalone or under a VLAN-unaware bridge - or at
      least it did so prior to commit d82f8ab0 ("net: dsa: tag_dsa:
      offload the bridge forwarding process").
      
      In any case, when there is a conversation between the CPU and a station
      connected to a user port, the station's MAC address is learned in VID 1
      but the CPU tries to transmit through VID 0. The packets reach the
      intended station, but via flooding and not by virtue of matching the
      existing ATU entry.
      
      DSA has established (and enforced in other drivers: sja1105, felix,
      mt7530) that a VLAN-unaware port should use a private pvid, and not
      inherit the one from the bridge. The bridge's pvid should only be
      inherited when that bridge is VLAN-aware, so all state transitions need
      to be handled. On the other hand, all bridge VLANs should sit in the VTU
      starting with the moment when the bridge offloads them via switchdev,
      they are just not used.
      
      This solves the problem that Tobias sees because packets ingressing on
      VLAN-unaware user ports now get classified to VID 0, which is also the
      VID used by tag_dsa.c on xmit.
      
      Fixes: d82f8ab0 ("net: dsa: tag_dsa: offload the bridge forwarding process")
      Link: https://patchwork.kernel.org/project/netdevbpf/patch/20211003222312.284175-2-vladimir.oltean@nxp.com/#24491503Reported-by: NTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      8b6836d8
  2. 08 10月, 2021 2 次提交
  3. 07 10月, 2021 3 次提交
    • S
      iavf: fix double unlock of crit_lock · 54ee3943
      Stefan Assmann 提交于
      The crit_lock mutex could be unlocked twice as reported here
      https://lists.osuosl.org/pipermail/intel-wired-lan/Week-of-Mon-20210823/025525.html
      
      Remove the superfluous unlock. Technically the problem was already
      present before 5ac49f3c as that commit only replaced the locking
      primitive, but no functional change.
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Fixes: 5ac49f3c ("iavf: use mutexes for locking of critical sections")
      Fixes: bac84861 ("iavf: Refactor the watchdog state machine")
      Signed-off-by: NStefan Assmann <sassmann@kpanic.de>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      54ee3943
    • S
      i40e: Fix freeing of uninitialized misc IRQ vector · 2e5a2057
      Sylwester Dziedziuch 提交于
      When VSI set up failed in i40e_probe() as part of PF switch set up
      driver was trying to free misc IRQ vectors in
      i40e_clear_interrupt_scheme and produced a kernel Oops:
      
         Trying to free already-free IRQ 266
         WARNING: CPU: 0 PID: 5 at kernel/irq/manage.c:1731 __free_irq+0x9a/0x300
         Workqueue: events work_for_cpu_fn
         RIP: 0010:__free_irq+0x9a/0x300
         Call Trace:
         ? synchronize_irq+0x3a/0xa0
         free_irq+0x2e/0x60
         i40e_clear_interrupt_scheme+0x53/0x190 [i40e]
         i40e_probe.part.108+0x134b/0x1a40 [i40e]
         ? kmem_cache_alloc+0x158/0x1c0
         ? acpi_ut_update_ref_count.part.1+0x8e/0x345
         ? acpi_ut_update_object_reference+0x15e/0x1e2
         ? strstr+0x21/0x70
         ? irq_get_irq_data+0xa/0x20
         ? mp_check_pin_attr+0x13/0xc0
         ? irq_get_irq_data+0xa/0x20
         ? mp_map_pin_to_irq+0xd3/0x2f0
         ? acpi_register_gsi_ioapic+0x93/0x170
         ? pci_conf1_read+0xa4/0x100
         ? pci_bus_read_config_word+0x49/0x70
         ? do_pci_enable_device+0xcc/0x100
         local_pci_probe+0x41/0x90
         work_for_cpu_fn+0x16/0x20
         process_one_work+0x1a7/0x360
         worker_thread+0x1cf/0x390
         ? create_worker+0x1a0/0x1a0
         kthread+0x112/0x130
         ? kthread_flush_work_fn+0x10/0x10
         ret_from_fork+0x1f/0x40
      
      The problem is that at that point misc IRQ vectors
      were not allocated yet and we get a call trace
      that driver is trying to free already free IRQ vectors.
      
      Add a check in i40e_clear_interrupt_scheme for __I40E_MISC_IRQ_REQUESTED
      PF state before calling i40e_free_misc_vector. This state is set only if
      misc IRQ vectors were properly initialized.
      
      Fixes: c17401a1 ("i40e: use separate state bit for miscellaneous IRQ setup")
      Reported-by: NPJ Waskiewicz <pwaskiewicz@jumptrading.com>
      Signed-off-by: NSylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
      Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: NDave Switzer <david.switzer@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      2e5a2057
    • J
      i40e: fix endless loop under rtnl · 857b6c6f
      Jiri Benc 提交于
      The loop in i40e_get_capabilities can never end. The problem is that
      although i40e_aq_discover_capabilities returns with an error if there's
      a firmware problem, the returned error is not checked. There is a check for
      pf->hw.aq.asq_last_status but that value is set to I40E_AQ_RC_OK on most
      firmware problems.
      
      When i40e_aq_discover_capabilities encounters a firmware problem, it will
      encounter the same problem on its next invocation. As the result, the loop
      becomes endless. We hit this with I40E_ERR_ADMIN_QUEUE_TIMEOUT but looking
      at the code, it can happen with a range of other firmware errors.
      
      I don't know what the correct behavior should be: whether the firmware
      should be retried a few times, or whether pf->hw.aq.asq_last_status should
      be always set to the encountered firmware error (but then it would be
      pointless and can be just replaced by the i40e_aq_discover_capabilities
      return value). However, the current behavior with an endless loop under the
      rtnl mutex(!) is unacceptable and Intel has not submitted a fix, although we
      explained the bug to them 7 months ago.
      
      This may not be the best possible fix but it's better than hanging the whole
      system on a firmware bug.
      
      Fixes: 56a62fc8 ("i40e: init code and hardware support")
      Tested-by: NStefan Assmann <sassmann@redhat.com>
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Reviewed-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: NDave Switzer <david.switzer@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      857b6c6f
  4. 06 10月, 2021 8 次提交
  5. 05 10月, 2021 3 次提交
  6. 02 10月, 2021 2 次提交
  7. 01 10月, 2021 12 次提交
  8. 30 9月, 2021 1 次提交
  9. 29 9月, 2021 7 次提交