1. 15 2月, 2021 6 次提交
    • V
      net: dsa: propagate extack to .port_vlan_filtering · 89153ed6
      Vladimir Oltean 提交于
      Some drivers can't dynamically change the VLAN filtering option, or
      impose some restrictions, it would be nice to propagate this info
      through netlink instead of printing it to a kernel log that might never
      be read. Also netlink extack includes the module that emitted the
      message, which means that it's easier to figure out which ones are
      driver-generated errors as opposed to command misuse.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89153ed6
    • V
      net: dsa: propagate extack to .port_vlan_add · 31046a5f
      Vladimir Oltean 提交于
      Allow drivers to communicate their restrictions to user space directly,
      instead of printing to the kernel log. Where the conversion would have
      been lossy and things like VLAN ID could no longer be conveyed (due to
      the lack of support for printf format specifier in netlink extack), I
      chose to keep the messages in full form to the kernel log only, and
      leave it up to individual driver maintainers to move more messages to
      extack.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      31046a5f
    • V
      net: dsa: tag_ocelot_8021q: add support for PTP timestamping · 0a6f17c6
      Vladimir Oltean 提交于
      For TX timestamping, we use the felix_txtstamp method which is common
      with the regular (non-8021q) ocelot tagger. This method says that skb
      deferral is needed, prepares a timestamp request ID, and puts a clone of
      the skb in a queue waiting for the timestamp IRQ.
      
      felix_txtstamp is called by dsa_skb_tx_timestamp() just before the
      tagger's xmit method. In the tagger xmit, we divert the packets
      classified by dsa_skb_tx_timestamp() as PTP towards the MMIO-based
      injection registers, and we declare them as dead towards dsa_slave_xmit.
      If not PTP, we proceed with normal tag_8021q stuff.
      
      Then the timestamp IRQ fires, the clone queued up from felix_txtstamp is
      matched to the TX timestamp retrieved from the switch's FIFO based on
      the timestamp request ID, and the clone is delivered to the stack.
      
      On RX, thanks to the VCAP IS2 rule that redirects the frames with an
      EtherType for 1588 towards two destinations:
      - the CPU port module (for MMIO based extraction) and
      - if the "no XTR IRQ" workaround is in place, the dsa_8021q CPU port
      the relevant data path processing starts in the ptp_classify_raw BPF
      classifier installed by DSA in the RX data path (post tagger, which is
      completely unaware that it saw a PTP packet).
      
      This time we can't reuse the same implementation of .port_rxtstamp that
      also works with the default ocelot tagger. That is because felix_rxtstamp
      is given an skb with a freshly stripped DSA header, and it says "I don't
      need deferral for its RX timestamp, it's right in it, let me show you";
      and it just points to the header right behind skb->data, from where it
      unpacks the timestamp and annotates the skb with it.
      
      The same thing cannot happen with tag_ocelot_8021q, because for one
      thing, the skb did not have an extraction frame header in the first
      place, but a VLAN tag with no timestamp information. So the code paths
      in felix_rxtstamp for the regular and 8021q tagger are completely
      independent. With tag_8021q, the timestamp must come from the packet's
      duplicate delivered to the CPU port module, but there is potentially
      complex logic to be handled [ and prone to reordering ] if we were to
      just start reading packets from the CPU port module, and try to match
      them to the one we received over Ethernet and which needs an RX
      timestamp. So we do something simple: we tell DSA "give me some time to
      think" (we request skb deferral by returning false from .port_rxtstamp)
      and we just drop the frame we got over Ethernet with no attempt to match
      it to anything - we just treat it as a notification that there's data to
      be processed from the CPU port module's queues. Then we proceed to read
      the packets from those, one by one, which we deliver up the stack,
      timestamped, using netif_rx - the same function that any driver would
      use anyway if it needed RX timestamp deferral. So the assumption is that
      we'll come across the PTP packet that triggered the CPU extraction
      notification eventually, but we don't know when exactly. Thanks to the
      VCAP IS2 trap/redirect rule and the exclusion of the CPU port module
      from the flooding replicators, only PTP frames should be present in the
      CPU port module's RX queues anyway.
      
      There is just one conflict between the VCAP IS2 trapping rule and the
      semantics of the BPF classifier. Namely, ptp_classify_raw() deems
      general messages as non-timestampable, but still, those are trapped to
      the CPU port module since they have an EtherType of ETH_P_1588. So, if
      the "no XTR IRQ" workaround is in place, we need to run another BPF
      classifier on the frames extracted over MMIO, to avoid duplicates being
      sent to the stack (once over Ethernet, once over MMIO). It doesn't look
      like it's possible to install VCAP IS2 rules based on keys extracted
      from the 1588 frame headers.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a6f17c6
    • V
      net: dsa: felix: setup MMIO filtering rules for PTP when using tag_8021q · c8c0ba4f
      Vladimir Oltean 提交于
      Since the tag_8021q tagger is software-defined, it has no means by
      itself for retrieving hardware timestamps of PTP event messages.
      
      Because we do want to support PTP on ocelot even with tag_8021q, we need
      to use the CPU port module for that. The RX timestamp is present in the
      Extraction Frame Header. And because we can't use NPI mode which redirects
      the CPU queues to an "external CPU" (meaning the ARM CPU running Linux),
      then we need to poll the CPU port module through the MMIO registers to
      retrieve TX and RX timestamps.
      
      Sadly, on NXP LS1028A, the Felix switch was integrated into the SoC
      without wiring the extraction IRQ line to the ARM GIC. So, if we want to
      be notified of any PTP packets received on the CPU port module, we have
      a problem.
      
      There is a possible workaround, which is to use the Ethernet CPU port as
      a notification channel that packets are available on the CPU port module
      as well. When a PTP packet is received by the DSA tagger (without timestamp,
      of course), we go to the CPU extraction queues, poll for it there, then
      we drop the original Ethernet packet and masquerade the packet retrieved
      over MMIO (plus the timestamp) as the original when we inject it up the
      stack.
      
      Create a quirk in struct felix is selected by the Felix driver (but not
      by Seville, since that doesn't support PTP at all). We want to do this
      such that the workaround is minimally invasive for future switches that
      don't require this workaround.
      
      The only traffic for which we need timestamps is PTP traffic, so add a
      redirection rule to the CPU port module for this. Currently we only have
      the need for PTP over L2, so redirection rules for UDP ports 319 and 320
      are TBD for now.
      
      Note that for the workaround of matching of PTP-over-Ethernet-port with
      PTP-over-MMIO queues to work properly, both channels need to be
      absolutely lossless. There are two parts to achieving that:
      - We keep flow control enabled on the tag_8021q CPU port
      - We put the DSA master interface in promiscuous mode, so it will never
        drop a PTP frame (for the profiles we are interested in, these are
        sent to the multicast MAC addresses of 01-80-c2-00-00-0e and
        01-1b-19-00-00-00).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8c0ba4f
    • V
      net: dsa: tag_ocelot: create separate tagger for Seville · 7c4bb540
      Vladimir Oltean 提交于
      The ocelot tagger is a hot mess currently, it relies on memory
      initialized by the attached driver for basic frame transmission.
      This is against all that DSA tagging protocols stand for, which is that
      the transmission and reception of a DSA-tagged frame, the data path,
      should be independent from the switch control path, because the tag
      protocol is in principle hot-pluggable and reusable across switches
      (even if in practice it wasn't until very recently). But if another
      driver like dsa_loop wants to make use of tag_ocelot, it couldn't.
      
      This was done to have common code between Felix and Ocelot, which have
      one bit difference in the frame header format. Quoting from commit
      67c24049 ("net: dsa: felix: create a template for the DSA tags on
      xmit"):
      
          Other alternatives have been analyzed, such as:
          - Create a separate tag_seville.c: too much code duplication for just 1
            bit field difference.
          - Create a separate DSA_TAG_PROTO_SEVILLE under tag_ocelot.c, just like
            tag_brcm.c, which would have a separate .xmit function. Again, too
            much code duplication for just 1 bit field difference.
          - Allocate the template from the init function of the tag_ocelot.c
            module, instead of from the driver: couldn't figure out a method of
            accessing the correct port template corresponding to the correct
            tagger in the .xmit function.
      
      The really interesting part is that Seville should have had its own
      tagging protocol defined - it is not compatible on the wire with Ocelot,
      even for that single bit. In principle, a packet generated by
      DSA_TAG_PROTO_OCELOT when booted on NXP LS1028A would look in a certain
      way, but when booted on NXP T1040 it would look differently. The reverse
      is also true: a packet generated by a Seville switch would be
      interpreted incorrectly by Wireshark if it was told it was generated by
      an Ocelot switch.
      
      Actually things are a bit more nuanced. If we concentrate only on the
      DSA tag, what I said above is true, but Ocelot/Seville also support an
      optional DSA tag prefix, which can be short or long, and it is possible
      to distinguish the two taggers based on an integer constant put in that
      prefix. Nonetheless, creating a separate tagger is still justified,
      since the tag prefix is optional, and without it, there is again no way
      to distinguish.
      
      Claiming backwards binary compatibility is a bit more tough, since I've
      already changed the format of tag_ocelot once, in commit 5124197c
      ("net: dsa: tag_ocelot: use a short prefix on both ingress and egress").
      Therefore I am not very concerned with treating this as a bugfix and
      backporting it to stable kernels (which would be another mess due to the
      fact that there would be lots of conflicts with the other DSA_TAG_PROTO*
      definitions). It's just simpler to say that the string values of the
      taggers have ABI value starting with kernel 5.12, which will be when the
      changing of tag protocol via /sys/class/net/<dsa-master>/dsa/tagging
      goes live.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c4bb540
    • V
      net: mscc: ocelot: use common tag parsing code with DSA · 40d3f295
      Vladimir Oltean 提交于
      The Injection Frame Header and Extraction Frame Header that the switch
      prepends to frames over the NPI port is also prepended to frames
      delivered over the CPU port module's queues.
      
      Let's unify the handling of the frame headers by making the ocelot
      driver call some helpers exported by the DSA tagger. Among other things,
      this allows us to get rid of the strange cpu_to_be32 when transmitting
      the Injection Frame Header on ocelot, since the packing API uses
      network byte order natively (when "quirks" is 0).
      
      The comments above ocelot_gen_ifh talk about setting pop_cnt to 3, and
      the cpu extraction queue mask to something, but the code doesn't do it,
      so we don't do it either.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40d3f295
  2. 13 2月, 2021 5 次提交
    • V
      net: dsa: sja1105: offload bridge port flags to device · 4d942354
      Vladimir Oltean 提交于
      The chip can configure unicast flooding, broadcast flooding and learning.
      Learning is per port, while flooding is per {ingress, egress} port pair
      and we need to configure the same value for all possible ingress ports
      towards the requested one.
      
      While multicast flooding is not officially supported, we can hack it by
      using a feature of the second generation (P/Q/R/S) devices, which is that
      FDB entries are maskable, and multicast addresses always have an odd
      first octet. So by putting a match-all for 00:01:00:00:00:00 addr and
      00:01:00:00:00:00 mask at the end of the FDB, we make sure that it is
      always checked last, and does not take precedence in front of any other
      MDB. So it behaves effectively as an unknown multicast entry.
      
      For the first generation switches, this feature is not available, so
      unknown multicast will always be treated the same as unknown unicast.
      So the only thing we can do is request the user to offload the settings
      for these 2 flags in tandem, i.e.
      
      ip link set swp2 type bridge_slave flood off
      Error: sja1105: This chip cannot configure multicast flooding independently of unicast.
      ip link set swp2 type bridge_slave flood off mcast_flood off
      ip link set swp2 type bridge_slave mcast_flood on
      Error: sja1105: This chip cannot configure multicast flooding independently of unicast.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d942354
    • V
      net: mscc: ocelot: offload bridge port flags to device · 421741ea
      Vladimir Oltean 提交于
      We should not be unconditionally enabling address learning, since doing
      that is actively detrimential when a port is standalone and not offloading
      a bridge. Namely, if a port in the switch is standalone and others are
      offloading the bridge, then we could enter a situation where we learn an
      address towards the standalone port, but the bridged ports could not
      forward the packet there, because the CPU is the only path between the
      standalone and the bridged ports. The solution of course is to not
      enable address learning unless the bridge asks for it.
      
      We need to set up the initial port flags for no learning and flooding
      everything, and also when the port joins and leaves the bridge.
      The flood configuration was already configured ok for standalone mode
      in ocelot_init, we just need to disable learning in ocelot_init_port.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      421741ea
    • V
      net: mscc: ocelot: use separate flooding PGID for broadcast · b360d94f
      Vladimir Oltean 提交于
      In preparation of offloading the bridge port flags which have
      independent settings for unknown multicast and for broadcast, we should
      also start reserving one destination Port Group ID for the flooding of
      broadcast packets, to allow configuring it individually.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b360d94f
    • V
      net: dsa: felix: restore multicast flood to CPU when NPI tagger reinitializes · 6edb9e8d
      Vladimir Oltean 提交于
      ocelot_init sets up PGID_MC to include the CPU port module, and that is
      fine, but the ocelot-8021q tagger removes the CPU port module from the
      unknown multicast replicator. So after a transition from the default
      ocelot tagger towards ocelot-8021q and then again towards ocelot,
      multicast flooding towards the CPU port module will be disabled.
      
      Fixes: e21268ef ("net: dsa: felix: perform switch setup for tag_8021q")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6edb9e8d
    • V
      net: dsa: act as passthrough for bridge port flags · a8b659e7
      Vladimir Oltean 提交于
      There are multiple ways in which a PORT_BRIDGE_FLAGS attribute can be
      expressed by the bridge through switchdev, and not all of them can be
      emulated by DSA mid-layer API at the same time.
      
      One possible configuration is when the bridge offloads the port flags
      using a mask that has a single bit set - therefore only one feature
      should change. However, DSA currently groups together unicast and
      multicast flooding in the .port_egress_floods method, which limits our
      options when we try to add support for turning off broadcast flooding:
      do we extend .port_egress_floods with a third parameter which b53 and
      mv88e6xxx will ignore? But that means that the DSA layer, which
      currently implements the PRE_BRIDGE_FLAGS attribute all by itself, will
      see that .port_egress_floods is implemented, and will report that all 3
      types of flooding are supported - not necessarily true.
      
      Another configuration is when the user specifies more than one flag at
      the same time, in the same netlink message. If we were to create one
      individual function per offloadable bridge port flag, we would limit the
      expressiveness of the switch driver of refusing certain combinations of
      flag values. For example, a switch may not have an explicit knob for
      flooding of unknown multicast, just for flooding in general. In that
      case, the only correct thing to do is to allow changes to BR_FLOOD and
      BR_MCAST_FLOOD in tandem, and never allow mismatched values. But having
      a separate .port_set_unicast_flood and .port_set_multicast_flood would
      not allow the driver to possibly reject that.
      
      Also, DSA doesn't consider it necessary to inform the driver that a
      SWITCHDEV_ATTR_ID_BRIDGE_MROUTER attribute was offloaded, because it
      just calls .port_egress_floods for the CPU port. When we'll add support
      for the plain SWITCHDEV_ATTR_ID_PORT_MROUTER, that will become a real
      problem because the flood settings will need to be held statefully in
      the DSA middle layer, otherwise changing the mrouter port attribute will
      impact the flooding attribute. And that's _assuming_ that the underlying
      hardware doesn't have anything else to do when a multicast router
      attaches to a port than flood unknown traffic to it.  If it does, there
      will need to be a dedicated .port_set_mrouter anyway.
      
      So we need to let the DSA drivers see the exact form that the bridge
      passes this switchdev attribute in, otherwise we are standing in the
      way. Therefore we also need to use this form of language when
      communicating to the driver that it needs to configure its initial
      (before bridge join) and final (after bridge leave) port flags.
      
      The b53 and mv88e6xxx drivers are converted to the passthrough API and
      their implementation of .port_egress_floods is split into two: a
      function that configures unicast flooding and another for multicast.
      The mv88e6xxx implementation is quite hairy, and it turns out that
      the implementations of unknown unicast flooding are actually the same
      for 6185 and for 6352:
      
      behind the confusing names actually lie two individual bits:
      NO_UNKNOWN_MC -> FLOOD_UC = 0x4 = BIT(2)
      NO_UNKNOWN_UC -> FLOOD_MC = 0x8 = BIT(3)
      
      so there was no reason to entangle them in the first place.
      
      Whereas the 6185 writes to MV88E6185_PORT_CTL0_FORWARD_UNKNOWN of
      PORT_CTL0, which has the exact same bit index. I have left the
      implementations separate though, for the only reason that the names are
      different enough to confuse me, since I am not able to double-check with
      a user manual. The multicast flooding setting for 6185 is in a different
      register than for 6352 though.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8b659e7
  3. 12 2月, 2021 3 次提交
  4. 10 2月, 2021 1 次提交
    • V
      net: dsa: felix: implement port flushing on .phylink_mac_link_down · eb4733d7
      Vladimir Oltean 提交于
      There are several issues which may be seen when the link goes down while
      forwarding traffic, all of which can be attributed to the fact that the
      port flushing procedure from the reference manual was not closely
      followed.
      
      With flow control enabled on both the ingress port and the egress port,
      it may happen when a link goes down that Ethernet packets are in flight.
      In flow control mode, frames are held back and not dropped. When there
      is enough traffic in flight (example: iperf3 TCP), then the ingress port
      might enter congestion and never exit that state. This is a problem,
      because it is the egress port's link that went down, and that has caused
      the inability of the ingress port to send packets to any other port.
      This is solved by flushing the egress port's queues when it goes down.
      
      There is also a problem when performing stream splitting for
      IEEE 802.1CB traffic (not yet upstream, but a sort of multicast,
      basically). There, if one port from the destination ports mask goes
      down, splitting the stream towards the other destinations will no longer
      be performed. This can be traced down to this line:
      
      	ocelot_port_writel(ocelot_port, 0, DEV_MAC_ENA_CFG);
      
      which should have been instead, as per the reference manual:
      
      	ocelot_port_rmwl(ocelot_port, 0, DEV_MAC_ENA_CFG_RX_ENA,
      			 DEV_MAC_ENA_CFG);
      
      Basically only DEV_MAC_ENA_CFG_RX_ENA should be disabled, but not
      DEV_MAC_ENA_CFG_TX_ENA - I don't have further insight into why that is
      the case, but apparently multicasting to several ports will cause issues
      if at least one of them doesn't have DEV_MAC_ENA_CFG_TX_ENA set.
      
      I am not sure what the state of the Ocelot VSC7514 driver is, but
      probably not as bad as Felix/Seville, since VSC7514 uses phylib and has
      the following in ocelot_adjust_link:
      
      	if (!phydev->link)
      		return;
      
      therefore the port is not really put down when the link is lost, unlike
      the DSA drivers which use .phylink_mac_link_down for that.
      
      Nonetheless, I put ocelot_port_flush() in the common ocelot.c because it
      needs to access some registers from drivers/net/ethernet/mscc/ocelot_rew.h
      which are not exported in include/soc/mscc/ and a bugfix patch should
      probably not move headers around.
      
      Fixes: bdeced75 ("net: dsa: felix: Add PCS operations for PHYLINK")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb4733d7
  5. 07 2月, 2021 1 次提交
  6. 05 2月, 2021 2 次提交
  7. 02 2月, 2021 3 次提交
  8. 30 1月, 2021 5 次提交
    • V
      net: dsa: felix: perform switch setup for tag_8021q · e21268ef
      Vladimir Oltean 提交于
      Unlike sja1105, the only other user of the software-defined tag_8021q.c
      tagger format, the implementation we choose for the Felix DSA switch
      driver preserves full functionality under a vlan_filtering bridge
      (i.e. IP termination works through the DSA user ports under all
      circumstances).
      
      The tag_8021q protocol just wants:
      - Identifying the ingress switch port based on the RX VLAN ID, as seen
        by the CPU. We achieve this by using the TCAM engines (which are also
        used for tc-flower offload) to push the RX VLAN as a second, outer
        tag, on egress towards the CPU port.
      - Steering traffic injected into the switch from the network stack
        towards the correct front port based on the TX VLAN, and consuming
        (popping) that header on the switch's egress.
      
      A tc-flower pseudocode of the static configuration done by the driver
      would look like this:
      
      $ tc qdisc add dev <cpu-port> clsact
      $ for eth in swp0 swp1 swp2 swp3; do \
      	tc filter add dev <cpu-port> egress flower indev ${eth} \
      		action vlan push id <rxvlan> protocol 802.1ad; \
      	tc filter add dev <cpu-port> ingress protocol 802.1Q flower
      		vlan_id <txvlan> action vlan pop \
      		action mirred egress redirect dev ${eth}; \
      done
      
      but of course since DSA does not register network interfaces for the CPU
      port, this configuration would be impossible for the user to do. Also,
      due to the same reason, it is impossible for the user to inadvertently
      delete these rules using tc. These rules do not collide in any way with
      tc-flower, they just consume some TCAM space, which is something we can
      live with.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      e21268ef
    • V
      net: dsa: add a second tagger for Ocelot switches based on tag_8021q · 7c83a7c5
      Vladimir Oltean 提交于
      There are use cases for which the existing tagger, based on the NPI
      (Node Processor Interface) functionality, is insufficient.
      
      Namely:
      - Frames injected through the NPI port bypass the frame analyzer, so no
        source address learning is performed, no TSN stream classification,
        etc.
      - Flow control is not functional over an NPI port (PAUSE frames are
        encapsulated in the same Extraction Frame Header as all other frames)
      - There can be at most one NPI port configured for an Ocelot switch. But
        in NXP LS1028A and T1040 there are two Ethernet CPU ports. The non-NPI
        port is currently either disabled, or operated as a plain user port
        (albeit an internally-facing one). Having the ability to configure the
        two CPU ports symmetrically could pave the way for e.g. creating a LAG
        between them, to increase bandwidth seamlessly for the system.
      
      So there is a desire to have an alternative to the NPI mode. This change
      keeps the default tagger for the Seville and Felix switches as "ocelot",
      but it can be changed via the following device attribute:
      
      echo ocelot-8021q > /sys/class/<dsa-master>/dsa/tagging
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      7c83a7c5
    • V
      net: dsa: felix: convert to the new .change_tag_protocol DSA API · adb3dccf
      Vladimir Oltean 提交于
      In expectation of the new tag_ocelot_8021q tagger implementation, we
      need to be able to do runtime switchover between one tagger and another.
      So we must structure the existing code for the current NPI-based tagger
      in a certain way.
      
      We move the felix_npi_port_init function in expectation of the future
      driver configuration necessary for tag_ocelot_8021q: we would like to
      not have the NPI-related bits interspersed with the tag_8021q bits.
      
      The conversion from this:
      
      	ocelot_write_rix(ocelot,
      			 ANA_PGID_PGID_PGID(GENMASK(ocelot->num_phys_ports, 0)),
      			 ANA_PGID_PGID, PGID_UC);
      
      to this:
      
      	cpu_flood = ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports));
      	ocelot_rmw_rix(ocelot, cpu_flood, cpu_flood, ANA_PGID_PGID, PGID_UC);
      
      is perhaps non-trivial, but is nonetheless non-functional. The PGID_UC
      (replicator for unknown unicast) is already configured out of hardware
      reset to flood to all ports except ocelot->num_phys_ports (the CPU port
      module). All we change is that we use a read-modify-write to only add
      the CPU port module to the unknown unicast replicator, as opposed to
      doing a full write to the register.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      adb3dccf
    • V
      net: mscc: ocelot: don't use NPI tag prefix for the CPU port module · cacea62f
      Vladimir Oltean 提交于
      Context: Ocelot switches put the injection/extraction frame header in
      front of the Ethernet header. When used in NPI mode, a DSA master would
      see junk instead of the destination MAC address, and it would most
      likely drop the packets. So the Ocelot frame header can have an optional
      prefix, which is just "ff:ff:ff:ff:ff:fe > ff:ff:ff:ff:ff:ff" padding
      put before the actual tag (still before the real Ethernet header) such
      that the DSA master thinks it's looking at a broadcast frame with a
      strange EtherType.
      
      Unfortunately, a lesson learned in commit 69df578c ("net: mscc:
      ocelot: eliminate confusion between CPU and NPI port") seems to have
      been forgotten in the meanwhile.
      
      The CPU port module and the NPI port have independent settings for the
      length of the tag prefix. However, the driver is using the same variable
      to program both of them.
      
      There is no reason really to use any tag prefix with the CPU port
      module, since that is not connected to any Ethernet port. So this patch
      makes the inj_prefix and xtr_prefix variables apply only to the NPI
      port (which the switchdev ocelot_vsc7514 driver does not use).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      cacea62f
    • K
      net: dsa: hellcreek: Add missing TAPRIO dependency · 6c13d75b
      Kurt Kanzenbach 提交于
      Add missing dependency to TAPRIO to avoid build failures such as:
      
      |ERROR: modpost: "taprio_offload_get" [drivers/net/dsa/hirschmann/hellcreek_sw.ko] undefined!
      |ERROR: modpost: "taprio_offload_free" [drivers/net/dsa/hirschmann/hellcreek_sw.ko] undefined!
      
      Fixes: 24dfc6eb ("net: dsa: hellcreek: Add TAPRIO offloading support")
      Reported-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NKurt Kanzenbach <kurt@linutronix.de>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
      Link: https://lore.kernel.org/r/20210128163338.22665-1-kurt@linutronix.deSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      6c13d75b
  9. 28 1月, 2021 2 次提交
  10. 27 1月, 2021 2 次提交
  11. 26 1月, 2021 1 次提交
  12. 24 1月, 2021 3 次提交
  13. 21 1月, 2021 3 次提交
  14. 20 1月, 2021 1 次提交
  15. 19 1月, 2021 1 次提交
  16. 16 1月, 2021 1 次提交
    • V
      net: mscc: ocelot: configure watermarks using devlink-sb · f59fd9ca
      Vladimir Oltean 提交于
      Using devlink-sb, we can configure 12/16 (the important 75%) of the
      switch's controlling watermarks for congestion drops, and we can monitor
      50% of the watermark occupancies (we can monitor the reservation
      watermarks, but not the sharing watermarks, which are exposed as pool
      sizes).
      
      The following definitions can be made:
      
      SB_BUF=0 # The devlink-sb for frame buffers
      SB_REF=1 # The devlink-sb for frame references
      POOL_ING=0 # The pool for ingress traffic. Both devlink-sb instances
                 # have one of these.
      POOL_EGR=1 # The pool for egress traffic. Both devlink-sb instances
                 # have one of these.
      
      Editing the hardware watermarks is done in the following way:
      BUF_xxxx_I is accessed when sb=$SB_BUF and pool=$POOL_ING
      REF_xxxx_I is accessed when sb=$SB_REF and pool=$POOL_ING
      BUF_xxxx_E is accessed when sb=$SB_BUF and pool=$POOL_EGR
      REF_xxxx_E is accessed when sb=$SB_REF and pool=$POOL_EGR
      
      Configuring the sharing watermarks for COL_SHR(dp=0) is done implicitly
      by modifying the corresponding pool size. By default, the pool size has
      maximum size, so this can be skipped.
      
      devlink sb pool set pci/0000:00:00.5 sb $SB_BUF pool $POOL_ING \
      	size 129840 thtype static
      
      Since by default there is no buffer reservation, the above command has
      maxed out BUF_COL_SHR_I(dp=0).
      
      Configuring the per-port reservation watermark (P_RSRV) is done in the
      following way:
      
      devlink sb port pool set pci/0000:00:00.5/0 sb $SB_BUF \
      	pool $POOL_ING th 1000
      
      The above command sets BUF_P_RSRV_I(port 0) to 1000 bytes. After this
      command, the sharing watermarks are internally reconfigured with 1000
      bytes less, i.e. from 129840 bytes to 128840 bytes.
      
      Configuring the per-port-tc reservation watermarks (Q_RSRV) is done in
      the following way:
      
      for tc in {0..7}; do
      	devlink sb tc bind set pci/0000:00:00.5/0 sb 0 tc $tc \
      		type ingress pool $POOL_ING \
      		th 3000
      done
      
      The above command sets BUF_Q_RSRV_I(port 0, tc 0..7) to 3000 bytes.
      The sharing watermarks are again reconfigured with 24000 bytes less.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      f59fd9ca