1. 06 1月, 2021 1 次提交
  2. 05 10月, 2020 1 次提交
    • V
      net: dsa: propagate switchdev vlan_filtering prepare phase to drivers · 2e554a7a
      Vladimir Oltean 提交于
      A driver may refuse to enable VLAN filtering for any reason beyond what
      the DSA framework cares about, such as:
      - having tc-flower rules that rely on the switch being VLAN-aware
      - the particular switch does not support VLAN, even if the driver does
        (the DSA framework just checks for the presence of the .port_vlan_add
        and .port_vlan_del pointers)
      - simply not supporting this configuration to be toggled at runtime
      
      Currently, when a driver rejects a configuration it cannot support, it
      does this from the commit phase, which triggers various warnings in
      switchdev.
      
      So propagate the prepare phase to drivers, to give them the ability to
      refuse invalid configurations cleanly and avoid the warnings.
      
      Since we need to modify all function prototypes and check for the
      prepare phase from within the drivers, take that opportunity and move
      the existing driver restrictions within the prepare phase where that is
      possible and easy.
      
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
      Cc: Hauke Mehrtens <hauke@hauke-m.de>
      Cc: Woojung Huh <woojung.huh@microchip.com>
      Cc: Microchip Linux Driver Support <UNGLinuxDriver@microchip.com>
      Cc: Sean Wang <sean.wang@mediatek.com>
      Cc: Landen Chao <Landen.Chao@mediatek.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Vivien Didelot <vivien.didelot@gmail.com>
      Cc: Jonathan McDowell <noodles@earth.li>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Claudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2e554a7a
  3. 26 9月, 2020 2 次提交
  4. 21 9月, 2020 1 次提交
    • V
      net: dsa: tag_8021q: add VLANs to the master interface too · bbed0bbd
      Vladimir Oltean 提交于
      The whole purpose of tag_8021q is to send VLAN-tagged traffic to the
      CPU, from which the driver can decode the source port and switch id.
      
      Currently this only works if the VLAN filtering on the master is
      disabled. Change that by explicitly adding code to tag_8021q.c to add
      the VLANs corresponding to the tags to the filter of the master
      interface.
      
      Because we now need to call vlan_vid_add, then we also need to hold the
      RTNL mutex. Propagate that requirement to the callers of dsa_8021q_setup
      and modify the existing call sites as appropriate. Note that one call
      path, sja1105_best_effort_vlan_filtering_set -> sja1105_vlan_filtering
      -> sja1105_setup_8021q_tagging, was already holding this lock.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bbed0bbd
  5. 12 9月, 2020 2 次提交
    • V
      net: dsa: tag_8021q: add a context structure · 5899ee36
      Vladimir Oltean 提交于
      While working on another tag_8021q driver implementation, some things
      became apparent:
      
      - It is not mandatory for a DSA driver to offload the tag_8021q VLANs by
        using the VLAN table per se. For example, it can add custom TCAM rules
        that simply encapsulate RX traffic, and redirect & decapsulate rules
        for TX traffic. For such a driver, it makes no sense to receive the
        tag_8021q configuration through the same callback as it receives the
        VLAN configuration from the bridge and the 8021q modules.
      
      - Currently, sja1105 (the only tag_8021q user) sets a
        priv->expect_dsa_8021q variable to distinguish between the bridge
        calling, and tag_8021q calling. That can be improved, to say the
        least.
      
      - The crosschip bridging operations are, in fact, stateful already. The
        list of crosschip_links must be kept by the caller and passed to the
        relevant tag_8021q functions.
      
      So it would be nice if the tag_8021q configuration was more
      self-contained. This patch attempts to do that.
      
      Create a struct dsa_8021q_context which encapsulates a struct
      dsa_switch, and has 2 function pointers for adding and deleting a VLAN.
      These will replace the previous channel to the driver, which was through
      the .port_vlan_add and .port_vlan_del callbacks of dsa_switch_ops.
      
      Also put the list of crosschip_links into this dsa_8021q_context.
      Drivers that don't support cross-chip bridging can simply omit to
      initialize this list, as long as they dont call any cross-chip function.
      
      The sja1105_vlan_add and sja1105_vlan_del functions are refactored into
      a smaller sja1105_vlan_add_one, which now has 2 entry points:
      - sja1105_vlan_add, from struct dsa_switch_ops
      - sja1105_dsa_8021q_vlan_add, from the tag_8021q ops
      But even this change is fairly trivial. It just reflects the fact that
      for sja1105, the VLANs from these 2 channels end up in the same hardware
      table. However that is not necessarily true in the general sense (and
      that's the reason for making this change).
      
      The rest of the patch is mostly plain refactoring of "ds" -> "ctx". The
      dsa_8021q_context structure needs to be propagated because adding a VLAN
      is now done through the ops function pointers inside of it.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5899ee36
    • V
      net: dsa: tag_8021q: setup tagging via a single function call · 7e092af2
      Vladimir Oltean 提交于
      There is no point in calling dsa_port_setup_8021q_tagging for each
      individual port. Additionally, it will become more difficult to do that
      when we'll have a context structure to tag_8021q (next patch). So
      refactor this now.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7e092af2
  6. 25 8月, 2020 1 次提交
  7. 06 8月, 2020 1 次提交
    • V
      net: dsa: sja1105: use detected device id instead of DT one on mismatch · 0b0e2997
      Vladimir Oltean 提交于
      Although we can detect the chip revision 100% at runtime, it is useful
      to specify it in the device tree compatible string too, because
      otherwise there would be no way to assess the correctness of device tree
      bindings statically, without booting a board (only some switch versions
      have internal RGMII delays and/or an SGMII port).
      
      But for testing the P/Q/R/S support, what I have is a reworked board
      with the SJA1105T replaced by a pin-compatible SJA1105Q, and I don't
      want to keep a separate device tree blob just for this one-off board.
      Since just the chip has been replaced, its RGMII delay setup is
      inherently the same (meaning: delays added by the PHY on the slave
      ports, and by PCB traces on the fixed-link CPU port).
      
      For this board, I'd rather have the driver shout at me, but go ahead and
      use what it found even if it doesn't match what it's been told is there.
      
      [    2.970826] sja1105 spi0.1: Device tree specifies chip SJA1105T but found SJA1105Q, please fix it!
      [    2.980010] sja1105 spi0.1: Probed switch chip: SJA1105Q
      [    3.005082] sja1105 spi0.1: Enabled switch tagging
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b0e2997
  8. 30 6月, 2020 1 次提交
  9. 30 5月, 2020 1 次提交
    • V
      net: dsa: sja1105: avoid invalid state in sja1105_vlan_filtering · aef31718
      Vladimir Oltean 提交于
      Be there 2 switches spi/spi2.0 and spi/spi2.1 in a cross-chip setup,
      both under the same VLAN-filtering bridge, both in the
      SJA1105_VLAN_BEST_EFFORT state.
      
      If we try to change the VLAN state of one of the switches (to
      SJA1105_VLAN_FILTERING_FULL) we get the following error:
      
      devlink dev param set spi/spi2.1 name best_effort_vlan_filtering value
      false cmode runtime
      [   38.325683] sja1105 spi2.1: Not allowed to overcommit frame memory.
                     L2 memory partitions and VL memory partitions share the
                     same space. The sum of all 16 memory partitions is not
                     allowed to be larger than 929 128-byte blocks (or 910
                     with retagging). Please adjust
                     l2-forwarding-parameters-table.part_spc and/or
                     vl-forwarding-parameters-table.partspc.
      [   38.356803] sja1105 spi2.1: Invalid config, cannot upload
      
      This is because the spi/spi2.1 switch doesn't support tagging anymore in
      the SJA1105_VLAN_FILTERING_FULL state, so it doesn't need to have any
      retagging rules defined. Great, so it can use more frame memory
      (retagging consumes extra memory).
      
      But the built-in low-level static config checker from the sja1105 driver
      says "not so fast, you've increased the frame memory to non-retagging
      values, but you still kept the retagging rules in the static config".
      
      So we need to rebuild the VLAN table immediately before re-uploading the
      static config, operation which will take care, based on the new VLAN
      state, of removing the retagging rules.
      
      Fixes: 3f01c91a ("net: dsa: sja1105: implement VLAN retagging for dsa_8021q sub-VLANs")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aef31718
  10. 29 5月, 2020 1 次提交
    • V
      net: dsa: sja1105: offload the Credit-Based Shaper qdisc · 4d752508
      Vladimir Oltean 提交于
      SJA1105, being AVB/TSN switches, provide hardware assist for the
      Credit-Based Shaper as described in the IEEE 8021Q-2018 document.
      
      First generation has 10 shapers, freely assignable to any of the 4
      external ports and 8 traffic classes, and second generation has 16
      shapers.
      
      The Credit-Based Shaper tables are accessed through the dynamic
      reconfiguration interface, so we have to restore them manually after a
      switch reset. The tables are backed up by the static config only on
      P/Q/R/S, and we don't want to add custom code only for that family,
      since the procedure that is in place now works for both.
      
      Tested with the following commands:
      
      data_rate_kbps=67000
      port_transmit_rate_kbps=1000000
      idleslope=$data_rate_kbps
      sendslope=$(($idleslope - $port_transmit_rate_kbps))
      locredit=$((-0x80000000))
      hicredit=$((0x7fffffff))
      tc qdisc add dev swp2 root handle 1: mqprio hw 0 num_tc 8 \
              map 0 1 2 3 4 5 6 7 \
              queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7
      tc qdisc replace dev swp2 parent 1:1 cbs \
              idleslope $idleslope \
              sendslope $sendslope \
              hicredit $hicredit \
              locredit $locredit \
              offload 1
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d752508
  11. 13 5月, 2020 10 次提交
  12. 11 5月, 2020 1 次提交
    • V
      net: dsa: sja1105: implement cross-chip bridging operations · ac02a451
      Vladimir Oltean 提交于
      sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart
      and which is compatible with cascading. A complete description of this
      tagging format is in net/dsa/tag_8021q.c, but a quick summary is that
      each external-facing port tags incoming frames with a unique pvid, and
      this special VLAN is transmitted as tagged towards the inside of the
      system, and as untagged towards the exterior. The tag encodes the switch
      id and the source port index.
      
      This means that cross-chip bridging for dsa_8021q only entails adding
      the dsa_8021q pvids of one switch to the RX filter of the other
      switches. Everything else falls naturally into place, as long as the
      bottom-end of ports (the leaves in the tree) is comprised exclusively of
      dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be
      a chance that a front-panel switch transmits a packet tagged with a
      dsa_8021q header, header which it wouldn't be able to remove, and which
      would hence "leak" out.
      
      The only use case I tested (due to lack of board availability) was when
      the sja1105 switches are part of disjoint trees (however, this doesn't
      change the fact that multiple sja1105 switches still need unique switch
      identifiers in such a system). But in principle, even "true" single-tree
      setups (with DSA links) should work just as fine, except for a small
      change which I can't test: dsa_towards_port should be used instead of
      dsa_upstream_port (I made the assumption that the routing port that any
      sja1105 should use towards its neighbours is the CPU port. That might
      not hold true in other setups).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      ac02a451
  13. 08 5月, 2020 2 次提交
    • V
      net: dsa: sja1105: implement tc-gate using time-triggered virtual links · 834f8933
      Vladimir Oltean 提交于
      Restrict the TTEthernet hardware support on this switch to operate as
      closely as possible to IEEE 802.1Qci as possible. This means that it can
      perform PTP-time-based ingress admission control on streams identified
      by {DMAC, VID, PCP}, which is useful when trying to ensure the
      determinism of traffic scheduled via IEEE 802.1Qbv.
      
      The oddity comes from the fact that in hardware (and in TTEthernet at
      large), virtual links always need a full-blown action, including not
      only the type of policing, but also the list of destination ports. So in
      practice, a single tc-gate action will result in all packets getting
      dropped. Additional actions (either "trap" or "redirect") need to be
      specified in the same filter rule such that the conforming packets are
      actually forwarded somewhere.
      
      Apart from the VL Lookup, Policing and Forwarding tables which need to
      be programmed for each flow (virtual link), the Schedule engine also
      needs to be told to open/close the admission gates for each individual
      virtual link. A fairly accurate (and detailed) description of how that
      works is already present in sja1105_tas.c, since it is already used to
      trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key
      point here, we remember that the schedule engine supports 8
      "subschedules" (execution threads that iterate through the global
      schedule in parallel, and that no 2 hardware threads must execute a
      schedule entry at the same time). For tc-taprio, each egress port used
      one of these 8 subschedules, leaving a total of 4 subschedules unused.
      In principle we could have allocated 1 subschedule for the tc-gate
      offload of each ingress port, but actually the schedules of all virtual
      links installed on each ingress port would have needed to be merged
      together, before they could have been programmed to hardware. So
      simplify our life and just merge the entire tc-gate configuration, for
      all virtual links on all ingress ports, into a single subschedule. Be
      sure to check that against the usual hardware scheduling conflicts, and
      program it to hardware alongside any tc-taprio subschedule that may be
      present.
      
      The following scenarios were tested:
      
      1. Quantitative testing:
      
         tc qdisc add dev swp2 clsact
         tc filter add dev swp2 ingress flower skip_sw \
                 dst_mac 42:be:24:9b:76:20 \
                 action gate index 1 base-time 0 \
                 sched-entry OPEN 1200 -1 -1 \
                 sched-entry CLOSE 1200 -1 -1 \
                 action trap
      
         ping 192.168.1.2 -f
         PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
         .............................
         --- 192.168.1.2 ping statistics ---
         948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms
      
      2. Qualitative testing (with a phase-aligned schedule - the clocks are
         synchronized by ptp4l, not shown here):
      
         Receiver (sja1105):
      
         tc qdisc add dev swp2 clsact
         now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \
                 sec=$(echo $now | awk -F. '{print $1}') && \
                 base_time="$(((sec + 2) * 1000000000))" && \
                 echo "base time ${base_time}"
         tc filter add dev swp2 ingress flower skip_sw \
                 dst_mac 42:be:24:9b:76:20 \
                 action gate base-time ${base_time} \
                 sched-entry OPEN  60000 -1 -1 \
                 sched-entry CLOSE 40000 -1 -1 \
                 action trap
      
         Sender (enetc):
         now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \
                 sec=$(echo $now | awk -F. '{print $1}') && \
                 base_time="$(((sec + 2) * 1000000000))" && \
                 echo "base time ${base_time}"
         tc qdisc add dev eno0 parent root taprio \
                 num_tc 8 \
                 map 0 1 2 3 4 5 6 7 \
                 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
                 base-time ${base_time} \
                 sched-entry S 01  50000 \
                 sched-entry S 00  50000 \
                 flags 2
      
         ping -A 192.168.1.1
         PING 192.168.1.1 (192.168.1.1): 56 data bytes
         ...
         ^C
         --- 192.168.1.1 ping statistics ---
         1425 packets transmitted, 1424 packets received, 0% packet loss
         round-trip min/avg/max = 0.322/0.361/0.990 ms
      
         And just for comparison, with the tc-taprio schedule deleted:
      
         ping -A 192.168.1.1
         PING 192.168.1.1 (192.168.1.1): 56 data bytes
         ...
         ^C
         --- 192.168.1.1 ping statistics ---
         33 packets transmitted, 19 packets received, 42% packet loss
         round-trip min/avg/max = 0.336/0.464/0.597 ms
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      834f8933
    • V
      net: dsa: sja1105: support flow-based redirection via virtual links · dfacc5a2
      Vladimir Oltean 提交于
      Implement tc-flower offloads for redirect, trap and drop using
      non-critical virtual links.
      
      Commands which were tested to work are:
      
        # Send frames received on swp2 with a DA of 42:be:24:9b:76:20 to the
        # CPU and to swp3. This type of key (DA only) when the port's VLAN
        # awareness state is off.
        tc qdisc add dev swp2 clsact
        tc filter add dev swp2 ingress flower skip_sw dst_mac 42:be:24:9b:76:20 \
                action mirred egress redirect dev swp3 \
                action trap
      
        # Drop frames received on swp2 with a DA of 42:be:24:9b:76:20, a VID
        # of 100 and a PCP of 0.
        tc filter add dev swp2 ingress protocol 802.1Q flower skip_sw \
                dst_mac 42:be:24:9b:76:20 vlan_id 100 vlan_prio 0 action drop
      
      Under the hood, all rules match on DMAC, VID and PCP, but when VLAN
      filtering is disabled, those are set internally by the driver to the
      port-based defaults. Because we would be put in an awkward situation if
      the user were to change the VLAN filtering state while there are active
      rules (packets would no longer match on the specified keys), we simply
      deny changing vlan_filtering unless the list of flows offloaded via
      virtual links is empty. Then the user can re-add new rules.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfacc5a2
  14. 31 3月, 2020 2 次提交
    • V
      net: dsa: sja1105: add broadcast and per-traffic class policers · a6af7763
      Vladimir Oltean 提交于
      This patch adds complete support for manipulating the L2 Policing Tables
      from this switch. There are 45 table entries, one entry per each port
      and traffic class, and one dedicated entry for broadcast traffic for
      each ingress port.
      
      Policing entries are shareable, and we use this functionality to support
      shared block filters.
      
      We are modeling broadcast policers as simple tc-flower matches on
      dst_mac. As for the traffic class policers, the switch only deduces the
      traffic class from the VLAN PCP field, so it makes sense to model this
      as a tc-flower match on vlan_prio.
      
      How to limit broadcast traffic coming from all front-panel ports to a
      cumulated total of 10 Mbit/s:
      
      tc qdisc add dev sw0p0 ingress_block 1 clsact
      tc qdisc add dev sw0p1 ingress_block 1 clsact
      tc qdisc add dev sw0p2 ingress_block 1 clsact
      tc qdisc add dev sw0p3 ingress_block 1 clsact
      tc filter add block 1 flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
      	action police rate 10mbit burst 64k
      
      How to limit traffic with VLAN PCP 0 (also includes untagged traffic) to
      100 Mbit/s on port 0 only:
      
      tc filter add dev sw0p0 ingress protocol 802.1Q flower skip_sw \
      	vlan_prio 0 action police rate 100mbit burst 64k
      
      The broadcast, VLAN PCP and port policers are compatible with one
      another (can be installed at the same time on a port).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6af7763
    • V
      net: dsa: sja1105: add configuration of port policers · a7cc081c
      Vladimir Oltean 提交于
      This adds partial configuration support for the L2 Policing Table. Out
      of the 45 policing entries, only 5 are used (one for each port), in a
      shared manner. All 8 traffic classes, and the broadcast policer, are
      redirected to a common instance which belongs to the ingress port.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7cc081c
  15. 28 3月, 2020 1 次提交
    • V
      net: dsa: sja1105: implement the port MTU callbacks · c279c726
      Vladimir Oltean 提交于
      On this switch, the frame length enforcements are performed by the
      ingress policers. There are 2 types of those: regular L2 (also called
      best-effort) and Virtual Link policers (an ARINC664/AFDX concept for
      defining L2 streams with certain QoS abilities). To avoid future
      confusion, I prefer to call the reset reason "Best-effort policers",
      even though the VL policers are not yet supported.
      
      We also need to change the setup of the initial static config, such that
      DSA calls to .change_mtu (which are expensive) become no-ops and don't
      reset the switch 5 times.
      
      A driver-level decision is to unconditionally allow single VLAN-tagged
      traffic on all ports. The CPU port must accept an additional VLAN header
      for the DSA tag, which is again a driver-level decision.
      
      The policers actually count bytes not only from the SDU, but also from
      the Ethernet header and FCS, so those need to be accounted for as well.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c279c726
  16. 24 3月, 2020 2 次提交
    • V
      net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT · 747e5eb3
      Vladimir Oltean 提交于
      The SJA1105 switch family has a PTP_CLK pin which emits a signal with
      fixed 50% duty cycle, but variable frequency and programmable start time.
      
      On the second generation (P/Q/R/S) switches, this pin supports even more
      functionality. The use case described by the hardware documents talks
      about synchronization via oneshot pulses: given 2 sja1105 switches,
      arbitrarily designated as a master and a slave, the master emits a
      single pulse on PTP_CLK, while the slave is configured to timestamp this
      pulse received on its PTP_CLK pin (which must obviously be configured as
      input). The difference between the timestamps then exactly becomes the
      slave offset to the master.
      
      The only trouble with the above is that the hardware is very much tied
      into this use case only, and not very generic beyond that:
       - When emitting a oneshot pulse, instead of being told when to emit it,
         the switch just does it "now" and tells you later what time it was,
         via the PTPSYNCTS register. [ Incidentally, this is the same register
         that the slave uses to collect the ext_ts timestamp from, too. ]
       - On the sync slave, there is no interrupt mechanism on reception of a
         new extts, and no FIFO to buffer them, because in the foreseen use
         case, software is in control of both the master and the slave pins,
         so it "knows" when there's something to collect.
      
      These 2 problems mean that:
       - We don't support (at least yet) the quirky oneshot mode exposed by
         the hardware, just normal periodic output.
       - We abuse the hardware a little bit when we expose generic extts.
         Because there's no interrupt mechanism, we need to poll at double the
         frequency we expect to receive a pulse. Currently that means a
         non-configurable "twice a second".
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      747e5eb3
    • V
      net: dsa: sja1105: unconditionally set DESTMETA and SRCMETA in AVB table · 79d5511c
      Vladimir Oltean 提交于
      These fields configure the destination and source MAC address that the
      switch will put in the Ethernet frames sent towards the CPU port that
      contain RX timestamps for PTP.
      
      These fields do not enable the feature itself, that is configured via
      SEND_META0 and SEND_META1 in the General Params table.
      
      The implication of this patch is that the AVB Params table will always
      be present in the static config. Which doesn't really hurt.
      
      This is needed because in a future patch, we will add another field from
      this table, CAS_MASTER, for configuring the PTP_CLK pin function. That
      can be configured irrespective of whether RX timestamping is enabled or
      not, so always having this table present is going to simplify things a
      bit.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79d5511c
  17. 20 3月, 2020 2 次提交
  18. 15 3月, 2020 1 次提交
  19. 04 3月, 2020 1 次提交
  20. 01 3月, 2020 1 次提交
    • V
      net: dsa: sja1105: Don't destroy not-yet-created xmit_worker · 52c0d4e3
      Vladimir Oltean 提交于
      Fixes the following NULL pointer dereference on PHY connect error path
      teardown:
      
      [    2.291010] sja1105 spi0.1: Probed switch chip: SJA1105T
      [    2.310044] sja1105 spi0.1: Enabled switch tagging
      [    2.314970] fsl-gianfar soc:ethernet@2d90000 eth2: error -19 setting up slave phy
      [    2.322463] 8<--- cut here ---
      [    2.325497] Unable to handle kernel NULL pointer dereference at virtual address 00000018
      [    2.333555] pgd = (ptrval)
      [    2.336241] [00000018] *pgd=00000000
      [    2.339797] Internal error: Oops: 5 [#1] SMP ARM
      [    2.344384] Modules linked in:
      [    2.347420] CPU: 1 PID: 64 Comm: kworker/1:1 Not tainted 5.5.0-rc5 #1
      [    2.353820] Hardware name: Freescale LS1021A
      [    2.358070] Workqueue: events deferred_probe_work_func
      [    2.363182] PC is at kthread_destroy_worker+0x4/0x74
      [    2.368117] LR is at sja1105_teardown+0x70/0xb4
      [    2.372617] pc : [<c036cdd4>]    lr : [<c0b89238>]    psr: 60000013
      [    2.378845] sp : eeac3d30  ip : eeab1900  fp : eef45480
      [    2.384036] r10: eef4549c  r9 : 00000001  r8 : 00000000
      [    2.389227] r7 : eef527c0  r6 : 00000034  r5 : ed8ddd0c  r4 : ed8ddc40
      [    2.395714] r3 : 00000000  r2 : 00000000  r1 : eef4549c  r0 : 00000000
      [    2.402204] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      [    2.409297] Control: 10c5387d  Table: 8020406a  DAC: 00000051
      [    2.415008] Process kworker/1:1 (pid: 64, stack limit = 0x(ptrval))
      [    2.421237] Stack: (0xeeac3d30 to 0xeeac4000)
      [    2.612635] [<c036cdd4>] (kthread_destroy_worker) from [<c0b89238>] (sja1105_teardown+0x70/0xb4)
      [    2.621379] [<c0b89238>] (sja1105_teardown) from [<c10717fc>] (dsa_switch_teardown.part.1+0x48/0x74)
      [    2.630467] [<c10717fc>] (dsa_switch_teardown.part.1) from [<c1072438>] (dsa_register_switch+0x8b0/0xbf4)
      [    2.639984] [<c1072438>] (dsa_register_switch) from [<c0b89c30>] (sja1105_probe+0x2ac/0x464)
      [    2.648378] [<c0b89c30>] (sja1105_probe) from [<c0b11a5c>] (spi_drv_probe+0x7c/0xa0)
      [    2.656081] [<c0b11a5c>] (spi_drv_probe) from [<c0a26ab8>] (really_probe+0x208/0x480)
      [    2.663871] [<c0a26ab8>] (really_probe) from [<c0a26f0c>] (driver_probe_device+0x78/0x1c4)
      [    2.672093] [<c0a26f0c>] (driver_probe_device) from [<c0a24c48>] (bus_for_each_drv+0x80/0xc4)
      [    2.680574] [<c0a24c48>] (bus_for_each_drv) from [<c0a26810>] (__device_attach+0xd0/0x168)
      [    2.688794] [<c0a26810>] (__device_attach) from [<c0a259d8>] (bus_probe_device+0x84/0x8c)
      [    2.696927] [<c0a259d8>] (bus_probe_device) from [<c0a25f24>] (deferred_probe_work_func+0x84/0xc4)
      [    2.705842] [<c0a25f24>] (deferred_probe_work_func) from [<c03667b0>] (process_one_work+0x22c/0x560)
      [    2.714926] [<c03667b0>] (process_one_work) from [<c0366d8c>] (worker_thread+0x2a8/0x5d4)
      [    2.723059] [<c0366d8c>] (worker_thread) from [<c036cf94>] (kthread+0x150/0x154)
      [    2.730416] [<c036cf94>] (kthread) from [<c03010e8>] (ret_from_fork+0x14/0x2c)
      
      Checking for NULL pointer is correct because the per-port xmit kernel
      threads are created in sja1105_probe immediately after calling
      dsa_register_switch.
      
      Fixes: a68578c2 ("net: dsa: Make deferred_xmit private to sja1105")
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52c0d4e3
  21. 28 2月, 2020 1 次提交
  22. 17 1月, 2020 1 次提交
  23. 09 1月, 2020 1 次提交
    • F
      net: dsa: Get information about stacked DSA protocol · 4d776482
      Florian Fainelli 提交于
      It is possible to stack multiple DSA switches in a way that they are not
      part of the tree (disjoint) but the DSA master of a switch is a DSA
      slave of another. When that happens switch drivers may have to know this
      is the case so as to determine whether their tagging protocol has a
      remove chance of working.
      
      This is useful for specific switch drivers such as b53 where devices
      have been known to be stacked in the wild without the Broadcom tag
      protocol supporting that feature. This allows b53 to continue supporting
      those devices by forcing the disabling of Broadcom tags on the outermost
      switches if necessary.
      
      The get_tag_protocol() function is therefore updated to gain an
      additional enum dsa_tag_protocol argument which denotes the current
      tagging protocol used by the DSA master we are attached to, else
      DSA_TAG_PROTO_NONE for the top of the dsa_switch_tree.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d776482
  24. 06 1月, 2020 2 次提交
    • V
      net: dsa: Make deferred_xmit private to sja1105 · a68578c2
      Vladimir Oltean 提交于
      There are 3 things that are wrong with the DSA deferred xmit mechanism:
      
      1. Its introduction has made the DSA hotpath ever so slightly more
         inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs
         to be initialized to false for every transmitted frame, in order to
         figure out whether the driver requested deferral or not (a very rare
         occasion, rare even for the only driver that does use this mechanism:
         sja1105). That was necessary to avoid kfree_skb from freeing the skb.
      
      2. Because L2 PTP is a link-local protocol like STP, it requires
         management routes and deferred xmit with this switch. But as opposed
         to STP, the deferred work mechanism needs to schedule the packet
         rather quickly for the TX timstamp to be collected in time and sent
         to user space. But there is no provision for controlling the
         scheduling priority of this deferred xmit workqueue. Too bad this is
         a rather specific requirement for a feature that nobody else uses
         (more below).
      
      3. Perhaps most importantly, it makes the DSA core adhere a bit too
         much to the NXP company-wide policy "Innovate Where It Doesn't
         Matter". The sja1105 is probably the only DSA switch that requires
         some frames sent from the CPU to be routed to the slave port via an
         out-of-band configuration (register write) rather than in-band (DSA
         tag). And there are indeed very good reasons to not want to do that:
         if that out-of-band register is at the other end of a slow bus such
         as SPI, then you limit that Ethernet flow's throughput to effectively
         the throughput of the SPI bus. So hardware vendors should definitely
         not be encouraged to design this way. We do _not_ want more
         widespread use of this mechanism.
      
      Luckily we have a solution for each of the 3 issues:
      
      For 1, we can just remove that variable in the skb->cb and counteract
      the effect of kfree_skb with skb_get, much to the same effect. The
      advantage, of course, being that anybody who doesn't use deferred xmit
      doesn't need to do any extra operation in the hotpath.
      
      For 2, we can create a kernel thread for each port's deferred xmit work.
      If the user switch ports are named swp0, swp1, swp2, the kernel threads
      will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15
      character length limit on kernel thread names). With this, the user can
      change the scheduling priority with chrt $(pidof swp2_xmit).
      
      For 3, we can actually move the entire implementation to the sja1105
      driver.
      
      So this patch deletes the generic implementation from the DSA core and
      adds a new one, more adequate to the requirements of PTP TX
      timestamping, in sja1105_main.c.
      Suggested-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a68578c2
    • V
      net: dsa: sja1105: Always send through management routes in slot 0 · 0a51826c
      Vladimir Oltean 提交于
      I finally found out how the 4 management route slots are supposed to
      be used, but.. it's not worth it.
      
      The description from the comment I've just deleted in this commit is
      still true: when more than 1 management slot is active at the same time,
      the switch will match frames incoming [from the CPU port] on the lowest
      numbered management slot that matches the frame's DMAC.
      
      My issue was that one was not supposed to statically assign each port a
      slot. Yes, there are 4 slots and also 4 non-CPU ports, but that is a
      mere coincidence.
      
      Instead, the switch can be used like this: every management frame gets a
      slot at the right of the most recently assigned slot:
      
      Send mgmt frame 1 through S0:    S0 x  x  x
      Send mgmt frame 2 through S1:    S0 S1 x  x
      Send mgmt frame 3 through S2:    S0 S1 S2 x
      Send mgmt frame 4 through S3:    S0 S1 S2 S3
      
      The difference compared to the old usage is that the transmission of
      frames 1-4 doesn't need to wait until the completion of the management
      route. It is safe to use a slot to the right of the most recently used
      one, because by protocol nobody will program a slot to your left and
      "steal" your route towards the correct egress port.
      
      So there is a potential throughput benefit here.
      
      But mgmt frame 5 has no more free slot to use, so it has to wait until
      _all_ of S0, S1, S2, S3 are full, in order to use S0 again.
      
      And that's actually exactly the problem: I was looking for something
      that would bring more predictable transmission latency, but this is
      exactly the opposite: 3 out of 4 frames would be transmitted quicker,
      but the 4th would draw the short straw and have a worse worst-case
      latency than before.
      
      Useless.
      
      Things are made even worse by PTP TX timestamping, which is something I
      won't go deeply into here. Suffice to say that the fact there is a
      driver-level lock on the SPI bus offsets any potential throughput gains
      that parallelism might bring.
      
      So there's no going back to the multi-slot scheme, remove the
      "mgmt_slot" variable from sja1105_port and the dummy static assignment
      made at probe time.
      
      While passing by, also remove the assignment to casc_port altogether.
      Don't pretend that we support cascaded setups.
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a51826c