提交 · eb55d7b65f1735dfb39fb14e47007d3c8fb74c43 · openeuler / Kernel

08 5月, 2020 4 次提交

net: dsa: sja1105: implement tc-gate using time-triggered virtual links · 834f8933

由 Vladimir Oltean 提交于 5月 05, 2020

Restrict the TTEthernet hardware support on this switch to operate as
closely as possible to IEEE 802.1Qci as possible. This means that it can
perform PTP-time-based ingress admission control on streams identified
by {DMAC, VID, PCP}, which is useful when trying to ensure the
determinism of traffic scheduled via IEEE 802.1Qbv.

The oddity comes from the fact that in hardware (and in TTEthernet at
large), virtual links always need a full-blown action, including not
only the type of policing, but also the list of destination ports. So in
practice, a single tc-gate action will result in all packets getting
dropped. Additional actions (either "trap" or "redirect") need to be
specified in the same filter rule such that the conforming packets are
actually forwarded somewhere.

Apart from the VL Lookup, Policing and Forwarding tables which need to
be programmed for each flow (virtual link), the Schedule engine also
needs to be told to open/close the admission gates for each individual
virtual link. A fairly accurate (and detailed) description of how that
works is already present in sja1105_tas.c, since it is already used to
trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key
point here, we remember that the schedule engine supports 8
"subschedules" (execution threads that iterate through the global
schedule in parallel, and that no 2 hardware threads must execute a
schedule entry at the same time). For tc-taprio, each egress port used
one of these 8 subschedules, leaving a total of 4 subschedules unused.
In principle we could have allocated 1 subschedule for the tc-gate
offload of each ingress port, but actually the schedules of all virtual
links installed on each ingress port would have needed to be merged
together, before they could have been programmed to hardware. So
simplify our life and just merge the entire tc-gate configuration, for
all virtual links on all ingress ports, into a single subschedule. Be
sure to check that against the usual hardware scheduling conflicts, and
program it to hardware alongside any tc-taprio subschedule that may be
present.

The following scenarios were tested:

1. Quantitative testing:

   tc qdisc add dev swp2 clsact
   tc filter add dev swp2 ingress flower skip_sw \
           dst_mac 42:be:24:9b:76:20 \
           action gate index 1 base-time 0 \
           sched-entry OPEN 1200 -1 -1 \
           sched-entry CLOSE 1200 -1 -1 \
           action trap

   ping 192.168.1.2 -f
   PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
   .............................
   --- 192.168.1.2 ping statistics ---
   948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms

2. Qualitative testing (with a phase-aligned schedule - the clocks are
   synchronized by ptp4l, not shown here):

   Receiver (sja1105):

   tc qdisc add dev swp2 clsact
   now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \
           sec=$(echo $now | awk -F. '{print $1}') && \
           base_time="$(((sec + 2) * 1000000000))" && \
           echo "base time ${base_time}"
   tc filter add dev swp2 ingress flower skip_sw \
           dst_mac 42:be:24:9b:76:20 \
           action gate base-time ${base_time} \
           sched-entry OPEN  60000 -1 -1 \
           sched-entry CLOSE 40000 -1 -1 \
           action trap

   Sender (enetc):
   now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \
           sec=$(echo $now | awk -F. '{print $1}') && \
           base_time="$(((sec + 2) * 1000000000))" && \
           echo "base time ${base_time}"
   tc qdisc add dev eno0 parent root taprio \
           num_tc 8 \
           map 0 1 2 3 4 5 6 7 \
           queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
           base-time ${base_time} \
           sched-entry S 01  50000 \
           sched-entry S 00  50000 \
           flags 2

   ping -A 192.168.1.1
   PING 192.168.1.1 (192.168.1.1): 56 data bytes
   ...
   ^C
   --- 192.168.1.1 ping statistics ---
   1425 packets transmitted, 1424 packets received, 0% packet loss
   round-trip min/avg/max = 0.322/0.361/0.990 ms

   And just for comparison, with the tc-taprio schedule deleted:

   ping -A 192.168.1.1
   PING 192.168.1.1 (192.168.1.1): 56 data bytes
   ...
   ^C
   --- 192.168.1.1 ping statistics ---
   33 packets transmitted, 19 packets received, 42% packet loss
   round-trip min/avg/max = 0.336/0.464/0.597 ms
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

834f8933

net: dsa: sja1105: support flow-based redirection via virtual links · dfacc5a2

由 Vladimir Oltean 提交于 5月 05, 2020

Implement tc-flower offloads for redirect, trap and drop using
non-critical virtual links.

Commands which were tested to work are:

  # Send frames received on swp2 with a DA of 42:be:24:9b:76:20 to the
  # CPU and to swp3. This type of key (DA only) when the port's VLAN
  # awareness state is off.
  tc qdisc add dev swp2 clsact
  tc filter add dev swp2 ingress flower skip_sw dst_mac 42:be:24:9b:76:20 \
          action mirred egress redirect dev swp3 \
          action trap

  # Drop frames received on swp2 with a DA of 42:be:24:9b:76:20, a VID
  # of 100 and a PCP of 0.
  tc filter add dev swp2 ingress protocol 802.1Q flower skip_sw \
          dst_mac 42:be:24:9b:76:20 vlan_id 100 vlan_prio 0 action drop

Under the hood, all rules match on DMAC, VID and PCP, but when VLAN
filtering is disabled, those are set internally by the driver to the
port-based defaults. Because we would be put in an awkward situation if
the user were to change the VLAN filtering state while there are active
rules (packets would no longer match on the specified keys), we simply
deny changing vlan_filtering unless the list of flows offloaded via
virtual links is empty. Then the user can re-add new rules.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dfacc5a2

net: dsa: sja1105: make room for virtual link parsing in flower offload · b70bb8d4

由 Vladimir Oltean 提交于 5月 05, 2020

Virtual links are a sja1105 hardware concept of executing various flow
actions based on a key extracted from the frame's DMAC, VID and PCP.

Currently the tc-flower offload code supports only parsing the DMAC if
that is the broadcast MAC address, and the VLAN PCP. Extract the key
parsing logic from the L2 policers functionality and move it into its
own function, after adding extra logic for matching on any DMAC and VID.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b70bb8d4

net: dsa: sja1105: add static tables for virtual links · 94f94d4a

由 Vladimir Oltean 提交于 5月 05, 2020

This patch adds the register definitions for the:
- VL Lookup Table
- VL Policing Table
- VL Forwarding Table
- VL Forwarding Parameters Table

These are needed in order to perform TTEthernet operations: QoS
classification, flow-based policing and/or frame redirecting with the
switch.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

94f94d4a

21 4月, 2020 1 次提交

net: dsa: sja1105: enable internal pull-down for RX_DV/CRS_DV/RX_CTL and RX_ER · 135e3018

由 Vladimir Oltean 提交于 4月 17, 2020

Some boards do not have the RX_ER MII signal connected. Normally in such
situation, those pins would be grounded, but then again, some boards
left it electrically floating.

When sending traffic to those switch ports, one can see that the
N_SOFERR statistics counter is incrementing once per each packet. The
user manual states for this counter that it may count the number of
frames "that have the MII error input being asserted prior to or
up to the SOF delimiter byte". So the switch MAC is sampling an
electrically floating signal, and preventing proper traffic reception
because of that.

As a workaround, enable the internal weak pull-downs on the input pads
for the MII control signals. This way, a floating signal would be
internally tied to ground.

The logic levels of signals which _are_ externally driven should not be
bothered by this 40-50 KOhm internal resistor. So it is not an issue to
enable the internal pull-down unconditionally, irrespective of PHY
interface type (MII, RMII, RGMII, SGMII) and of board layout.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

135e3018

31 3月, 2020 1 次提交

net: dsa: sja1105: add broadcast and per-traffic class policers · a6af7763

由 Vladimir Oltean 提交于 3月 29, 2020

This patch adds complete support for manipulating the L2 Policing Tables
from this switch. There are 45 table entries, one entry per each port
and traffic class, and one dedicated entry for broadcast traffic for
each ingress port.

Policing entries are shareable, and we use this functionality to support
shared block filters.

We are modeling broadcast policers as simple tc-flower matches on
dst_mac. As for the traffic class policers, the switch only deduces the
traffic class from the VLAN PCP field, so it makes sense to model this
as a tc-flower match on vlan_prio.

How to limit broadcast traffic coming from all front-panel ports to a
cumulated total of 10 Mbit/s:

tc qdisc add dev sw0p0 ingress_block 1 clsact
tc qdisc add dev sw0p1 ingress_block 1 clsact
tc qdisc add dev sw0p2 ingress_block 1 clsact
tc qdisc add dev sw0p3 ingress_block 1 clsact
tc filter add block 1 flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
	action police rate 10mbit burst 64k

How to limit traffic with VLAN PCP 0 (also includes untagged traffic) to
100 Mbit/s on port 0 only:

tc filter add dev sw0p0 ingress protocol 802.1Q flower skip_sw \
	vlan_prio 0 action police rate 100mbit burst 64k

The broadcast, VLAN PCP and port policers are compatible with one
another (can be installed at the same time on a port).
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a6af7763

30 3月, 2020 1 次提交

net: dsa: sja1105: show more ethtool statistics counters for P/Q/R/S · 336aa67b

由 Vladimir Oltean 提交于 3月 27, 2020

It looks like the P/Q/R/S series supports some more counters,
generically named "Ethernet statistics counter", which we were not
printing. Add them.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

336aa67b

28 3月, 2020 1 次提交

net: dsa: sja1105: implement the port MTU callbacks · c279c726

由 Vladimir Oltean 提交于 3月 27, 2020

On this switch, the frame length enforcements are performed by the
ingress policers. There are 2 types of those: regular L2 (also called
best-effort) and Virtual Link policers (an ARINC664/AFDX concept for
defining L2 streams with certain QoS abilities). To avoid future
confusion, I prefer to call the reset reason "Best-effort policers",
even though the VL policers are not yet supported.

We also need to change the setup of the initial static config, such that
DSA calls to .change_mtu (which are expensive) become no-ops and don't
reset the switch 5 times.

A driver-level decision is to unconditionally allow single VLAN-tagged
traffic on all ports. The CPU port must accept an additional VLAN header
for the DSA tag, which is again a driver-level decision.

The policers actually count bytes not only from the SDU, but also from
the Ethernet header and FCS, so those need to be accounted for as well.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c279c726

24 3月, 2020 2 次提交

net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT · 747e5eb3

由 Vladimir Oltean 提交于 3月 24, 2020

The SJA1105 switch family has a PTP_CLK pin which emits a signal with
fixed 50% duty cycle, but variable frequency and programmable start time.

On the second generation (P/Q/R/S) switches, this pin supports even more
functionality. The use case described by the hardware documents talks
about synchronization via oneshot pulses: given 2 sja1105 switches,
arbitrarily designated as a master and a slave, the master emits a
single pulse on PTP_CLK, while the slave is configured to timestamp this
pulse received on its PTP_CLK pin (which must obviously be configured as
input). The difference between the timestamps then exactly becomes the
slave offset to the master.

The only trouble with the above is that the hardware is very much tied
into this use case only, and not very generic beyond that:
 - When emitting a oneshot pulse, instead of being told when to emit it,
   the switch just does it "now" and tells you later what time it was,
   via the PTPSYNCTS register. [ Incidentally, this is the same register
   that the slave uses to collect the ext_ts timestamp from, too. ]
 - On the sync slave, there is no interrupt mechanism on reception of a
   new extts, and no FIFO to buffer them, because in the foreseen use
   case, software is in control of both the master and the slave pins,
   so it "knows" when there's something to collect.

These 2 problems mean that:
 - We don't support (at least yet) the quirky oneshot mode exposed by
   the hardware, just normal periodic output.
 - We abuse the hardware a little bit when we expose generic extts.
   Because there's no interrupt mechanism, we need to poll at double the
   frequency we expect to receive a pulse. Currently that means a
   non-configurable "twice a second".
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: NRichard Cochran <richardcochran@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

747e5eb3

net: dsa: sja1105: make the AVB table dynamically reconfigurable · 0a7e984c

由 Vladimir Oltean 提交于 3月 24, 2020

The AVB table contains the CAS_MASTER field (to be added in the next
patch) which decides the direction of the PTP_CLK pin.

Reconfiguring this field dynamically is highly preferable to having to
reset the switch and upload a new static configuration, so we add
support for exactly that.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a7e984c

20 3月, 2020 1 次提交

net: dsa: sja1105: Add support for the SGMII port · ffe10e67

由 Vladimir Oltean 提交于 3月 20, 2020

SJA1105 switches R and S have one SerDes port with an 802.3z
quasi-compatible PCS, hardwired on port 4. The other ports are still
MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds;
the MAC on this port is hardwired at gigabit. Only full duplex is
supported.

The SGMII port can be configured as part of the static config tables, as
well as through a dedicated SPI address region for its pseudo-clause-22
registers. However it looks like the static configuration is not
able to change some out-of-reset values (like the value of MII_BMCR), so
at the end of the day, having code for it is utterly pointless. We are
just going to use the pseudo-C22 interface.

Because the PCS gets reset when the switch resets, we have to add even
more restoration logic to sja1105_static_config_reload, otherwise the
SGMII port breaks after operations such as enabling PTP timestamping
which require a switch reset.

>From PHYLINK perspective, the switch supports *only* SGMII (it doesn't
support 1000Base-X). It also doesn't expose access to the raw config
word for in-band AN in registers MII_ADV/MII_LPA.
It is able to work in the following modes:
 - Forced speed
 - SGMII in-band AN slave (speed received from PHY)
 - SGMII in-band AN master (acting as a PHY)

The latter mode is not supported by this patch. It is even unclear to me
how that would be described. There is some code for it left in the
patch, but 'an_master' is always passed as false.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ffe10e67

15 11月, 2019 3 次提交

net: dsa: sja1105: Simplify reset handling · abfb228a

由 Vladimir Oltean 提交于 11月 13, 2019

We don't really need 10k species of reset. Remove everything except cold
reset which is what is actually used. Too bad the hardware designers
couldn't agree to use the same bit field for rev 1 and rev 2, so the
(*reset_cmd) function pointer is there to stay.

However let's simplify the prototype and give it a struct dsa_switch (we
want to avoid forward-declarations of structures, in this case struct
sja1105_private, wherever we can).
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

abfb228a

net: dsa: sja1105: Implement state machine for TAS with PTP clock source · 86db36a3

由 Vladimir Oltean 提交于 11月 12, 2019

Tested using the following bash script and the tc from iproute2-next:

	#!/bin/bash

	set -e -u -o pipefail

	NSEC_PER_SEC="1000000000"

	gatemask() {
		local tc_list="$1"
		local mask=0

		for tc in ${tc_list}; do
			mask=$((${mask} | (1 << ${tc})))
		done

		printf "%02x" ${mask}
	}

	if ! systemctl is-active --quiet ptp4l; then
		echo "Please start the ptp4l service"
		exit
	fi

	now=$(phc_ctl /dev/ptp1 get | gawk '/clock time is/ { print $5; }')
	# Phase-align the base time to the start of the next second.
	sec=$(echo "${now}" | gawk -F. '{ print $1; }')
	base_time="$(((${sec} + 1) * ${NSEC_PER_SEC}))"

	tc qdisc add dev swp5 parent root handle 100 taprio \
		num_tc 8 \
		map 0 1 2 3 5 6 7 \
		queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
		base-time ${base_time} \
		sched-entry S $(gatemask 7) 100000 \
		sched-entry S $(gatemask "0 1 2 3 4 5 6") 400000 \
		clockid CLOCK_TAI flags 2

The "state machine" is a workqueue invoked after each manipulation
command on the PTP clock (reset, adjust time, set time, adjust
frequency) which checks over the state of the time-aware scheduler.
So it is not monitored periodically, only in reaction to a PTP command
typically triggered from a userspace daemon (linuxptp). Otherwise there
is no reason for things to go wrong.

Now that the timecounter/cyclecounter has been replaced with hardware
operations on the PTP clock, the TAS Kconfig now depends upon PTP and
the standalone clocksource operating mode has been removed.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86db36a3

net: dsa: sja1105: Make the PTP command read-write · 41603d78

由 Vladimir Oltean 提交于 11月 12, 2019

The PTPSTRTSCH and PTPSTOPSCH bits are actually readable and indicate
whether the time-aware scheduler is running or not. We will be using
that for monitoring the scheduler in the next patch, so refactor the PTP
command API in order to allow that.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

41603d78

13 11月, 2019 1 次提交

net: dsa: sja1105: Print the reset reason · 2eea1fa8

由 Vladimir Oltean 提交于 11月 12, 2019

Sometimes it can be quite opaque even for me why the driver decided to
reset the switch. So instead of adding dump_stack() calls each time for
debugging, just add a reset reason to sja1105_static_config_reload
calls which gets printed to the console.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2eea1fa8

12 11月, 2019 1 次提交

net: dsa: sja1105: Implement the .gettimex64 system call for PTP · 34d76e9f

由 Vladimir Oltean 提交于 11月 09, 2019

Through the PTP_SYS_OFFSET_EXTENDED ioctl, it is possible for userspace
applications (i.e. phc2sys) to compensate for the delays incurred while
reading the PHC's time.

The task itself of taking the software timestamp is delegated to the SPI
subsystem, through the newly introduced API in struct spi_transfer. The
goal is to cross-timestamp I/O operations on the switch's PTP clock with
values in the local system clock (CLOCK_REALTIME). For that we need to
understand a bit of the hardware internals.

The 'read PTP time' message is a 12 byte structure, first 4 bytes of
which represent the SPI header, and the last 8 bytes represent the
64-bit PTP time. The switch itself starts processing the command
immediately after receiving the last bit of the address, i.e. at the
middle of byte 3 (last byte of header). The PTP time is shadowed to a
buffer register in the switch, and retrieved atomically during the
subsequent SPI frames.

A similar thing goes on for the 'write PTP time' message, although in
that case the switch waits until the 64-bit PTP time becomes fully
available before taking any action. So the byte that needs to be
software-timestamped is byte 11 (last) of the transfer.

The patch creates a common (and local) sja1105_xfer implementation for
the SPI I/O, and offers 3 front-ends:

- sja1105_xfer_u32 and sja1105_xfer_u64: these are capable of optionally
  requesting a PTP timestamp

- sja1105_xfer_buf: this is for large transfers (e.g. the static config
  buffer) and other misc data, and there is no point in giving
  timestamping capabilities to this.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

34d76e9f

19 10月, 2019 1 次提交

net: dsa: sja1105: Switch to hardware operations for PTP · 2fb079a2

由 Vladimir Oltean 提交于 10月 16, 2019

Adjusting the hardware clock (PTPCLKVAL, PTPCLKADD, PTPCLKRATE) is a
requirement for the auxiliary PTP functionality of the switch
(TTEthernet, PPS input, PPS output).

Therefore we need to switch to using these registers to keep a
synchronized time in hardware, instead of the timecounter/cyclecounter
implementation, which is reliant on the free-running PTPTSCLK.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2fb079a2

16 10月, 2019 2 次提交

net: dsa: sja1105: Use the correct style for SPDX License Identifier · b790b554

由 Nishad Kamdar 提交于 10月 14, 2019

This patch corrects the SPDX License Identifier style
in header files related to Distributed Switch Architecture
drivers for NXP SJA1105 series Ethernet switch support.
It uses an expilict block comment for the SPDX License
Identifier.

Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.
Suggested-by: NJoe Perches <joe@perches.com>
Signed-off-by: NNishad Kamdar <nishadkamdar@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b790b554

net: dsa: sja1105: Switch to scatter/gather API for SPI · 08839c06

由 Vladimir Oltean 提交于 10月 12, 2019

This reworks the SPI transfer implementation to make use of more of the
SPI core features. The main benefit is to avoid the memcpy in
sja1105_xfer_buf().

The memcpy was only needed because the function was transferring a
single buffer at a time. So it needed to copy the caller-provided buffer
at buf + 4, to store the SPI message header in the "headroom" area.

But the SPI core supports scatter-gather messages, comprised of multiple
transfers. We can actually use those to break apart every SPI message
into 2 transfers: one for the header and one for the actual payload.

To keep the behavior the same regarding the chip select signal, it is
necessary to tell the SPI core to de-assert the chip select after each
chunk. This was not needed before, because each spi_message contained
only 1 single transfer.

The meaning of the per-transfer cs_change=1 is:

- If the transfer is the last one of the message, keep CS asserted
- Otherwise, deassert CS

We need to deassert CS in the "otherwise" case, which was implicit
before.

Avoiding the memcpy creates yet another opportunity. The device can't
process more than 256 bytes of SPI payload at a time, so the
sja1105_xfer_long_buf() function used to exist, to split the larger
caller buffer into chunks.

But these chunks couldn't be used as scatter/gather buffers for
spi_message until now, because of that memcpy (we would have needed more
memory for each chunk). So we can now remove the sja1105_xfer_long_buf()
function and have a single implementation for long and short buffers.

Another benefit is lower usage of stack memory. Previously we had to
store 2 SPI buffers for each chunk. Due to the elimination of the
memcpy, we can now send pointers to the actual chunks from the
caller-supplied buffer to the SPI core.

Since the patch merges two functions into a rewritten implementation,
the function prototype was also changed, mainly for cosmetic consistency
with the structures used within it.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

08839c06

15 10月, 2019 3 次提交

net: dsa: sja1105: Change the PTP command access pattern · 66427778

由 Vladimir Oltean 提交于 10月 12, 2019

The PTP command register contains enable bits for:
- Putting the 64-bit PTPCLKVAL register in add/subtract or write mode
- Taking timestamps off of the corrected vs free-running clock
- Starting/stopping the TTEthernet scheduling
- Starting/stopping PPS output
- Resetting the switch

When a command needs to be issued (e.g. "change the PTPCLKVAL from write
mode to add/subtract mode"), one cannot simply write to the command
register setting the PTPCLKADD bit to 1, because that would zeroize the
other settings. One also cannot do a read-modify-write (that would be
too easy for this hardware) because not all bits of the command register
are readable over SPI.

So this leaves us with the only option of keeping the value of the PTP
command register in the driver, and operating on that.

Actually there are 2 types of PTP operations now:
- Operations that modify the cached PTP command. These operate on
  ptp_data->cmd as a pointer.
- Operations that apply all previously cached PTP settings, but don't
  otherwise cache what they did themselves. The sja1105_ptp_reset
  function is such an example. It copies the ptp_data->cmd on stack
  before modifying and writing it to SPI.

This practically means that struct sja1105_ptp_cmd is no longer an
implementation detail, since it needs to be stored in full into struct
sja1105_ptp_data, and hence in struct sja1105_private. So the (*ptp_cmd)
function prototype can change and take struct sja1105_ptp_cmd as second
argument now.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66427778

net: dsa: sja1105: Move PTP data to its own private structure · a9d6ed7a

由 Vladimir Oltean 提交于 10月 12, 2019

This is a non-functional change with 2 goals (both for the case when
CONFIG_NET_DSA_SJA1105_PTP is not enabled):

- Reduce the size of the sja1105_private structure.
- Make the PTP code more self-contained.

Leaving priv->ptp_data.lock to be initialized in sja1105_main.c is not a
leftover: it will be used in a future patch "net: dsa: sja1105: Restore
PTP time after switch reset".
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a9d6ed7a

net: dsa: sja1105: Make all public PTP functions take dsa_switch as argument · 61c77126

由 Vladimir Oltean 提交于 10月 12, 2019

The new rule (as already started for sja1105_tas.h) is for functions of
optional driver components (ones which may be disabled via Kconfig - PTP
and TAS) to take struct dsa_switch *ds instead of struct sja1105_private
*priv as first argument.

This is so that forward-declarations of struct sja1105_private can be
avoided.

So make sja1105_ptp.h the second user of this rule.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61c77126

03 10月, 2019 2 次提交

net: dsa: sja1105: Rename sja1105_spi_send_packed_buf to sja1105_xfer_buf · 1bd44870

由 Vladimir Oltean 提交于 10月 01, 2019

The most commonly called function in the driver is long due for a
rename. The "packed" word is redundant (it doesn't make sense to
transfer an unpacked structure, since that is in CPU endianness yadda
yadda), and the "spi" word is also redundant since argument 2 of the
function is SPI_READ or SPI_WRITE.

As for the sja1105_spi_send_long_packed_buf function, it is only being
used from sja1105_spi.c, so remove its global prototype.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1bd44870

net: dsa: sja1105: Replace sja1105_spi_send_int with sja1105_xfer_{u32, u64} · dff79620

由 Vladimir Oltean 提交于 10月 01, 2019

Having a function that takes a variable number of unpacked bytes which
it generically calls an "int" is confusing and makes auditing patches
next to impossible.

We only use spi_send_int with the int sizes of 32 and 64 bits. So just
make the spi_send_int function less generic and replace it with the
appropriate two explicit functions, which can now type-check the int
pointer type.

Note that there is still a small weirdness in the u32 function, which
has to convert it to a u64 temporary. This is because of how the packing
API works at the moment, but the weirdness is at least hidden from
callers of sja1105_xfer_u32 now.
Suggested-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dff79620

17 9月, 2019 1 次提交

net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload · 317ab5b8

由 Vladimir Oltean 提交于 9月 15, 2019

This qdisc offload is the closest thing to what the SJA1105 supports in
hardware for time-based egress shaping. The switch core really is built
around SAE AS6802/TTEthernet (a TTTech standard) but can be made to
operate similarly to IEEE 802.1Qbv with some constraints:

- The gate control list is a global list for all ports. There are 8
  execution threads that iterate through this global list in parallel.
  I don't know why 8, there are only 4 front-panel ports.

- Care must be taken by the user to make sure that two execution threads
  never get to execute a GCL entry simultaneously. I created a O(n^4)
  checker for this hardware limitation, prior to accepting a taprio
  offload configuration as valid.

- The spec says that if a GCL entry's interval is shorter than the frame
  length, you shouldn't send it (and end up in head-of-line blocking).
  Well, this switch does anyway.

- The switch has no concept of ADMIN and OPER configurations. Because
  it's so simple, the TAS settings are loaded through the static config
  tables interface, so there isn't even place for any discussion about
  'graceful switchover between ADMIN and OPER'. You just reset the
  switch and upload a new OPER config.

- The switch accepts multiple time sources for the gate events. Right
  now I am using the standalone clock source as opposed to PTP. So the
  base time parameter doesn't really do much. Support for the PTP clock
  source will be added in a future series.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

317ab5b8

10 6月, 2019 3 次提交

net: dsa: sja1105: Add RGMII delay support for P/Q/R/S chips · c05ec3d4

由 Vladimir Oltean 提交于 6月 08, 2019

As per the DT phy-mode specification, RGMII delays are applied by the
MAC when there is no PHY present on the link.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c05ec3d4

net: dsa: sja1105: Remove duplicate rgmii_pad_mii_tx from regs · b5b0c7f4

由 Vladimir Oltean 提交于 6月 08, 2019

The pad_mii_tx registers point to the same memory region but were
unused. So convert to using these for RGMII I/O cell configuration, as
they bear a shorter name.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5b0c7f4

net: dsa: sja1105: Export the sja1105_inhibit_tx function · d114fb04

由 Vladimir Oltean 提交于 6月 08, 2019

This will be used to stop egress traffic in .phylink_mac_link_up.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d114fb04

09 6月, 2019 3 次提交

net: dsa: sja1105: Add a global sja1105_tagger_data structure · 844d7edc

由 Vladimir Oltean 提交于 6月 08, 2019

This will be used to keep state for RX timestamping. It is global
because the switch serializes timestampable and meta frames when
trapping them towards the CPU port (lower port indices have higher
priority) and therefore having one state machine per port would create
unnecessary complications.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

844d7edc

net: dsa: sja1105: Add logic for TX timestamping · 47ed985e

由 Vladimir Oltean 提交于 6月 08, 2019

On TX, timestamping is performed synchronously from the
port_deferred_xmit worker thread.
In management routes, the switch is requested to take egress timestamps
(again partial), which are reconstructed and appended to a clone of the
skb that was just sent.  The cloning is done by DSA and we retrieve the
pointer from the structure that DSA keeps in skb->cb.
Then these clones are enqueued to the socket's error queue for
application-level processing.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

47ed985e

net: dsa: sja1105: Add support for the PTP clock · bb77f36a

由 Vladimir Oltean 提交于 6月 08, 2019

The design of this PHC driver is influenced by the switch's behavior
w.r.t. timestamping.  It exposes two PTP counters, one free-running
(PTPTSCLK) and the other offset- and frequency-corrected in hardware
through PTPCLKVAL, PTPCLKADD and PTPCLKRATE.  The MACs can sample either
of these for frame timestamps.

However, the user manual warns that taking timestamps based on the
corrected clock is less than useful, as the switch can deliver corrupted
timestamps in a variety of circumstances.

Therefore, this PHC uses the free-running PTPTSCLK together with a
timecounter/cyclecounter structure that translates it into a software
time domain.  Thus, the settime/adjtime and adjfine callbacks are
hardware no-ops.

The timestamps (introduced in a further patch) will also be translated
to the correct time domain before being handed over to the userspace PTP
stack.

The introduction of a second set of PHC operations that operate on the
hardware PTPCLKVAL/PTPCLKADD/PTPCLKRATE in the future is somewhat
unavoidable, as the TTEthernet core uses the corrected PTP time domain.
However, the free-running counter + timecounter structure combination
will suffice for now, as the resulting timestamps yield a sub-50 ns
synchronization offset in steady state using linuxptp.

For this patch, in absence of frame timestamping, the operations of the
switch PHC were tested by syncing it to the system time as a local slave
clock with:

phc2sys -s CLOCK_REALTIME -c swp2 -O 0 -m -S 0.01
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bb77f36a

05 6月, 2019 2 次提交

net: dsa: sja1105: Add FDB operations for P/Q/R/S series · 1da73821

由 Vladimir Oltean 提交于 6月 03, 2019

This adds support for manipulating the L2 forwarding database (dump,
add, delete) for the second generation of NXP SJA1105 switches.

At the moment only FDB entries installed statically through 'bridge fdb'
are visible in the dump callback - the dynamically learned ones are
still under investigation.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1da73821

net: dsa: sja1105: Make room for P/Q/R/S FDB operations · 9dfa6911

由 Vladimir Oltean 提交于 6月 03, 2019

The DSA callbacks were written with the E/T (first generation) in mind,
which is quite different.

For P/Q/R/S completely new implementations need to be provided, which
are held as function pointers in the priv->info structure.  We are
taking a slightly roundabout way for this (a function from
sja1105_main.c reads a structure defined in sja1105_spi.c that
points to a function defined in sja1105_main.c), but it is what it is.

The FDB dump callback works for both families, hence no function pointer
for that.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9dfa6911

06 5月, 2019 1 次提交

net: dsa: sja1105: Add support for traffic through standalone ports · 227d07a0

由 Vladimir Oltean 提交于 5月 05, 2019

In order to support this, we are creating a make-shift switch tag out of
a VLAN trunk configured on the CPU port. Termination of normal traffic
on switch ports only works when not under a vlan_filtering bridge.
Termination of management (PTP, BPDU) traffic works under all
circumstances because it uses a different tagging mechanism
(incl_srcpt). We are making use of the generic CONFIG_NET_DSA_TAG_8021Q
code and leveraging it from our own CONFIG_NET_DSA_TAG_SJA1105.

There are two types of traffic: regular and link-local.

The link-local traffic received on the CPU port is trapped from the
switch's regular forwarding decisions because it matched one of the two
DMAC filters for management traffic.

On transmission, the switch requires special massaging for these
link-local frames. Due to a weird implementation of the switching IP, by
default it drops link-local frames that originate on the CPU port.
It needs to be told where to forward them to, through an SPI command
("management route") that is valid for only a single frame.
So when we're sending link-local traffic, we are using the
dsa_defer_xmit mechanism.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

227d07a0

03 5月, 2019 6 次提交

net: dsa: sja1105: Prevent PHY jabbering during switch reset · 1a4c6940

由 Vladimir Oltean 提交于 5月 02, 2019

Resetting the switch at runtime is currently done while changing the
vlan_filtering setting (due to the required TPID change).

But reset is asynchronous with packet egress, and the switch core will
not wait for egress to finish before carrying on with the reset
operation.

As a result, a connected PHY such as the BCM5464 would see an
unterminated Ethernet frame and start to jabber (repeat the last seen
Ethernet symbols - jabber is by definition an oversized Ethernet frame
with bad FCS). This behavior is strange in itself, but it also causes
the MACs of some link partners (such as the FRDM-LS1012A) to completely
lock up.

So as a remedy for this situation, when switch reset is required, simply
inhibit Tx on all ports, and wait for the necessary time for the
eventual one frame left in the egress queue (not even the Tx inhibit
command is instantaneous) to be flushed.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1a4c6940

net: dsa: sja1105: Add support for configuring address ageing time · 8456721d

由 Vladimir Oltean 提交于 5月 02, 2019

If STP is active, this setting is applied on bridged ports each time an
Ethernet link is established (topology changes).

Since the setting is global to the switch and a reset is required to
change it, resets are prevented if the new callback does not change the
value that the hardware already is programmed for.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8456721d

net: dsa: sja1105: Add support for ethtool port counters · 52c34e6e

由 Vladimir Oltean 提交于 5月 02, 2019

Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

52c34e6e

net: dsa: sja1105: Error out if RGMII delays are requested in DT · f5b8631c

由 Vladimir Oltean 提交于 5月 02, 2019

Documentation/devicetree/bindings/net/ethernet.txt is confusing because
it says what the MAC should not do, but not what it *should* do:

* "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC
should not add an RX delay in this case)

The gap in semantics is threefold:
1. Is it illegal for the MAC to apply the Rx internal delay by itself,
and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before
passing it to of_phy_connect? The documentation would suggest yes.
1. For "rgmii-rxid", while the situation with the Rx clock skew is more
or less clear (needs to be added by the PHY), what should the MAC
driver do about the Tx delays? Is it an implicit wild card for the
MAC to apply delays in the Tx direction if it can? What if those were
already added as serpentine PCB traces, how could that be made more
obvious through DT bindings so that the MAC doesn't attempt to add
them twice and again potentially break the link?
3. If the interface is a fixed-link and therefore the PHY object is
fixed (a purely software entity that obviously cannot add clock
skew), what is the meaning of the above property?

So an interpretation of the RGMII bindings was chosen that hopefully
does not contradict their intention but also makes them more applied.
The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings
if the port is in the PHY role (either explicitly, or if it is a
fixed-link). Otherwise it always passes the duty of setting up delays to
the PHY driver.

The error behavior that this patch adds is required on SJA1105E/T where
the MAC really cannot apply internal delays. If the other end of the
fixed-link cannot apply RGMII delays either (this would be specified
through its own DT bindings), then the situation requires PCB delays.

For SJA1105P/Q/R/S, this is however hardware supported and the error is
thus only temporary. I created a stub function pointer for configuring
delays per-port on RXC and TXC, and will implement it when I have access
to a board with this hardware setup.

Meanwhile do not allow the user to select an invalid configuration.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5b8631c

net: dsa: sja1105: Add support for FDB and MDB management · 291d1e72

由 Vladimir Oltean 提交于 5月 02, 2019

Currently only the (more difficult) first generation E/T series is
supported. Here the TCAM is only 4-way associative, and to know where
the hardware will search for a FDB entry, we need to perform the same
hash algorithm in order to install the entry in the correct bin.

On P/Q/R/S, the TCAM should be fully associative. However the SPI
command interface is different, and because I don't have access to a
new-generation device at the moment, support for it is TODO.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

291d1e72

net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch · 8aa9ebcc

由 Vladimir Oltean 提交于 5月 02, 2019

At this moment the following is supported:
* Link state management through phylib
* Autonomous L2 forwarding managed through iproute2 bridge commands.

IP termination must be done currently through the master netdevice,
since the switch is unmanaged at this point and using
DSA_TAG_PROTO_NONE.
Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NGeorg Waibel <georg.waibel@sensor-technik.de>
Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8aa9ebcc

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功