1. 18 8月, 2022 1 次提交
    • V
      net: mscc: ocelot: fix incorrect ndo_get_stats64 packet counters · 5152de7b
      Vladimir Oltean 提交于
      Reading stats using the SYS_COUNT_* register definitions is only used by
      ocelot_get_stats64() from the ocelot switchdev driver, however,
      currently the bucket definitions are incorrect.
      
      Separately, on both RX and TX, we have the following problems:
      - a 256-1023 bucket which actually tracks the 256-511 packets
      - the 1024-1526 bucket actually tracks the 512-1023 packets
      - the 1527-max bucket actually tracks the 1024-1526 packets
      
      => nobody tracks the packets from the real 1527-max bucket
      
      Additionally, the RX_PAUSE, RX_CONTROL, RX_LONGS and RX_CLASSIFIED_DROPS
      all track the wrong thing. However this doesn't seem to have any
      consequence, since ocelot_get_stats64() doesn't use these.
      
      Even though this problem only manifests itself for the switchdev driver,
      we cannot split the fix for ocelot and for DSA, since it requires fixing
      the bucket definitions from enum ocelot_reg, which makes us necessarily
      adapt the structures from felix and seville as well.
      
      Fixes: 84705fc1 ("net: dsa: felix: introduce support for Seville VSC9953 switch")
      Fixes: 56051948 ("net: dsa: ocelot: add driver for Felix switch family")
      Fixes: a556c76a ("net: mscc: Add initial Ocelot switch support")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      5152de7b
  2. 12 5月, 2022 1 次提交
  3. 30 4月, 2022 1 次提交
    • C
      net: ethernet: ocelot: remove the need for num_stats initializer · 2f187bfa
      Colin Foster 提交于
      There is a desire to share the oclot_stats_layout struct outside of the
      current vsc7514 driver. In order to do so, the length of the array needs to
      be known at compile time, and defined in the struct ocelot and struct
      felix_info.
      
      Since the array is defined in a .c file and would be declared in the header
      file via:
      extern struct ocelot_stat_layout[];
      the size of the array will not be known at compile time to outside modules.
      
      To fix this, remove the need for defining the number of stats at compile
      time and allow this number to be determined at initialization.
      Signed-off-by: NColin Foster <colin.foster@in-advantage.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f187bfa
  4. 28 2月, 2022 1 次提交
  5. 26 2月, 2022 1 次提交
  6. 09 2月, 2022 1 次提交
    • V
      net: dsa: seville: register the mdiobus under devres · bd488afc
      Vladimir Oltean 提交于
      As explained in commits:
      74b6d7d1 ("net: dsa: realtek: register the MDIO bus under devres")
      5135e96a ("net: dsa: don't allocate the slave_mii_bus using devres")
      
      mdiobus_free() will panic when called from devm_mdiobus_free() <-
      devres_release_all() <- __device_release_driver(), and that mdiobus was
      not previously unregistered.
      
      The Seville VSC9959 switch is a platform device, so the initial set of
      constraints that I thought would cause this (I2C or SPI buses which call
      ->remove on ->shutdown) do not apply. But there is one more which
      applies here.
      
      If the DSA master itself is on a bus that calls ->remove from ->shutdown
      (like dpaa2-eth, which is on the fsl-mc bus), there is a device link
      between the switch and the DSA master, and device_links_unbind_consumers()
      will unbind the seville switch driver on shutdown.
      
      So the same treatment must be applied to all DSA switch drivers, which
      is: either use devres for both the mdiobus allocation and registration,
      or don't use devres at all.
      
      The seville driver has a code structure that could accommodate both the
      mdiobus_unregister and mdiobus_free calls, but it has an external
      dependency upon mscc_miim_setup() from mdio-mscc-miim.c, which calls
      devm_mdiobus_alloc_size() on its behalf. So rather than restructuring
      that, and exporting yet one more symbol mscc_miim_teardown(), let's work
      with devres and replace of_mdiobus_register with the devres variant.
      When we use all-devres, we can ensure that devres doesn't free a
      still-registered bus (it either runs both callbacks, or none).
      
      Fixes: ac3a68d5 ("net: phy: don't abuse devres in devm_mdiobus_register()")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      bd488afc
  7. 03 1月, 2022 2 次提交
  8. 08 12月, 2021 1 次提交
  9. 29 11月, 2021 2 次提交
  10. 18 11月, 2021 1 次提交
  11. 24 10月, 2021 1 次提交
    • S
      net: convert users of bitmap_foo() to linkmode_foo() · 4973056c
      Sean Anderson 提交于
      This converts instances of
      	bitmap_foo(args..., __ETHTOOL_LINK_MODE_MASK_NBITS)
      to
      	linkmode_foo(args...)
      
      I manually fixed up some lines to prevent them from being excessively
      long. Otherwise, this change was generated with the following semantic
      patch:
      
      // Generated with
      // echo linux/linkmode.h > includes
      // git grep -Flf includes include/ | cut -f 2- -d / | cat includes - \
      // | sort | uniq | tee new_includes | wc -l && mv new_includes includes
      // and repeating until the number stopped going up
      @i@
      @@
      
      (
       #include <linux/acpi_mdio.h>
      |
       #include <linux/brcmphy.h>
      |
       #include <linux/dsa/loop.h>
      |
       #include <linux/dsa/sja1105.h>
      |
       #include <linux/ethtool.h>
      |
       #include <linux/ethtool_netlink.h>
      |
       #include <linux/fec.h>
      |
       #include <linux/fs_enet_pd.h>
      |
       #include <linux/fsl/enetc_mdio.h>
      |
       #include <linux/fwnode_mdio.h>
      |
       #include <linux/linkmode.h>
      |
       #include <linux/lsm_audit.h>
      |
       #include <linux/mdio-bitbang.h>
      |
       #include <linux/mdio.h>
      |
       #include <linux/mdio-mux.h>
      |
       #include <linux/mii.h>
      |
       #include <linux/mii_timestamper.h>
      |
       #include <linux/mlx5/accel.h>
      |
       #include <linux/mlx5/cq.h>
      |
       #include <linux/mlx5/device.h>
      |
       #include <linux/mlx5/driver.h>
      |
       #include <linux/mlx5/eswitch.h>
      |
       #include <linux/mlx5/fs.h>
      |
       #include <linux/mlx5/port.h>
      |
       #include <linux/mlx5/qp.h>
      |
       #include <linux/mlx5/rsc_dump.h>
      |
       #include <linux/mlx5/transobj.h>
      |
       #include <linux/mlx5/vport.h>
      |
       #include <linux/of_mdio.h>
      |
       #include <linux/of_net.h>
      |
       #include <linux/pcs-lynx.h>
      |
       #include <linux/pcs/pcs-xpcs.h>
      |
       #include <linux/phy.h>
      |
       #include <linux/phy_led_triggers.h>
      |
       #include <linux/phylink.h>
      |
       #include <linux/platform_data/bcmgenet.h>
      |
       #include <linux/platform_data/xilinx-ll-temac.h>
      |
       #include <linux/pxa168_eth.h>
      |
       #include <linux/qed/qed_eth_if.h>
      |
       #include <linux/qed/qed_fcoe_if.h>
      |
       #include <linux/qed/qed_if.h>
      |
       #include <linux/qed/qed_iov_if.h>
      |
       #include <linux/qed/qed_iscsi_if.h>
      |
       #include <linux/qed/qed_ll2_if.h>
      |
       #include <linux/qed/qed_nvmetcp_if.h>
      |
       #include <linux/qed/qed_rdma_if.h>
      |
       #include <linux/sfp.h>
      |
       #include <linux/sh_eth.h>
      |
       #include <linux/smsc911x.h>
      |
       #include <linux/soc/nxp/lpc32xx-misc.h>
      |
       #include <linux/stmmac.h>
      |
       #include <linux/sunrpc/svc_rdma.h>
      |
       #include <linux/sxgbe_platform.h>
      |
       #include <net/cfg80211.h>
      |
       #include <net/dsa.h>
      |
       #include <net/mac80211.h>
      |
       #include <net/selftests.h>
      |
       #include <rdma/ib_addr.h>
      |
       #include <rdma/ib_cache.h>
      |
       #include <rdma/ib_cm.h>
      |
       #include <rdma/ib_hdrs.h>
      |
       #include <rdma/ib_mad.h>
      |
       #include <rdma/ib_marshall.h>
      |
       #include <rdma/ib_pack.h>
      |
       #include <rdma/ib_pma.h>
      |
       #include <rdma/ib_sa.h>
      |
       #include <rdma/ib_smi.h>
      |
       #include <rdma/ib_umem.h>
      |
       #include <rdma/ib_umem_odp.h>
      |
       #include <rdma/ib_verbs.h>
      |
       #include <rdma/iw_cm.h>
      |
       #include <rdma/mr_pool.h>
      |
       #include <rdma/opa_addr.h>
      |
       #include <rdma/opa_port_info.h>
      |
       #include <rdma/opa_smi.h>
      |
       #include <rdma/opa_vnic.h>
      |
       #include <rdma/rdma_cm.h>
      |
       #include <rdma/rdma_cm_ib.h>
      |
       #include <rdma/rdmavt_cq.h>
      |
       #include <rdma/rdma_vt.h>
      |
       #include <rdma/rdmavt_qp.h>
      |
       #include <rdma/rw.h>
      |
       #include <rdma/tid_rdma_defs.h>
      |
       #include <rdma/uverbs_ioctl.h>
      |
       #include <rdma/uverbs_named_ioctl.h>
      |
       #include <rdma/uverbs_std_types.h>
      |
       #include <rdma/uverbs_types.h>
      |
       #include <soc/mscc/ocelot.h>
      |
       #include <soc/mscc/ocelot_ptp.h>
      |
       #include <soc/mscc/ocelot_vcap.h>
      |
       #include <trace/events/ib_mad.h>
      |
       #include <trace/events/rdma_core.h>
      |
       #include <trace/events/rdma.h>
      |
       #include <trace/events/rpcrdma.h>
      |
       #include <uapi/linux/ethtool.h>
      |
       #include <uapi/linux/ethtool_netlink.h>
      |
       #include <uapi/linux/mdio.h>
      |
       #include <uapi/linux/mii.h>
      )
      
      @depends on i@
      expression list args;
      @@
      
      (
      - bitmap_zero(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_zero(args)
      |
      - bitmap_copy(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_copy(args)
      |
      - bitmap_and(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_and(args)
      |
      - bitmap_or(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_or(args)
      |
      - bitmap_empty(args, ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_empty(args)
      |
      - bitmap_andnot(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_andnot(args)
      |
      - bitmap_equal(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_equal(args)
      |
      - bitmap_intersects(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_intersects(args)
      |
      - bitmap_subset(args, __ETHTOOL_LINK_MODE_MASK_NBITS)
      + linkmode_subset(args)
      )
      
      Add missing linux/mii.h include to mellanox. -DaveM
      Signed-off-by: NSean Anderson <sean.anderson@seco.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4973056c
  12. 19 9月, 2021 1 次提交
    • V
      net: dsa: be compatible with masters which unregister on shutdown · 0650bf52
      Vladimir Oltean 提交于
      Lino reports that on his system with bcmgenet as DSA master and KSZ9897
      as a switch, rebooting or shutting down never works properly.
      
      What does the bcmgenet driver have special to trigger this, that other
      DSA masters do not? It has an implementation of ->shutdown which simply
      calls its ->remove implementation. Otherwise said, it unregisters its
      network interface on shutdown.
      
      This message can be seen in a loop, and it hangs the reboot process there:
      
      unregister_netdevice: waiting for eth0 to become free. Usage count = 3
      
      So why 3?
      
      A usage count of 1 is normal for a registered network interface, and any
      virtual interface which links itself as an upper of that will increment
      it via dev_hold. In the case of DSA, this is the call path:
      
      dsa_slave_create
      -> netdev_upper_dev_link
         -> __netdev_upper_dev_link
            -> __netdev_adjacent_dev_insert
               -> dev_hold
      
      So a DSA switch with 3 interfaces will result in a usage count elevated
      by two, and netdev_wait_allrefs will wait until they have gone away.
      
      Other stacked interfaces, like VLAN, watch NETDEV_UNREGISTER events and
      delete themselves, but DSA cannot just vanish and go poof, at most it
      can unbind itself from the switch devices, but that must happen strictly
      earlier compared to when the DSA master unregisters its net_device, so
      reacting on the NETDEV_UNREGISTER event is way too late.
      
      It seems that it is a pretty established pattern to have a driver's
      ->shutdown hook redirect to its ->remove hook, so the same code is
      executed regardless of whether the driver is unbound from the device, or
      the system is just shutting down. As Florian puts it, it is quite a big
      hammer for bcmgenet to unregister its net_device during shutdown, but
      having a common code path with the driver unbind helps ensure it is well
      tested.
      
      So DSA, for better or for worse, has to live with that and engage in an
      arms race of implementing the ->shutdown hook too, from all individual
      drivers, and do something sane when paired with masters that unregister
      their net_device there. The only sane thing to do, of course, is to
      unlink from the master.
      
      However, complications arise really quickly.
      
      The pattern of redirecting ->shutdown to ->remove is not unique to
      bcmgenet or even to net_device drivers. In fact, SPI controllers do it
      too (see dspi_shutdown -> dspi_remove), and presumably, I2C controllers
      and MDIO controllers do it too (this is something I have not researched
      too deeply, but even if this is not the case today, it is certainly
      plausible to happen in the future, and must be taken into consideration).
      
      Since DSA switches might be SPI devices, I2C devices, MDIO devices, the
      insane implication is that for the exact same DSA switch device, we
      might have both ->shutdown and ->remove getting called.
      
      So we need to do something with that insane environment. The pattern
      I've come up with is "if this, then not that", so if either ->shutdown
      or ->remove gets called, we set the device's drvdata to NULL, and in the
      other hook, we check whether the drvdata is NULL and just do nothing.
      This is probably not necessary for platform devices, just for devices on
      buses, but I would really insist for consistency among drivers, because
      when code is copy-pasted, it is not always copy-pasted from the best
      sources.
      
      So depending on whether the DSA switch's ->remove or ->shutdown will get
      called first, we cannot really guarantee even for the same driver if
      rebooting will result in the same code path on all platforms. But
      nonetheless, we need to do something minimally reasonable on ->shutdown
      too to fix the bug. Of course, the ->remove will do more (a full
      teardown of the tree, with all data structures freed, and this is why
      the bug was not caught for so long). The new ->shutdown method is kept
      separate from dsa_unregister_switch not because we couldn't have
      unregistered the switch, but simply in the interest of doing something
      quick and to the point.
      
      The big question is: does the DSA switch's ->shutdown get called earlier
      than the DSA master's ->shutdown? If not, there is still a risk that we
      might still trigger the WARN_ON in unregister_netdevice that says we are
      attempting to unregister a net_device which has uppers. That's no good.
      Although the reference to the master net_device won't physically go away
      even if DSA's ->shutdown comes afterwards, remember we have a dev_hold
      on it.
      
      The answer to that question lies in this comment above device_link_add:
      
       * A side effect of the link creation is re-ordering of dpm_list and the
       * devices_kset list by moving the consumer device and all devices depending
       * on it to the ends of these lists (that does not happen to devices that have
       * not been registered when this function is called).
      
      so the fact that DSA uses device_link_add towards its master is not
      exactly for nothing. device_shutdown() walks devices_kset from the back,
      so this is our guarantee that DSA's shutdown happens before the master's
      shutdown.
      
      Fixes: 2f1e8ea7 ("net: dsa: link interfaces with the DSA master to get rid of lockdep warnings")
      Link: https://lore.kernel.org/netdev/20210909095324.12978-1-LinoSanfilippo@gmx.de/Reported-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Tested-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0650bf52
  13. 08 6月, 2021 1 次提交
  14. 15 2月, 2021 2 次提交
    • V
      net: dsa: tag_ocelot: create separate tagger for Seville · 7c4bb540
      Vladimir Oltean 提交于
      The ocelot tagger is a hot mess currently, it relies on memory
      initialized by the attached driver for basic frame transmission.
      This is against all that DSA tagging protocols stand for, which is that
      the transmission and reception of a DSA-tagged frame, the data path,
      should be independent from the switch control path, because the tag
      protocol is in principle hot-pluggable and reusable across switches
      (even if in practice it wasn't until very recently). But if another
      driver like dsa_loop wants to make use of tag_ocelot, it couldn't.
      
      This was done to have common code between Felix and Ocelot, which have
      one bit difference in the frame header format. Quoting from commit
      67c24049 ("net: dsa: felix: create a template for the DSA tags on
      xmit"):
      
          Other alternatives have been analyzed, such as:
          - Create a separate tag_seville.c: too much code duplication for just 1
            bit field difference.
          - Create a separate DSA_TAG_PROTO_SEVILLE under tag_ocelot.c, just like
            tag_brcm.c, which would have a separate .xmit function. Again, too
            much code duplication for just 1 bit field difference.
          - Allocate the template from the init function of the tag_ocelot.c
            module, instead of from the driver: couldn't figure out a method of
            accessing the correct port template corresponding to the correct
            tagger in the .xmit function.
      
      The really interesting part is that Seville should have had its own
      tagging protocol defined - it is not compatible on the wire with Ocelot,
      even for that single bit. In principle, a packet generated by
      DSA_TAG_PROTO_OCELOT when booted on NXP LS1028A would look in a certain
      way, but when booted on NXP T1040 it would look differently. The reverse
      is also true: a packet generated by a Seville switch would be
      interpreted incorrectly by Wireshark if it was told it was generated by
      an Ocelot switch.
      
      Actually things are a bit more nuanced. If we concentrate only on the
      DSA tag, what I said above is true, but Ocelot/Seville also support an
      optional DSA tag prefix, which can be short or long, and it is possible
      to distinguish the two taggers based on an integer constant put in that
      prefix. Nonetheless, creating a separate tagger is still justified,
      since the tag prefix is optional, and without it, there is again no way
      to distinguish.
      
      Claiming backwards binary compatibility is a bit more tough, since I've
      already changed the format of tag_ocelot once, in commit 5124197c
      ("net: dsa: tag_ocelot: use a short prefix on both ingress and egress").
      Therefore I am not very concerned with treating this as a bugfix and
      backporting it to stable kernels (which would be another mess due to the
      fact that there would be lots of conflicts with the other DSA_TAG_PROTO*
      definitions). It's just simpler to say that the string values of the
      taggers have ABI value starting with kernel 5.12, which will be when the
      changing of tag protocol via /sys/class/net/<dsa-master>/dsa/tagging
      goes live.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c4bb540
    • V
      net: mscc: ocelot: use common tag parsing code with DSA · 40d3f295
      Vladimir Oltean 提交于
      The Injection Frame Header and Extraction Frame Header that the switch
      prepends to frames over the NPI port is also prepended to frames
      delivered over the CPU port module's queues.
      
      Let's unify the handling of the frame headers by making the ocelot
      driver call some helpers exported by the DSA tagger. Among other things,
      this allows us to get rid of the strange cpu_to_be32 when transmitting
      the Injection Frame Header on ocelot, since the packing API uses
      network byte order natively (when "quirks" is 0).
      
      The comments above ocelot_gen_ifh talk about setting pop_cnt to 3, and
      the cpu extraction queue mask to something, but the code doesn't do it,
      so we don't do it either.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40d3f295
  15. 30 1月, 2021 1 次提交
    • V
      net: dsa: felix: convert to the new .change_tag_protocol DSA API · adb3dccf
      Vladimir Oltean 提交于
      In expectation of the new tag_ocelot_8021q tagger implementation, we
      need to be able to do runtime switchover between one tagger and another.
      So we must structure the existing code for the current NPI-based tagger
      in a certain way.
      
      We move the felix_npi_port_init function in expectation of the future
      driver configuration necessary for tag_ocelot_8021q: we would like to
      not have the NPI-related bits interspersed with the tag_8021q bits.
      
      The conversion from this:
      
      	ocelot_write_rix(ocelot,
      			 ANA_PGID_PGID_PGID(GENMASK(ocelot->num_phys_ports, 0)),
      			 ANA_PGID_PGID, PGID_UC);
      
      to this:
      
      	cpu_flood = ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports));
      	ocelot_rmw_rix(ocelot, cpu_flood, cpu_flood, ANA_PGID_PGID, PGID_UC);
      
      is perhaps non-trivial, but is nonetheless non-functional. The PGID_UC
      (replicator for unknown unicast) is already configured out of hardware
      reset to flood to all ports except ocelot->num_phys_ports (the CPU port
      module). All we change is that we use a read-modify-write to only add
      the CPU port module to the unknown unicast replicator, as opposed to
      doing a full write to the register.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      adb3dccf
  16. 16 1月, 2021 3 次提交
  17. 06 12月, 2020 1 次提交
    • V
      net: mscc: ocelot: fix dropping of unknown IPv4 multicast on Seville · edd2410b
      Vladimir Oltean 提交于
      The current assumption is that the felix DSA driver has flooding knobs
      per traffic class, while ocelot switchdev has a single flooding knob.
      This was correct for felix VSC9959 and ocelot VSC7514, but with the
      introduction of seville VSC9953, we see a switch driven by felix.c which
      has a single flooding knob.
      
      So it is clear that we must do what should have been done from the
      beginning, which is not to overwrite the configuration done by ocelot.c
      in felix, but instead to teach the common ocelot library about the
      differences in our switches, and set up the flooding PGIDs centrally.
      
      The effect that the bogus iteration through FELIX_NUM_TC has upon
      seville is quite dramatic. ANA_FLOODING is located at 0x00b548, and
      ANA_FLOODING_IPMC is located at 0x00b54c. So the bogus iteration will
      actually overwrite ANA_FLOODING_IPMC when attempting to write
      ANA_FLOODING[1]. There is no ANA_FLOODING[1] in sevile, just ANA_FLOODING.
      
      And when ANA_FLOODING_IPMC is overwritten with a bogus value, the effect
      is that ANA_FLOODING_IPMC gets the value of 0x0003CF7D:
      	MC6_DATA = 61,
      	MC6_CTRL = 61,
      	MC4_DATA = 60,
      	MC4_CTRL = 0.
      Because MC4_CTRL is zero, this means that IPv4 multicast control packets
      are not flooded, but dropped. An invalid configuration, and this is how
      the issue was actually spotted.
      Reported-by: NEldar Gasanov <eldargasanov2@gmail.com>
      Reported-by: NMaxim Kochetkov <fido_max@inbox.ru>
      Tested-by: NEldar Gasanov <eldargasanov2@gmail.com>
      Fixes: 84705fc1 ("net: dsa: felix: introduce support for Seville VSC9953 switch")
      Fixes: 3c7b51bd ("net: dsa: felix: allow flooding for all traffic classes")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
      Link: https://lore.kernel.org/r/20201204175416.1445937-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      edd2410b
  18. 20 10月, 2020 1 次提交
  19. 06 10月, 2020 1 次提交
  20. 03 10月, 2020 1 次提交
  21. 30 9月, 2020 6 次提交
  22. 27 9月, 2020 1 次提交
    • V
      net: dsa: tag_ocelot: use a short prefix on both ingress and egress · 5124197c
      Vladimir Oltean 提交于
      There are 2 goals that we follow:
      
      - Reduce the header size
      - Make the header size equal between RX and TX
      
      The issue that required long prefix on RX was the fact that the ocelot
      DSA tag, being put before Ethernet as it is, would overlap with the area
      that a DSA master uses for RX filtering (destination MAC address
      mainly).
      
      Now that we can ask DSA to put the master in promiscuous mode, in theory
      we could remove the prefix altogether and call it a day, but it looks
      like we can't. Using no prefix on ingress, some packets (such as ICMP)
      would be received, while others (such as PTP) would not be received.
      This is because the DSA master we use (enetc) triggers parse errors
      ("MAC rx frame errors") presumably because it sees Ethernet frames with
      a bad length. And indeed, when using no prefix, the EtherType (bytes
      12-13 of the frame, bits 96-111) falls over the REW_VAL field from the
      extraction header, aka the PTP timestamp.
      
      When turning the short (32-bit) prefix on, the EtherType overlaps with
      bits 64-79 of the extraction header, which are a reserved area
      transmitted as zero by the switch. The packets are not dropped by the
      DSA master with a short prefix. Actually, the frames look like this in
      tcpdump (below is a PTP frame, with an extra dsa_8021q tag - dadb 0482 -
      added by a downstream sja1105).
      
      89:0c:a9:f2:01:00 > 88:80:00:0a:00:1d, 802.3, length 0: LLC, \
      	dsap Unknown (0x10) Individual, ssap ProWay NM (0x0e) Response, \
      	ctrl 0x0004: Information, send seq 2, rcv seq 0, \
      	Flags [Response], length 78
      
      0x0000:  8880 000a 001d 890c a9f2 0100 0000 100f  ................
      0x0010:  0400 0000 0180 c200 000e 001f 7b63 0248  ............{c.H
      0x0020:  dadb 0482 88f7 1202 0036 0000 0000 0000  .........6......
      0x0030:  0000 0000 0000 0000 0000 001f 7bff fe63  ............{..c
      0x0040:  0248 0001 1f81 0500 0000 0000 0000 0000  .H..............
      0x0050:  0000 0000 0000 0000 0000 0000            ............
      
      So the short prefix is our new default: we've shortened our RX frames by
      12 octets, increased TX by 4, and headers are now equal between RX and
      TX. Note that we still need promiscuous mode for the DSA master to not
      drop it.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5124197c
  23. 22 9月, 2020 1 次提交
  24. 19 9月, 2020 7 次提交