1. 24 8月, 2022 1 次提交
  2. 23 8月, 2022 1 次提交
    • V
      net: dsa: microchip: keep compatibility with device tree blobs with no phy-mode · 5fbb08eb
      Vladimir Oltean 提交于
      DSA has multiple ways of specifying a MAC connection to an internal PHY.
      One requires a DT description like this:
      
      	port@0 {
      		reg = <0>;
      		phy-handle = <&internal_phy>;
      		phy-mode = "internal";
      	};
      
      (which is IMO the recommended approach, as it is the clearest
      description)
      
      but it is also possible to leave the specification as just:
      
      	port@0 {
      		reg = <0>;
      	}
      
      and if the driver implements ds->ops->phy_read and ds->ops->phy_write,
      the DSA framework "knows" it should create a ds->slave_mii_bus, and it
      should connect to a non-OF-based internal PHY on this MDIO bus, at an
      MDIO address equal to the port address.
      
      There is also an intermediary way of describing things:
      
      	port@0 {
      		reg = <0>;
      		phy-handle = <&internal_phy>;
      	};
      
      In case 2, DSA calls phylink_connect_phy() and in case 3, it calls
      phylink_of_phy_connect(). In both cases, phylink_create() has been
      called with a phy_interface_t of PHY_INTERFACE_MODE_NA, and in both
      cases, PHY_INTERFACE_MODE_NA is translated into phy->interface.
      
      It is important to note that phy_device_create() initializes
      dev->interface = PHY_INTERFACE_MODE_GMII, and so, when we use
      phylink_create(PHY_INTERFACE_MODE_NA), no one will override this, and we
      will end up with a PHY_INTERFACE_MODE_GMII interface inherited from the
      PHY.
      
      All this means that in order to maintain compatibility with device tree
      blobs where the phy-mode property is missing, we need to allow the
      "gmii" phy-mode and treat it as "internal".
      
      Fixes: 2c709e0b ("net: dsa: microchip: ksz8795: add phylink support")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216320Reported-by: NCraig McQueen <craig@mcqueen.id.au>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NAlvin Šipraga <alsi@bang-olufsen.dk>
      Tested-by: NRasmus Villemoes <rasmus.villemoes@prevas.dk>
      Link: https://lore.kernel.org/r/20220818143250.2797111-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      5fbb08eb
  3. 18 8月, 2022 7 次提交
    • V
      net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset · d4c36765
      Vladimir Oltean 提交于
      With so many counter addresses recently discovered as being wrong, it is
      desirable to at least have a central database of information, rather
      than two: one through the SYS_COUNT_* registers (used for
      ndo_get_stats64), and the other through the offset field of struct
      ocelot_stat_layout elements (used for ethtool -S).
      
      The strategy will be to keep the SYS_COUNT_* definitions as the single
      source of truth, but for that we need to expand our current definitions
      to cover all registers. Then we need to convert the ocelot region
      creation logic, and stats worker, to the read semantics imposed by going
      through SYS_COUNT_* absolute register addresses, rather than offsets
      of 32-bit words relative to SYS_COUNT_RX_OCTETS (which should have been
      SYS_CNT, by the way).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      d4c36765
    • V
      net: mscc: ocelot: make struct ocelot_stat_layout array indexable · 91904600
      Vladimir Oltean 提交于
      The ocelot counters are 32-bit and require periodic reading, every 2
      seconds, by ocelot_port_update_stats(), so that wraparounds are
      detected.
      
      Currently, the counters reported by ocelot_get_stats64() come from the
      32-bit hardware counters directly, rather than from the 64-bit
      accumulated ocelot->stats, and this is a problem for their integrity.
      
      The strategy is to make ocelot_get_stats64() able to cherry-pick
      individual stats from ocelot->stats the way in which it currently reads
      them out from SYS_COUNT_* registers. But currently it can't, because
      ocelot->stats is an opaque u64 array that's used only to feed data into
      ethtool -S.
      
      To solve that problem, we need to make ocelot->stats indexable, and
      associate each element with an element of struct ocelot_stat_layout used
      by ethtool -S.
      
      This makes ocelot_stat_layout a fat (and possibly sparse) array, so we
      need to change the way in which we access it. We no longer need
      OCELOT_STAT_END as a sentinel, because we know the array's size
      (OCELOT_NUM_STATS). We just need to skip the array elements that were
      left unpopulated for the switch revision (ocelot, felix, seville).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      91904600
    • V
      net: mscc: ocelot: turn stats_lock into a spinlock · 22d842e3
      Vladimir Oltean 提交于
      ocelot_get_stats64() currently runs unlocked and therefore may collide
      with ocelot_port_update_stats() which indirectly accesses the same
      counters. However, ocelot_get_stats64() runs in atomic context, and we
      cannot simply take the sleepable ocelot->stats_lock mutex. We need to
      convert it to an atomic spinlock first. Do that as a preparatory change.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      22d842e3
    • V
      net: mscc: ocelot: fix incorrect ndo_get_stats64 packet counters · 5152de7b
      Vladimir Oltean 提交于
      Reading stats using the SYS_COUNT_* register definitions is only used by
      ocelot_get_stats64() from the ocelot switchdev driver, however,
      currently the bucket definitions are incorrect.
      
      Separately, on both RX and TX, we have the following problems:
      - a 256-1023 bucket which actually tracks the 256-511 packets
      - the 1024-1526 bucket actually tracks the 512-1023 packets
      - the 1527-max bucket actually tracks the 1024-1526 packets
      
      => nobody tracks the packets from the real 1527-max bucket
      
      Additionally, the RX_PAUSE, RX_CONTROL, RX_LONGS and RX_CLASSIFIED_DROPS
      all track the wrong thing. However this doesn't seem to have any
      consequence, since ocelot_get_stats64() doesn't use these.
      
      Even though this problem only manifests itself for the switchdev driver,
      we cannot split the fix for ocelot and for DSA, since it requires fixing
      the bucket definitions from enum ocelot_reg, which makes us necessarily
      adapt the structures from felix and seville as well.
      
      Fixes: 84705fc1 ("net: dsa: felix: introduce support for Seville VSC9953 switch")
      Fixes: 56051948 ("net: dsa: ocelot: add driver for Felix switch family")
      Fixes: a556c76a ("net: mscc: Add initial Ocelot switch support")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      5152de7b
    • V
      net: dsa: felix: fix ethtool 256-511 and 512-1023 TX packet counters · 40d21c45
      Vladimir Oltean 提交于
      What the driver actually reports as 256-511 is in fact 512-1023, and the
      TX packets in the 256-511 bucket are not reported. Fix that.
      
      Fixes: 56051948 ("net: dsa: ocelot: add driver for Felix switch family")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      40d21c45
    • R
      net: dsa: sja1105: fix buffer overflow in sja1105_setup_devlink_regions() · fd8e899c
      Rustam Subkhankulov 提交于
      If an error occurs in dsa_devlink_region_create(), then 'priv->regions'
      array will be accessed by negative index '-1'.
      
      Found by Linux Verification Center (linuxtesting.org) with SVACE.
      Signed-off-by: NRustam Subkhankulov <subkhankulov@ispras.ru>
      Fixes: bf425b82 ("net: dsa: sja1105: expose static config as devlink region")
      Link: https://lore.kernel.org/r/20220817003845.389644-1-subkhankulov@ispras.ruSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      fd8e899c
    • A
      net: dsa: microchip: ksz9477: fix fdb_dump last invalid entry · 36c0d935
      Arun Ramadoss 提交于
      In the ksz9477_fdb_dump function it reads the ALU control register and
      exit from the timeout loop if there is valid entry or search is
      complete. After exiting the loop, it reads the alu entry and report to
      the user space irrespective of entry is valid. It works till the valid
      entry. If the loop exited when search is complete, it reads the alu
      table. The table returns all ones and it is reported to user space. So
      bridge fdb show gives ff:ff:ff:ff:ff:ff as last entry for every port.
      To fix it, after exiting the loop the entry is reported only if it is
      valid one.
      
      Fixes: b987e98e ("dsa: add DSA switch driver for Microchip KSZ9477")
      Signed-off-by: NArun Ramadoss <arun.ramadoss@microchip.com>
      Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20220816105516.18350-1-arun.ramadoss@microchip.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      36c0d935
  4. 13 8月, 2022 1 次提交
    • S
      net: dsa: mv88e6060: prevent crash on an unused port · 246bbf2f
      Sergei Antonov 提交于
      If the port isn't a CPU port nor a user port, 'cpu_dp'
      is a null pointer and a crash happened on dereferencing
      it in mv88e6060_setup_port():
      
      [    9.575872] Unable to handle kernel NULL pointer dereference at virtual address 00000014
      ...
      [    9.942216]  mv88e6060_setup from dsa_register_switch+0x814/0xe84
      [    9.948616]  dsa_register_switch from mdio_probe+0x2c/0x54
      [    9.954433]  mdio_probe from really_probe.part.0+0x98/0x2a0
      [    9.960375]  really_probe.part.0 from driver_probe_device+0x30/0x10c
      [    9.967029]  driver_probe_device from __device_attach_driver+0xb8/0x13c
      [    9.973946]  __device_attach_driver from bus_for_each_drv+0x90/0xe0
      [    9.980509]  bus_for_each_drv from __device_attach+0x110/0x184
      [    9.986632]  __device_attach from bus_probe_device+0x8c/0x94
      [    9.992577]  bus_probe_device from deferred_probe_work_func+0x78/0xa8
      [    9.999311]  deferred_probe_work_func from process_one_work+0x290/0x73c
      [   10.006292]  process_one_work from worker_thread+0x30/0x4b8
      [   10.012155]  worker_thread from kthread+0xd4/0x10c
      [   10.017238]  kthread from ret_from_fork+0x14/0x3c
      
      Fixes: 0abfd494 ("net: dsa: use dedicated CPU port")
      CC: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
      CC: Florian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NSergei Antonov <saproj@gmail.com>
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20220811070939.1717146-1-saproj@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      246bbf2f
  5. 10 8月, 2022 1 次提交
    • V
      net: dsa: felix: suppress non-changes to the tagging protocol · 4c46bb49
      Vladimir Oltean 提交于
      The way in which dsa_tree_change_tag_proto() works is that when
      dsa_tree_notify() fails, it doesn't know whether the operation failed
      mid way in a multi-switch tree, or it failed for a single-switch tree.
      So even though drivers need to fail cleanly in
      ds->ops->change_tag_protocol(), DSA will still call dsa_tree_notify()
      again, to restore the old tag protocol for potential switches in the
      tree where the change did succeeed (before failing for others).
      
      This means for the felix driver that if we report an error in
      felix_change_tag_protocol(), we'll get another call where proto_ops ==
      old_proto_ops. If we proceed to act upon that, we may do unexpected
      things. For example, we will call dsa_tag_8021q_register() twice in a
      row, without any dsa_tag_8021q_unregister() in between. Then we will
      actually call dsa_tag_8021q_unregister() via old_proto_ops->teardown,
      which (if it manages to run at all, after walking through corrupted data
      structures) will leave the ports inoperational anyway.
      
      The bug can be readily reproduced if we force an error while in
      tag_8021q mode; this crashes the kernel.
      
      echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging
      echo edsa > /sys/class/net/eno2/dsa/tagging # -EPROTONOSUPPORT
      
      Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
      Call trace:
       vcap_entry_get+0x24/0x124
       ocelot_vcap_filter_del+0x198/0x270
       felix_tag_8021q_vlan_del+0xd4/0x21c
       dsa_switch_tag_8021q_vlan_del+0x168/0x2cc
       dsa_switch_event+0x68/0x1170
       dsa_tree_notify+0x14/0x34
       dsa_port_tag_8021q_vlan_del+0x84/0x110
       dsa_tag_8021q_unregister+0x15c/0x1c0
       felix_tag_8021q_teardown+0x16c/0x180
       felix_change_tag_protocol+0x1bc/0x230
       dsa_switch_event+0x14c/0x1170
       dsa_tree_change_tag_proto+0x118/0x1c0
      
      Fixes: 7a29d220 ("net: dsa: felix: reimplement tagging protocol change with function pointers")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/20220808125127.3344094-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      4c46bb49
  6. 09 8月, 2022 1 次提交
    • V
      net: dsa: felix: fix min gate len calculation for tc when its first gate is closed · 7e4babff
      Vladimir Oltean 提交于
      min_gate_len[tc] is supposed to track the shortest interval of
      continuously open gates for a traffic class. For example, in the
      following case:
      
      TC 76543210
      
      t0 00000001b 200000 ns
      t1 00000010b 200000 ns
      
      min_gate_len[0] and min_gate_len[1] should be 200000, while
      min_gate_len[2-7] should be 0.
      
      However what happens is that min_gate_len[0] is 200000, but
      min_gate_len[1] ends up being 0 (despite gate_len[1] being 200000 at the
      point where the logic detects the gate close event for TC 1).
      
      The problem is that the code considers a "gate close" event whenever it
      sees that there is a 0 for that TC (essentially it's level rather than
      edge triggered). By doing that, any time a gate is seen as closed
      without having been open prior, gate_len, which is 0, will be written
      into min_gate_len. Once min_gate_len becomes 0, it's impossible for it
      to track anything higher than that (the length of actually open
      intervals).
      
      To fix this, we make the writing to min_gate_len[tc] be edge-triggered,
      which avoids writes for gates that are closed in consecutive intervals.
      However what this does is it makes us need to special-case the
      permanently closed gates at the end.
      
      Fixes: 55a515b1 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220804202817.1677572-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      7e4babff
  7. 29 7月, 2022 14 次提交
  8. 28 7月, 2022 1 次提交
  9. 27 7月, 2022 9 次提交
  10. 19 7月, 2022 3 次提交
  11. 18 7月, 2022 1 次提交