1. 01 4月, 2020 1 次提交
    • R
      net: dsa: fix oops while probing Marvell DSA switches · 765bda93
      Russell King 提交于
      Fix an oops in dsa_port_phylink_mac_change() caused by a combination
      of a20f9970 ("net: dsa: Don't instantiate phylink for CPU/DSA
      ports unless needed") and the net-dsa-improve-serdes-integration
      series of patches 65b7a2c8 ("Merge branch
      'net-dsa-improve-serdes-integration'").
      
      Unable to handle kernel NULL pointer dereference at virtual address 00000124
      pgd = c0004000
      [00000124] *pgd=00000000
      Internal error: Oops: 805 [#1] SMP ARM
      Modules linked in: tag_edsa spi_nor mtd xhci_plat_hcd mv88e6xxx(+) xhci_hcd armada_thermal marvell_cesa dsa_core ehci_orion libdes phy_armada38x_comphy at24 mcp3021 sfp evbug spi_orion sff mdio_i2c
      CPU: 1 PID: 214 Comm: irq/55-mv88e6xx Not tainted 5.6.0+ #470
      Hardware name: Marvell Armada 380/385 (Device Tree)
      PC is at phylink_mac_change+0x10/0x88
      LR is at mv88e6352_serdes_irq_status+0x74/0x94 [mv88e6xxx]
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Reviewed-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      765bda93
  2. 31 3月, 2020 2 次提交
  3. 28 3月, 2020 2 次提交
    • V
      net: dsa: implement auto-normalization of MTU for bridge hardware datapath · bff33f7e
      Vladimir Oltean 提交于
      Many switches don't have an explicit knob for configuring the MTU
      (maximum transmission unit per interface).  Instead, they do the
      length-based packet admission checks on the ingress interface, for
      reasons that are easy to understand (why would you accept a packet in
      the queuing subsystem if you know you're going to drop it anyway).
      
      So it is actually the MRU that these switches permit configuring.
      
      In Linux there only exists the IFLA_MTU netlink attribute and the
      associated dev_set_mtu function. The comments like to play blind and say
      that it's changing the "maximum transfer unit", which is to say that
      there isn't any directionality in the meaning of the MTU word. So that
      is the interpretation that this patch is giving to things: MTU == MRU.
      
      When 2 interfaces having different MTUs are bridged, the bridge driver
      MTU auto-adjustment logic kicks in: what br_mtu_auto_adjust() does is it
      adjusts the MTU of the bridge net device itself (and not that of the
      slave net devices) to the minimum value of all slave interfaces, in
      order for forwarded packets to not exceed the MTU regardless of the
      interface they are received and send on.
      
      The idea behind this behavior, and why the slave MTUs are not adjusted,
      is that normal termination from Linux over the L2 forwarding domain
      should happen over the bridge net device, which _is_ properly limited by
      the minimum MTU. And termination over individual slave devices is
      possible even if those are bridged. But that is not "forwarding", so
      there's no reason to do normalization there, since only a single
      interface sees that packet.
      
      The problem with those switches that can only control the MRU is with
      the offloaded data path, where a packet received on an interface with
      MRU 9000 would still be forwarded to an interface with MRU 1500. And the
      br_mtu_auto_adjust() function does not really help, since the MTU
      configured on the bridge net device is ignored.
      
      In order to enforce the de-facto MTU == MRU rule for these switches, we
      need to do MTU normalization, which means: in order for no packet larger
      than the MTU configured on this port to be sent, then we need to limit
      the MRU on all ports that this packet could possibly come from. AKA
      since we are configuring the MRU via MTU, it means that all ports within
      a bridge forwarding domain should have the same MTU.
      
      And that is exactly what this patch is trying to do.
      
      >From an implementation perspective, we try to follow the intent of the
      user, otherwise there is a risk that we might livelock them (they try to
      change the MTU on an already-bridged interface, but we just keep
      changing it back in an attempt to keep the MTU normalized). So the MTU
      that the bridge is normalized to is either:
      
       - The most recently changed one:
      
         ip link set dev swp0 master br0
         ip link set dev swp1 master br0
         ip link set dev swp0 mtu 1400
      
         This sequence will make swp1 inherit MTU 1400 from swp0.
      
       - The one of the most recently added interface to the bridge:
      
         ip link set dev swp0 master br0
         ip link set dev swp1 mtu 1400
         ip link set dev swp1 master br0
      
         The above sequence will make swp0 inherit MTU 1400 as well.
      Suggested-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bff33f7e
    • V
      net: dsa: configure the MTU for switch ports · bfcb8132
      Vladimir Oltean 提交于
      It is useful be able to configure port policers on a switch to accept
      frames of various sizes:
      
      - Increase the MTU for better throughput from the default of 1500 if it
        is known that there is no 10/100 Mbps device in the network.
      - Decrease the MTU to limit the latency of high-priority frames under
        congestion, or work around various network segments that add extra
        headers to packets which can't be fragmented.
      
      For DSA slave ports, this is mostly a pass-through callback, called
      through the regular ndo ops and at probe time (to ensure consistency
      across all supported switches).
      
      The CPU port is called with an MTU equal to the largest configured MTU
      of the slave ports. The assumption is that the user might want to
      sustain a bidirectional conversation with a partner over any switch
      port.
      
      The DSA master is configured the same as the CPU port, plus the tagger
      overhead. Since the MTU is by definition L2 payload (sans Ethernet
      header), it is up to each individual driver to figure out if it needs to
      do anything special for its frame tags on the CPU port (it shouldn't
      except in special cases). So the MTU does not contain the tagger
      overhead on the CPU port.
      However the MTU of the DSA master, minus the tagger overhead, is used as
      a proxy for the MTU of the CPU port, which does not have a net device.
      This is to avoid uselessly calling the .change_mtu function on the CPU
      port when nothing should change.
      
      So it is safe to assume that the DSA master and the CPU port MTUs are
      apart by exactly the tagger's overhead in bytes.
      
      Some changes were made around dsa_master_set_mtu(), function which was
      now removed, for 2 reasons:
        - dev_set_mtu() already calls dev_validate_mtu(), so it's redundant to
          do the same thing in DSA
        - __dev_set_mtu() returns 0 if ops->ndo_change_mtu is an absent method
      That is to say, there's no need for this function in DSA, we can safely
      call dev_set_mtu() directly, take the rtnl lock when necessary, and just
      propagate whatever errors get reported (since the user probably wants to
      be informed).
      
      Some inspiration (mainly in the MTU DSA notifier) was taken from a
      vaguely similar patch from Murali and Florian, who are credited as
      co-developers down below.
      Co-developed-by: NMurali Krishna Policharla <murali.policharla@broadcom.com>
      Signed-off-by: NMurali Krishna Policharla <murali.policharla@broadcom.com>
      Co-developed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bfcb8132
  4. 18 3月, 2020 1 次提交
  5. 09 3月, 2020 1 次提交
  6. 04 3月, 2020 2 次提交
  7. 09 1月, 2020 1 次提交
    • F
      net: dsa: Get information about stacked DSA protocol · 4d776482
      Florian Fainelli 提交于
      It is possible to stack multiple DSA switches in a way that they are not
      part of the tree (disjoint) but the DSA master of a switch is a DSA
      slave of another. When that happens switch drivers may have to know this
      is the case so as to determine whether their tagging protocol has a
      remove chance of working.
      
      This is useful for specific switch drivers such as b53 where devices
      have been known to be stacked in the wild without the Broadcom tag
      protocol supporting that feature. This allows b53 to continue supporting
      those devices by forcing the disabling of Broadcom tags on the outermost
      switches if necessary.
      
      The get_tag_protocol() function is therefore updated to gain an
      additional enum dsa_tag_protocol argument which denotes the current
      tagging protocol used by the DSA master we are attached to, else
      DSA_TAG_PROTO_NONE for the top of the dsa_switch_tree.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d776482
  8. 06 1月, 2020 1 次提交
    • V
      net: dsa: Make deferred_xmit private to sja1105 · a68578c2
      Vladimir Oltean 提交于
      There are 3 things that are wrong with the DSA deferred xmit mechanism:
      
      1. Its introduction has made the DSA hotpath ever so slightly more
         inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs
         to be initialized to false for every transmitted frame, in order to
         figure out whether the driver requested deferral or not (a very rare
         occasion, rare even for the only driver that does use this mechanism:
         sja1105). That was necessary to avoid kfree_skb from freeing the skb.
      
      2. Because L2 PTP is a link-local protocol like STP, it requires
         management routes and deferred xmit with this switch. But as opposed
         to STP, the deferred work mechanism needs to schedule the packet
         rather quickly for the TX timstamp to be collected in time and sent
         to user space. But there is no provision for controlling the
         scheduling priority of this deferred xmit workqueue. Too bad this is
         a rather specific requirement for a feature that nobody else uses
         (more below).
      
      3. Perhaps most importantly, it makes the DSA core adhere a bit too
         much to the NXP company-wide policy "Innovate Where It Doesn't
         Matter". The sja1105 is probably the only DSA switch that requires
         some frames sent from the CPU to be routed to the slave port via an
         out-of-band configuration (register write) rather than in-band (DSA
         tag). And there are indeed very good reasons to not want to do that:
         if that out-of-band register is at the other end of a slow bus such
         as SPI, then you limit that Ethernet flow's throughput to effectively
         the throughput of the SPI bus. So hardware vendors should definitely
         not be encouraged to design this way. We do _not_ want more
         widespread use of this mechanism.
      
      Luckily we have a solution for each of the 3 issues:
      
      For 1, we can just remove that variable in the skb->cb and counteract
      the effect of kfree_skb with skb_get, much to the same effect. The
      advantage, of course, being that anybody who doesn't use deferred xmit
      doesn't need to do any extra operation in the hotpath.
      
      For 2, we can create a kernel thread for each port's deferred xmit work.
      If the user switch ports are named swp0, swp1, swp2, the kernel threads
      will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15
      character length limit on kernel thread names). With this, the user can
      change the scheduling priority with chrt $(pidof swp2_xmit).
      
      For 3, we can actually move the entire implementation to the sja1105
      driver.
      
      So this patch deletes the generic implementation from the DSA core and
      adds a new one, more adequate to the requirements of PTP TX
      timestamping, in sja1105_main.c.
      Suggested-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a68578c2
  9. 05 11月, 2019 1 次提交
    • A
      net: of_get_phy_mode: Change API to solve int/unit warnings · 0c65b2b9
      Andrew Lunn 提交于
      Before this change of_get_phy_mode() returned an enum,
      phy_interface_t. On error, -ENODEV etc, is returned. If the result of
      the function is stored in a variable of type phy_interface_t, and the
      compiler has decided to represent this as an unsigned int, comparision
      with -ENODEV etc, is a signed vs unsigned comparision.
      
      Fix this problem by changing the API. Make the function return an
      error, or 0 on success, and pass a pointer, of type phy_interface_t,
      where the phy mode should be stored.
      
      v2:
      Return with *interface set to PHY_INTERFACE_MODE_NA on error.
      Add error checks to all users of of_get_phy_mode()
      Fixup a few reverse christmas tree errors
      Fixup a few slightly malformed reverse christmas trees
      
      v3:
      Fix 0-day reported errors.
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c65b2b9
  10. 31 10月, 2019 1 次提交
  11. 25 10月, 2019 1 次提交
    • T
      net: core: add generic lockdep keys · ab92d68f
      Taehee Yoo 提交于
      Some interface types could be nested.
      (VLAN, BONDING, TEAM, MACSEC, MACVLAN, IPVLAN, VIRT_WIFI, VXLAN, etc..)
      These interface types should set lockdep class because, without lockdep
      class key, lockdep always warn about unexisting circular locking.
      
      In the current code, these interfaces have their own lockdep class keys and
      these manage itself. So that there are so many duplicate code around the
      /driver/net and /net/.
      This patch adds new generic lockdep keys and some helper functions for it.
      
      This patch does below changes.
      a) Add lockdep class keys in struct net_device
         - qdisc_running, xmit, addr_list, qdisc_busylock
         - these keys are used as dynamic lockdep key.
      b) When net_device is being allocated, lockdep keys are registered.
         - alloc_netdev_mqs()
      c) When net_device is being free'd llockdep keys are unregistered.
         - free_netdev()
      d) Add generic lockdep key helper function
         - netdev_register_lockdep_key()
         - netdev_unregister_lockdep_key()
         - netdev_update_lockdep_key()
      e) Remove unnecessary generic lockdep macro and functions
      f) Remove unnecessary lockdep code of each interfaces.
      
      After this patch, each interface modules don't need to maintain
      their lockdep keys.
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab92d68f
  12. 17 9月, 2019 1 次提交
  13. 28 8月, 2019 6 次提交
  14. 20 7月, 2019 3 次提交
  15. 10 7月, 2019 5 次提交
  16. 15 6月, 2019 2 次提交
  17. 09 6月, 2019 1 次提交
  18. 31 5月, 2019 1 次提交
  19. 30 5月, 2019 2 次提交
  20. 13 5月, 2019 1 次提交
    • V
      net: dsa: Initialize DSA_SKB_CB(skb)->deferred_xmit variable · 87671375
      Vladimir Oltean 提交于
      The sk_buff control block can have any contents on xmit put there by the
      stack, so initialization is mandatory, since we are checking its value
      after the actual DSA xmit (the tagger may have changed it).
      
      The DSA_SKB_ZERO() macro could have been used for this purpose, but:
      - Zeroizing a 48-byte memory region in the hotpath is best avoided.
      - It would have triggered a warning with newer compilers since
        __dsa_skb_cb contains a structure within a structure, and the {0}
        initializer was incorrect for that purpose.
      
      So simply remove the DSA_SKB_ZERO() macro and initialize the
      deferred_xmit variable by hand (which should be done for all further
      dsa_skb_cb variables which need initialization - currently none - to
      avoid the performance penalty).
      
      Fixes: 97a69a0d ("net: dsa: Add support for deferred xmit")
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      87671375
  21. 08 5月, 2019 1 次提交
  22. 06 5月, 2019 2 次提交
    • V
      net: dsa: Add support for deferred xmit · 97a69a0d
      Vladimir Oltean 提交于
      Some hardware needs to take work to get convinced to receive frames on
      the CPU port (such as the sja1105 which takes temporary L2 forwarding
      rules over SPI that last for a single frame). Such work needs a
      sleepable context, and because the regular .ndo_start_xmit is atomic,
      this cannot be done in the tagger. So introduce a generic DSA mechanism
      that sets up a transmit skb queue and a workqueue for deferred
      transmission.
      
      The new driver callback (.port_deferred_xmit) is in dsa_switch and not
      in the tagger because the operations that require sleeping typically
      also involve interacting with the hardware, and not simply skb
      manipulations. Therefore having it there simplifies the structure a bit
      and makes it unnecessary to export functions from the driver to the
      tagger.
      
      The driver is responsible of calling dsa_enqueue_skb which transfers it
      to the master netdevice. This is so that it has a chance of performing
      some more work afterwards, such as cleanup or TX timestamping.
      
      To tell DSA that skb xmit deferral is required, I have thought about
      changing the return type of the tagger .xmit from struct sk_buff * into
      a enum dsa_tx_t that could potentially encode a DSA_XMIT_DEFER value.
      
      But the trailer tagger is reallocating every skb on xmit and therefore
      making a valid use of the pointer return value. So instead of reworking
      the API in complicated ways, right now a boolean property in the newly
      introduced DSA_SKB_CB is set.
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      97a69a0d
    • P
      net/dsa: use intermediate representation for matchall offload · 9681e8b3
      Pieter Jansen van Vuuren 提交于
      Updates dsa hardware switch handling infrastructure to use the newer
      intermediate representation for flow actions in matchall offloads.
      Signed-off-by: NPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9681e8b3
  23. 01 5月, 2019 1 次提交