1. 25 8月, 2021 32 次提交
    • B
      ravb: Add ptp_cfg_active to struct ravb_hw_info · a69a3d09
      Biju Das 提交于
      There are some H/W differences for the gPTP feature between
      R-Car Gen3, R-Car Gen2, and RZ/G2L as below.
      
      1) On R-Car Gen3, gPTP support is active in config mode.
      2) On R-Car Gen2, gPTP support is not active in config mode.
      3) RZ/G2L does not support the gPTP feature.
      
      Add a ptp_cfg_active hw feature bit to struct ravb_hw_info for
      supporting gPTP active in config mode for R-Car Gen3.
      This patch also removes enum ravb_chip_id, chip_id from both
      struct ravb_hw_info and struct ravb_private, as it is unused.
      Signed-off-by: NBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: NLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a69a3d09
    • B
      ravb: Add no_ptp_cfg_active to struct ravb_hw_info · 8f27219a
      Biju Das 提交于
      There are some H/W differences for the gPTP feature between
      R-Car Gen3, R-Car Gen2, and RZ/G2L as below.
      
      1) On R-Car Gen2, gPTP support is not active in config mode.
      2) On R-Car Gen3, gPTP support is active in config mode.
      3) RZ/G2L does not support the gPTP feature.
      
      Add a no_ptp_cfg_active hw feature bit to struct ravb_hw_info for
      handling gPTP for R-Car Gen2.
      Signed-off-by: NBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: NLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f27219a
    • B
      ravb: Add multi_irq to struct ravb_hw_info · 6de19fa0
      Biju Das 提交于
      R-Car Gen3 supports separate interrupts for E-MAC and DMA queues,
      whereas R-Car Gen2 and RZ/G2L have a single interrupt instead.
      
      Add a multi_irq hw feature bit to struct ravb_hw_info to enable
      this only for R-Car Gen3.
      Signed-off-by: NBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: NLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6de19fa0
    • B
      ravb: Remove the macros NUM_TX_DESC_GEN[23] · c81d8942
      Biju Das 提交于
      For addressing 4 bytes alignment restriction on transmission
      buffer for R-Car Gen2 we use 2 descriptors whereas it is a single
      descriptor for other cases.
      Replace the macros NUM_TX_DESC_GEN[23] with magic number and
      add a comment to explain it.
      Signed-off-by: NBiju Das <biju.das.jz@bp.renesas.com>
      Suggested-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: NLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c81d8942
    • V
      net: dsa: tag_sja1105: stop asking the sja1105 driver in sja1105_xmit_tpid · 8ded9160
      Vladimir Oltean 提交于
      Introduced in commit 38b5beea ("net: dsa: sja1105: prepare tagger
      for handling DSA tags and VLAN simultaneously"), the sja1105_xmit_tpid
      function solved quite a different problem than our needs are now.
      
      Then, we used best-effort VLAN filtering and we were using the xmit_tpid
      to tunnel packets coming from an 8021q upper through the TX VLAN allocated
      by tag_8021q to that egress port. The need for a different VLAN protocol
      depending on switch revision came from the fact that this in itself was
      more of a hack to trick the hardware into accepting tunneled VLANs in
      the first place.
      
      Right now, we deny 8021q uppers (see sja1105_prechangeupper). Even if we
      supported them again, we would not do that using the same method of
      {tunneling the VLAN on egress, retagging the VLAN on ingress} that we
      had in the best-effort VLAN filtering mode. It seems rather simpler that
      we just allocate a VLAN in the VLAN table that is simply not used by the
      bridge at all, or by any other port.
      
      Anyway, I have 2 gripes with the current sja1105_xmit_tpid:
      
      1. When sending packets on behalf of a VLAN-aware bridge (with the new
         TX forwarding offload framework) plus untagged (with the tag_8021q
         VLAN added by the tagger) packets, we can see that on SJA1105P/Q/R/S
         and later (which have a qinq_tpid of ETH_P_8021AD), some packets sent
         through the DSA master have a VLAN protocol of 0x8100 and others of
         0x88a8. This is strange and there is no reason for it now. If we have
         a bridge and are therefore forced to send using that bridge's TPID,
         we can as well blend with that bridge's VLAN protocol for all packets.
      
      2. The sja1105_xmit_tpid introduces a dependency on the sja1105 driver,
         because it looks inside dp->priv. It is desirable to keep as much
         separation between taggers and switch drivers as possible. Now it
         doesn't do that anymore.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8ded9160
    • V
      net: dsa: sja1105: drop untagged packets on the CPU and DSA ports · b0b8c67e
      Vladimir Oltean 提交于
      The sja1105 driver is a bit special in its use of VLAN headers as DSA
      tags. This is because in VLAN-aware mode, the VLAN headers use an actual
      TPID of 0x8100, which is understood even by the DSA master as an actual
      VLAN header.
      
      Furthermore, control packets such as PTP and STP are transmitted with no
      VLAN header as a DSA tag, because, depending on switch generation, there
      are ways to steer these control packets towards a precise egress port
      other than VLAN tags. Transmitting control packets as untagged means
      leaving a door open for traffic in general to be transmitted as untagged
      from the DSA master, and for it to traverse the switch and exit a random
      switch port according to the FDB lookup.
      
      This behavior is a bit out of line with other DSA drivers which have
      native support for DSA tagging. There, it is to be expected that the
      switch only accepts DSA-tagged packets on its CPU port, dropping
      everything that does not match this pattern.
      
      We perhaps rely a bit too much on the switches' hardware dropping on the
      CPU port, and place no other restrictions in the kernel data path to
      avoid that. For example, sja1105 is also a bit special in that STP/PTP
      packets are transmitted using "management routes"
      (sja1105_port_deferred_xmit): when sending a link-local packet from the
      CPU, we must first write a SPI message to the switch to tell it to
      expect a packet towards multicast MAC DA 01-80-c2-00-00-0e, and to route
      it towards port 3 when it gets it. This entry expires as soon as it
      matches a packet received by the switch, and it needs to be reinstalled
      for the next packet etc. All in all quite a ghetto mechanism, but it is
      all that the sja1105 switches offer for injecting a control packet.
      The driver takes a mutex for serializing control packets and making the
      pairs of SPI writes of a management route and its associated skb atomic,
      but to be honest, a mutex is only relevant as long as all parties agree
      to take it. With the DSA design, it is possible to open an AF_PACKET
      socket on the DSA master net device, and blast packets towards
      01-80-c2-00-00-0e, and whatever locking the DSA switch driver might use,
      it all goes kaput because management routes installed by the driver will
      match skbs sent by the DSA master, and not skbs generated by the driver
      itself. So they will end up being routed on the wrong port.
      
      So through the lens of that, maybe it would make sense to avoid that
      from happening by doing something in the network stack, like: introduce
      a new bit in struct sk_buff, like xmit_from_dsa. Then, somewhere around
      dev_hard_start_xmit(), introduce the following check:
      
      	if (netdev_uses_dsa(dev) && !skb->xmit_from_dsa)
      		kfree_skb(skb);
      
      Ok, maybe that is a bit drastic, but that would at least prevent a bunch
      of problems. For example, right now, even though the majority of DSA
      switches drop packets without DSA tags sent by the DSA master (and
      therefore the majority of garbage that user space daemons like avahi and
      udhcpcd and friends create), it is still conceivable that an aggressive
      user space program can open an AF_PACKET socket and inject a spoofed DSA
      tag directly on the DSA master. We have no protection against that; the
      packet will be understood by the switch and be routed wherever user
      space says. Furthermore: there are some DSA switches where we even have
      register access over Ethernet, using DSA tags. So even user space
      drivers are possible in this way. This is a huge hole.
      
      However, the biggest thing that bothers me is that udhcpcd attempts to
      ask for an IP address on all interfaces by default, and with sja1105, it
      will attempt to get a valid IP address on both the DSA master as well as
      on sja1105 switch ports themselves. So with IP addresses in the same
      subnet on multiple interfaces, the routing table will be messed up and
      the system will be unusable for traffic until it is configured manually
      to not ask for an IP address on the DSA master itself.
      
      It turns out that it is possible to avoid that in the sja1105 driver, at
      least very superficially, by requesting the switch to drop VLAN-untagged
      packets on the CPU port. With the exception of control packets, all
      traffic originated from tag_sja1105.c is already VLAN-tagged, so only
      STP and PTP packets need to be converted. For that, we need to uphold
      the equivalence between an untagged and a pvid-tagged packet, and to
      remember that the CPU port of sja1105 uses a pvid of 4095.
      
      Now that we drop untagged traffic on the CPU port, non-aggressive user
      space applications like udhcpcd stop bothering us, and sja1105 effectively
      becomes just as vulnerable to the aggressive kind of user space programs
      as other DSA switches are (ok, users can also create 8021q uppers on top
      of the DSA master in the case of sja1105, but in future patches we can
      easily deny that, but it still doesn't change the fact that VLAN-tagged
      packets can still be injected over raw sockets).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0b8c67e
    • V
      net: dsa: sja1105: prevent tag_8021q VLANs from being received on user ports · 73ceab83
      Vladimir Oltean 提交于
      Currently it is possible for an attacker to craft packets with a fake
      DSA tag and send them to us, and our user ports will accept them and
      preserve that VLAN when transmitting towards the CPU. Then the tagger
      will be misled into thinking that the packets came on a different port
      than they really came on.
      
      Up until recently there wasn't a good option to prevent this from
      happening. In SJA1105P and later, the MAC Configuration Table introduced
      two options called:
      - DRPSITAG: Drop Single Inner Tagged Frames
      - DRPSOTAG: Drop Single Outer Tagged Frames
      
      Because the sja1105 driver classifies all VLANs as "outer VLANs" (S-Tags),
      it would be in principle possible to enable the DRPSOTAG bit on ports
      using tag_8021q, and drop on ingress all packets which have a VLAN tag.
      When the switch is VLAN-unaware, this works, because it uses a custom
      TPID of 0xdadb, so any "tagged" packets received on a user port are
      probably a spoofing attempt. But when the switch overall is VLAN-aware,
      and some ports are standalone (therefore they use tag_8021q), the TPID
      is 0x8100, and the port can receive a mix of untagged and VLAN-tagged
      packets. The untagged ones will be classified to the tag_8021q pvid, and
      the tagged ones to the VLAN ID from the packet header. Yes, it is true
      that since commit 4fbc08bd ("net: dsa: sja1105: deny 8021q uppers on
      ports") we no longer support this mixed mode, but that is a temporary
      limitation which will eventually be lifted. It would be nice to not
      introduce one more restriction via DRPSOTAG, which would make the
      standalone ports of a VLAN-aware switch drop genuinely VLAN-tagged
      packets.
      
      Also, the DRPSOTAG bit is not available on the first generation of
      switches (SJA1105E, SJA1105T). So since one of the key features of this
      driver is compatibility across switch generations, this makes it an even
      less desirable approach.
      
      The breakthrough comes from commit bef0746c ("net: dsa: sja1105:
      make sure untagged packets are dropped on ingress ports with no pvid"),
      where it became obvious that untagged packets are not dropped even if
      the ingress port is not in the VMEMB_PORT vector of that port's pvid.
      However, VLAN-tagged packets are subject to VLAN ingress
      checking/dropping. This means that instead of using the catch-all
      DRPSOTAG bit introduced in SJA1105P, we can drop tagged packets on a
      per-VLAN basis, and this is already compatible with SJA1105E/T.
      
      This patch adds an "allowed_ingress" argument to sja1105_vlan_add(), and
      we call it with "false" for tag_8021q VLANs on user ports. The tag_8021q
      VLANs still need to be allowed, of course, on ingress to DSA ports and
      CPU ports.
      
      We also need to refine the drop_untagged check in sja1105_commit_pvid to
      make it not freak out about this new configuration. Currently it will
      try to keep the configuration consistent between untagged and pvid-tagged
      packets, so if the pvid of a port is 1 but VLAN 1 is not in VMEMB_PORT,
      packets tagged with VID 1 will behave the same as untagged packets, and
      be dropped. This behavior is what we want for ports under a VLAN-aware
      bridge, but for the ports with a tag_8021q pvid, we want untagged
      packets to be accepted, but packets tagged with a header recognized by
      the switch as a tag_8021q VLAN to be dropped. So only restrict the
      drop_untagged check to apply to the bridge_pvid, not to the tag_8021q_pvid.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73ceab83
    • D
      net: dsa: mt7530: manually set up VLAN ID 0 · 1ca8a193
      DENG Qingfang 提交于
      The driver was relying on dsa_slave_vlan_rx_add_vid to add VLAN ID 0. After
      the blamed commit, VLAN ID 0 won't be set up anymore, breaking software
      bridging fallback on VLAN-unaware bridges.
      
      Manually set up VLAN ID 0 to fix this.
      
      Fixes: 06cfb2df ("net: dsa: don't advertise 'rx-vlan-filter' when not needed")
      Signed-off-by: NDENG Qingfang <dqfext@gmail.com>
      Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ca8a193
    • H
      net: mana: Add WARN_ON_ONCE in case of CQE read overflow · c1a3e9f9
      Haiyang Zhang 提交于
      This is not an expected case normally.
      Add WARN_ON_ONCE in case of CQE read overflow, instead of failing
      silently.
      Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c1a3e9f9
    • H
      net: mana: Add support for EQ sharing · 1e2d0824
      Haiyang Zhang 提交于
      The existing code uses (1 + #vPorts * #Queues) MSIXs, which may exceed
      the device limit.
      
      Support EQ sharing, so that multiple vPorts (NICs) can share the same
      set of MSIXs.
      
      And, report the EQ-sharing capability bit to the host, which means the
      host can potentially offer more vPorts and queues to the VM.
      
      Also update the resource limit checking and error handling for better
      robustness.
      
      Now, we support up to 256 virtual ports per VF (it was 16/VF), and
      support up to 64 queues per vPort (it was 16).
      Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1e2d0824
    • H
      net: mana: Move NAPI from EQ to CQ · e1b5683f
      Haiyang Zhang 提交于
      The existing code has NAPI threads polling on EQ directly. To prepare
      for EQ sharing among vPorts, move NAPI from EQ to CQ so that one EQ
      can serve multiple CQs from different vPorts.
      
      The "arm bit" is only set when CQ processing is completed to reduce
      the number of EQ entries, which in turn reduce the number of interrupts
      on EQ.
      Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1b5683f
    • S
      netxen_nic: Remove the repeated declaration · 807d1032
      Shaokun Zhang 提交于
      Function 'netxen_rom_fast_read' is declared twice, so remove the
      repeated declaration.
      
      Cc: Manish Chopra <manishc@marvell.com>
      Cc: Rahul Verma <rahulv@marvell.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Signed-off-by: NShaokun Zhang <zhangshaokun@hisilicon.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      807d1032
    • N
      cxgb4: Properly revert VPD changes · bc4f128d
      Nathan Chancellor 提交于
      Clang warns:
      
      drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2785:2: error: variable 'kw_offset' is uninitialized when used here [-Werror,-Wuninitialized]
              FIND_VPD_KW(i, "RV");
              ^~~~~~~~~~~~~~~~~~~~
      drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2776:39: note: expanded from macro 'FIND_VPD_KW'
              var = pci_vpd_find_info_keyword(vpd, kw_offset, vpdr_len, name); \
                                                   ^~~~~~~~~
      drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2748:34: note: initialize the variable 'kw_offset' to silence this warning
              unsigned int vpdr_len, kw_offset, id_len;
                                              ^
                                               = 0
      drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2785:2: error: variable 'vpdr_len' is uninitialized when used here [-Werror,-Wuninitialized]
              FIND_VPD_KW(i, "RV");
              ^~~~~~~~~~~~~~~~~~~~
      drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2776:50: note: expanded from macro 'FIND_VPD_KW'
              var = pci_vpd_find_info_keyword(vpd, kw_offset, vpdr_len, name); \
                                                              ^~~~~~~~
      drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2748:23: note: initialize the variable 'vpdr_len' to silence this warning
              unsigned int vpdr_len, kw_offset, id_len;
                                   ^
                                    = 0
      2 errors generated.
      
      The series "PCI/VPD: Convert more users to the new VPD API functions"
      was applied to net-next when it should have been applied to the PCI tree
      because of build errors. However, commit 82e34c8a ("Revert "Revert
      "cxgb4: Search VPD with pci_vpd_find_ro_info_keyword()""") reapplied a
      change, resulting in the warning above.
      
      Properly revert commit 8d63ee60 ("cxgb4: Search VPD with
      pci_vpd_find_ro_info_keyword()") to fix the warning and restore proper
      functionality. This also reverts commit 3a93bede ("cxgb4: Remove
      unused vpd_param member ec") to avoid future merge conflicts, as that
      change has been applied to the PCI tree.
      
      Link: https://lore.kernel.org/r/20210823120929.7c6f7a4f@canb.auug.org.au/
      Link: https://lore.kernel.org/r/1ca29408-7bc7-4da5-59c7-87893c9e0442@gmail.com/Signed-off-by: NNathan Chancellor <nathan@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc4f128d
    • J
      lan78xx: Limit number of driver warning messages · df0d6f7a
      John Efstathiades 提交于
      Device removal can result in a large burst of driver warning messages
      (20 - 30) sent to the kernel log. Most of these are register read/write
      failures.
      
      This change limits the rate at which these messages are emitted.
      Signed-off-by: NJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df0d6f7a
    • J
      lan78xx: Fix race condition in disconnect handling · 77dfff5b
      John Efstathiades 提交于
      If there is a device disconnect at roughly the same time as a
      deferred PHY link reset there is a race condition that can result
      in a kernel lock up due to a null pointer dereference in the
      driver's deferred work handling routine lan78xx_delayedwork().
      The following changes fix this problem.
      
      Add new status flag EVENT_DEV_DISCONNECT to indicate when the
      device has been removed and use it to prevent operations, such as
      register access, that will fail once the device is removed.
      
      Stop processing of deferred work items when the driver's USB
      disconnect handler is invoked.
      
      Disconnect the PHY only after the network device has been
      unregistered and all delayed work has been cancelled.
      Signed-off-by: NJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      77dfff5b
    • J
      lan78xx: Fix race conditions in suspend/resume handling · 5f4cc6e2
      John Efstathiades 提交于
      If the interface is given an IP address while the device is
      suspended (as a result of an auto-suspend event) there is a race
      between lan78xx_resume() and lan78xx_open() that can result in an
      exception or failure to handle incoming packets. The following
      changes fix this problem.
      
      Introduce a mutex to serialise operations in the network interface
      open and stop entry points with respect to the USB driver suspend
      and resume entry points.
      
      Move Tx and Rx data path start/stop to lan78xx_start() and
      lan78xx_stop() respectively and flush the packet FIFOs before
      starting the Tx and Rx data paths. This prevents the MAC and FIFOs
      getting out of step and delivery of malformed packets to the network
      stack.
      
      Stop processing of received packets before disconnecting the
      PHY from the MAC to prevent a kernel exception caused by handling
      packets after the PHY device has been removed.
      
      Refactor device auto-suspend code to make it consistent with the
      the system suspend code and make the suspend handler easier to read.
      
      Add new code to stop wake-on-lan packets or PHY events resuming the
      host or device from suspend if the device has not been opened
      (typically after an IP address is assigned).
      
      This patch is dependent on changes to lan78xx_suspend() and
      lan78xx_resume() introduced in the previous patch of this patch set.
      Signed-off-by: NJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f4cc6e2
    • J
      lan78xx: Fix partial packet errors on suspend/resume · e1210fe6
      John Efstathiades 提交于
      The MAC can get out of step with the internal packet FIFOs if the
      system goes to sleep when the link is active, especially at high
      data rates. This can result in partial frames in the packet FIFOs
      that in result in malformed frames being delivered to the host.
      This occurs because the driver does not enable/disable the internal
      packet FIFOs in step with the corresponding MAC data path. The
      following changes fix this problem.
      
      Update code that enables/disables the MAC receiver and transmitter
      to the more general Rx and Tx data path, where the data path in each
      direction consists of both the MAC function (Tx or Rx) and the
      corresponding packet FIFO.
      
      In the receive path the packet FIFO must be enabled before the MAC
      receiver but disabled after the MAC receiver.
      
      In the transmit path the opposite is true: the packet FIFO must be
      enabled after the MAC transmitter but disabled before the MAC
      transmitter.
      
      The packet FIFOs can be flushed safely once the corresponding data
      path is stopped.
      Signed-off-by: NJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1210fe6
    • J
      lan78xx: Fix exception on link speed change · b1f6696d
      John Efstathiades 提交于
      An exception is sometimes seen when the link speed is changed
      from auto-negotiation to a fixed speed, or vice versa. The
      exception occurs when the MAC is reset (due to the link speed
      change) at the same time as the PHY state machine is accessing
      a PHY register. The following changes fix this problem.
      
      Rework the MAC reset to ensure there is no outstanding MDIO
      register transaction before the reset and then wait until the
      reset is complete before allowing any further MAC register access.
      Signed-off-by: NJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1f6696d
    • J
      lan78xx: Add missing return code checks · 3415f6ba
      John Efstathiades 提交于
      There are many places in the driver where the return code from a
      function call is captured but without a subsequent test of the
      return code and appropriate action taken.
      
      This patch adds the missing return code tests and action. In most
      cases the action is an early exit from the calling function.
      
      The function lan78xx_set_suspend() was also updated to make it
      consistent with lan78xx_suspend().
      Signed-off-by: NJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3415f6ba
    • J
      lan78xx: Remove unused pause frame queue · 40b8452f
      John Efstathiades 提交于
      Remove the pause frame queue from the driver. It is initialised
      but not actually used.
      Signed-off-by: NJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40b8452f
    • J
      lan78xx: Set flow control threshold to prevent packet loss · dc35f854
      John Efstathiades 提交于
      Set threshold at which flow control is triggered to 3/4 full of
      the internal Rx packet FIFO to prevent packet drops at high data
      rates. The new setting reduces the number of dropped UDP frames
      and TCP retransmit requests especially on less capable CPUs.
      Signed-off-by: NJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc35f854
    • J
      lan78xx: Remove unused timer · 3bef6b9e
      John Efstathiades 提交于
      Remove kernel timer that is not used by the driver.
      Signed-off-by: NJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3bef6b9e
    • J
      lan78xx: Fix white space and style issues · 9ceec7d3
      John Efstathiades 提交于
      Fix white space and code style issues identified by checkpatch.
      Signed-off-by: NJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ceec7d3
    • J
      xen/netfront: don't trust the backend response data blindly · a884daa6
      Juergen Gross 提交于
      Today netfront will trust the backend to send only sane response data.
      In order to avoid privilege escalations or crashes in case of malicious
      backends verify the data to be within expected limits. Especially make
      sure that the response always references an outstanding request.
      
      Note that only the tx queue needs special id handling, as for the rx
      queue the id is equal to the index in the ring page.
      
      Introduce a new indicator for the device whether it is broken and let
      the device stop working when it is set. Set this indicator in case the
      backend sets any weird data.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a884daa6
    • J
      xen/netfront: disentangle tx_skb_freelist · 21631d2d
      Juergen Gross 提交于
      The tx_skb_freelist elements are in a single linked list with the
      request id used as link reference. The per element link field is in a
      union with the skb pointer of an in use request.
      
      Move the link reference out of the union in order to enable a later
      reuse of it for requests which need a populated skb pointer.
      
      Rename add_id_to_freelist() and get_id_from_freelist() to
      add_id_to_list() and get_id_from_list() in order to prepare using
      those for other lists as well. Define ~0 as value to indicate the end
      of a list and place that value into the link for a request not being
      on the list.
      
      When freeing a skb zero the skb pointer in the request. Use a NULL
      value of the skb pointer instead of skb_entry_is_link() for deciding
      whether a request has a skb linked to it.
      
      Remove skb_entry_set_link() and open code it instead as it is really
      trivial now.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      21631d2d
    • J
      xen/netfront: don't read data from request on the ring page · 162081ec
      Juergen Gross 提交于
      In order to avoid a malicious backend being able to influence the local
      processing of a request build the request locally first and then copy
      it to the ring page. Any reading from the request influencing the
      processing in the frontend needs to be done on the local instance.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      162081ec
    • J
      xen/netfront: read response from backend only once · 8446066b
      Juergen Gross 提交于
      In order to avoid problems in case the backend is modifying a response
      on the ring page while the frontend has already seen it, just read the
      response into a local buffer in one go and then operate on that buffer
      only.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8446066b
    • A
      qed: Enable automatic recovery on error condition. · 755f9053
      Alok Prasad 提交于
      This patch enables automatic recovery by default in case of various
      error condition like fw assert , hardware error etc.
      This also ensure driver can handle multiple iteration of assertion
      conditions.
      Signed-off-by: NAriel Elior <aelior@marvell.com>
      Signed-off-by: NShai Malin <smalin@marvell.com>
      Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: NAlok Prasad <palok@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      755f9053
    • V
      igc: Add support for PTP getcrosststamp() · a90ec848
      Vinicius Costa Gomes 提交于
      i225 supports PCIe Precision Time Measurement (PTM), allowing us to
      support the PTP_SYS_OFFSET_PRECISE ioctl() in the driver via the
      getcrosststamp() function.
      
      The easiest way to expose the PTM registers would be to configure the PTM
      dialogs to run periodically, but the PTP_SYS_OFFSET_PRECISE ioctl()
      semantics are more aligned to using a kind of "one-shot" way of retrieving
      the PTM timestamps. But this causes a bit more code to be written: the
      trigger registers for the PTM dialogs are not cleared automatically.
      
      i225 can be configured to send "fake" packets with the PTM
      information, adding support for handling these types of packets is
      left for the future.
      
      PTM improves the accuracy of time synchronization, for example, using
      phc2sys, while a simple application is sending packets as fast as
      possible. First, without .getcrosststamp():
      
      phc2sys[191.382]: enp4s0 sys offset      -959 s2 freq    -454 delay   4492
      phc2sys[191.482]: enp4s0 sys offset       798 s2 freq   +1015 delay   4069
      phc2sys[191.583]: enp4s0 sys offset       962 s2 freq   +1418 delay   3849
      phc2sys[191.683]: enp4s0 sys offset       924 s2 freq   +1669 delay   3753
      phc2sys[191.783]: enp4s0 sys offset       664 s2 freq   +1686 delay   3349
      phc2sys[191.883]: enp4s0 sys offset       218 s2 freq   +1439 delay   2585
      phc2sys[191.983]: enp4s0 sys offset       761 s2 freq   +2048 delay   3750
      phc2sys[192.083]: enp4s0 sys offset       756 s2 freq   +2271 delay   4061
      phc2sys[192.183]: enp4s0 sys offset       809 s2 freq   +2551 delay   4384
      phc2sys[192.283]: enp4s0 sys offset      -108 s2 freq   +1877 delay   2480
      phc2sys[192.383]: enp4s0 sys offset     -1145 s2 freq    +807 delay   4438
      phc2sys[192.484]: enp4s0 sys offset       571 s2 freq   +2180 delay   3849
      phc2sys[192.584]: enp4s0 sys offset       241 s2 freq   +2021 delay   3389
      phc2sys[192.684]: enp4s0 sys offset       405 s2 freq   +2257 delay   3829
      phc2sys[192.784]: enp4s0 sys offset        17 s2 freq   +1991 delay   3273
      phc2sys[192.884]: enp4s0 sys offset       152 s2 freq   +2131 delay   3948
      phc2sys[192.984]: enp4s0 sys offset      -187 s2 freq   +1837 delay   3162
      phc2sys[193.084]: enp4s0 sys offset     -1595 s2 freq    +373 delay   4557
      phc2sys[193.184]: enp4s0 sys offset       107 s2 freq   +1597 delay   3740
      phc2sys[193.284]: enp4s0 sys offset       199 s2 freq   +1721 delay   4010
      phc2sys[193.385]: enp4s0 sys offset      -169 s2 freq   +1413 delay   3701
      phc2sys[193.485]: enp4s0 sys offset       -47 s2 freq   +1484 delay   3581
      phc2sys[193.585]: enp4s0 sys offset       -65 s2 freq   +1452 delay   3778
      phc2sys[193.685]: enp4s0 sys offset        95 s2 freq   +1592 delay   3888
      phc2sys[193.785]: enp4s0 sys offset       206 s2 freq   +1732 delay   4445
      phc2sys[193.885]: enp4s0 sys offset      -652 s2 freq    +936 delay   2521
      phc2sys[193.985]: enp4s0 sys offset      -203 s2 freq   +1189 delay   3391
      phc2sys[194.085]: enp4s0 sys offset      -376 s2 freq    +955 delay   2951
      phc2sys[194.185]: enp4s0 sys offset      -134 s2 freq   +1084 delay   3330
      phc2sys[194.285]: enp4s0 sys offset       -22 s2 freq   +1156 delay   3479
      phc2sys[194.386]: enp4s0 sys offset        32 s2 freq   +1204 delay   3602
      phc2sys[194.486]: enp4s0 sys offset       122 s2 freq   +1303 delay   3731
      
      Statistics for this run (total of 2179 lines), in nanoseconds:
        average: -1.12
        stdev: 634.80
        max: 1551
        min: -2215
      
      With .getcrosststamp() via PCIe PTM:
      
      phc2sys[367.859]: enp4s0 sys offset         6 s2 freq   +1727 delay      0
      phc2sys[367.959]: enp4s0 sys offset        -2 s2 freq   +1721 delay      0
      phc2sys[368.059]: enp4s0 sys offset         5 s2 freq   +1727 delay      0
      phc2sys[368.160]: enp4s0 sys offset        -1 s2 freq   +1723 delay      0
      phc2sys[368.260]: enp4s0 sys offset        -4 s2 freq   +1719 delay      0
      phc2sys[368.360]: enp4s0 sys offset        -5 s2 freq   +1717 delay      0
      phc2sys[368.460]: enp4s0 sys offset         1 s2 freq   +1722 delay      0
      phc2sys[368.560]: enp4s0 sys offset        -3 s2 freq   +1718 delay      0
      phc2sys[368.660]: enp4s0 sys offset         5 s2 freq   +1725 delay      0
      phc2sys[368.760]: enp4s0 sys offset        -1 s2 freq   +1721 delay      0
      phc2sys[368.860]: enp4s0 sys offset         0 s2 freq   +1721 delay      0
      phc2sys[368.960]: enp4s0 sys offset         0 s2 freq   +1721 delay      0
      phc2sys[369.061]: enp4s0 sys offset         4 s2 freq   +1725 delay      0
      phc2sys[369.161]: enp4s0 sys offset         1 s2 freq   +1724 delay      0
      phc2sys[369.261]: enp4s0 sys offset         4 s2 freq   +1727 delay      0
      phc2sys[369.361]: enp4s0 sys offset         8 s2 freq   +1732 delay      0
      phc2sys[369.461]: enp4s0 sys offset         7 s2 freq   +1733 delay      0
      phc2sys[369.561]: enp4s0 sys offset         4 s2 freq   +1733 delay      0
      phc2sys[369.661]: enp4s0 sys offset         1 s2 freq   +1731 delay      0
      phc2sys[369.761]: enp4s0 sys offset         1 s2 freq   +1731 delay      0
      phc2sys[369.861]: enp4s0 sys offset        -5 s2 freq   +1725 delay      0
      phc2sys[369.961]: enp4s0 sys offset        -4 s2 freq   +1725 delay      0
      phc2sys[370.062]: enp4s0 sys offset         2 s2 freq   +1730 delay      0
      phc2sys[370.162]: enp4s0 sys offset        -7 s2 freq   +1721 delay      0
      phc2sys[370.262]: enp4s0 sys offset        -3 s2 freq   +1723 delay      0
      phc2sys[370.362]: enp4s0 sys offset         1 s2 freq   +1726 delay      0
      phc2sys[370.462]: enp4s0 sys offset        -3 s2 freq   +1723 delay      0
      phc2sys[370.562]: enp4s0 sys offset        -1 s2 freq   +1724 delay      0
      phc2sys[370.662]: enp4s0 sys offset        -4 s2 freq   +1720 delay      0
      phc2sys[370.762]: enp4s0 sys offset        -7 s2 freq   +1716 delay      0
      phc2sys[370.862]: enp4s0 sys offset        -2 s2 freq   +1719 delay      0
      
      Statistics for this run (total of 2179 lines), in nanoseconds:
        average: 0.14
        stdev: 5.03
        max: 48
        min: -27
      
      For reference, the statistics for runs without PCIe congestion show
      that the improvements from enabling PTM are less dramatic. For two
      runs of 16466 entries:
        without PTM: avg -0.04 stdev 10.57 max 39 min -42
        with PTM: avg 0.01 stdev 4.20 max 19 min -16
      
      One possible explanation is that when PTM is not enabled, and there's a lot
      of traffic in the PCIe fabric, some register reads will take more time
      than the others because of congestion on the PCIe fabric.
      
      When PTM is enabled, even if the PTM dialogs take more time to
      complete under heavy traffic, the time measurements do not depend on
      the time to read the registers.
      
      This was implemented following the i225 EAS version 0.993.
      Signed-off-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
      Tested-by: NDvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      a90ec848
    • V
      igc: Enable PCIe PTM · 1b5d73fb
      Vinicius Costa Gomes 提交于
      Enables PCIe PTM (Precision Time Measurement) support in the igc
      driver. Notifies the PCI devices that PCIe PTM should be enabled.
      
      PCIe PTM is similar protocol to PTP (Precision Time Protocol) running
      in the PCIe fabric, it allows devices to report time measurements from
      their internal clocks and the correlation with the PCIe root clock.
      
      The i225 NIC exposes some registers that expose those time
      measurements, those registers will be used, in later patches, to
      implement the PTP_SYS_OFFSET_PRECISE ioctl().
      Signed-off-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
      Tested-by: NDvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      1b5d73fb
    • V
      PCI: Add pcie_ptm_enabled() · 014408cd
      Vinicius Costa Gomes 提交于
      Add a predicate that returns if PCIe PTM (Precision Time Measurement)
      is enabled.
      
      It will only return true if it's enabled in all the ports in the path
      from the device to the root.
      Signed-off-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      014408cd
    • V
      Revert "PCI: Make pci_enable_ptm() private" · 1d71eb53
      Vinicius Costa Gomes 提交于
      Make pci_enable_ptm() accessible from the drivers.
      
      Exposing this to the driver enables the driver to use the
      'ptm_enabled' field of 'pci_dev' to check if PTM is enabled or not.
      
      This reverts commit ac6c26da ("PCI: Make pci_enable_ptm() private").
      Signed-off-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      1d71eb53
  2. 24 8月, 2021 6 次提交
    • Y
      net: hns3: add ethtool support for CQE/EQE mode configuration · cce1689e
      Yufeng Mo 提交于
      Add support in ethtool for switching EQE/CQE mode.
      Signed-off-by: NYufeng Mo <moyufeng@huawei.com>
      Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      cce1689e
    • Y
      net: hns3: add support for EQE/CQE mode configuration · 9f0c6f4b
      Yufeng Mo 提交于
      For device whose version is above V3(include V3), the GL can
      select EQE or CQE mode, so adds support for it.
      
      In CQE mode, the coalesced timer will restart when the first new
      completion occurs, while in EQE mode, the timer will not restart.
      Signed-off-by: NYufeng Mo <moyufeng@huawei.com>
      Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      9f0c6f4b
    • Y
      ethtool: extend coalesce setting uAPI with CQE mode · f3ccfda1
      Yufeng Mo 提交于
      In order to support more coalesce parameters through netlink,
      add two new parameter kernel_coal and extack for .set_coalesce
      and .get_coalesce, then some extra info can return to user with
      the netlink API.
      Signed-off-by: NYufeng Mo <moyufeng@huawei.com>
      Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      f3ccfda1
    • H
      r8169: enable ASPM L0s state · 18a9eae2
      Heiner Kallweit 提交于
      ASPM is disabled completely because we've seen different types of
      problems in the past. However it seems these problems occurred with
      L1 or L1 sub-states only. On all the chip versions I've seen the
      acceptable L0s exit latency is 512ns. This should be short enough not
      to cause problems. If the actual L0s exit latency of the PCIe link
      is bigger than 512ns then the PCI core will disable L0s anyway.
      So let's give it a try and disable L1 and L1 sub-states only.
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      18a9eae2
    • V
      net: dsa: let drivers state that they need VLAN filtering while standalone · 58adf9dc
      Vladimir Oltean 提交于
      As explained in commit e358bef7 ("net: dsa: Give drivers the chance
      to veto certain upper devices"), the hellcreek driver uses some tricks
      to comply with the network stack expectations: it enforces port
      separation in standalone mode using VLANs. For untagged traffic,
      bridging between ports is prevented by using different PVIDs, and for
      VLAN-tagged traffic, it never accepts 8021q uppers with the same VID on
      two ports, so packets with one VLAN cannot leak from one port to another.
      
      That is almost fine*, and has worked because hellcreek relied on an
      implicit behavior of the DSA core that was changed by the previous
      patch: the standalone ports declare the 'rx-vlan-filter' feature as 'on
      [fixed]'. Since most of the DSA drivers are actually VLAN-unaware in
      standalone mode, that feature was actually incorrectly reflecting the
      hardware/driver state, so there was a desire to fix it. This leaves the
      hellcreek driver in a situation where it has to explicitly request this
      behavior from the DSA framework.
      
      We configure the ports as follows:
      
      - Standalone: 'rx-vlan-filter' is on. An 8021q upper on top of a
        standalone hellcreek port will go through dsa_slave_vlan_rx_add_vid
        and will add a VLAN to the hardware tables, giving the driver the
        opportunity to refuse it through .port_prechangeupper.
      
      - Bridged with vlan_filtering=0: 'rx-vlan-filter' is off. An 8021q upper
        on top of a bridged hellcreek port will not go through
        dsa_slave_vlan_rx_add_vid, because there will not be any attempt to
        offload this VLAN. The driver already disables VLAN awareness, so that
        upper should receive the traffic it needs.
      
      - Bridged with vlan_filtering=1: 'rx-vlan-filter' is on. An 8021q upper
        on top of a bridged hellcreek port will call dsa_slave_vlan_rx_add_vid,
        and can again be vetoed through .port_prechangeupper.
      
      *It is not actually completely fine, because if I follow through
      correctly, we can have the following situation:
      
      ip link add br0 type bridge vlan_filtering 0
      ip link set lan0 master br0 # lan0 now becomes VLAN-unaware
      ip link set lan0 nomaster # lan0 fails to become VLAN-aware again, therefore breaking isolation
      
      This patch fixes that corner case by extending the DSA core logic, based
      on this requested attribute, to change the VLAN awareness state of the
      switch (port) when it leaves the bridge.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NKurt Kanzenbach <kurt@linutronix.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58adf9dc
    • H
      cxgb4: improve printing NIC information · 1bb39cb6
      Heiner Kallweit 提交于
      Currently the interface name and PCI address are printed twice, because
      netdev_info() is printing this information implicitly already. This results
      in messages like the following. remove the duplicated information.
      
      cxgb4 0000:81:00.4 eth3: eth3: Chelsio T6225-OCP-SO (0000:81:00.4) 1G/10G/25GBASE-SFP28
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1bb39cb6
  3. 23 8月, 2021 2 次提交