1. 18 6月, 2021 2 次提交
  2. 16 6月, 2021 1 次提交
  3. 15 6月, 2021 5 次提交
  4. 13 6月, 2021 3 次提交
  5. 12 6月, 2021 18 次提交
    • V
      net: pcs: xpcs: export xpcs_do_config and xpcs_link_up · a853c68e
      Vladimir Oltean 提交于
      The sja1105 hardware has a quirk in that some changes require a switch
      reset, which loses all configuration. When the reset is initiated,
      everything needs to be reprogrammed, including the MACs and the PCS.
      This is currently done in sja1105_static_config_reload() - we manually
      call sja1105_adjust_port_config(), sja1105_sgmii_pcs_config() and
      sja1105_sgmii_pcs_force_speed() which are all internal functions.
      
      There is a desire for sja1105 to use the common xpcs driver, and that
      means that the equivalents of those functions, xpcs_do_config() and
      xpcs_link_up() respectively, will no longer be local functions.
      
      Forcing phylink to retrigger a resolve somehow, say by doing dev_close()
      followed by dev_open() is not really an option, because the CPU port
      might have a PCS as well, and there is no net device which we can close
      and reopen for that. Additionally, the dev_close/dev_open sequence might
      force a renegotiation of the copper-side link for SGMII ports connected
      to a PHY, and this is undesirable as well, because the switch reset is
      much quicker than a PHY autoneg, so we would have a lot more downtime.
      
      The only solution I see is for the sja1105 driver to keep doing what
      it's doing, and that means we need to export the equivalents from xpcs
      for sja1105_sgmii_pcs_config and sja1105_sgmii_pcs_force_speed, and call
      them directly in sja1105_static_config_reload(). This will be done
      during the conversion patch.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a853c68e
    • V
      net: pcs: xpcs: add support for NXP SJA1110 · f7380bba
      Vladimir Oltean 提交于
      The NXP SJA1110 switch integrates its own, non-Synopsys PMA, but it
      manages it through the register space of the XPCS itself, in a small
      register window inside MDIO_MMD_VEND2 from address 0x8030 to 0x806e.
      
      This coincides with where the registers for the default Synopsys PMA
      are, but the register definitions are of course not the same.
      
      This situation is an odd hardware quirk, but the simplest way to manage
      it is to drive the SJA1110's PMA from within the XPCS driver.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f7380bba
    • V
      net: pcs: xpcs: add support for NXP SJA1105 · dd0721ea
      Vladimir Oltean 提交于
      The NXP SJA1105 DSA switch integrates a Synopsys SGMII XPCS on port 4.
      The generic code works fine, except there is an integration issue which
      needs to be dealt with: in this switch, the XPCS is integrated with a
      PMA that has the TX lane polarity inverted by default (PLUS is MINUS,
      MINUS is PLUS).
      
      To obtain normal non-inverted behavior, the TX lane polarity must be
      inverted in the PCS, via the DIGITAL_CONTROL_2 register.
      
      We introduce a pma_config() method in xpcs_compat which is called by the
      phylink_pcs_config() implementation.
      
      Also, the NXP SJA1105 returns all zeroes in the PHY ID registers 2 and 3.
      We need to hack up an ad-hoc PHY ID (OUI is zero, device ID is 1) in
      order for the XPCS driver to recognize it. This PHY ID is added to the
      public include/linux/pcs/pcs-xpcs.h for that reason (for the sja1105
      driver to be able to use it in a later patch).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd0721ea
    • V
      net: pcs: xpcs: rename mdio_xpcs_args to dw_xpcs · 5673ef86
      Vladimir Oltean 提交于
      The struct mdio_xpcs_args is reminiscent of when a similarly named
      struct mdio_xpcs_ops existed. Now that that is removed, we can shorten
      the name to dw_xpcs (dw for DesignWare).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5673ef86
    • A
      virtio/vsock: rest of SOCK_SEQPACKET support · 9ac841f5
      Arseny Krasnov 提交于
      Small updates to make SOCK_SEQPACKET work:
      1) Send SHUTDOWN on socket close for SEQPACKET type.
      2) Set SEQPACKET packet type during send.
      3) Set 'VIRTIO_VSOCK_SEQ_EOR' bit in flags for last
         packet of message.
      4) Implement data check function for SEQPACKET.
      5) Check for max datagram size.
      Signed-off-by: NArseny Krasnov <arseny.krasnov@kaspersky.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ac841f5
    • A
      virtio/vsock: dequeue callback for SOCK_SEQPACKET · 44931195
      Arseny Krasnov 提交于
      Callback fetches RW packets from rx queue of socket until whole record
      is copied(if user's buffer is full, user is not woken up). This is done
      to not stall sender, because if we wake up user and it leaves syscall,
      nobody will send credit update for rest of record, and sender will wait
      for next enter of read syscall at receiver's side. So if user buffer is
      full, we just send credit update and drop data.
      Signed-off-by: NArseny Krasnov <arseny.krasnov@kaspersky.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44931195
    • C
      net: phylink: introduce phylink_fwnode_phy_connect() · 25396f68
      Calvin Johnson 提交于
      Define phylink_fwnode_phy_connect() to connect phy specified by
      a fwnode to a phylink instance.
      Signed-off-by: NCalvin Johnson <calvin.johnson@oss.nxp.com>
      Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
      Acked-by: NGrant Likely <grant.likely@arm.com>
      Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25396f68
    • C
      net: mdio: Add ACPI support code for mdio · 803ca24d
      Calvin Johnson 提交于
      Define acpi_mdiobus_register() to Register mii_bus and create PHYs for
      each ACPI child node.
      Signed-off-by: NCalvin Johnson <calvin.johnson@oss.nxp.com>
      Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
      Acked-by: NRafael J. Wysocki <rafael@kernel.org>
      Acked-by: NGrant Likely <grant.likely@arm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      803ca24d
    • C
      ACPI: utils: Introduce acpi_get_local_address() · 7ec16433
      Calvin Johnson 提交于
      Introduce a wrapper around the _ADR evaluation.
      Signed-off-by: NCalvin Johnson <calvin.johnson@oss.nxp.com>
      Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
      Acked-by: NRafael J. Wysocki <rafael@kernel.org>
      Acked-by: NGrant Likely <grant.likely@arm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7ec16433
    • C
      net: mdiobus: Introduce fwnode_mdiobus_register_phy() · bc1bee3b
      Calvin Johnson 提交于
      Introduce fwnode_mdiobus_register_phy() to register PHYs on the
      mdiobus. From the compatible string, identify whether the PHY is
      c45 and based on this create a PHY device instance which is
      registered on the mdiobus.
      
      Along with fwnode_mdiobus_register_phy() also introduce
      fwnode_find_mii_timestamper() and fwnode_mdiobus_phy_device_register()
      since they are needed.
      While at it, also use the newly introduced fwnode operation in
      of_mdiobus_phy_device_register().
      Signed-off-by: NCalvin Johnson <calvin.johnson@oss.nxp.com>
      Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
      Acked-by: NGrant Likely <grant.likely@arm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc1bee3b
    • C
      net: phy: Introduce fwnode_get_phy_id() · 114dea60
      Calvin Johnson 提交于
      Extract phy_id from compatible string. This will be used by
      fwnode_mdiobus_register_phy() to create phy device using the
      phy_id.
      Signed-off-by: NCalvin Johnson <calvin.johnson@oss.nxp.com>
      Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
      Acked-by: NGrant Likely <grant.likely@arm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      114dea60
    • C
      net: phy: Introduce phy related fwnode functions · 425775ed
      Calvin Johnson 提交于
      Define fwnode_phy_find_device() to iterate an mdiobus and find the
      phy device of the provided phy fwnode. Additionally define
      device_phy_find_device() to find phy device of provided device.
      
      Define fwnode_get_phy_node() to get phy_node using named reference.
      Signed-off-by: NCalvin Johnson <calvin.johnson@oss.nxp.com>
      Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
      Acked-by: NGrant Likely <grant.likely@arm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      425775ed
    • C
      net: phy: Introduce fwnode_mdio_find_device() · 0fb16976
      Calvin Johnson 提交于
      Define fwnode_mdio_find_device() to get a pointer to the
      mdio_device from fwnode passed to the function.
      
      Refactor of_mdio_find_device() to use fwnode_mdio_find_device().
      Signed-off-by: NCalvin Johnson <calvin.johnson@oss.nxp.com>
      Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
      Acked-by: NGrant Likely <grant.likely@arm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0fb16976
    • V
      net: dsa: sja1105: implement TX timestamping for SJA1110 · 566b18c8
      Vladimir Oltean 提交于
      The TX timestamping procedure for SJA1105 is a bit unconventional
      because the transmit procedure itself is unconventional.
      
      Control packets (and therefore PTP as well) are transmitted to a
      specific port in SJA1105 using "management routes" which must be written
      over SPI to the switch. These are one-shot rules that match by
      destination MAC address on traffic coming from the CPU port, and select
      the precise destination port for that packet. So to transmit a packet
      from NET_TX softirq context, we actually need to defer to a process
      context so that we can perform that SPI write before we send the packet.
      The DSA master dev_queue_xmit() runs in process context, and we poll
      until the switch confirms it took the TX timestamp, then we annotate the
      skb clone with that TX timestamp. This is why the sja1105 driver does
      not need an skb queue for TX timestamping.
      
      But the SJA1110 is a bit (not much!) more conventional, and you can
      request 2-step TX timestamping through the DSA header, as well as give
      the switch a cookie (timestamp ID) which it will give back to you when
      it has the timestamp. So now we do need a queue for keeping the skb
      clones until their TX timestamps become available.
      
      The interesting part is that the metadata frames from SJA1105 haven't
      disappeared completely. On SJA1105 they were used as follow-ups which
      contained RX timestamps, but on SJA1110 they are actually TX completion
      packets, which contain a variable (up to 32) array of timestamps.
      Why an array? Because:
      - not only is the TX timestamp on the egress port being communicated,
        but also the RX timestamp on the CPU port. Nice, but we don't care
        about that, so we ignore it.
      - because a packet could be multicast to multiple egress ports, each
        port takes its own timestamp, and the TX completion packet contains
        the individual timestamps on each port.
      
      This is unconventional because switches typically have a timestamping
      FIFO and raise an interrupt, but this one doesn't. So the tagger needs
      to detect and parse meta frames, and call into the main switch driver,
      which pairs the timestamps with the skbs in the TX timestamping queue
      which are waiting for one.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      566b18c8
    • V
      net: dsa: add support for the SJA1110 native tagging protocol · 4913b8eb
      Vladimir Oltean 提交于
      The SJA1110 has improved a few things compared to SJA1105:
      
      - To send a control packet from the host port with SJA1105, one needed
        to program a one-shot "management route" over SPI. This is no longer
        true with SJA1110, you can actually send "in-band control extensions"
        in the packets sent by DSA, these are in fact DSA tags which contain
        the destination port and switch ID.
      
      - When receiving a control packet from the switch with SJA1105, the
        source port and switch ID were written in bytes 3 and 4 of the
        destination MAC address of the frame (which was a very poor shot at a
        DSA header). If the control packet also had an RX timestamp, that
        timestamp was sent in an actual follow-up packet, so there were
        reordering concerns on multi-core/multi-queue DSA masters, where the
        metadata frame with the RX timestamp might get processed before the
        actual packet to which that timestamp belonged (there is no way to
        pair a packet to its timestamp other than the order in which they were
        received). On SJA1110, this is no longer true, control packets have
        the source port, switch ID and timestamp all in the DSA tags.
      
      - Timestamps from the switch were partial: to get a 64-bit timestamp as
        required by PTP stacks, one would need to take the partial 24-bit or
        32-bit timestamp from the packet, then read the current PTP time very
        quickly, and then patch in the high bits of the current PTP time into
        the captured partial timestamp, to reconstruct what the full 64-bit
        timestamp must have been. That is awful because packet processing is
        done in NAPI context, but reading the current PTP time is done over
        SPI and therefore needs sleepable context.
      
      But it also aggravated a few things:
      
      - Not only is there a DSA header in SJA1110, but there is a DSA trailer
        in fact, too. So DSA needs to be extended to support taggers which
        have both a header and a trailer. Very unconventional - my understanding
        is that the trailer exists because the timestamps couldn't be prepared
        in time for putting them in the header area.
      
      - Like SJA1105, not all packets sent to the CPU have the DSA tag added
        to them, only control packets do:
      
        * the ones which match the destination MAC filters/traps in
          MAC_FLTRES1 and MAC_FLTRES0
        * the ones which match FDB entries which have TRAP or TAKETS bits set
      
        So we could in theory hack something up to request the switch to take
        timestamps for all packets that reach the CPU, and those would be
        DSA-tagged and contain the source port / switch ID by virtue of the
        fact that there needs to be a timestamp trailer provided. BUT:
      
      - The SJA1110 does not parse its own DSA tags in a way that is useful
        for routing in cross-chip topologies, a la Marvell. And the sja1105
        driver already supports cross-chip bridging from the SJA1105 days.
        It does that by automatically setting up the DSA links as VLAN trunks
        which contain all the necessary tag_8021q RX VLANs that must be
        communicated between the switches that span the same bridge. So when
        using tag_8021q on sja1105, it is possible to have 2 switches with
        ports sw0p0, sw0p1, sw1p0, sw1p1, and 2 VLAN-unaware bridges br0 and
        br1, and br0 can take sw0p0 and sw1p0, and br1 can take sw0p1 and
        sw1p1, and forwarding will happen according to the expected rules of
        the Linux bridge.
        We like that, and we don't want that to go away, so as a matter of
        fact, the SJA1110 tagger still needs to support tag_8021q.
      
      So the sja1110 tagger is a hybrid between tag_8021q for data packets,
      and the native hardware support for control packets.
      
      On RX, packets have a 13-byte trailer if they contain an RX timestamp.
      That trailer is padded in such a way that its byte 8 (the start of the
      "residence time" field - not parsed by Linux because we don't care) is
      aligned on a 16 byte boundary. So the padding has a variable length
      between 0 and 15 bytes. The DSA header contains the offset of the
      beginning of the padding relative to the beginning of the frame (and the
      end of the padding is obviously the end of the packet minus 13 bytes,
      the length of the trailer). So we discard it.
      
      Packets which don't have a trailer contain the source port and switch ID
      information in the header (they are "trap-to-host" packets). Packets
      which have a trailer contain the source port and switch ID in the trailer.
      
      On TX, the destination port mask and switch ID is always in the trailer,
      so we always need to say in the header that a trailer is present.
      
      The header needs a custom EtherType and this was chosen as 0xdadc, after
      0xdada which is for Marvell and 0xdadb which is for VLANs in
      VLAN-unaware mode on SJA1105 (and SJA1110 in fact too).
      
      Because we use tag_8021q in concert with the native tagging protocol,
      control packets will have 2 DSA tags.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4913b8eb
    • V
      net: dsa: sja1105: make SJA1105_SKB_CB fit a full timestamp · 617ef8d9
      Vladimir Oltean 提交于
      In SJA1105, RX timestamps for packets sent to the CPU are transmitted in
      separate follow-up packets (metadata frames). These contain partial
      timestamps (24 or 32 bits) which are kept in SJA1105_SKB_CB(skb)->meta_tstamp.
      
      Thankfully, SJA1110 improved that, and the RX timestamps are now
      transmitted in-band with the actual packet, in the timestamp trailer.
      The RX timestamps are now full-width 64 bits.
      
      Because we process the RX DSA tags in the rcv() method in the tagger,
      but we would like to preserve the DSA code structure in that we populate
      the skb timestamp in the port_rxtstamp() call which only happens later,
      the implication is that we must somehow pass the 64-bit timestamp from
      the rcv() method all the way to port_rxtstamp(). We can use the skb->cb
      for that.
      
      Rename the meta_tstamp from struct sja1105_skb_cb from "meta_tstamp" to
      "tstamp", and increase its size to 64 bits.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      617ef8d9
    • V
      net: dsa: tag_8021q: refactor RX VLAN parsing into a dedicated function · 233697b3
      Vladimir Oltean 提交于
      The added value of this function is that it can deal with both the case
      where the VLAN header is in the skb head, as well as in the offload field.
      This is something I was not able to do using other functions in the
      network stack.
      
      Since both ocelot-8021q and sja1105 need to do the same stuff, let's
      make it a common service provided by tag_8021q.
      
      This is done as refactoring for the new SJA1110 tagger, which partly
      uses tag_8021q as well (just like SJA1105), and will be the third caller.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      233697b3
    • V
      net: dsa: tag_8021q: remove shim declarations · ab6a303c
      Vladimir Oltean 提交于
      All users of tag_8021q select it in Kconfig, so shim functions are not
      needed because it is not possible for it to be disabled and its callers
      enabled.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab6a303c
  6. 11 6月, 2021 1 次提交
    • J
      ice: add low level PTP clock access functions · 03cb4473
      Jacob Keller 提交于
      Add the ice_ptp_hw.c file and some associated definitions to the ice
      driver folder. This file contains basic low level definitions for
      functions that interact with the device hardware.
      
      For now, only E810-based devices are supported. The ice hardware
      supports 2 major variants which have different PHYs with different
      procedures necessary for interacting with the device clock.
      
      Because the device captures timestamps in the PHY, each PHY has its own
      internal timer. The timers are synchronized in hardware by first
      preparing the source timer and the PHY timer shadow registers, and then
      issuing a synchronization command. This ensures that both the source
      timer and PHY timers are programmed simultaneously. The timers
      themselves are all driven from the same oscillator source.
      
      The functions in ice_ptp_hw.c abstract over the differences between how
      the PHYs in E810 are programmed vs how the PHYs in E822 devices are
      programmed. This series only implements E810 support, but E822 support
      will be added in a future change.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      03cb4473
  7. 10 6月, 2021 4 次提交
    • V
      net/mlx5: Bridge, add offload infrastructure · 19e9bfa0
      Vlad Buslov 提交于
      Create new files bridge.{c|h} in en/rep directory that implement bridge
      interaction with representor netdevices and handle required
      events/notifications, bridge.{c|h} in esw directory that implement all
      necessary eswitch offloading infrastructure and works on vport/eswitch
      level. Provide new kconfig MLX5_BRIDGE which is automatically selected when
      both kernel bridge and mlx5 eswitch configs are enabled.
      
      Provide basic infrastructure for bridge offloads:
      
      - struct mlx5_esw_bridge_offloads - per-eswitch bridge offload structure
      that encapsulates generic bridge-offloads data (notifier blocks, ingress
      flow table/group, etc.) that is created/deleted on enable/disable eswitch
      offloads.
      
      - struct mlx5_esw_bridge - per-bridge structure that encapsulates
      per-bridge data (reference counter, FDB, egress flow table/group, etc.)
      that is created when first eswitch represetor is attached to new bridge and
      deleted when last representor is removed from the bridge as a result of
      NETDEV_CHANGEUPPER event.
      
      The bridge tables are created with new priority FDB_BR_OFFLOAD in FDB
      namespace. The new priority is between tc-miss and slow path priorities.
      Priority consist of two levels: the ingress table that is global per
      eswitch and matches incoming packets by src_mac/vid and redirects them to
      next level (egress table) that is chosen according to ingress port bridge
      membership and matches on dst_mac/vid in order to redirect packet to vport
      according to the following diagram:
      
                      +
                      |
            +---------v----------+
            |                    |
            |   FDB_TC_OFFLOAD   |
            |                    |
            +---------+----------+
                      |
                      |
            +---------v----------+
            |                    |
            |   FDB_FT_OFFLOAD   |
            |                    |
            +---------+----------+
                      |
                      |
            +---------v----------+
            |                    |
            |    FDB_TC_MISS     |
            |                    |
            +---------+----------+
                      |
      +--------------------------------------+
      |               |                      |
      |        +------+                      |
      |        |                             |
      | +------v--------+   FDB_BR_OFFLOAD   |
      | | INGRESS_TABLE |                    |
      | +------+---+----+                    |
      |        |   |      match              |
      |        |   +---------+               |
      |        |             |               |    +-------+
      |        |     +-------v-------+ match |    |       |
      |        |     | EGRESS_TABLE  +------------> vport |
      |        |     +-------+-------+       |    |       |
      |        |             |               |    +-------+
      |        |    miss     |               |
      |        +------+------+               |
      |               |                      |
      +--------------------------------------+
                      |
                      |
            +---------v----------+
            |                    |
            |   FDB_SLOW_PATH    |
            |                    |
            +---------+----------+
                      |
                      v
      Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: NJianbo Liu <jianbol@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      19e9bfa0
    • V
      net/mlx5: Create TC-miss priority and table · ec3be887
      Vlad Buslov 提交于
      In order to adhere to kernel software datapath model bridge offloads must
      come after TC and NF FDBs. Following patches in this series add new FDB
      priority for bridge after FDB_FT_OFFLOAD. However, since netfilter offload
      is implemented with unmanaged tables, its miss path is not automatically
      connected to next priority and requires the code to manually connect with
      slow table. To keep bridge offloads encapsulated and not mix it with
      eswitch offloads, create a new FDB_TC_MISS priority between FDB_FT_OFFLOAD
      and FDB_SLOW_PATH:
      
                +
                |
      +---------v----------+
      |                    |
      |   FDB_TC_OFFLOAD   |
      |                    |
      +---------+----------+
                |
                |
                |
      +---------v----------+
      |                    |
      |   FDB_FT_OFFLOAD   |
      |                    |
      +---------+----------+
                |
                |
                |
      +---------v----------+
      |                    |
      |    FDB_TC_MISS     |
      |                    |
      +---------+----------+
                |
                |
                |
      +---------v----------+
      |                    |
      |   FDB_SLOW_PATH    |
      |                    |
      +---------+----------+
                |
                v
      
      Initialize the new priority with single default empty managed table and use
      the table as TC/NF miss patch instead of slow table. This approach allows
      bridge offloads to be created as new FDB namespace priority between
      FDB_TC_MISS and FDB_SLOW_PATH without exposing its internal tables to any
      other modules since miss path of managed TC-miss table is automatically
      wired to next priority.
      Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: NJianbo Liu <jianbol@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      ec3be887
    • Y
      net/mlx5: Added new parameters to reformat context · 3f3f05ab
      Yevgeny Kliteynik 提交于
      Adding new reformat context type (INSERT_HEADER) requires adding two new
      parameters to reformat context - reformat_param_0 and reformat_param_1.
      As defined by HW spec, these parameters have different meaning for
      different reformat context type.
      
      The first parameter (reformat_param_0) is not new to HW spec, but it
      wasn't used by any of the supported reformats. The second parameter
      (reformat_param_1) is new to the HW spec - it was added to allow
      supporting INSERT_HEADER.
      
      For NSERT_HEADER, reformat_param_0 indicates the header used to
      reference the location of the inserted header, and reformat_param_1
      indicates the offset of the inserted header from the reference point
      defined by reformat_param_0.
      Signed-off-by: NYevgeny Kliteynik <kliteyn@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      3f3f05ab
    • Y
      net/mlx5: mlx5_ifc support for header insert/remove · 67133eaa
      Yevgeny Kliteynik 提交于
      Add support for HCA caps 2 that contains capabilities for the new
      insert/remove header actions.
      
      Added the required definitions for supporting the new reformat type:
      added packet reformat parameters, reformat anchors and definitions
      to allow copy/set into the inserted EMD (Embedded MetaData) tag.
      Signed-off-by: NYevgeny Kliteynik <kliteyn@nvidia.com>
      Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: NJianbo Liu <jianbol@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      67133eaa
  8. 09 6月, 2021 4 次提交
    • M
      net: stmmac: explicitly deassert GMAC_AHB_RESET · e67f325e
      Matthew Hagan 提交于
      We are currently assuming that GMAC_AHB_RESET will already be deasserted
      by the bootloader. However if this has not been done, probing of the GMAC
      will fail. To remedy this we must ensure GMAC_AHB_RESET has been deasserted
      prior to probing.
      
      v2 changes:
       - remove NULL condition check for stmmac_ahb_rst in stmmac_main.c
       - unwrap dev_err() message in stmmac_main.c
       - add PTR_ERR() around plat->stmmac_ahb_rst in stmmac_platform.c
      
      v3 changes:
       - add error pointer to dev_err() output
       - add reset_control_assert(stmmac_ahb_rst) in stmmac_dvr_remove
       - revert PTR_ERR() around plat->stmmac_ahb_rst since this is performed
         on the returned value of ret by the calling function
      Signed-off-by: NMatthew Hagan <mnhagan88@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e67f325e
    • S
      net: wwan: make WWAN_PORT_MAX meaning less surprised · b64d76b7
      Sergey Ryazanov 提交于
      It is quite unusual when some value can not be equal to a defined range
      max value. Also most subsystems defines FOO_TYPE_MAX as a maximum valid
      value. So turn the WAN_PORT_MAX meaning from the number of supported
      port types to the maximum valid port type.
      Signed-off-by: NSergey Ryazanov <ryazanov.s.a@gmail.com>
      Reviewed-by: NLoic Poulain <loic.poulain@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b64d76b7
    • V
      net: stmmac: enable Intel mGbE 2.5Gbps link speed · 46682cb8
      Voon Weifeng 提交于
      The Intel mGbE supports 2.5Gbps link speed by increasing the clock rate by
      2.5 times of the original rate. In this mode, the serdes/PHY operates at a
      serial baud rate of 3.125 Gbps and the PCS data path and GMII interface of
      the MAC operate at 312.5 MHz instead of 125 MHz.
      
      For Intel mGbE, the overclocking of 2.5 times clock rate to support 2.5G is
      only able to be configured in the BIOS during boot time. Kernel driver has
      no access to modify the clock rate for 1Gbps/2.5G mode. The way to
      determined the current 1G/2.5G mode is by reading a dedicated adhoc
      register through mdio bus. In short, after the system boot up, it is either
      in 1G mode or 2.5G mode which not able to be changed on the fly.
      
      Compared to 1G mode, the 2.5G mode selects the 2500BASEX as PHY interface and
      disables the xpcs_an_inband. This is to cater for some PHYs that only
      supports 2500BASEX PHY interface with no autonegotiation.
      
      v2: remove MAC supported link speed masking
      v3: Restructure  to introduce intel_speed_mode_2500() to read serdes registers
          for max speed supported and select the appropritate configuration.
          Use max_speed to determine the supported link speed mask.
      Signed-off-by: NVoon Weifeng <weifeng.voon@intel.com>
      Signed-off-by: NMichael Sit Wei Hong <michael.wei.hong.sit@intel.com>
      Reviewed-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46682cb8
    • V
      net: pcs: add 2500BASEX support for Intel mGbE controller · f27abde3
      Voon Weifeng 提交于
      XPCS IP supports 2500BASEX as PHY interface. It is configured as
      autonegotiation disable to cater for PHYs that does not supports 2500BASEX
      autonegotiation.
      
      v2: Add supported link speed masking.
      v3: Restructure to introduce xpcs_config_2500basex() used to configure the
          xpcs for 2.5G speeds. Added 2500BASEX specific information for
          configuration.
      v4: Fix indentation error
      Signed-off-by: NVoon Weifeng <weifeng.voon@intel.com>
      Signed-off-by: NMichael Sit Wei Hong <michael.wei.hong.sit@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f27abde3
  9. 08 6月, 2021 2 次提交
    • I
      page_pool: Allow drivers to hint on SKB recycling · 6a5bcd84
      Ilias Apalodimas 提交于
      Up to now several high speed NICs have custom mechanisms of recycling
      the allocated memory they use for their payloads.
      Our page_pool API already has recycling capabilities that are always
      used when we are running in 'XDP mode'. So let's tweak the API and the
      kernel network stack slightly and allow the recycling to happen even
      during the standard operation.
      The API doesn't take into account 'split page' policies used by those
      drivers currently, but can be extended once we have users for that.
      
      The idea is to be able to intercept the packet on skb_release_data().
      If it's a buffer coming from our page_pool API recycle it back to the
      pool for further usage or just release the packet entirely.
      
      To achieve that we introduce a bit in struct sk_buff (pp_recycle:1) and
      a field in struct page (page->pp) to store the page_pool pointer.
      Storing the information in page->pp allows us to recycle both SKBs and
      their fragments.
      We could have skipped the skb bit entirely, since identical information
      can bederived from struct page. However, in an effort to affect the free path
      as less as possible, reading a single bit in the skb which is already
      in cache, is better that trying to derive identical information for the
      page stored data.
      
      The driver or page_pool has to take care of the sync operations on it's own
      during the buffer recycling since the buffer is, after opting-in to the
      recycling, never unmapped.
      
      Since the gain on the drivers depends on the architecture, we are not
      enabling recycling by default if the page_pool API is used on a driver.
      In order to enable recycling the driver must call skb_mark_for_recycle()
      to store the information we need for recycling in page->pp and
      enabling the recycling bit, or page_pool_store_mem_info() for a fragment.
      Co-developed-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Co-developed-by: NMatteo Croce <mcroce@microsoft.com>
      Signed-off-by: NMatteo Croce <mcroce@microsoft.com>
      Signed-off-by: NIlias Apalodimas <ilias.apalodimas@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6a5bcd84
    • M
      skbuff: add a parameter to __skb_frag_unref · c420c989
      Matteo Croce 提交于
      This is a prerequisite patch, the next one is enabling recycling of
      skbs and fragments. Add an extra argument on __skb_frag_unref() to
      handle recycling, and update the current users of the function with that.
      Signed-off-by: NMatteo Croce <mcroce@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c420c989