1. 24 4月, 2019 19 次提交
    • D
      Merge tag 'mlx5-updates-2019-04-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 20eb08b2
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      mlx5-updates-2019-04-22
      
      This series includes updates to mlx5e driver RX data path and some
      significant XDP RX/TX improvements to overcome/mitigate HW and PCIE
      bottlenecks.
      
      From Tariq:
      1) Some Enhancements in rq->flags
      2) Stabilize RX packet rate (on Striding RQ) with
      multiple outstanding UMR posts
      In this patch, we add support for multiple outstanding UMR posts,
       to allow faster gap closure between consuming MPWQEs and reposting
      them back into the WQ.
      
      Performance test:
      As expected, huge improvement in large-scale (48 cores).
      
      xdp_redirect_map, 64B UDP multi-stream.
      Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps.
      CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.
      
      Before: Unstable, 7 to 30 Mpps
      After:  Stable,   at 70.5 Mpps
      
      From Shay:
      3) XDP, Inline small packets into the TX MPWQE in XDP xmit flow
      
      Upon high packet rate with multiple CPUs TX workloads, much of the HCA's
      resources are spent on prefetching TX descriptors, thus affecting
      transmission rates.
      This patch comes to mitigate this problem by moving some workload to the
      CPU and reducing the HW data prefetch overhead for small packets (<= 256B).
      
      When forwarding packets with XDP, a packet that is smaller
      than a certain size (set to ~256 bytes) would be sent inline within
      its WQE TX descrptor (mem-copied), when the hardware tx queue is congested
      beyond a pre-defined water-mark.
      
      Performance:
          Tested packet rate for UDP 64Byte multi-stream
          over two dual port ConnectX-5 100Gbps NICs.
          CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
      
          * Tested with hyper-threading disabled
      
          XDP_TX:
      
          |          | before | after   |       |
          | 24 rings | 51Mpps | 116Mpps | +126% |
          | 1 ring   | 12Mpps | 12Mpps  | same  |
      
          XDP_REDIRECT:
      
          ** Below is the transmit rate, not the redirection rate
          which might be larger, and is not affected by this patch.
      
          |          | before  | after   |      |
          | 32 rings | 64Mpps  | 92Mpps  | +43% |
          | 1 ring   | 6.4Mpps | 6.4Mpps | same |
      
      As we can see, feature significantly improves scaling, without
      hurting single ring performance.
      
      From Maxim:
      4) Some trivial refactoring and code improvements prior to a larger series
      to support AF_XDP.
      ====================
      Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      20eb08b2
    • M
      net/mlx5e: Use #define for the WQE wait timeout constant · f8ebecf2
      Maxim Mikityanskiy 提交于
      Create a #define for the timeout of mlx5e_wait_for_min_rx_wqes to
      clarify the meaning of a magic number.
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      f8ebecf2
    • M
      net/mlx5e: Remove unused rx_page_reuse stat · 03ceda6f
      Maxim Mikityanskiy 提交于
      Remove the no longer used page_reuse stat of RQs.
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      03ceda6f
    • M
      net/mlx5e: Take HW interrupt trigger into a function · 63d26b49
      Maxim Mikityanskiy 提交于
      mlx5e_trigger_irq posts a NOP to the ICO SQ just to trigger an IRQ and
      enter the NAPI poll on the right CPU according to the affinity. Use it
      in mlx5e_activate_rq.
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      63d26b49
    • M
      net/mlx5e: Remove unused parameter · 10961c56
      Maxim Mikityanskiy 提交于
      mdev is unused in mlx5e_rx_is_linear_skb.
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      10961c56
    • M
      net/mlx5e: Add an underflow warning comment · b1b187e1
      Maxim Mikityanskiy 提交于
      mlx5e_mpwqe_get_log_rq_size calculates the number of WQEs (N) based on
      the requested number of frames in the RQ (F) and the number of packets
      per WQE (P). It ensures that N is not less than the minimum number of
      WQEs in an RQ (N_min). Arithmetically, it means that F / P >= N_min
      should be true. This function deals with logarithms, so it should check
      that log(F) - log(P) >= log(N_min). However, if F < P, this expression
      will cause an unsigned underflow. Check log(F) >= log(P) + log(N_min)
      instead.
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      b1b187e1
    • M
      net/mlx5e: Move parameter calculation functions to en/params.c · 9a22d5d8
      Maxim Mikityanskiy 提交于
      This commit moves the parameter calculation functions to a separate file
      for better modularity and code sharing with future features.
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      9a22d5d8
    • M
      net/mlx5e: Report mlx5e_xdp_set errors · 74bbaebf
      Maxim Mikityanskiy 提交于
      If the channels fail to reopen after setting an XDP program, return the
      error code instead of 0. A proper fix is still needed, as now any error
      while reopening the channels brings the interface down. This patch only
      adds error reporting.
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      74bbaebf
    • M
      net/mlx5e: Remove unused parameter · 83b2fd64
      Maxim Mikityanskiy 提交于
      params is unused in mlx5e_init_di_list.
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      83b2fd64
    • S
      net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow · c2273219
      Shay Agroskin 提交于
      Upon high packet rate with multiple CPUs TX workloads, much of the HCA's
      resources are spent on prefetching TX descriptors, thus affecting
      transmission rates.
      This patch comes to mitigate this problem by moving some workload to the
      CPU and reducing the HW data prefetch overhead for small packets (<= 256B).
      
      When forwarding packets with XDP, a packet that is smaller
      than a certain size (set to ~256 bytes) would be sent inline within
      its WQE TX descrptor (mem-copied), when the hardware tx queue is congested
      beyond a pre-defined water-mark.
      
      This is added to better utilize the HW resources (which now makes
      one less packet data prefetch) and allow better scalability, on the
      account of CPU usage (which now 'memcpy's the packet into the WQE).
      
      To load balance between HW and CPU and get max packet rate, we use
      watermarks to detect how much the HW is congested and move the work
      loads back and forth between HW and CPU.
      
      Performance:
      Tested packet rate for UDP 64Byte multi-stream
      over two dual port ConnectX-5 100Gbps NICs.
      CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
      
      * Tested with hyper-threading disabled
      
      XDP_TX:
      
      |          | before | after   |       |
      | 24 rings | 51Mpps | 116Mpps | +126% |
      | 1 ring   | 12Mpps | 12Mpps  | same  |
      
      XDP_REDIRECT:
      
      ** Below is the transmit rate, not the redirection rate
      which might be larger, and is not affected by this patch.
      
      |          | before  | after   |      |
      | 32 rings | 64Mpps  | 92Mpps  | +43% |
      | 1 ring   | 6.4Mpps | 6.4Mpps | same |
      
      As we can see, feature significantly improves scaling, without
      hurting single ring performance.
      Signed-off-by: NShay Agroskin <shayag@mellanox.com>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      c2273219
    • S
      net/mlx5e: XDP, Add TX MPWQE session counter · 73cab880
      Shay Agroskin 提交于
      This counter tracks how many TX MPWQE sessions are started in XDP SQ
      in XDP TX/REDIRECT flow. It counts per-channel and global stats.
      Signed-off-by: NShay Agroskin <shayag@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      73cab880
    • T
      net/mlx5e: XDP, Enhance RQ indication for XDP redirect flush · 15143bf5
      Tariq Toukan 提交于
      The XDP redirect flush indication belongs to the receive queue,
      not to its XDP send queue.
      
      For this, use a new bit on rq->flags.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: NShay Agroskin <shayag@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      15143bf5
    • T
      net/mlx5e: XDP, Fix shifted flag index in RQ bitmap · f03590f7
      Tariq Toukan 提交于
      Values in enum mlx5e_rq_flag are used as bit indixes.
      Intention was to use them with no BIT(i) wrapping.
      
      No functional bug fix here, as the same (shifted)flag bit
      is used for all set, test, and clear operations.
      
      Fixes: 121e8927 ("net/mlx5e: Refactor RQ XDP_TX indication")
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: NShay Agroskin <shayag@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      f03590f7
    • T
      net/mlx5e: RX, Support multiple outstanding UMR posts · fd9b4be8
      Tariq Toukan 提交于
      The buffers mapping of the Multi-Packet WQEs (of Striding RQ)
      is done via UMR posts, one UMR WQE per an RX MPWQE.
      
      A single MPWQE is capable of serving many incoming packets,
      usually larger than the budget of a single napi cycle.
      Hence, posting a single UMR WQE per napi cycle (and handling its
      completion in the next cycle) works fine in many common cases,
      but not always.
      
      When an XDP program is loaded, every MPWQE is capable of serving less
      packets, to satisfy the packet-per-page requirement.
      Thus, for the same number of packets more MPWQEs (and UMR posts)
      are needed (twice as much for the default MTU), giving less latency
      room for the UMR completions.
      
      In this patch, we add support for multiple outstanding UMR posts,
      to allow faster gap closure between consuming MPWQEs and reposting
      them back into the WQ.
      
      For better SW and HW locality, we combine the UMR posts in bulks of
      (at least) two.
      
      This is expected to improve packet rate in high CPU scale.
      
      Performance test:
      As expected, huge improvement in large-scale (48 cores).
      
      xdp_redirect_map, 64B UDP multi-stream.
      Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps.
      CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.
      
      Before: Unstable, 7 to 30 Mpps
      After:  Stable,   at 70.5 Mpps
      
      No degradation in other tested scenarios.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      fd9b4be8
    • S
    • D
      Merge branch 'net-phy-mscc-Improvements-to-VSC8514-PHY-driver' · 539b593d
      David S. Miller 提交于
      Kavya Sree Kotagiri says:
      
      ====================
      net: phy: mscc: Improvements to VSC8514 PHY driver.
      
          The VSC8514 PHY is a 4-ports PHY that is 10/100/1000BASE-T, 100BASE-FX,
          1000BASE-X, can communicate with the MAC via QSGMII.
          The MAC interface protocol for each port within QSGMII can
          be either 1000BASE-X or SGMII, if the QSGMII MAC that the VSC8514 is
          connecting to supports this functionality.
          VSC8514 also supports SGMII MAC-side autonegotiation on each individual
          port, downshifting, can set the blinking pattern of each of its 4 LEDs,
          SyncE, 1000BASE-T Ring Resiliency as well as HP Auto-MDIX detection.
      
          This patch series adds support for 10BASE-T, 100BASE-TX, and
          1000BASE-T, QSGMII link with the MAC, downshifting, HP Auto-MDIX
          detection and blinking pattern for its 4 LEDs.
      
          The GPIO register bank is a set of registers that are common to all
          PHYs in the package. So any modification in any register of this bank
          affects all PHYs of the package.
      
          If the PHYs haven't been reset before booting the Linux kernel and were
          configured to use interrupts for e.g. link status updates, it is
          required to clear the interrupts mask register of all PHYs before being
          able to use interrupts with any PHY. The first PHY of the package that
          will be init will take care of clearing all PHYs interrupts mask
          registers. Thus, we need to keep track of the init sequence in the
          package, if it's already been done or if it's to be done.
      
          Most of the init sequence of a PHY of the package is common to all PHYs
          in the package, thus we use the SMI broadcast feature which enables us
          to propagate a write in one register of one PHY to all PHYs in the same
          package.
      
          This patch series adds support for VSC8514 in Microsemi driver(mscc.c)
          and removes support from Vitesse driver(vitesse.c).
      
      v8
      - mscc: Added appropriate code using phy_modify() in vsc8514_config_init().
      
      v7
      - mscc: Handled return values in vsc8514_config_init().
      
      v6
      - mscc: Added proper return value in vsc85xx_csr_ctrl_phy_read().
      - mscc: Replaced __mdiobus_write and__mdiobus_read with __phy_write and __phy_read resp.
      - mscc: Replaced register addresses in 8514_config_init() with proper constants.
      
      v5
      - mscc: Added return error statements for few function calls.
      - mscc: Added comments in vsc85xx_csr_ctrl_phy_read() and vsc85xx_csr_ctrl_phy_write()
      v4
      - mscc: Removed features settings
      - mscc: Removed aneg_done settings.
      
      v3
      - mscc: Used BIT(x) for PHY_MCB_S6G_WRITE and PHY_MCB_S6G_READ
              instead of hex.
      - mscc: Replaced magic numbers with proper constants.
      - mscc: Handled delays and timeouts at appropriate points.
      - mscc: Added comments/explanation where requested.
      
      v2
      - mscc: Sorted variable declarations in reverse christmas tree order.
      
      v1
      - Added 0/2 file.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      539b593d
    • K
      net: phy: vitesse: Remove support for VSC8514. · edeb207b
      Kavya Sree Kotagiri 提交于
      Add support for VSC8514 in Microsemi driver (mscc.c)
      with more features.
      Signed-off-by: NKavya Sree Kotagiri <kavyasree.kotagiri@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      edeb207b
    • K
      net: phy: mscc: add support for VSC8514 PHY. · e4f9ba64
      Kavya Sree Kotagiri 提交于
      The VSC8514 PHY is a 4-ports PHY that is 10/100/1000BASE-T, 100BASE-FX,
      1000BASE-X, can communicate with the MAC via QSGMII.
      The MAC interface protocol for each port within QSGMII can
      be either 1000BASE-X or SGMII, if the QSGMII MAC that the VSC8514 is
      connecting to supports this functionality.
      VSC8514 also supports SGMII MAC-side autonegotiation on each individual
      port, downshifting, can set the blinking pattern of each of its 4 LEDs,
      SyncE, 1000BASE-T Ring Resiliency as well as HP Auto-MDIX detection.
      
      This adds support for 10BASE-T, 100BASE-TX, and 1000BASE-T,
      QSGMII link with the MAC, downshifting, HP Auto-MDIX detection
      and blinking pattern for its 4 LEDs.
      
      The GPIO register bank is a set of registers that are common to all PHYs
      in the package. So any modification in any register of this bank affects
      all PHYs of the package.
      
      If the PHYs haven't been reset before booting the Linux kernel and were
      configured to use interrupts for e.g. link status updates, it is
      required to clear the interrupts mask register of all PHYs before being
      able to use interrupts with any PHY. The first PHY of the package that
      will be init will take care of clearing all PHYs interrupts mask
      registers. Thus, we need to keep track of the init sequence in the
      package, if it's already been done or if it's to be done.
      
      Most of the init sequence of a PHY of the package is common to all PHYs
      in the package, thus we use the SMI broadcast feature which enables us
      to propagate a write in one register of one PHY to all PHYs in the same
      package.
      Signed-off-by: NKavya Sree Kotagiri <kavyasree.kotagiri@microchip.com>
      Signed-off-by: NQuentin Schulz <quentin.schulz@bootlin.com>
      Co-developed-by: NQuentin Schulz <quentin.schulz@bootlin.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4f9ba64
    • J
      net: phy: marvell: add new default led configure for m88e151x · a93f7fe1
      Jian Shen 提交于
      The default m88e151x LED configuration is 0x1177, used LED[0]
      for 1000M link, LED[1] for 100M link, and LED[2] for active.
      But for some boards, which use LED[0] for link, and LED[1] for
      active, prefer to be 0x1040. To be compatible with this case,
      this patch defines a new dev_flag, and set it before connect
      phy in HNS3 driver. When phy initializing, using the new
      LED configuration if this dev_flag is set.
      Signed-off-by: NJian Shen <shenjian15@huawei.com>
      Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a93f7fe1
  2. 23 4月, 2019 21 次提交