1. 10 10月, 2020 1 次提交
  2. 16 9月, 2020 1 次提交
    • O
      net/mlx5e: Add CQE compression support for multi-strides packets · b7cf0806
      Ofer Levi 提交于
      Add CQE compression support for completions of packets that span
      multiple strides in a Striding RQ, per the HW capability.
      In our memory model, we use small strides (256B as of today) for the
      non-linear SKB mode. This feature allows CQE compression to work also
      for multiple strides packets. In this case decompressing the mini CQE
      array will use stride index provided by HW as part of the mini CQE.
      Before this feature, compression was possible only for single-strided
      packets, i.e. for packets of size up to 256 bytes when in non-linear
      mode, and the index was maintained by SW.
      This feature is supported for ConnectX-5 and above.
      
      Feature performance test:
      This was whitebox-tested, we reduced the PCI speed from 125Gb/s to
      62.5Gb/s to overload pci and manipulated mlx5 driver to drop incoming
      packets before building the SKB to achieve low cpu utilization.
      Outcome is low cpu utilization and bottleneck on pci only.
      Test setup:
      Server: Intel(R) Xeon(R) Silver 4108 CPU @ 1.80GHz server, 32 cores
      NIC: ConnectX-6 DX.
      Sender side generates 300 byte packets at full pci bandwidth.
      Receiver side configuration:
      Single channel, one cpu processing with one ring allocated. Cpu utilization
      is ~20% while pci bandwidth is fully utilized.
      For the generated traffic and interface MTU of 4500B (to activate the
      non-linear SKB mode), packet rate improvement is about 19% from ~17.6Mpps
      to ~21Mpps.
      Without this feature, counters show no CQE compression blocks for
      this setup, while with the feature, counters show ~20.7Mpps compressed CQEs
      in ~500K compression blocks.
      Signed-off-by: NOfer Levi <oferle@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      b7cf0806
  3. 27 7月, 2020 2 次提交
  4. 16 7月, 2020 1 次提交
  5. 28 6月, 2020 1 次提交
  6. 29 4月, 2020 4 次提交
  7. 28 3月, 2020 1 次提交
  8. 17 1月, 2020 1 次提交
  9. 11 1月, 2020 1 次提交
  10. 06 9月, 2019 1 次提交
  11. 02 9月, 2019 1 次提交
  12. 14 8月, 2019 1 次提交
  13. 09 8月, 2019 1 次提交
  14. 04 7月, 2019 2 次提交
  15. 14 6月, 2019 1 次提交
  16. 01 6月, 2019 1 次提交
  17. 30 4月, 2019 2 次提交
  18. 06 4月, 2019 1 次提交
  19. 15 2月, 2019 3 次提交
    • B
      net/mlx5: Add host params change event · 7f0d11c7
      Bodong Wang 提交于
      In Embedded CPU (EC) configurations, the EC driver needs to know when
      the number of virtual functions change on the corresponding PF at the
      host side. This is required so the EC driver can create or destroy
      representor net devices that represent the VFs ports.
      
      Whenever a change in the number of VFs occurs, firmware will generate an
      event towards the EC which will trigger a work to complete the rest of
      the handling. The specifics of the handling will be introduced in a
      downstream patch.
      Signed-off-by: NBodong Wang <bodong@mellanox.com>
      Signed-off-by: NEli Cohen <eli@mellanox.com>
      Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      7f0d11c7
    • B
      net/mlx5: Introduce Mellanox SmartNIC and modify page management logic · 591905ba
      Bodong Wang 提交于
      Mellanox's SmartNIC combines embedded CPU(e.g, ARM) processing power
      with advanced network offloads to accelerate a multitude of security,
      networking and storage applications.
      
      With the introduction of the SmartNIC, there is a new PCI function
      called Embedded CPU Physical Function(ECPF). And it's possible for a
      PF to get its ICM pages from the ECPF PCI function. Driver shall
      identify if it is running on such a function by reading a bit in
      the initialization segment.
      
      When firmware asks for pages, it would issue a page request event
      specifying how many pages it requests and for which function. That
      driver responds with a manage_pages command providing the requested
      pages along with an indication for which function it is providing these
      pages.
      
      The encoding before this patch was as follows:
          function_id == 0: pages are requested for the function receiving
                            the EQE.
          function_id != 0: pages are requested for VF identified by the
                            function_id value
      
      A new one bit field in the EQE identifies that pages are requested for
      the ECPF.
      
      The notion of page_supplier can be introduced here and to support that,
      manage pages and query pages were modified so firmware can distinguish
      the following cases:
      
      1. Function provides pages for itself
      2. PF provides pages for its VF
      3. ECPF provides pages to itself
      4. ECPF provides pages for another function
      
      This distinction is possible through the introduction of the bit
      "embedded_cpu_function" in query_pages, manage_pages and page request
      EQE.
      Signed-off-by: NBodong Wang <bodong@mellanox.com>
      Signed-off-by: NEli Cohen <eli@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      591905ba
    • B
      net/mlx5: Use void pointer as the type in address_of macro · 20bbf22a
      Bodong Wang 提交于
      Better to use void * and avoid unnecessary casts.
      
      This patch doesn't change any functionality.
      Signed-off-by: NBodong Wang <bodong@mellanox.com>
      Signed-off-by: NEli Cohen <eli@mellanox.com>
      Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      20bbf22a
  20. 03 2月, 2019 1 次提交
  21. 21 12月, 2018 1 次提交
    • T
      net/mlx5e: XDP, Support Enhanced Multi-Packet TX WQE · 5e0d2eef
      Tariq Toukan 提交于
      Add support for the HW feature of multi-packet WQE in XDP
      xmit flow.
      
      The conventional TX descriptor (WQE, Work Queue Element) serves
      a single packet. Our HW has support for multi-packet WQE (MPWQE)
      in which a single descriptor serves multiple TX packets.
      
      This reduces both the PCI overhead and the CPU cycles wasted on
      writing them.
      
      In this patch we add support for the HW feature, which is supported
      starting from ConnectX-5.
      
      Performance:
      Tested packet rate for UDP 64Byte multi-stream over ConnectX-5 NICs.
      CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
      
      XDP_TX:
      We see a huge gain on single port ConnectX-5, and reach the 100 Mpps
      milestone.
      * Single-port HCA:
      	Before:   70 Mpps
      	After:   100 Mpps (+42.8%)
      
      * Dual-port HCA:
      	Before: 51.7 Mpps
      	After:  57.3 Mpps (+10.8%)
      
      * In both cases we tested traffic on one port and for now On Dual-port HCAs
        we see only small gain, we are working to overcome this bottleneck, but
        for the moment only with experimental firmware on dual port HCAs we can
        reach the wanted numbers as seen on Single-port HCAs.
      
      XDP_REDIRECT:
      Redirect from (A) ConnectX-5 to (B) ConnectX-5.
      Due to a setup limitation, (A) and (B) are on different NUMA nodes,
      so absolute performance numbers are not optimal.
      Note:
        Below is the transmit rate of (B), not the redirect rate of (A)
        which is in some cases higher.
      
      * (B) is single-port:
      	Before:   77 Mpps
      	After:    90 Mpps (+16.8%)
      
      * (B) is dual-port:
      	Before:  61 Mpps
      	After:   72 Mpps (+18%)
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      5e0d2eef
  22. 11 12月, 2018 1 次提交
  23. 10 12月, 2018 1 次提交
  24. 27 11月, 2018 1 次提交
    • S
      net/mlx5: EQ, Introduce atomic notifier chain subscription API · 0f597ed4
      Saeed Mahameed 提交于
      Use atomic_notifier_chain to fire firmware events at internal mlx5 core
      components such as eswitch/fpga/clock/FW tracer/etc.., this is to
      avoid explicit calls from low level mlx5_core to upper components and to
      simplify the mlx5_core API for future developments.
      
      Simply provide register/unregister notifiers API and call the notifier
      chain on firmware async events.
      
      Example: to subscribe to a FW event:
      struct mlx5_nb port_event;
      
      MLX5_NB_INIT(&port_event, port_event_handler, PORT_CHANGE);
      mlx5_eq_notifier_register(mdev, &port_event);
      
      where:
       - port_event_handler is the notifier block callback.
       - PORT_EVENT is the suffix of MLX5_EVENT_TYPE_PORT_CHANGE.
      
      The above will guarantee that port_event_handler will receive all FW
      events of the type MLX5_EVENT_TYPE_PORT_CHANGE.
      
      To receive all FW/HW events one can subscribe to
      MLX5_EVENT_TYPE_NOTIFY_ANY.
      
      The next few patches will start moving all mlx5 core components to use
      this new API and cleanup mlx5_eq_async_int misx handler from component
      explicit calls and specific logic.
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      0f597ed4
  25. 13 11月, 2018 1 次提交
  26. 04 10月, 2018 1 次提交
    • F
      net/mlx5: Add Fast teardown support · fcd29ad1
      Feras Daoud 提交于
      Today mlx5 devices support two teardown modes:
      1- Regular teardown
      2- Force teardown
      
      This change introduces the enhanced version of the "Force teardown" that
      allows SW to perform teardown in a faster way without the need to reclaim
      all the pages.
      
      Fast teardown provides the following advantages:
      1- Fix a FW race condition that could cause command timeout
      2- Avoid moving to polling mode
      3- Close the vport to prevent PCI ACK to be sent without been scatter
      to memory
      Signed-off-by: NFeras Daoud <ferasda@mellanox.com>
      Reviewed-by: NMajd Dibbiny <majd@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      fcd29ad1
  27. 05 9月, 2018 1 次提交
  28. 09 8月, 2018 1 次提交
  29. 24 7月, 2018 1 次提交
    • F
      net/mlx5: FW tracer, events handling · c71ad41c
      Feras Daoud 提交于
      The tracer has one event, event 0x26, with two subtypes:
      - Subtype 0: Ownership change
      - Subtype 1: Traces available
      
      An ownership change occurs in the following cases:
      1- Owner releases his ownership, in this case, an event will be
      sent to inform others to reattempt acquire ownership.
      2- Ownership was taken by a higher priority tool, in this case
      the owner should understand that it lost ownership, and go through
      tear down flow.
      
      The second subtype indicates that there are traces in the trace buffer,
      in this case, the driver polls the tracer buffer for new traces, parse
      them and prepares the messages for printing.
      
      The HW starts tracing from the first address in the tracer buffer.
      Driver receives an event notifying that new trace block exists.
      HW posts a timestamp event at the last 8B of every 256B block.
      Comparing the timestamp to the last handled timestamp would indicate
      that this is a new trace block. Once the new timestamp is detected,
      the entire block is considered valid.
      
      Block validation and parsing, should be done after copying the current
      block to a different location, in order to avoid block overwritten
      during processing.
      Signed-off-by: NFeras Daoud <ferasda@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      c71ad41c
  30. 19 7月, 2018 1 次提交
  31. 20 6月, 2018 1 次提交
  32. 01 6月, 2018 1 次提交