1. 25 5月, 2018 2 次提交
  2. 18 5月, 2018 3 次提交
  3. 17 5月, 2018 1 次提交
  4. 01 5月, 2018 1 次提交
    • I
      net/mlx5: Accel, Add TLS tx offload interface · 1ae17322
      Ilya Lesokhin 提交于
      Add routines for manipulating TLS TX offload contexts.
      
      In Innova TLS, TLS contexts are added or deleted
      via a command message over the SBU connection.
      The HW then sends a response message over the same connection.
      
      Add implementation for Innova TLS (FPGA-based) hardware.
      
      These routines will be used by the TLS offload support in a later patch
      
      mlx5/accel is a middle acceleration layer to allow mlx5e and other ULPs
      to work directly with mlx5_core rather than Innova FPGA or other mlx5
      acceleration providers.
      
      In the future, when IPSec/TLS or any other acceleration gets integrated
      into ConnectX chip, mlx5/accel layer will provide the integrated
      acceleration, rather than the Innova one.
      Signed-off-by: NIlya Lesokhin <ilyal@mellanox.com>
      Signed-off-by: NBoris Pismenny <borisp@mellanox.com>
      Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ae17322
  5. 27 4月, 2018 1 次提交
    • I
      net/mlx5: Fix mlx5_get_vector_affinity function · 6082d9c9
      Israel Rukshin 提交于
      Adding the vector offset when calling to mlx5_vector2eqn() is wrong.
      This is because mlx5_vector2eqn() checks if EQ index is equal to vector number
      and the fact that the internal completion vectors that mlx5 allocates
      don't get an EQ index.
      
      The second problem here is that using effective_affinity_mask gives the same
      CPU for different vectors.
      This leads to unmapped queues when calling it from blk_mq_rdma_map_queues().
      This doesn't happen when using affinity_hint mask.
      
      Fixes: 2572cf57 ("mlx5: fix mlx5_get_vector_affinity to start from completion vector 0")
      Fixes: 05e0cc84 ("net/mlx5: Fix get vector affinity helper function")
      Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
      Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      6082d9c9
  6. 06 4月, 2018 3 次提交
  7. 05 4月, 2018 1 次提交
  8. 31 3月, 2018 2 次提交
    • T
      net/mlx5e: Use linear SKB in Striding RQ · 619a8f2a
      Tariq Toukan 提交于
      Current Striding RQ HW feature utilizes the RX buffers so that
      there is no wasted room between the strides. This maximises
      the memory utilization.
      This prevents the use of build_skb() (which requires headroom
      and tailroom), and demands to memcpy the packets headers into
      the skb linear part.
      
      In this patch, whenever a set of conditions holds, we apply
      an RQ configuration that allows combining the use of linear SKB
      on top of a Striding RQ.
      
      To use build_skb() with Striding RQ, the following must hold:
      1. packet does not cross a page boundary.
      2. there is enough headroom and tailroom surrounding the packet.
      
      We can satisfy 1 and 2 by configuring:
      	stride size = MTU + headroom + tailoom.
      
      This is possible only when:
      a. (MTU - headroom - tailoom) does not exceed PAGE_SIZE.
      b. HW LRO is turned off.
      
      Using linear SKB has many advantages:
      - Saves a memcpy of the headers.
      - No page-boundary checks in datapath.
      - No filler CQEs.
      - Significantly smaller CQ.
      - SKB data continuously resides in linear part, and not split to
        small amount (linear part) and large amount (fragment).
        This saves datapath cycles in driver and improves utilization
        of SKB fragments in GRO.
      - The fragments of a resulting GRO SKB follow the IP forwarding
        assumption of equal-size fragments.
      
      Some implementation details:
      HW writes the packets to the beginning of a stride,
      i.e. does not keep headroom. To overcome this we make sure we can
      extend backwards and use the last bytes of stride i-1.
      Extra care is needed for stride 0 as it has no preceding stride.
      We make sure headroom bytes are available by shifting the buffer
      pointer passed to HW by headroom bytes.
      
      This configuration now becomes default, whenever capable.
      Of course, this implies turning LRO off.
      
      Performance testing:
      ConnectX-5, single core, single RX ring, default MTU.
      
      UDP packet rate, early drop in TC layer:
      
      --------------------------------------------
      | pkt size | before    | after     | ratio |
      --------------------------------------------
      | 1500byte | 4.65 Mpps | 5.96 Mpps | 1.28x |
      |  500byte | 5.23 Mpps | 5.97 Mpps | 1.14x |
      |   64byte | 5.94 Mpps | 5.96 Mpps | 1.00x |
      --------------------------------------------
      
      TCP streams: ~20% gain
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      619a8f2a
    • S
      net/mlx5: Eliminate query xsrq dead code · b2d3907c
      Saeed Mahameed 提交于
      1. This function is not used anywhere in mlx5 driver
      2. It has a memcpy statement that makes no sense and produces build
      warning with gcc8
      
      drivers/net/ethernet/mellanox/mlx5/core/transobj.c: In function 'mlx5_core_query_xsrq':
      drivers/net/ethernet/mellanox/mlx5/core/transobj.c:347:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
      
      Fixes: 01949d01 ("net/mlx5_core: Enable XRCs and SRQs when using ISSI > 0")
      Reported-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      b2d3907c
  9. 28 3月, 2018 3 次提交
  10. 27 3月, 2018 6 次提交
  11. 20 3月, 2018 1 次提交
    • B
      net/mlx5: Packet pacing enhancement · 05d3ac97
      Bodong Wang 提交于
      Add two new parameters: max_burst_sz and typical_pkt_size (both
      in bytes) to rate limit configurations.
      
      max_burst_sz: The device will schedule bursts of packets for an
      SQ connected to this rate, smaller than or equal to this value.
      Value 0x0 indicates packet bursts will be limited to the device
      defaults. This field should be used if bursts of packets must be
      strictly kept under a certain value.
      
      typical_pkt_size: When the rate limit is intended for a stream of
      similar packets, stating the typical packet size can improve the
      accuracy of the rate limiter. The expected packet size will be
      the same for all SQs associated with the same rate limit index.
      
      Ethernet driver is updated according to this change, but these two
      parameters will be kept as 0 due to lacking of proper way to get the
      configurations from user space which requires to change
      ndo_set_tx_maxrate interface.
      Signed-off-by: NBodong Wang <bodong@mellanox.com>
      Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      05d3ac97
  12. 15 3月, 2018 1 次提交
  13. 14 3月, 2018 1 次提交
  14. 08 3月, 2018 8 次提交
  15. 07 3月, 2018 3 次提交
  16. 24 2月, 2018 2 次提交
  17. 15 2月, 2018 1 次提交
    • Y
      IB/mlx5: Implement fragmented completion queue (CQ) · 388ca8be
      Yonatan Cohen 提交于
      The current implementation of create CQ requires contiguous
      memory, such requirement is problematic once the memory is
      fragmented or the system is low in memory, it causes for
      failures in dma_zalloc_coherent().
      
      This patch implements new scheme of fragmented CQ to overcome
      this issue by introducing new type: 'struct mlx5_frag_buf_ctrl'
      to allocate fragmented buffers, rather than contiguous ones.
      
      Base the Completion Queues (CQs) on this new fragmented buffer.
      
      It fixes following crashes:
      kworker/29:0: page allocation failure: order:6, mode:0x80d0
      CPU: 29 PID: 8374 Comm: kworker/29:0 Tainted: G OE 3.10.0
      Workqueue: ib_cm cm_work_handler [ib_cm]
      Call Trace:
      [<>] dump_stack+0x19/0x1b
      [<>] warn_alloc_failed+0x110/0x180
      [<>] __alloc_pages_slowpath+0x6b7/0x725
      [<>] __alloc_pages_nodemask+0x405/0x420
      [<>] dma_generic_alloc_coherent+0x8f/0x140
      [<>] x86_swiotlb_alloc_coherent+0x21/0x50
      [<>] mlx5_dma_zalloc_coherent_node+0xad/0x110 [mlx5_core]
      [<>] ? mlx5_db_alloc_node+0x69/0x1b0 [mlx5_core]
      [<>] mlx5_buf_alloc_node+0x3e/0xa0 [mlx5_core]
      [<>] mlx5_buf_alloc+0x14/0x20 [mlx5_core]
      [<>] create_cq_kernel+0x90/0x1f0 [mlx5_ib]
      [<>] mlx5_ib_create_cq+0x3b0/0x4e0 [mlx5_ib]
      Signed-off-by: NYonatan Cohen <yonatanc@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      388ca8be