1. 20 1月, 2017 1 次提交
  2. 10 1月, 2017 1 次提交
    • E
      IB/mlx5: Allow future extension of libmlx5 input data · b037c29a
      Eli Cohen 提交于
      Current check requests that new fields in struct
      mlx5_ib_alloc_ucontext_req_v2 that are not known to the driver be zero.
      This was introduced so new libraries passing additional information to
      the kernel through struct mlx5_ib_alloc_ucontext_req_v2 will be notified
      by old kernels that do not support their request by failing the
      operation. This schecme is problematic since it requires libmlx5 to issue
      the requests with descending input size for struct
      mlx5_ib_alloc_ucontext_req_v2.
      
      To avoid this, we require that new features that will obey the following
      rules:
      If the feature requires one or more fields in the response and the at
      least one of the fields can be encoded such that a zero value means the
      kernel ignored the request then this field will provide the indication
      to the library. If no response is required or if zero is a valid
      response, a new field should be added that indicates to the library
      whether its request was processed.
      
      Fixes: b368d7cb ('IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext')
      Signed-off-by: NEli Cohen <eli@mellanox.com>
      Reviewed-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      b037c29a
  3. 08 1月, 2017 2 次提交
  4. 03 1月, 2017 1 次提交
  5. 29 12月, 2016 1 次提交
  6. 19 11月, 2016 2 次提交
  7. 13 10月, 2016 1 次提交
    • T
      net/mlx5: Add MLX5_ARRAY_SET64 to fix BUILD_BUG_ON · b8a4ddb2
      Tom Herbert 提交于
      I am hitting this in mlx5:
      
      drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c: In function
      reclaim_pages_cmd.clone.0:
      drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:346: error: call
      to __compiletime_assert_346 declared with attribute error:
      BUILD_BUG_ON failed: __mlx5_bit_off(manage_pages_out, pas[i]) % 64
      drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c: In function give_pages:
      drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:291: error: call
      to __compiletime_assert_291 declared with attribute error:
      BUILD_BUG_ON failed: __mlx5_bit_off(manage_pages_in, pas[i]) % 64
      
      Problem is that this is doing a BUILD_BUG_ON on a non-constant
      expression because of trying to take offset of pas[i] in the
      structure.
      
      Fix is to create MLX5_ARRAY_SET64 that takes an additional argument
      that is the field index to separate between BUILD_BUG_ON on the array
      constant field and the indexed field to assign the value to.
      There are two callers of MLX5_SET64 that are trying to get a variable
      offset, change those to call MLX5_ARRAY_SET64 passing 'pas' and 'i'
      as the arguments to use in the offset check and the indexed value
      assignment.
      
      Fixes: a533ed5e ("net/mlx5: Pages management commands via mlx5 ifc")
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b8a4ddb2
  8. 18 8月, 2016 1 次提交
  9. 17 8月, 2016 1 次提交
  10. 14 8月, 2016 5 次提交
  11. 26 7月, 2016 1 次提交
  12. 27 6月, 2016 1 次提交
    • Y
      net/mlx5: Rate limit tables support · 1466cc5b
      Yevgeny Petrilin 提交于
      Configuring and managing HW rate limit tables.
      The HW holds a table of rate limits, each rate is
      associated with an index in that table.
      Later a Send Queue uses this index to set the rate limit.
      Multiple Send Queues can have the same rate limit, which is
      represented by a single entry in this table.
      Even though a rate can be shared, each queue is being rate
      limited independently of others.
      
      The SW shadow of this table holds the rate itself,
      the index in the HW table and the refcount (number of queues)
      working with this rate.
      
      The exported functions are mlx5_rl_add_rate and mlx5_rl_remove_rate.
      Number of different rates and their values are derived
      from HW capabilities.
      Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1466cc5b
  13. 10 6月, 2016 2 次提交
  14. 12 5月, 2016 1 次提交
  15. 05 5月, 2016 1 次提交
    • M
      net/mlx5: Flow steering, Add vport ACL support · efdc810b
      Mohamad Haj Yahia 提交于
      Update the relevant flow steering device structs and commands to
      support vport.
      Update the flow steering core API to receive vport number.
      Add ingress and egress ACL flow table name spaces.
      Add ACL flow table support:
      * ACL (Access Control List) flow table is a table that contains
      only allow/drop steering rules.
      
      * We have two types of ACL flow tables - ingress and egress.
      
      * ACLs handle traffic sent from/to E-Switch FDB table, Ingress refers to
      traffic sent from Vport to E-Switch and Egress refers to traffic sent
      from E-Switch to vport.
      
      * Ingress ACL flow table allow/drop rules is checked against traffic
      sent from VF.
      
      * Egress ACL flow table allow/drop rules is checked against traffic sent
      to VF.
      Signed-off-by: NMohamad Haj Yahia <mohamad@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      efdc810b
  16. 28 4月, 2016 1 次提交
  17. 27 4月, 2016 3 次提交
  18. 22 4月, 2016 1 次提交
    • T
      net/mlx5e: Support RX multi-packet WQE (Striding RQ) · 461017cb
      Tariq Toukan 提交于
      Introduce the feature of multi-packet WQE (RX Work Queue Element)
      referred to as (MPWQE or Striding RQ), in which WQEs are larger
      and serve multiple packets each.
      
      Every WQE consists of many strides of the same size, every received
      packet is aligned to a beginning of a stride and is written to
      consecutive strides within a WQE.
      
      In the regular approach, each regular WQE is big enough to be capable
      of serving one received packet of any size up to MTU or 64K in case of
      device LRO is enabled, making it very wasteful when dealing with
      small packets or device LRO is enabled.
      
      For its flexibility, MPWQE allows a better memory utilization
      (implying improvements in CPU utilization and packet rate) as packets
      consume strides according to their size, preserving the rest of
      the WQE to be available for other packets.
      
      MPWQE default configuration:
      	Num of WQEs	= 16
      	Strides Per WQE = 2048
      	Stride Size	= 64 byte
      
      The default WQEs memory footprint went from 1024*mtu (~1.5MB) to
      16 * 2048 * 64 = 2MB per ring.
      However, HW LRO can now be supported at no additional cost in memory
      footprint, and hence we turn it on by default and get an even better
      performance.
      
      Performance tested on ConnectX4-Lx 50G.
      To isolate the feature under test, the numbers below were measured with
      HW LRO turned off. We verified that the performance just improves when
      LRO is turned back on.
      
      * Netperf single TCP stream:
      - BW raised by 10-15% for representative packet sizes:
        default, 64B, 1024B, 1478B, 65536B.
      
      * Netperf multi TCP stream:
      - No degradation, line rate reached.
      
      * Pktgen: packet rate raised by 2-10% for traffic of different message
      sizes: 64B, 128B, 256B, 1024B, and 1500B.
      
      * Pktgen: packet loss in bursts of small messages (64byte),
      single stream:
      - | num packets | packets loss before | packets loss after
        |     2K      |       ~ 1K          |       0
        |     8K      |       ~ 6K          |       0
        |     16K     |       ~13K          |       0
        |     32K     |       ~28K          |       0
        |     64K     |       ~57K          |     ~24K
      
      As expected as the driver can receive as many small packets (<=64B) as
      the number of total strides in the ring (default = 2048 * 16) vs. 1024
      (default ring size regardless of packets size) before this feature.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NAchiad Shochat <achiad@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      461017cb
  19. 22 3月, 2016 1 次提交
  20. 10 3月, 2016 1 次提交
  21. 01 3月, 2016 2 次提交
  22. 25 2月, 2016 2 次提交
  23. 22 1月, 2016 1 次提交
  24. 12 1月, 2016 1 次提交
  25. 06 1月, 2016 1 次提交
  26. 24 12月, 2015 3 次提交
  27. 04 12月, 2015 1 次提交