1. 01 10月, 2020 1 次提交
  2. 18 9月, 2020 3 次提交
  3. 11 9月, 2020 1 次提交
  4. 10 9月, 2020 10 次提交
  5. 31 8月, 2020 1 次提交
  6. 19 8月, 2020 1 次提交
  7. 30 7月, 2020 1 次提交
  8. 07 7月, 2020 7 次提交
  9. 03 7月, 2020 1 次提交
  10. 25 6月, 2020 1 次提交
  11. 24 6月, 2020 1 次提交
  12. 23 6月, 2020 4 次提交
  13. 09 6月, 2020 1 次提交
  14. 03 6月, 2020 3 次提交
  15. 30 5月, 2020 2 次提交
    • Y
      RDMA/core: Introduce shared CQ pool API · c7ff819a
      Yamin Friedman 提交于
      Allow a ULP to ask the core to provide a completion queue based on a
      least-used search on a per-device CQ pools. The device CQ pools grow in a
      lazy fashion when more CQs are requested.
      
      This feature reduces the amount of interrupts when using many QPs.  Using
      shared CQs allows for more effcient completion handling. It also reduces
      the amount of overhead needed for CQ contexts.
      
      Test setup:
      Intel(R) Xeon(R) Platinum 8176M CPU @ 2.10GHz servers.
      Running NVMeoF 4KB read IOs over ConnectX-5EX across Spectrum switch.
      TX-depth = 32. The patch was applied in the nvme driver on both the target
      and initiator. Four controllers are accessed from each core. In the
      current test case we have exposed sixteen NVMe namespaces using four
      different subsystems (four namespaces per subsystem) from one NVM port.
      Each controller allocated X queues (RDMA QPs) and attached to Y CQs.
      Before this series we had X == Y, i.e for four controllers we've created
      total of 4X QPs and 4X CQs. In the shared case, we've created 4X QPs and
      only X CQs which means that we have four controllers that share a
      completion queue per core. Until fourteen cores there is no significant
      change in performance and the number of interrupts per second is less than
      a million in the current case.
      ==================================================
      |Cores|Current KIOPs  |Shared KIOPs  |improvement|
      |-----|---------------|--------------|-----------|
      |14   |2332           |2723          |16.7%      |
      |-----|---------------|--------------|-----------|
      |20   |2086           |2712          |30%        |
      |-----|---------------|--------------|-----------|
      |28   |1971           |2669          |35.4%      |
      |=================================================
      |Cores|Current avg lat|Shared avg lat|improvement|
      |-----|---------------|--------------|-----------|
      |14   |767us          |657us         |14.3%      |
      |-----|---------------|--------------|-----------|
      |20   |1225us         |943us         |23%        |
      |-----|---------------|--------------|-----------|
      |28   |1816us         |1341us        |26.1%      |
      ========================================================
      |Cores|Current interrupts|Shared interrupts|improvement|
      |-----|------------------|-----------------|-----------|
      |14   |1.6M/sec          |0.4M/sec         |72%        |
      |-----|------------------|-----------------|-----------|
      |20   |2.8M/sec          |0.6M/sec         |72.4%      |
      |-----|------------------|-----------------|-----------|
      |28   |2.9M/sec          |0.8M/sec         |63.4%      |
      ====================================================================
      |Cores|Current 99.99th PCTL lat|Shared 99.99th PCTL lat|improvement|
      |-----|------------------------|-----------------------|-----------|
      |14   |67ms                    |6ms                    |90.9%      |
      |-----|------------------------|-----------------------|-----------|
      |20   |5ms                     |6ms                    |-10%       |
      |-----|------------------------|-----------------------|-----------|
      |28   |8.7ms                   |6ms                    |25.9%      |
      |===================================================================
      
      Performance improvement with sixteen disks (sixteen CQs per core) is
      comparable.
      
      Link: https://lore.kernel.org/r/1590568495-101621-3-git-send-email-yaminf@mellanox.comSigned-off-by: NYamin Friedman <yaminf@mellanox.com>
      Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      c7ff819a
    • Y
      RDMA/core: Add protection for shared CQs used by ULPs · 3446cbd2
      Yamin Friedman 提交于
      A pre-step for adding shared CQs. Add the infrastructure to prevent shared
      CQ users from altering the CQ configurations. For now all cqs are marked
      as private (non-shared). The core driver should use the new force
      functions to perform resize/destroy/moderation changes that are not
      allowed for users of shared CQs.
      
      Link: https://lore.kernel.org/r/1590568495-101621-2-git-send-email-yaminf@mellanox.comSigned-off-by: NYamin Friedman <yaminf@mellanox.com>
      Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      3446cbd2
  16. 22 5月, 2020 2 次提交