1. 07 7月, 2020 6 次提交
  2. 03 7月, 2020 1 次提交
  3. 25 6月, 2020 1 次提交
  4. 24 6月, 2020 1 次提交
  5. 23 6月, 2020 4 次提交
  6. 09 6月, 2020 1 次提交
  7. 03 6月, 2020 4 次提交
  8. 30 5月, 2020 3 次提交
    • Y
      RDMA/core: Introduce shared CQ pool API · c7ff819a
      Yamin Friedman 提交于
      Allow a ULP to ask the core to provide a completion queue based on a
      least-used search on a per-device CQ pools. The device CQ pools grow in a
      lazy fashion when more CQs are requested.
      
      This feature reduces the amount of interrupts when using many QPs.  Using
      shared CQs allows for more effcient completion handling. It also reduces
      the amount of overhead needed for CQ contexts.
      
      Test setup:
      Intel(R) Xeon(R) Platinum 8176M CPU @ 2.10GHz servers.
      Running NVMeoF 4KB read IOs over ConnectX-5EX across Spectrum switch.
      TX-depth = 32. The patch was applied in the nvme driver on both the target
      and initiator. Four controllers are accessed from each core. In the
      current test case we have exposed sixteen NVMe namespaces using four
      different subsystems (four namespaces per subsystem) from one NVM port.
      Each controller allocated X queues (RDMA QPs) and attached to Y CQs.
      Before this series we had X == Y, i.e for four controllers we've created
      total of 4X QPs and 4X CQs. In the shared case, we've created 4X QPs and
      only X CQs which means that we have four controllers that share a
      completion queue per core. Until fourteen cores there is no significant
      change in performance and the number of interrupts per second is less than
      a million in the current case.
      ==================================================
      |Cores|Current KIOPs  |Shared KIOPs  |improvement|
      |-----|---------------|--------------|-----------|
      |14   |2332           |2723          |16.7%      |
      |-----|---------------|--------------|-----------|
      |20   |2086           |2712          |30%        |
      |-----|---------------|--------------|-----------|
      |28   |1971           |2669          |35.4%      |
      |=================================================
      |Cores|Current avg lat|Shared avg lat|improvement|
      |-----|---------------|--------------|-----------|
      |14   |767us          |657us         |14.3%      |
      |-----|---------------|--------------|-----------|
      |20   |1225us         |943us         |23%        |
      |-----|---------------|--------------|-----------|
      |28   |1816us         |1341us        |26.1%      |
      ========================================================
      |Cores|Current interrupts|Shared interrupts|improvement|
      |-----|------------------|-----------------|-----------|
      |14   |1.6M/sec          |0.4M/sec         |72%        |
      |-----|------------------|-----------------|-----------|
      |20   |2.8M/sec          |0.6M/sec         |72.4%      |
      |-----|------------------|-----------------|-----------|
      |28   |2.9M/sec          |0.8M/sec         |63.4%      |
      ====================================================================
      |Cores|Current 99.99th PCTL lat|Shared 99.99th PCTL lat|improvement|
      |-----|------------------------|-----------------------|-----------|
      |14   |67ms                    |6ms                    |90.9%      |
      |-----|------------------------|-----------------------|-----------|
      |20   |5ms                     |6ms                    |-10%       |
      |-----|------------------------|-----------------------|-----------|
      |28   |8.7ms                   |6ms                    |25.9%      |
      |===================================================================
      
      Performance improvement with sixteen disks (sixteen CQs per core) is
      comparable.
      
      Link: https://lore.kernel.org/r/1590568495-101621-3-git-send-email-yaminf@mellanox.comSigned-off-by: NYamin Friedman <yaminf@mellanox.com>
      Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      c7ff819a
    • Y
      RDMA/core: Add protection for shared CQs used by ULPs · 3446cbd2
      Yamin Friedman 提交于
      A pre-step for adding shared CQs. Add the infrastructure to prevent shared
      CQ users from altering the CQ configurations. For now all cqs are marked
      as private (non-shared). The core driver should use the new force
      functions to perform resize/destroy/moderation changes that are not
      allowed for users of shared CQs.
      
      Link: https://lore.kernel.org/r/1590568495-101621-2-git-send-email-yaminf@mellanox.comSigned-off-by: NYamin Friedman <yaminf@mellanox.com>
      Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      3446cbd2
    • J
  9. 28 5月, 2020 8 次提交
  10. 22 5月, 2020 2 次提交
  11. 21 5月, 2020 4 次提交
  12. 18 5月, 2020 1 次提交
  13. 13 5月, 2020 1 次提交
    • G
      IB/rdmavt: Replace zero-length array with flexible-array · 0cb9e4f9
      Gustavo A. R. Silva 提交于
      The current codebase makes use of the zero-length array language
      extension to the C90 standard, but the preferred mechanism to declare
      variable-length types such as these ones is a flexible array member[1][2],
      introduced in C99:
      
      struct foo {
              int stuff;
              struct boo array[];
      };
      
      By making use of the mechanism above, we will get a compiler warning
      in case the flexible array does not occur last in the structure, which
      will help us prevent some kind of undefined behavior bugs from being
      inadvertently introduced[3] to the codebase from now on.
      
      Also, notice that, dynamic memory allocations won't be affected by
      this change:
      
      "Flexible array members have incomplete type, and so the sizeof operator
      may not be applied. As a quirk of the original implementation of
      zero-length arrays, sizeof evaluates to zero."[1]
      
      sizeof(flexible-array-member) triggers a warning because flexible array
      members have incomplete type[1]. There are some instances of code in
      which the sizeof operator is being incorrectly/erroneously applied to
      zero-length arrays and the result is zero. Such instances may be hiding
      some bugs. So, this work (flexible-array member conversions) will also
      help to get completely rid of those sorts of issues.
      
      This issue was found with the help of Coccinelle.
      
      [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
      [2] https://github.com/KSPP/linux/issues/21
      [3] commit 76497732 ("cxgb3/l2t: Fix undefined behaviour")
      
      Link: https://lore.kernel.org/r/20200507185342.GA14476@embeddedorSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      0cb9e4f9
  14. 07 5月, 2020 1 次提交
  15. 06 5月, 2020 2 次提交