1. 09 7月, 2019 5 次提交
    • Y
      RDMA/nldev: Added configuration of RDMA dynamic interrupt moderation to netlink · f8fc8cd9
      Yamin Friedman 提交于
      Added parameter in ib_device for enabling dynamic interrupt moderation so
      that it can be configured in userspace using rdma tool.
      
      In order to set adaptive-moderation for an ib device the command is:
      rdma dev set [DEV] adaptive-moderation [on|off]
      Please set on/off.
      
      rdma dev show
      0: mlx5_0: node_type ca fw 16.26.0055 node_guid 248a:0703:00a5:29d0
      sys_image_guid 248a:0703:00a5:29d0 adaptive-moderation on
      
      rdma resource show cq
      dev mlx5_0 cqn 0 cqe 1023 users 4 poll-ctx UNBOUND_WORKQUEUE
      adaptive-moderation off comm [ib_core]
      Signed-off-by: NYamin Friedman <yaminf@mellanox.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      f8fc8cd9
    • Y
      RDMA/core: Provide RDMA DIM support for ULPs · da662979
      Yamin Friedman 提交于
      Added the interface in the infiniband driver that applies the rdma_dim
      adaptive moderation. There is now a special function for allocating an
      ib_cq that uses rdma_dim.
      
      Performance improvement (ConnectX-5 100GbE, x86) running FIO benchmark over
      NVMf between two equal end-hosts with 56 cores across a Mellanox switch
      using null_blk device:
      
      READS without DIM:
      blk size | BW       | IOPS | 99th percentile latency  | 99.99th latency
      512B     | 3.8GiB/s | 7.7M | 1401  usec               | 2442  usec
      4k       | 7.0GiB/s | 1.8M | 4817  usec               | 6587  usec
      64k      | 10.7GiB/s| 175k | 9896  usec               | 10028 usec
      
      IO WRITES without DIM:
      blk size | BW       | IOPS | 99th percentile latency  | 99.99th latency
      512B     | 3.6GiB/s | 7.5M | 1434  usec               | 2474  usec
      4k       | 6.3GiB/s | 1.6M | 938   usec               | 1221  usec
      64k      | 10.7GiB/s| 175k | 8979  usec               | 12780 usec
      
      IO READS with DIM:
      blk size | BW       | IOPS | 99th percentile latency  | 99.99th latency
      512B     | 4GiB/s   | 8.2M | 816    usec              | 889   usec
      4k       | 10.1GiB/s| 2.65M| 3359   usec              | 5080  usec
      64k      | 10.7GiB/s| 175k | 9896   usec              | 10028 usec
      
      IO WRITES with DIM:
      blk size | BW       | IOPS  | 99th percentile latency | 99.99th latency
      512B     | 3.9GiB/s | 8.1M  | 799   usec              | 922   usec
      4k       | 9.6GiB/s | 2.5M  | 717   usec              | 1004  usec
      64k      | 10.7GiB/s| 176k  | 8586  usec              | 12256 usec
      
      The rdma_dim algorithm was designed to measure the effectiveness of
      moderation on the flow in a general way and thus should be appropriate
      for all RDMA storage protocols.
      
      rdma_dim is configured to be the default option based on performance
      improvement seen after extensive tests.
      Signed-off-by: NYamin Friedman <yaminf@mellanox.com>
      Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      da662979
    • D
      IB/mlx5: Report correctly tag matching rendezvous capability · 89705e92
      Danit Goldberg 提交于
      Userspace expects the IB_TM_CAP_RC bit to indicate that the device
      supports RC transport tag matching with rendezvous offload. However the
      firmware splits this into two capabilities for eager and rendezvous tag
      matching.
      
      Only if the FW supports both modes should userspace be told the tag
      matching capability is available.
      
      Cc: <stable@vger.kernel.org> # 4.13
      Fixes: eb761894 ("IB/mlx5: Fill XRQ capabilities")
      Signed-off-by: NDanit Goldberg <danitg@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Reviewed-by: NArtemy Kovalyov <artemyko@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      89705e92
    • M
      IB/mlx5: Implement VHCA tunnel mechanism in DEVX · b6142608
      Max Gurtovoy 提交于
      This mechanism will allow function-A to perform operations "on behalf" of
      function-B via tunnel object. Function-A will have privileges for creating
      and using this tunnel object.
      
      For example, in the device emulation feature presented in Bluefield-1 SoC,
      using device emulation capability, one can present NVMe function to the
      host OS.
      
      Since the NVMe function doesn't have a normal command interface to the HCA
      HW, here is a need to create a channel that will be able to issue commands
      "on behalf" of this function.
      
      This channel is the VHCA_TUNNEL general object. The emulation software
      will create this tunnel for every managed function and issue commands via
      devx general cmd interface using the appropriate tunnel ID. When devX
      context will receive a command with non-zero vhca_tunnel_id, it will pass
      the command as-is down to the HCA.
      
      All the validation, security and resource tracking of the commands and the
      created tunneled objects is in the responsibility of the HCA FW. When a
      VHCA_TUNNEL object destroyed, the device will issue an internal
      FLR (function level reset) to the emulated function associated with this
      tunnel. This will destroy all the created resources using the tunnel
      mechanism.
      Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      b6142608
    • J
      RDMA/rvt: Do not use a kernel header in the ABI · f10ff380
      Jason Gunthorpe 提交于
      rvt was using ib_sge as part of it's ABI, which is not allowed. Introduce
      a new struct with the same layout and use it instead.
      
      Fixes: dabac6e4 ("IB/hfi1: Move receive work queue struct into uapi directory")
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      f10ff380
  2. 08 7月, 2019 1 次提交
    • J
      RDMA/siw: Fix DEFINE_PER_CPU compilation when ARCH_NEEDS_WEAK_PER_CPU · 4c7d6dcd
      Jason Gunthorpe 提交于
      The initializer for the variable cannot be inside the macro (and zero
      initialization isn't needed anyhow).
      
      include/linux/percpu-defs.h:92:33: warning: '__pcpu_unique_use_cnt' initialized and declared 'extern'
        extern __PCPU_DUMMY_ATTRS char __pcpu_unique_##name;  \
                                       ^~~~~~~~~~~~~~
      include/linux/percpu-defs.h:115:2: note: in expansion of macro 'DEFINE_PER_CPU_SECTION'
        DEFINE_PER_CPU_SECTION(type, name, "")
        ^~~~~~~~~~~~~~~~~~~~~~
      drivers/infiniband/sw/siw/siw_main.c:129:8: note: in expansion of macro 'DEFINE_PER_CPU'
       static DEFINE_PER_CPU(atomic_t, use_cnt = ATOMIC_INIT(0));
              ^~~~~~~~~~~~~~
      
      Also the rules for PER_CPU require the variable names to be globally
      unique, so prefix them with siw_
      
      Fixes: b9be6f18 ("rdma/siw: transmit path")
      Fixes: bdcf26bf ("rdma/siw: network and RDMA core interface")
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      4c7d6dcd
  3. 07 7月, 2019 5 次提交
  4. 05 7月, 2019 25 次提交
  5. 04 7月, 2019 4 次提交