提交 · 6044414fa849e14fa0de60a75e3f85ea048c89db · openeuler / Kernel

04 7月, 2019 5 次提交

RDMA/hns: Remove set but not used variable 'fclr_write_fail_flag' · 6044414f

由 YueHaibing 提交于 7月 03, 2019

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/infiniband/hw/hns/hns_roce_hw_v2.c: In function 'hns_roce_function_clear':
drivers/infiniband/hw/hns/hns_roce_hw_v2.c:1135:7: warning:
 variable 'fclr_write_fail_flag' set but not used [-Wunused-but-set-variable]

It is never used, so can be removed.
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

6044414f

RDMA/i40iw: Set queue pair state when being queried · 2e67e775

由 Liu, Changcheng 提交于 6月 28, 2019

The API for ib_query_qp requires the driver to set qp_state and
cur_qp_state on return, add the missing sets.

Fixes: d3749841 ("i40iw: add files for iwarp interface")
Signed-off-by: NChangcheng Liu <changcheng.liu@aliyun.com>
Acked-by: NShiraz Saleem <shiraz.saleem@intel.com>
Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

2e67e775

IB/i40iw: Use kmemdup rather than open coding · cda8cf56

由 Fuqian Huang 提交于 7月 04, 2019

Use kmemdump instead of kzmalloc + memcpy.
Signed-off-by: NFuqian Huang <huangfq.daxian@gmail.com>
Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

cda8cf56

IB/ipoib: Remove memset after vzalloc in ipoib_cm.c · 5d7d78ea

由 Fuqian Huang 提交于 6月 28, 2019

vzalloc has already zeroed the memory.  So a memset is unneeded.
Signed-off-by: NFuqian Huang <huangfq.daxian@gmail.com>
Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

5d7d78ea

IB: Remove unneeded memset · 4c44d463

由 Fuqian Huang 提交于 6月 28, 2019

In commit af7ddd8a ("Merge tag 'dma-mapping-4.21' of
git://git.infradead.org/users/hch/dma-mapping"),
dma_alloc_coherent/dmam_alloc_coherent always zeroed the returned memory.
So the memset after a coherent allocation function is not needed.
Signed-off-by: NFuqian Huang <huangfq.daxian@gmail.com>
Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

4c44d463

03 7月, 2019 12 次提交

Merge branch 'siw' into rdma.git for-next · c5cfcfcb

由 Jason Gunthorpe 提交于 7月 02, 2019

Bernard Metzler says:

====================
This patch set contributes the SoftiWarp driver rebased for latest
rdma-next. SoftiWarp (siw) implements the iWarp RDMA protocol over kernel
TCP sockets. The driver integrates with the linux-rdma framework.

A matching userlevel driver is available as PR at
https://github.com/linux-rdma/rdma-core/pull/536

Many thanks for reviewing and testing the driver, especially to Leon,
Jason, Steve, Doug, Olga, Dennis, Gal. You all helped to significantly
improve the driver over the last year.

Please find below a list of changes and comments, compared to older
versions of the siw driver.

Many thanks!
Bernard.

CHANGES:
========

v3 (this version)
-----------------

- Rebased to rdma-next

- Removed unneccessary initialization of enums in siw-abi.h

- Added comment on sizing of all work queues to power of two.

v2
-----------------

- Changed recieve path CRC calculation to compute CRC32c not
  on target buffer after placement, but on original skbuf.
  This change severely hurts performance, if CRC is switched
  on, since skb must now be walked twice. It is planned to
  work on an extension to skb_copy_bits() to fold in CRC
  computation.

- Moved debugging to using ibdev_dbg().

- Dropped detailed packet debug printing.

- Removed siw_debug.[ch] files.

- Removed resource tracking, code now relies on restrack of
  RDMA midlayer. Only object counting to enforce reported
  device limits is left in place.

- Removed all nested switch-case statements.

- Cleaned up header file #include's

- Moved CQ create/destroy to new semantics,
  where midlayer creates/destroys containing object.

- Set siw's ABI version to 1 (was 0 before)

- Removed all enum initialization where not needed.

- Fixed MAINTANERS entry for siw driver

- This version stays with the current siw specific
  management of user memory (siw_umem_get() vs.
  ib_umem_get(), etc.). This, since the current ib_umem
  implementation is less efficient for user page lookup
  on the fast path, where effciency is important for a
  SW RDMA driver.
  It is planned to contribute enhancements to the ib_umem
  framework, wich makes it suitable for SW drivers as well.

v1 (first version after v9 of siw RFC)
--------------------------------------

- Rebased to 5.2-rc1

- All IDR code got removed.

- Both MR and QP deallocation verbs now synchronously
  free the resources referenced by the RDMA mid-layer.

- IPv6 support was added.

- For compatibility with Chelsio iWarp hardware, the RX
  path was slightly reworked. It now allows packet intersection
  between tagged and untagged RDMAP operations. While not
  a defined behavior as of IETF RFC 5040/5041, some RDMA hardware
  may intersect an ongoing outbound (large) tagged message, such
  as an multisegment RDMA Read Response with sending an untagged
  message, such as an RDMA Send frame. This behavior was only
  detected in an NVMeF setup, where siw was used at target side,
  and RDMA hardware at client side (during file write). siw now
  implements two input paths for tagged and untagged messages each,
  and allows the intersected placement of both messages.

- The siw kernel abi file got renamed from siw_user.h to siw-abi.h.
====================

* branch 'siw':
  SIW addition to kernel build environment
  SIW completion queue methods
  SIW receive path
  SIW transmit path
  SIW queue pair methods
  SIW application buffer management
  SIW application interface
  SIW connection management
  SIW network and RDMA core interface
  SIW main include file
  iWarp wire packet format

c5cfcfcb

rdma/siw: addition to kernel build environment · c0cf5bdd

由 Bernard Metzler 提交于 6月 20, 2019

Broken up commit to add the Soft iWarp RDMA driver.
Signed-off-by: NBernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

c0cf5bdd

rdma/siw: completion queue methods · b0fff731

由 Bernard Metzler 提交于 6月 20, 2019

Broken up commit to add the Soft iWarp RDMA driver.
Signed-off-by: NBernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

b0fff731

rdma/siw: receive path · 8b6a361b

由 Bernard Metzler 提交于 6月 20, 2019

Broken up commit to add the Soft iWarp RDMA driver.
Signed-off-by: NBernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

8b6a361b

rdma/siw: transmit path · b9be6f18

由 Bernard Metzler 提交于 6月 20, 2019

Broken up commit to add the Soft iWarp RDMA driver.
Signed-off-by: NBernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

b9be6f18

rdma/siw: queue pair methods · f29dd55b

由 Bernard Metzler 提交于 6月 20, 2019

Broken up commit to add the Soft iWarp RDMA driver.
Signed-off-by: NBernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

f29dd55b

rdma/siw: application buffer management · 2251334d

由 Bernard Metzler 提交于 6月 20, 2019

Broken up commit to add the Soft iWarp RDMA driver.
Signed-off-by: NBernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

2251334d

rdma/siw: application interface · 303ae1cd

由 Bernard Metzler 提交于 6月 20, 2019

Broken up commit to add the Soft iWarp RDMA driver.
Signed-off-by: NBernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

303ae1cd

rdma/siw: connection management · 6c52fdc2

由 Bernard Metzler 提交于 6月 20, 2019

Broken up commit to add the Soft iWarp RDMA driver.
Signed-off-by: NBernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

6c52fdc2

rdma/siw: network and RDMA core interface · bdcf26bf

由 Bernard Metzler 提交于 6月 20, 2019

Broken up commit to add the Soft iWarp RDMA driver.
Signed-off-by: NBernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

bdcf26bf

rdma/siw: main include file · a5319752

由 Bernard Metzler 提交于 6月 20, 2019

Broken up commit to add the Soft iWarp RDMA driver.
Signed-off-by: NBernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

a5319752

rdma/siw: iWarp wire packet format · 0e935ae6

由 Bernard Metzler 提交于 6月 20, 2019

Broken up commit to add the Soft iWarp RDMA driver.
Signed-off-by: NBernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

0e935ae6

29 6月, 2019 13 次提交

IB/hfi1: No need to use try_module_get for debugfs · 09fbca8e

由 Dennis Dalessandro 提交于 6月 28, 2019

The call in debugfs.c for try_module_get() is not needed. A reference to
the module will be taken by the VFS layer as long as the owner field is
set in the file ops struct. So set this as well as remove the call.
Suggested-by: NJason Gunthorpe <jgg@mellanox.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

09fbca8e

IB/rdmavt: Add trace for map_mr_sg · 8bd516bd

由 Mike Marciniszyn 提交于 6月 28, 2019

Add trace to debug map_mr_sg handling.
Reviewed-by: NKaike Wan <kaike.wan@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

8bd516bd

IB/rdmavt: Enhance trace information for FRWR debug · 315aed11

由 Mike Marciniszyn 提交于 6月 28, 2019

This patch enhances the MR trace information to enable more focused debug
of MR issues.
Reviewed-by: NKaike Wan <kaike.wan@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

315aed11

IB/hfi1: Add missing INVALIDATE opcodes for trace · aa9b79ec

由 Mike Marciniszyn 提交于 6月 28, 2019

This was missed in the original implementation of the memory management
extensions.

Fixes: 0db3dfa0 ("IB/hfi1: Work request processing for fast register mr and invalidate")
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

aa9b79ec

IB/hfi1: Reduce excessive aspm inlines · bf3b1e0c

由 Michael J. Ruhl 提交于 6月 28, 2019

Uninline the aspm API since it increases code space for no reason.

Move the aspm module param to the new aspm C file.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

bf3b1e0c

IB/{rdmavt, hfi1, qib}: Add helpers to hide SWQE WR details · 2b0ad2da

由 Michael J. Ruhl 提交于 6月 28, 2019

Add some helper functions to hide struct rvt_swqe details.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

2b0ad2da

IB/{rdmavt, hfi1, qib}: Remove AH refcount for UD QPs · d310c4bf

由 Michael J. Ruhl 提交于 6月 28, 2019

Historically rdmavt destroy_ah() has returned an -EBUSY when the AH has a
non-zero reference count.  IBTA 11.2.2 notes no such return value or error
case:

	Output Modifiers:
	- Verb results:
	- Operation completed successfully.
	- Invalid HCA handle.
	- Invalid address handle.

ULPs never test for this error and this will leak memory.

The reference count exists to allow for driver independent progress
mechanisms to process UD SWQEs in parallel with post sends.  The SWQE will
hold a reference count until the UD SWQE completes and then drops the
reference.

Fix by removing need to reference count the AH.  Add a UD specific
allocation to each SWQE entry to cache the necessary information for
independent progress.  Copy the information during the post send
processing.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

d310c4bf

IB/rdmavt: Set QP allowed opcodes after QP allocation · fe2ac047

由 Michael J. Ruhl 提交于 6月 28, 2019

Currently QP allowed_ops is set after the QP is completely initialized.
This curtails the use of this optimization for any initialization before
allowed_ops is set.

Fix by adding a helper to determine the correct allowed_ops and moving the
setting of the allowed_ops to just after QP allocation.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

fe2ac047

IB/{hfi1, qib, rdmavt}: Put qp in error state when cq is full · 5136bfea

由 Kamenee Arumugam 提交于 6月 28, 2019

When a completion queue is full, the associated queue pairs are not put
into the error state. According to the IBTA specification, this is a
violation.

Quote from IBTA spec:
C9-218: A Requester Class F error occurs when the CQ is inaccessible or
full and an attempt is made to complete a WQE. The Affected QP shall be
moved to the error state and affiliated asynchronous errors generated as
described in 11.6.3.1 Affiliated Asynchronous Events on page 678. The
current WQE and any subsequent WQEs are left in an unknown state.

C11-37: The CI shall generate a CQ Error when a CQ overrun is
detected. This condition will result in an Affiliated Asynchronous Error
for any associated Work Queues when they attempt to use that
CQ. Completions can no longer be added to the CQ. It is not guaranteed
that completions present in the CQ at the time the error occurred can be
retrieved. Possible causes include a CQ overrun or a CQ protection error.

Put the qp in error state when cq is full. Implement a state called full
to continue to put other associated QPs in error state.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NKamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

5136bfea

IB/rdmavt: Fracture single lock used for posting and processing RWQEs · f592ae3c

由 Kamenee Arumugam 提交于 6月 28, 2019

Usage of single lock prevents fetching posted and processing receive work
queue entries from progressing simultaneously and impacts overall
performance.

Fracture the single lock used for posting and processing Receive Work
Queue Entries (RWQEs) to allow the circular buffer to be filled and
emptied at the same time. Two new spinlocks - one for the producers and
one for the consumers used for posting and processing RWQEs simultaneously
and the two indices are define on two different cache lines. The threshold
count is used to avoid reading other index in different cache line every
time.
Signed-off-by: NHarish Chegondi <harish.chegondi@intel.com>
Signed-off-by: NKamenee Arumugam <kamenee.arumugam@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

f592ae3c

IB/hfi1: Move receive work queue struct into uapi directory · dabac6e4

由 Kamenee Arumugam 提交于 6月 28, 2019

The rvt_rwqe and rvt_rwq struct elements are shared between rdmavt and the
providers but are not in uapi directory.  As per the comment in
https://marc.info/?l=linux-rdma&m=152296522708522&w=2, The hfi1 driver and
the rdma core driver are not using shared structures in the uapi
directory.

Move rvt_rwqe and rvt_rwq struct into rvt-abi.h header in uapi directory.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NKamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

dabac6e4

IB/hfi1: Move rvt_cq_wc struct into uapi directory · 239b0e52

由 Kamenee Arumugam 提交于 6月 28, 2019

The rvt_cq_wc struct elements are shared between rdmavt and the providers
but not in uapi directory.  As per the comment in
https://marc.info/?l=linux-rdma&m=152296522708522&w=2 The hfi1 driver and
the rdma core driver are not using shared structures in the uapi
directory.

In that case, move rvt_cq_wc struct into the rvt-abi.h header file and
create a rvt_k_cq_w for the kernel completion queue.
Signed-off-by: NKamenee Arumugam <kamenee.arumugam@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

239b0e52

Merge tag 'v5.2-rc6' into rdma.git for-next · 371bb621

由 Jason Gunthorpe 提交于 6月 28, 2019

For dependencies in next patches.

Resolve conflicts:
- Use uverbs_get_cleared_udata() with new cq allocation flow
- Continue to delete nes despite SPDX conflict
- Resolve list appends in mlx5_command_str()
- Use u16 for vport_rule stuff
- Resolve list appends in struct ib_client
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

371bb621

26 6月, 2019 3 次提交

RDMA/hns: fix spelling mistake "attatch" -> "attach" · 10dcc744

由 Colin Ian King 提交于 6月 24, 2019

There is a spelling mistake in an dev_err message. Fix it.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

10dcc744

RDMA/netlink: Audit policy settings for netlink attributes · 34d65cd8

由 Doug Ledford 提交于 6月 21, 2019

For all string attributes for which we don't currently accept the element
as input, we only use it as output, set the string length to
RDMA_NLDEV_ATTR_EMPTY_STRING which is defined as 1. That way we will only
accept a null string for that element. This will prevent someone from
writing a new input routine that uses the element without also updating
the policy to have a valid value.

Also while there, make sure the existing entries that are valid have the
correct policy, if not, correct the policy. Remove unnecessary checks
for nla_strlcpy() overflow once the policy has been set correctly.
Signed-off-by: NDoug Ledford <dledford@redhat.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

34d65cd8

RDMA/hns: Cleanup unnecessary exported symbols · e9816ddf

由 Lijun Ou 提交于 6月 19, 2019

This patch removes the hns-roce.ko for cleanup all the exported symbols in
common part.
Signed-off-by: NXi Wang <wangxi11@huawei.com>
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

e9816ddf

25 6月, 2019 3 次提交

docs: infiniband: convert docs to ReST and rename to *.rst · 97162a1e

由 Mauro Carvalho Chehab 提交于 6月 08, 2019

The InfiniBand docs are plain text with no markups. So, all we needed to
do were to add the title markups and some markup sequences in order to
properly parse tables, lists and literal blocks.

At its new index.rst, let's add a :orphan: while this is not linked to the
main index.rst file, in order to avoid build warnings.
Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

97162a1e

RDMA/hns: Fix an error code in hns_roce_set_user_sq_size() · b417c087

由 Dan Carpenter 提交于 6月 08, 2019

This function is supposed to return negative kernel error codes but here
it returns CMD_RST_PRC_EBUSY (2).  The error code eventually gets passed
to IS_ERR() and since it's not an error pointer it leads to an Oops in
hns_roce_v1_rsv_lp_qp()
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

b417c087

RDMA/hns: fix potential integer overflow on left shift · 7ef75875

由 Colin Ian King 提交于 6月 24, 2019

There is a potential integer overflow when int i is left shifted as this
is evaluated using 32 bit arithmetic but is being used in a context that
expects an expression of type dma_addr_t. Fix this by casting integer i
to dma_addr_t before shifting to avoid the overflow.

Addresses-Coverity: ("Unintentional integer overflow")
Fixes: 2ac0bc5e ("RDMA/hns: Add a group interfaces for optimizing buffers getting flow")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

7ef75875

24 6月, 2019 4 次提交

RDMA/mlx5: Refactor MR descriptors allocation · 7796d2a3

由 Max Gurtovoy 提交于 6月 11, 2019

Improve code readability using static helpers for each memory region
type. Re-use the common logic to get smaller functions that are easy
to maintain and reduce code duplication.
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

7796d2a3

RDMA/mlx5: Use PA mapping for PI handover · 2563e2f3

由 Max Gurtovoy 提交于 6月 11, 2019

If possibe, avoid doing a UMR operation to register data and protection
buffers (via MTT/KLM mkeys). Instead, use the local DMA key and map the
SG lists using PA access. This is safe, since the internal key for data
and protection never exposed to the remote server (only signature key
might be exposed). If PA mappings are not possible, perform mapping
using MTT/KLM descriptors.

The setup of the tested benchmark (using iSER ULP):
 - 2 servers with 24 cores (1 initiator and 1 target)
 - ConnectX-4/ConnectX-5 adapters
 - 24 target sessions with 1 LUN each
 - ramdisk backstore
 - PI active

Performance results running fio (24 jobs, 128 iodepth) using
write_generate=1 and read_verify=1 (w/w.o patch):

bs      IOPS(read)        IOPS(write)
----    ----------        ----------
512   1266.4K/1262.4K    1720.1K/1732.1K
4k    793139/570902      1129.6K/773982
32k   72660/72086        97229/96164

Using write_generate=0 and read_verify=0 (w/w.o patch):
bs      IOPS(read)        IOPS(write)
----    ----------        ----------
512   1590.2K/1600.1K    1828.2K/1830.3K
4k    1078.1K/937272     1142.1K/815304
32k   77012/77369        98125/97435
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
Suggested-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

2563e2f3

RDMA/mlx5: Improve PI handover performance · de0ae958

由 Israel Rukshin 提交于 6月 11, 2019

In some loads, there is performance degradation when using KLM mkey
instead of MTT mkey. This is because KLM descriptor access is via
indirection that might require more HW resources and cycles.
Using KLM descriptor is not necessary when there are no gaps at the
data/metadata sg lists. As an optimization, use MTT mkey whenever it
is possible. For that matter, allocate internal MTT mkey and choose the
effective pi_mr for in transaction according to the required mapping
scheme.

The setup of the tested benchmark (using iSER ULP):
 - 2 servers with 24 cores (1 initiator and 1 target)
 - ConnectX-4/ConnectX-5 adapters
 - 24 target sessions with 1 LUN each
 - ramdisk backstore
 - PI active

Performance results running fio (24 jobs, 128 iodepth) using
write_generate=1 and read_verify=1 (w/w.o/baseline):

bs      IOPS(read)                IOPS(write)
----    ----------                ----------
512   1262.4K/1243.3K/1147.1K    1732.1K/1725.1K/1423.8K
4k    570902/571233/457874       773982/743293/642080
32k   72086/72388/71933          96164/71789/93249

Using write_generate=0 and read_verify=0 (w/w.o patch):
bs      IOPS(read)                IOPS(write)
----    ----------                ----------
512   1600.1K/1572.1K/1393.3K    1830.3K/1823.5K/1557.2K
4k    937272/921992/762934       815304/753772/646071
32k   77369/75052/72058          97435/73180/94612
Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Suggested-by: NMax Gurtovoy <maxg@mellanox.com>
Suggested-by: NIdan Burstein <idanb@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

de0ae958

RDMA/mlx5: Remove unused IB_WR_REG_SIG_MR code · 5c171cbe

由 Israel Rukshin 提交于 6月 11, 2019

IB_WR_REG_SIG_MR is not needed after IB_WR_REG_MR_INTEGRITY
was used.
Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

5c171cbe

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功