提交 · cc26aee100588a3f293921342a307b6309ace193 · openeuler / Kernel

28 9月, 2021 2 次提交

RDMA/hns: Fix the size setting error when copying CQE in clean_cq() · cc26aee1

由 Wenpeng Liang 提交于 9月 27, 2021

The size of CQE is different for different versions of hardware, so the
driver needs to specify the size of CQE explicitly.

Fixes: 09a5f210 ("RDMA/hns: Add support for CQE in size of 64 Bytes")
Link: https://lore.kernel.org/r/20210927125557.15031-2-liangwenpeng@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

cc26aee1

RDMA/hfi1: Fix kernel pointer leak · 7d5cfafe

由 Guo Zhi 提交于 9月 22, 2021

Pointers should be printed with %p or %px rather than cast to 'unsigned
long long' and printed with %llx. Change %llx to %p to print the secured
pointer.

Fixes: 042a00f9 ("IB/{ipoib,hfi1}: Add a timeout handler for rdma_netdev")
Link: https://lore.kernel.org/r/20210922134857.619602-1-qtxuning1999@sjtu.edu.cnSigned-off-by: NGuo Zhi <qtxuning1999@sjtu.edu.cn>
Acked-by: NMike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

7d5cfafe

24 9月, 2021 3 次提交

RDMA/usnic: Lock VF with mutex instead of spinlock · a86cd017

由 Leon Romanovsky 提交于 9月 13, 2021

Usnic VF doesn't need lock in atomic context to create QPs, so it is safe
to use mutex instead of spinlock. Such change fixes the following smatch
error.

Smatch static checker warning:

   lib/kobject.c:289 kobject_set_name_vargs()
    warn: sleeping in atomic context

Fixes: 514aee66 ("RDMA: Globally allocate and release QP memory")
Link: https://lore.kernel.org/r/2a0e295786c127e518ebee8bb7cafcb819a625f6.1631520231.git.leonro@nvidia.comReported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Reviewed-by: NHåkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

a86cd017

RDMA/hns: Work around broken constant propagation in gcc 8 · 14351f08

由 Jason Gunthorpe 提交于 9月 16, 2021

gcc 8.3 and 5.4 throw this:

In function 'modify_qp_init_to_rtr',
././include/linux/compiler_types.h:322:38: error: call to '__compiletime_assert_1859' declared with attribute error: FIELD_PREP: value too large for the field
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
[..]
drivers/infiniband/hw/hns/hns_roce_common.h:91:52: note: in expansion of macro 'FIELD_PREP'
   *((__le32 *)ptr + (field_h) / 32) |= cpu_to_le32(FIELD_PREP(   \
                                                    ^~~~~~~~~~
drivers/infiniband/hw/hns/hns_roce_common.h:95:39: note: in expansion of macro '_hr_reg_write'
 #define hr_reg_write(ptr, field, val) _hr_reg_write(ptr, field, val)
                                       ^~~~~~~~~~~~~
drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4412:2: note: in expansion of macro 'hr_reg_write'
  hr_reg_write(context, QPC_LP_PKTN_INI, lp_pktn_ini);

Because gcc has miscalculated the constantness of lp_pktn_ini:

	mtu = ib_mtu_enum_to_int(ib_mtu);
	if (WARN_ON(mtu < 0)) [..]
	lp_pktn_ini = ilog2(MAX_LP_MSG_LEN / mtu);

Since mtu is limited to {256,512,1024,2048,4096} lp_pktn_ini is between 4
and 8 which is compatible with the 4 bit field in the FIELD_PREP.

Work around this broken compiler by adding a 'can never be true'
constraint on lp_pktn_ini's value which clears out the problem.

Fixes: f0cb411a ("RDMA/hns: Use new interface to modify QP context")
Link: https://lore.kernel.org/r/0-v1-c773ecb137bc+11f-hns_gcc8_jgg@nvidia.comReported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

14351f08

RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests · 305d568b

由 Jason Gunthorpe 提交于 9月 16, 2021

The FSM can run in a circle allowing rdma_resolve_ip() to be called twice
on the same id_priv. While this cannot happen without going through the
work, it violates the invariant that the same address resolution
background request cannot be active twice.

       CPU 1                                  CPU 2

rdma_resolve_addr():
  RDMA_CM_IDLE -> RDMA_CM_ADDR_QUERY
  rdma_resolve_ip(addr_handler)  #1

			 process_one_req(): for #1
                          addr_handler():
                            RDMA_CM_ADDR_QUERY -> RDMA_CM_ADDR_BOUND
                            mutex_unlock(&id_priv->handler_mutex);
                            [.. handler still running ..]

rdma_resolve_addr():
  RDMA_CM_ADDR_BOUND -> RDMA_CM_ADDR_QUERY
  rdma_resolve_ip(addr_handler)
    !! two requests are now on the req_list

rdma_destroy_id():
 destroy_id_handler_unlock():
  _destroy_id():
   cma_cancel_operation():
    rdma_addr_cancel()

                          // process_one_req() self removes it
		          spin_lock_bh(&lock);
                           cancel_delayed_work(&req->work);
	                   if (!list_empty(&req->list)) == true

      ! rdma_addr_cancel() returns after process_on_req #1 is done

   kfree(id_priv)

			 process_one_req(): for #2
                          addr_handler():
	                    mutex_lock(&id_priv->handler_mutex);
                            !! Use after free on id_priv

rdma_addr_cancel() expects there to be one req on the list and only
cancels the first one. The self-removal behavior of the work only happens
after the handler has returned. This yields a situations where the
req_list can have two reqs for the same "handle" but rdma_addr_cancel()
only cancels the first one.

The second req remains active beyond rdma_destroy_id() and will
use-after-free id_priv once it inevitably triggers.

Fix this by remembering if the id_priv has called rdma_resolve_ip() and
always cancel before calling it again. This ensures the req_list never
gets more than one item in it and doesn't cost anything in the normal flow
that never uses this strange error path.

Link: https://lore.kernel.org/r/0-v1-3bc675b8006d+22-syz_cancel_uaf_jgg@nvidia.com
Cc: stable@vger.kernel.org
Fixes: e51060f0 ("IB: IP address based RDMA connection manager")
Reported-by: syzbot+dc3dfba010d7671e05f5@syzkaller.appspotmail.com
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

305d568b

23 9月, 2021 1 次提交

RDMA/cma: Do not change route.addr.src_addr.ss_family · bc0bdc5a

由 Jason Gunthorpe 提交于 9月 15, 2021

If the state is not idle then rdma_bind_addr() will immediately fail and
no change to global state should happen.

For instance if the state is already RDMA_CM_LISTEN then this will corrupt
the src_addr and would cause the test in cma_cancel_operation():

		if (cma_any_addr(cma_src_addr(id_priv)) && !id_priv->cma_dev)

To view a mangled src_addr, eg with a IPv6 loopback address but an IPv4
family, failing the test.

This would manifest as this trace from syzkaller:

  BUG: KASAN: use-after-free in __list_add_valid+0x93/0xa0 lib/list_debug.c:26
  Read of size 8 at addr ffff8881546491e0 by task syz-executor.1/32204

  CPU: 1 PID: 32204 Comm: syz-executor.1 Not tainted 5.12.0-rc8-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Call Trace:
   __dump_stack lib/dump_stack.c:79 [inline]
   dump_stack+0x141/0x1d7 lib/dump_stack.c:120
   print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232
   __kasan_report mm/kasan/report.c:399 [inline]
   kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416
   __list_add_valid+0x93/0xa0 lib/list_debug.c:26
   __list_add include/linux/list.h:67 [inline]
   list_add_tail include/linux/list.h:100 [inline]
   cma_listen_on_all drivers/infiniband/core/cma.c:2557 [inline]
   rdma_listen+0x787/0xe00 drivers/infiniband/core/cma.c:3751
   ucma_listen+0x16a/0x210 drivers/infiniband/core/ucma.c:1102
   ucma_write+0x259/0x350 drivers/infiniband/core/ucma.c:1732
   vfs_write+0x28e/0xa30 fs/read_write.c:603
   ksys_write+0x1ee/0x250 fs/read_write.c:658
   do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Which is indicating that an rdma_id_private was destroyed without doing
cma_cancel_listens().

Instead of trying to re-use the src_addr memory to indirectly create an
any address build one explicitly on the stack and bind to that as any
other normal flow would do.

Link: https://lore.kernel.org/r/0-v1-9fbb33f5e201+2a-cma_listen_jgg@nvidia.com
Cc: stable@vger.kernel.org
Fixes: 732d41c5 ("RDMA/cma: Make the locking for automatic state transition more clear")
Reported-by: syzbot+6bb0528b13611047209c@syzkaller.appspotmail.com
Tested-by: NHao Sun <sunhao.th@gmail.com>
Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

bc0bdc5a

21 9月, 2021 4 次提交

RDMA/irdma: Report correct WC error when there are MW bind errors · 9f7fa37a

由 Sindhu Devale 提交于 9月 16, 2021

Report the correct WC error when MW bind error related asynchronous events
are generated by HW.

Fixes: b48c24c2 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20210916191222.824-5-shiraz.saleem@intel.comSigned-off-by: NSindhu Devale <sindhu.devale@intel.com>
Signed-off-by: NShiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

9f7fa37a

RDMA/irdma: Report correct WC error when transport retry counter is exceeded · d3bdcd59

由 Sindhu Devale 提交于 9月 16, 2021

When the retry counter exceeds, as the remote QP didn't send any Ack or
Nack an asynchronous event (AE) for too many retries is generated. Add
code to handle the AE and set the correct IB WC error code
IB_WC_RETRY_EXC_ERR.

Fixes: b48c24c2 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20210916191222.824-4-shiraz.saleem@intel.comSigned-off-by: NSindhu Devale <sindhu.devale@intel.com>
Signed-off-by: NShiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

d3bdcd59

RDMA/irdma: Validate number of CQ entries on create CQ · f4475f24

由 Sindhu Devale 提交于 9月 16, 2021

Add lower bound check for CQ entries at creation time.

Fixes: b48c24c2 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20210916191222.824-3-shiraz.saleem@intel.comSigned-off-by: NSindhu Devale <sindhu.devale@intel.com>
Signed-off-by: NShiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

f4475f24

RDMA/irdma: Skip CQP ring during a reset · 5b1e985f

由 Sindhu Devale 提交于 9月 16, 2021

Due to duplicate reset flags, CQP commands are processed during reset.

This leads CQP failures such as below:

irdma0: [Delete Local MAC Entry Cmd Error][op_code=49] status=-27 waiting=1 completion_err=0 maj=0x0 min=0x0

Remove the redundant flag and set the correct reset flag so CPQ is paused
during reset

Fixes: 8498a30e ("RDMA/irdma: Register auxiliary driver and implement private channel OPs")
Link: https://lore.kernel.org/r/20210916191222.824-2-shiraz.saleem@intel.comReported-by: NLiLiang <liali@redhat.com>
Signed-off-by: NSindhu Devale <sindhu.devale@intel.com>
Signed-off-by: NShiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

5b1e985f

15 9月, 2021 2 次提交

RDMA/cma: Fix listener leak in rdma_cma_listen_on_all() failure · ca465e1f

由 Tao Liu 提交于 9月 13, 2021

If cma_listen_on_all() fails it leaves the per-device ID still on the
listen_list but the state is not set to RDMA_CM_ADDR_BOUND.

When the cmid is eventually destroyed cma_cancel_listens() is not called
due to the wrong state, however the per-device IDs are still holding the
refcount preventing the ID from being destroyed, thus deadlocking:

 task:rping state:D stack:   0 pid:19605 ppid: 47036 flags:0x00000084
 Call Trace:
  __schedule+0x29a/0x780
  ? free_unref_page_commit+0x9b/0x110
  schedule+0x3c/0xa0
  schedule_timeout+0x215/0x2b0
  ? __flush_work+0x19e/0x1e0
  wait_for_completion+0x8d/0xf0
  _destroy_id+0x144/0x210 [rdma_cm]
  ucma_close_id+0x2b/0x40 [rdma_ucm]
  __destroy_id+0x93/0x2c0 [rdma_ucm]
  ? __xa_erase+0x4a/0xa0
  ucma_destroy_id+0x9a/0x120 [rdma_ucm]
  ucma_write+0xb8/0x130 [rdma_ucm]
  vfs_write+0xb4/0x250
  ksys_write+0xb5/0xd0
  ? syscall_trace_enter.isra.19+0x123/0x190
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Ensure that cma_listen_on_all() atomically unwinds its action under the
lock during error.

Fixes: c80a0c52 ("RDMA/cma: Add missing error handling of listen_id")
Link: https://lore.kernel.org/r/20210913093344.17230-1-thomas.liu@ucloud.cnSigned-off-by: NTao Liu <thomas.liu@ucloud.cn>
Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

ca465e1f

IB/cma: Do not send IGMP leaves for sendonly Multicast groups · 2cc74e1e

由 Christoph Lameter 提交于 9月 08, 2021

ROCE uses IGMP for Multicast instead of the native Infiniband system where
joins are required in order to post messages on the Multicast group.  On
Ethernet one can send Multicast messages to arbitrary addresses without
the need to subscribe to a group.

So ROCE correctly does not send IGMP joins during rdma_join_multicast().

F.e. in cma_iboe_join_multicast() we see:

   if (addr->sa_family == AF_INET) {
                if (gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP) {
                        ib.rec.hop_limit = IPV6_DEFAULT_HOPLIMIT;
                        if (!send_only) {
                                err = cma_igmp_send(ndev, &ib.rec.mgid,
                                                    true);
                        }
                }
        } else {

So the IGMP join is suppressed as it is unnecessary.

However no such check is done in destroy_mc(). And therefore leaving a
sendonly multicast group will send an IGMP leave.

This means that the following scenario can lead to a multicast receiver
unexpectedly being unsubscribed from a MC group:

1. Sender thread does a sendonly join on MC group X. No IGMP join
   is sent.

2. Receiver thread does a regular join on the same MC Group x.
   IGMP join is sent and the receiver begins to get messages.

3. Sender thread terminates and destroys MC group X.
   IGMP leave is sent and the receiver no longer receives data.

This patch adds the same logic for sendonly joins to destroy_mc() that is
also used in cma_iboe_join_multicast().

Fixes: ab15c95a ("IB/core: Support for CMA multicast join flags")
Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2109081340540.668072@gentwo.deSigned-off-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

2cc74e1e

14 9月, 2021 1 次提交

IB/qib: Fix clang confusion of NULL pointer comparison · 3110b942

由 Jason Gunthorpe 提交于 9月 13, 2021

clang becomes confused due to the comparison to NULL in a integer constant
expression context:

 >> drivers/infiniband/hw/qib/qib_sysfs.c:413:1: error: static_assert expression is not an integral constant expression
    QIB_DIAGC_ATTR(rc_resends);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~
    drivers/infiniband/hw/qib/qib_sysfs.c:406:16: note: expanded from macro 'QIB_DIAGC_ATTR'
            static_assert(&((struct qib_ibport *)0)->rvp.n_##N != (u64 *)NULL);    \

Nathan found __same_type that solves this problem nicely, so use it instead.
Reported-by: Nkernel test robot <lkp@intel.com>
Suggested-by: NNathan Chancellor <nathan@kernel.org>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

3110b942

08 9月, 2021 5 次提交

IB/hfi1: make hist static · 2169b908

由 chongjiapeng 提交于 9月 06, 2021

This symbol is not used outside of trace.c, so marks it static.

Fix the following sparse warning:

drivers/infiniband/hw/hfi1/trace.c:491:23: warning: symbol 'hist' was not declared. Should it be static?

Link: https://lore.kernel.org/r/1630921723-21545-1-git-send-email-jiapeng.chong@linux.alibaba.comReported-by: NAbaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Nchongjiapeng <jiapeng.chong@linux.alibaba.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

2169b908

RDMA/bnxt_re: Prefer kcalloc over open coded arithmetic · f1b195ce

由 Len Baker 提交于 9月 05, 2021

As noted in the "Deprecated Interfaces, Language Features, Attributes, and
Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead to
values wrapping around and a smaller allocation being made than the caller
was expecting. Using those allocations could lead to linear overflows of
heap memory and other misbehaviors.

In this case this is not actually dynamic sizes: both sides of the
multiplication are constant values. However it is best to refactor this
anyway, just to keep the open-coded math idiom out of code.

So, use the purpose specific kcalloc() function instead of the argument
size * count in the kzalloc() function.

Also, remove the unnecessary initialization of the sqp_tbl variable since
it is set a few lines later.

[1] https://www.kernel.org/doc/html/v5.14/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments

Link: https://lore.kernel.org/r/20210905081812.17113-1-len.baker@gmx.comSigned-off-by: NLen Baker <len.baker@gmx.com>
Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Acked-by: NSelvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

f1b195ce

IB/qib: Fix null pointer subtraction compiler warning · 84f969e1

由 Jason Gunthorpe 提交于 9月 03, 2021

>> drivers/infiniband/hw/qib/qib_sysfs.c:411:1: warning: performing pointer subtraction with a null pointer has undefined behavior
+[-Wnull-pointer-subtraction]
   QIB_DIAGC_ATTR(rc_resends);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/infiniband/hw/qib/qib_sysfs.c:408:51: note: expanded from macro 'QIB_DIAGC_ATTR'
                   .counter = &((struct qib_ibport *)0)->rvp.n_##N - (u64 *)0,    \

Use offsetof and accomplish the type check using static_assert.

Fixes: 4a7aaf88 ("RDMA/qib: Use attributes for the port sysfs")
Link: https://lore.kernel.org/r/0-v1-43ae3c759177+65-qib_type_jgg@nvidia.comReported-by: Nkernel test robot <lkp@intel.com>
Acked-by: NDennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

84f969e1

RDMA/mlx5: Fix xlt_chunk_align calculation · f4c6f310

由 Niklas Schnelle 提交于 9月 08, 2021

The XLT chunk alignment depends on ent_size not sizeof(ent_size) aka
sizeof(size_t). The incoming ent_size is either 8 or 16, so the
miscalculation when 16 is required is only an over-alignment and
functional harmless.

Fixes: 8010d74b ("RDMA/mlx5: Split the WR setup out of mlx5_ib_update_xlt()")
Link: https://lore.kernel.org/r/20210908081849.7948-2-schnelle@linux.ibm.comSigned-off-by: NNiklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

f4c6f310

RDMA/mlx5: Fix number of allocated XLT entries · 9660dcbe

由 Niklas Schnelle 提交于 9月 08, 2021

In commit 8010d74b ("RDMA/mlx5: Split the WR setup out of
mlx5_ib_update_xlt()") the allocation logic was split out of
mlx5_ib_update_xlt() and the logic was changed to enable better OOM
handling. Sadly this change introduced a miscalculation of the number of
entries that were actually allocated when under memory pressure where it
can actually become 0 which on s390 lets dma_map_single() fail.

It can also lead to corruption of the free pages list when the wrong
number of entries is used in the calculation of sg->length which is used
as argument for free_pages().

Fix this by using the allocation size instead of misusing get_order(size).

Cc: stable@vger.kernel.org
Fixes: 8010d74b ("RDMA/mlx5: Split the WR setup out of mlx5_ib_update_xlt()")
Link: https://lore.kernel.org/r/20210908081849.7948-1-schnelle@linux.ibm.comSigned-off-by: NNiklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

9660dcbe

30 8月, 2021 1 次提交

RDMA/mlx5: Relax DCS QP creation checks · 65f90c8e

由 Lior Nahmanson 提交于 8月 30, 2021

In order to create DCS QPs, we don't need to rely on both
log_max_dci_stream_channels and log_max_dci_errored_streams capabilities.

Fixes: 11656f59 ("RDMA/mlx5: Add DCS offload support")
Link: https://lore.kernel.org/r/3e7b3363fd73686176cc584295e86832a7cf99b2.1630320354.git.leonro@nvidia.comSigned-off-by: NLior Nahmanson <liorna@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

65f90c8e

26 8月, 2021 15 次提交

RDMA/hns: Delete unnecessary blank lines. · 1a018278

由 Xinhao Liu 提交于 8月 26, 2021

Just delete unnecessary blank lines.

Link: https://lore.kernel.org/r/1629985056-57004-8-git-send-email-liangwenpeng@huawei.comSigned-off-by: NXinhao Liu <liuxinhao5@hisilicon.com>
Signed-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

1a018278

RDMA/hns: Encapsulate the qp db as a function · ae2854c5

由 Yixing Liu 提交于 8月 26, 2021

Encapsulate qp db into two functions: user and kernel.

Link: https://lore.kernel.org/r/1629985056-57004-7-git-send-email-liangwenpeng@huawei.comSigned-off-by: NYixing Liu <liuyixing1@huawei.com>
Signed-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

ae2854c5

RDMA/hns: Adjust the order in which irq are requested and enabled · 7fac7169

由 Wenpeng Liang 提交于 8月 26, 2021

It should first alloc workqueue and request irq, and finally enable irq.

Link: https://lore.kernel.org/r/1629985056-57004-6-git-send-email-liangwenpeng@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

7fac7169

RDMA/hns: Remove RST2RST error prints for hw v1 · ab5cbb9d

由 Weihang Li 提交于 8月 26, 2021

There is no need to prints error for hw_v1.

Link: https://lore.kernel.org/r/1629985056-57004-5-git-send-email-liangwenpeng@huawei.comSigned-off-by: NWeihang Li <liweihang@huawei.com>
Signed-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

ab5cbb9d

RDMA/hns: Remove dqpn filling when modify qp from Init to Init · fe164fc8

由 Wenpeng Liang 提交于 8月 26, 2021

According to the IB specification, the destination qpn is allowed to be
filled into the qpc only when the qp transitions from Init to RTR, so this
code is unused.

Link: https://lore.kernel.org/r/1629985056-57004-4-git-send-email-liangwenpeng@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

fe164fc8

RDMA/hns: Fix QP's resp incomplete assignment · d2e0ccff

由 Wenpeng Liang 提交于 8月 26, 2021

The resp passed to the user space represents the enable flag of qp,
incomplete assignment will cause some features of the user space to be
disabled.

Fixes: 90ae0b57 ("RDMA/hns: Combine enable flags of qp")
Fixes: aba457ca ("RDMA/hns: Support owner mode doorbell")
Link: https://lore.kernel.org/r/1629985056-57004-3-git-send-email-liangwenpeng@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

d2e0ccff

RDMA/hns: Fix query destination qpn · e788a3cd

由 Wenpeng Liang 提交于 8月 26, 2021

The bit width of dqpn is 24 bits, using u8 will cause truncation error.

Fixes: 926a01dc ("RDMA/hns: Add QP operations support for hip08 SoC")
Link: https://lore.kernel.org/r/1629985056-57004-2-git-send-email-liangwenpeng@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

e788a3cd

RDMA/hfi1: Convert to SPDX identifier · 145eba1a

由 Cai Huoqing 提交于 8月 23, 2021

use SPDX-License-Identifier instead of a verbose license text

Link: https://lore.kernel.org/r/20210823042622.109-1-caihuoqing@baidu.comSigned-off-by: NCai Huoqing <caihuoqing@baidu.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

145eba1a

IB/rdmavt: Convert to SPDX identifier · d164bf64

由 Cai Huoqing 提交于 8月 23, 2021

use SPDX-License-Identifier instead of a verbose license text

Link: https://lore.kernel.org/r/20210823023530.48-1-caihuoqing@baidu.comSigned-off-by: NCai Huoqing <caihuoqing@baidu.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

d164bf64

RDMA/hns: Bugfix for incorrect association between dip_idx and dgid · eb653eda

由 Junxian Huang 提交于 8月 25, 2021

dip_idx and dgid should be a one-to-one mapping relationship, but when
qp_num loops back to the start number, it may happen that two different
dgid are assiociated to the same dip_idx incorrectly.

One solution is to store the qp_num that is not assigned to dip_idx in an
array. When a dip_idx needs to be allocated to a new dgid, an spare qp_num
is extracted and assigned to dip_idx.

Fixes: f91696f2 ("RDMA/hns: Support congestion control type selection according to the FW")
Link: https://lore.kernel.org/r/1629884592-23424-4-git-send-email-liangwenpeng@huawei.comSigned-off-by: NJunxian Huang <huangjunxian4@hisilicon.com>
Signed-off-by: NYangyang Li <liyangyang20@huawei.com>
Signed-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

eb653eda

RDMA/hns: Bugfix for the missing assignment for dip_idx · 074f315f

由 Junxian Huang 提交于 8月 25, 2021

When the dgid-dip_idx mapping relationship exists, dip should be assigned.

Fixes: f91696f2 ("RDMA/hns: Support congestion control type selection according to the FW")
Link: https://lore.kernel.org/r/1629884592-23424-3-git-send-email-liangwenpeng@huawei.comSigned-off-by: NJunxian Huang <huangjunxian4@hisilicon.com>
Signed-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

074f315f

RDMA/hns: Bugfix for data type of dip_idx · 4303e612

由 Junxian Huang 提交于 8月 25, 2021

dip_idx is associated with qp_num whose data type is u32. However, dip_idx
is incorrectly defined as u8 data in the hns_roce_dip struct, which leads
to data truncation during value assignment.

Fixes: f91696f2 ("RDMA/hns: Support congestion control type selection according to the FW")
Link: https://lore.kernel.org/r/1629884592-23424-2-git-send-email-liangwenpeng@huawei.comSigned-off-by: NJunxian Huang <huangjunxian4@hisilicon.com>
Signed-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

4303e612

RDMA/hns: Fix incorrect lsn field · 9bed8a70

由 Yixing Liu 提交于 8月 25, 2021

In RNR NAK screnario, according to the specification, when no credit is
available, only the first fragment of the send request can be sent. The
LSN(Limit Sequence Number) field should be 0 or the entire packet will be
resent.

Fixes: 926a01dc ("RDMA/hns: Add QP operations support for hip08 SoC")
Link: https://lore.kernel.org/r/1629883169-2306-1-git-send-email-liangwenpeng@huawei.comSigned-off-by: NYixing Liu <liuyixing1@huawei.com>
Signed-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

9bed8a70

RDMA/irdma: Remove the repeated declaration · fc3bf30f

由 Shaokun Zhang 提交于 8月 25, 2021

Functions 'irdma_alloc_ws_node_id' and 'irdma_free_ws_node_id' are
declared twice, so remove the repeated declaration.

Link: https://lore.kernel.org/r/1629861674-53343-1-git-send-email-zhangshaokun@hisilicon.comSigned-off-by: NShaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

fc3bf30f

RDMA/core/sa_query: Retry SA queries · 5f5a6509

由 Håkon Bugge 提交于 8月 12, 2021

A MAD packet is sent as an unreliable datagram (UD). SA requests are sent
as MAD packets. As such, SA requests or responses may be silently dropped.

IB Core's MAD layer has a timeout and retry mechanism, which amongst
other, is used by RDMA CM. But it is not used by SA queries. The lack of
retries of SA queries leads to long specified timeout, and error being
returned in case of packet loss. The ULP or user-land process has to
perform the retry.

Fix this by taking advantage of the MAD layer's retry mechanism.

First, a check against a zero timeout is added in rdma_resolve_route(). In
send_mad(), we set the MAD layer timeout to one tenth of the specified
timeout and the number of retries to 10. The special case when timeout is
less than 10 is handled.

With this fix:

 # ucmatose -c 1000 -S 1024 -C 1

runs stable on an Infiniband fabric. Without this fix, we see an
intermittent behavior and it errors out with:

cmatose: event: RDMA_CM_EVENT_ROUTE_ERROR, error: -110

(110 is ETIMEDOUT)

Link: https://lore.kernel.org/r/1628784755-28316-1-git-send-email-haakon.bugge@oracle.comSigned-off-by: NHåkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

5f5a6509

25 8月, 2021 3 次提交

RDMA: Use the sg_table directly and remove the opencoded version from umem · 79fbd3e1

由 Maor Gottlieb 提交于 8月 24, 2021

This allows using the normal sg_table APIs and makes all the code
cleaner. Remove sgt, nents and nmapd from ib_umem.

Link: https://lore.kernel.org/r/20210824142531.3877007-4-maorg@nvidia.comSigned-off-by: NMaor Gottlieb <maorg@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

79fbd3e1

lib/scatterlist: Fix wrong update of orig_nents · 3e302dbc

由 Maor Gottlieb 提交于 8月 24, 2021

orig_nents should represent the number of entries with pages,
but __sg_alloc_table_from_pages sets orig_nents as the number of
total entries in the table. This is wrong when the API is used for
dynamic allocation where not all the table entries are mapped with
pages. It wasn't observed until now, since RDMA umem who uses this
API in the dynamic form doesn't use orig_nents implicit or explicit
by the scatterlist APIs.

Fix it by changing the append API to track the SG append table
state and have an API to free the append table according to the
total number of entries in the table.
Now all APIs set orig_nents as number of enries with pages.

Fixes: 07da1223 ("lib/scatterlist: Add support in dynamic allocation of SG table from pages")
Link: https://lore.kernel.org/r/20210824142531.3877007-3-maorg@nvidia.comSigned-off-by: NMaor Gottlieb <maorg@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

3e302dbc

lib/scatterlist: Provide a dedicated function to support table append · 90e7a6de

由 Maor Gottlieb 提交于 8月 24, 2021

RDMA is the only in-kernel user that uses __sg_alloc_table_from_pages to
append pages dynamically. In the next patch. That mode will be extended
and that function will get more parameters. So separate it into a unique
function to make such change more clear.

Link: https://lore.kernel.org/r/20210824142531.3877007-2-maorg@nvidia.comSigned-off-by: NMaor Gottlieb <maorg@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

90e7a6de

24 8月, 2021 3 次提交

ethtool: extend coalesce setting uAPI with CQE mode · f3ccfda1

由 Yufeng Mo 提交于 8月 20, 2021

In order to support more coalesce parameters through netlink,
add two new parameter kernel_coal and extack for .set_coalesce
and .get_coalesce, then some extra info can return to user with
the netlink API.
Signed-off-by: NYufeng Mo <moyufeng@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

f3ccfda1

RDMA/hns: Delete unused hns bitmap interface · f0a64199

由 Yangyang Li 提交于 8月 19, 2021

The resources that use the hns bitmap interface: qp, cq, mr, pd, xrcd,
uar, srq, have been changed to IDA interfaces, and the unused hns' own
bitmap interfaces need to be deleted.

Link: https://lore.kernel.org/r/1629336980-17499-4-git-send-email-liangwenpeng@huawei.comSigned-off-by: NYangyang Li <liyangyang20@huawei.com>
Signed-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

f0a64199

RDMA/hns: Use IDA interface to manage srq index · c4f11b36

由 Yangyang Li 提交于 8月 19, 2021

Switch srq index allocation and release from hns' own bitmap interface to
IDA interface.

Link: https://lore.kernel.org/r/1629336980-17499-3-git-send-email-liangwenpeng@huawei.comSigned-off-by: NYangyang Li <liyangyang20@huawei.com>
Signed-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

c4f11b36

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功