提交 · a372173bf314d374da4dd1155549d8ca7fc44709 · openeuler / Kernel

19 1月, 2021 1 次提交

RDMA/cxgb4: Fix the reported max_recv_sge value · a372173b

由 Kamal Heib 提交于 1月 14, 2021

The max_recv_sge value is wrongly reported when calling query_qp, This is
happening due to a typo when assigning the max_recv_sge value, the value
of sq_max_sges was assigned instead of rq_max_sges.

Fixes: 3e5c02c9 ("iw_cxgb4: Support query_qp() verb")
Link: https://lore.kernel.org/r/20210114191423.423529-1-kamalheib1@gmail.comSigned-off-by: NKamal Heib <kamalheib1@gmail.com>
Reviewed-by: NPotnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

a372173b

15 1月, 2021 4 次提交

RDMA/cma: Fix error flow in default_roce_mode_store · 7c7b3e5d

由 Neta Ostrovsky 提交于 1月 13, 2021

In default_roce_mode_store(), we took a reference to cma_dev, but didn't
return it with cma_dev_put in the error flow.

Fixes: 1c15b4f2 ("RDMA/core: Modify enum ib_gid_type and enum rdma_network_type")
Link: https://lore.kernel.org/r/20210113130214.562108-1-leon@kernel.orgSigned-off-by: NNeta Ostrovsky <netao@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

7c7b3e5d

RDMA/mlx5: Fix wrong free of blue flame register on error · 1c3aa6bd

由 Mark Bloch 提交于 1月 13, 2021

If the allocation of the fast path blue flame register fails, the driver
should free the regular blue flame register allocated a statement above,
not the one that it just failed to allocate.

Fixes: 16c1975f ("IB/mlx5: Create profile infrastructure to add and remove stages")
Link: https://lore.kernel.org/r/20210113121703.559778-6-leon@kernel.orgReported-by: NHans Petter Selasky <hanss@nvidia.com>
Signed-off-by: NMark Bloch <mbloch@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

1c3aa6bd

IB/mlx5: Fix error unwinding when set_has_smi_cap fails · 2cb091f6

由 Parav Pandit 提交于 1月 13, 2021

When set_has_smi_cap() fails, multiport master cleanup is missed. Fix it
by doing the correct error unwinding goto.

Fixes: a989ea01 ("RDMA/mlx5: Move SMI caps logic")
Link: https://lore.kernel.org/r/20210113121703.559778-3-leon@kernel.orgSigned-off-by: NParav Pandit <parav@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

2cb091f6

RDMA/umem: Avoid undefined behavior of rounddown_pow_of_two() · b79f2dc5

由 Aharon Landau 提交于 1月 13, 2021

rounddown_pow_of_two() is undefined when the input is 0. Therefore we need
to avoid it in ib_umem_find_best_pgsz and return 0. Otherwise, it could
result in not rejecting an invalid page size which eventually causes a
kernel oops due to the logical inconsistency.

Fixes: 3361c29e ("RDMA/umem: Use simpler logic for ib_umem_find_best_pgsz()")
Link: https://lore.kernel.org/r/20210113121703.559778-2-leon@kernel.orgSigned-off-by: NAharon Landau <aharonl@nvidia.com>
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Reviewed-by: NMaor Gottlieb <maorg@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

b79f2dc5

08 1月, 2021 3 次提交

RDMA/ocrdma: Fix use after free in ocrdma_dealloc_ucontext_pd() · f2bc3af6

由 Tom Rix 提交于 12月 29, 2020

In ocrdma_dealloc_ucontext_pd() uctx->cntxt_pd is assigned to the variable
pd and then after uctx->cntxt_pd is freed, the variable pd is passed to
function _ocrdma_dealloc_pd() which dereferences pd directly or through
its call to ocrdma_mbx_dealloc_pd().

Reorder the free using the variable pd.

Cc: stable@vger.kernel.org
Fixes: 21a428a0 ("RDMA: Handle PD allocations by IB/core")
Link: https://lore.kernel.org/r/20201230024653.1516495-1-trix@redhat.comSigned-off-by: NTom Rix <trix@redhat.com>
Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

f2bc3af6

RDMA/usnic: Fix memleak in find_free_vf_and_create_qp_grp · a306aba9

由 Dinghao Liu 提交于 12月 26, 2020

If usnic_ib_qp_grp_create() fails at the first call, dev_list
will not be freed on error, which leads to memleak.

Fixes: e3cf00d0 ("IB/usnic: Add Cisco VIC low-level hardware driver")
Link: https://lore.kernel.org/r/20201226074248.2893-1-dinghao.liu@zju.edu.cnSigned-off-by: NDinghao Liu <dinghao.liu@zju.edu.cn>
Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

a306aba9

RDMA/restrack: Don't treat as an error allocation ID wrapping · 3c638cdb

由 Leon Romanovsky 提交于 12月 16, 2020

xa_alloc_cyclic() call returns positive number if ID allocation
succeeded but wrapped. It is not an error, so normalize the "ret"
variable to zero as marker of not-an-error.

drivers/infiniband/core/restrack.c:261 rdma_restrack_add()
warn: 'ret' can be either negative or positive

Fixes: fd47c2f9 ("RDMA/restrack: Convert internal DB from hash to XArray")
Link: https://lore.kernel.org/r/20201216100753.1127638-1-leon@kernel.orgReported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

3c638cdb

07 1月, 2021 1 次提交

RDMA/ucma: Do not miss ctx destruction steps in some cases · 8ae291cc

由 Jason Gunthorpe 提交于 1月 05, 2021

The destruction flow is very complicated here because the cm_id can be
destroyed from the event handler at any time if the device is
hot-removed. This leaves behind a partial ctx with no cm_id in the
xarray, and will let user space leak memory.

Make everything consistent in this flow in all places:

 - Return the xarray back to XA_ZERO_ENTRY before beginning any
   destruction. The thread that reaches this first is responsible to
   kfree, everyone else does nothing.

 - Test the xarray during the special hot-removal case to block the
   queue_work, this has much simpler locking and doesn't require a
   'destroying'

 - Fix the ref initialization so that it is only positive if cm_id !=
   NULL, then rely on that to guide the destruction process in all cases.

Now the new ucma_destroy_private_ctx() can be called in all places that
want to free the ctx, including all the error unwinds, and none of the
details are missed.

Fixes: a1d33b70 ("RDMA/ucma: Rework how new connections are passed through event delivery")
Link: https://lore.kernel.org/r/20210105111327.230270-1-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

8ae291cc

17 12月, 2020 1 次提交

block/rnbd-clt: Does not request pdu to rtrs-clt · 9aaf9a2a

由 Gioh Kim 提交于 12月 10, 2020

Previously the rnbd client requested the rtrs to allocate rnbd_iu
just after the rtrs_iu. So the rnbd client passes the size of
rnbd_iu for rtrs_clt_open() and rtrs creates an array of
rnbd_iu and rtrs_iu.

For IO handling, rnbd_iu exists after the request because we pass
the size of rnbd_iu when setting the tag-set. Therefore we do not
use the rnbd_iu allocated by rtrs for IO handling.
We only use the rnbd_iu allocated by rtrs when doing session
initialization. Almost all rnbd_iu allocated by rtrs are wasted.

By this patch the rnbd client does not request rnbd_iu allocation
to rtrs but allocate it for itself when doing session initialization.

Also remove unused rtrs_permit_to_pdu from rtrs.
Signed-off-by: NGioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: NJack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9aaf9a2a

15 12月, 2020 2 次提交

RDMA/cma: Don't overwrite sgid_attr after device is released · e246b7c0

由 Leon Romanovsky 提交于 12月 13, 2020

As part of the cma_dev release, that pointer will be set to NULL. In case
it happens in rdma_bind_addr() (part of an error flow), the next call to
addr_handler() will have a call to cma_acquire_dev_by_src_ip() which will
overwrite sgid_attr without releasing it.

WARNING: CPU: 2 PID: 108 at drivers/infiniband/core/cma.c:606 cma_bind_sgid_attr drivers/infiniband/core/cma.c:606 [inline]
WARNING: CPU: 2 PID: 108 at drivers/infiniband/core/cma.c:606 cma_acquire_dev_by_src_ip+0x470/0x4b0 drivers/infiniband/core/cma.c:649
CPU: 2 PID: 108 Comm: kworker/u8:1 Not tainted 5.10.0-rc6+ #257
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Workqueue: ib_addr process_one_req
RIP: 0010:cma_bind_sgid_attr drivers/infiniband/core/cma.c:606 [inline]
RIP: 0010:cma_acquire_dev_by_src_ip+0x470/0x4b0 drivers/infiniband/core/cma.c:649
Code: 66 d9 4a ff 4d 8b 6e 10 49 8d bd 1c 08 00 00 e8 b6 d6 4a ff 45 0f b6 bd 1c 08 00 00 41 83 e7 01 e9 49 fd ff ff e8 90 c5 29 ff <0f> 0b e9 80 fe ff ff e8 84 c5 29 ff 4c 89 f7 e8 2c d9 4a ff 4d 8b
RSP: 0018:ffff8881047c7b40 EFLAGS: 00010293
RAX: ffff888104789c80 RBX: 0000000000000001 RCX: ffffffff820b8ef8
RDX: 0000000000000000 RSI: ffffffff820b9080 RDI: ffff88810cd4c998
RBP: ffff8881047c7c08 R08: ffff888104789c80 R09: ffffed10209f4036
R10: ffff888104fa01ab R11: ffffed10209f4035 R12: ffff88810cd4c800
R13: ffff888105750e28 R14: ffff888108f0a100 R15: ffff88810cd4c998
FS: 0000000000000000(0000) GS:ffff888119c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000104e60005 CR4: 0000000000370ea0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
addr_handler+0x266/0x350 drivers/infiniband/core/cma.c:3190
process_one_req+0xa3/0x300 drivers/infiniband/core/addr.c:645
process_one_work+0x54c/0x930 kernel/workqueue.c:2272
worker_thread+0x82/0x830 kernel/workqueue.c:2418
kthread+0x1ca/0x220 kernel/kthread.c:292
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

Fixes: ff11c6cd ("RDMA/cma: Introduce and use cma_acquire_dev_by_src_ip()")
Link: https://lore.kernel.org/r/20201213132940.345554-5-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

e246b7c0

RDMA/mlx5: Fix MR cache memory leak · e8993890

由 Maor Gottlieb 提交于 12月 13, 2020

If the MR cache entry invalidation failed, then we detach this entry from
the cache, therefore we must to free the memory as well.

Allcation backtrace for the leaker:

    [<00000000d8e423b0>] alloc_cache_mr+0x23/0xc0 [mlx5_ib]
    [<000000001f21304c>] create_cache_mr+0x3f/0xf0 [mlx5_ib]
    [<000000009d6b45dc>] mlx5_ib_alloc_implicit_mr+0x41/0×210 [mlx5_ib]
    [<00000000879d0d68>] mlx5_ib_reg_user_mr+0x9e/0×6e0 [mlx5_ib]
    [<00000000be74bf89>] create_qp+0x2fc/0xf00 [ib_uverbs]
    [<000000001a532d22>] ib_uverbs_handler_UVERBS_METHOD_COUNTERS_READ+0x1d9/0×230 [ib_uverbs]
    [<0000000070f46001>] rdma_alloc_commit_uobject+0xb5/0×120 [ib_uverbs]
    [<000000006d8a0b38>] uverbs_alloc+0x2b/0xf0 [ib_uverbs]
    [<00000000075217c9>] ksysioctl+0x234/0×7d0
    [<00000000eb5c120b>] __x64_sys_ioctl+0x16/0×20
    [<00000000db135b48>] do_syscall_64+0x59/0×2e0

Fixes: 1769c4c5 ("RDMA/mlx5: Always remove MRs from the cache before destroying them")
Link: https://lore.kernel.org/r/20201213132940.345554-2-leon@kernel.orgSigned-off-by: NMaor Gottlieb <maorg@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

e8993890

12 12月, 2020 12 次提交

RDMA/rxe: Use acquire/release for memory ordering · d21a1240

由 Bob Pearson 提交于 12月 10, 2020

Change work and completion queues to use smp_load_acquire() and
smp_store_release() to synchronize between driver and users. This commit
goes with a matching series of commits in the rxe user space provider.

Link: https://lore.kernel.org/r/20201210174258.5234-1-rpearson@hpe.comSigned-off-by: NBob Pearson <rpearson@hpe.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

d21a1240

RDMA/hns: Simplify AEQE process for different types of queue · d8cc403b

由 Yixian Liu 提交于 12月 11, 2020

There is no need to get queue number repeatly for different queues from an
AEQE entity, as they are the same. Furthermore, redefine the AEQE
structure to make the codes more readable.

In addition, HNS_ROCE_EVENT_TYPE_CEQ_OVERFLOW is removed because the
hardware never reports this event.

Link: https://lore.kernel.org/r/1607650657-35992-12-git-send-email-liweihang@huawei.comSigned-off-by: NYixian Liu <liuyixian@huawei.com>
Signed-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NWeihang Li <liweihang@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

d8cc403b

RDMA/hns: Fix inaccurate prints · 61918e9b

由 Yixing Liu 提交于 12月 11, 2020

Some %d in print format string should be %u, and some prints miss the
useful errno or are in nonstandard format. Just fix above issues.

Link: https://lore.kernel.org/r/1607650657-35992-11-git-send-email-liweihang@huawei.comSigned-off-by: NYixing Liu <liuyixing1@huawei.com>
Signed-off-by: NWeihang Li <liweihang@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

61918e9b

RDMA/hns: Fix incorrect symbol types · dcdc366a

由 Wenpeng Liang 提交于 12月 11, 2020

Types of some fields, variables and parameters of some functions should be
unsigned.

Link: https://lore.kernel.org/r/1607650657-35992-10-git-send-email-liweihang@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NWeihang Li <liweihang@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

dcdc366a

RDMA/hns: Clear redundant variable initialization · 62f3b70e

由 Xinhao Liu 提交于 12月 11, 2020

There is no need to initialize some variable because they will be assigned
with a value later.

Link: https://lore.kernel.org/r/1607650657-35992-9-git-send-email-liweihang@huawei.comSigned-off-by: NXinhao Liu <liuxinhao5@hisilicon.com>
Signed-off-by: NWeihang Li <liweihang@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

62f3b70e

RDMA/hns: Fix coding style issues · dc93a0d9

由 Lang Cheng 提交于 12月 11, 2020

Just format the code without modifying anything, including fixing some
redundant and missing blanks and spaces and changing the variable
definition order.

Link: https://lore.kernel.org/r/1607650657-35992-8-git-send-email-liweihang@huawei.comSigned-off-by: NLang Cheng <chenglang@huawei.com>
Signed-off-by: NWeihang Li <liweihang@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

dc93a0d9

RDMA/hns: Remove unnecessary access right set during INIT2INIT · 29b52027

由 Yixian Liu 提交于 12月 11, 2020

As the qp access right is checked and setted in common function
hns_roce_v2_set_opt_fields(), there is no need to set again for a special
case INIT2INIT.

Fixes: 926a01dc ("RDMA/hns: Add QP operations support for hip08 SoC")
Fixes: 7db82697 ("RDMA/hns: Add support for extended atomic in userspace")
Link: https://lore.kernel.org/r/1607650657-35992-7-git-send-email-liweihang@huawei.comSigned-off-by: NYixian Liu <liuyixian@huawei.com>
Signed-off-by: NWeihang Li <liweihang@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

29b52027

RDMA/hns: WARN_ON if get a reserved sl from users · f7550683

由 Weihang Li 提交于 12月 11, 2020

According to the RoCE v1 specification, the sl (service level) 0-7 are
mapped directly to priorities 0-7 respectively, sl 8-15 are reserved. The
driver should verify whether the value of sl is larger than 7, if so, an
exception should be returned.

Link: https://lore.kernel.org/r/1607650657-35992-6-git-send-email-liweihang@huawei.comSigned-off-by: NWeihang Li <liweihang@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

f7550683

RDMA/hns: Avoid filling sl in high 3 bits of vlan_id · 94a8c4df

由 Weihang Li 提交于 12月 11, 2020

Only the low 12 bits of vlan_id is valid, and service level has been
filled in Address Vector. So there is no need to fill sl in vlan_id in
Address Vector.

Fixes: 7406c003 ("RDMA/hns: Only record vlan info for HIP08")
Link: https://lore.kernel.org/r/1607650657-35992-5-git-send-email-liweihang@huawei.comSigned-off-by: NWeihang Li <liweihang@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

94a8c4df

RDMA/hns: Do shift on traffic class when using RoCEv2 · 603bee93

由 Weihang Li 提交于 12月 11, 2020

The high 6 bits of traffic class in GRH is DSCP (Differentiated Services
Codepoint), the driver should shift it before the hardware gets it when
using RoCEv2.

Fixes: 606bf89e ("RDMA/hns: Refactor for hns_roce_v2_modify_qp function")
Fixes: fba429fc ("RDMA/hns: Fix missing fields in address vector")
Link: https://lore.kernel.org/r/1607650657-35992-4-git-send-email-liweihang@huawei.comSigned-off-by: NWeihang Li <liweihang@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

603bee93

RDMA/hns: Normalization the judgment of some features · 4ddeacf6

由 Wenpeng Liang 提交于 12月 11, 2020

Whether to enable the these features should better depend on the enable
flags, not the value of related fields.

Fixes: 5c1f167a ("RDMA/hns: Init SRQ table for hip08")
Fixes: 3cb2c996 ("RDMA/hns: Add support for SCCC in size of 64 Bytes")
Link: https://lore.kernel.org/r/1607650657-35992-3-git-send-email-liweihang@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NWeihang Li <liweihang@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

4ddeacf6

RDMA/hns: Limit the length of data copied between kernel and userspace · 1c0ca9cd

由 Wenpeng Liang 提交于 12月 11, 2020

For ib_copy_from_user(), the length of udata may not be the same as that
of cmd. For ib_copy_to_user(), the length of udata may not be the same as
that of resp. So limit the length to prevent out-of-bounds read and write
operations from ib_copy_from_user() and ib_copy_to_user().

Fixes: de77503a ("RDMA/hns: RDMA/hns: Assign rq head pointer when enable rq record db")
Fixes: 633fb4d9 ("RDMA/hns: Use structs to describe the uABI instead of opencoding")
Fixes: ae85bf92 ("RDMA/hns: Optimize qp param setup flow")
Fixes: 6fd610c5 ("RDMA/hns: Support 0 hop addressing for SRQ buffer")
Fixes: 9d9d4ff7 ("RDMA/hns: Update the kernel header file of hns")
Link: https://lore.kernel.org/r/1607650657-35992-2-git-send-email-liweihang@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NWeihang Li <liweihang@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

1c0ca9cd

11 12月, 2020 6 次提交

RDMA/mlx4: Remove bogus dev_base_lock usage · 6f320f69

由 Vladimir Oltean 提交于 12月 08, 2020

It is not clear what this lock protects. If the authors wanted to ensure
that "dev" does not disappear, that is impossible, given the following
code path:

mlx4_ib_netdev_event (under RTNL mutex)
-> mlx4_ib_scan_netdevs
   -> mlx4_ib_update_qps

Also, the dev_base_lock does not protect dev->dev_addr either.

So it serves no purpose here. Remove it.

Link: https://lore.kernel.org/r/20201208193928.1500893-1-vladimir.oltean@nxp.comReviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

6f320f69

RDMA/uverbs: Fix incorrect variable type · e0da6899

由 Avihai Horon 提交于 12月 08, 2020

Fix incorrect type of max_entries in UVERBS_METHOD_QUERY_GID_TABLE -
max_entries is of type size_t although it can take negative values.

The following static check revealed it:

drivers/infiniband/core/uverbs_std_types_device.c:338 ib_uverbs_handler_UVERBS_METHOD_QUERY_GID_TABLE() warn: 'max_entries' unsigned <= 0

Fixes: 9f85cbe5 ("RDMA/uverbs: Expose the new GID query API to user space")
Link: https://lore.kernel.org/r/20201208073545.9723-4-leon@kernel.orgReported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NAvihai Horon <avihaih@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

e0da6899

RDMA/core: Do not indicate device ready when device enablement fails · 779e0bf4

由 Jack Morgenstein 提交于 12月 08, 2020

In procedure ib_register_device, procedure kobject_uevent is called
(advertising that the device is ready for userspace usage) even when
device_enable_and_get() returned an error.

As a result, various RDMA modules attempted to register for the device
even while the device driver was preparing to unregister the device.

Fix this by advertising the device availability only after enabling the
device succeeds.

Fixes: e7a5b4aa ("RDMA/device: Don't fire uevent before device is fully initialized")
Link: https://lore.kernel.org/r/20201208073545.9723-3-leon@kernel.orgSuggested-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

779e0bf4

RDMA/core: Clean up cq pool mechanism · 286e1d3f

由 Jack Morgenstein 提交于 12月 08, 2020

The CQ pool mechanism had two problems:

1. The CQ pool lists were uninitialized in the device registration error
   flow.  As a result, all the list pointers remained NULL.  This caused
   the kernel to crash (in procedure ib_cq_pool_destroy) when that error
   flow was taken (and unregister called).  The stack trace snippet:

     BUG: kernel NULL pointer dereference, address: 0000000000000000
     #PF: supervisor read access in kernel mode
     #PF: error_code(0×0000) ? not-present page
     PGD 0 P4D 0
     Oops: 0000 [#1] SMP PTI
     . . .
     RIP: 0010:ib_cq_pool_destroy+0x1b/0×70 [ib_core]
     . . .
     Call Trace:
      disable_device+0x9f/0×130 [ib_core]
      __ib_unregister_device+0x35/0×90 [ib_core]
      ib_register_device+0x529/0×610 [ib_core]
      __mlx5_ib_add+0x3a/0×70 [mlx5_ib]
      mlx5_add_device+0x87/0×1c0 [mlx5_core]
      mlx5_register_interface+0x74/0xc0 [mlx5_core]
      do_one_initcall+0x4b/0×1f4
      do_init_module+0x5a/0×223
      load_module+0x1938/0×1d40

2. At device unregister, when cleaning up the cq pool, the cq's in the
   pool lists were freed, but the cq entries were left in the list.

The fix for the first issue is to initialize the cq pool lists when the
ib_device structure is allocated for a new device (in procedure
_ib_alloc_device).

The fix for the second problem is to delete cq entries from the pool lists
when cleaning up the cq pool.

In addition, procedure ib_cq_pool_destroy() is renamed to the more
appropriate name ib_cq_pool_cleanup().

Fixes: 4aa16152 ("RDMA/core: Fix ordering of CQ pool destruction")
Link: https://lore.kernel.org/r/20201208073545.9723-2-leon@kernel.orgSuggested-by: NJason Gunthorpe <jgg@nvidia.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

286e1d3f

RDMA/core: Update kernel documentation for ib_create_named_qp() · d1dec0ca

由 Lukas Bulwahn 提交于 12月 07, 2020

Commit 66f57b87 ("RDMA/restrack: Support all QP types") extends
ib_create_qp() to a named ib_create_named_qp(), which takes the caller's
name as argument, but it did not add the new argument description to the
function's kerneldoc.

make htmldocs warns:

  ./drivers/infiniband/core/verbs.c:1206: warning: Function parameter or member 'caller' not described in 'ib_create_named_qp'

Add a description for this new argument based on the description of the
same argument in other related functions.

Fixes: 66f57b87 ("RDMA/restrack: Support all QP types")
Link: https://lore.kernel.org/r/20201207173255.13355-1-lukas.bulwahn@gmail.comSigned-off-by: NLukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

d1dec0ca

RDMA/mlx5: Remove unneeded semicolon · 7f1d2dfa

由 Tom Rix 提交于 10月 31, 2020

A semicolon is not needed after a switch statement.

Link: https://lore.kernel.org/r/20201031134638.2135060-1-trix@redhat.comSigned-off-by: NTom Rix <trix@redhat.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

7f1d2dfa

10 12月, 2020 1 次提交

RDMA/cm: Fix an attempt to use non-valid pointer when cleaning timewait · 340b940e

由 Leon Romanovsky 提交于 12月 04, 2020

If cm_create_timewait_info() fails, the timewait_info pointer will contain
an error value and will be used in cm_remove_remote() later.

general protection fault, probably for non-canonical address 0xdffffc0000000024: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0×0000000000000120-0×0000000000000127]
CPU: 2 PID: 12446 Comm: syz-executor.3 Not tainted 5.10.0-rc5-5d4c0742a60e #27
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:cm_remove_remote.isra.0+0x24/0×170 drivers/infiniband/core/cm.c:978
Code: 84 00 00 00 00 00 41 54 55 53 48 89 fb 48 8d ab 2d 01 00 00 e8 7d bf 4b fe 48 89 ea 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <0f> b6 04 02 48 89 ea 83 e2 07 38 d0 7f 08 84 c0 0f 85 fc 00 00 00
RSP: 0018:ffff888013127918 EFLAGS: 00010006
RAX: dffffc0000000000 RBX: fffffffffffffff4 RCX: ffffc9000a18b000
RDX: 0000000000000024 RSI: ffffffff82edc573 RDI: fffffffffffffff4
RBP: 0000000000000121 R08: 0000000000000001 R09: ffffed1002624f1d
R10: 0000000000000003 R11: ffffed1002624f1c R12: ffff888107760c70
R13: ffff888107760c40 R14: fffffffffffffff4 R15: ffff888107760c9c
FS: 00007fe1ffcc1700(0000) GS:ffff88811a600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2ff21000 CR3: 000000010f504001 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
cm_destroy_id+0x189/0×15b0 drivers/infiniband/core/cm.c:1155
cma_connect_ib drivers/infiniband/core/cma.c:4029 [inline]
rdma_connect_locked+0x1100/0×17c0 drivers/infiniband/core/cma.c:4107
rdma_connect+0x2a/0×40 drivers/infiniband/core/cma.c:4140
ucma_connect+0x277/0×340 drivers/infiniband/core/ucma.c:1069
ucma_write+0x236/0×2f0 drivers/infiniband/core/ucma.c:1724
vfs_write+0x220/0×830 fs/read_write.c:603
ksys_write+0x1df/0×240 fs/read_write.c:658
do_syscall_64+0x33/0×40 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: a977049d ("[PATCH] IB: Add the kernel CM implementation")
Link: https://lore.kernel.org/r/20201204064205.145795-1-leon@kernel.orgReviewed-by: NMaor Gottlieb <maorg@nvidia.com>
Reported-by: NAmit Matityahu <mitm@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

340b940e

08 12月, 2020 9 次提交

RDMA/core: Fix empty gid table for non IB/RoCE devices · e432c04c

由 Gal Pressman 提交于 12月 06, 2020

The query_gid_table ioctl skips non IB/RoCE ports, which as a result
returns an empty gid table for devices such as EFA which have a GID table,
but are not IB/RoCE.

Fixes: c4b4d548 ("RDMA/core: Introduce new GID table query API")
Link: https://lore.kernel.org/r/20201206153238.34878-1-galpress@amazon.comSigned-off-by: NGal Pressman <galpress@amazon.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

e432c04c

RDMA/iser: Remove in_interrupt() usage · 0583531b

由 Sebastian Andrzej Siewior 提交于 12月 04, 2020

iser_initialize_task_headers() uses in_interrupt() to find out if it is
safe to acquire a mutex.

in_interrupt() is deprecated as it is ill defined and does not provide
what it suggests. Aside of that it covers only parts of the contexts in
which a mutex may not be acquired.

The following callchains exist:

iscsi_queuecommand() *locks* iscsi_session::frwd_lock
-> iscsi_prep_scsi_cmd_pdu()
   -> session->tt->init_task() (iscsi_iser_task_init())
      -> iser_initialize_task_headers()
-> iscsi_iser_task_xmit() (iscsi_transport::xmit_task)
  -> iscsi_iser_task_xmit_unsol_data()
    -> iser_send_data_out()
      -> iser_initialize_task_headers()

iscsi_data_xmit() *locks* iscsi_session::frwd_lock
-> iscsi_prep_mgmt_task()
   -> session->tt->init_task() (iscsi_iser_task_init())
      -> iser_initialize_task_headers()
-> iscsi_prep_scsi_cmd_pdu()
   -> session->tt->init_task() (iscsi_iser_task_init())
      -> iser_initialize_task_headers()

__iscsi_conn_send_pdu() caller has iscsi_session::frwd_lock
  -> iscsi_prep_mgmt_task()
     -> session->tt->init_task() (iscsi_iser_task_init())
        -> iser_initialize_task_headers()
  -> session->tt->xmit_task() (

The only callchain that is close to be invoked in preemptible context:
iscsi_xmitworker() worker
-> iscsi_data_xmit()
   -> iscsi_xmit_task()
      -> conn->session->tt->xmit_task() (iscsi_iser_task_xmit()

In iscsi_iser_task_xmit() there is this check:
   if (!task->sc)
      return iscsi_iser_mtask_xmit(conn, task);

so it does end up in iser_initialize_task_headers() and
iser_initialize_task_headers() relies on iscsi_task::sc == NULL.

Remove conditional locking of iser_conn::state_mutex because there is no
call chain to do so. Remove the goto label and return early now that there
is no clean up needed.

Link: https://lore.kernel.org/r/20201204174256.62xfcvudndt7oufl@linutronix.deSigned-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Max Gurtovoy <maxg@nvidia.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

0583531b

RDMA/mlx5: Assign dev to DM MR · ca991a7d

由 Maor Gottlieb 提交于 12月 03, 2020

Currently, DM MR registration flow doesn't set the mlx5_ib_dev pointer and
can cause a NULL pointer dereference if userspace dumps the MR via rdma
tool.

Assign the IB device together with the other fields and remove the
redundant reference of mlx5_ib_dev from mlx5_ib_mr.

Cc: stable@vger.kernel.org
Fixes: 6c29f57e ("IB/mlx5: Device memory mr registration support")
Link: https://lore.kernel.org/r/20201203190807.127189-1-leon@kernel.orgSigned-off-by: NMaor Gottlieb <maorg@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

ca991a7d

RDMA/hns: Move capability flags of QP and CQ to hns-abi.h · 53ef4999

由 Weihang Li 提交于 12月 02, 2020

These flags will be returned to the userspace through ABI, so they should
be defined in hns-abi.h. Furthermore, there is no need to include
hns-abi.h in every source files, it just needs to be included in the
common header file.

Link: https://lore.kernel.org/r/1606872560-17823-1-git-send-email-liweihang@huawei.comReported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NWeihang Li <liweihang@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

53ef4999

IB: Fix kernel-doc markups · 2988ca08

由 Mauro Carvalho Chehab 提交于 12月 01, 2020

Some functions have different names between their prototypes and the
kernel-doc markup.

Others need to be fixed, as kernel-doc markups should use this format:
identifier - description

Link: https://lore.kernel.org/r/78b98c41a5a0f4c0106433d305b143028a4168b0.1606823973.git.mchehab+huawei@kernel.orgSigned-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

2988ca08

RDMA/bnxt_re: Fix max_qp_wrs reported · c63e1c4d

由 Selvin Xavier 提交于 11月 30, 2020

While creating qps, the driver adds one extra entry to the sq size passed
by the ULPs in order to avoid queue full condition. When ULPs creates QPs
with max_qp_wr reported, driver creates QP with 1 more than the max_wqes
supported by HW. Create QP fails in this case. To avoid this error, reduce
1 entry in max_qp_wqes and report it to the stack.

Link: https://lore.kernel.org/r/1606741986-16477-1-git-send-email-selvin.xavier@broadcom.comSigned-off-by: NDevesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

c63e1c4d

RDMA/i40iw: Replace atomic_add_return(1, ..) · c277f98b

由 Yejune Deng 提交于 11月 30, 2020

atomic_inc_return() is a little neater

Link: https://lore.kernel.org/r/1606726376-7675-1-git-send-email-yejune.deng@gmail.comSigned-off-by: NYejune Deng <yejune.deng@gmail.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

c277f98b

RDMA/mlx5: Fix error unwinds for rereg_mr · ef3642c4

由 Jason Gunthorpe 提交于 11月 30, 2020

This is all a giant train wreck of error handling, in many cases the MR is
left in some corrupted state where continuing on is going to lead to
chaos, or various unwinds/order is missed.

rereg had three possible completely different actions, depending on flags
and various details about the MR. Split the three actions into three
functions, and call the right action from the start.

For each action carefully design the error handling to fit the action:

- UMR access/PD update is a simple UMR, if it fails the MR isn't changed,
so do nothing

- PAS update over UMR is multiple UMR operations. To keep everything sane
revoke access to the MKey while it is being changed and restore it once
the MR is correct.

- Recreating the mkey should completely build a parallel MR with a fully
loaded PAS then swap and destroy the old one. If it fails the original
should be left untouched. This is handled in the core code. Directly
call the normal MR creation functions, possibly re-using the existing
umem.

Add support for working with ODP MRs. The READ/WRITE access flags can be
changed by UMR and we can trivially convert to/from ODP MRs using the
logic to build a completely new MR.

This new logic also fixes various problems with MRs continuing to work
while their PAS lists are no longer valid, eg during a page size change.

Link: https://lore.kernel.org/r/20201130075839.278575-6-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

ef3642c4

RDMA/mlx5: Reorganize mlx5_ib_reg_user_mr() · 38f8ff5b

由 Jason Gunthorpe 提交于 11月 30, 2020

This function handles an ODP and regular MR flow all mushed together, even
though the two flows are quite different. Split them into two dedicated
functions.

Link: https://lore.kernel.org/r/20201130075839.278575-5-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

38f8ff5b

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功