- 10 5月, 2023 2 次提交
-
-
由 Patrisious Haddad 提交于
mainline inclusion from mainline-v6.3-rc1 commit 8d037973 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6X49E CVE: CVE-2023-2176 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d037973d48c026224ab285e6a06985ccac6f7bf --------------------------- Refactor rdma_bind_addr function so that it doesn't require that the cma destination address be changed before calling it. So now it will update the destination address internally only when it is really needed and after passing all the required checks. Which in turn results in a cleaner and more sensible call and error handling flows for the functions that call it directly or indirectly. Signed-off-by: NPatrisious Haddad <phaddad@nvidia.com> Reported-by: NWei Chen <harperchen1110@gmail.com> Reviewed-by: NMark Zhang <markzhang@nvidia.com> Link: https://lore.kernel.org/r/3d0e9a2fd62bc10ba02fed1c7c48a48638952320.1672819273.git.leonro@nvidia.comSigned-off-by: NLeon Romanovsky <leon@kernel.org> (cherry picked from commit 8d037973) Signed-off-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
由 Jason Gunthorpe 提交于
mainline inclusion from mainline-v5.15-rc4 commit 305d568b category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6X49E CVE: CVE-2023-2176 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=305d568b72f17f674155a2a8275f865f207b3808 --------------------------- The FSM can run in a circle allowing rdma_resolve_ip() to be called twice on the same id_priv. While this cannot happen without going through the work, it violates the invariant that the same address resolution background request cannot be active twice. CPU 1 CPU 2 rdma_resolve_addr(): RDMA_CM_IDLE -> RDMA_CM_ADDR_QUERY rdma_resolve_ip(addr_handler) #1 process_one_req(): for #1 addr_handler(): RDMA_CM_ADDR_QUERY -> RDMA_CM_ADDR_BOUND mutex_unlock(&id_priv->handler_mutex); [.. handler still running ..] rdma_resolve_addr(): RDMA_CM_ADDR_BOUND -> RDMA_CM_ADDR_QUERY rdma_resolve_ip(addr_handler) !! two requests are now on the req_list rdma_destroy_id(): destroy_id_handler_unlock(): _destroy_id(): cma_cancel_operation(): rdma_addr_cancel() // process_one_req() self removes it spin_lock_bh(&lock); cancel_delayed_work(&req->work); if (!list_empty(&req->list)) == true ! rdma_addr_cancel() returns after process_on_req #1 is done kfree(id_priv) process_one_req(): for #2 addr_handler(): mutex_lock(&id_priv->handler_mutex); !! Use after free on id_priv rdma_addr_cancel() expects there to be one req on the list and only cancels the first one. The self-removal behavior of the work only happens after the handler has returned. This yields a situations where the req_list can have two reqs for the same "handle" but rdma_addr_cancel() only cancels the first one. The second req remains active beyond rdma_destroy_id() and will use-after-free id_priv once it inevitably triggers. Fix this by remembering if the id_priv has called rdma_resolve_ip() and always cancel before calling it again. This ensures the req_list never gets more than one item in it and doesn't cost anything in the normal flow that never uses this strange error path. Link: https://lore.kernel.org/r/0-v1-3bc675b8006d+22-syz_cancel_uaf_jgg@nvidia.com Cc: stable@vger.kernel.org Fixes: e51060f0 ("IB: IP address based RDMA connection manager") Reported-by: syzbot+dc3dfba010d7671e05f5@syzkaller.appspotmail.com Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> (cherry picked from commit 305d568b) Signed-off-by: NLiu Jian <liujian56@huawei.com> Conflicts: drivers/infiniband/core/cma_priv.h Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 13 2月, 2023 1 次提交
-
-
由 Michael Guralnik 提交于
stable inclusion from stable-v5.10.143 commit e9ea271c2e43af02c9d9c876519c078b09beeb5c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0U6 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e9ea271c2e43af02c9d9c876519c078b09beeb5c -------------------------------- [ Upstream commit 27cfde79 ] Fix the order of source and destination addresses when resolving the route between server and client to validate use of correct net device. The reverse order we had so far didn't actually validate the net device as the server would try to resolve the route to itself, thus always getting the server's net device. The issue was discovered when running cm applications on a single host between 2 interfaces with same subnet and source based routing rules. When resolving the reverse route the source based route rules were ignored. Fixes: f887f2ac ("IB/cma: Validate routing of incoming requests") Link: https://lore.kernel.org/r/1c1ec2277a131d277ebcceec987fd338d35b775f.1661251872.git.leonro@nvidia.comSigned-off-by: NMichael Guralnik <michaelgur@nvidia.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> Reviewed-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 06 7月, 2022 1 次提交
-
-
由 Håkon Bugge 提交于
stable inclusion from stable-v5.10.110 commit 5e6b030ac345033bd9259f7a591c424213247000 bugzilla: https://gitee.com/openeuler/kernel/issues/I574AL Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5e6b030ac345033bd9259f7a591c424213247000 -------------------------------- [ Upstream commit 748663c8 ] XRC INI QPs should be able to adjust their local ACK timeout. Fixes: 2c1619ed ("IB/cma: Define option to set ack timeout and pack tos_set") Link: https://lore.kernel.org/r/1644421175-31943-1-git-send-email-haakon.bugge@oracle.comSigned-off-by: NHåkon Bugge <haakon.bugge@oracle.com> Suggested-by: NAvneesh Pant <avneesh.pant@oracle.com> Reviewed-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 26 5月, 2022 1 次提交
-
-
由 Jason Gunthorpe 提交于
stable inclusion from stable-v5.10.103 commit 5b1cef5798b4fd6e4fd5522e7b8a26248beeacaa bugzilla: https://gitee.com/openeuler/kernel/issues/I56NE7 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5b1cef5798b4fd6e4fd5522e7b8a26248beeacaa -------------------------------- commit 22e9f710 upstream. If the state is not idle then resolve_prepare_src() should immediately fail and no change to global state should happen. However, it unconditionally overwrites the src_addr trying to build a temporary any address. For instance if the state is already RDMA_CM_LISTEN then this will corrupt the src_addr and would cause the test in cma_cancel_operation(): if (cma_any_addr(cma_src_addr(id_priv)) && !id_priv->cma_dev) Which would manifest as this trace from syzkaller: BUG: KASAN: use-after-free in __list_add_valid+0x93/0xa0 lib/list_debug.c:26 Read of size 8 at addr ffff8881546491e0 by task syz-executor.1/32204 CPU: 1 PID: 32204 Comm: syz-executor.1 Not tainted 5.12.0-rc8-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x141/0x1d7 lib/dump_stack.c:120 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232 __kasan_report mm/kasan/report.c:399 [inline] kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416 __list_add_valid+0x93/0xa0 lib/list_debug.c:26 __list_add include/linux/list.h:67 [inline] list_add_tail include/linux/list.h:100 [inline] cma_listen_on_all drivers/infiniband/core/cma.c:2557 [inline] rdma_listen+0x787/0xe00 drivers/infiniband/core/cma.c:3751 ucma_listen+0x16a/0x210 drivers/infiniband/core/ucma.c:1102 ucma_write+0x259/0x350 drivers/infiniband/core/ucma.c:1732 vfs_write+0x28e/0xa30 fs/read_write.c:603 ksys_write+0x1ee/0x250 fs/read_write.c:658 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae This is indicating that an rdma_id_private was destroyed without doing cma_cancel_listens(). Instead of trying to re-use the src_addr memory to indirectly create an any address derived from the dst build one explicitly on the stack and bind to that as any other normal flow would do. rdma_bind_addr() will copy it over the src_addr once it knows the state is valid. This is similar to commit bc0bdc5a ("RDMA/cma: Do not change route.addr.src_addr.ss_family") Link: https://lore.kernel.org/r/0-v2-e975c8fd9ef2+11e-syz_cma_srcaddr_jgg@nvidia.com Cc: stable@vger.kernel.org Fixes: 732d41c5 ("RDMA/cma: Make the locking for automatic state transition more clear") Reported-by: syzbot+c94a3675a626f6333d74@syzkaller.appspotmail.com Reviewed-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 17 5月, 2022 1 次提交
-
-
由 Maor Gottlieb 提交于
stable inclusion from stable-v5.10.99 commit 371979069a577ee5bc1bcaaa39fb53d9e4dc7e3f bugzilla: https://gitee.com/openeuler/kernel/issues/I55O7H Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=371979069a577ee5bc1bcaaa39fb53d9e4dc7e3f -------------------------------- commit d9e410eb upstream. In RoCE we should use cma_iboe_set_mgid() and not cma_set_mgid to generate the mgid, otherwise we will generate an IGMP for an incorrect address. Fixes: b5de0c60 ("RDMA/cma: Fix use after free race in roce multicast join") Link: https://lore.kernel.org/r/913bc6783fd7a95fe71ad9454e01653ee6fb4a9a.1642491047.git.leonro@nvidia.comSigned-off-by: NMaor Gottlieb <maorg@nvidia.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 27 4月, 2022 1 次提交
-
-
由 Avihai Horon 提交于
stable inclusion from stable-v5.10.94 commit 7c0d9c815ce87257e2eba1a346c27211e0867b81 bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7c0d9c815ce87257e2eba1a346c27211e0867b81 -------------------------------- [ Upstream commit 20679094 ] Currently, when cma_resolve_ib_dev() searches for a matching GID it will stop searching after encountering the first empty GID table entry. This behavior is wrong since neither IB nor RoCE spec enforce tightly packed GID tables. For example, when the matching valid GID entry exists at index N, and if a GID entry is empty at index N-1, cma_resolve_ib_dev() will fail to find the matching valid entry. Fix it by making cma_resolve_ib_dev() continue searching even after encountering missing entries. Fixes: f17df3b0 ("RDMA/cma: Add support for AF_IB to rdma_resolve_addr()") Link: https://lore.kernel.org/r/b7346307e3bb396c43d67d924348c6c496493991.1639055490.git.leonro@nvidia.comSigned-off-by: NAvihai Horon <avihaih@nvidia.com> Reviewed-by: NMark Zhang <markzhang@nvidia.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
- 15 11月, 2021 3 次提交
-
-
由 Tao Liu 提交于
stable inclusion from stable-5.10.71 commit 3f4e68902d2e545033c80d7ad62fd9a439e573f4 bugzilla: 182981 https://gitee.com/openeuler/kernel/issues/I4I3KD Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3f4e68902d2e545033c80d7ad62fd9a439e573f4 -------------------------------- [ Upstream commit ca465e1f ] If cma_listen_on_all() fails it leaves the per-device ID still on the listen_list but the state is not set to RDMA_CM_ADDR_BOUND. When the cmid is eventually destroyed cma_cancel_listens() is not called due to the wrong state, however the per-device IDs are still holding the refcount preventing the ID from being destroyed, thus deadlocking: task:rping state:D stack: 0 pid:19605 ppid: 47036 flags:0x00000084 Call Trace: __schedule+0x29a/0x780 ? free_unref_page_commit+0x9b/0x110 schedule+0x3c/0xa0 schedule_timeout+0x215/0x2b0 ? __flush_work+0x19e/0x1e0 wait_for_completion+0x8d/0xf0 _destroy_id+0x144/0x210 [rdma_cm] ucma_close_id+0x2b/0x40 [rdma_ucm] __destroy_id+0x93/0x2c0 [rdma_ucm] ? __xa_erase+0x4a/0xa0 ucma_destroy_id+0x9a/0x120 [rdma_ucm] ucma_write+0xb8/0x130 [rdma_ucm] vfs_write+0xb4/0x250 ksys_write+0xb5/0xd0 ? syscall_trace_enter.isra.19+0x123/0x190 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Ensure that cma_listen_on_all() atomically unwinds its action under the lock during error. Fixes: c80a0c52 ("RDMA/cma: Add missing error handling of listen_id") Link: https://lore.kernel.org/r/20210913093344.17230-1-thomas.liu@ucloud.cnSigned-off-by: NTao Liu <thomas.liu@ucloud.cn> Reviewed-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Christoph Lameter 提交于
stable inclusion from stable-5.10.71 commit 62ba3c50104b405b522bf539eebf70cf0dce8fcd bugzilla: 182981 https://gitee.com/openeuler/kernel/issues/I4I3KD Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=62ba3c50104b405b522bf539eebf70cf0dce8fcd -------------------------------- [ Upstream commit 2cc74e1e ] ROCE uses IGMP for Multicast instead of the native Infiniband system where joins are required in order to post messages on the Multicast group. On Ethernet one can send Multicast messages to arbitrary addresses without the need to subscribe to a group. So ROCE correctly does not send IGMP joins during rdma_join_multicast(). F.e. in cma_iboe_join_multicast() we see: if (addr->sa_family == AF_INET) { if (gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP) { ib.rec.hop_limit = IPV6_DEFAULT_HOPLIMIT; if (!send_only) { err = cma_igmp_send(ndev, &ib.rec.mgid, true); } } } else { So the IGMP join is suppressed as it is unnecessary. However no such check is done in destroy_mc(). And therefore leaving a sendonly multicast group will send an IGMP leave. This means that the following scenario can lead to a multicast receiver unexpectedly being unsubscribed from a MC group: 1. Sender thread does a sendonly join on MC group X. No IGMP join is sent. 2. Receiver thread does a regular join on the same MC Group x. IGMP join is sent and the receiver begins to get messages. 3. Sender thread terminates and destroys MC group X. IGMP leave is sent and the receiver no longer receives data. This patch adds the same logic for sendonly joins to destroy_mc() that is also used in cma_iboe_join_multicast(). Fixes: ab15c95a ("IB/core: Support for CMA multicast join flags") Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2109081340540.668072@gentwo.deSigned-off-by: NChristoph Lameter <cl@linux.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jason Gunthorpe 提交于
stable inclusion from stable-5.10.71 commit 0a16c9751e0f1de96f08643216cf1f19e8a5a787 bugzilla: 182981 https://gitee.com/openeuler/kernel/issues/I4I3KD Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=0a16c9751e0f1de96f08643216cf1f19e8a5a787 -------------------------------- commit bc0bdc5a upstream. If the state is not idle then rdma_bind_addr() will immediately fail and no change to global state should happen. For instance if the state is already RDMA_CM_LISTEN then this will corrupt the src_addr and would cause the test in cma_cancel_operation(): if (cma_any_addr(cma_src_addr(id_priv)) && !id_priv->cma_dev) To view a mangled src_addr, eg with a IPv6 loopback address but an IPv4 family, failing the test. This would manifest as this trace from syzkaller: BUG: KASAN: use-after-free in __list_add_valid+0x93/0xa0 lib/list_debug.c:26 Read of size 8 at addr ffff8881546491e0 by task syz-executor.1/32204 CPU: 1 PID: 32204 Comm: syz-executor.1 Not tainted 5.12.0-rc8-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x141/0x1d7 lib/dump_stack.c:120 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232 __kasan_report mm/kasan/report.c:399 [inline] kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416 __list_add_valid+0x93/0xa0 lib/list_debug.c:26 __list_add include/linux/list.h:67 [inline] list_add_tail include/linux/list.h:100 [inline] cma_listen_on_all drivers/infiniband/core/cma.c:2557 [inline] rdma_listen+0x787/0xe00 drivers/infiniband/core/cma.c:3751 ucma_listen+0x16a/0x210 drivers/infiniband/core/ucma.c:1102 ucma_write+0x259/0x350 drivers/infiniband/core/ucma.c:1732 vfs_write+0x28e/0xa30 fs/read_write.c:603 ksys_write+0x1ee/0x250 fs/read_write.c:658 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae Which is indicating that an rdma_id_private was destroyed without doing cma_cancel_listens(). Instead of trying to re-use the src_addr memory to indirectly create an any address build one explicitly on the stack and bind to that as any other normal flow would do. Link: https://lore.kernel.org/r/0-v1-9fbb33f5e201+2a-cma_listen_jgg@nvidia.com Cc: stable@vger.kernel.org Fixes: 732d41c5 ("RDMA/cma: Make the locking for automatic state transition more clear") Reported-by: syzbot+6bb0528b13611047209c@syzkaller.appspotmail.com Tested-by: NHao Sun <sunhao.th@gmail.com> Reviewed-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 15 10月, 2021 1 次提交
-
-
由 Gerd Rausch 提交于
stable inclusion from stable-5.10.51 commit 3d08b5917984f737f32d5bee9737b9075c3895c6 bugzilla: 175263 https://gitee.com/openeuler/kernel/issues/I4DT6F Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3d08b5917984f737f32d5bee9737b9075c3895c6 -------------------------------- [ Upstream commit 74f160ea ] Fix a memory leak when "mda_resolve_route() is called more than once on the same "rdma_cm_id". This is possible if cma_query_handler() triggers the RDMA_CM_EVENT_ROUTE_ERROR flow which puts the state machine back and allows rdma_resolve_route() to be called again. Link: https://lore.kernel.org/r/f6662b7b-bdb7-2706-1e12-47c61d3474b6@oracle.comSigned-off-by: NGerd Rausch <gerd.rausch@oracle.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 13 10月, 2021 3 次提交
-
-
由 Leon Romanovsky 提交于
stable inclusion from stable-5.10.50 commit a23ba98e91ffda5a1dc84bd904fe698002f96474 bugzilla: 174522 https://gitee.com/openeuler/kernel/issues/I4DNFY Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a23ba98e91ffda5a1dc84bd904fe698002f96474 -------------------------------- [ Upstream commit 3d828754 ] Change location of rdma_restrack_del() to fix the bug where task_struct was acquired but not released, causing to resource leak. ucma_create_id() { ucma_alloc_ctx(); rdma_create_user_id() { rdma_restrack_new(); rdma_restrack_set_name() { rdma_restrack_attach_task.part.0(); <--- task_struct was gotten } } ucma_destroy_private_ctx() { ucma_put_ctx(); rdma_destroy_id() { _destroy_id() <--- id_priv was freed } } } Fixes: 889d916b ("RDMA/core: Don't access cm_id after its destruction") Link: https://lore.kernel.org/r/073ec27acb943ca8b6961663c47c5abe78a5c8cc.1624948948.git.leonro@nvidia.comReported-by: NPavel Skripkin <paskripkin@gmail.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Håkon Bugge 提交于
stable inclusion from stable-5.10.50 commit 11044f8c2c9f50e6de833f330d94fa27719c47c4 bugzilla: 174522 https://gitee.com/openeuler/kernel/issues/I4DNFY Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=11044f8c2c9f50e6de833f330d94fa27719c47c4 -------------------------------- [ Upstream commit e84045ea ] An approximation for the PacketLifeTime is half the local ACK timeout. The encoding for both timers are logarithmic. If the local ACK timeout is set, but zero, it means the timer is disabled. In this case, we choose the CMA_IBOE_PACKET_LIFETIME value, since 50% of infinite makes no sense. Before this commit, the PacketLifeTime became 255 if local ACK timeout was zero (not running). Fixed by explicitly testing for timeout being zero. Fixes: e1ee1e62 ("RDMA/cma: Use ACK timeout for RoCE packetLifeTime") Link: https://lore.kernel.org/r/1624371207-26710-1-git-send-email-haakon.bugge@oracle.comSigned-off-by: NHåkon Bugge <haakon.bugge@oracle.com> Reviewed-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Håkon Bugge 提交于
stable inclusion from stable-5.10.50 commit c764f2d899b26eb2001358498eea5b1df031c18e bugzilla: 174522 https://gitee.com/openeuler/kernel/issues/I4DNFY Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c764f2d899b26eb2001358498eea5b1df031c18e -------------------------------- [ Upstream commit ca0c448d ] The struct rdma_id_private contains three bit-fields, tos_set, timeout_set, and min_rnr_timer_set. These are set by accessor functions without any synchronization. If two or all accessor functions are invoked in close proximity in time, there will be Read-Modify-Write from several contexts to the same variable, and the result will be intermittent. Fixed by protecting the bit-fields by the qp_mutex in the accessor functions. The consumer of timeout_set and min_rnr_timer_set is in rdma_init_qp_attr(), which is called with qp_mutex held for connected QPs. Explicit locking is added for the consumers of tos and tos_set. This commit depends on ("RDMA/cma: Remove unnecessary INIT->INIT transition"), since the call to rdma_init_qp_attr() from cma_init_conn_qp() does not hold the qp_mutex. Fixes: 2c1619ed ("IB/cma: Define option to set ack timeout and pack tos_set") Fixes: 3aeffc46 ("IB/cma: Introduce rdma_set_min_rnr_timer()") Link: https://lore.kernel.org/r/1624369197-24578-3-git-send-email-haakon.bugge@oracle.comSigned-off-by: NHåkon Bugge <haakon.bugge@oracle.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 03 6月, 2021 2 次提交
-
-
由 Shay Drory 提交于
stable inclusion from stable-5.10.40 commit bd538f2f136fe5463458351a5ae045ed0a201cae bugzilla: 51882 CVE: NA -------------------------------- [ Upstream commit 889d916b ] restrack should only be attached to a cm_id while the ID has a valid device pointer. It is set up when the device is first loaded, but not cleared when the device is removed. There is also two copies of the device pointer, one private and one in the public API, and these were left out of sync. Make everything go to NULL together and manipulate restrack right around the device assignments. Found by syzcaller: BUG: KASAN: wild-memory-access in __list_del include/linux/list.h:112 [inline] BUG: KASAN: wild-memory-access in __list_del_entry include/linux/list.h:135 [inline] BUG: KASAN: wild-memory-access in list_del include/linux/list.h:146 [inline] BUG: KASAN: wild-memory-access in cma_cancel_listens drivers/infiniband/core/cma.c:1767 [inline] BUG: KASAN: wild-memory-access in cma_cancel_operation drivers/infiniband/core/cma.c:1795 [inline] BUG: KASAN: wild-memory-access in cma_cancel_operation+0x1f4/0x4b0 drivers/infiniband/core/cma.c:1783 Write of size 8 at addr dead000000000108 by task syz-executor716/334 CPU: 0 PID: 334 Comm: syz-executor716 Not tainted 5.11.0+ #271 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0xbe/0xf9 lib/dump_stack.c:120 __kasan_report mm/kasan/report.c:400 [inline] kasan_report.cold+0x5f/0xd5 mm/kasan/report.c:413 __list_del include/linux/list.h:112 [inline] __list_del_entry include/linux/list.h:135 [inline] list_del include/linux/list.h:146 [inline] cma_cancel_listens drivers/infiniband/core/cma.c:1767 [inline] cma_cancel_operation drivers/infiniband/core/cma.c:1795 [inline] cma_cancel_operation+0x1f4/0x4b0 drivers/infiniband/core/cma.c:1783 _destroy_id+0x29/0x460 drivers/infiniband/core/cma.c:1862 ucma_close_id+0x36/0x50 drivers/infiniband/core/ucma.c:185 ucma_destroy_private_ctx+0x58d/0x5b0 drivers/infiniband/core/ucma.c:576 ucma_close+0x91/0xd0 drivers/infiniband/core/ucma.c:1797 __fput+0x169/0x540 fs/file_table.c:280 task_work_run+0xb7/0x100 kernel/task_work.c:140 exit_task_work include/linux/task_work.h:30 [inline] do_exit+0x7da/0x17f0 kernel/exit.c:825 do_group_exit+0x9e/0x190 kernel/exit.c:922 __do_sys_exit_group kernel/exit.c:933 [inline] __se_sys_exit_group kernel/exit.c:931 [inline] __x64_sys_exit_group+0x2d/0x30 kernel/exit.c:931 do_syscall_64+0x2d/0x40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 255d0c14 ("RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count") Link: https://lore.kernel.org/r/3352ee288fe34f2b44220457a29bfc0548686363.1620711734.git.leonro@nvidia.comSigned-off-by: NShay Drory <shayd@nvidia.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Shay Drory 提交于
stable inclusion from stable-5.10.37 commit c5ebaca402f5c5bd61ac8316feb2aa3a0be4d4a8 bugzilla: 51868 CVE: NA -------------------------------- [ Upstream commit cb5cd0ea ] The device attach triggers addition of CM_ID to the restrack DB. However, when error occurs, we releasing this device, but defer CM_ID release. This causes to the situation where restrack sees CM_ID that is not valid anymore. As a solution, add the CM_ID to the resource tracking DB only after the attachment is finished. Found by syzcaller: infiniband syz0: added syz_tun rdma_rxe: ignoring netdev event = 10 for syz_tun infiniband syz0: set down infiniband syz0: ib_query_port failed (-19) restrack: ------------[ cut here ]------------ infiniband syz0: BUG: RESTRACK detected leak of resources restrack: User CM_ID object allocated by syz-executor716 is not freed restrack: ------------[ cut here ]------------ Fixes: b09c4d70 ("RDMA/restrack: Improve readability in task name management") Link: https://lore.kernel.org/r/ab93e56ba831eac65c322b3256796fa1589ec0bb.1618753862.git.leonro@nvidia.comSigned-off-by: NShay Drory <shayd@nvidia.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 09 4月, 2021 1 次提交
-
-
由 Avihai Horon 提交于
stable inclusion from stable-5.10.20 commit 60d613b39e8d0c9f3b526e9c96445422b4562d76 bugzilla: 50608 -------------------------------- [ Upstream commit fe454dc3 ] ucma_process_join() allocates struct ucma_multicast mc and frees it if an error occurs during its run. Specifically, if an error occurs in copy_to_user(), a use-after-free might happen in the following scenario: 1. mc struct is allocated. 2. rdma_join_multicast() is called and succeeds. During its run, cma_iboe_join_multicast() enqueues a work that will later use the aforementioned mc struct. 3. copy_to_user() is called and fails. 4. mc struct is deallocated. 5. The work that was enqueued by cma_iboe_join_multicast() is run and calls ucma_create_uevent() which tries to access mc struct (which is freed by now). Fix this bug by cancelling the work enqueued by cma_iboe_join_multicast(). Since cma_work_handler() frees struct cma_work, we don't use it in cma_iboe_join_multicast() so we can safely cancel the work later. The following syzkaller report revealed it: BUG: KASAN: use-after-free in ucma_create_uevent+0x2dd/0x;3f0 drivers/infiniband/core/ucma.c:272 Read of size 8 at addr ffff88810b3ad110 by task kworker/u8:1/108 CPU: 1 PID: 108 Comm: kworker/u8:1 Not tainted 5.10.0-rc6+ #257 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: rdma_cm cma_work_handler Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xbe/0xf9 lib/dump_stack.c:118 print_address_description.constprop.0+0x3e/0×60 mm/kasan/report.c:385 __kasan_report mm/kasan/report.c:545 [inline] kasan_report.cold+0x1f/0×37 mm/kasan/report.c:562 ucma_create_uevent+0x2dd/0×3f0 drivers/infiniband/core/ucma.c:272 ucma_event_handler+0xb7/0×3c0 drivers/infiniband/core/ucma.c:349 cma_cm_event_handler+0x5d/0×1c0 drivers/infiniband/core/cma.c:1977 cma_work_handler+0xfa/0×190 drivers/infiniband/core/cma.c:2718 process_one_work+0x54c/0×930 kernel/workqueue.c:2272 worker_thread+0x82/0×830 kernel/workqueue.c:2418 kthread+0x1ca/0×220 kernel/kthread.c:292 ret_from_fork+0x1f/0×30 arch/x86/entry/entry_64.S:296 Allocated by task 359: kasan_save_stack+0x1b/0×40 mm/kasan/common.c:48 kasan_set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc mm/kasan/common.c:461 [inline] __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:434 kmalloc include/linux/slab.h:552 [inline] kzalloc include/linux/slab.h:664 [inline] ucma_process_join+0x16e/0×3f0 drivers/infiniband/core/ucma.c:1453 ucma_join_multicast+0xda/0×140 drivers/infiniband/core/ucma.c:1538 ucma_write+0x1f7/0×280 drivers/infiniband/core/ucma.c:1724 vfs_write fs/read_write.c:603 [inline] vfs_write+0x191/0×4c0 fs/read_write.c:585 ksys_write+0x1a1/0×1e0 fs/read_write.c:658 do_syscall_64+0x2d/0×40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 359: kasan_save_stack+0x1b/0×40 mm/kasan/common.c:48 kasan_set_track+0x1c/0×30 mm/kasan/common.c:56 kasan_set_free_info+0x1b/0×30 mm/kasan/generic.c:355 __kasan_slab_free+0x112/0×160 mm/kasan/common.c:422 slab_free_hook mm/slub.c:1544 [inline] slab_free_freelist_hook mm/slub.c:1577 [inline] slab_free mm/slub.c:3142 [inline] kfree+0xb3/0×3e0 mm/slub.c:4124 ucma_process_join+0x22d/0×3f0 drivers/infiniband/core/ucma.c:1497 ucma_join_multicast+0xda/0×140 drivers/infiniband/core/ucma.c:1538 ucma_write+0x1f7/0×280 drivers/infiniband/core/ucma.c:1724 vfs_write fs/read_write.c:603 [inline] vfs_write+0x191/0×4c0 fs/read_write.c:585 ksys_write+0x1a1/0×1e0 fs/read_write.c:658 do_syscall_64+0x2d/0×40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The buggy address belongs to the object at ffff88810b3ad100 which belongs to the cache kmalloc-192 of size 192 The buggy address is located 16 bytes inside of 192-byte region [ffff88810b3ad100, ffff88810b3ad1c0) Fixes: b5de0c60 ("RDMA/cma: Fix use after free race in roce multicast join") Link: https://lore.kernel.org/r/20210211090517.1278415-1-leon@kernel.orgReported-by: NAmit Matityahu <mitm@nvidia.com> Signed-off-by: NAvihai Horon <avihaih@nvidia.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 12 1月, 2021 3 次提交
-
-
由 Leon Romanovsky 提交于
stable inclusion from stable-5.10.4 commit 04ca5e7fa40d0a82d08d85a282f7b6df975e6989 bugzilla: 46903 -------------------------------- [ Upstream commit e246b7c0 ] As part of the cma_dev release, that pointer will be set to NULL. In case it happens in rdma_bind_addr() (part of an error flow), the next call to addr_handler() will have a call to cma_acquire_dev_by_src_ip() which will overwrite sgid_attr without releasing it. WARNING: CPU: 2 PID: 108 at drivers/infiniband/core/cma.c:606 cma_bind_sgid_attr drivers/infiniband/core/cma.c:606 [inline] WARNING: CPU: 2 PID: 108 at drivers/infiniband/core/cma.c:606 cma_acquire_dev_by_src_ip+0x470/0x4b0 drivers/infiniband/core/cma.c:649 CPU: 2 PID: 108 Comm: kworker/u8:1 Not tainted 5.10.0-rc6+ #257 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: ib_addr process_one_req RIP: 0010:cma_bind_sgid_attr drivers/infiniband/core/cma.c:606 [inline] RIP: 0010:cma_acquire_dev_by_src_ip+0x470/0x4b0 drivers/infiniband/core/cma.c:649 Code: 66 d9 4a ff 4d 8b 6e 10 49 8d bd 1c 08 00 00 e8 b6 d6 4a ff 45 0f b6 bd 1c 08 00 00 41 83 e7 01 e9 49 fd ff ff e8 90 c5 29 ff <0f> 0b e9 80 fe ff ff e8 84 c5 29 ff 4c 89 f7 e8 2c d9 4a ff 4d 8b RSP: 0018:ffff8881047c7b40 EFLAGS: 00010293 RAX: ffff888104789c80 RBX: 0000000000000001 RCX: ffffffff820b8ef8 RDX: 0000000000000000 RSI: ffffffff820b9080 RDI: ffff88810cd4c998 RBP: ffff8881047c7c08 R08: ffff888104789c80 R09: ffffed10209f4036 R10: ffff888104fa01ab R11: ffffed10209f4035 R12: ffff88810cd4c800 R13: ffff888105750e28 R14: ffff888108f0a100 R15: ffff88810cd4c998 FS: 0000000000000000(0000) GS:ffff888119c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000104e60005 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: addr_handler+0x266/0x350 drivers/infiniband/core/cma.c:3190 process_one_req+0xa3/0x300 drivers/infiniband/core/addr.c:645 process_one_work+0x54c/0x930 kernel/workqueue.c:2272 worker_thread+0x82/0x830 kernel/workqueue.c:2418 kthread+0x1ca/0x220 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 Fixes: ff11c6cd ("RDMA/cma: Introduce and use cma_acquire_dev_by_src_ip()") Link: https://lore.kernel.org/r/20201213132940.345554-5-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Jason Gunthorpe 提交于
stable inclusion from stable-5.10.4 commit f85d05c0a599ede6f161988ab6eea1a62b1b400f bugzilla: 46903 -------------------------------- [ Upstream commit dd37d2f5 ] rdma_detroy_id() cannot be called under &lock - we must instead keep the error'd ID around until &lock can be released, then destroy it. This is complicated by the usual way listen IDs are destroyed through cma_process_remove() which can run at any time and will asynchronously destroy the same ID. Remove the ID from visiblity of cma_process_remove() before going down the destroy path outside the locking. Fixes: c80a0c52 ("RDMA/cma: Add missing error handling of listen_id") Link: https://lore.kernel.org/r/20201118133756.GK244516@ziepe.ca Reported-by: syzbot+1bc48bf7f78253f664a9@syzkaller.appspotmail.com Reviewed-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Leon Romanovsky 提交于
stable inclusion from stable-5.10.4 commit 70ba8b1697e35c04ea5f22edb6e401aeb1208d96 bugzilla: 46903 -------------------------------- [ Upstream commit c80a0c52 ] Don't silently continue if rdma_listen() fails but destroy previously created CM_ID and return an error to the caller. Fixes: d02d1f53 ("RDMA/cma: Fix deadlock destroying listen requests") Link: https://lore.kernel.org/r/20201104144008.3808124-5-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
- 28 10月, 2020 1 次提交
-
-
由 Jason Gunthorpe 提交于
There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the handler triggers a completion and another thread does rdma_connect() or the handler directly calls rdma_connect(). In all cases rdma_connect() needs to hold the handler_mutex, but when handler's are invoked this is already held by the core code. This causes ULPs using the 2nd method to deadlock. Provide a rdma_connect_locked() and have all ULPs call it from their handlers. Link: https://lore.kernel.org/r/0-v2-53c22d5c1405+33-rdma_connect_locking_jgg@nvidia.comReported-and-tested-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com> Fixes: 2a7cec53 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state") Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Acked-by: NJack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
- 02 10月, 2020 1 次提交
-
-
由 Avihai Horon 提交于
Separate IB_GID_TYPE_IB and IB_GID_TYPE_ROCE to two different values, so enum ib_gid_type will match the gid types of the new query GID table API which will be introduced in the following patches. This change in enum ib_gid_type requires to change also enum rdma_network_type by separating RDMA_NETWORK_IB and RDMA_NETWORK_ROCE_V1 values. Link: https://lore.kernel.org/r/20200923165015.2491894-3-leon@kernel.orgSigned-off-by: NAvihai Horon <avihaih@nvidia.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
- 29 9月, 2020 1 次提交
-
-
由 Taehee Yoo 提交于
Functions related to nested interface infrastructure such as netdev_walk_all_{ upper | lower }_dev() pass both private functions and "data" pointer to handle their own things. At this point, the data pointer type is void *. In order to make it easier to expand common variables and functions, this new netdev_nested_priv structure is added. In the following patch, a new member variable will be added into this struct to fix the lockdep issue. Signed-off-by: NTaehee Yoo <ap420073@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 9月, 2020 4 次提交
-
-
由 Leon Romanovsky 提交于
Use rdma_restrack_set_name() and rdma_restrack_parent_name() instead of tricky uses of rdma_restrack_attach_task()/rdma_restrack_uadd(). This uniformly makes all restracks add'd using rdma_restrack_add(). Link: https://lore.kernel.org/r/20200922091106.2152715-6-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Leon Romanovsky 提交于
Have a single rdma_restrack_add() that adds an entry, there is no reason to split the user/kernel here, the rdma_restrack_set_task() is responsible for this difference. This patch prepares the code to the future requirement of making restrack is mandatory for managing ib objects. Link: https://lore.kernel.org/r/20200922091106.2152715-5-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Leon Romanovsky 提交于
Refactor the restrack code to make sure the kref inside the restrack entry properly kref's the object in which it is embedded. This slight change is needed for future conversions of MR and QP which are refcounted before the release and kfree. The ideal flow from ib_core perspective as follows: * Allocate ib_* structure with rdma_zalloc_*. * Set everything that is known to ib_core to that newly created object. * Initialize kref with restrack help * Call to driver specific allocation functions. * Insert into restrack DB .... * Return and release restrack with restrack_put. Largely this means a rdma_restrack_new() should be called near allocating the containing structure. Link: https://lore.kernel.org/r/20200922091106.2152715-4-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Leon Romanovsky 提交于
Update the code to have similar destroy pattern like other IB objects. This change create asymmetry to the rdma_id_private create flow to make sure that memory is managed by restrack. Link: https://lore.kernel.org/r/20200922091106.2152715-2-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
- 17 9月, 2020 8 次提交
-
-
由 Jason Gunthorpe 提交于
The roce path triggers a work queue that continues to touch the id_priv but doesn't hold any reference on it. Futher, unlike in the IB case, the work queue is not fenced during rdma_destroy_id(). This can trigger a use after free if a destroy is triggered in the incredibly narrow window after the queue_work and the work starting and obtaining the handler_mutex. The only purpose of this work queue is to run the ULP event callback from the standard context, so switch the design to use the existing cma_work_handler() scheme. This simplifies quite a lot of the flow: - Use the cma_work_handler() callback to launch the work for roce. This requires generating the event synchronously inside the rdma_join_multicast(), which in turn means the dummy struct ib_sa_multicast can become a simple stack variable. - cm_work_handler() used the id_priv kref, so we can entirely eliminate the kref inside struct cma_multicast. Since the cma_multicast never leaks into an unprotected work queue the kfree can be done at the same time as for IB. - Eliminating the general multicast.ib requires using cma_set_mgid() in a few places to recompute the mgid. Fixes: 3c86aa70 ("RDMA/cm: Add RDMA CM support for IBoE devices") Link: https://lore.kernel.org/r/20200902081122.745412-9-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Jason Gunthorpe 提交于
Two places were open coding this sequence, and also pull in cma_leave_roce_mc_group() which was called only once. Link: https://lore.kernel.org/r/20200902081122.745412-8-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Jason Gunthorpe 提交于
There is no kernel user of RDMA CM multicast so this code managing the multicast subscription of the kernel-only internal QP is dead. Remove it. This makes the bug fixes in the next patches much simpler. Link: https://lore.kernel.org/r/20200902081122.745412-7-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Jason Gunthorpe 提交于
These are the same thing, except that cma_ndev_work doesn't have a state transition. Signal no state transition by setting old_state and new_state == 0. In all cases the handler function should not be called once rdma_destroy_id() has progressed passed setting the state. Link: https://lore.kernel.org/r/20200902081122.745412-6-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Jason Gunthorpe 提交于
The only place that still uses it is rdma_join_multicast() which is only doing a sanity check that the caller hasn't done something wrong and doesn't need the spinlock. At least in the case of rdma_join_multicast() the information it needs will remain until the ID is destroyed once it enters these states. Similarly there is no reason to check for these specific states in the handler callback, instead use the usual check for a destroyed id under the handler_mutex. Link: https://lore.kernel.org/r/20200902081122.745412-5-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Jason Gunthorpe 提交于
There is a strange unlocked read of the ID state when checking for reuseaddr. This is because an ID cannot be reusable once it becomes a listening ID. Instead of using the state to exclude reuse, just clear it as part of rdma_listen()'s flow to convert reusable into not reusable. Once a ID goes to listen there is no way back out, and the only use of reusable is on the bind_list check. Finally, update the checks under handler_mutex to use READ_ONCE and audit that once RDMA_CM_LISTEN is observed in a req callback it is stable under the handler_mutex. Link: https://lore.kernel.org/r/20200902081122.745412-4-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Jason Gunthorpe 提交于
Re-organize things so the state variable is not read unlocked. The first attempt to go directly from ADDR_BOUND immediately tells us if the ID is already bound, if we can't do that then the attempt inside rdma_bind_addr() to go from IDLE to ADDR_BOUND confirms the ID needs binding. Link: https://lore.kernel.org/r/20200902081122.745412-3-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Jason Gunthorpe 提交于
It is currently a bit confusing, but the design is if the handler_mutex is held, and the state is in RDMA_CM_CONNECT, then the state cannot leave RDMA_CM_CONNECT without also serializing with the handler_mutex. Make this clearer by adding a direct assertion, fixing the usage in rdma_connect and generally using READ_ONCE to read the state value. Link: https://lore.kernel.org/r/20200902081122.745412-2-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
- 27 8月, 2020 1 次提交
-
-
由 Jason Gunthorpe 提交于
In almost all cases rdma_accept() is called under the handler_mutex by ULPs from their handler callbacks. The one exception was ucma which did not get the handler_mutex. To improve the understand-ability of the locking scheme obtain the mutex for ucma as well. This improves how ucma works by allowing it to directly use handler_mutex for some of its internal locking against the handler callbacks intead of the global file->mut lock. There does not seem to be a serious bug here, other than a DISCONNECT event can be delivered concurrently with accept succeeding. Link: https://lore.kernel.org/r/20200818120526.702120-7-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
- 24 8月, 2020 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
-
- 30 7月, 2020 3 次提交
-
-
由 Jason Gunthorpe 提交于
When a rdma_cm_id needs to be destroyed after a handler callback fails, part of the destruction pattern is open coded into each call site. Unfortunately the blind assignment to state discards important information needed to do cma_cancel_operation(). This results in active operations being left running after rdma_destroy_id() completes, and the use-after-free bugs from KASAN. Consolidate this entire pattern into destroy_id_handler_unlock() and manage the locking correctly. The state should be set to RDMA_CM_DESTROYING under the handler_lock to atomically ensure no futher handlers are called. Link: https://lore.kernel.org/r/20200723070707.1771101-5-leon@kernel.org Reported-by: syzbot+08092148130652a6faae@syzkaller.appspotmail.com Reported-by: syzbot+a929647172775e335941@syzkaller.appspotmail.com Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Jason Gunthorpe 提交于
The REQ flows are concerned that once the handler is called on the new cm_id the ULP can choose to trigger a rdma_destroy_id() concurrently at any time. However, this is not true, while the ULP can call rdma_destroy_id(), it immediately blocks on the handler_mutex which prevents anything harmful from running concurrently. Remove the confusing extra locking and refcounts and make the handler_mutex protecting state during destroy more clear. Link: https://lore.kernel.org/r/20200723070707.1771101-4-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Jason Gunthorpe 提交于
Whenever an event is delivered to the handler it should be done under the handler_mutex and upon any non-zero return from the handler it should trigger destruction of the cm_id. cma_process_remove() skips some steps here, it is not necessarily wrong since the state change should prevent any races, but it is confusing and unnecessary. Follow the standard pattern here, with the slight twist that the transition to RDMA_CM_DEVICE_REMOVAL includes a cma_cancel_operation(). Link: https://lore.kernel.org/r/20200723070707.1771101-3-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-