- 30 7月, 2020 2 次提交
-
-
由 Jason Gunthorpe 提交于
Whenever an event is delivered to the handler it should be done under the handler_mutex and upon any non-zero return from the handler it should trigger destruction of the cm_id. cma_process_remove() skips some steps here, it is not necessarily wrong since the state change should prevent any races, but it is confusing and unnecessary. Follow the standard pattern here, with the slight twist that the transition to RDMA_CM_DEVICE_REMOVAL includes a cma_cancel_operation(). Link: https://lore.kernel.org/r/20200723070707.1771101-3-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Jason Gunthorpe 提交于
cma_process_remove() triggers an unconditional rdma_destroy_id() for internal_id's and skips the event deliver and transition through RDMA_CM_DEVICE_REMOVAL. This is confusing and unnecessary. internal_id always has cma_listen_handler() as the handler, have it catch the RDMA_CM_DEVICE_REMOVAL event and directly consume it and signal removal. This way the FSM sequence never skips the DEVICE_REMOVAL case and the logic in this hard to test area is simplified. Link: https://lore.kernel.org/r/20200723070707.1771101-2-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
- 03 6月, 2020 1 次提交
-
-
由 Chuck Lever 提交于
The restrack ID for an rdma_cm_id is not assigned until it is associated with a device. Here's an example I captured while testing NFS/RDMA's support for DEVICE_REMOVAL. The new tracepoint name is "cm_id_attach". <...>-4261 [001] 366.581299: cm_event_handler: cm.id=0 src=0.0.0.0:45919 dst=192.168.2.55:20049 tos=0 ADDR_ERROR (1/-19) <...>-4261 [001] 366.581304: cm_event_done: cm.id=0 src=0.0.0.0:45919 dst=192.168.2.55:20049 tos=0 ADDR_ERROR consumer returns 0 <...>-1950 [000] 366.581309: cm_id_destroy: cm.id=0 src=0.0.0.0:45919 dst=192.168.2.55:20049 tos=0 <...>-7 [001] 369.589400: cm_event_handler: cm.id=0 src=0.0.0.0:49023 dst=192.168.2.55:20049 tos=0 ADDR_ERROR (1/-19) <...>-7 [001] 369.589404: cm_event_done: cm.id=0 src=0.0.0.0:49023 dst=192.168.2.55:20049 tos=0 ADDR_ERROR consumer returns 0 <...>-1950 [000] 369.589407: cm_id_destroy: cm.id=0 src=0.0.0.0:49023 dst=192.168.2.55:20049 tos=0 <...>-4261 [001] 372.597650: cm_id_attach: cm.id=0 src=192.168.2.51:47492 dst=192.168.2.55:20049 device=mlx4_0 <...>-4261 [001] 372.597652: cm_event_handler: cm.id=0 src=192.168.2.51:47492 dst=192.168.2.55:20049 tos=0 ADDR_RESOLVED (0/0) <...>-4261 [001] 372.597654: cm_event_done: cm.id=0 src=192.168.2.51:47492 dst=192.168.2.55:20049 tos=0 ADDR_RESOLVED consumer returns 0 <...>-4261 [001] 372.597738: cm_event_handler: cm.id=0 src=192.168.2.51:47492 dst=192.168.2.55:20049 tos=0 ROUTE_RESOLVED (2/0) <...>-4261 [001] 372.597740: cm_event_done: cm.id=0 src=192.168.2.51:47492 dst=192.168.2.55:20049 tos=0 ROUTE_RESOLVED consumer returns 0 <...>-4691 [007] 372.600101: cm_qp_create: cm.id=0 src=192.168.2.51:47492 dst=192.168.2.55:20049 tos=0 pd.id=2 qp_type=RC send_wr=4091 recv_wr=256 qp_num=530 rc=0 <...>-4691 [007] 372.600207: cm_send_req: cm.id=0 src=192.168.2.51:47492 dst=192.168.2.55:20049 tos=0 qp_num=530 <...>-185 [002] 372.601212: cm_send_mra: cm.id=0 src=192.168.2.51:47492 dst=192.168.2.55:20049 tos=0 <...>-185 [002] 372.601362: cm_send_rtu: cm.id=0 src=192.168.2.51:47492 dst=192.168.2.55:20049 tos=0 <...>-185 [002] 372.601372: cm_event_handler: cm.id=0 src=192.168.2.51:47492 dst=192.168.2.55:20049 tos=0 ESTABLISHED (9/0) <...>-185 [002] 372.601379: cm_event_done: cm.id=0 src=192.168.2.51:47492 dst=192.168.2.55:20049 tos=0 ESTABLISHED consumer returns 0 Fixes: ed999f82 ("RDMA/cma: Add trace points in RDMA Connection Manager") Link: https://lore.kernel.org/r/20200530174934.21362.56754.stgit@manet.1015granger.netSigned-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 28 5月, 2020 4 次提交
-
-
由 Leon Romanovsky 提交于
IBTA declares "vendor option not supported" reject reason in REJ messages if passive side doesn't want to accept proposed ECE options. Due to the fact that ECE is managed by userspace, there is a need to let users to provide such rejected reason. Link: https://lore.kernel.org/r/20200526103304.196371-7-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
The rdma_accept() is called by both passive and active sides of CMID connection to mark readiness to start data transfer. For passive side, this is called explicitly, for active side, it is called implicitly while receiving REP message. Provide ECE data to rdma_accept function needed for passive side to send that REP message. Link: https://lore.kernel.org/r/20200526103304.196371-6-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
ECE parameters are exchanged through REQ->REP/SIDR_REP messages, this patch adds the data to provide to other side of CMID communication channel. Link: https://lore.kernel.org/r/20200526103304.196371-5-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
Active side of CMID initiates connection through librdmacm's rdma_connect() and kernel's ucma_connect(). Extend UCMA interface to handle those new parameters. Link: https://lore.kernel.org/r/20200526103304.196371-3-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 07 5月, 2020 1 次提交
-
-
由 Mark Zhang 提交于
If flow label is not set by the user or it's not IPv4, initialize it with the cma src/dst based on the "Kernighan and Ritchie's hash function". Link: https://lore.kernel.org/r/20200504051935.269708-5-leon@kernel.orgSigned-off-by: NMark Zhang <markz@mellanox.com> Reviewed-by: NMaor Gottlieb <maorg@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 06 5月, 2020 1 次提交
-
-
由 Jason Gunthorpe 提交于
When a client is added it isn't allowed to fail, but all the client's have various failure paths within their add routines. This creates the very fringe condition where the client was added, failed during add and didn't set the client_data. The core code will then still call other client_data centric ops like remove(), rename(), get_nl_info(), and get_net_dev_by_params() with NULL client_data - which is confusing and unexpected. If the add() callback fails, then do not call any more client ops for the device, even remove. Remove all the now redundant checks for NULL client_data in ops callbacks. Update all the add() callbacks to return error codes appropriately. EOPNOTSUPP is used for cases where the ULP does not support the ib_device - eg because it only works with IB. Link: https://lore.kernel.org/r/20200421172440.387069-1-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Acked-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 15 4月, 2020 1 次提交
-
-
由 Leon Romanovsky 提交于
The function is local to cma.c, so let's limit its scope. Link: https://lore.kernel.org/r/20200413132323.930869-1-leon@kernel.orgReviewed-by: NMaor Gottlieb <maorg@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 27 3月, 2020 1 次提交
-
-
由 Avihai Horon 提交于
After a successful allocation of path_rec, num_paths is set to 1, but any error after such allocation will leave num_paths uncleared. This causes to de-referencing a NULL pointer later on. Hence, num_paths needs to be set back to 0 if such an error occurs. The following crash from syzkaller revealed it. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI CPU: 0 PID: 357 Comm: syz-executor060 Not tainted 4.18.0+ #311 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:ib_copy_path_rec_to_user+0x94/0x3e0 Code: f1 f1 f1 f1 c7 40 0c 00 00 f4 f4 65 48 8b 04 25 28 00 00 00 48 89 45 c8 31 c0 e8 d7 60 24 ff 48 8d 7b 4c 48 89 f8 48 c1 e8 03 <42> 0f b6 14 30 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 RSP: 0018:ffff88006586f980 EFLAGS: 00010207 RAX: 0000000000000009 RBX: 0000000000000000 RCX: 1ffff1000d5fe475 RDX: ffff8800621e17c0 RSI: ffffffff820d45f9 RDI: 000000000000004c RBP: ffff88006586fa50 R08: ffffed000cb0df73 R09: ffffed000cb0df72 R10: ffff88006586fa70 R11: ffffed000cb0df73 R12: 1ffff1000cb0df30 R13: ffff88006586fae8 R14: dffffc0000000000 R15: ffff88006aff2200 FS: 00000000016fc880(0000) GS:ffff88006d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000040 CR3: 0000000063fec000 CR4: 00000000000006b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? ib_copy_path_rec_from_user+0xcc0/0xcc0 ? __mutex_unlock_slowpath+0xfc/0x670 ? wait_for_completion+0x3b0/0x3b0 ? ucma_query_route+0x818/0xc60 ucma_query_route+0x818/0xc60 ? ucma_listen+0x1b0/0x1b0 ? sched_clock_cpu+0x18/0x1d0 ? sched_clock_cpu+0x18/0x1d0 ? ucma_listen+0x1b0/0x1b0 ? ucma_write+0x292/0x460 ucma_write+0x292/0x460 ? ucma_close_id+0x60/0x60 ? sched_clock_cpu+0x18/0x1d0 ? sched_clock_cpu+0x18/0x1d0 __vfs_write+0xf7/0x620 ? ucma_close_id+0x60/0x60 ? kernel_read+0x110/0x110 ? time_hardirqs_on+0x19/0x580 ? lock_acquire+0x18b/0x3a0 ? finish_task_switch+0xf3/0x5d0 ? _raw_spin_unlock_irq+0x29/0x40 ? _raw_spin_unlock_irq+0x29/0x40 ? finish_task_switch+0x1be/0x5d0 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? security_file_permission+0x172/0x1e0 vfs_write+0x192/0x460 ksys_write+0xc6/0x1a0 ? __ia32_sys_read+0xb0/0xb0 ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe ? do_syscall_64+0x1d/0x470 do_syscall_64+0x9e/0x470 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: 3c86aa70 ("RDMA/cm: Add RDMA CM support for IBoE devices") Link: https://lore.kernel.org/r/20200318101741.47211-1-leon@kernel.orgSigned-off-by: NAvihai Horon <avihaih@mellanox.com> Reviewed-by: NMaor Gottlieb <maorg@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 11 3月, 2020 1 次提交
-
-
由 Jason Gunthorpe 提交于
This lock ordering only happens when bonding is enabled and a certain bonding related event fires. However, since it can happen this is a global restriction on lock ordering. Teach lockdep about the order directly and unconditionally so bugs here are found quickly. See https://syzkaller.appspot.com/bug?extid=55de90ab5f44172b0c90 Link: https://lore.kernel.org/r/20200227203651.GA27185@ziepe.caSigned-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 20 2月, 2020 1 次提交
-
-
由 Parav Pandit 提交于
This reverts commit 219d2e9d. The call chain below requires the cm_id_priv's destination address to be setup before performing rdma_bind_addr(). Otherwise source port allocation fails as cma_port_is_unique() no longer sees the correct tuple to allow duplicate users of the source port. rdma_resolve_addr() cma_bind_addr() rdma_bind_addr() cma_get_port() cma_alloc_any_port() cma_port_is_unique() <- compared with zero daddr This can result in false failures to connect, particularly if the source port range is restricted. Fixes: 219d2e9d ("RDMA/cma: Simplify rdma_resolve_addr() error flow") Link: https://lore.kernel.org/r/20200212072635.682689-4-leon@kernel.orgSigned-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 12 2月, 2020 6 次提交
-
-
由 Parav Pandit 提交于
Use a refcount_t for atomics being used as a refcount. Link: https://lore.kernel.org/r/20200126142652.104803-8-leon@kernel.orgSigned-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Parav Pandit 提交于
Helper functions which increment/decrement reference count of a structure read better when they are named with the get/put suffix. Hence, rename cma_ref/deref_id() to cma_id_get/put(). Also use cma_get_id() wrapper to find the balancing put() calls. Link: https://lore.kernel.org/r/20200126142652.104803-7-leon@kernel.orgSigned-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Parav Pandit 提交于
Use the refcount variant to capture the reference counting of the cma device structure. Link: https://lore.kernel.org/r/20200126142652.104803-6-leon@kernel.orgSigned-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Parav Pandit 提交于
Helper functions which increment/decrement reference count of the structure read better when they are named with the get/put suffix. Hence, rename cma_ref/deref_dev() to cma_dev_get/put(). Link: https://lore.kernel.org/r/20200126142652.104803-5-leon@kernel.orgSigned-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Parav Pandit 提交于
Use RDMA device port iterator to avoid open coding. Link: https://lore.kernel.org/r/20200126142652.104803-4-leon@kernel.orgSigned-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Parav Pandit 提交于
To avoid errors, with attaching ownership of work item and its cm_id refcount which is decremented in work handler, tie them up in single helper function. Also avoid code duplication. Link: https://lore.kernel.org/r/20200126142652.104803-3-leon@kernel.orgSigned-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 29 1月, 2020 1 次提交
-
-
由 Parav Pandit 提交于
Below commit missed the AF_IB and loopback code flow in rdma_resolve_addr(). This leads to an unbalanced cm_id refcount in cma_work_handler() which puts the refcount which was not incremented prior to queuing the work. A call trace is observed with such code flow: BUG: unable to handle kernel NULL pointer dereference at (null) [<ffffffff96b67e16>] __mutex_lock_slowpath+0x166/0x1d0 [<ffffffff96b6715f>] mutex_lock+0x1f/0x2f [<ffffffffc0beabb5>] cma_work_handler+0x25/0xa0 [<ffffffff964b9ebf>] process_one_work+0x17f/0x440 [<ffffffff964baf56>] worker_thread+0x126/0x3c0 Hence, hold the cm_id reference when scheduling the resolve work item. Fixes: 722c7b2b ("RDMA/{cma, core}: Avoid callback on rdma_addr_cancel()") Link: https://lore.kernel.org/r/20200126142652.104803-2-leon@kernel.orgSigned-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 08 1月, 2020 1 次提交
-
-
由 Chuck Lever 提交于
Record state transitions as each connection is established. The IP address of both peers and the Type of Service is reported. These trace points are not in performance hot paths. Also, record each cm_event_handler call to ULPs. This eliminates the need for each ULP to add its own similar trace point in its CM event handler function. These new trace points appear in a new trace subsystem called "rdma_cma". Sample events: <...>-220 [004] 121.430733: cm_id_create: cm.id=0 <...>-472 [003] 121.430991: cm_event_handler: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 ADDR_RESOLVED (0/0) <...>-472 [003] 121.430995: cm_event_done: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 result=0 <...>-472 [003] 121.431172: cm_event_handler: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 ROUTE_RESOLVED (2/0) <...>-472 [003] 121.431174: cm_event_done: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 result=0 <...>-220 [004] 121.433480: cm_qp_create: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 pd.id=2 qp_type=RC send_wr=4091 recv_wr=256 qp_num=521 rc=0 <...>-220 [004] 121.433577: cm_send_req: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 qp_num=521 kworker/1:2-973 [001] 121.436190: cm_send_mra: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 kworker/1:2-973 [001] 121.436340: cm_send_rtu: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 kworker/1:2-973 [001] 121.436359: cm_event_handler: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 ESTABLISHED (9/0) kworker/1:2-973 [001] 121.436365: cm_event_done: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 result=0 <...>-1975 [005] 123.161954: cm_disconnect: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 <...>-1975 [005] 123.161974: cm_sent_dreq: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 <...>-220 [004] 123.162102: cm_disconnect: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 kworker/0:1-13 [000] 123.162391: cm_event_handler: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 DISCONNECTED (10/0) kworker/0:1-13 [000] 123.162393: cm_event_done: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 result=0 <...>-220 [004] 123.164456: cm_qp_destroy: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 qp_num=521 <...>-220 [004] 123.165290: cm_id_destroy: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 Some features to note: - restracker ID of the rdma_cm_id is tagged on each trace event - The source and destination IP addresses and TOS are reported - CM event upcalls are shown with decoded event and status - CM state transitions are reported - rdma_cm_id lifetime events are captured - The latency of ULP CM event handlers is reported - Lifetime events of associated QPs are reported - Device removal and insertion is reported This patch is based on previous work by: Saeed Mahameed <saeedm@mellanox.com> Mukesh Kacker <mukesh.kacker@oracle.com> Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com> Aron Silverton <aron.silverton@oracle.com> Avinash Repaka <avinash.repaka@oracle.com> Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Link: https://lore.kernel.org/r/20191218201810.30584.3052.stgit@manet.1015granger.netSigned-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 10 12月, 2019 1 次提交
-
-
由 Chuhong Yuan 提交于
The driver forgets to call unregister_pernet_subsys() in the error path of cma_init(). Add the missed call to fix it. Fixes: 4be74b42 ("IB/cma: Separate port allocation to network namespaces") Signed-off-by: NChuhong Yuan <hslester96@gmail.com> Reviewed-by: NParav Pandit <parav@mellanox.com> Link: https://lore.kernel.org/r/20191206012426.12744-1-hslester96@gmail.comSigned-off-by: NDoug Ledford <dledford@redhat.com>
-
- 17 11月, 2019 1 次提交
-
-
由 Dag Moxnes 提交于
The cma is currently using a hard-coded value, CMA_IBOE_PACKET_LIFETIME, for the PacketLifeTime, as it can not be determined from the network. This value might not be optimal for all networks. The cma module supports the function rdma_set_ack_timeout to set the ACK timeout for a QP associated with a connection. As per IBTA 12.7.34 local ACK timeout = (2 * PacketLifeTime + Local CA’s ACK delay). Assuming a negligible local ACK delay, we can use PacketLifeTime = local ACK timeout/2 as a reasonable approximation for RoCE networks. Link: https://lore.kernel.org/r/1572439440-17416-1-git-send-email-dag.moxnes@oracle.comSigned-off-by: NDag Moxnes <dag.moxnes@oracle.com> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 28 10月, 2019 1 次提交
-
-
由 Leon Romanovsky 提交于
Add Mellanox to lust of copyright holders and replace copyright boilerplate with relevant SPDX tag. Link: https://lore.kernel.org/r/20191020071559.9743-4-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 23 10月, 2019 1 次提交
-
-
由 Parav Pandit 提交于
When a macvlan netdevice is used for RoCE, consider the tos->prio->tc mapping as SL using its lower netdevice. 1. If the lower netdevice is a VLAN netdevice, consider the VLAN netdevice and it's parent netdevice for mapping 2. If the lower netdevice is not a VLAN netdevice, consider tc mapping directly from the lower netdevice Link: https://lore.kernel.org/r/20191015072058.17347-1-leon@kernel.orgSigned-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 01 10月, 2019 1 次提交
-
-
由 Bart Van Assche 提交于
This patch fixes the lock inversion complaint: ============================================ WARNING: possible recursive locking detected 5.3.0-rc7-dbg+ #1 Not tainted -------------------------------------------- kworker/u16:6/171 is trying to acquire lock: 00000000035c6e6c (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x78/0x4a0 [rdma_cm] but task is already holding lock: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&id_priv->handler_mutex); lock(&id_priv->handler_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/u16:6/171: #0: 00000000e2eaa773 ((wq_completion)iw_cm_wq){+.+.}, at: process_one_work+0x472/0xac0 #1: 000000001efd357b ((work_completion)(&work->work)#3){+.+.}, at: process_one_work+0x476/0xac0 #2: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] stack backtrace: CPU: 3 PID: 171 Comm: kworker/u16:6 Not tainted 5.3.0-rc7-dbg+ #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: iw_cm_wq cm_work_handler [iw_cm] Call Trace: dump_stack+0x8a/0xd6 __lock_acquire.cold+0xe1/0x24d lock_acquire+0x106/0x240 __mutex_lock+0x12e/0xcb0 mutex_lock_nested+0x1f/0x30 rdma_destroy_id+0x78/0x4a0 [rdma_cm] iw_conn_req_handler+0x5c9/0x680 [rdma_cm] cm_work_handler+0xe62/0x1100 [iw_cm] process_one_work+0x56d/0xac0 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 This is not a bug as there are actually two lock classes here. Link: https://lore.kernel.org/r/20190930231707.48259-3-bvanassche@acm.org Fixes: de910bd9 ("RDMA/cma: Simplify locking needed for serialization of callbacks") Signed-off-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 14 9月, 2019 1 次提交
-
-
由 Håkon Bugge 提交于
In addr_handler(), assuming status == 0 and the device already has been acquired (id_priv->cma_dev != NULL), we get the following incorrect "error" message: RDMA CM: ADDR_ERROR: failed to resolve IP. status 0 Fixes: 498683c6 ("IB/cma: Add debug messages to error flows") Link: https://lore.kernel.org/r/20190902092731.1055757-1-haakon.bugge@oracle.comSigned-off-by: NHåkon Bugge <haakon.bugge@oracle.com> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 21 8月, 2019 1 次提交
-
-
由 zhengbin 提交于
In cma_init, if cma_configfs_init fails, need to free the previously memory and return fail, otherwise will trigger null-ptr-deref Read in cma_cleanup. cma_cleanup cma_configfs_exit configfs_unregister_subsystem Fixes: 045959db ("IB/cma: Add configfs for rdma_cm") Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: Nzhengbin <zhengbin13@huawei.com> Reviewed-by: NParav Pandit <parav@mellanox.com> Link: https://lore.kernel.org/r/1566188859-103051-1-git-send-email-zhengbin13@huawei.comSigned-off-by: NDoug Ledford <dledford@redhat.com>
-
- 03 5月, 2019 1 次提交
-
-
由 Parav Pandit 提交于
To access the netdevice of the GID attribute, use an existing API rdma_read_gid_attr_ndev_rcu(). This further reduces dependency on open access to netdevice of GID attribute. Signed-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 24 4月, 2019 1 次提交
-
-
由 Parav Pandit 提交于
When two netdev have same link local addresses (such as vlan and non vlan), two rdma cm listen id should be able to bind to following different addresses. listener-1: addr=lla, scope_id=A, port=X listener-2: addr=lla, scope_id=B, port=X However while comparing the addresses only addr and port are considered, due to which 2nd listener fails to listen. In below example of two listeners, 2nd listener is failing with address in use error. $ rping -sv -a fe80::268a:7ff:feb3:d113%ens2f1 -p 4545& $ rping -sv -a fe80::268a:7ff:feb3:d113%ens2f1.200 -p 4545 rdma_bind_addr: Address already in use To overcome this, consider the scope_ids as well which forms the accurate IPv6 link local address. Signed-off-by: NParav Pandit <parav@mellanox.com> Reviewed-by: NDaniel Jurgens <danielj@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 04 4月, 2019 1 次提交
-
-
由 Leon Romanovsky 提交于
Conversion from IDR to XArray missed the fact that idr_alloc() returned index as a return value, this index was saved in port variable and used as query index later on. This caused to the following error. BUG: KASAN: use-after-free in cma_check_port+0x86a/0xa20 [rdma_cm] Read of size 8 at addr ffff888069fde998 by task ucmatose/387 CPU: 3 PID: 387 Comm: ucmatose Not tainted 5.1.0-rc2+ #253 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0x7c/0xc0 print_address_description+0x6c/0x23c ? cma_check_port+0x86a/0xa20 [rdma_cm] kasan_report.cold.3+0x1c/0x35 ? cma_check_port+0x86a/0xa20 [rdma_cm] ? cma_check_port+0x86a/0xa20 [rdma_cm] cma_check_port+0x86a/0xa20 [rdma_cm] rdma_bind_addr+0x11bc/0x1b00 [rdma_cm] ? find_held_lock+0x33/0x1c0 ? cma_ndev_work_handler+0x180/0x180 [rdma_cm] ? wait_for_completion+0x3d0/0x3d0 ucma_bind+0x120/0x160 [rdma_ucm] ? ucma_resolve_addr+0x1a0/0x1a0 [rdma_ucm] ucma_write+0x1f8/0x2b0 [rdma_ucm] ? ucma_open+0x260/0x260 [rdma_ucm] vfs_write+0x157/0x460 ksys_write+0xb8/0x170 ? __ia32_sys_read+0xb0/0xb0 ? trace_hardirqs_off_caller+0x5b/0x160 ? do_syscall_64+0x18/0x3c0 do_syscall_64+0x95/0x3c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Allocated by task 381: __kasan_kmalloc.constprop.5+0xc1/0xd0 cma_alloc_port+0x4d/0x160 [rdma_cm] rdma_bind_addr+0x14e7/0x1b00 [rdma_cm] ucma_bind+0x120/0x160 [rdma_ucm] ucma_write+0x1f8/0x2b0 [rdma_ucm] vfs_write+0x157/0x460 ksys_write+0xb8/0x170 do_syscall_64+0x95/0x3c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 381: __kasan_slab_free+0x12e/0x180 kfree+0xed/0x290 rdma_destroy_id+0x6b6/0x9e0 [rdma_cm] ucma_close+0x110/0x300 [rdma_ucm] __fput+0x25a/0x740 task_work_run+0x10e/0x190 do_exit+0x85e/0x29e0 do_group_exit+0xf0/0x2e0 get_signal+0x2e0/0x17e0 do_signal+0x94/0x1570 exit_to_usermode_loop+0xfa/0x130 do_syscall_64+0x327/0x3c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Reported-by: <syzbot+2e3e485d5697ea610460@syzkaller.appspotmail.com> Reported-by: NRan Rozenstein <ranro@mellanox.com> Fixes: 63826753 ("cma: Convert portspace IDRs to XArray") Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Tested-by: NBart Van Assche <bvanassche@acm.org> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 29 3月, 2019 1 次提交
-
-
由 Parav Pandit 提交于
Introduce an API rdma_dev_access_netns() to check whether a rdma device can be accessed from the specified net namespace or not. Use rdma_dev_access_netns() while opening character uverbs, umad network device and also check while rdma cm_id binds to rdma device. Signed-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 26 3月, 2019 1 次提交
-
-
由 Matthew Wilcox 提交于
Signed-off-by: NMatthew Wilcox <willy@infradead.org> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 20 2月, 2019 1 次提交
-
-
由 Jason Gunthorpe 提交于
We have many loops iterating over all of the end port numbers on a struct ib_device, simplify them with a for_each helper. Reviewed-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 09 2月, 2019 3 次提交
-
-
由 Steve Wise 提交于
This allows drivers to know the tos was actively set by the application. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Steve Wise 提交于
If a user binds to INADDR_ANY and sets the service id, then the device-specific cm_ids should also use this tos. This allows an app to do: rdma_bind_addr(INADDR_ANY) set_service_type() rdma_listen() And connections setup via this listening endpoint will use the correct tos. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Reviewed-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Danit Goldberg 提交于
Define new option in 'rdma_set_option' to override calculated QP timeout when requested to provide QP attributes to modify a QP. At the same time, pack tos_set to be bitfield. Signed-off-by: NDanit Goldberg <danitg@mellanox.com> Reviewed-by: NMoni Shoua <monis@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 06 2月, 2019 1 次提交
-
-
由 Leon Romanovsky 提交于
Netlink statistics exported by rdma-cm never had any working user space component published to the mailing list or to any open source project. Canvassing various proprietary users, and the original requester, we find that there are no real users of this interface. This patch simply removes all occurrences of RDMA CM netlink in favour of modern nldev implementation, which provides the same information and accompanied by widely used user space component. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 15 1月, 2019 1 次提交
-
-
由 Myungho Jung 提交于
If cma_acquire_dev_by_src_ip() returns error in addr_handler(), the device state changes back to RDMA_CM_ADDR_BOUND but the resolved source IP address is still left. After that, if rdma_destroy_id() is called after rdma_listen(), the device is freed without removed from listen_any_list in cma_cancel_operation(). Revert to the previous IP address if acquiring device fails. Reported-by: syzbot+f3ce716af730c8f96637@syzkaller.appspotmail.com Signed-off-by: NMyungho Jung <mhjungk@gmail.com> Reviewed-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 09 1月, 2019 1 次提交
-
-
由 Steve Wise 提交于
A recent regression causes a null ptr crash when dumping cm_id resources. The cma is incorrectly adding all cm_id restrack resources as kernel mode. Fixes: af8d7037 ("RDMA/restrack: Resource-tracker should not use uobject pointers") Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Reviewed-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-