- 22 2月, 2018 1 次提交
-
-
由 Leon Romanovsky 提交于
Attempt to modify XRC_TGT QP type from the user space (ibv_xsrq_pingpong invocation) will trigger the following kernel panic. It is caused by the fact that such QPs missed uobject initialization. [ 17.408845] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 [ 17.412645] IP: rdma_lookup_put_uobject+0x9/0x50 [ 17.416567] PGD 0 P4D 0 [ 17.419262] Oops: 0000 [#1] SMP PTI [ 17.422915] CPU: 0 PID: 455 Comm: ibv_xsrq_pingpo Not tainted 4.16.0-rc1+ #86 [ 17.424765] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 17.427399] RIP: 0010:rdma_lookup_put_uobject+0x9/0x50 [ 17.428445] RSP: 0018:ffffb8c7401e7c90 EFLAGS: 00010246 [ 17.429543] RAX: 0000000000000000 RBX: ffffb8c7401e7cf8 RCX: 0000000000000000 [ 17.432426] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000 [ 17.437448] RBP: 0000000000000000 R08: 00000000000218f0 R09: ffffffff8ebc4cac [ 17.440223] R10: fffff6038052cd80 R11: ffff967694b36400 R12: ffff96769391f800 [ 17.442184] R13: ffffb8c7401e7cd8 R14: 0000000000000000 R15: ffff967699f60000 [ 17.443971] FS: 00007fc29207d700(0000) GS:ffff96769fc00000(0000) knlGS:0000000000000000 [ 17.446623] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 17.448059] CR2: 0000000000000048 CR3: 000000001397a000 CR4: 00000000000006b0 [ 17.449677] Call Trace: [ 17.450247] modify_qp.isra.20+0x219/0x2f0 [ 17.451151] ib_uverbs_modify_qp+0x90/0xe0 [ 17.452126] ib_uverbs_write+0x1d2/0x3c0 [ 17.453897] ? __handle_mm_fault+0x93c/0xe40 [ 17.454938] __vfs_write+0x36/0x180 [ 17.455875] vfs_write+0xad/0x1e0 [ 17.456766] SyS_write+0x52/0xc0 [ 17.457632] do_syscall_64+0x75/0x180 [ 17.458631] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 17.460004] RIP: 0033:0x7fc29198f5a0 [ 17.460982] RSP: 002b:00007ffccc71f018 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 17.463043] RAX: ffffffffffffffda RBX: 0000000000000078 RCX: 00007fc29198f5a0 [ 17.464581] RDX: 0000000000000078 RSI: 00007ffccc71f050 RDI: 0000000000000003 [ 17.466148] RBP: 0000000000000000 R08: 0000000000000078 R09: 00007ffccc71f050 [ 17.467750] R10: 000055b6cf87c248 R11: 0000000000000246 R12: 00007ffccc71f300 [ 17.469541] R13: 000055b6cf8733a0 R14: 0000000000000000 R15: 0000000000000000 [ 17.471151] Code: 00 00 0f 1f 44 00 00 48 8b 47 48 48 8b 00 48 8b 40 10 e9 0b 8b 68 00 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 53 89 f5 <48> 8b 47 48 48 89 fb 40 0f b6 f6 48 8b 00 48 8b 40 20 e8 e0 8a [ 17.475185] RIP: rdma_lookup_put_uobject+0x9/0x50 RSP: ffffb8c7401e7c90 [ 17.476841] CR2: 0000000000000048 [ 17.477764] ---[ end trace 1dbcc5354071a712 ]--- [ 17.478880] Kernel panic - not syncing: Fatal exception [ 17.480277] Kernel Offset: 0xd000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Fixes: 2f08ee36 ("RDMA/restrack: don't use uaccess_kernel()") Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 17 2月, 2018 2 次提交
-
-
由 Steve Wise 提交于
uaccess_kernel() isn't sufficient to determine if an rdma resource is user-mode or not. For example, resources allocated in the add_one() function of an ib_client get falsely labeled as user mode, when they are kernel mode allocations. EG: mad qps. The result is that these qps are skipped over during a nldev query because of an erroneous namespace mismatch. So now we determine if the resource is user-mode by looking at the object struct's uobject or similar pointer to know if it was allocated for user mode applications. Fixes: 02d8883f ("RDMA/restrack: Add general infrastructure to track RDMA resources") Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
Update all the flows to ensure that function pointer exists prior to accessing it. This is much safer than checking the uverbs_ex_mask variable, especially since we know that test isn't working properly and will be removed in -next. This prevents a user triggereable oops. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 16 2月, 2018 4 次提交
-
-
由 Leon Romanovsky 提交于
================================================================== BUG: KASAN: use-after-free in copy_ah_attr_from_uverbs+0x6f2/0x8c0 Read of size 4 at addr ffff88006476a198 by task syzkaller697701/265 CPU: 0 PID: 265 Comm: syzkaller697701 Not tainted 4.15.0+ #90 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0xde/0x164 ? dma_virt_map_sg+0x22c/0x22c ? show_regs_print_info+0x17/0x17 ? lock_contended+0x11a0/0x11a0 print_address_description+0x83/0x3e0 kasan_report+0x18c/0x4b0 ? copy_ah_attr_from_uverbs+0x6f2/0x8c0 ? copy_ah_attr_from_uverbs+0x6f2/0x8c0 ? lookup_get_idr_uobject+0x120/0x200 ? copy_ah_attr_from_uverbs+0x6f2/0x8c0 copy_ah_attr_from_uverbs+0x6f2/0x8c0 ? modify_qp+0xd0e/0x1350 modify_qp+0xd0e/0x1350 ib_uverbs_modify_qp+0xf9/0x170 ? ib_uverbs_query_qp+0xa70/0xa70 ib_uverbs_write+0x7f9/0xef0 ? attach_entity_load_avg+0x8b0/0x8b0 ? ib_uverbs_query_qp+0xa70/0xa70 ? uverbs_devnode+0x110/0x110 ? cyc2ns_read_end+0x10/0x10 ? print_irqtrace_events+0x280/0x280 ? sched_clock_cpu+0x18/0x200 ? _raw_spin_unlock_irq+0x29/0x40 ? _raw_spin_unlock_irq+0x29/0x40 ? _raw_spin_unlock_irq+0x29/0x40 ? time_hardirqs_on+0x27/0x670 __vfs_write+0x10d/0x700 ? uverbs_devnode+0x110/0x110 ? kernel_read+0x170/0x170 ? _raw_spin_unlock_irq+0x29/0x40 ? finish_task_switch+0x1bd/0x7a0 ? finish_task_switch+0x194/0x7a0 ? prandom_u32_state+0xe/0x180 ? rcu_read_unlock+0x80/0x80 ? security_file_permission+0x93/0x260 vfs_write+0x1b0/0x550 SyS_write+0xc7/0x1a0 ? SyS_read+0x1a0/0x1a0 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL_64_fastpath+0x1e/0x8b RIP: 0033:0x433c29 RSP: 002b:00007ffcf2be82a8 EFLAGS: 00000217 Allocated by task 62: kasan_kmalloc+0xa0/0xd0 kmem_cache_alloc+0x141/0x480 dup_fd+0x101/0xcc0 copy_process.part.62+0x166f/0x4390 _do_fork+0x1cb/0xe90 kernel_thread+0x34/0x40 call_usermodehelper_exec_work+0x112/0x260 process_one_work+0x929/0x1aa0 worker_thread+0x5c6/0x12a0 kthread+0x346/0x510 ret_from_fork+0x3a/0x50 Freed by task 259: kasan_slab_free+0x71/0xc0 kmem_cache_free+0xf3/0x4c0 put_files_struct+0x225/0x2c0 exit_files+0x88/0xc0 do_exit+0x67c/0x1520 do_group_exit+0xe8/0x380 SyS_exit_group+0x1e/0x20 entry_SYSCALL_64_fastpath+0x1e/0x8b The buggy address belongs to the object at ffff88006476a000 which belongs to the cache files_cache of size 832 The buggy address is located 408 bytes inside of 832-byte region [ffff88006476a000, ffff88006476a340) The buggy address belongs to the page: page:ffffea000191da80 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 flags: 0x4000000000008100(slab|head) raw: 4000000000008100 0000000000000000 0000000000000000 0000000100080008 raw: 0000000000000000 0000000100000001 ffff88006bcf7a80 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88006476a080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88006476a100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff88006476a180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88006476a200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88006476a280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Cc: syzkaller <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> # 4.11 Fixes: 44c58487 ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types") Reported-by: NNoa Osherovich <noaos@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
Avoid circular locking dependency by calling to uobj_alloc_commit() outside of xrcd_tree_mutex lock. ====================================================== WARNING: possible circular locking dependency detected 4.15.0+ #87 Not tainted ------------------------------------------------------ syzkaller401056/269 is trying to acquire lock: (&uverbs_dev->xrcd_tree_mutex){+.+.}, at: [<000000006c12d2cd>] uverbs_free_xrcd+0xd2/0x360 but task is already holding lock: (&ucontext->uobjects_lock){+.+.}, at: [<00000000da010f09>] uverbs_cleanup_ucontext+0x168/0x730 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&ucontext->uobjects_lock){+.+.}: __mutex_lock+0x111/0x1720 rdma_alloc_commit_uobject+0x22c/0x600 ib_uverbs_open_xrcd+0x61a/0xdd0 ib_uverbs_write+0x7f9/0xef0 __vfs_write+0x10d/0x700 vfs_write+0x1b0/0x550 SyS_write+0xc7/0x1a0 entry_SYSCALL_64_fastpath+0x1e/0x8b -> #0 (&uverbs_dev->xrcd_tree_mutex){+.+.}: lock_acquire+0x19d/0x440 __mutex_lock+0x111/0x1720 uverbs_free_xrcd+0xd2/0x360 remove_commit_idr_uobject+0x6d/0x110 uverbs_cleanup_ucontext+0x2f0/0x730 ib_uverbs_cleanup_ucontext.constprop.3+0x52/0x120 ib_uverbs_close+0xf2/0x570 __fput+0x2cd/0x8d0 task_work_run+0xec/0x1d0 do_exit+0x6a1/0x1520 do_group_exit+0xe8/0x380 SyS_exit_group+0x1e/0x20 entry_SYSCALL_64_fastpath+0x1e/0x8b other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ucontext->uobjects_lock); lock(&uverbs_dev->xrcd_tree_mutex); lock(&ucontext->uobjects_lock); lock(&uverbs_dev->xrcd_tree_mutex); *** DEADLOCK *** 3 locks held by syzkaller401056/269: #0: (&file->cleanup_mutex){+.+.}, at: [<00000000c9f0c252>] ib_uverbs_close+0xac/0x570 #1: (&ucontext->cleanup_rwsem){++++}, at: [<00000000b6994d49>] uverbs_cleanup_ucontext+0xf6/0x730 #2: (&ucontext->uobjects_lock){+.+.}, at: [<00000000da010f09>] uverbs_cleanup_ucontext+0x168/0x730 stack backtrace: CPU: 0 PID: 269 Comm: syzkaller401056 Not tainted 4.15.0+ #87 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0xde/0x164 ? dma_virt_map_sg+0x22c/0x22c ? uverbs_cleanup_ucontext+0x168/0x730 ? console_unlock+0x502/0xbd0 print_circular_bug.isra.24+0x35e/0x396 ? print_circular_bug_header+0x12e/0x12e ? find_usage_backwards+0x30/0x30 ? entry_SYSCALL_64_fastpath+0x1e/0x8b validate_chain.isra.28+0x25d1/0x40c0 ? check_usage+0xb70/0xb70 ? graph_lock+0x160/0x160 ? find_usage_backwards+0x30/0x30 ? cyc2ns_read_end+0x10/0x10 ? print_irqtrace_events+0x280/0x280 ? __lock_acquire+0x93d/0x1630 __lock_acquire+0x93d/0x1630 lock_acquire+0x19d/0x440 ? uverbs_free_xrcd+0xd2/0x360 __mutex_lock+0x111/0x1720 ? uverbs_free_xrcd+0xd2/0x360 ? uverbs_free_xrcd+0xd2/0x360 ? __mutex_lock+0x828/0x1720 ? mutex_lock_io_nested+0x1550/0x1550 ? uverbs_cleanup_ucontext+0x168/0x730 ? __lock_acquire+0x9a9/0x1630 ? mutex_lock_io_nested+0x1550/0x1550 ? uverbs_cleanup_ucontext+0xf6/0x730 ? lock_contended+0x11a0/0x11a0 ? uverbs_free_xrcd+0xd2/0x360 uverbs_free_xrcd+0xd2/0x360 remove_commit_idr_uobject+0x6d/0x110 uverbs_cleanup_ucontext+0x2f0/0x730 ? sched_clock_cpu+0x18/0x200 ? uverbs_close_fd+0x1c0/0x1c0 ib_uverbs_cleanup_ucontext.constprop.3+0x52/0x120 ib_uverbs_close+0xf2/0x570 ? ib_uverbs_remove_one+0xb50/0xb50 ? ib_uverbs_remove_one+0xb50/0xb50 __fput+0x2cd/0x8d0 task_work_run+0xec/0x1d0 do_exit+0x6a1/0x1520 ? fsnotify_first_mark+0x220/0x220 ? exit_notify+0x9f0/0x9f0 ? entry_SYSCALL_64_fastpath+0x5/0x8b ? entry_SYSCALL_64_fastpath+0x5/0x8b ? trace_hardirqs_on_thunk+0x1a/0x1c ? time_hardirqs_on+0x27/0x670 ? time_hardirqs_off+0x27/0x490 ? syscall_return_slowpath+0x6c/0x460 ? entry_SYSCALL_64_fastpath+0x5/0x8b do_group_exit+0xe8/0x380 SyS_exit_group+0x1e/0x20 entry_SYSCALL_64_fastpath+0x1e/0x8b RIP: 0033:0x431ce9 Cc: syzkaller <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> # 4.11 Fixes: fd3c7904 ("IB/core: Change idr objects to use the new schema") Reported-by: NNoa Osherovich <noaos@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
There is no matching lock for this mutex. Git history suggests this is just a missed remnant from an earlier version of the function before this locking was moved into uverbs_free_xrcd. Originally this lock was protecting the xrcd_table_delete() ===================================== WARNING: bad unlock balance detected! 4.15.0+ #87 Not tainted ------------------------------------- syzkaller223405/269 is trying to release lock (&uverbs_dev->xrcd_tree_mutex) at: [<00000000b8703372>] ib_uverbs_close_xrcd+0x195/0x1f0 but there are no more locks to release! other info that might help us debug this: 1 lock held by syzkaller223405/269: #0: (&uverbs_dev->disassociate_srcu){....}, at: [<000000005af3b960>] ib_uverbs_write+0x265/0xef0 stack backtrace: CPU: 0 PID: 269 Comm: syzkaller223405 Not tainted 4.15.0+ #87 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0xde/0x164 ? dma_virt_map_sg+0x22c/0x22c ? ib_uverbs_write+0x265/0xef0 ? console_unlock+0x502/0xbd0 ? ib_uverbs_close_xrcd+0x195/0x1f0 print_unlock_imbalance_bug+0x131/0x160 lock_release+0x59d/0x1100 ? ib_uverbs_close_xrcd+0x195/0x1f0 ? lock_acquire+0x440/0x440 ? lock_acquire+0x440/0x440 __mutex_unlock_slowpath+0x88/0x670 ? wait_for_completion+0x4c0/0x4c0 ? rdma_lookup_get_uobject+0x145/0x2f0 ib_uverbs_close_xrcd+0x195/0x1f0 ? ib_uverbs_open_xrcd+0xdd0/0xdd0 ib_uverbs_write+0x7f9/0xef0 ? cyc2ns_read_end+0x10/0x10 ? ib_uverbs_open_xrcd+0xdd0/0xdd0 ? uverbs_devnode+0x110/0x110 ? cyc2ns_read_end+0x10/0x10 ? cyc2ns_read_end+0x10/0x10 ? sched_clock_cpu+0x18/0x200 __vfs_write+0x10d/0x700 ? uverbs_devnode+0x110/0x110 ? kernel_read+0x170/0x170 ? __fget+0x358/0x5d0 ? security_file_permission+0x93/0x260 vfs_write+0x1b0/0x550 SyS_write+0xc7/0x1a0 ? SyS_read+0x1a0/0x1a0 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL_64_fastpath+0x1e/0x8b RIP: 0033:0x4335c9 Cc: syzkaller <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> # 4.11 Fixes: fd3c7904 ("IB/core: Change idr objects to use the new schema") Reported-by: NNoa Osherovich <noaos@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
Once the uobj is committed it is immediately possible another thread could destroy it, which worst case, can result in a use-after-free of the restrack objects. Cc: syzkaller <syzkaller@googlegroups.com> Fixes: 08f294a1 ("RDMA/core: Add resource tracking for create and destroy CQs") Reported-by: NNoa Osherovich <noaos@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 30 1月, 2018 3 次提交
-
-
由 Leon Romanovsky 提交于
Track create and destroy operations of PD objects. Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
Track create and destroy operations of CQ objects. Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
Track create and destroy operations of QP objects. Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 16 1月, 2018 1 次提交
-
-
由 Jason Gunthorpe 提交于
This matches what the userspace copy of this header has been doing for a while. imm_data is an opaque 4 byte array carried over the network, and invalidate_rkey is in CPU byte order. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 28 12月, 2017 1 次提交
-
-
由 Moni Shoua 提交于
If the input command length is larger than the kernel supports an error should be returned in case the unsupported bytes are not cleared, instead of the other way aroudn. This matches what all other callers of ib_is_udata_cleared do and will avoid user ABI problems in the future. Cc: <stable@vger.kernel.org> # v4.10 Fixes: 189aba99 ("IB/uverbs: Extend modify_qp and support packet pacing") Reviewed-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NMoni Shoua <monis@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 12 12月, 2017 1 次提交
-
-
由 Gomonovych, Vasyl 提交于
Fix ptr_ret.cocci warnings: drivers/infiniband/core/uverbs_cmd.c:1156:1-3: WARNING: PTR_ERR_OR_ZERO can be used Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR Generated by: scripts/coccinelle/api/ptr_ret.cocci Signed-off-by: NVasyl Gomonovych <gomonovych@gmail.com> Reviewed-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 08 12月, 2017 1 次提交
-
-
由 Daniel Jurgens 提交于
The alternate port number is used as an array index in the IB security implementation, invalid values can result in a kernel panic. Cc: <stable@vger.kernel.org> # v4.12 Fixes: d291f1a6 ("IB/core: Enforce PKey security on QPs") Signed-off-by: NDaniel Jurgens <danielj@mellanox.com> Reviewed-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 14 11月, 2017 3 次提交
-
-
由 Leon Romanovsky 提交于
Current ib_modify_cq() is used to set CQ moderation parameters. This patch renames ib_modify_cq() to be rdma_set_cq_moderation(), because the kernel version of RDMA API doesn't need to follow already exposed to user's API pattern (create_XXX/modify_XXX/query_XXX/destroy_XXX) and better to have more accurate name which describes the actual usage. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Yonatan Cohen 提交于
The query_device function can now obtain the maximum values for cq_max_count and cq_period, needed for CQ moderation. cq_max_count is a 16 bits number that determines the number of CQEs to accumulate before generating an event. cq_period is a 16 bits number that determines the timeout in micro seconds from the last event generated, upon which a new event will be generated even if cq_max_count was not reached. Signed-off-by: NYonatan Cohen <yonatanc@mellanox.com> Reviewed-by: NMajd Dibbiny <majd@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Yonatan Cohen 提交于
Uverbs support in modify_cq for CQ moderation only. Gives ability to change cq_max_count and cq_period. CQ moderation enhance performance by moderating the number of CQEs needed to create an event instead of application having to suffer from event per-CQE. To achieve CQ moderation the application needs to set cq_max_count and cq_period. cq_max_count - defines the number of CQEs needed to create an event. cq_period - defines the timeout (micro seconds) between last event and a new one that will occur even if cq_max_count was not satisfied Signed-off-by: NYonatan Cohen <yonatanc@mellanox.com> Reviewed-by: NMajd Dibbiny <majd@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 11 11月, 2017 1 次提交
-
-
由 Noa Osherovich 提交于
There are root complexes that are able to optimize their performance when incoming data is multiple full cache lines. PCI write end padding is the device's ability to pad the ending of incoming packets (scatter) to full cache line such that the last upstream write generated by an incoming packet will be a full cache line. Add a relevant entry to ib_device_cap_flags to report such capability of an RDMA device. Add the QP and WQ create flags: * A QP/WQ created with a scatter end padding flag will cause HW to pad the last upstream write generated by a packet to cache line. User should consider several factors before activating this feature: - In case of high CPU memory load (which may cause PCI back pressure in turn), if a large percent of the writes are partial cache line, this feature should be checked as an optional solution. - This feature might reduce performance if most packets are between one and two cache lines and PCIe throughput has reached its maximum capacity. E.g. 65B packet from the network port will lead to 128B write on PCIe, which may cause traffic on PCIe to reach high throughput. Signed-off-by: NNoa Osherovich <noaos@mellanox.com> Reviewed-by: NMajd Dibbiny <majd@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 19 10月, 2017 1 次提交
-
-
由 Parav Pandit 提交于
Introduce rdma_create_user_ah API which allows passing udata to provider driver and additionally which resolves DMAC for RoCE. ib_resolve_eth_dmac() resolves destination mac address for unicast, multicast, link local ipv4 mapped ipv6 and ipv6 destination gid entry. This allows all RoCE provider drivers to avoid duplicating such code. Such change brings consistency where IB core always resolves dmac and pass it to RoCE provider drivers for user and kernel consumers, with this ah_attr->roce.dmac is always an input field for provider drivers. This uniformity avoids exporting ib_resolve_eth_dmac symbol to providers or other modules. Therefore its removed as exported symbol at later in the patch series. Now uverbs and umad both makes use of rdma_create_user_ah API which fixes the issue where umad has invalid DMAC for address. Signed-off-by: NParav Pandit <parav@mellanox.com> Reviewed-by: NDaniel Jurgens <danielj@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 27 9月, 2017 1 次提交
-
-
由 Arnd Bergmann 提交于
After changing INIT_UDATA_BUF_OR_NULL() to an inline function, this does the same change to INIT_UDATA for consistency. I'm keeping it separate as this part is much larger and we wouldn't want to backport this to stable kernels if we ever want to address the gcc warnings by backporting the first patch. Again, using an inline function gives us better type safety here among other issues with macros. I'm using u64_to_user_ptr() to convert the user pointer to simplify the logic rather than adding lots of new type casts. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 25 9月, 2017 1 次提交
-
-
由 Leon Romanovsky 提交于
The tag matching functionality is implemented by mlx5 driver by extending XRQ, however this internal kernel information was exposed to user space applications with *xrq* name instead of *tm*. This patch renames *xrq* to *tm* to handle that. Fixes: 8d50505a ("IB/uverbs: Expose XRQ capabilities") Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 09 9月, 2017 1 次提交
-
-
由 Davidlohr Bueso 提交于
Allow interval trees to quickly check for overlaps to avoid unnecesary tree lookups in interval_tree_iter_first(). As of this patch, all interval tree flavors will require using a 'rb_root_cached' such that we can have the leftmost node easily available. While most users will make use of this feature, those with special functions (in addition to the generic insert, delete, search calls) will avoid using the cached option as they can do funky things with insertions -- for example, vma_interval_tree_insert_after(). [jglisse@redhat.com: fix deadlock from typo vm_lock_anon_vma()] Link: http://lkml.kernel.org/r/20170808225719.20723-1-jglisse@redhat.com Link: http://lkml.kernel.org/r/20170719014603.19029-12-dave@stgolabs.netSigned-off-by: NDavidlohr Bueso <dbueso@suse.de> Signed-off-by: NJérôme Glisse <jglisse@redhat.com> Acked-by: NChristian König <christian.koenig@amd.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NDoug Ledford <dledford@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Cc: David Airlie <airlied@linux.ie> Cc: Jason Wang <jasowang@redhat.com> Cc: Christian Benvenuti <benve@cisco.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 8月, 2017 3 次提交
-
-
由 Artemy Kovalyov 提交于
Make XRQ capabilities available via ibv_query_device() verb. Signed-off-by: NArtemy Kovalyov <artemyko@mellanox.com> Reviewed-by: NYossi Itigin <yosefe@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Artemy Kovalyov 提交于
Add new SRQ type capable of new tag matching feature. When SRQ receives a message it will search through the matching list for the corresponding posted receive buffer. The process of searching the matching list is called tag matching. In case the tag matching results in a match, the received message will be placed in the address specified by the receive buffer. In case no match was found the message will be placed in a generic buffer until the corresponding receive buffer will be posted. These messages are called unexpected and their set is called an unexpected list. Signed-off-by: NArtemy Kovalyov <artemyko@mellanox.com> Reviewed-by: NYossi Itigin <yosefe@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Artemy Kovalyov 提交于
Before this change CQ attached to SRQ was part of XRC specific extension. Moving CQ handle out makes it available to other types extending SRQ functionality. Signed-off-by: NArtemy Kovalyov <artemyko@mellanox.com> Reviewed-by: NYossi Itigin <yosefe@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 25 8月, 2017 2 次提交
-
-
由 Parav Pandit 提交于
This patch introduces two helper functions to copy ah attributes from uverbs to internal ib_ah_attr structure and the other way during modify qp and query qp respectively. Signed-off-by: NParav Pandit <parav@mellanox.com> Reviewed-by: NDaniel Jurgens <danielj@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Noa Osherovich 提交于
Commit 44c58487 ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types") introduced the concept of type in ah_attr: * During ib_register_device, each port is checked for its type which is stored in ib_device's port_immutable array. * During uverbs' modify_qp, the type is inferred using the port number in ib_uverbs_qp_dest struct (address vector) by accessing the relevant port_immutable array and the type is passed on to providers. IB spec (version 1.3) enforces a valid port value only in Reset to Init. During Init to RTR, the address vector must be valid but port number is not mentioned as a field in the address vector, so its value is not validated, which leads to accesses to a non-allocated memory when inferring the port type. Save the real port number in ib_qp during modify to Init (when the comp_mask indicates that the port number is valid) and use this value to infer the port type. Avoid copying the address vector fields if the matching bit is not set in the attr_mask. Address vector can't be modified before the port, so no valid flow is affected. Fixes: 44c58487 ('IB/core: Define 'ib' and 'roce' rdma_ah_attr types') Signed-off-by: NNoa Osherovich <noaos@mellanox.com> Reviewed-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 23 8月, 2017 3 次提交
-
-
由 Don Hiatt 提交于
When address handle attributes are initialized, the LIDs are transformed to be in the 32 bit LID space. When constructing the header, hfi1 driver will look at the LID to determine the packet header to be created. Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: NDon Hiatt <don.hiatt@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Amrani, Ram 提交于
Most user verbs pass user data to the kernel with the inclusion of the ib_uverbs_cmd_hdr structure. This is problematic because the vendor has no ideas if the verb was called by a legacy verb or an extended verb. Also, the incosistency between the verbs is confusing. Fixes: 565197dd ("IB/core: Extend ib_uverbs_create_cq") Signed-off-by: NRam Amrani <Ram.Amrani@cavium.com> Signed-off-by: NAriel Elior <Ariel.Elior@cavium.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Bharat Potnuri 提交于
Initializing cq_context with ev_queue in create_cq(), leads to NULL pointer dereference in ib_uverbs_comp_handler(), if application doesnot use completion channel. This patch fixes the cq_context initialization. Fixes: 1e7710f3 ("IB/core: Change completion channel to use the reworked") Cc: stable@vger.kernel.org # 4.12 Signed-off-by: NPotnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com> (cherry picked from commit 699a2d5b)
-
- 19 8月, 2017 2 次提交
-
-
由 Hiatt, Don 提交于
This patch series primarily increases sizes of variables that hold lid values from 16 to 32 bits. Additionally, it adds a check in the IB mad stack to verify a properly formatted MAD when OPA extended LIDs are used. Signed-off-by: NDon Hiatt <don.hiatt@intel.com> Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Bharat Potnuri 提交于
Initializing cq_context with ev_queue in create_cq(), leads to NULL pointer dereference in ib_uverbs_comp_handler(), if application doesnot use completion channel. This patch fixes the cq_context initialization. Fixes: 1e7710f3 ("IB/core: Change completion channel to use the reworked") Signed-off-by: NPotnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 09 8月, 2017 3 次提交
-
-
由 Hiatt, Don 提交于
slid field in struct ib_wc is increased to 32 bits. This enables core components to use larger LIDs if needed. The user ABI is unchanged and return 16 bit values when queried. Signed-off-by: NDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Signed-off-by: NDon Hiatt <don.hiatt@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
sm_lid field in struct ib_port_attr is increased to 32 bits. This enables core components to use larger LIDs if needed. The user ABI is unchanged and return 16 bit values when queried. Signed-off-by: NDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Signed-off-by: NDon Hiatt <don.hiatt@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
lid field in struct ib_port_attr is increased to 32 bits. This enables core components to use larger LIDs if needed. The user ABI is unchanged and return 16 bit values when queried. Signed-off-by: NDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Signed-off-by: NDon Hiatt <don.hiatt@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 05 8月, 2017 1 次提交
-
-
由 Leon Romanovsky 提交于
initialize to zero the response structure to prevent the leakage of "resp.reserved" field. drivers/infiniband/core/uverbs_cmd.c:1178 ib_uverbs_resize_cq() warn: check that 'resp.reserved' doesn't leak information Fixes: 33b9b3ee ("IB: Add userspace support for resizing CQs") Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 24 7月, 2017 1 次提交
-
-
由 Yishai Hadas 提交于
Enable QP creation with a given source QP number, the created QP will use the source QPN as its wire QP number. To create such a QP, root privileges (i.e. CAP_NET_RAW) are required from the user application. This comes as a pre-patch for downstream patches in this series to allow user space applications to accelerate traffic which is typically handled by IPoIB ULP. Signed-off-by: NYishai Hadas <yishaih@mellanox.com> Reviewed-by: NMaor Gottlieb <maorg@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 20 7月, 2017 2 次提交
-
-
由 Ismail, Mustafa 提交于
The port number is only valid if IB_QP_PORT is set in the mask. So only check port number if it is valid to prevent modify_qp from failing due to an invalid port number. Fixes: 5ecce4c9("Check port number supplied by user verbs cmds") Cc: <stable@vger.kernel.org> # v2.6.14+ Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NMustafa Ismail <mustafa.ismail@intel.com> Tested-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Matan Barak 提交于
Delete unused variables to prevent sparse warnings. Fixes: db1b5ddd ("IB/core: Rename uverbs event file structure") Fixes: fd3c7904 ("IB/core: Change idr objects to use the new schema") Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 18 7月, 2017 1 次提交
-
-
由 Parav Pandit 提交于
This patch makes use of IB core's ib_modify_qp_with_udata function that also resolves the DMAC and handles udata. Signed-off-by: NParav Pandit <parav@mellanox.com> Reviewed-by: NEli Cohen <eli@mellanox.com> Reviewed-by: NDaniel Jurgens <danielj@mellanox.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-