- 17 1月, 2020 1 次提交
-
-
由 Michael Guralnik 提交于
Enable relaxed ordering in the mkey context when requested. As relaxed ordering is not currently supported in UMR, disable UMR usage for relaxed ordering MRs. Link: https://lore.kernel.org/r/1578506740-22188-11-git-send-email-yishaih@mellanox.comSigned-off-by: NMichael Guralnik <michaelgur@mellanox.com> Signed-off-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 16 1月, 2020 2 次提交
-
-
由 Jason Gunthorpe 提交于
Till recently it was not possible for userspace to specify a different IOVA, but with the new ibv_reg_mr_iova() library call this can be done. To compute the user_va we must compute: user_va = (iova - iova_start) + user_va_start while being cautious of overflow and other math problems. The iova is not reliably stored in the mmkey when the MR is created. Only the cached creation path (the common one) set it, so it must also be set when creating uncached MRs. Fix the weird use of iova when computing the starting page index in the MR. In the normal case, when iova == umem.address: iova & (~(BIT(page_shift) - 1)) == ALIGN_DOWN(umem.address, odp->page_size) == ib_umem_start(odp) And when iova is different using it in math with a user_va is wrong. Finally, do not allow an implicit ODP to be created with a non-zero IOVA as we have no support for that. Fixes: 7bdf65d4 ("IB/mlx5: Handle page faults") Signed-off-by: NMoni Shoua <monis@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
-
由 Moni Shoua 提交于
So far the assumption was that ib_umem_get() and ib_umem_odp_get() are called from flows that start in UVERBS and therefore has a user context. This assumption restricts flows that are initiated by ULPs and need the service that ib_umem_get() provides. This patch changes ib_umem_get() and ib_umem_odp_get() to get IB device directly by relying on the fact that both UVERBS and ULPs sets that field correctly. Reviewed-by: NGuy Levi <guyle@mellanox.com> Signed-off-by: NMoni Shoua <monis@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
-
- 04 1月, 2020 2 次提交
-
-
由 zhengbin 提交于
Fixes coccicheck warning: drivers/infiniband/hw/mlx5/mr.c:150:2-26: WARNING: Assignment of 0/1 to bool variable drivers/infiniband/hw/mlx5/mr.c:1455:2-26: WARNING: Assignment of 0/1 to bool variable drivers/infiniband/hw/mlx5/qp.c:1874:6-20: WARNING: Assignment of 0/1 to bool variable Link: https://lore.kernel.org/r/1577176812-2238-6-git-send-email-zhengbin13@huawei.comReported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: Nzhengbin <zhengbin13@huawei.com> Acked-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Artemy Kovalyov 提交于
Building MR translation table in the ODP case requires additional flexibility, namely random access to DMA addresses. Make both direct and indirect ODP MR use same code path, separated from the non-ODP MR code path. With the restructuring the correct page_shift is now used around __mlx5_ib_populate_pas(). Fixes: d2183c6f ("RDMA/umem: Move page_shift from ib_umem to ib_odp_umem") Link: https://lore.kernel.org/r/20191222124649.52300-2-leon@kernel.orgSigned-off-by: NArtemy Kovalyov <artemyko@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 24 11月, 2019 1 次提交
-
-
由 Jason Gunthorpe 提交于
Replace the internal interval tree based mmu notifier with the new common mmu_interval_notifier_insert() API. This removes a lot of code and fixes a deadlock that can be triggered in ODP: zap_page_range() mmu_notifier_invalidate_range_start() [..] ib_umem_notifier_invalidate_range_start() down_read(&per_mm->umem_rwsem) unmap_single_vma() [..] __split_huge_page_pmd() mmu_notifier_invalidate_range_start() [..] ib_umem_notifier_invalidate_range_start() down_read(&per_mm->umem_rwsem) // DEADLOCK mmu_notifier_invalidate_range_end() up_read(&per_mm->umem_rwsem) mmu_notifier_invalidate_range_end() up_read(&per_mm->umem_rwsem) The umem_rwsem is held across the range_start/end as the ODP algorithm for invalidate_range_end cannot tolerate changes to the interval tree. However, due to the nested invalidation regions the second down_read() can deadlock if there are competing writers. The new core code provides an alternative scheme to solve this problem. Fixes: ca748c39 ("RDMA/umem: Get rid of per_mm->notifier_count") Link: https://lore.kernel.org/r/20191112202231.3856-6-jgg@ziepe.caTested-by: NArtemy Kovalyov <artemyko@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 17 11月, 2019 1 次提交
-
-
由 Christoph Hellwig 提交于
The argument is always ignored, so remove it. Link: https://lore.kernel.org/r/20191113073214.9514-3-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Acked-by: NMichal Kalderon <michal.kalderon@marvell.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 01 11月, 2019 1 次提交
-
-
由 Leon Romanovsky 提交于
Returned value from mlx5_mr_cache_alloc() is checked to be error or real pointer. Return proper error code instead of NULL which is not checked later. Fixes: 81713d37 ("IB/mlx5: Add implicit MR support") Link: https://lore.kernel.org/r/20191029055721.7192-1-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 29 10月, 2019 6 次提交
-
-
由 Jason Gunthorpe 提交于
For creation, as soon as the umem_odp is created the notifier can be called, however the underlying MR may not have been setup yet. This would cause problems if mlx5_ib_invalidate_range() runs. There is some confusing/ulocked/racy code that might by trying to solve this, but without locks it isn't going to work right. Instead trivially solve the problem by short-circuiting the invalidation if there are not yet any DMA mapped pages. By definition there is nothing to invalidate in this case. The create code will have the umem fully setup before anything is DMA mapped, and npages is fully locked by the umem_mutex. For destroy, invalidate the entire MR at the HW to stop DMA then DMA unmap the pages before destroying the MR. This drives npages to zero and prevents similar racing with invalidate while the MR is undergoing destruction. Arguably it would be better if the umem was created after the MR and destroyed before, but that would require a big rework of the MR code. Fixes: 6aec21f6 ("IB/mlx5: Page faults handling infrastructure") Link: https://lore.kernel.org/r/20191009160934.3143-15-jgg@ziepe.caReviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
Use SRCU in a sensible way by removing all MRs in the implicit tree from the two xarrays (the update operation), then a synchronize, followed by a normal single threaded teardown. This is only a little unusual from the normal pattern as there can still be some work pending in the unbound wq that may also require a workqueue flush. This is tracked with a single atomic, consolidating the redundant existing atomics and wait queue. For understand-ability the entire ODP implicit create/destroy flow now largely exists in a single pair of functions within odp.c, with a few support functions for tearing down an unused child. Link: https://lore.kernel.org/r/20191009160934.3143-13-jgg@ziepe.caReviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
No users are left, delete it. Link: https://lore.kernel.org/r/20191009160934.3143-5-jgg@ziepe.caReviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
There is a per device xarray storing mkeys that is used to store every mkey in the system. However, this xarray is now only read by ODP for certain ODP designated MRs (ODP, implicit ODP, MW, DEVX_INDIRECT). Create an xarray only for use by ODP, that only contains ODP related MKeys. This xarray is protected by SRCU and all erases are protected by a synchronize. This improves performance: - All MRs in the odp_mkeys xarray are ODP MRs, so some tests for is_odp() can be deleted. The xarray will also consume fewer nodes. - normal MR's are never mixed with ODP MRs in a SRCU data structure so performance sucking synchronize_srcu() on every MR destruction is not needed. - No smp_load_acquire(live) and xa_load() double barrier on read Due to the SRCU locking scheme care must be taken with the placement of the xa_store(). Once it completes the MR is immediately visible to other threads and only through a xa_erase() & synchronize_srcu() cycle could it be destroyed. Link: https://lore.kernel.org/r/20191009160934.3143-4-jgg@ziepe.caReviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
The locking model for signature is completely different than ODP, do not share the same xarray that relies on SRCU locking to support ODP. Simply store the active mlx5_core_sig_ctx's in an xarray when signature MRs are created and rely on trivial xarray locking to serialize everything. The overhead of storing only a handful of SIG related MRs is going to be much less than an xarray full of every mkey. Link: https://lore.kernel.org/r/20191009160934.3143-3-jgg@ziepe.caReviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
The mkey_table xarray is touched by the reg_mr_callback() function which is called from a hard irq. Thus all other uses of xa_lock must use the _irq variants. WARNING: inconsistent lock state 5.4.0-rc1 #12 Not tainted -------------------------------- inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. python3/343 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff888182be1d40 (&(&xa->xa_lock)->rlock#3){?.-.}, at: xa_erase+0x12/0x30 {IN-HARDIRQ-W} state was registered at: lock_acquire+0xe1/0x200 _raw_spin_lock_irqsave+0x35/0x50 reg_mr_callback+0x2dd/0x450 [mlx5_ib] mlx5_cmd_exec_cb_handler+0x2c/0x70 [mlx5_core] mlx5_cmd_comp_handler+0x355/0x840 [mlx5_core] [..] Possible unsafe locking scenario: CPU0 ---- lock(&(&xa->xa_lock)->rlock#3); <Interrupt> lock(&(&xa->xa_lock)->rlock#3); *** DEADLOCK *** 2 locks held by python3/343: #0: ffff88818eb4bd38 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_ioctl+0xe5/0x1e0 [ib_uverbs] #1: ffff888176c76d38 (&file->hw_destroy_rwsem){++++}, at: uobj_destroy+0x2d/0x90 [ib_uverbs] stack backtrace: CPU: 3 PID: 343 Comm: python3 Not tainted 5.4.0-rc1 #12 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x86/0xca print_usage_bug.cold.50+0x2e5/0x355 mark_lock+0x871/0xb50 ? match_held_lock+0x20/0x250 ? check_usage_forwards+0x240/0x240 __lock_acquire+0x7de/0x23a0 ? __kasan_check_read+0x11/0x20 ? mark_lock+0xae/0xb50 ? mark_held_locks+0xb0/0xb0 ? find_held_lock+0xca/0xf0 lock_acquire+0xe1/0x200 ? xa_erase+0x12/0x30 _raw_spin_lock+0x2a/0x40 ? xa_erase+0x12/0x30 xa_erase+0x12/0x30 mlx5_ib_dealloc_mw+0x55/0xa0 [mlx5_ib] uverbs_dealloc_mw+0x3c/0x70 [ib_uverbs] uverbs_free_mw+0x1a/0x20 [ib_uverbs] destroy_hw_idr_uobject+0x49/0xa0 [ib_uverbs] [..] Fixes: 04177915 ("RDMA/mlx5: Add missing synchronize_srcu() for MW cases") Link: https://lore.kernel.org/r/20191024234910.GA9038@ziepe.caAcked-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 09 10月, 2019 1 次提交
-
-
由 Parav Pandit 提交于
Introduce and use set_mkc_access_pd_addr_fields() which sets mkey context's access rights, PD, address fields. Thereby avoid the code duplication. Link: https://lore.kernel.org/r/20191006155443.31068-1-leon@kernel.orgSigned-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 05 10月, 2019 4 次提交
-
-
由 Jason Gunthorpe 提交于
While MR uses live as the SRCU 'update', the MW case uses the xarray directly, xa_erase() causes the MW to become inaccessible to the pagefault thread. Thus whenever a MW is removed from the xarray we must synchronize_srcu() before freeing it. This must be done before freeing the mkey as re-use of the mkey while the pagefault thread is using the stale mkey is undesirable. Add the missing synchronizes to MW and DEVX indirect mkey and delete the bogus protection against double destroy in mlx5_core_destroy_mkey() Fixes: 534fd7aa ("IB/mlx5: Manage indirection mkey upon DEVX flow for ODP") Fixes: 6aec21f6 ("IB/mlx5: Page faults handling infrastructure") Link: https://lore.kernel.org/r/20191001153821.23621-7-jgg@ziepe.caReviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
live is used to signal to the pagefault thread that the MR is initialized and ready for use. It should be after the umem is assigned and all other setup is completed. This prevents races (at least) of the form: CPU0 CPU1 mlx5_ib_alloc_implicit_mr() implicit_mr_alloc() live = 1 imr->umem = umem num_pending_prefetch_inc() if (live) atomic_inc(num_pending_prefetch) atomic_set(num_pending_prefetch,0) // Overwrites other thread's store Further, live is being used with SRCU as the 'update' in an acquire/release fashion, so it can not be read and written raw. Move all live = 1's to after MR initialization is completed and use smp_store_release/smp_load_acquire() for manipulating it. Add a missing live = 0 when an implicit MR child is deleted, before queuing work to do synchronize_srcu(). The barriers in update_odp_mr() were some broken attempt to create a acquire/release, but were not even applied consistently and missed the point, delete it as well. Fixes: 6aec21f6 ("IB/mlx5: Page faults handling infrastructure") Link: https://lore.kernel.org/r/20191001153821.23621-6-jgg@ziepe.caReviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
During destroy setting live = 0 and then synchronize_srcu() prevents num_pending_prefetch from incrementing, and also, ensures that all work holding that count is queued on the WQ. Testing before causes races of the form: CPU0 CPU1 dereg_mr() mlx5_ib_advise_mr_prefetch() srcu_read_lock() num_pending_prefetch_inc() if (!live) live = 0 atomic_read() == 0 // skip flush_workqueue() atomic_inc() queue_work(); srcu_read_unlock() WARN_ON(atomic_read()) // Fails Swap the order so that the synchronize_srcu() prevents this. Fixes: a6bc3875 ("IB/mlx5: Protect against prefetch of invalid MR") Link: https://lore.kernel.org/r/20191001153821.23621-5-jgg@ziepe.caReviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
This code is completely broken, the umem of a ODP MR simply cannot be discarded without a lot more locking, nor can an ODP mkey be blithely destroyed via destroy_mkey(). Fixes: 6aec21f6 ("IB/mlx5: Page faults handling infrastructure") Link: https://lore.kernel.org/r/20191001153821.23621-2-jgg@ziepe.caReviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 22 8月, 2019 3 次提交
-
-
由 Jason Gunthorpe 提交于
Now that there are allocator APIs that return the ib_umem_odp directly it should be freed through a umem_odp free'er as well. Link: https://lore.kernel.org/r/20190819111710.18440-8-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
This is the last creation API that is overloaded for both, there is very little code sharing and a driver has to be specifically ready for a umem_odp to be created to use the odp version. Link: https://lore.kernel.org/r/20190819111710.18440-7-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
Implicit ODP umems are special, they don't have any page lists, they don't exist in the interval tree and they are never DMA mapped. Instead of trying to guess this based on a zero length use an explicit flag. Further, do not allow non-implicit umems to be 0 size. Link: https://lore.kernel.org/r/20190819111710.18440-4-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 21 8月, 2019 2 次提交
-
-
由 Moni Shoua 提交于
The UMR WQE in the MR re-registration flow requires that modify_atomic and modify_entity_size capabilities are enabled. Therefore, check that the these capabilities are present before going to umr flow and go through slow path if not. Fixes: c8d75a98 ("IB/mlx5: Respect new UMR capabilities") Signed-off-by: NMoni Shoua <monis@mellanox.com> Reviewed-by: NGuy Levi <guyle@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190815083834.9245-8-leon@kernel.orgSigned-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Moni Shoua 提交于
Introduce helper function to unify various use_umr checks. Signed-off-by: NMoni Shoua <monis@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190815083834.9245-6-leon@kernel.orgSigned-off-by: NDoug Ledford <dledford@redhat.com>
-
- 01 8月, 2019 1 次提交
-
-
由 Guy Levi 提交于
Driver shouldn't allow to use UMR to register a MR when umr_modify_atomic_disabled is set. Otherwise it will always end up with a failure in the post send flow which sets the UMR WQE to modify atomic access right. Fixes: c8d75a98 ("IB/mlx5: Respect new UMR capabilities") Signed-off-by: NGuy Levi <guyle@mellanox.com> Reviewed-by: NMoni Shoua <monis@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190731081929.32559-1-leon@kernel.orgSigned-off-by: NDoug Ledford <dledford@redhat.com>
-
- 25 7月, 2019 4 次提交
-
-
由 Yishai Hadas 提交于
Any dma map underlying the MR should only be freed once the MR is fenced at the hardware. As of the above we first destroy the MKEY and just after that can safely call to dma_unmap_single(). Link: https://lore.kernel.org/r/20190723065733.4899-6-leon@kernel.org Cc: <stable@vger.kernel.org> # 4.3 Fixes: 8a187ee5 ("IB/mlx5: Support the new memory registration API") Signed-off-by: NYishai Hadas <yishaih@mellanox.com> Reviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Yishai Hadas 提交于
Fix unreg_umr to move the MR to a kernel owned PD (i.e. the UMR PD) which can't be accessed by userspace. This ensures that nothing can continue to access the MR once it has been placed in the kernels cache for reuse. MRs in the cache continue to have their HW state, including DMA tables, present. Even though the MR has been invalidated, changing the PD provides an additional layer of protection against use of the MR. Link: https://lore.kernel.org/r/20190723065733.4899-5-leon@kernel.org Cc: <stable@vger.kernel.org> # 3.10 Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: NYishai Hadas <yishaih@mellanox.com> Reviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Yishai Hadas 提交于
Use a direct firmware command to destroy the mkey in case the unreg UMR operation has failed. This prevents a case that a mkey will leak out from the cache post a failure to be destroyed by a UMR WR. In case the MR cache limit didn't reach a call to add another entry to the cache instead of the destroyed one is issued. In addition, replaced a warn message to WARN_ON() as this flow is fatal and can't happen unless some bug around. Link: https://lore.kernel.org/r/20190723065733.4899-4-leon@kernel.org Cc: <stable@vger.kernel.org> # 4.10 Fixes: 49780d42 ("IB/mlx5: Expose MR cache for mlx5_ib") Signed-off-by: NYishai Hadas <yishaih@mellanox.com> Reviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Yishai Hadas 提交于
Fix unreg_umr to ignore the mkey state and do not fail if was freed. This prevents a case that a user space application already changed the mkey state to free and then the UMR operation will fail leaving the mkey in an inappropriate state. Link: https://lore.kernel.org/r/20190723065733.4899-3-leon@kernel.org Cc: <stable@vger.kernel.org> # 3.19 Fixes: 968e78dd ("IB/mlx5: Enhance UMR support to allow partial page table update") Signed-off-by: NYishai Hadas <yishaih@mellanox.com> Reviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 25 6月, 2019 1 次提交
-
-
由 Matthew Wilcox 提交于
The lock protecting the data structure does not need to be an rwlock. The only read access to the lock is in an error path, and if that's limiting your scalability, you have bigger performance problems. Eliminate mlx5_mkey_table in favour of using the xarray directly. reg_mr_callback must use GFP_ATOMIC for allocating XArray nodes as it may be called in interrupt context. This also fixes a minor bug where SRCU locking was being used on the radix tree read side, when RCU was needed too. Signed-off-by: NMatthew Wilcox <willy@infradead.org> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
- 24 6月, 2019 5 次提交
-
-
由 Max Gurtovoy 提交于
Improve code readability using static helpers for each memory region type. Re-use the common logic to get smaller functions that are easy to maintain and reduce code duplication. Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NIsrael Rukshin <israelr@mellanox.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Max Gurtovoy 提交于
If possibe, avoid doing a UMR operation to register data and protection buffers (via MTT/KLM mkeys). Instead, use the local DMA key and map the SG lists using PA access. This is safe, since the internal key for data and protection never exposed to the remote server (only signature key might be exposed). If PA mappings are not possible, perform mapping using MTT/KLM descriptors. The setup of the tested benchmark (using iSER ULP): - 2 servers with 24 cores (1 initiator and 1 target) - ConnectX-4/ConnectX-5 adapters - 24 target sessions with 1 LUN each - ramdisk backstore - PI active Performance results running fio (24 jobs, 128 iodepth) using write_generate=1 and read_verify=1 (w/w.o patch): bs IOPS(read) IOPS(write) ---- ---------- ---------- 512 1266.4K/1262.4K 1720.1K/1732.1K 4k 793139/570902 1129.6K/773982 32k 72660/72086 97229/96164 Using write_generate=0 and read_verify=0 (w/w.o patch): bs IOPS(read) IOPS(write) ---- ---------- ---------- 512 1590.2K/1600.1K 1828.2K/1830.3K 4k 1078.1K/937272 1142.1K/815304 32k 77012/77369 98125/97435 Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NIsrael Rukshin <israelr@mellanox.com> Suggested-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Israel Rukshin 提交于
In some loads, there is performance degradation when using KLM mkey instead of MTT mkey. This is because KLM descriptor access is via indirection that might require more HW resources and cycles. Using KLM descriptor is not necessary when there are no gaps at the data/metadata sg lists. As an optimization, use MTT mkey whenever it is possible. For that matter, allocate internal MTT mkey and choose the effective pi_mr for in transaction according to the required mapping scheme. The setup of the tested benchmark (using iSER ULP): - 2 servers with 24 cores (1 initiator and 1 target) - ConnectX-4/ConnectX-5 adapters - 24 target sessions with 1 LUN each - ramdisk backstore - PI active Performance results running fio (24 jobs, 128 iodepth) using write_generate=1 and read_verify=1 (w/w.o/baseline): bs IOPS(read) IOPS(write) ---- ---------- ---------- 512 1262.4K/1243.3K/1147.1K 1732.1K/1725.1K/1423.8K 4k 570902/571233/457874 773982/743293/642080 32k 72086/72388/71933 96164/71789/93249 Using write_generate=0 and read_verify=0 (w/w.o patch): bs IOPS(read) IOPS(write) ---- ---------- ---------- 512 1600.1K/1572.1K/1393.3K 1830.3K/1823.5K/1557.2K 4k 937272/921992/762934 815304/753772/646071 32k 77369/75052/72058 97435/73180/94612 Signed-off-by: NIsrael Rukshin <israelr@mellanox.com> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Suggested-by: NMax Gurtovoy <maxg@mellanox.com> Suggested-by: NIdan Burstein <idanb@mellanox.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Israel Rukshin 提交于
IB_WR_REG_SIG_MR is not needed after IB_WR_REG_MR_INTEGRITY was used. Signed-off-by: NIsrael Rukshin <israelr@mellanox.com> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Max Gurtovoy 提交于
mlx5_ib_map_mr_sg_pi() will map the PI and data dma mapped SG lists to the mlx5 memory region prior to the registration operation. In the new API, the mlx5 driver will allocate an internal memory region for the UMR operation to register both PI and data SG lists. The internal MR will use KLM mode in order to map 2 (possibly non-contiguous/non-align) SG lists using 1 memory key. In the new API, each ULP will use 1 memory region for the signature operation (instead of 3 in the old API). This memory region will have a key that will be exposed to remote server to perform RDMA operation. The internal memory key that will map the SG lists will stay private. Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NIsrael Rukshin <israelr@mellanox.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 21 6月, 2019 1 次提交
-
-
由 Leon Romanovsky 提交于
Update ib_umem_release() to behave similarly to kfree() and allow submitting NULL pointer as safe input to this function. Fixes: a52c8e24 ("RDMA: Clean destroy CQ in drivers do not return errors") Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 22 5月, 2019 1 次提交
-
-
由 Jason Gunthorpe 提交于
This value has always been set to PAGE_SHIFT in the core code, the only thing that does differently was the ODP path. Move the value into the ODP struct and still use it for ODP, but change all the non-ODP things to just use PAGE_SHIFT/PAGE_SIZE/PAGE_MASK directly. Reviewed-by: NShiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
-
- 06 5月, 2019 2 次提交
-
-
由 Ariel Levkovich 提交于
This patch adds support for allocating, deallocating and registering a new device memory type, STEERING_SW_ICM. This memory can be allocated and used by a privileged user for direct rule insertion and management of the device's steering tables. The type is provided by the user via the dedicated attribute in the alloc_dm ioctl command. Signed-off-by: NAriel Levkovich <lariel@mellanox.com> Reviewed-by: NEli Cohen <eli@mellanox.com> Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Ariel Levkovich 提交于
This patch intoruduces a new mlx5_ib driver attribute to the DM allocation method - the DM type. In order to allow addition of new types in downstream patches this patch also refactors the allocation, deallocation and registration handlers to consider the requested type and perform the necessary actions according to it. Since not all future device memory types will be such that are mapped to user memory, the mandatory page index output attribute is modified to be optional. Signed-off-by: NAriel Levkovich <lariel@mellanox.com> Reviewed-by: NEli Cohen <eli@mellanox.com> Reviewed-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 11 4月, 2019 1 次提交
-
-
由 Mark Bloch 提交于
In preparation of moving into a model of single IB device multiple ports move rep to be part of the port structure. We mark a representor device by setting is_rep, no functional change with this patch. Signed-off-by: NMark Bloch <markb@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-