1. 29 10月, 2019 5 次提交
  2. 05 10月, 2019 3 次提交
    • J
      RDMA/mlx5: Put live in the correct place for ODP MRs · aa603815
      Jason Gunthorpe 提交于
      live is used to signal to the pagefault thread that the MR is initialized
      and ready for use. It should be after the umem is assigned and all other
      setup is completed. This prevents races (at least) of the form:
      
          CPU0                                     CPU1
      mlx5_ib_alloc_implicit_mr()
       implicit_mr_alloc()
        live = 1
       imr->umem = umem
                                          num_pending_prefetch_inc()
                                            if (live)
      				        atomic_inc(num_pending_prefetch)
       atomic_set(num_pending_prefetch,0) // Overwrites other thread's store
      
      Further, live is being used with SRCU as the 'update' in an
      acquire/release fashion, so it can not be read and written raw.
      
      Move all live = 1's to after MR initialization is completed and use
      smp_store_release/smp_load_acquire() for manipulating it.
      
      Add a missing live = 0 when an implicit MR child is deleted, before
      queuing work to do synchronize_srcu().
      
      The barriers in update_odp_mr() were some broken attempt to create a
      acquire/release, but were not even applied consistently and missed the
      point, delete it as well.
      
      Fixes: 6aec21f6 ("IB/mlx5: Page faults handling infrastructure")
      Link: https://lore.kernel.org/r/20191001153821.23621-6-jgg@ziepe.caReviewed-by: NArtemy Kovalyov <artemyko@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      aa603815
    • J
      RDMA/odp: Lift umem_mutex out of ib_umem_odp_unmap_dma_pages() · 9dc775e7
      Jason Gunthorpe 提交于
      This fixes a race of the form:
          CPU0                               CPU1
      mlx5_ib_invalidate_range()     mlx5_ib_invalidate_range()
      				 // This one actually makes npages == 0
      				 ib_umem_odp_unmap_dma_pages()
      				 if (npages == 0 && !dying)
        // This one does nothing
        ib_umem_odp_unmap_dma_pages()
        if (npages == 0 && !dying)
           dying = 1;
                                          dying = 1;
      				    schedule_work(&umem_odp->work);
           // Double schedule of the same work
           schedule_work(&umem_odp->work);  // BOOM
      
      npages and dying must be read and written under the umem_mutex lock.
      
      Since whenever ib_umem_odp_unmap_dma_pages() is called mlx5 must also call
      mlx5_ib_update_xlt, and both need to be done in the same locking region,
      hoist the lock out of unmap.
      
      This avoids an expensive double critical section in
      mlx5_ib_invalidate_range().
      
      Fixes: 81713d37 ("IB/mlx5: Add implicit MR support")
      Link: https://lore.kernel.org/r/20191001153821.23621-4-jgg@ziepe.caReviewed-by: NArtemy Kovalyov <artemyko@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      9dc775e7
    • J
      RDMA/mlx5: Fix a race with mlx5_ib_update_xlt on an implicit MR · f28b1932
      Jason Gunthorpe 提交于
      mlx5_ib_update_xlt() must be protected against parallel free of the MR it
      is accessing, also it must be called single threaded while updating the
      HW. Otherwise we can have races of the form:
      
          CPU0                               CPU1
        mlx5_ib_update_xlt()
         mlx5_odp_populate_klm()
           odp_lookup() == NULL
           pklm = ZAP
                                            implicit_mr_get_data()
       				        implicit_mr_alloc()
       					  <update interval tree>
      					mlx5_ib_update_xlt
      					  mlx5_odp_populate_klm()
      					    odp_lookup() != NULL
      					    pklm = VALID
      					   mlx5_ib_post_send_wait()
      
          mlx5_ib_post_send_wait() // Replaces VALID with ZAP
      
      This can be solved by putting both the SRCU and the umem_mutex lock around
      every call to mlx5_ib_update_xlt(). This ensures that the content of the
      interval tree relavent to mlx5_odp_populate_klm() (ie mr->parent == mr)
      will not change while it is running, and thus the posted WRs to update the
      KLM will always reflect the correct information.
      
      The race above will resolve by either having CPU1 wait till CPU0 completes
      the ZAP or CPU0 will run after the add and instead store VALID.
      
      The pagefault path adding children already holds the umem_mutex and SRCU,
      so the only missed lock is during MR destruction.
      
      Fixes: 81713d37 ("IB/mlx5: Add implicit MR support")
      Link: https://lore.kernel.org/r/20191001153821.23621-3-jgg@ziepe.caReviewed-by: NArtemy Kovalyov <artemyko@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      f28b1932
  3. 17 9月, 2019 1 次提交
  4. 28 8月, 2019 2 次提交
  5. 22 8月, 2019 7 次提交
  6. 21 8月, 2019 1 次提交
  7. 08 8月, 2019 1 次提交
    • Y
      IB/mlx5: Fix implicit MR release flow · f591822c
      Yishai Hadas 提交于
      Once implicit MR is being called to be released by
      ib_umem_notifier_release() its leaves were marked as "dying".
      
      However, when dereg_mr()->mlx5_ib_free_implicit_mr()->mr_leaf_free() is
      called, it skips running the mr_leaf_free_action (i.e. umem_odp->work)
      when those leaves were marked as "dying".
      
      As such ib_umem_release() for the leaves won't be called and their MRs
      will be leaked as well.
      
      When an application exits/killed without calling dereg_mr we might hit the
      above flow.
      
      This fatal scenario is reported by WARN_ON() upon
      mlx5_ib_dealloc_ucontext() as ibcontext->per_mm_list is not empty, the
      call trace can be seen below.
      
      Originally the "dying" mark as part of ib_umem_notifier_release() was
      introduced to prevent pagefault_mr() from returning a success response
      once this happened. However, we already have today the completion
      mechanism so no need for that in those flows any more.  Even in case a
      success response will be returned the firmware will not find the pages and
      an error will be returned in the following call as a released mm will
      cause ib_umem_odp_map_dma_pages() to permanently fail mmget_not_zero().
      
      Fix the above issue by dropping the "dying" from the above flows.  The
      other flows that are using "dying" are still needed it for their
      synchronization purposes.
      
         WARNING: CPU: 1 PID: 7218 at
         drivers/infiniband/hw/mlx5/main.c:2004
      		  mlx5_ib_dealloc_ucontext+0x84/0x90 [mlx5_ib]
         CPU: 1 PID: 7218 Comm: ibv_rc_pingpong Tainted: G     E
      	       5.2.0-rc6+ #13
         Call Trace:
         uverbs_destroy_ufile_hw+0xb5/0x120 [ib_uverbs]
         ib_uverbs_close+0x1f/0x80 [ib_uverbs]
         __fput+0xbe/0x250
         task_work_run+0x88/0xa0
         do_exit+0x2cb/0xc30
         ? __fput+0x14b/0x250
         do_group_exit+0x39/0xb0
         get_signal+0x191/0x920
         ? _raw_spin_unlock_bh+0xa/0x20
         ? inet_csk_accept+0x229/0x2f0
         do_signal+0x36/0x5e0
         ? put_unused_fd+0x5b/0x70
         ? __sys_accept4+0x1a6/0x1e0
         ? inet_hash+0x35/0x40
         ? release_sock+0x43/0x90
         ? _raw_spin_unlock_bh+0xa/0x20
         ? inet_listen+0x9f/0x120
         exit_to_usermode_loop+0x5c/0xc6
         do_syscall_64+0x182/0x1b0
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 81713d37 ("IB/mlx5: Add implicit MR support")
      Link: https://lore.kernel.org/r/20190805083010.21777-1-leon@kernel.orgSigned-off-by: NYishai Hadas <yishaih@mellanox.com>
      Reviewed-by: NArtemy Kovalyov <artemyko@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      f591822c
  8. 01 8月, 2019 1 次提交
  9. 25 7月, 2019 1 次提交
  10. 23 7月, 2019 1 次提交
  11. 04 7月, 2019 1 次提交
  12. 25 6月, 2019 1 次提交
  13. 14 6月, 2019 4 次提交
  14. 22 5月, 2019 1 次提交
  15. 09 4月, 2019 1 次提交
  16. 04 4月, 2019 1 次提交
  17. 28 3月, 2019 2 次提交
  18. 04 3月, 2019 1 次提交
  19. 22 2月, 2019 2 次提交
  20. 05 2月, 2019 3 次提交