1. 27 10月, 2020 1 次提交
    • P
      RDMA/mlx5: Fix devlink deadlock on net namespace deletion · fbdd0049
      Parav Pandit 提交于
      When a mlx5 core devlink instance is reloaded in different net namespace,
      its associated IB device is deleted and recreated.
      
      Example sequence is:
      $ ip netns add foo
      $ devlink dev reload pci/0000:00:08.0 netns foo
      $ ip netns del foo
      
      mlx5 IB device needs to attach and detach the netdevice to it through the
      netdev notifier chain during load and unload sequence.  A below call graph
      of the unload flow.
      
      cleanup_net()
         down_read(&pernet_ops_rwsem); <- first sem acquired
           ops_pre_exit_list()
             pre_exit()
               devlink_pernet_pre_exit()
                 devlink_reload()
                   mlx5_devlink_reload_down()
                     mlx5_unload_one()
                     [...]
                       mlx5_ib_remove()
                         mlx5_ib_unbind_slave_port()
                           mlx5_remove_netdev_notifier()
                             unregister_netdevice_notifier()
                               down_write(&pernet_ops_rwsem);<- recurrsive lock
      
      Hence, when net namespace is deleted, mlx5 reload results in deadlock.
      
      When deadlock occurs, devlink mutex is also held. This not only deadlocks
      the mlx5 device under reload, but all the processes which attempt to
      access unrelated devlink devices are deadlocked.
      
      Hence, fix this by mlx5 ib driver to register for per net netdev notifier
      instead of global one, which operats on the net namespace without holding
      the pernet_ops_rwsem.
      
      Fixes: 4383cfcc ("net/mlx5: Add devlink reload")
      Link: https://lore.kernel.org/r/20201026134359.23150-1-parav@nvidia.comSigned-off-by: NParav Pandit <parav@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      fbdd0049
  2. 17 10月, 2020 1 次提交
  3. 02 10月, 2020 1 次提交
  4. 30 9月, 2020 1 次提交
  5. 19 9月, 2020 1 次提交
  6. 18 9月, 2020 6 次提交
  7. 10 9月, 2020 1 次提交
    • L
      RDMA: Restore ability to fail on PD deallocate · 91a7c58f
      Leon Romanovsky 提交于
      The IB verbs objects are counted by the kernel and ib_core ensures that
      deallocate PD will success so it will be called once all other objects
      that depends on PD will be released. This is achieved by managing various
      reference counters on such objects.
      
      The mlx5 driver didn't follow this standard flow when allowed DEVX objects
      that are not managed by ib_core to be interleaved with the ones under
      ib_core responsibility.
      
      In such interleaved scenarios deallocate command can fail and ib_core will
      leave uobject in internal DB and attempt to clean it later to free
      resources anyway.
      
      This change partially restores returned value from dealloc_pd() for all
      drivers, but keeping in mind that non-DEVX devices and kernel verbs paths
      shouldn't fail.
      
      Fixes: 21a428a0 ("RDMA: Handle PD allocations by IB/core")
      Link: https://lore.kernel.org/r/20200907120921.476363-2-leon@kernel.orgSigned-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      91a7c58f
  8. 24 8月, 2020 1 次提交
  9. 29 7月, 2020 1 次提交
  10. 09 7月, 2020 1 次提交
  11. 08 7月, 2020 6 次提交
  12. 07 7月, 2020 6 次提交
  13. 24 6月, 2020 3 次提交
  14. 23 6月, 2020 1 次提交
  15. 03 6月, 2020 1 次提交
  16. 30 5月, 2020 1 次提交
  17. 28 5月, 2020 1 次提交
  18. 22 5月, 2020 1 次提交
  19. 18 5月, 2020 1 次提交
  20. 14 5月, 2020 1 次提交
  21. 13 5月, 2020 1 次提交
  22. 07 5月, 2020 2 次提交