1. 28 3月, 2018 1 次提交
    • P
      IB/cm: Block processing alternate path handling RoCE Rx cm messages · 97c45c2c
      Parav Pandit 提交于
      Due to below reasons, it is better to not support alternate path receive
      messages for RoCE in near term.
      
      1. Alternate path for RoCE is not supported at rdmacm layer.
      2. It is not supported in uverbs/core layer for RoCE.
      3. Alternate path for IPv6 for link local address cannot resolve route
      determinstically without a valid incoming interface id whose usecase
      make sense only with dual port mode.
      4. init_av_from_path while processing LAP messages for IB and RoCE can
      lead to adding duplicate entry of AV into the port list, leads to list
      corruption.
      5. rdma-core userspace a well known userspace implementation has removed
      support of libucm which use ucm.ko module, which is the only module that
      can trigger alternate path related messages.
      6. ucm kernel module is requested to be removed from the IB core in
      patch [1].
      
      [1] https://patchwork.kernel.org/patch/10268503/Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      97c45c2c
  2. 24 3月, 2018 1 次提交
    • P
      IB/cma: Resolve route only while receiving CM requests · 114cc9c4
      Parav Pandit 提交于
      Currently CM request for RoCE follows following flow.
      rdma_create_id()
      rdma_resolve_addr()
      rdma_resolve_route()
      For RC QPs:
      rdma_connect()
      ->cma_connect_ib()
        ->ib_send_cm_req()
          ->cm_init_av_by_path()
            ->ib_init_ah_attr_from_path()
      For UD QPs:
      rdma_connect()
      ->cma_resolve_ib_udp()
        ->ib_send_cm_sidr_req()
          ->cm_init_av_by_path()
            ->ib_init_ah_attr_from_path()
      
      In both the flows, route is already resolved before sending CM requests.
      Therefore, code is refactored to avoid resolving route second time in
      ib_cm layer.
      ib_init_ah_attr_from_path() is extended to resolve route when it is not
      yet resolved for RoCE link layer. This is achieved by caller setting
      route_resolved field in path record whenever it has route already
      resolved.
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      114cc9c4
  3. 16 3月, 2018 2 次提交
  4. 29 1月, 2018 2 次提交
  5. 19 12月, 2017 8 次提交
  6. 12 12月, 2017 1 次提交
  7. 26 10月, 2017 1 次提交
    • P
      IB/cm: Fix memory corruption in handling CM request · 5a3dc323
      Parav Pandit 提交于
      In recent code, two path record entries are alwasy cleared while
      allocated could be either one or two path record entries.
      This leads to zero out of unallocated memory.
      
      This fix initializes alternative path record only when alternative path
      is set.
      
      While we are at it, path record allocation doesn't check for OPA
      alternative path, but rest of the code checks for OPA alternative path.
      Path record allocation code doesn't check for OPA alternative LID.
      This can further lead to memory corruption when only one path record is
      allocated, but there is actually alternative OPA path record present in CM
      request.
      
      Cc: <stable@vger.kernel.org> # v4.12+
      Fixes: 9fdca4da ("IB/SA: Split struct sa_path_rec based on IB and ROCE specific fields")
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Reviewed-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      5a3dc323
  8. 15 10月, 2017 1 次提交
  9. 10 10月, 2017 1 次提交
  10. 31 8月, 2017 1 次提交
    • R
      IB/cm: Fix sleeping in atomic when RoCE is used · c7616118
      Roland Dreier 提交于
      A couple of places in the CM do
      
          spin_lock_irq(&cm_id_priv->lock);
          ...
          if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
      
      However when the underlying transport is RoCE, this leads to a sleeping function
      being called with the lock held - the callchain is
      
          cm_alloc_response_msg() ->
            ib_create_ah_from_wc() ->
              ib_init_ah_from_wc() ->
                rdma_addr_find_l2_eth_by_grh() ->
                  rdma_resolve_ip()
      
      and rdma_resolve_ip() starts out by doing
      
          req = kzalloc(sizeof *req, GFP_KERNEL);
      
      not to mention rdma_addr_find_l2_eth_by_grh() doing
      
          wait_for_completion(&ctx.comp);
      
      to wait for the task that rdma_resolve_ip() queues up.
      
      Fix this by moving the AH creation out of the lock.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      Reviewed-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      c7616118
  11. 19 8月, 2017 1 次提交
  12. 18 8月, 2017 1 次提交
  13. 09 8月, 2017 4 次提交
  14. 02 6月, 2017 1 次提交
    • M
      RDMA/SA: Fix kernel panic in CMA request handler flow · d3957b86
      Majd Dibbiny 提交于
      Commit 9fdca4da (IB/SA: Split struct sa_path_rec based on IB and
      ROCE specific fields) moved the service_id to be specific attribute
      for IB and OPA SA Path Record, and thus wasn't assigned for RoCE.
      
      This caused to the following kernel panic in the CMA request handler flow:
      
      [   27.074594] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      [   27.074731] IP: __radix_tree_lookup+0x1d/0xe0
      ...
      [   27.075356] Workqueue: ib_cm cm_work_handler [ib_cm]
      [   27.075401] task: ffff88022e3b8000 task.stack: ffffc90001298000
      [   27.075449] RIP: 0010:__radix_tree_lookup+0x1d/0xe0
      ...
      [   27.075979] Call Trace:
      [   27.076015]  radix_tree_lookup+0xd/0x10
      [   27.076055]  cma_ps_find+0x59/0x70 [rdma_cm]
      [   27.076097]  cma_id_from_event+0xd2/0x470 [rdma_cm]
      [   27.076144]  ? ib_init_ah_from_path+0x39a/0x590 [ib_core]
      [   27.076193]  cma_req_handler+0x25/0x480 [rdma_cm]
      [   27.076237]  cm_process_work+0x25/0x120 [ib_cm]
      [   27.076280]  ? cm_get_bth_pkey.isra.62+0x3c/0xa0 [ib_cm]
      [   27.076350]  cm_req_handler+0xb03/0xd40 [ib_cm]
      [   27.076430]  ? sched_clock_cpu+0x11/0xb0
      [   27.076478]  cm_work_handler+0x194/0x1588 [ib_cm]
      [   27.076525]  process_one_work+0x160/0x410
      [   27.076565]  worker_thread+0x137/0x4a0
      [   27.076614]  kthread+0x112/0x150
      [   27.076684]  ? max_active_store+0x60/0x60
      [   27.077642]  ? kthread_park+0x90/0x90
      [   27.078530]  ret_from_fork+0x2c/0x40
      
      This patch moves it back to the common SA Path Record structure
      and removes the redundant setter and getter.
      
      Tested on Connect-IB and Connect-X4 in Infiniband and RoCE respectively.
      
      Fixes: 9fdca4da (IB/SA: Split struct sa_path_rec based on IB ands
      	ROCE specific fields)
      Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
      Reviewed-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      d3957b86
  15. 02 5月, 2017 10 次提交
  16. 25 1月, 2017 1 次提交
  17. 15 12月, 2016 2 次提交
  18. 17 11月, 2016 1 次提交
    • M
      IB/cm: Mark stale CM id's whenever the mad agent was unregistered · 9db0ff53
      Mark Bloch 提交于
      When there is a CM id object that has port assigned to it, it means that
      the cm-id asked for the specific port that it should go by it, but if
      that port was removed (hot-unplug event) the cm-id was not updated.
      In order to fix that the port keeps a list of all the cm-id's that are
      planning to go by it, whenever the port is removed it marks all of them
      as invalid.
      
      This commit fixes a kernel panic which happens when running traffic between
      guests and we force reboot a guest mid traffic, it triggers a kernel panic:
      
       Call Trace:
        [<ffffffff815271fa>] ? panic+0xa7/0x16f
        [<ffffffff8152b534>] ? oops_end+0xe4/0x100
        [<ffffffff8104a00b>] ? no_context+0xfb/0x260
        [<ffffffff81084db2>] ? del_timer_sync+0x22/0x30
        [<ffffffff8104a295>] ? __bad_area_nosemaphore+0x125/0x1e0
        [<ffffffff81084240>] ? process_timeout+0x0/0x10
        [<ffffffff8104a363>] ? bad_area_nosemaphore+0x13/0x20
        [<ffffffff8104aabf>] ? __do_page_fault+0x31f/0x480
        [<ffffffff81065df0>] ? default_wake_function+0x0/0x20
        [<ffffffffa0752675>] ? free_msg+0x55/0x70 [mlx5_core]
        [<ffffffffa0753434>] ? cmd_exec+0x124/0x840 [mlx5_core]
        [<ffffffff8105a924>] ? find_busiest_group+0x244/0x9f0
        [<ffffffff8152d45e>] ? do_page_fault+0x3e/0xa0
        [<ffffffff8152a815>] ? page_fault+0x25/0x30
        [<ffffffffa024da25>] ? cm_alloc_msg+0x35/0xc0 [ib_cm]
        [<ffffffffa024e821>] ? ib_send_cm_dreq+0xb1/0x1e0 [ib_cm]
        [<ffffffffa024f836>] ? cm_destroy_id+0x176/0x320 [ib_cm]
        [<ffffffffa024fb00>] ? ib_destroy_cm_id+0x10/0x20 [ib_cm]
        [<ffffffffa034f527>] ? ipoib_cm_free_rx_reap_list+0xa7/0x110 [ib_ipoib]
        [<ffffffffa034f590>] ? ipoib_cm_rx_reap+0x0/0x20 [ib_ipoib]
        [<ffffffffa034f5a5>] ? ipoib_cm_rx_reap+0x15/0x20 [ib_ipoib]
        [<ffffffff81094d20>] ? worker_thread+0x170/0x2a0
        [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40
        [<ffffffff81094bb0>] ? worker_thread+0x0/0x2a0
        [<ffffffff8109aef6>] ? kthread+0x96/0xa0
        [<ffffffff8100c20a>] ? child_rip+0xa/0x20
        [<ffffffff8109ae60>] ? kthread+0x0/0xa0
        [<ffffffff8100c200>] ? child_rip+0x0/0x20
      
      Fixes: a977049d ("[PATCH] IB: Add the kernel CM implementation")
      Signed-off-by: NMark Bloch <markb@mellanox.com>
      Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
      Reviewed-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      9db0ff53