1. 23 6月, 2016 1 次提交
    • M
      IB/mlx5: Reset flow support for IB kernel ULPs · 89ea94a7
      Maor Gottlieb 提交于
      The driver exposes interfaces that directly relate to HW state.
      Upon fatal error, consumers of these interfaces (ULPs) that rely
      on completion of all their posted work-request could hang, thereby
      introducing dependencies in shutdown order. To prevent this from
      happening, we manage the relevant resources (CQs, QPs) that are used
      by the device. Upon a fatal error, we now generate simulated
      completions for outstanding WQEs that were not completed at the
      time the HW was reset.
      
      It includes invoking the completion event handler for all involved
      CQs so that the ULPs will poll those CQs. When polled we return
      simulated CQEs with IB_WC_WR_FLUSH_ERR return code enabling ULPs
      to clean up their  resources and not wait forever for completions
      upon receiving remove_one.
      
      The above change requires an extra check in the data path to make
      sure that when device is in error state, the simulated CQEs will
      be returned and no further WQEs will be posted.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      89ea94a7
  2. 18 5月, 2016 1 次提交
    • M
      net/mlx5_core: Use tasklet for user-space CQ completion events · 94c6825e
      Matan Barak 提交于
      Previously, we've fired all our completion callbacks straight from
      our ISR.
      
      Some of those callbacks were lightweight (for example, mlx5 Ethernet
      napi callbacks), but some of them did more work (for example,
      the user-space RDMA stack uverbs' completion handler). Besides that,
      doing more than the minimal work in ISR is generally considered wrong,
      it could even lead to a hard lockup of the system. Since when a lot
      of completion events are generated by the hardware, the loop over
      those events could be so long, that we'll get into a hard lockup by
      the system watchdog.
      
      In order to avoid that, add a new way of invoking completion events
      callbacks. In the interrupt itself, we add the CQs which receive
      completion event to a per-EQ list and schedule a tasklet. In the
      tasklet context we loop over all the CQs in the list and invoke the
      user callback.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      94c6825e
  3. 18 1月, 2016 1 次提交
    • D
      net/mlx5_core: Fix trimming down IRQ number · 0b6e26ce
      Doron Tsur 提交于
      With several ConnectX-4 cards installed on a server, one may receive
      irqn > 255 from the kernel API, which we mistakenly trim to 8bit.
      
      This causes EQ creation failure with the following stack trace:
      [<ffffffff812a11f4>] dump_stack+0x48/0x64
      [<ffffffff810ace21>] __setup_irq+0x3a1/0x4f0
      [<ffffffff810ad7e0>] request_threaded_irq+0x120/0x180
      [<ffffffffa0923660>] ? mlx5_eq_int+0x450/0x450 [mlx5_core]
      [<ffffffffa0922f64>] mlx5_create_map_eq+0x1e4/0x2b0 [mlx5_core]
      [<ffffffffa091de01>] alloc_comp_eqs+0xb1/0x180 [mlx5_core]
      [<ffffffffa091ea99>] mlx5_dev_init+0x5e9/0x6e0 [mlx5_core]
      [<ffffffffa091ec29>] init_one+0x99/0x1c0 [mlx5_core]
      [<ffffffff812e2afc>] local_pci_probe+0x4c/0xa0
      
      Fixing it by changing of the irqn type from u8 to unsigned int to
      support values > 255
      
      Fixes: 61d0e73e ('net/mlx5_core: Use the the real irqn in eq->irqn')
      Reported-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDoron Tsur <doront@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b6e26ce
  4. 31 5月, 2015 1 次提交
  5. 03 4月, 2015 2 次提交
  6. 08 3月, 2014 1 次提交
    • S
      IB/mlx5: Collect signature error completion · d5436ba0
      Sagi Grimberg 提交于
      This commit takes care of the generated signature error CQE generated
      by the HW (if happened).  The underlying mlx5 driver will handle
      signature error completions and will mark the relevant memory region
      as dirty.
      
      Once the consumer gets the completion for the transaction, it must
      check for signature errors on signature memory region using a new
      lightweight verb ib_check_mr_status().
      
      In case the user doesn't check for signature error (i.e. doesn't call
      ib_check_mr_status() with status check IB_MR_CHECK_SIG_STATUS), the
      memory region cannot be used for another signature operation
      (REG_SIG_MR work request will fail).
      Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      d5436ba0
  7. 23 1月, 2014 2 次提交
  8. 09 7月, 2013 1 次提交