1. 31 8月, 2015 2 次提交
    • M
      IB/mlx4: Implement ib_device callbacks · e26be1bf
      Moni Shoua 提交于
      get_netdev: get the net_device on the physical port of the IB transport port. In
      port aggregation mode it is required to return the netdev of the active port.
      
      modify_gid: note for a change in the RoCE gid cache. Handle this by writing to
      the harsware GID table. It is possible that indexes in cahce and hardware tables
      won't match so a translation is required when modifying a QP or creating an
      address handle.
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      e26be1bf
    • M
      net/mlx4: Postpone the registration of net_device · 79857cd3
      Moni Shoua 提交于
      The mlx4 network driver was registered in the context of the 'add'
      function of the core driver (called when HW should be registered).
      This makes the netdev event NETDEV_REGISTER to be sent in a context
      where the answer to get_protocol_dev() callback returns NULL. This may
      be confusing to listeners of netdev events.
      This patch is a preparation to the patch that implements the
      get_netdev() callback in the IB/mlx4 driver.
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      79857cd3
  2. 28 7月, 2015 3 次提交
  3. 16 6月, 2015 4 次提交
  4. 13 6月, 2015 1 次提交
  5. 31 5月, 2015 1 次提交
    • M
      net/mlx4: Add EQ pool · c66fa19c
      Matan Barak 提交于
      Previously, mlx4_en allocated EQs and used them exclusively.
      This affected RoCE performance, as applications which are
      events sensitive were limited to use only the legacy EQs.
      
      Change that by introducing an EQ pool. This pool is managed
      by mlx4_core. EQs are assigned to ports (when there are limited
      number of EQs, multiple ports could be assigned to the same EQs).
      
      An exception to this rule is the ASYNC EQ which handles various events.
      
      Legacy EQs are completely removed as all EQs could be shared.
      
      When a consumer (mlx4_ib/mlx4_en) requests an EQ, it asks for
      EQ serving on a specific port. The core driver calculates which
      EQ should be assigned to that request.
      
      Because IRQs are shared between IB and Ethernet modules, their
      names only include the PCI device BDF address.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NIdo Shamay <idos@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c66fa19c
  6. 16 4月, 2015 2 次提交
  7. 03 4月, 2015 12 次提交
  8. 01 4月, 2015 2 次提交
  9. 19 3月, 2015 1 次提交
  10. 07 3月, 2015 1 次提交
  11. 03 3月, 2015 1 次提交
  12. 10 2月, 2015 1 次提交
    • Y
      IB/mlx4: Reset flow support for IB kernel ULPs · 35f05dab
      Yishai Hadas 提交于
      The driver exposes interfaces that directly relate to HW state. Upon fatal
      error, consumers of these interfaces (ULPs) that rely on completion of
      all their posted work-request could hang, thereby introducing dependencies
      in shutdown order.  To prevent this from happening, we manage the
      relevant resources (CQs, QPs) that are used by the device. Upon a fatal error,
      we now generate simulated completions for outstanding WQEs that were not
      completed at the time the HW was reset.
      
      It includes invoking the completion event handler for all involved CQs so that
      the ULPs will poll those CQs. When polled we return simulated CQEs with
      IB_WC_WR_FLUSH_ERR return code enabling ULPs to clean up their resources and
      not wait forever for completions upon receiving remove_one.
      
      The above change requires an extra check in the data path to make sure that when
      device is in error state, the simulated CQEs will be returned and no further
      WQEs will be posted.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35f05dab
  13. 05 2月, 2015 2 次提交
  14. 03 2月, 2015 1 次提交
  15. 28 1月, 2015 2 次提交
  16. 26 1月, 2015 4 次提交
    • Y
      net/mlx4_core: Enable device recovery flow with SRIOV · 55ad3592
      Yishai Hadas 提交于
      In SRIOV, both the PF and the VF may attempt device recovery whenever they
      assume that the device is not functioning.  When the PF driver resets the
      device, the VF should detect this and attempt to reinitialize itself.
      
      The VF must be able to reset itself under all circumstances, even
      if the PF is not responsive.
      
      The VF shall reset itself in the following cases:
      
      1. Commands are not processed within reasonable time over the communication channel.
      This is done considering device state and the correct return code based on
      the command as was done in the native mode, done in the next patch.
      
      2. The VF driver receives an internal error event reported by the PF on the
      communication channel. This occurs when the PF driver resets the device or
      when VF is out of sync with the PF.
      
      Add 'VF reset' capability, which allows the VF to reinitialize itself even when the
      PF is not responsive.
      
      As PF and VF may run their reset flow simulantanisly, there are several cases
      that are handled:
      - Prevent freeing VF resources upon FLR, when PF is in its unloading stage.
      - Prevent PF getting VF commands before it has finished initializing its resources.
      - Upon VF startup, check that comm-channel is online before sending
        commands to the PF and getting timed-out.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      55ad3592
    • Y
      net/mlx4_core: Manage interface state for Reset flow cases · c69453e2
      Yishai Hadas 提交于
      We need to manage interface state to sync between reset flow and some other
      relative cases such as remove_one. This has to be done to prevent certain
      races. For example in case software stack is down as a result of unload call,
      the remove_one should skip the unload phase.
      
      Implement the remove_one case, handling AER and other cases comes next.
      
      The interface can be up/down, upon remove_one, the state will include an extra
      bit indicating that the device is cleaned-up, forcing other tasks to finish
      before the final cleanup.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c69453e2
    • Y
      net/mlx4_core: Activate reset flow upon fatal command cases · f5aef5aa
      Yishai Hadas 提交于
      We activate reset flow upon command fatal errors, when the device enters an
      erroneous state, and must be reset.
      
      The cases below are assumed to be fatal: FW command timed-out, an error from FW
      on closing commands, pci is offline when posting/pending a command.
      
      In those cases we place the device into an error state: chip is reset, pending
      commands are awakened and completed immediately. Subsequent commands will
      return immediately.
      
      The return code in the above cases will depend on the command. Commands which
      free and close resources will return success (because the chip was reset, so
      callers may safely free their kernel resources). Other commands will return -EIO.
      
      Since the device's state was marked as error, the catas poller will
      detect this and restart the device's software stack (as is done when a FW
      internal error is directly detected). The device state is protected by a
      persistent mutex lives on its mlx4_dev, as such no need any more for the
      hcr_mutex which is removed.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f5aef5aa
    • Y
      net/mlx4_core: Enhance the catas flow to support device reset · f6bc11e4
      Yishai Hadas 提交于
      This includes:
      
      - resetting the chip when a fatal error is detected (the current code
        does not do this).
      
      - exposing the ability to enter error state from outside the catas code
        by calling its functionality. (E.g. FW Command timeout, AER error).
      
      - managing a persistent device state. This is needed to sync between
        reset flow cases.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6bc11e4
反馈
建议
客服 返回
顶部