1. 24 2月, 2018 4 次提交
  2. 15 2月, 2018 1 次提交
    • Y
      IB/mlx5: Implement fragmented completion queue (CQ) · 388ca8be
      Yonatan Cohen 提交于
      The current implementation of create CQ requires contiguous
      memory, such requirement is problematic once the memory is
      fragmented or the system is low in memory, it causes for
      failures in dma_zalloc_coherent().
      
      This patch implements new scheme of fragmented CQ to overcome
      this issue by introducing new type: 'struct mlx5_frag_buf_ctrl'
      to allocate fragmented buffers, rather than contiguous ones.
      
      Base the Completion Queues (CQs) on this new fragmented buffer.
      
      It fixes following crashes:
      kworker/29:0: page allocation failure: order:6, mode:0x80d0
      CPU: 29 PID: 8374 Comm: kworker/29:0 Tainted: G OE 3.10.0
      Workqueue: ib_cm cm_work_handler [ib_cm]
      Call Trace:
      [<>] dump_stack+0x19/0x1b
      [<>] warn_alloc_failed+0x110/0x180
      [<>] __alloc_pages_slowpath+0x6b7/0x725
      [<>] __alloc_pages_nodemask+0x405/0x420
      [<>] dma_generic_alloc_coherent+0x8f/0x140
      [<>] x86_swiotlb_alloc_coherent+0x21/0x50
      [<>] mlx5_dma_zalloc_coherent_node+0xad/0x110 [mlx5_core]
      [<>] ? mlx5_db_alloc_node+0x69/0x1b0 [mlx5_core]
      [<>] mlx5_buf_alloc_node+0x3e/0xa0 [mlx5_core]
      [<>] mlx5_buf_alloc+0x14/0x20 [mlx5_core]
      [<>] create_cq_kernel+0x90/0x1f0 [mlx5_ib]
      [<>] mlx5_ib_create_cq+0x3b0/0x4e0 [mlx5_ib]
      Signed-off-by: NYonatan Cohen <yonatanc@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      388ca8be
  3. 30 1月, 2018 1 次提交
  4. 19 1月, 2018 1 次提交
  5. 09 1月, 2018 6 次提交
  6. 04 1月, 2018 3 次提交
  7. 29 12月, 2017 3 次提交
  8. 28 12月, 2017 1 次提交
  9. 11 11月, 2017 1 次提交
  10. 26 10月, 2017 4 次提交
  11. 29 8月, 2017 1 次提交
  12. 25 8月, 2017 1 次提交
  13. 24 7月, 2017 6 次提交
    • Y
      IB/mlx5: Add support for QP with a given source QPN · c2e53b2c
      Yishai Hadas 提交于
      Allow user space applications to accelerate send and receive
      traffic which is typically handled by IPoIB ULP by creating
      a UD QP with a given source QPN of the IPoIB UD QP.
      
      UD QP with a given source QPN should basically be similar to
      RAW QP from point of view of its created resources.
      
      However,
      - Its TIS should point to the source QPN.
      - Modify can be done only on its state as the transport attributes
        are managed by its source QP.
      
      This patch manages below:
      - Creating/destroying/modifying UD QP with a given source QPN.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Reviewed-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      c2e53b2c
    • M
      IB/mlx5: Add delay drop configuration and statistics · fe248c3a
      Maor Gottlieb 提交于
      Add debugfs interface for monitor the number of delay drop timeout
      events and the number of existing dropless RQs in the system.
      
      In addition add debugfs interface for configuring the global timeout value
      which is used in the SET_DELAY_DROP command.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      fe248c3a
    • M
      IB/mlx5: Add support to dropless RQ · 03404e8a
      Maor Gottlieb 提交于
      RQs that were configured for "delay drop" will prevent packet drops
      when their WQEs are depleted.
      Marking an RQ to be drop-less is done by setting delay_drop_en in RQ
      context using CREATE_RQ command.
      
      Since this feature is globally activated/deactivated by using the
      SET_DELAY_DROP command on all the marked RQs, we activated/deactivated
      it according to the number of RQs with 'delay_drop' enabled.
      
      When timeout is expired, then the feature is deactivated. Therefore
      the driver handles the delay drop timeout event and reactivate it.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      03404e8a
    • P
      IB/mlx5: Add debug control parameters for congestion control · 4a2da0b8
      Parav Pandit 提交于
      This patch adds debug control parameters for congestion control which
      can be read or written through debugfs. They are for reaction point and
      notification point nodes.
      
      These control parameters are as below:
       +------------------------------+-----------------------------------------+
       |      Name                    |           Description                   |
       |------------------------------+-----------------------------------------|
       |rp_clamp_tgt_rate             | When set target rate is updated to      |
       |                              | current rate                            |
       |------------------------------+-----------------------------------------|
       |rp_clamp_tgt_rate_ati         | When set update target rate based on    |
       |                              | timer as well                           |
       |------------------------------+-----------------------------------------|
       |rp_time_reset                 | time between rate increase if no        |
       |                              | CNP is received unit in usec            |
       |------------------------------+-----------------------------------------|
       |rp_byte_reset                 | Number of bytes between rate inease if  |
       |                              | no CNP is received                      |
       |------------------------------+-----------------------------------------|
       |rp_threshold                  | Threshold for reaction point rate       |
       |                              | control                                 |
       |------------------------------+-----------------------------------------|
       |rp_ai_rate                    | Rate for target rate, unit in Mbps      |
       |------------------------------+-----------------------------------------|
       |rp_hai_rate                   | Rate for hyper increase state           |
       |                              | unit in Mbps                            |
       |------------------------------+-----------------------------------------|
       |rp_min_dec_fac                | Minimum factor by which the current     |
       |                              | transmit rate can be changed when       |
       |                              | processing a CNP, unit is percerntage   |
       |------------------------------+-----------------------------------------|
       |rp_min_rate                   | Minimum value for rate limit,           |
       |                              | unit in Mbps                            |
       |------------------------------+-----------------------------------------|
       |rp_rate_to_set_on_first_cnp   | Rate that is set when first CNP is      |
       |                              | received, unit is Mbps                  |
       |------------------------------+-----------------------------------------|
       |rp_dce_tcp_g                  | Used to calculate alpha                 |
       |------------------------------+-----------------------------------------|
       |rp_dce_tcp_rtt                | Time between updates of alpha value,    |
       |                              | unit is usec                            |
       |------------------------------+-----------------------------------------|
       |rp_rate_reduce_monitor_period | Minimum time between consecutive rate   |
       |                              | reductions                              |
       |------------------------------+-----------------------------------------|
       |rp_initial_alpha_value        | Initial value of alpha                  |
       |------------------------------+-----------------------------------------|
       |rp_gd                         | When CNP is received, flow rate is      |
       |                              | reduced based on gd, rp_gd is given as  |
       |                              | log2(rp_gd)                             |
       |------------------------------+-----------------------------------------|
       |np_cnp_dscp                   | dscp code point for generated cnp       |
       |------------------------------+-----------------------------------------|
       |np_cnp_prio_mode              | 802.1p priority for generated cnp       |
       |------------------------------+-----------------------------------------|
       |np_cnp_prio                   | cnp priority mode                       |
       +------------------------------+-----------------------------------------+
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: NEli Cohen <eli@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      4a2da0b8
    • M
      IB/mlx5: Change logic for dispatching IB events for port state · fd65f1b8
      Moni Shoua 提交于
      The old logic ignored link state. This led to missing IB events like
      when link goes down on the switch while admin state is up or to redundant
      events like when admin state goes up while link is down.
      To fix that, probe the port state on NETDEV events and compare to last
      known state to decide if IB events needs to be dispatched.
      
      FIxes: 5ec8c83e ("IB/mlx5: Port events in RoCE now rely on netdev events")
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Reviewed-by: NNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      fd65f1b8
    • H
      IB/mlx5: Add raw ethernet local loopback support · c85023e1
      Huy Nguyen 提交于
      Currently, unicast/multicast loopback raw ethernet
      (non-RDMA) packets are sent back to the vport.
      A unicast loopback packet is the packet with destination
      MAC address the same as the source MAC address.
      For multicast, the destination MAC address is in the
      vport's multicast filter list.
      
      Moreover, the local loopback is not needed if
      there is one or none user space context.
      
      After this patch, the raw ethernet unicast and multicast
      local loopback are disabled by default. When there is more
      than one user space context, the local loopback is enabled.
      
      Note that when local loopback is disabled, raw ethernet
      packets are not looped back to the vport and are forwarded
      to the next routing level (eswitch, or multihost switch,
      or out to the wire depending on the configuration).
      Signed-off-by: NHuy Nguyen <huyn@mellanox.com>
      Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      c85023e1
  14. 02 6月, 2017 1 次提交
  15. 02 5月, 2017 1 次提交
  16. 26 4月, 2017 1 次提交
  17. 22 4月, 2017 1 次提交
    • P
      IB/mlx5: Support congestion related counters · e1f24a79
      Parav Pandit 提交于
      This patch adds support to query the congestion related hardware counters
      through new command and links them with other hw counters being available
      in hw_counters sysfs location.
      
      In order to reuse existing infrastructure it renames related q_counter
      data structures to more generic counters to reflect q_counters and
      congestion counters and maybe some other counters in the future.
      
      New hardware counters:
       * rp_cnp_handled - CNP packets handled by the reaction point
       * rp_cnp_ignored - CNP packets ignored by the reaction point
       * np_cnp_sent    - CNP packets sent by notification point to respond to
                           CE marked RoCE packets
       * np_ecn_marked_roce_packets - CE marked RoCE packets received by
                                      notification point
      
      It also avoids returning ENOSYS which is specific for invalid
      system call and produces the following checkpatch.pl warning.
      
      WARNING: ENOSYS means 'invalid syscall nr' and nothing else
      +		return -ENOSYS;
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Reviewed-by: NEli Cohen <eli@mellanox.com>
      Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      e1f24a79
  18. 17 4月, 2017 1 次提交
  19. 15 2月, 2017 2 次提交