1. 20 3月, 2018 1 次提交
    • B
      net/mlx5: Packet pacing enhancement · 05d3ac97
      Bodong Wang 提交于
      Add two new parameters: max_burst_sz and typical_pkt_size (both
      in bytes) to rate limit configurations.
      
      max_burst_sz: The device will schedule bursts of packets for an
      SQ connected to this rate, smaller than or equal to this value.
      Value 0x0 indicates packet bursts will be limited to the device
      defaults. This field should be used if bursts of packets must be
      strictly kept under a certain value.
      
      typical_pkt_size: When the rate limit is intended for a stream of
      similar packets, stating the typical packet size can improve the
      accuracy of the rate limiter. The expected packet size will be
      the same for all SQs associated with the same rate limit index.
      
      Ethernet driver is updated according to this change, but these two
      parameters will be kept as 0 due to lacking of proper way to get the
      configurations from user space which requires to change
      ndo_set_tx_maxrate interface.
      Signed-off-by: NBodong Wang <bodong@mellanox.com>
      Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      05d3ac97
  2. 15 3月, 2018 1 次提交
  3. 14 3月, 2018 1 次提交
  4. 08 3月, 2018 8 次提交
  5. 07 3月, 2018 3 次提交
  6. 24 2月, 2018 2 次提交
  7. 15 2月, 2018 5 次提交
    • Y
      IB/mlx5: Implement fragmented completion queue (CQ) · 388ca8be
      Yonatan Cohen 提交于
      The current implementation of create CQ requires contiguous
      memory, such requirement is problematic once the memory is
      fragmented or the system is low in memory, it causes for
      failures in dma_zalloc_coherent().
      
      This patch implements new scheme of fragmented CQ to overcome
      this issue by introducing new type: 'struct mlx5_frag_buf_ctrl'
      to allocate fragmented buffers, rather than contiguous ones.
      
      Base the Completion Queues (CQs) on this new fragmented buffer.
      
      It fixes following crashes:
      kworker/29:0: page allocation failure: order:6, mode:0x80d0
      CPU: 29 PID: 8374 Comm: kworker/29:0 Tainted: G OE 3.10.0
      Workqueue: ib_cm cm_work_handler [ib_cm]
      Call Trace:
      [<>] dump_stack+0x19/0x1b
      [<>] warn_alloc_failed+0x110/0x180
      [<>] __alloc_pages_slowpath+0x6b7/0x725
      [<>] __alloc_pages_nodemask+0x405/0x420
      [<>] dma_generic_alloc_coherent+0x8f/0x140
      [<>] x86_swiotlb_alloc_coherent+0x21/0x50
      [<>] mlx5_dma_zalloc_coherent_node+0xad/0x110 [mlx5_core]
      [<>] ? mlx5_db_alloc_node+0x69/0x1b0 [mlx5_core]
      [<>] mlx5_buf_alloc_node+0x3e/0xa0 [mlx5_core]
      [<>] mlx5_buf_alloc+0x14/0x20 [mlx5_core]
      [<>] create_cq_kernel+0x90/0x1f0 [mlx5_ib]
      [<>] mlx5_ib_create_cq+0x3b0/0x4e0 [mlx5_ib]
      Signed-off-by: NYonatan Cohen <yonatanc@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      388ca8be
    • S
      net/mlx5: Remove redundant EQ API exports · 3ec5693b
      Saeed Mahameed 提交于
      EQ structure and API is private to mlx5_core driver only, external
      drivers should not have access or the means to manipulate EQ objects.
      
      Remove redundant exports and move API functions out of the linux/mlx5
      include directory into the driver's mlx5_core.h private include file.
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: NGal Pressman <galp@mellanox.com>
      3ec5693b
    • S
      net/mlx5: Move CQ completion and event forwarding logic to eq.c · 3ac7afdb
      Saeed Mahameed 提交于
      Since CQ tree is now per EQ, CQ completion and event forwarding became
      specific implementation of EQ logic, this patch moves that logic to eq.c
      and makes those functions static.
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: NGal Pressman <galp@mellanox.com>
      3ac7afdb
    • S
      net/mlx5: CQ hold/put API · f105b45b
      Saeed Mahameed 提交于
      Now as the CQ table is per EQ, add an API to hold/put CQ to be used from
      eq.c in downstream patch.
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: NGal Pressman <galp@mellanox.com>
      f105b45b
    • S
      net/mlx5: CQ Database per EQ · 02d92f79
      Saeed Mahameed 提交于
      Before this patch the driver had one CQ database protected via one
      spinlock, this spinlock is meant to synchronize between CQ
      adding/removing and CQ IRQ interrupt handling.
      
      On a system with large number of CPUs and on a work load that requires
      lots of interrupts, this global spinlock becomes a very nasty hotspot
      and introduces a contention between the active cores, which will
      significantly hurt performance and becomes a bottleneck that prevents
      seamless cpu scaling.
      
      To solve this we simply move the CQ database and its spinlock to be per
      EQ (IRQ), thus per core.
      
      Tested with:
      system: 2 sockets, 14 cores per socket, hyperthreading, 2x14x2=56 cores
      netperf command: ./super_netperf 200 -P 0 -t TCP_RR  -H <server> -l 30 -- -r 300,300 -o -s 1M,1M -S 1M,1M
      
      WITHOUT THIS PATCH:
      Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft %steal  %guest  %gnice   %idle
      Average:     all    4.32    0.00   36.15    0.09    0.00   34.02   0.00    0.00    0.00   25.41
      
      Samples: 2M of event 'cycles:pp', Event count (approx.): 1554616897271
      Overhead  Command          Shared Object                 Symbol
      +   14.28%  swapper          [kernel.vmlinux]              [k] intel_idle
      +   12.25%  swapper          [kernel.vmlinux]              [k] queued_spin_lock_slowpath
      +   10.29%  netserver        [kernel.vmlinux]              [k] queued_spin_lock_slowpath
      +    1.32%  netserver        [kernel.vmlinux]              [k] mlx5e_xmit
      
      WITH THIS PATCH:
      Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
      Average:     all    4.27    0.00   34.31    0.01    0.00   18.71    0.00    0.00    0.00   42.69
      
      Samples: 2M of event 'cycles:pp', Event count (approx.): 1498132937483
      Overhead  Command          Shared Object             Symbol
      +   23.33%  swapper          [kernel.vmlinux]          [k] intel_idle
      +    1.69%  netserver        [kernel.vmlinux]          [k] mlx5e_xmit
      Tested-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: NGal Pressman <galp@mellanox.com>
      02d92f79
  8. 05 2月, 2018 1 次提交
  9. 20 1月, 2018 2 次提交
  10. 19 1月, 2018 1 次提交
  11. 18 1月, 2018 1 次提交
  12. 12 1月, 2018 2 次提交
    • S
      net/mlx5: Fix get vector affinity helper function · 05e0cc84
      Saeed Mahameed 提交于
      mlx5_get_vector_affinity used to call pci_irq_get_affinity and after
      reverting the patch that sets the device affinity via PCI_IRQ_AFFINITY
      API, calling pci_irq_get_affinity becomes useless and it breaks RDMA
      mlx5 users.  To fix this, this patch provides an alternative way to
      retrieve IRQ vector affinity using legacy IRQ API, following
      smp_affinity read procfs implementation.
      
      Fixes: 231243c8 ("Revert mlx5: move affinity hints assignments to generic code")
      Fixes: a435393a ("mlx5: move affinity hints assignments to generic code")
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      05e0cc84
    • E
      {net,ib}/mlx5: Don't disable local loopback multicast traffic when needed · 8978cc92
      Eran Ben Elisha 提交于
      There are systems platform information management interfaces (such as
      HOST2BMC) for which we cannot disable local loopback multicast traffic.
      
      Separate disable_local_lb_mc and disable_local_lb_uc capability bits so
      driver will not disable multicast loopback traffic if not supported.
      (It is expected that Firmware will not set disable_local_lb_mc if
      HOST2BMC is running for example.)
      
      Function mlx5_nic_vport_update_local_lb will do best effort to
      disable/enable UC/MC loopback traffic and return success only in case it
      succeeded to changed all allowed by Firmware.
      
      Adapt mlx5_ib and mlx5e to support the new cap bits.
      
      Fixes: 2c43c5a0 ("net/mlx5e: Enable local loopback in loopback selftest")
      Fixes: c85023e1 ("IB/mlx5: Add raw ethernet local loopback support")
      Fixes: bded747b ("net/mlx5: Add raw ethernet local loopback firmware command")
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Cc: kernel-team@fb.com
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      8978cc92
  13. 09 1月, 2018 8 次提交
  14. 29 12月, 2017 2 次提交
  15. 22 12月, 2017 1 次提交
  16. 20 12月, 2017 1 次提交
    • M
      net/mlx5: Cleanup IRQs in case of unload failure · d6b2785c
      Moshe Shemesh 提交于
      When mlx5_stop_eqs fails to destroy any of the eqs it returns with an error.
      In such failure flow the function will return without
      releasing all EQs irqs and then pci_free_irq_vectors will fail.
      Fix by only warn on destroy EQ failure and continue to release other
      EQs and their irqs.
      
      It fixes the following kernel trace:
      kernel: kernel BUG at drivers/pci/msi.c:352!
      ...
      ...
      kernel: Call Trace:
      kernel: pci_disable_msix+0xd3/0x100
      kernel: pci_free_irq_vectors+0xe/0x20
      kernel: mlx5_load_one.isra.17+0x9f5/0xec0 [mlx5_core]
      
      Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters")
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      d6b2785c