提交 · 05d3ac978ed25b753bfe34fe76c50c31ee506a82 · openanolis / cloud-kernel

20 3月, 2018 1 次提交

net/mlx5: Packet pacing enhancement · 05d3ac97

由 Bodong Wang 提交于 3月 19, 2018

Add two new parameters: max_burst_sz and typical_pkt_size (both
in bytes) to rate limit configurations.

max_burst_sz: The device will schedule bursts of packets for an
SQ connected to this rate, smaller than or equal to this value.
Value 0x0 indicates packet bursts will be limited to the device
defaults. This field should be used if bursts of packets must be
strictly kept under a certain value.

typical_pkt_size: When the rate limit is intended for a stream of
similar packets, stating the typical packet size can improve the
accuracy of the rate limiter. The expected packet size will be
the same for all SQs associated with the same rate limit index.

Ethernet driver is updated according to this change, but these two
parameters will be kept as 0 due to lacking of proper way to get the
configurations from user space which requires to change
ndo_set_tx_maxrate interface.
Signed-off-by: NBodong Wang <bodong@mellanox.com>
Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

05d3ac97

15 3月, 2018 1 次提交

IB/mlx5: Expose more priorities for bypass namespace · 72f7cc09

由 Mark Bloch 提交于 3月 13, 2018

BYPASS namespace is used by the RDMA side to insert flow rules into
the vport RX flow tables. Currently only 8 priorities are exposed,
increase this to 16 to allow more flexibility. This change will also
cause the BYPASS namespace to use 32 levels (as apposed to 16 today) of
flow tables, 16 levels for regular rules and 16 for don't trap rules.
Reviewed-by: NMaor Gottlieb <maorg@mellanox.com>
Signed-off-by: NMark Bloch <markb@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

72f7cc09

14 3月, 2018 1 次提交

IB/mlx5: Fix integer overflows in mlx5_ib_create_srq · c2b37f76

由 Boris Pismenny 提交于 3月 08, 2018

This patch validates user provided input to prevent integer overflow due
to integer manipulation in the mlx5_ib_create_srq function.

Cc: syzkaller <syzkaller@googlegroups.com>
Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: NBoris Pismenny <borisp@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c2b37f76

08 3月, 2018 8 次提交

net/mlx5: IPSec, Add support for ESN · cb010083

由 Aviad Yehezkel 提交于 1月 18, 2018

Currently ESN is not supported with IPSec device offload.

This patch adds ESN support to IPsec device offload.
Implementing new xfrm device operation to synchronize offloading device
ESN with xfrm received SN. New QP command to update SA state at the
following:

           ESN 1                    ESN 2                  ESN 3
|-----------*-----------|-----------*-----------|-----------*
^           ^           ^           ^           ^           ^

^ - marks where QP command invoked to update the SA ESN state
    machine.
| - marks the start of the ESN scope (0-2^32-1). At this point move SA
    ESN overlap bit to zero and increment ESN.
* - marks the middle of the ESN scope (2^31). At this point move SA
    ESN overlap bit to one.
Signed-off-by: NAviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: NYossef Efraim <yossefe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

cb010083

net/mlx5: Add flow-steering commands for FPGA IPSec implementation · 05564d0a

由 Aviad Yehezkel 提交于 2月 18, 2018

In order to add a context to the FPGA, we need to get both the software
transform context (which includes the keys, etc) and the
source/destination IPs (which are included in the steering
rule). Therefore, we register new set of firmware like commands for
the FPGA. Each time a rule is added, the steering core infrastructure
calls the FPGA command layer. If the rule is intended for the FPGA,
it combines the IPs information with the software transformation
context and creates the respective hardware transform.
Afterwards, it calls the standard steering command layer.
Signed-off-by: NAviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

05564d0a

net/mlx5: Refactor accel IPSec code · d6c4f029

由 Aviad Yehezkel 提交于 1月 18, 2018

The current code has one layer that executed FPGA commands and
the Ethernet part directly used this code. Since downstream patches
introduces support for IPSec in mlx5_ib, we need to provide some
abstractions. This patch refactors the accel code into one layer
that creates a software IPSec transformation and another one which
creates the actual hardware context.
The internal command implementation is now hidden in the FPGA
core layer. The code also adds the ability to share FPGA hardware
contexts. If two contexts are the same, only a reference count
is taken.
Signed-off-by: NAviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

d6c4f029

net/mlx5: Added required metadata capability for ipsec · af9fe19d

由 Aviad Yehezkel 提交于 1月 17, 2018

Currently our device requires additional metadata in packet
to perform ipsec crypto offload.
Signed-off-by: NAviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

af9fe19d

net/mlx5: Export ipsec capabilities · 1d2005e2

由 Aviad Yehezkel 提交于 1月 29, 2018

We will need that for ipsec verbs.
Signed-off-by: NAviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

1d2005e2

net/mlx5: IPSec, Add command V2 support · 65802f48

由 Aviad Yehezkel 提交于 1月 16, 2018

This patch adds V2 command support.
New fpga devices support extended features (udp encap, esn etc...), this
features require new hardware sadb format therefore we have a new version
of commands to manipulate it.
Signed-off-by: NYossef Efraim <yossefe@mellanox.com>
Signed-off-by: NAviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

65802f48

net/mlx5e: IPSec, Add support for ESP trailer removal by hardware · 788a8210

由 Yossi Kuperman 提交于 10月 22, 2017

Current hardware decrypts and authenticates incoming ESP packets.
Subsequently, the software extracts the nexthdr field, truncates the
trailer and adjusts csum accordingly.

With this patch and a capable device, the trailer is being removed
by the hardware and the nexthdr field is conveyed via PET. This way
we avoid both the need to access the trailer (cache miss) and to
compute its relative checksum, which significantly improve
the performance.

Experiment shows that trailer removal improves the performance by
2Gbps, (netperf). Both forwarding and host-to-host configurations.
Signed-off-by: NYossi Kuperman <yossiku@mellanox.com>
Signed-off-by: NAviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

788a8210

net/mlx5: IPSec, Generalize sandbox QP commands · 581fddde

由 Yossi Kuperman 提交于 10月 22, 2017

The current code assume only SA QP commands.
Refactor in order to pave the way for new QP commands:
1. Generic cmd response format.
2. SA cmd checks are in dedicated functions.
3. Aligned debug prints.
Signed-off-by: NYossi Kuperman <yossiku@mellanox.com>
Signed-off-by: NAviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

581fddde

07 3月, 2018 3 次提交

{net,IB}/mlx5: Add flow steering helpers · 3346c487

由 Boris Pismenny 提交于 8月 20, 2017

Add helper functions that check if a protocol is
part of a flow steering match criteria.
Signed-off-by: NBoris Pismenny <borisp@mellanox.com>
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NAviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

3346c487

net/mlx5: Add empty egress namespace to flow steering core · 5f418378

由 Aviad Yehezkel 提交于 2月 18, 2018

Currently, we don't support egress flow steering namespace in mlx5
flow steering core implementation. However, when we want to encrypt
a packet, we model it as a flow steering rule in the egress path.
To overcome this, we add an empty egress namespace to flow steering.
This namespace is initialized only when ipsec support exists.
In the future, this will grow to a full blown full steering
implementation, resembling the ingress path.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NAviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

5f418378

{net,IB}/mlx5: Add has_tag to mlx5_flow_act · a9db0ecf

由 Matan Barak 提交于 8月 16, 2017

The has_tag member will indicate whether a tag action was specified
in flow specification.

A flow tag 0 = MLX5_FS_DEFAULT_FLOW_TAG is assumed a valid flow tag
that is currently used by mlx5 RDMA driver, whereas in HW flow_tag = 0
means that the user doesn't care about flow_tag.  HW always provide
a flow_tag = 0 if all flow tags requested on a specific flow are 0.

So we need a way (in the driver) to differentiate between a user really
requesting flow_tag = 0 and a user who does not care, in order to be
able to report conflicting flow tags on a specific flow.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Reviewed-by: NAviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

a9db0ecf

24 2月, 2018 2 次提交

net/mlx5: E-Switch, Add definition of IB representor · 5e65b02c

由 Mark Bloch 提交于 1月 16, 2018

Create a new representor type: REP_IB. which will be initialized by an IB
device that is used as a logical representor of a eswitch vport (VF or
uplink) just like we have a net device today in switchdev mode.
Signed-off-by: NMark Bloch <markb@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

5e65b02c

net/mlx5: E-Switch, Move representors definition to a global scope · 57cbd893

由 Mark Bloch 提交于 1月 16, 2018

In preparation for IB representors, move representors structs to a global
scope, also expose functions needed for registration, unregistration,
eswitch mode and creating a flow rule to direct traffic from SQs to the
right VF.
Signed-off-by: NMark Bloch <markb@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

57cbd893

15 2月, 2018 5 次提交

IB/mlx5: Implement fragmented completion queue (CQ) · 388ca8be

由 Yonatan Cohen 提交于 1月 02, 2018

The current implementation of create CQ requires contiguous
memory, such requirement is problematic once the memory is
fragmented or the system is low in memory, it causes for
failures in dma_zalloc_coherent().

This patch implements new scheme of fragmented CQ to overcome
this issue by introducing new type: 'struct mlx5_frag_buf_ctrl'
to allocate fragmented buffers, rather than contiguous ones.

Base the Completion Queues (CQs) on this new fragmented buffer.

It fixes following crashes:
kworker/29:0: page allocation failure: order:6, mode:0x80d0
CPU: 29 PID: 8374 Comm: kworker/29:0 Tainted: G OE 3.10.0
Workqueue: ib_cm cm_work_handler [ib_cm]
Call Trace:
[<>] dump_stack+0x19/0x1b
[<>] warn_alloc_failed+0x110/0x180
[<>] __alloc_pages_slowpath+0x6b7/0x725
[<>] __alloc_pages_nodemask+0x405/0x420
[<>] dma_generic_alloc_coherent+0x8f/0x140
[<>] x86_swiotlb_alloc_coherent+0x21/0x50
[<>] mlx5_dma_zalloc_coherent_node+0xad/0x110 [mlx5_core]
[<>] ? mlx5_db_alloc_node+0x69/0x1b0 [mlx5_core]
[<>] mlx5_buf_alloc_node+0x3e/0xa0 [mlx5_core]
[<>] mlx5_buf_alloc+0x14/0x20 [mlx5_core]
[<>] create_cq_kernel+0x90/0x1f0 [mlx5_ib]
[<>] mlx5_ib_create_cq+0x3b0/0x4e0 [mlx5_ib]
Signed-off-by: NYonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

388ca8be

net/mlx5: Remove redundant EQ API exports · 3ec5693b

由 Saeed Mahameed 提交于 2月 01, 2018

EQ structure and API is private to mlx5_core driver only, external
drivers should not have access or the means to manipulate EQ objects.

Remove redundant exports and move API functions out of the linux/mlx5
include directory into the driver's mlx5_core.h private include file.
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Reviewed-by: NGal Pressman <galp@mellanox.com>

3ec5693b

net/mlx5: Move CQ completion and event forwarding logic to eq.c · 3ac7afdb

由 Saeed Mahameed 提交于 2月 01, 2018

Since CQ tree is now per EQ, CQ completion and event forwarding became
specific implementation of EQ logic, this patch moves that logic to eq.c
and makes those functions static.
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Reviewed-by: NGal Pressman <galp@mellanox.com>

3ac7afdb

net/mlx5: CQ hold/put API · f105b45b

由 Saeed Mahameed 提交于 2月 01, 2018

Now as the CQ table is per EQ, add an API to hold/put CQ to be used from
eq.c in downstream patch.
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Reviewed-by: NGal Pressman <galp@mellanox.com>

f105b45b

net/mlx5: CQ Database per EQ · 02d92f79

由 Saeed Mahameed 提交于 1月 19, 2018

Before this patch the driver had one CQ database protected via one
spinlock, this spinlock is meant to synchronize between CQ
adding/removing and CQ IRQ interrupt handling.

On a system with large number of CPUs and on a work load that requires
lots of interrupts, this global spinlock becomes a very nasty hotspot
and introduces a contention between the active cores, which will
significantly hurt performance and becomes a bottleneck that prevents
seamless cpu scaling.

To solve this we simply move the CQ database and its spinlock to be per
EQ (IRQ), thus per core.

Tested with:
system: 2 sockets, 14 cores per socket, hyperthreading, 2x14x2=56 cores
netperf command: ./super_netperf 200 -P 0 -t TCP_RR  -H <server> -l 30 -- -r 300,300 -o -s 1M,1M -S 1M,1M

WITHOUT THIS PATCH:
Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft %steal  %guest  %gnice   %idle
Average:     all    4.32    0.00   36.15    0.09    0.00   34.02   0.00    0.00    0.00   25.41

Samples: 2M of event 'cycles:pp', Event count (approx.): 1554616897271
Overhead  Command          Shared Object                 Symbol
+   14.28%  swapper          [kernel.vmlinux]              [k] intel_idle
+   12.25%  swapper          [kernel.vmlinux]              [k] queued_spin_lock_slowpath
+   10.29%  netserver        [kernel.vmlinux]              [k] queued_spin_lock_slowpath
+    1.32%  netserver        [kernel.vmlinux]              [k] mlx5e_xmit

WITH THIS PATCH:
Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
Average:     all    4.27    0.00   34.31    0.01    0.00   18.71    0.00    0.00    0.00   42.69

Samples: 2M of event 'cycles:pp', Event count (approx.): 1498132937483
Overhead  Command          Shared Object             Symbol
+   23.33%  swapper          [kernel.vmlinux]          [k] intel_idle
+    1.69%  netserver        [kernel.vmlinux]          [k] mlx5e_xmit
Tested-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Reviewed-by: NGal Pressman <galp@mellanox.com>

02d92f79

05 2月, 2018 1 次提交

mlx5: fix mlx5_get_vector_affinity to start from completion vector 0 · 2572cf57

由 Sagi Grimberg 提交于 2月 05, 2018

The consumers of this routine expects the affinity map of of vector
index relative to the first completion vector. The upper layers are
not aware of internal/private completion vectors that mlx5 allocates
for its own usage.

Hence, return the affinity map of vector index relative to the first
completion vector.

Fixes: 05e0cc84 ("net/mlx5: Fix get vector affinity helper function")
Reported-by: NLogan Gunthorpe <logang@deltatee.com>
Tested-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Cc: <stable@vger.kernel.org> # v4.15
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

2572cf57

20 1月, 2018 2 次提交

net/mlx5: Enable setting hairpin queue size · 4d533e0f

由 Or Gerlitz 提交于 1月 04, 2018

Allow to specify the size of the hairpin queues along with the
packet buffer data size from the core setup code.

If the driver doesn't provide this, the FW applies proper value that
matches the provided data size and a FW chosen RQ stride size.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

4d533e0f

net/mlx5: Vectorize the low level core hairpin object · ddae74ac

由 Or Gerlitz 提交于 11月 23, 2017

Enhance the hairpin setup code at the core to support a set of N
(RQ,SQ) pairs. This will be later used by the caller to set RSS
spreading among the different RQs.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

ddae74ac

19 1月, 2018 1 次提交

net/mlx5e: Add clock info page to mlx5 core devices · 24d33d2c

由 Feras Daoud 提交于 1月 16, 2018

Adds a new page to mlx5 core containing clock info data that allows
user level applications to translate between cqe timestamp to
nanoseconds. The information stored into this page is represented
through mlx5_ib_clock_info.

In order to synchronize between kernel and user space a sequence
number is incremented at the beginning and end of each update.
An odd number means the data is being updated while an even means
the access was already done. To guarantee that the data structure
was accessed atomically user will:

repeat:
        seq1 = <read sequence>
        goto <repeate> while odd
        <read data structure>
        seq2 = <read sequence>
        if seq1 != seq2 goto repeat
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
Reviewed-by: NAlex Vesker <valex@mellanox.com>
Signed-off-by: NFeras Daoud <ferasda@mellanox.com>
Signed-off-by: NEitan Rabin <rabin@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

24d33d2c

18 1月, 2018 1 次提交

net/mlx5: Fix build break · 2d83619d

由 Saeed Mahameed 提交于 1月 17, 2018

The latest merge between net and net-next introduced a complier assert in
mlx5 driver.  In hca_cap_bits older fields are kept along with newer
fields that should have replaced them.

Fixes: c02b3741 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d83619d

12 1月, 2018 2 次提交

net/mlx5: Fix get vector affinity helper function · 05e0cc84

由 Saeed Mahameed 提交于 1月 04, 2018

mlx5_get_vector_affinity used to call pci_irq_get_affinity and after
reverting the patch that sets the device affinity via PCI_IRQ_AFFINITY
API, calling pci_irq_get_affinity becomes useless and it breaks RDMA
mlx5 users. To fix this, this patch provides an alternative way to
retrieve IRQ vector affinity using legacy IRQ API, following
smp_affinity read procfs implementation.

Fixes: 231243c8 ("Revert mlx5: move affinity hints assignments to generic code")
Fixes: a435393a ("mlx5: move affinity hints assignments to generic code")
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

05e0cc84

{net,ib}/mlx5: Don't disable local loopback multicast traffic when needed · 8978cc92

由 Eran Ben Elisha 提交于 1月 09, 2018

There are systems platform information management interfaces (such as
HOST2BMC) for which we cannot disable local loopback multicast traffic.

Separate disable_local_lb_mc and disable_local_lb_uc capability bits so
driver will not disable multicast loopback traffic if not supported.
(It is expected that Firmware will not set disable_local_lb_mc if
HOST2BMC is running for example.)

Function mlx5_nic_vport_update_local_lb will do best effort to
disable/enable UC/MC loopback traffic and return success only in case it
succeeded to changed all allowed by Firmware.

Adapt mlx5_ib and mlx5e to support the new cap bits.

Fixes: 2c43c5a0 ("net/mlx5e: Enable local loopback in loopback selftest")
Fixes: c85023e1 ("IB/mlx5: Add raw ethernet local loopback support")
Fixes: bded747b ("net/mlx5: Add raw ethernet local loopback firmware command")
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

8978cc92

09 1月, 2018 8 次提交

net/mlx5: Hairpin pair core object setup · 18e568c3

由 Or Gerlitz 提交于 11月 12, 2017

Low level code to setup hairpin pair core object, deals with:
 - create hairpin RQs/SQs
 - destroy hairpin RQs/SQs
 - modifying hairpin RQs/SQs - pairing (rst2rdy) and unpairing (rdy2rst)

Unlike conventional RQs/SQs, the memory used for the packet and descriptor
buffers is allocated by the firmware and not the driver. The driver sets
the overall data size (log).
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

18e568c3

net/mlx5: Add hairpin definitions to the FW API · 40817cdb

由 Or Gerlitz 提交于 6月 25, 2017

Add hairpin definitions to the IFC file.

This includes the HCA ID, few HCA hairpin capabilities, new
fields in RQ/SQ used later for the pairing and the WQ hairpin
data size attribute.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

40817cdb

{net, IB}/mlx5: Change set_roce_gid to take a port number · cfe4e37f

由 Daniel Jurgens 提交于 1月 04, 2018

When in dual port mode setting a RoCE GID for any port flows through the
master ports mlx5_core_dev. Provide an interface to set the port when
sending this command.
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Reviewed-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

cfe4e37f

{net, IB}/mlx5: Manage port association for multiport RoCE · 32f69e4b

由 Daniel Jurgens 提交于 1月 04, 2018

When mlx5_ib_add is called determine if the mlx5 core device being
added is capable of dual port RoCE operation. If it is, determine
whether it is a master device or a slave device using the
num_vhca_ports and affiliate_nic_vport_criteria capabilities.

If the device is a slave, attempt to find a master device to affiliate it
with. Devices that can be affiliated will share a system image guid. If
none are found place it on a list of unaffiliated ports. If a master is
found bind the port to it by configuring the port affiliation in the NIC
vport context.

Similarly when mlx5_ib_remove is called determine the port type. If it's
a slave port, unaffiliate it from the master device, otherwise just
remove it from the unaffiliated port list.

The IB device is registered as a multiport device, even if a 2nd port is
not available for affiliation. When the 2nd port is affiliated later the
GID cache must be refreshed in order to get the default GIDs for the 2nd
port in the cache. Export roce_rescan_device to provide a mechanism to
refresh the cache after a new port is bound.

In a multiport configuration all IB object (QP, MR, PD, etc) related
commands should flow through the master mlx5_core_dev, other commands
must be sent to the slave port mlx5_core_mdev, an interface is provide
to get the correct mdev for non IB object commands.
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Reviewed-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

32f69e4b

IB/mlx5: Make netdev notifications multiport capable · 7fd8aefb

由 Daniel Jurgens 提交于 1月 04, 2018

When multiple RoCE ports are supported registration for events on
multiple netdevs is required. Refactor the event registration and
handling to support multiple ports.
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Reviewed-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

7fd8aefb

net/mlx5: Set software owner ID during init HCA · 8737f818

由 Daniel Jurgens 提交于 1月 04, 2018

Generate a unique 128bit identifier for each host and pass that value to
firmware in the INIT_HCA command if it reports the sw_owner_id
capability. Each device bound to the mlx5_core driver will have the same
software owner ID.

In subsequent patches mlx5_core devices will be bound via a new VPort
command so that they can operate together under a single InfiniBand
device. Only devices that have the same software owner ID can be bound,
to prevent traffic intended for one host arriving at another.

The INIT_HCA command length was expanded by 128 bits. The command
length is provided as an input FW commands. Older FW does not have a
problem receiving this command in the new longer form.
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Reviewed-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

8737f818

net/mlx5: Fix race for multiple RoCE enable · 734dc065

由 Daniel Jurgens 提交于 1月 04, 2018

There are two potential problems with the existing implementation.

1. Enable and disable can race after the atomic operations.
2. If a command fails the refcount is left in an inconsistent state.

Introduce a lock and perform error checking.

Fixes: a6f7d2af ("net/mlx5: Add support for multiple RoCE enable")
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Reviewed-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

734dc065

net/mlx5: Add DCT command interface · 57cda166

由 Moni Shoua 提交于 1月 02, 2018

Add a missing command interface to work with a DCT. It includes: creating,
destroying and get events for.
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

57cda166

29 12月, 2017 2 次提交

net/mlx5: Separate ingress/egress namespaces for each vport · 9b93ab98

由 Gal Pressman 提交于 11月 28, 2017

Each vport has its own root flow table for the ACL flow tables and root
flow table is per namespace, therefore we should create a namespace for
each vport.

Fixes: efdc810b ("net/mlx5: Flow steering, Add vport ACL support")
Signed-off-by: NGal Pressman <galp@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

9b93ab98

IB/mlx5: Extend UAR stuff to support dynamic allocation · 31a78a5a

由 Yishai Hadas 提交于 12月 24, 2017

This patch extends the alloc context flow to be prepared for working
with dynamic UAR allocations.

Currently upon alloc context there is some fix size of UARs that are
allocated (named 'static allocation') and there is no option to user
application to ask for more or control which UAR will be used by which
QP.

In this patch the driver prepares its data structures to manage both the
static and the dynamic allocations and let the user driver knows about
the max value of dynamic blue-flame registers that are allowed.

Downstream patches from this series will enable the dynamic allocation
and the association as part of QP creation.
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

31a78a5a

22 12月, 2017 1 次提交

IB/mlx5: Fix congestion counters in LAG mode · 71a0ff65

由 Majd Dibbiny 提交于 12月 21, 2017

Congestion counters are counted and queried per physical function.
When working in LAG mode, CNP packets can be sent or received on both
of the functions, thus congestion counters should be aggregated from
the two physical functions.

Fixes: e1f24a79 ("IB/mlx5: Support congestion related counters")
Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
Reviewed-by: NAviv Heller <avivh@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

71a0ff65

20 12月, 2017 1 次提交

net/mlx5: Cleanup IRQs in case of unload failure · d6b2785c

由 Moshe Shemesh 提交于 11月 21, 2017

When mlx5_stop_eqs fails to destroy any of the eqs it returns with an error.
In such failure flow the function will return without
releasing all EQs irqs and then pci_free_irq_vectors will fail.
Fix by only warn on destroy EQ failure and continue to release other
EQs and their irqs.

It fixes the following kernel trace:
kernel: kernel BUG at drivers/pci/msi.c:352!
...
...
kernel: Call Trace:
kernel: pci_disable_msix+0xd3/0x100
kernel: pci_free_irq_vectors+0xe/0x20
kernel: mlx5_load_one.isra.17+0x9f5/0xec0 [mlx5_core]

Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

d6b2785c

openanolis / cloud-kernel 12 个月 前同步成功

openanolis / cloud-kernel
12 个月前同步成功