提交 · bd99fdea420b00925e9b83a50f2ccc5e1f07ef7d · openanolis / cloud-kernel

08 10月, 2016 6 次提交

IB/{core,hw}: Add constant for node_desc · bd99fdea

由 Yuval Shaia 提交于 8月 25, 2016

Signed-off-by: NYuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

bd99fdea

IB/mlx4/alias_GUID: Remove deprecated create_singlethread_workqueue · fb6375d7

由 Bhaktipriya Shridhar 提交于 8月 15, 2016

alloc_ordered_workqueue() with WQ_MEM_RECLAIM set, replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "wq" queues work item that maps to alias_guid_work.
It has been identity converted.

WQ_MEM_RECLAIM has been set to ensure forward progress under
memory pressure.
Signed-off-by: NBhaktipriya Shridhar <bhaktipriya96@gmail.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

fb6375d7

IB/mlx4/mcg: Remove deprecated create_singlethread_workqueue · fcf621dd

由 Bhaktipriya Shridhar 提交于 8月 15, 2016

alloc_ordered_workqueue() with WQ_MEM_RECLAIM set, replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "mcg_wq" queues work items &group->work
and &group->timeout_work.

The workqueue "clean_wq" queues work item mcg_clean_task.

Both have been identity converted.

WQ_MEM_RECLAIM has been set to ensure forward progress under
memory pressure.
Signed-off-by: NBhaktipriya Shridhar <bhaktipriya96@gmail.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

fcf621dd

IB/mlx4/mad: Remove deprecated create_singlethread_workqueue · 90b14b32

由 Bhaktipriya Shridhar 提交于 8月 15, 2016

alloc_ordered_workqueue() with WQ_MEM_RECLAIM set, replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "wq" queues work item &ctx->work and the workqueue "ud_wq"
queues work item &dm[i]->work.

Both the workqueues have been identity converted.

WQ_MEM_RECLAIM has been set to ensure forward progress under
memory pressure.
Signed-off-by: NBhaktipriya Shridhar <bhaktipriya96@gmail.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

90b14b32

IB/mlx4: Remove deprecated create_singlethread_workqueue · 41cd3944

由 Bhaktipriya Shridhar 提交于 8月 15, 2016

alloc_ordered_workqueue() with WQ_MEM_RECLAIM set, replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "wq" queues work items &dm[i]->work, &ew->work.
It has been identity converted.

WQ_MEM_RECLAIM has been set to ensure forward progress under
memory pressure.
Signed-off-by: NBhaktipriya Shridhar <bhaktipriya96@gmail.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

41cd3944

IB/mlx4: Add validation to flow specifications parsing · 1f02a09c

由 Maor Gottlieb 提交于 8月 30, 2016

Add validation check that all set fields in flow specification
are supported by vendor.
Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

1f02a09c

24 9月, 2016 1 次提交

IB/core: add support to create a unsafe global rkey to ib_create_pd · ed082d36

由 Christoph Hellwig 提交于 9月 05, 2016

Instead of exposing ib_get_dma_mr to ULPs and letting them use it more or
less unchecked, this moves the capability of creating a global rkey into
the RDMA core, where it can be easily audited.  It also prints a warning
everytime this feature is used as well.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

ed082d36

04 8月, 2016 2 次提交

IB/mlx4: Add diagnostic hardware counters · 3f85f2aa

由 Mark Bloch 提交于 7月 19, 2016

Expose IB diagnostic hardware counters.
The counters count IB events and are applicable for IB and RoCE.

The counters can be divided into two groups, per device and per port.
Device counters are always exposed.
Port counters are exposed only if the firmware supports per port counters.

rq_num_dup and sq_num_to are only exposed if we have firmware support
for them, if we do, we expose them per device and per port.
rq_num_udsdprd and num_cqovf are device only counters.

rq - denotes responder.
sq - denotes requester.

|-----------------------|---------------------------------------|
|	Name		|	Description			|
|-----------------------|---------------------------------------|
|rq_num_lle		| Number of local length errors		|
|-----------------------|---------------------------------------|
|sq_num_lle		| number of local length errors		|
|-----------------------|---------------------------------------|
|rq_num_lqpoe		| Number of local QP operation errors	|
|-----------------------|---------------------------------------|
|sq_num_lqpoe		| Number of local QP operation errors	|
|-----------------------|---------------------------------------|
|rq_num_lpe		| Number of local protection errors	|
|-----------------------|---------------------------------------|
|sq_num_lpe		| Number of local protection errors	|
|-----------------------|---------------------------------------|
|rq_num_wrfe		| Number of CQEs with error		|
|-----------------------|---------------------------------------|
|sq_num_wrfe		| Number of CQEs with error		|
|-----------------------|---------------------------------------|
|sq_num_mwbe		| Number of Memory Window bind errors	|
|-----------------------|---------------------------------------|
|sq_num_bre		| Number of bad response errors		|
|-----------------------|---------------------------------------|
|sq_num_rire		| Number of Remote Invalid request	|
|			| errors				|
|-----------------------|---------------------------------------|
|rq_num_rire		| Number of Remote Invalid request	|
|			| errors				|
|-----------------------|---------------------------------------|
|sq_num_rae		| Number of remote access errors	|
|-----------------------|---------------------------------------|
|rq_num_rae		| Number of remote access errors	|
|-----------------------|---------------------------------------|
|sq_num_roe		| Number of remote operation errors	|
|-----------------------|---------------------------------------|
|sq_num_tree		| Number of transport retries exceeded	|
|			| errors				|
|-----------------------|---------------------------------------|
|sq_num_rree		| Number of RNR NAK retries exceeded	|
|			| errors				|
|-----------------------|---------------------------------------|
|rq_num_rnr		| Number of RNR NAKs sent		|
|-----------------------|---------------------------------------|
|sq_num_rnr		| Number of RNR NAKs received		|
|-----------------------|---------------------------------------|
|rq_num_oos		| Number of Out of Sequence requests	|
|			| received				|
|-----------------------|---------------------------------------|
|sq_num_oos		| Number of Out of Sequence NAKs	|
|			| received				|
|-----------------------|---------------------------------------|
|rq_num_udsdprd		| Number of UD packets silently		|
|			| discarded on the Receive Queue due to	|
|			| lack of receive descriptor		|
|-----------------------|---------------------------------------|
|rq_num_dup		| Number of duplicate requests received	|
|-----------------------|---------------------------------------|
|sq_num_to		| Number of time out received		|
|-----------------------|---------------------------------------|
|num_cqovf		| Number of CQ overflows		|
|-----------------------|---------------------------------------|
Signed-off-by: NMark Bloch <markb@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

3f85f2aa

IB/mlx4: Don't use GFP_ATOMIC for CQ resize struct · 0c87b672

由 Roland Dreier 提交于 7月 28, 2016

We allocate a small tracking structure as part of mlx4_ib_resize_cq().
However, we don't need to use GFP_ATOMIC -- immediately after the
allocation, we call mlx4_cq_resize(), which allocates a command
mailbox with GFP_KERNEL and then sleeps on a firmware command, so we
better not be in an atomic context.

This actually has a real impact, because when this GFP_ATOMIC
allocation fails (and GFP_ATOMIC does fail in practice) then a
userspace consumer resizing a CQ will get a spurious failure that we
can easily avoid.
Signed-off-by: NRoland Dreier <roland@purestorage.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

0c87b672

20 7月, 2016 1 次提交

net/mlx4_en: break out tx_desc write into separate function · 224e92e0

由 Brenden Blanco 提交于 7月 19, 2016

In preparation for writing the tx descriptor from multiple functions,
create a helper for both normal and blueflame access.
Signed-off-by: NBrenden Blanco <bblanco@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

224e92e0

24 6月, 2016 1 次提交

IB/mlx4: Support device FW version string · e9db59fc

由 Ira Weiny 提交于 6月 15, 2016

And remove the sysfs in favor of common core version.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

e9db59fc

23 6月, 2016 5 次提交

IB/mlx4: Prevent cross page boundary allocation · cbc9355a

由 Chuck Lever 提交于 6月 22, 2016

Prevent cross page boundary allocation by allocating
new page, this is required to be aligned with ConnectX-3 HW
requirements.

Not doing that might cause to "RDMA read local protection" error.

Fixes: 1b2cd0fc ('IB/mlx4: Support the new memory registration API')
Suggested-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

cbc9355a

IB/mlx4: Fix memory leak if QP creation failed · 5b420d9c

由 Dotan Barak 提交于 6月 22, 2016

When RC, UC, or RAW QPs are created, a qp object is allocated (kzalloc).
If at a later point (in procedure create_qp_common) the qp creation fails,
this qp object must be freed.

Fixes: 1ffeb2eb ("IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support")
Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5b420d9c

IB/mlx4: Verify port number in flow steering create flow · 5533c18a

由 Yishai Hadas 提交于 6月 22, 2016

In procedure mlx4_ib_create_flow, passing an invalid port number
will cause an out-of-bounds array access. Data passed to this procedure
can come from user-space.  Therefore, need to validate port number
before proceeding onwards.

Note that we check against the number of physical ports declared at
the verbs (ib core) level; When bonding is active, the verbs level
sees one physical port, even though the low-level driver sees two ports.

Fixes: f77c0162 ("IB/mlx4: Add receive flow steering support")
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Reviewed-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5533c18a

IB/mlx4: Fix error flow when sending mads under SRIOV · a6100603

由 Yishai Hadas 提交于 6月 22, 2016

Fix mad send error flow to prevent double freeing address handles,
and leaking tx_ring entries when SRIOV is active.

If ib_mad_post_send fails, the address handle pointer in the tx_ring entry
must be set to NULL (or there will be a double-free) and tx_tail must be
incremented (or there will be a leak of tx_ring entries).
The tx_ring is handled the same way in the send-completion handler.

Fixes: 37bfc7c1 ("IB/mlx4: SR-IOV multiplex and demultiplex MADs")
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Reviewed-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

a6100603

IB/mlx4: Fix the SQ size of an RC QP · f2940e2c

由 Yishai Hadas 提交于 6月 22, 2016

When calculating the required size of an RC QP send queue, leave
enough space for masked atomic operations, which require more space than
"regular" atomic operation.

Fixes: 6fa8f719 ("IB/mlx4: Add support for masked atomic operations")
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Reviewed-by: NJack Morgenstein <jackm@mellanox.co.il>
Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

f2940e2c

18 6月, 2016 1 次提交

IB/mlx4: Properly initialize GRH TClass and FlowLabel in AHs · 8c5122e4

由 Jason Gunthorpe 提交于 6月 08, 2016

When this code was reworked for IBoE support the order of assignments
for the sl_tclass_flowlabel got flipped around resulting in
TClass & FlowLabel being permanently set to 0 in the packet headers.

This breaks IB routers that rely on these headers, but only affects
kernel users - libmlx4 does this properly for user space.

Cc: stable@vger.kernel.org
Fixes: fa417f7b ("IB/mlx4: Add support for IBoE")
Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

8c5122e4

07 6月, 2016 1 次提交

IB/mlx4: Fix device managed flow steering support test · ca920f5b

由 Bart Van Assche 提交于 6月 03, 2016

Perform the test for device managed flow steering support even if
memory windows are not supported. I noticed this because smatch
reported inconsistent indentation for the device managed flow
steering support test.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Cc: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

ca920f5b

18 5月, 2016 1 次提交

IB/mlx4: Fix unaligned access in send_reply_to_slave · 04ef0f1a

由 shamir rabinovitch 提交于 5月 18, 2016

The problem is that the function 'send_reply_to_slave' gets the
'req_sa_mad' as a pointer whose address is only aliged to 4 bytes
but is 8 bytes in size.  This can result in unaligned access faults
on certain architectures.

Sowmini Varadhan pointed to this reply from Dave Miller that say
that memcpy should not be used to solve alignment issues:
https://lkml.org/lkml/2015/10/21/352

Optimization of memcpy to 'ldx' instruction can only happen if the
compiler knows that the size of the data we are copying is 8 bytes
and it assumes it is aligned to 8 bytes. If the compiler know the
type is not aligned to 8 it must not optimize the 8 byte copy.
Defining the data type as aligned to 4 forces the compiler to treat
all accesses as though they aren't aligned and avoids the 'ldx'
optimization.

Full credit for the idea goes to Jason Gunthorpe
<jgunthorpe@obsidianresearch.com>.
Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

04ef0f1a

14 5月, 2016 4 次提交

IB/mlx4: Use list_for_each_entry_safe · ee71b968

由 Geliang Tang 提交于 12月 07, 2015

Simplify the code in search_relocate_mgid0_group with by using
list_for_each_entry_safe().
Signed-off-by: NGeliang Tang <geliangtang@163.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

ee71b968

IB/mlx4: trivial fix of spelling mistake on "argument" · aa703453

由 Colin Ian King 提交于 4月 25, 2016

fix spelling mistake, argumant -> argument
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

aa703453

IB/core: Enhance ib_map_mr_sg() · 9aa8b321

由 Bart Van Assche 提交于 5月 12, 2016

The SRP initiator allows to set max_sectors to a value that exceeds
the largest amount of data that can be mapped at once with an mlx4
HCA using fast registration and a page size of 4 KB. Hence modify
ib_map_mr_sg() such that it can map partial sg-elements. If an
sg-element has been mapped partially, let the caller know
which fraction has been mapped by adjusting *sg_offset.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Tested-by: NLaurence Oberman <loberman@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

9aa8b321

IB/core: Add passing an offset into the SG to ib_map_mr_sg · ff2ba993

由 Christoph Hellwig 提交于 5月 03, 2016

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

ff2ba993

06 5月, 2016 1 次提交

net/mlx4: Avoid wrong virtual mappings · 73898db0

由 Haggai Abramovsky 提交于 5月 04, 2016

The dma_alloc_coherent() function returns a virtual address which can
be used for coherent access to the underlying memory.  On some
architectures, like arm64, undefined behavior results if this memory is
also accessed via virtual mappings that are not coherent.  Because of
their undefined nature, operations like virt_to_page() return garbage
when passed virtual addresses obtained from dma_alloc_coherent().  Any
subsequent mappings via vmap() of the garbage page values are unusable
and result in bad things like bus errors (synchronous aborts in ARM64
speak).

The mlx4 driver contains code that does the equivalent of:
vmap(virt_to_page(dma_alloc_coherent)), this results in an OOPs when the
device is opened.

Prevent Ethernet driver to run this problematic code by forcing it to
allocate contiguous memory. As for the Infiniband driver, at first we
are trying to allocate contiguous memory, but in case of failure roll
back to work with fragmented memory.
Signed-off-by: NHaggai Abramovsky <hagaya@mellanox.com>
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Reported-by: NDavid Daney <david.daney@cavium.com>
Tested-by: NSinan Kaya <okaya@codeaurora.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73898db0

28 4月, 2016 1 次提交

IB/mlx4: printk fix · 35fc7b7d

由 Colin Ian King 提交于 4月 25, 2016

fix spelling mistake, argumant -> argument
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

35fc7b7d

04 3月, 2016 1 次提交

net: mellanox: add DEVLINK dependencies · 3d1cbe83

由 Arnd Bergmann 提交于 3月 02, 2016

The new NET_DEVLINK infrastructure can be a loadable module, but the drivers
using it might be built-in, which causes link errors like:

drivers/net/built-in.o: In function `mlx4_load_one':
:(.text+0x2fbfda): undefined reference to `devlink_port_register'
:(.text+0x2fc084): undefined reference to `devlink_port_unregister'
drivers/net/built-in.o: In function `mlxsw_sx_port_remove':
:(.text+0x33a03a): undefined reference to `devlink_port_type_clear'
:(.text+0x33a04e): undefined reference to `devlink_port_unregister'

There are multiple ways to avoid this:

a) add 'depends on NET_DEVLINK || !NET_DEVLINK' dependencies
   for each user
b) use 'select NET_DEVLINK' from each driver that uses it
   and hide the symbol in Kconfig.
c) make NET_DEVLINK a 'bool' option so we don't have to
   list it as a dependency, and rely on the APIs to be
   stubbed out when it is disabled
d) use IS_REACHABLE() rather than IS_ENABLED() to check for
   NET_DEVLINK in include/net/devlink.h

This implements a variation of approach a) by adding an
intermediate symbol that drivers can depend on, and changes
the three drivers using it.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Fixes: 09d4d087 ("mlx4: Implement devlink interface")
Fixes: c4745500 ("mlxsw: Implement devlink interface")
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d1cbe83

02 3月, 2016 2 次提交

mlx4: Implement devlink interface · 09d4d087

由 Jiri Pirko 提交于 2月 26, 2016

Implement newly introduced devlink interface. Add devlink port instances
for every port and set the port types accordingly.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
v2->v3:
-add dev param to devlink_register (api change)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

09d4d087

IB/core: Add vendor's specific data to alloc mw · b2a239df

由 Matan Barak 提交于 2月 29, 2016

Passing udata to the vendor's driver in order to pass data from the
user-space driver to the kernel-space driver. This data will be
used in downstream patches.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b2a239df

01 3月, 2016 3 次提交

IB/mlx4: Add support for the don't trap rule · 0e451e88

由 Marina Varshaver 提交于 2月 18, 2016

Add support for receiving multicast/unicast traffic with
the don't trap rule.

Sniffing these packets requires a flow steering rule of type NORMAL
at priority 0 with flag IB_FLOW_ATTR_FLAGS_DONT_TRAP set.
Choosing between multicast or unicast is done via ethernet L2 dest_mac
mask and value:
- If mask is all zeros - unicast and multicast are set.
- If mask non zero - only mask with multicast bit 1 and rest 0 is
                     supported, the mac value will choose if it is
                     multicast or unicast rule.

If the mask multicast bit is on and some other bits are on too, it means
a request for specific multicast or unicast, this is not supported,
either receive all multicast or all unicast.

Only when limitations are met registered QP will receive requested type
but other QPs can receive same traffic if registered for it.
Otherwise, if limitations are not met, an error will be returned.

Limitations:
- Rule must be with priority 0.
- A0 mode is not supported.
- Sniffer QP cannot appear in any other flow steering rule.
Signed-off-by: NMarina Varshaver <marinav@mellanox.com>
Reviewed-by: NMatan Barak <matanb@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

0e451e88

IB/core: Add don't trap flag to flow creation · a3100a78

由 Marina Varshaver 提交于 2月 18, 2016

Don't trap flag (i.e. IB_FLOW_ATTR_FLAGS_DONT_TRAP) indicates that QP
will receive traffic, but will not steal it.

When a packet matches a flow steering rule that was created with
the don't trap flag, the QPs assigned to this rule will get this
packet, but matching will continue to other equal/lower priority
rules. This will let other QPs assigned to those rules to get the
packet too.

If both don't trap rule and other rules have the same priority
and match the same packet, the behavior is undefined.

The don't trap flag can't be set with default rule types
(i.e. IB_FLOW_ATTR_ALL_DEFAULT, IB_FLOW_ATTR_MC_DEFAULT) as default rules
don't have rules after them and don't trap has no meaning here.
Signed-off-by: NMarina Varshaver <marinav@mellanox.com>
Reviewed-by: NMatan Barak <matanb@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

a3100a78

IB/mlx4: Use boottime · 571e09ee

由 Abhilash Jindal 提交于 1月 31, 2016

Wall time obtained from ktime_get_real_ns is susceptible to sudden jumps due to
user setting the time or due to NTP.  Boot time is constantly increasing time
better suited for comparing two timestamps.
Signed-off-by: NAbhilash Jindal <klock.android@gmail.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

571e09ee

17 2月, 2016 3 次提交

net/mlx4_core: Set UAR page size to 4KB regardless of system page size · 85743f1e

由 Huy Nguyen 提交于 2月 17, 2016

problem description:

The current code sets UAR page size equal to system page size.
The ConnectX-3 and ConnectX-3 Pro HWs require minimum 128 UAR pages.
The mlx4 kernel drivers are not loaded if there is less than 128 UAR pages.

solution:

Always set UAR page to 4KB. This allows more UAR pages if the OS
has PAGE_SIZE larger than 4KB. For example, PowerPC kernel use 64KB
system page size, with 4MB uar region, there are 4MB/2/64KB = 32
uars (half for uar, half for blueflame). This does not meet minimum 128
UAR pages requirement. With 4KB UAR page, there are 4MB/2/4KB = 512 uars
which meet the minimum requirement.

Note that only codes in mlx4_core that deal with firmware know that uar
page size is 4KB. Codes that deal with usr page in cq and qp context
(mlx4_ib, mlx4_en and part of mlx4_core) still have the same assumption
that uar page size equals to system page size.

Note that with this implementation, on 64KB system page size kernel, there
are 16 uars per system page but only one uars is used. The other 15
uars are ignored because of the above assumption.

Regarding SR-IOV, mlx4_core in hypervisor will set the uar page size
to 4KB and mlx4_core code in virtual OS will obtain the uar page size from
firmware.

Regarding backward compatibility in SR-IOV, if hypervisor has this new code,
the virtual OS must be updated. If hypervisor has old code, and the virtual
OS has this new code, the new code will be backward compatible with the
old code. If the uar size is big enough, this new code in VF continues to
work with 64 KB uar page size (on PowerPc kernel). If the uar size does not
meet 128 uars requirement, this new code not loaded in VF and print the same
error message as the old code in Hypervisor.
Signed-off-by: NHuy Nguyen <huyn@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

85743f1e

IB/mlx4: Add support for the port info class for RoCE ports · c2bab619

由 Eran Ben Elisha 提交于 2月 11, 2016

Report that driver supports IB_PMA_CLASS_CAP_EXT_WIDTH in respond for
IB_MGMT_CLASS_PERF_MGMT mad with IB_PMA_CLASS_PORT_INFO attr id.
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c2bab619

IB/mlx4: Add support for extended counters over RoCE ports · c3c0c836

由 Eran Ben Elisha 提交于 2月 11, 2016

When attribute IB_PMA_PORT_COUNTERS_EXT is set, we now return 64 bit
values for the counters.
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c3c0c836

20 1月, 2016 6 次提交

IB/mlx4: Advertise RoCE v2 support · 4ed088e6

由 Matan Barak 提交于 1月 14, 2016

Advertise RoCE v2 support in port_immutable attributes according to
the hardware's capabilities. This enables the verbs stack to use
RoCE v2 mode.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

4ed088e6

IB/mlx4: Create and use another QP1 for RoCEv2 · e1b866c6

由 Moni Shoua 提交于 1月 14, 2016

The mlx4 driver uses a special QP to implement the GSI QP. This kind
of QP allows to build the InfiniBand headers in software.
When mlx4 hardware builds the packet, it calculates the ICRC and puts
it at the end of the payload. However, this ICRC calculation depends
on the QP configuration, which is determined when the QP is modified
(roce_mode during INIT->RTR).
When receiving a packet, the ICRC verification doesn't depend on this
configuration.
Therefore, using two GSI QPs for send (one for each RoCE version) and
one GSI QP for receive are required.
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

e1b866c6

IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers · 3ef967a4

由 Moni Shoua 提交于 1月 14, 2016

RoCEv2 packets are sent over IP/UDP protocols.
The mlx4 driver uses a type of RAW QP to send packets for QP1 and
therefore needs to build the network headers below BTH in software.

This patch adds option to build QP1 packets with IP and UDP headers if
RoCEv2 is requested.
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

3ef967a4

IB/mlx4: Enable RoCE v2 when the IB device is added · 71a39bbb

由 Moni Shoua 提交于 1月 14, 2016

If the hardware supports RoCE v2, we configure the hardware UDP
port according to the RoCE v2 Annex when mlx4_ib device is added.
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

71a39bbb

IB/mlx4: Support modify_qp for RoCE v2 · 3b5daf28

由 Moni Shoua 提交于 1月 14, 2016

In order to support modify_qp for RoCE v2, we need to set
the gid_type in the QP context.
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

3b5daf28

IB/mlx4: Add support for setting RoCEv2 gids in hardware · 7e57b85c

由 Moni Shoua 提交于 1月 14, 2016

To tell hardware about a gid with type RoCEv2, software needs a new
modifier to the SET_PORT command: MLX4_SET_PORT_ROCE_ADDR. This can
replace the old method, MLX4_SET_PORT_GID_TABLE, for  RoCEv1 gids.
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

7e57b85c

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功