提交 · b9b7cee405797cc395f699d8dee4747b96b1e0a8 · openeuler / raspberrypi-kernel

07 4月, 2016 7 次提交

mlxsw: reg: Add QoS ETS Element Configuration register · b9b7cee4

由 Ido Schimmel 提交于 4月 06, 2016

We are going to introduce support for DCB, so we need to be able to
configure the traffic selection algorithm (TSA) used by each traffic
class (TC), as well as the bandwidth percentage allocated to each TC in
case of ETS.

Add the QoS ETS Element Configuration register, which controls the
above parameters.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b9b7cee4

mlxsw: spectrum: Set port's shared buffer size to 0 · d6b7c13b

由 Ido Schimmel 提交于 4月 06, 2016

In addition to the priority group (PG) buffers in the headroom, the
device enables the allocation of headroom shared buffer, which can
be shared between different PGs.

However, we are not going to use the headroom shared buffer and instead
allow the user to use its size for PGs or the switch's shared buffer.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d6b7c13b

mlxsw: reg: Use correct PBMC register length · 7ad7cd61

由 Ido Schimmel 提交于 4月 06, 2016

The last field of the PBMC register is at offset 0x64 and its size is
0x8, so the correct register's length is 0x6C bytes.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7ad7cd61

mlxsw: spectrum: Correctly configure headroom size · ff6551ec

由 Ido Schimmel 提交于 4月 06, 2016

When packets ingress the switch they are assigned a switch priority and
directed to the corresponding priority group (PG) buffer in the port's
headroom buffer.

Since we now map all switch priorities to priority group 0 (PG0) by
default, there is no need to allocate the other priority groups during
initialization. The only exception is PG9, which is used for control
traffic.

At minimum, the PG should be able to store the currently classified
packet (pipeline latency isn't 0) and also the packets arriving during
the classification time. However, an incoming packet will not be
buffered if there is no available MTU-sized buffer space for storing it.

The buffer needed to accommodate for pipeline latency is variable and
needs to take into account both the current link speed and current
latency of the pipeline, which is time-dependent. Testing showed that
setting the PG's size to twice the current MTU is optimal.

Since PG9 is used strictly for control packets and not subject to flow
control, we are not going to resize it according to user configuration,
so we simply set it according to worst case scenario, which is twice the
maximum MTU.

In any case, later patches in the series will allow a user to direct
lossless flows to other PGs than PG0 and set their size to accommodate
for round-trip propagation delay.

The above change also requires us to resize the PG buffer whenever the
port's MTU is changed.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff6551ec

mlxsw: spectrum: Add bytes to cells helper · 1a198449

由 Ido Schimmel 提交于 4月 06, 2016

Buffers in the switch store packets in units called buffer cells. Add a
helper to convert from bytes to cells, so that the actual number of
cells required (result is round up) is returned.

Also, drop the SB (shared buffer) acronym from the BYTES_PER_CELL macro,
as this unit is also used in the ports' buffers and not only the
switch's shared buffer.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1a198449

mlxsw: spectrum: Map all switch priorities to priority group 0 · dd6cb0f9

由 Ido Schimmel 提交于 4月 06, 2016

During transmission, the skb's priority is used to map the skb to a
traffic class, where the idea is to group priorities with similar
characteristics (e.g. lossy, lossless) to the same traffic class. By
default, all priorities are mapped to traffic class 0.

In the device, we model the skb's priority as the switch priority, which
is assigned to a packet according to its PCP value and ingress port
(untagged packets are assigned the port's default switch priority - 0).

At ingress, the packet is directed to a priority group (PG) buffer in
the port's headroom buffer according to the packet's switch priority and
switch priority to buffer mapping.

While it's possible to configure the egress mapping between skb's
priority (switch priority) and traffic class, there is no mechanism to
configure the ingress mapping to a PG.

In order to keep things simple and since grouping certain priorities into
a traffic class at egress also implies they should be grouped the same
at ingress, treat a PG as the ingress counterpart of an egress traffic
class.

Having established the above, during initialization map all the switch
priorities to PG0 in accordance with the Linux defaults for traffic
class mapping.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd6cb0f9

mlxsw: reg: Add Port Prio To Buffer register · b98ff151

由 Ido Schimmel 提交于 4月 06, 2016

When packets ingress the switch they are assigned a switch priority
number that dictates the packet's priority group (PG) buffer in the
port's headroom buffer.

Add the Port Prio To Buffer (PPTB) register, which configures the switch
priority to PG mapping.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b98ff151

06 4月, 2016 2 次提交

mlxsw: spectrum: Add support for physical port names · 2bf9a586

由 Ido Schimmel 提交于 4月 05, 2016

Export to userspace the front panel name of the port, so that udev can
rename the ports accordingly. The convention suggested by switchdev
documentation is used:

1) Non-split: pX
2) Split: pXsY
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2bf9a586

mlxsw: spectrum: Reduce number of supported 802.1D bridges · b555cf4a

由 Ido Schimmel 提交于 4月 05, 2016

Resources allocated for these bridges at init time cannot be later used
for other purposes. While current number is supported by the device,
it's mostly theoretical with regards to any real use case, which leads
to poor utilization of device's resources. Solve that by reducing the
number.

The long term plan is to make this value (along with others) user
configurable via devlink and write it to NVRAM, so that it can be used
during the next init. Until then we must hardcode such values.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b555cf4a

22 3月, 2016 5 次提交

net/mlx5_core: Implement modify HCA vport command · 1f324bff

由 Eli Cohen 提交于 3月 11, 2016

Implement the modify HCA vport commands used to modify the parameters of
virtual HCA's ports.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

1f324bff

net/mlx5_core: Add VF param when querying vport counter · 2a4826fe

由 Eli Cohen 提交于 3月 11, 2016

Add a vf parameter to mlx5_core_query_vport_counter so we can call it to
query counters of virtual functions. Also update current users of the
API.

PFs may call mlx5_core_query_vport_counter with other_vport set to
indicate that they are querying a virtual function. The virtual
function to be queried is given by the vf parameter. Virtual function
numbering is zero based so the first VF is 0 and so on. When a PF
queries its own function, the other_vport parameter is cleared.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

2a4826fe

net/mlx5_core: Introduce offload arithmetic hardware capabilities · 3f0393a5

由 Sagi Grimberg 提交于 2月 23, 2016

Define the necessary hardware structures for the offload
arithmetic capabilities and read/cache them on driver load.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

3f0393a5

net/mlx5_core: Refactor device capability function · b06e7de8

由 Leon Romanovsky 提交于 2月 23, 2016

Device capability function was called similar in all places.
It was called twice for every queried parameter, while the
difference between calls was in HCA capability mode only.

The change proposed unify these calls into one function.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b06e7de8

net/mlx5_core: Fix caching ATOMIC endian mode capability · 91d9ed84

由 Leon Romanovsky 提交于 2月 23, 2016

Add caching of maximum device capability of ATOMIC endian mode.

Fixes: f91e6d89 ('net/mlx5_core: Add setting ATOMIC endian mode')
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

91d9ed84

21 3月, 2016 1 次提交

net/mlx4: remove unused array zero_gid[] · 5a779c4f

由 Colin Ian King 提交于 3月 20, 2016

zero_gid is not used, so remove this redundant array.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a779c4f

19 3月, 2016 1 次提交

net/mlx4_core: Fix backward compatibility on VFs · 76e39ccf

由 Eli Cohen 提交于 3月 17, 2016

Commit 85743f1e ("net/mlx4_core: Set UAR page size to 4KB regardless
of system page size") introduced dependency where old VF drivers without
this fix fail to load if the PF driver runs with this commit.

To resolve this add a module parameter which disables that functionality
by default.  If both the PF and VFs are running with a driver with that
commit the administrator may set the module param to true.

The module parameter is called enable_4k_uar.

Fixes: 85743f1e ('net/mlx4_core: Set UAR page size to 4KB ...')
Signed-off-by: NEli Cohen <eli@mellanox.com>
Tested-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76e39ccf

18 3月, 2016 1 次提交

mm: introduce page reference manipulation functions · fe896d18

由 Joonsoo Kim 提交于 3月 17, 2016

The success of CMA allocation largely depends on the success of
migration and key factor of it is page reference count.  Until now, page
reference is manipulated by direct calling atomic functions so we cannot
follow up who and where manipulate it.  Then, it is hard to find actual
reason of CMA allocation failure.  CMA allocation should be guaranteed
to succeed so finding offending place is really important.

In this patch, call sites where page reference is manipulated are
converted to introduced wrapper function.  This is preparation step to
add tracepoint to each page reference manipulation function.  With this
facility, we can easily find reason of CMA allocation failure.  There is
no functional change in this patch.

In addition, this patch also converts reference read sites.  It will
help a second step that renames page._count to something else and
prevents later attempt to direct access to it (Suggested by Andrew).
Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: NMichal Nazarewicz <mina86@mina86.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fe896d18

15 3月, 2016 1 次提交

mlx4: add missing braces in verify_qp_parameters · baefd701

由 Arnd Bergmann 提交于 3月 14, 2016

The implementation of QP paravirtualization back in linux-3.7 included
some code that looks very dubious, and gcc-6 has grown smart enough
to warn about it:

drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'verify_qp_parameters':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3154:5: error: statement is indented as if it were guarded by... [-Werror=misleading-indentation]
     if (optpar & MLX4_QP_OPTPAR_ALT_ADDR_PATH) {
     ^~
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3144:4: note: ...this 'if' clause, but it is not
    if (slave != mlx4_master_func_num(dev))

>From looking at the context, I'm reasonably sure that the indentation
is correct but that it should have contained curly braces from the
start, as the update_gid() function in the same patch correctly does.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Fixes: 54679e14 ("mlx4: Implement QP paravirtualization and maintain phys_pkey_cache for smp_snoop")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

baefd701

14 3月, 2016 3 次提交

mlx5: use napi_consume_skb API to get bulk free operations · 8ec736e5

由 Jesper Dangaard Brouer 提交于 3月 11, 2016

Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is needed by napi_consume_skb() to detect
if called from netpoll.
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ec736e5

mlx4: use napi_consume_skb API to get bulk free operations · b4a53379

由 Jesper Dangaard Brouer 提交于 3月 11, 2016

Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is usually needed by napi_consume_skb()
to detect if called from netpoll.  In this patch it has an extra meaning.

For mlx4 driver, the mlx4_en_stop_port() call is done outside
NAPI/softirq context, and cleanup the entire TX ring via
mlx4_en_free_tx_buf().  The code mlx4_en_free_tx_desc() for
freeing SKBs are shared with NAPI calls.

To handle this shared use the zero budget indication is reused,
and handled appropriately in napi_consume_skb(). To reflect this,
variable is called napi_mode for the function call that needed
this distinction.
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b4a53379

mlxsw: pci: Implement reset done check · 233fa44b

由 Jiri Pirko 提交于 3月 10, 2016

Firmware now tells us that the reset is done by passing a magic value
via register. Use it to shorten the wait in case this is supported.
With old firmware, we still wait until the timeout is reached.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

233fa44b

12 3月, 2016 1 次提交

mlxsw: spectrum: Check requested ageing time is valid · 869f63a4

由 Ido Schimmel 提交于 3月 08, 2016

Commit c62987bb ("bridge: push bridge setting ageing_time down to
switchdev") added a check for minimum and maximum ageing time, but this
breaks existing behaviour where one can set ageing time to 0 for a
non-learning bridge.

Push this check down to the driver and allow the check in the bridge
layer to be removed. Currently ageing time 0 is refused by the driver,
but we can later add support for this functionality.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

869f63a4

11 3月, 2016 6 次提交

net/mlx5e: Support offload cls_flower with skbedit mark action · 12185a9f

由 Amir Vadai 提交于 3月 08, 2016

Introduce offloading of skbedit mark action.

For example, to mark with 0x1234, all TCP (ip_proto 6) packets arriving
to interface ens9:

 # tc qdisc add dev ens9 ingress
 # tc filter add dev ens9 protocol ip parent ffff: \
     flower ip_proto 6 \
     indev ens9 \
     action skbedit mark 0x1234
Signed-off-by: NAmir Vadai <amir@vadai.me>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

12185a9f

net/mlx5e: Support offload cls_flower with drop action · e3a2b7ed

由 Amir Vadai 提交于 3月 08, 2016

Parse tc_cls_flower_offload into device specific commands and program
the hardware to classify and act accordingly.

For example, to drop ICMP (ip_proto 1) packets from specific smac, dmac,
src_ip, src_ip, arriving to interface ens9:

 # tc qdisc add dev ens9 ingress

 # tc filter add dev ens9 protocol ip parent ffff: \
     flower ip_proto 1 \
     dst_mac 7c:fe:90:69:81:62 src_mac 7c:fe:90:69:81:56 \
     dst_ip 11.11.11.11 src_ip 11.11.11.12 indev ens9 \
     action drop
Signed-off-by: NAmir Vadai <amir@vadai.me>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3a2b7ed

net/mlx5e: Introduce tc offload support · e8f887ac

由 Amir Vadai 提交于 3月 08, 2016

Extend ndo_setup_tc() to support ingress tc offloading. Will be used by
later patches to offload tc flower filter.

Feature is off by default and could be enabled by issuing:
 # ethtool  -K eth0 hw-tc-offload on

Offloads flow table is dynamically created when first filter is
added.
Rules are saved in a hash table that is maintained by the consumer (for
example - the flower offload in the next patch).
When last filter is removed and no filters exist in the hash table, the
offload flow table is destroyed.
Signed-off-by: NAmir Vadai <amir@vadai.me>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e8f887ac

net/mlx5e: Add a new priority for kernel flow tables · b6172aac

由 Amir Vadai 提交于 3月 08, 2016

Move the vlan and main flow tables to use priority 1. This will allow
the upcoming TC offload logic to use a higher priority (0) for the
offload steering table.
Signed-off-by: NAmir Vadai <amir@vadai.me>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6172aac

net/mlx5e: Relax ndo_setup_tc handle restriction · 67ba422e

由 Amir Vadai 提交于 3月 08, 2016

Restricting handle to TC_H_ROOT breaks the old instantiation of mqprio
to setup a hardware qdisc. This patch relaxes the test, to only check the
type.

Fixes: 08fb1dac ("net/mlx5e: Support DCBNL IEEE ETS")
Signed-off-by: NAmir Vadai <amir@vadai.me>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

67ba422e

net/mlx5_core: Set flow steering dest only for forward rules · 60ab4584

由 Amir Vadai 提交于 3月 08, 2016

We need to handle flow table entry destinations only if the action
associated with the rule is forwarding (MLX5_FLOW_CONTEXT_ACTION_FWD_DEST).

Fixes: 26a81453 ('net/mlx5_core: Introduce flow steering firmware commands')
Signed-off-by: NAmir Vadai <amir@vadai.me>
Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

60ab4584

10 3月, 2016 2 次提交

net/mlx5_core: Introduce forward to next priority action · b3638e1a

由 Maor Gottlieb 提交于 3月 07, 2016

Add support to create flow rule that forward packets
to the first flow table in the next priority (next priority
could be the first priority in the next namespace or the
next priority in the same namespace).
This feature could be used for DONT_TRAP rules or rules
that only want to mark the packet with flow tag.

In order to do it optimally, each flow table has list
of all rules that point to this flow table,
when a flow table is destroyed/created, we update the list
head correspondingly.

This kind of rule is created when destination is NULL and
action is MLX5_FLOW_CONTEXT_ACTION_FWD_NEXT_PRIO.
Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
Reviewed-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b3638e1a

net/mlx5_core: Create anchor of last flow table · 153fefbf

由 Maor Gottlieb 提交于 3月 07, 2016

Create an empty flow table in the end of NIC rx namesapce.
Adding this flow table simplify the implementation of "forward
to next prio" rules.
Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
Reviewed-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

153fefbf

08 3月, 2016 2 次提交

mlxsw: pci: Correctly determine if descriptor queue is full · 5091730d

由 Ido Schimmel 提交于 3月 07, 2016

The descriptor queues for sending (SDQs) and receiving (RDQs) packets
are managed by two counters - producer and consumer - which are both
16-bit in size. A queue is considered full when the difference between
the two equals the queue's maximum number of descriptors.

However, if the producer counter overflows, then it's possible for the
full queue check to fail, as it doesn't take the overflow into account.
In such a case, descriptors already passed to the device - but for which
a completion has yet to be posted - will be overwritten, thereby causing
undefined behavior. The above can be achieved under heavy load (~30
netperf instances).

Fix that by casting the subtraction result to u16, preventing it from
being treated as a signed integer.

Fixes: eda6500a ("mlxsw: Add PCI bus implementation")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5091730d

mlxsw: spectrum: Always decrement bridge's ref count · 912b1c89

由 Ido Schimmel 提交于 3月 07, 2016

Since we only support one VLAN filtering bridge we need to associate a
reference count with it, so that when the last port netdev leaves it, we
would know that a different bridge can be offloaded to hardware.

When a LAG device is memeber in a bridge and port netdevs are leaving
the LAG, we should always decrement the bridge's reference count, as it's
incremented for any port in the LAG.

Fixes: 4dc236c3 ("mlxsw: spectrum: Handle port leaving LAG while bridged")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

912b1c89

04 3月, 2016 2 次提交

net: mellanox: add DEVLINK dependencies · 3d1cbe83

由 Arnd Bergmann 提交于 3月 02, 2016

The new NET_DEVLINK infrastructure can be a loadable module, but the drivers
using it might be built-in, which causes link errors like:

drivers/net/built-in.o: In function `mlx4_load_one':
:(.text+0x2fbfda): undefined reference to `devlink_port_register'
:(.text+0x2fc084): undefined reference to `devlink_port_unregister'
drivers/net/built-in.o: In function `mlxsw_sx_port_remove':
:(.text+0x33a03a): undefined reference to `devlink_port_type_clear'
:(.text+0x33a04e): undefined reference to `devlink_port_unregister'

There are multiple ways to avoid this:

a) add 'depends on NET_DEVLINK || !NET_DEVLINK' dependencies
   for each user
b) use 'select NET_DEVLINK' from each driver that uses it
   and hide the symbol in Kconfig.
c) make NET_DEVLINK a 'bool' option so we don't have to
   list it as a dependency, and rely on the APIs to be
   stubbed out when it is disabled
d) use IS_REACHABLE() rather than IS_ENABLED() to check for
   NET_DEVLINK in include/net/devlink.h

This implements a variation of approach a) by adding an
intermediate symbol that drivers can depend on, and changes
the three drivers using it.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Fixes: 09d4d087 ("mlx4: Implement devlink interface")
Fixes: c4745500 ("mlxsw: Implement devlink interface")
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d1cbe83

net: relax setup_tc ndo op handle restriction · 5eb4dce3

由 John Fastabend 提交于 2月 29, 2016

I added this check in setup_tc to multiple drivers,

 if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)

Unfortunately restricting to TC_H_ROOT like this breaks the old
instantiation of mqprio to setup a hardware qdisc. This patch
relaxes the test to only check the type to make it equivalent
to the check before I broke it. With this the old instantiation
continues to work.

A good smoke test is to setup mqprio with,

# tc qdisc add dev eth4 root mqprio num_tc 8 \
  map 0 1 2 3 4 5 6 7 \
  queues 0@0 1@1 2@2 3@3 4@4 5@5 6@6 7@7

Fixes: e4c6734e ("net: rework ndo tc op to consume additional qdisc handle paramete")
Reported-by: NSingh Krishneil <krishneil.k.singh@intel.com>
Reported-by: NJake Keller <jacob.e.keller@intel.com>
CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: Or Gerlitz <ogerlitz@mellanox.com>
CC: Ariel Elior <ariel.elior@qlogic.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5eb4dce3

03 3月, 2016 6 次提交

net/mlx4_core: Allow resetting VF admin mac to zero · 6e522422

由 Jack Morgenstein 提交于 3月 02, 2016

The VF administrative mac addresses (stored in the PF driver) are
initialized to zero when the PF driver starts up.

These addresses may be modified in the PF driver through ndo calls
initiated by iproute2 or libvirt.

While we allow the PF/host to change the VF admin mac address from zero
to a valid unicast mac, we do not allow restoring the VF admin mac to
zero. We currently only allow changing this mac to a different unicast mac.

This leads to problems when libvirt scripts are used to deal with
VF mac addresses, and libvirt attempts to revoke the mac so this
host will not use it anymore.

Fix this by allowing resetting a VF administrative MAC back to zero.

Fixes: 8f7ba3ca ('net/mlx4: Add set VF mac address support')
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Reported-by: NMoshe Levi <moshele@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e522422

net/mlx4_core: Check the correct limitation on VFs for HA mode · 00ada910

由 Moni Shoua 提交于 3月 02, 2016

The limit of 63 is only for virtual functions while the actual enforcement
was for VFs plus physical functions, fix that.

Fixes: e57968a1 ('net/mlx4_core: Support the HA mode for SRIOV VFs too')
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

00ada910

net/mlx4_core: Fix lockdep warning in handling of mac/vlan tables · 03a79f31

由 Jack Morgenstein 提交于 3月 02, 2016

In the mac and vlan register/unregister/replace functions, the driver locks
the mac table mutex (or vlan table mutex) on both ports.

We move to use mutex_lock_nested() to prevent warnings, such as the one below.

[ 101.828445] =============================================
[ 101.834820] [ INFO: possible recursive locking detected ]
[ 101.841199] 4.5.0-rc2+  #49 Not tainted
[ 101.850251] ---------------------------------------------
[ 101.856621] modprobe/3054 is trying to acquire lock:
[ 101.862514] (&table->mutex#2){+.+.+.}, at: [<ffffffffa079c10e>] __mlx4_register_mac+0x87e/0xa90 [mlx4_core]
[ 101.874598]
[ 101.874598] but task is already holding lock:
[ 101.881703] (&table->mutex#2){+.+.+.}, at: [<ffffffffa079c0f0>] __mlx4_register_mac+0x860/0xa90 [mlx4_core]
[ 101.893776]
[ 101.893776] other info that might help us debug this:
[ 101.901658] Possible unsafe locking scenario:
[ 101.901658]
[ 101.908859] CPU0
[ 101.911923] ----
[ 101.914985] lock(&table->mutex#2);
[ 101.919595] lock(&table->mutex#2);
[ 101.924199]
[ 101.924199] * DEADLOCK *
[ 101.924199]
[ 101.931643] May be due to missing lock nesting notation

Fixes: 5f61385d ('net/mlx4_core: Keep VLAN/MAC tables mirrored in multifunc HA mode')
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Suggested-by: NDoron Tsur <doront@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

03a79f31

net/mlx5e: Provide correct packet/bytes statistics · faf4478b

由 Gal Pressman 提交于 2月 29, 2016

Using the HW VPort counters for traffic (rx/tx packets/bytes)
statistics is wrong. This is because frames dropped due to steering or
out of buffer will be counted as received. To fix that, we move to use
the packet/bytes accounting done by the driver for what the netdev
reports out.

Fixes: f62b8bb8 ('net/mlx5: Extend mlx5_core to support [...]')
Signed-off-by: NGal Pressman <galp@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

faf4478b

net/mlx5e: Add rx/tx bytes software counters · b081da5e

由 Gal Pressman 提交于 2月 29, 2016

Sum up rx/tx bytes in software as we do for rx/tx packets, to be reported
in upcoming statistics fix.
Signed-off-by: NGal Pressman <galp@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b081da5e

net/mlx5e: Correctly handle RSS indirection table when changing number of channels · 85082dba

由 Tariq Toukan 提交于 2月 29, 2016

Upon changing num_channels, reset the RSS indirection table to
match the new value.

Fixes: 2d75b2bc ('net/mlx5e: Add ethtool RSS configuration options')
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

85082dba