提交 · 0ff1fb654bec0cff62ddf81a8a8edec4263604a0 · openeuler / Kernel

08 7月, 2012 6 次提交

{NET, IB}/mlx4: Add device managed flow steering firmware API · 0ff1fb65

由 Hadar Hen Zion 提交于 7月 05, 2012

The driver is modified to support three operation modes.

If supported by firmware use the device managed flow steering
API, that which we call device managed steering mode. Else, if
the firmware supports the B0 steering mode use it, and finally,
if none of the above, use the A0 steering mode.

When the steering mode is device managed, the code is modified
such that L2 based rules set by the mlx4_en driver for Ethernet
unicast and multicast, and the IB stack multicast attach calls
done through the mlx4_ib driver are all routed to use the device
managed API.

When attaching rule using device managed flow steering API,
the firmware returns a 64 bit registration id, which is to be
provided during detach.

Currently the firmware is always programmed during HCA initialization
to use standard L2 hashing. Future work should be done to allow
configuring the flow-steering hash function with common, non
proprietary means.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ff1fb65

net/mlx4_core: Add firmware commands to support device managed flow steering · 8fcfb4db

由 Hadar Hen Zion 提交于 7月 05, 2012

Add support for firmware commands to attach/detach a new device managed
steering mode. Such network steering rules allow the user to provide an
L2/L3/L4 flow specification to the firmware and have the device to steer
traffic that matches that specification to the provided QP.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fcfb4db

net/mlx4: Set steering mode according to device capabilities · c96d97f4

由 Hadar Hen Zion 提交于 7月 05, 2012

Instead of checking the firmware supported steering mode in various
places in the code, add a dedicated field in the mlx4 device capabilities
structure which is written once during the initialization flow and read
across the code.

This also set the grounds for add new steering modes. Currently two modes
are supported, and are named after the ConnectX HW versions A0 and B0.

A0 steering uses mac_index, vlan_index and priority to steer traffic
into pre-defined range of QPs.

B0 steering uses Ethernet L2 hashing rules and is enabled only
if the firmware supports both unicast and multicast B0 steering,

The current steering modes are relevant for Ethernet traffic only,
such that Infiniband steering remains untouched.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c96d97f4

net/mlx4_en: Re-design multicast attachments flow · 6d199937

由 Yevgeny Petrilin 提交于 7月 05, 2012

Currently, for every change in the net device multicast list, the driver
detaches all the addresses from the HW device, and then attaches the
updated list. This behavior is wrong from two aspects: first, it causes
a load of firmware commands and second, there is period of time where
the correct addresses are not attached, which turned into packet loss.

To improve - a copy of the multicast list is saved by the driver. For
every change in the multicast list, the multicast list copy is used
to find the delta between those two lists and add or remove multicast
addresses as needed.
Reported-by: NShawn Bohrer <sbohrer@rgmadvisors.com>
Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d199937

net/mlx4_core: Change resource tracking ID to be 64 bit · aa1ec3dd

由 Hadar Hen Zion 提交于 7月 05, 2012

Currently the IDs used by the resource tracker are of type u32, so far this was
ok since all the different resources we were tracking could be encoded in 32bit.

As a preparation step for tracking of resources whose IDs need > 32 bits such
as network flow steering rules, who are 64 bit in size, move to use 64 bit
based resource IDs.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa1ec3dd

net/mlx4_core: Change resource tracking mechanism to use red-black tree · 4af1c048

由 Hadar Hen Zion 提交于 7月 05, 2012

Change the data structure used for managing the SRIOV resource tracking
mechanism from radix tree to red-black tree. This is preparation step
for supporting resource IDs which are 64bit long, such as network flow
steering rules. Such IDs can't be used as radix-tree keys on 32bit
architectures and hence the reason for the change.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4af1c048

05 7月, 2012 1 次提交

mlx4: set maximal number of default RSS queues · 90b1ebe7

由 Yuval Mintz 提交于 7月 01, 2012

Signed-off-by: NYuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: NEilon Greenstein <eilong@broadcom.com>

Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

90b1ebe7

26 6月, 2012 3 次提交

net/mlx4_en: Release QP range in free_resources · 044ca2a5

由 Yevgeny Petrilin 提交于 6月 25, 2012

Add a missing resource release in ring cleanup.
Not doing this leaves a range of QPs that are being reserved,
and no one can use them.
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

044ca2a5

net/mlx4: Use single completion vector after NOP failure · 9858d2d1

由 Yevgeny Petrilin 提交于 6月 25, 2012

Fix a crash at the error flow of NOP command which caused the driver to try and use
a completion vector which wasn't allocated.
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9858d2d1

net/mlx4_en: Set correct port parameters during device initialization · 5c8e9046

由 Yevgeny Petrilin 提交于 6月 25, 2012

Set valid port parameters: MTU and flow control configuration when
configuring the port during HW device initialization,
prior to the net device open() being called.
Using  invalid parameters (such as all zeros)
could lead to bad firmware behavior.
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5c8e9046

07 6月, 2012 2 次提交

mlx4_core: Fix setting VL_cap in mlx4_SET_PORT wrapper flow · edc4a67e

由 Jack Morgenstein 提交于 5月 24, 2012

Commit 096335b3 ("mlx4_core: Allow dynamic MTU configuration for
IB ports") modifies the port VL setting.  This exposes a bug in
mlx4_common_set_port(), where the VL cap value passed in (inside the
command mailbox) is incorrectly zeroed-out:

mlx4_SET_PORT modifies the VL_cap field (byte 3 of the mailbox).
Since the SET_PORT command is paravirtualized on the master as well as
on the slaves, mlx4_SET_PORT_wrapper() is invoked on the master.  This
calls mlx4_common_set_port() where mailbox byte 3 gets overwritten by
code which should only set a single bit in that byte (for the reset
qkey counter flag) -- but instead overwrites the entire byte.

The result is that when running in SR-IOV mode, the VL_cap will be set
to zero -- fix this.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

edc4a67e

ethernet: Remove casts to same type · 64699336

由 Joe Perches 提交于 6月 04, 2012

Adding casts of objects to the same type is unnecessary
and confusing for a human reader.

For example, this cast:

        int y;
        int *p = (int *)&y;

I used the coccinelle script below to find and remove these
unnecessary casts.  I manually removed the conversions this
script produces of casts with __force, __iomem and __user.

@@
type T;
T *p;
@@

-       (T *)p
+       p

A function in atl1e_main.c was passed a const pointer
when it actually modified elements of the structure.

Change the argument to a non-const pointer.

A function in stmmac needed a __force to avoid a sparse
warning.  Added it.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

64699336

01 6月, 2012 6 次提交

net/mlx4_core: Fix obscure mlx4_cmd_box parameter in QUERY_DEV_CAP · 401453a3

由 Jack Morgenstein 提交于 5月 30, 2012

The "!mlx4_is_slave" is totally confusing.  Fix with
constant MLX4_CMD_NATIVE, which is the intended behavior.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

401453a3

net/mlx4_core: Check port out-of-range before using in mlx4_slave_cap · 6230bb23

由 Jack Morgenstein 提交于 5月 30, 2012

The range check was performed after using the port number.

Reverse this to prevent a potential array overflow.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6230bb23

net/mlx4_core: Fixes for VF / Guest startup flow · b91cb3eb

由 Jack Morgenstein 提交于 5月 30, 2012

- pass the following parameters:
  - firmware version (added QUERY_FW paravirtualization for that)

  - disable Blueflame on slaves. KVM disables write combining on guests,
    and we get better performance without BF in this case. (This requires
    QUERY_DEV_CAP paravirtualization, also in this commit)

  - max qp rdma as destination

- get rid of a chunk of "if (0)" dead code
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b91cb3eb

net/mlx4_en: Fix improper use of "port" parameter in mlx4_en_event · 13bf58b7

由 Jack Morgenstein 提交于 5月 30, 2012

Port is used as an array index before we know if that is proper.

For example, in the catas event case, port is zero; however,
the port index should lie in the range (1..2).

Fix this by using 'port' only in the events where it is of interest.

Test for port out of range in the default (unhandled event) case,
and do not output a message if it is not an ethernet port.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13bf58b7

net/mlx4_core: Fix number of EQs used in ICM initialisation · 3fc929e2

由 Marcel Apfelbaum 提交于 5月 30, 2012

In SRIOV mode, the number of EQs used when computing the total ICM size
was incorrect.

To fix this, we do the following:
1. We add a new structure to mlx4_dev, mlx4_phys_caps, to contain physical HCA
   capabilities.  The PPF uses the phys capabilities when it computes things
   like ICM size.

   The dev_caps structure will then contain the paravirtualized values, making
   bookkeeping much easier in SRIOV mode. We add a structure rather than a
   single parameter because there will be other fields in the phys_caps.

   The first field we add to the mlx4_phys_caps structure is num_phys_eqs.

2. In INIT_HCA, when running in SRIOV mode, the "log_num_eqs" parameter
   passed to the FW is the number of EQs per VF/PF; each function (PF or VF)
   has this number of EQs available.

   However, the total number of EQs which must be allowed for in the ICM is
   (1 << log_num_eqs) * (#VFs + #PFs).  Rather than compute this quantity,
   we allocate ICM space for 1024 EQs (which is the device maximum
   number of EQs, and which is the value we place in the mlx4_phys_caps structure).

   For INIT_HCA, however, we use the per-function number of EQs as described
   above.
Signed-off-by: NMarcel Apfelbaum <marcela@dev.mellanox.co.il>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3fc929e2

net/mlx4_core: Fix the slave_id out-of-range test in mlx4_eq_int · 30f7c73b

由 Jack Morgenstein 提交于 5月 30, 2012

Ths fixes the comparison in the FLR (Function Level Reset) event case.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

30f7c73b

18 5月, 2012 1 次提交

net/mlx4_en: num cores tx rings for every UP · bc6a4744

由 Amir Vadai 提交于 5月 17, 2012

Change the TX ring scheme such that the number of rings for untagged packets
and for tagged packets (per each of the vlan priorities) is the same, unlike
the current situation where for tagged traffic there's one ring per priority
and for untagged rings as the number of core.

Queue selection is done as follows:

If the mqprio qdisc is operates on the interface, such that the core networking
code invoked the device setup_tc ndo callback, a mapping of skb->priority =>
queue set is forced - for both, tagged and untagged traffic.

Else, the egress map skb->priority => User priority is used for tagged traffic, and
all untagged traffic is sent through tx rings of UP 0.

The patch follows the convergence of discussing that issue with John Fastabend
over this thread http://comments.gmane.org/gmane.linux.network/229877

Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Liran Liss <liranl@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc6a4744

16 5月, 2012 8 次提交

net/mlx4_core: Fixed error flow in rem_slave_eqs · eb71d0d6

由 Jack Morgenstein 提交于 5月 15, 2012

Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eb71d0d6

net/mlx4_core: Add XRC domains and counters to resource tracker · ba062d52

由 Jack Morgenstein 提交于 5月 15, 2012

Add missing resource tracking for XRC domains and complete the tracking for HCA
network flow counters.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba062d52

net/mlx4_core: Fix potential kernel Oops in res tracker during Dom0 driver unload · b8924951

由 Jack Morgenstein 提交于 5月 15, 2012

Currently the slave and master resources are deleted after master freed
all bitmaps. If any resources were not properly cleaned up during the
shutdown process, an Oops would result.

Fix so that delete slave (only) resources during cleanup. Master resources
are cleaned up during unload process, and need not separately be cleaned.

Note that during cleanup, we need to split the resource-tracker freeing
functionality.

Before removing all the bitmaps, we free any leftover slave resources.
However, we can only remove the resource tracker linked list after
all bitmap frees, since some of the freeing functions (e.g.,
mlx4_cleanup_eq_table) use paravirtualized FW commands which expect
the resource tracker linked list to be present.
Found-by: NAviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b8924951

net/mlx4_core: Do not reset module-parameter num_vfs when fail to enable sriov · 681372a7

由 Jack Morgenstein 提交于 5月 15, 2012

Consider the following scenario: 2 HCAs, where only one of which can run SRIOV.

If we reset the module parameter, all the VFs of the SRIOV HCA will be
claimed by the PPF host (-- the code relies on num_vfs being non-zero
to avoid this claiming, and num_vfs was reset when pci_enable_sriov failed
for the non-SRIOV HCA).

The solution is not to touch the num_vfs parameter.

Also, eliminate the unneeded check of num_vfs when disabling sriov
(the dev flag bit is sufficient).
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

681372a7

net/mlx4_core: Remove unused *_str functions from the resource tracker · b9985f41

由 Jack Morgenstein 提交于 5月 15, 2012

Removed unsued *_str helper functions from resource_tracker.c
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b9985f41

net/mlx4_core: Change SYNC_TPT to be native (not wrapped) · 5e92d803

由 Jack Morgenstein 提交于 5月 15, 2012

The "wrapped" was incorrect, since no wrapper function was defined.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5e92d803

net/mlx4_core: Fix init_port mask state for slaves · 8bac9ede

由 Jack Morgenstein 提交于 5月 15, 2012

In function mlx4_INIT_PORT_wrapper, the port state mask for the
slave is only set if we are invoking the INIT_PORT fw command.

However, the reference count for the (initialized) port is
incremented anyway.

This creates a problem in that when we have multiple slaves,
then the CLOSE_PORT command will never be invoked. The
reason is that in the CLOSE_PORT wrapper, if the port-state
mask is zero for the slave (which it is), the wrapper returns
without doing anything. The only slave which will not return
immediately in the CLOSE_PORT wrapper is that slave for which
INIT_PORT was invoked.

The fix is to not have the port-state mask setting depend
on the logic for calling the INIT_PORT fw command.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8bac9ede

net/mlx4: Address build warnings on set but not used variables · 162344ed

由 Or Gerlitz 提交于 5月 15, 2012

Handle the compiler warnings on variables which are set but not used
by removing the relevant variable or casting a return value which is
ignored on purpose to void.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

162344ed

15 5月, 2012 1 次提交

mlx4_core: Change bitmap allocator to work in round-robin fashion · f4ec9e95

由 Jack Morgenstein 提交于 5月 10, 2012

Under most circumstances, the bitmap allocator does not allocate the
same full 24-bit QP number immediately after a QP is destroyed.

This works by using the upper bits of a 24-bit QP number, beyond the
number of QPs that are actually available in the low level driver.
For example, say that the HCA is willing to allocate a maximum of 64K
qps.  We use the bits 23..16 as a "counter" which is incremented by 1
at each allocation so that even if the same physical QP is
re-allocated, it will not receive the same 24-bit QP number.

However, we have seen the following scenario:
1. Allocate, say, 255 QPs in succession.  This will cause a wrap of the "counter".
2. Destroy the first QP allocated, then allocate a new QP.  The new QP,
   because of the counter wraparound, will get the same FULL QP number as
   the QP just destroyed!

This is a problem because packets in transit can be erroneously
delivered to the new QP when they were meant for the old (destroyed)
QP, because the full QP number of the new QP is identical to the
destroyed QP.  (The "counter" mechanism is meant to prevent this by
having the full 24-bit QP numbers differ even if the physical QP on
the HCA is the same.  As we see above, however, this mechanism does
not always work).

The best fix for this problem is to allocate QPs in round-robin mode,
so that the physical QP numbers are not immediately re-used.
Found-by: NMatthew Finlay <matt@mellanox.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f4ec9e95

09 5月, 2012 1 次提交

mlx4_core: Add second capabilities flags field · b3416f44

由 Shlomo Pongratz 提交于 4月 29, 2012

This patch adds a 64-bit flags2 features member to struct mlx4_dev to
export further features of the hardware.  The original flags field
tracks features whose support bits are advertised by the firmware in
offsets 0x40 and 0x44 of the query device capabilities command.
flags2 will track features whose support bits are scattered at various
offsets.

RSS support is the first feature to be exported through flags2.  RSS
capabilities are located at offset 0x2e.  The size of the RSS
indirection table is also given in this offset.
Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b3416f44

24 4月, 2012 3 次提交

mlx4_en: Byte Queue Limit support · 5b263f53

由 Yevgeny Petrilin 提交于 4月 23, 2012

Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b263f53

mlx4_en: Moving to Interrupts for TX completions · e22979d9

由 Yevgeny Petrilin 提交于 4月 23, 2012

Moving to interrupts instead of polling fpr TX completions
Avoiding situations where skb can be held in by the driver for
a long time (till timer expires).
The change is also necessary for supporting BQL.

Removing comp_lock that was required because we could handle TX
completions from several contexts: Interrupts, timer, polling.
Now there is only interrupts
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e22979d9

Y
mlx4_en: Added Ethtool support for TX Interrupt coalescing · a19a848a
由 Yevgeny Petrilin 提交于 4月 23, 2012
```
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
a19a848a

16 4月, 2012 1 次提交

drivers/net: fix unresolved 64bit math in mellanox/mlx4/en_dcb_nl.c · 43c880df

由 Paul Gortmaker 提交于 4月 15, 2012

Commit 109d2446

    "net/mlx4_en: Set max rate-limit for a TC"

introduced 64 bit math operations into mlx4_en_dcbnl_ieee_setmaxrate()

causing the following final link failure on an x86_32 allmodconfig

  ERROR: "__udivdi3" [drivers/net/ethernet/mellanox/mlx4/mlx4_en.ko] undefined!

Convert it to use div_u64() instead.

Cc: Amir Vadai <amirv@mellanox.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

43c880df

15 4月, 2012 1 次提交

net: Fix spelling typo in net · fd9071ec

由 Masanari Iida 提交于 4月 13, 2012

Correct spelling typo within drivers/net.
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd9071ec

05 4月, 2012 6 次提交

net/mlx4_en: Set max rate-limit for a TC · 109d2446

由 Amir Vadai 提交于 4月 04, 2012

This patch is using the DCB netlink to set rate limit per ETS TC
Values are accepted in Kbps and rounded up to the nearest multiply of 100Mbps.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

109d2446

net/mlx4_en: sk_prio <=> UP for untagged traffic · 897d7846

由 Amir Vadai 提交于 4月 04, 2012

Since vlan egress map is only good for tagged traffic, need to have other
mapping to be used by untagged traffic.
For that, the driver uses sch_mqprio mapping. This mapping could be set by
using tc tool from iproute2 package.
Mapped UP will be used by the HW for QoS purposes, but won't go out on the
wire.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

897d7846

net/mlx4_en: DCB QoS support · 564c274c

由 Amir Vadai 提交于 4月 04, 2012

Set TSA, promised BW and PFC using IEEE 802.1qaz netlink commands.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

564c274c

net/mlx4_core: set port QoS attributes · e5395e92

由 Amir Vadai 提交于 4月 04, 2012

Adding QoS firmware commands:
- mlx4_en_SET_PORT_PRIO2TC - set UP <=> TC
- mlx4_en_SET_PORT_SCHEDULER - set promised BW, max BW and PG number
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5395e92

net/mlx4_en: Force user priority by QP attribute · 0e98b523

由 Amir Vadai 提交于 4月 04, 2012

Instead of relying on HW to change schedule queue by UP, schedule
queue is fixed for a tx_ring, and UP in WQE is ignored in this aspect.  This
resolves two issues with untagged traffic:
1. untagged traffic has no UP in packet which is needed for QoS. The change
   above allows setting the schedule queue (and by that the UP) of such a stream.
2. BlueFlame uses the same field used by vlan tag. So forcing UP from QPC
   allows using BF for untagged but prioritized traffic.

In old firmware that force UP is not supported, untagged traffic will not subject to
QoS.

Because UP is set by QP, need to always have a tx ring per UP, even if pfcrx
module paramter is false.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e98b523

mlx4: allocate just enough pages instead of always 4 pages · 117980c4

由 Thadeu Lima de Souza Cascardo 提交于 4月 04, 2012

The driver uses a 2-order allocation, which is too much on architectures
like ppc64, which has a 64KiB page. This particular allocation is used
for large packet fragments that may have a size of 512, 1024, 4096 or
fill the whole allocation. So, a minimum size of 16384 is good enough
and will be the same size that is used in architectures of 4KiB sized
pages.

This will avoid allocation failures that we see when the system is under
stress, but still has plenty of memory, like the one below.

This will also allow us to set the interface MTU to higher values like
9000, which was not possible on ppc64 without this patch.

Node 1 DMA: 737*64kB 37*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 51904kB
83137 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap  = 10420096kB
Total swap = 10420096kB
107776 pages RAM
1184 pages reserved
147343 pages shared
28152 pages non-shared
netstat: page allocation failure. order:2, mode:0x4020
Call Trace:
[c0000001a4fa3770] [c000000000012f04] .show_stack+0x74/0x1c0 (unreliable)
[c0000001a4fa3820] [c00000000016af38] .__alloc_pages_nodemask+0x618/0x930
[c0000001a4fa39a0] [c0000000001a71a0] .alloc_pages_current+0xb0/0x170
[c0000001a4fa3a40] [d00000000dcc3e00] .mlx4_en_alloc_frag+0x200/0x240 [mlx4_en]
[c0000001a4fa3b10] [d00000000dcc3f8c] .mlx4_en_complete_rx_desc+0x14c/0x250 [mlx4_en]
[c0000001a4fa3be0] [d00000000dcc4eec] .mlx4_en_process_rx_cq+0x62c/0x850 [mlx4_en]
[c0000001a4fa3d20] [d00000000dcc5150] .mlx4_en_poll_rx_cq+0x40/0x90 [mlx4_en]
[c0000001a4fa3dc0] [c0000000004e2bb8] .net_rx_action+0x178/0x450
[c0000001a4fa3eb0] [c00000000009c9b8] .__do_softirq+0x118/0x290
[c0000001a4fa3f90] [c000000000031df8] .call_do_softirq+0x14/0x24
[c000000184c3b520] [c00000000000e700] .do_softirq+0xf0/0x110
[c000000184c3b5c0] [c00000000009c6d4] .irq_exit+0xb4/0xc0
[c000000184c3b640] [c00000000000e964] .do_IRQ+0x144/0x230
Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: NKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Tested-by: NKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

117980c4

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功