提交 · 06efeb555524a8c65ef429f2603885c31a5212b1 · openeuler / Kernel

14 6月, 2019 15 次提交

Documentation: net: mlx5: Devlink health documentation · 06efeb55

由 Moshe Shemesh 提交于 6月 01, 2019

Documentation for devlink health reporters supported by mlx5.
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

06efeb55

net/mlx5: Report devlink health on FW fatal issues · b3bd076f

由 Moshe Shemesh 提交于 1月 27, 2019

Report devlink health on FW fatal issues via fw_fatal_reporter. The
driver recover flow for FW fatal error is now being handled by the
devlink health.

Having the recovery controlled by devlink health, the user has the
ability to cancel the auto-recovery for debug session and run it
manually.

Call mlx5_enter_error_state() before calling devlink_health_report() to
ensure entering device error state even if auto-recovery is off.
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

b3bd076f

net/mlx5: Add support for FW fatal reporter dump · 9b1f2982

由 Moshe Shemesh 提交于 1月 16, 2019

Add support of dump callback for mlx5 FW fatal reporter.
The FW fatal dump uses cr-dump functionality to gather cr-space data for
debug. The cr-dump uses vsc interface which is valid even if the FW
command interface is not functional, which is the case in most FW fatal
errors.

Command example and output:
$ devlink health dump show pci/0000:82:00.0 reporter fw_fatal
 crdump_data:
  00 20 00 01 00 00 00 00 03 00 00 00 00 00 00 00
  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80
  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  00 00 00 00 00 00 00 00 00 00 00 00 ba 82 00 00
  0c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20
  00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa 00
  a4 0e 00 00 00 00 00 00 80 c7 fe ff 50 0a 00 00
...
...
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

9b1f2982

net/mlx5: Add fw fatal devlink_health_reporter · 96c82cdf

由 Moshe Shemesh 提交于 12月 11, 2018

Create mlx5_devlink_health_reporter for fw fatal reporter.
The fw fatal reporter is added in addition to the fw reporter and
implements the recover callback.
The point of having two reporters for FW issues, is that we
don't want to run FW recover on any issue, but only fatal ones.
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

96c82cdf

net/mlx5: Report devlink health on FW issues · d1bf0e2c

由 Moshe Shemesh 提交于 12月 11, 2018

Use devlink_health_report() to report any symptom of FW issue as FW
counter miss or new health syndrome.
The FW issues detected in mlx5 during poll_health which is called in
timer atomic context and so health work queue is used to schedule the
reports.
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

d1bf0e2c

net/mlx5: Add support for FW reporter dump · fd1483fe

由 Moshe Shemesh 提交于 12月 11, 2018

Add support of dump callback for mlx5 FW reporter.  Once we trigger FW
dump, the FW will write the core dump to its raw data buffer. The tracer
translates the raw data to traces and save it to a cyclic array. Once
dump is done, the saved traces data is filled into the dump buffer. In
case syndrome is not zero the health buffer content will be printed as
well.

FW dump example:
$ devlink health dump show pci/0000:82:00.0 reporter fw
 dump fw traces:
   timestamp: 509006640427 lost: false event_id: 185 msg: dump general
info GVMI=0x0000
   timestamp: 509006645474 lost: false event_id: 185 msg: GVMI
management info, gvmi_management context:
   timestamp: 509006654463 lost: false event_id: 185 msg: [000]:
00000000  00000000  00000000  00000000
   timestamp: 509006656127 lost: false event_id: 185 msg: [010]:
00000000  00000000  00000000  00000000
   timestamp: 509006656255 lost: false event_id: 185 msg: [020]:
00000000  00000000  00000000  00000000
   timestamp: 509006656511 lost: false event_id: 185 msg: [030]:
00000000  00000000  00000000  00000000
   timestamp: 509006656639 lost: false event_id: 185 msg: [040]:
00000000  00000000  00000000  00000000
   timestamp: 509006656895 lost: false event_id: 185 msg: [050]:
00000000  00000000  00000000  00000000
   timestamp: 509006657023 lost: false event_id: 185 msg: [060]:
00000000  00000000  00000000  00000000
   timestamp: 509006657180 lost: false event_id: 185 msg: [070]:
00000000  00000000  00000000  00000000
   timestamp: 509006659839 lost: false event_id: 185 msg: CMDIF dbase
from IRON: active_dbase_slots = 0x00000000
   timestamp: 509006667391 lost: false event_id: 185 msg: GVMI=0x0000
hw_toc context:
   timestamp: 509006667647 lost: false event_id: 185 msg: [000]:
00000000  00000000  00000000  fffff000
   timestamp: 509006667775 lost: false event_id: 185 msg: [010]:
00000000  00000000  00000000  80d00000
...
...
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

fd1483fe

net/mlx5: Create FW devlink_health_reporter · 1e34f3ef

由 Moshe Shemesh 提交于 12月 11, 2018

Create mlx5_devlink_health_reporter for FW reporter. The FW reporter
implements devlink_health_reporter diagnose callback.

The fw reporter diagnose command can be triggered any time by the user
to check current fw status.
In healthy status, it will return clear syndrome. Otherwise it will
return the syndrome and description of the error type.

Command example and output on healthy status:
$ devlink health diagnose pci/0000:82:00.0 reporter fw
Syndrome: 0

Command example and output on non healthy status:
$ devlink health diagnose pci/0000:82:00.0 reporter fw
Syndrome: 8 Description: unrecoverable hardware error
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

1e34f3ef

net/mlx5: Issue SW reset on FW assert · 3e5b72ac

由 Feras Daoud 提交于 11月 12, 2018

If a FW assert is considered fatal, indicated by a new bit in the health
buffer, reset the FW. After the reset go through the normal recovery
flow. Only one PF needs to issue the reset, so an attempt is made to
prevent the 2nd function from also issuing the reset.
It's not an error if that happens, it just slows recovery.
Signed-off-by: NFeras Daoud <ferasda@mellanox.com>
Signed-off-by: NAlex Vesker <valex@mellanox.com>
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

3e5b72ac

net/mlx5: Control CR-space access by different PFs · 1ef6f1a1

由 Feras Daoud 提交于 12月 02, 2018

Since the FW can be shared between different PFs/VFs it is common
that more than one health poll will detected a failure, this can
lead to multiple resets which are unneeded.

The solution is to use a FW locking mechanism using semaphore space
to provide a way to allow only one device to collect the cr-dump and
to issue a sw-reset.
Signed-off-by: NFeras Daoud <ferasda@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

1ef6f1a1

net/mlx5: Handle SW reset of FW in error flow · 63cbc552

由 Feras Daoud 提交于 11月 12, 2018

New mlx5 adapters allow the driver to reset the FW in the event of an
error, this action called "SW Reset". When an SW reset is issued on any
PF all PFs enter reset state which is a recoverable condition. The
existing recovery flow was designed to allow the recovery of a VF after
a PF driver reload. This patch adds the sw reset to the NIC states
as a preparation for sw reset handling.

When a software reset is issued the following occurs:
1. The NIC interface mode is set to 7 while the reset is in progress.
2. Once the reset completes the NIC interface mode is set to 1.
Signed-off-by: NFeras Daoud <ferasda@mellanox.com>
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Reviewed-by: NAlex Vesker <valex@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

63cbc552

net/mlx5: Add Crdump support · 8b9d8baa

由 Alex Vesker 提交于 7月 17, 2018

Crdump allows the driver to retrieve a dump of the FW PCI crspace.
This is useful in case of catastrophic issues which may require FW
reset. The crspace dump can be used for later debug.
Signed-off-by: NAlex Vesker <valex@mellanox.com>
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Reviewed-by: NFeras Daoud <ferasda@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

8b9d8baa

net/mlx5: Add Vendor Specific Capability access gateway · b25bbc2f

由 Alex Vesker 提交于 6月 28, 2018

The Vendor Specific Capability (VSC) is used to activate a gateway
interfacing with the device. The gateway is used to read or write
device configurations, which are organized in different domains (spaces).
A configuration access may result in multiple actions, reads, writes.

Example usages are accessing the Crspace domain to read the crspace or
locking a device semaphore using the Semaphore domain.

The configuration access use pci_cfg_access to prevent parallel access to
the VSC space by the driver and userspace calls.
Signed-off-by: NAlex Vesker <valex@mellanox.com>
Signed-off-by: NFeras Daoud <ferasda@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

b25bbc2f

net/mlx5: Move all devlink related functions calls to devlink.c · 1f28d776

由 Eran Ben Elisha 提交于 12月 11, 2018

Centralize all devlink related callbacks in one file.
In the downstream patch, some more functionality will be added, this
patch is preparing the driver infrastructure for it.

Currently, move devlink un/register functions calls into this file.
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

1f28d776

S
Documentation: net: mlx5: Add mlx5 initial documentation · 00091c0d
由 Saeed Mahameed 提交于 6月 11, 2019
```
Add initial documentation for mlx5 driver.
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
```
00091c0d

devlink: Hang reporter's dump method on a dumpit cb · e44ef4e4

由 Aya Levin 提交于 5月 16, 2019

The devlink health reporter provides a dump method on an error. Dump
may contain a large amount of data, in this case doit cb isn't sufficient.
This is because the user side is blocking and doesn't allow draining of
the socket until the socket runs out of buffers. Using dumpit cb
is the correct way to go.
Please note that thankfully the dump op is not yet implemented in any
driver and therefore this change is not breaking userspace.

Fixes: 35455e23 ("devlink: Add health dump {get,clear} commands")
Signed-off-by: NAya Levin <ayal@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

e44ef4e4

13 6月, 2019 19 次提交

tcp: add optional per socket transmit delay · a842fe14

由 Eric Dumazet 提交于 6月 12, 2019

Adding delays to TCP flows is crucial for studying behavior
of TCP stacks, including congestion control modules.

Linux offers netem module, but it has unpractical constraints :
- Need root access to change qdisc
- Hard to setup on egress if combined with non trivial qdisc like FQ
- Single delay for all flows.

EDT (Earliest Departure Time) adoption in TCP stack allows us
to enable a per socket delay at a very small cost.

Networking tools can now establish thousands of flows, each of them
with a different delay, simulating real world conditions.

This requires FQ packet scheduler or a EDT-enabled NIC.

This patchs adds TCP_TX_DELAY socket option, to set a delay in
usec units.

  unsigned int tx_delay = 10000; /* 10 msec */

  setsockopt(fd, SOL_TCP, TCP_TX_DELAY, &tx_delay, sizeof(tx_delay));

Note that FQ packet scheduler limits might need some tweaking :

man tc-fq

PARAMETERS
   limit
       Hard  limit  on  the  real  queue  size. When this limit is
       reached, new packets are dropped. If the value is  lowered,
       packets  are  dropped so that the new limit is met. Default
       is 10000 packets.

   flow_limit
       Hard limit on the maximum  number  of  packets  queued  per
       flow.  Default value is 100.

Use of TCP_TX_DELAY option will increase number of skbs in FQ qdisc,
so packets would be dropped if any of the previous limit is hit.

Use of a jump label makes this support runtime-free, for hosts
never using the option.

Also note that TSQ (TCP Small Queues) limits are slightly changed
with this patch : we need to account that skbs artificially delayed
wont stop us providind more skbs to feed the pipe (netem uses
skb_orphan_partial() for this purpose, but FQ can not use this trick)

Because of that, using big delays might very well trigger
old bugs in TSO auto defer logic and/or sndbuf limited detection.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a842fe14

Merge branch 'ena-dynamic-queue-sizes' · e0ffbd37

由 David S. Miller 提交于 6月 12, 2019

Sameeh Jubran says:

====================
Support for dynamic queue size changes

This patchset introduces the following:
* add new admin command for supporting different queue size for Tx/Rx
* add support for Tx/Rx queues size modification through ethtool
* allow queues allocation backoff when low on memory
* update driver version

Difference from v2:
* Dropped superfluous range checks which are already done in ethtool. [patch 5/7]
* Dropped inline keyword from function. [patch 4/7]
* Added a new patch which drops inline keyword all *.c files. [patch 6/7]

Difference from v1:
* Changed ena_update_queue_sizes() signature to use u32 instead of int
  type for the size arguments. [patch 5/7]
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e0ffbd37

net: ena: update driver version from 2.0.3 to 2.1.0 · dbbc6e68

由 Sameeh Jubran 提交于 6月 11, 2019

Update driver version to match device specification.
Signed-off-by: NSameeh Jubran <sameehj@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dbbc6e68

net: ena: remove inline keyword from functions in *.c · c2b54204

由 Sameeh Jubran 提交于 6月 11, 2019

Let the compiler decide if the function should be inline in *.c files
Signed-off-by: NSameeh Jubran <sameehj@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2b54204

net: ena: add ethtool function for changing io queue sizes · eece4d2a

由 Sameeh Jubran 提交于 6月 11, 2019

Implement the set_ringparam() function of the ethtool interface
to enable the changing of io queue sizes.
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NSameeh Jubran <sameehj@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eece4d2a

net: ena: allow queue allocation backoff when low on memory · 13ca32a6

由 Sameeh Jubran 提交于 6月 11, 2019

If there is not enough memory to allocate io queues the driver will
try to allocate smaller queues.

The backoff algorithm is as follows:

1. Try to allocate TX and RX and if successful.
1.1. return success

2. Divide by 2 the size of the larger of RX and TX queues (or both if their size is the same).

3. If TX or RX is smaller than 256
3.1. return failure.
4. else
4.1. go back to 1.

Also change the tx_queue_size, rx_queue_size field names in struct
adapter to requested_tx_queue_size and requested_rx_queue_size, and
use RX and TX queue 0 for actual queue sizes.
Explanation:
The original fields were useless as they were simply used to assign
values once from them to each of the queues in the adapter in ena_probe().
They could simply be deleted. However now that we have a backoff
feature, we have use for them. In case of backoff there is a difference
between the requested queue sizes and the actual sizes. Therefore there
is a need to save the requested queue size for future retries of queue
allocation (for example if allocation failed and then ifdown + ifup was
called we want to start the allocation from the original requested size of
the queues).
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NSameeh Jubran <sameehj@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13ca32a6

net: ena: make ethtool show correct current and max queue sizes · 9f9ae3f9

由 Sameeh Jubran 提交于 6月 11, 2019

Currently ethtool -g shows the same size for current and max queue
sizes.
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NSameeh Jubran <sameehj@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f9ae3f9

net: ena: enable negotiating larger Rx ring size · 31aa9857

由 Sameeh Jubran 提交于 6月 11, 2019

Use MAX_QUEUES_EXT get feature capability to query the device.
Signed-off-by: NNetanel Belgazal <netanel@amazon.com>
Signed-off-by: NSameeh Jubran <sameehj@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31aa9857

net: ena: add MAX_QUEUES_EXT get feature admin command · ba8ef506

由 Arthur Kiyanovski 提交于 6月 11, 2019

Add a new admin command to support different queue size for Tx/Rx
queues (the change also support different SQ/CQ sizes)
Signed-off-by: NArthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: NSameeh Jubran <sameehj@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba8ef506

Merge branch 'dpaa2-eth-Add-support-for-MQPRIO-offloading' · f2dec9a2

由 David S. Miller 提交于 6月 12, 2019

Ioana Radulescu says:

====================
dpaa2-eth: Add support for MQPRIO offloading

Add support for adding multiple TX traffic classes with mqprio. We can have
up to one netdev queue and hardware frame queue per TC per core.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f2dec9a2

dpaa2-eth: Add mqprio support · ab1e6de2

由 Ioana Radulescu 提交于 6月 11, 2019

Implement mqprio qdisc support by mapping traffic classes to
different hardware enqueue priorities. The maximum number of
supported traffic classes is an attribute of each DPNI object.

The traffic classes map to hardware priorities from highest (0)
to lowest (highest prio number). The skb priority information
received from the stack is used to select the hardware Tx queue
on which to enqueue the frame.
Signed-off-by: NIoana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: NBogdan Purcareata <bogdan.purcareata@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab1e6de2

dpaa2-eth: Support multiple traffic classes on Tx · 15c87f6b

由 Ioana Radulescu 提交于 6月 11, 2019

DPNI objects can have multiple traffic classes, as reflected by
the num_tc attribute. Until now we ignored its value and only
used traffic class 0.

This patch adds support for multiple Tx traffic classes; we have
num_queues x num_tcs hardware queues available for each interface.
Signed-off-by: NIoana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15c87f6b

dpaa2-eth: Refactor xps code · 06d5b179

由 Ioana Radulescu 提交于 6月 11, 2019

Move the code configuring xps on the netdev TX queues to a
separate function. A subsequent patch will need to call
this in another context as well.
Signed-off-by: NIoana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06d5b179

net: ethernet: ti: cpts: fix build failure for powerpc · a41efedf

由 Grygorii Strashko 提交于 6月 11, 2019

Add dependency to TI CPTS from Common CLK framework COMMON_CLK to fix
allyesconfig build for Powerpc:

drivers/net/ethernet/ti/cpts.c: In function 'cpts_of_mux_clk_setup':
drivers/net/ethernet/ti/cpts.c:567:2: error: implicit declaration of function 'of_clk_parent_fill'; did you mean 'of_clk_get_parent_name'? [-Werror=implicit-function-declaration]
  of_clk_parent_fill(refclk_np, parent_names, num_parents);
  ^~~~~~~~~~~~~~~~~~
  of_clk_get_parent_name

Fixes: a3047a81 ("net: ethernet: ti: cpts: add support for ext rftclk selection")
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a41efedf

net: dsa: Deal with non-existing PHY/fixed-link · 2131fba5

由 Florian Fainelli 提交于 6月 10, 2019

We need to specifically deal with phylink_of_phy_connect() returning
-ENODEV, because this can happen when a CPU/DSA port does connect
neither to a PHY, nor has a fixed-link property. This is a valid use
case that is permitted by the binding and indicates to the switch:
auto-configure port with maximum capabilities.

Fixes: 0e279218 ("net: dsa: Use PHYLINK for the CPU/DSA ports")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2131fba5

net: dsa: mv88e6xxx: lock mutex in port_fdb_dump · fcf15367

由 Vivien Didelot 提交于 6月 12, 2019

During a port FDB dump operation, the mutex protecting the concurrent
access to the switch registers is currently held by the internal
mv88e6xxx_port_db_dump and mv88e6xxx_port_db_dump_fid helpers.

It must be held at the higher level in mv88e6xxx_port_fdb_dump which
is called directly by DSA through ds->ops->port_fdb_dump. Fix this.
Signed-off-by: NVivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fcf15367

dt-bindings: net: wiznet: add w5x00 support · 0114214e

由 Nicolas Saenz Julienne 提交于 6月 12, 2019

Add bindings for Wiznet's w5x00 series of SPI interfaced Ethernet chips.

Based on the bindings for microchip,enc28j60.
Signed-off-by: NNicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0114214e

net: ethernet: wiznet: w5X00 add device tree support · b9dd694e

由 Nicolas Saenz Julienne 提交于 6月 12, 2019

The w5X00 chip provides an SPI to Ethernet inteface. This patch allows
platform devices to be defined through the device tree.
Signed-off-by: NNicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b9dd694e

net: sched: ingress: set 'unlocked' flag for Qdisc ops · 7a096d57

由 Vlad Buslov 提交于 6月 12, 2019

To remove rtnl lock dependency in tc filter update API when using ingress
Qdisc, set QDISC_CLASS_OPS_DOIT_UNLOCKED flag in ingress Qdisc_class_ops.

Ingress Qdisc ops don't require any modifications to be used without rtnl
lock on tc filter update path. Ingress implementation never changes its
q->block and only releases it when Qdisc is being destroyed. This means it
is enough for RTM_{NEWTFILTER|DELTFILTER|GETTFILTER} message handlers to
hold ingress Qdisc reference while using it without relying on rtnl lock
protection. Unlocked Qdisc ops support is already implemented in filter
update path by unlocked cls API patch set.
Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7a096d57

12 6月, 2019 6 次提交

Merge branch 'tls-add-support-for-kernel-driven-resync-and-nfp-RX-offload' · 758a0a4d

由 David S. Miller 提交于 6月 11, 2019

Jakub Kicinski says:

====================
tls: add support for kernel-driven resync and nfp RX offload

This series adds TLS RX offload for NFP and completes the offload
by providing resync strategies.  When TLS data stream looses segments
or experiences reorder NIC can no longer perform in line offload.
Resyncs provide information about placement of records in the
stream so that offload can resume.

Existing TLS resync mechanisms are not a great fit for the NFP.
In particular the TX resync is hard to implement for packet-centric
NICs.  This patchset adds an ability to perform TX resync in a way
similar to the way initial sync is done - by calling down to the
driver when new record is created after driver indicated sync had
been lost.

Similarly on the RX side, we try to wait for a gap in the stream
and send record information for the next record.  This works very
well for RPC workloads which are the primary focus at this time.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

758a0a4d

nfp: tls: make use of kernel-driven TX resync · 9ed431c1

由 Jakub Kicinski 提交于 6月 10, 2019

When TCP stream gets out of sync (driver stops receiving skbs
with expected TCP sequence numbers) request a TX resync from
the kernel.

We try to distinguish retransmissions from missed transmissions
by comparing the sequence number to expected - if it's further
than the expected one - we probably missed packets.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NDirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ed431c1

net/tls: add kernel-driven resync mechanism for TX · 50180074

由 Jakub Kicinski 提交于 6月 10, 2019

TLS offload drivers keep track of TCP seq numbers to make sure
the packets are fed into the HW in order.

When packets get dropped on the way through the stack, the driver
will get out of sync and have to use fallback encryption, but unless
TCP seq number is resynced it will never match the packets correctly
(or even worse - use incorrect record sequence number after TCP seq
wraps).

Existing drivers (mlx5) feed the entire record on every out-of-order
event, allowing FW/HW to always be in sync.

This patch adds an alternative, more akin to the RX resync.  When
driver sees a frame which is past its expected sequence number the
stream must have gotten out of order (if the sequence number is
smaller than expected its likely a retransmission which doesn't
require resync).  Driver will ask the stack to perform TX sync
before it submits the next full record, and fall back to software
crypto until stack has performed the sync.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NDirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

50180074

net/tls: generalize the resync callback · eeb2efaf

由 Jakub Kicinski 提交于 6月 10, 2019

Currently only RX direction is ever resynced, however, TX may
also get out of sequence if packets get dropped on the way to
the driver.  Rename the resync callback and add a direction
parameter.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NDirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eeb2efaf

nfp: tls: enable TLS RX offload · c0a4948e

由 Jakub Kicinski 提交于 6月 10, 2019

Set ethtool TLS RX feature based on NIC capabilities, and enable
TLS RX when connections are added for decryption.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NDirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c0a4948e

nfp: tls: implement RX TLS resync · cad228a3

由 Dirk van der Merwe 提交于 6月 10, 2019

Enable kernel-controlled RX resync and propagate TLS connection
RX resync from kernel TLS to firmware.
Signed-off-by: NDirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cad228a3

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功