提交 · e1f78ecdfd59d560c42bc04b9bfb746ef8a9dfb1 · openeuler / Kernel

15 3月, 2021 24 次提交

mlxsw: spectrum: Remove unnecessary RCU read-side critical section · e1f78ecd

由 Ido Schimmel 提交于 3月 14, 2021

Since commit 7d8e8f34 ("mlxsw: core: Increase scope of RCU read-side
critical section"), all Rx handlers are called from an RCU read-side
critical section.

Remove the unnecessary rcu_read_lock() / rcu_read_unlock().
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e1f78ecd

mlxsw: pci: Set extra metadata in skb control block · 5ab6dc9f

由 Ido Schimmel 提交于 3月 14, 2021

Packets that are mirrored / sampled to the CPU have extra metadata
encoded in their corresponding Completion Queue Element (CQE). Retrieve
this metadata from the CQE and set it in the skb control block so that
it could be accessed by the switch driver (i.e., 'mlxsw_spectrum').
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ab6dc9f

mlxsw: Create dedicated field for Rx metadata in skb control block · d4cabaad

由 Ido Schimmel 提交于 3月 14, 2021

Next patch will need to encode more Rx metadata in the skb control
block, so create a dedicated field for it and move the cookie index
there.
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4cabaad

mlxsw: pci: Add more metadata fields to CQEv2 · e0eeede3

由 Ido Schimmel 提交于 3月 14, 2021

The Completion Queue Element version 2 (CQEv2) includes various metadata
fields for packets that are mirrored / sampled to the CPU.

Add these fields so that they could be used by a later patch.
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e0eeede3

selftests: netdevsim: Test psample functionality · f26b3091

由 Ido Schimmel 提交于 3月 14, 2021

Test various aspects of psample functionality over netdevsim and in
particular test that the psample module correctly reports the provided
metadata.

Example:

 # ./psample.sh
 TEST: psample enable / disable                                      [ OK ]
 TEST: psample group number                                          [ OK ]
 TEST: psample metadata                                              [ OK ]
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f26b3091

netdevsim: Add dummy psample implementation · a8700c3d

由 Ido Schimmel 提交于 3月 14, 2021

Allow netdevsim to report "sampled" packets to the psample module by
periodically generating packets from a work queue. The behavior can be
enabled / disabled (default) and the various meta data attributes can be
controlled via debugfs knobs.

This implementation enables both testing of the psample module with all
the optional attributes as well as development of user space
applications on top of psample such as hsflowd and a Wireshark dissector
for psample generic netlink packets.
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a8700c3d

psample: Add additional metadata attributes · 07e1a580

由 Ido Schimmel 提交于 3月 14, 2021

Extend psample to report the following attributes when available:

* Output traffic class as a 16-bit value
* Output traffic class occupancy in bytes as a 64-bit value
* End-to-end latency of the packet in nanoseconds resolution
* Software timestamp in nanoseconds resolution (always available)
* Packet's protocol. Needed for packet dissection in user space (always
  available)
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07e1a580

psample: Encapsulate packet metadata in a struct · a03e99d3

由 Ido Schimmel 提交于 3月 14, 2021

Currently, callers of psample_sample_packet() pass three metadata
attributes: Ingress port, egress port and truncated size. Subsequent
patches are going to add more attributes (e.g., egress queue occupancy),
which also need an indication whether they are valid or not.

Encapsulate packet metadata in a struct in order to keep the number of
arguments reasonable.
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a03e99d3

Merge branch 'skbuff-micro-optimize-flow-dissection' · c6baf7ee

由 David S. Miller 提交于 3月 14, 2021

Alexander Lobakin says:

====================
skbuff: micro-optimize flow dissection

This little number makes all of the flow dissection functions take
raw input data pointer as const (1-5) and shuffles the branches in
__skb_header_pointer() according to their hit probability.

The result is +20 Mbps per flow/core with one Flow Dissector pass
per packet. This affects RPS (with software hashing), drivers that
use eth_get_headlen() on their Rx path and so on.

From v2 [1]:
 - reword some commit messages as a potential fix for NIPA;
 - no functional changes.

From v1 [0]:
 - rebase on top of the latest net-next. This was super-weird, but
   I double-checked that the series applies with no conflicts, and
   then on Patchwork it didn't;
 - no other changes.

[0] https://lore.kernel.org/netdev/20210312194538.337504-1-alobakin@pm.me
[1] https://lore.kernel.org/netdev/20210313113645.5949-1-alobakin@pm.me
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c6baf7ee

skbuff: micro-optimize {,__}skb_header_pointer() · d206121f

由 Alexander Lobakin 提交于 3月 14, 2021

{,__}skb_header_pointer() helpers exist mainly for preventing
accesses-beyond-end of the linear data.
In the vast majorify of cases, they bail out on the first condition.
All code going after is mostly a fallback.
Mark the most common branch as 'likely' one to move it in-line.
Also, skb_copy_bits() can return negative values only when the input
arguments are invalid, e.g. offset is greater than skb->len. It can
be safely marked as 'unlikely' branch, assuming that hotpath code
provides sane input to not fail here.

These two bump the throughput with a single Flow Dissector pass on
every packet (e.g. with RPS or driver that uses eth_get_headlen())
on 20 Mbps per flow/core.
Signed-off-by: NAlexander Lobakin <alobakin@pm.me>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d206121f

ethernet: constify eth_get_headlen()'s data argument · 59753ce8

由 Alexander Lobakin 提交于 3月 14, 2021

It's used only for flow dissection, which now takes constant data
pointers.
Signed-off-by: NAlexander Lobakin <alobakin@pm.me>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

59753ce8

linux/etherdevice.h: misc trailing whitespace cleanup · 805a25f3

由 Alexander Lobakin 提交于 3月 14, 2021

Caught by the text editor. Fix it separately from the actual changes.
Signed-off-by: NAlexander Lobakin <alobakin@pm.me>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

805a25f3

flow_dissector: constify raw input data argument · f96533cd

由 Alexander Lobakin 提交于 3月 14, 2021

Flow Dissector code never modifies the input buffer, neither skb nor
raw data.
Make 'data' argument const for all of the Flow dissector's functions.
Signed-off-by: NAlexander Lobakin <alobakin@pm.me>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f96533cd

skbuff: make __skb_header_pointer()'s data argument const · e3305138

由 Alexander Lobakin 提交于 3月 14, 2021

The function never modifies the input buffer, so 'data' argument
can be marked as const.
This implies one harmless cast-away.
Signed-off-by: NAlexander Lobakin <alobakin@pm.me>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3305138

flow_dissector: constify bpf_flow_dissector's data pointers · dac06b32

由 Alexander Lobakin 提交于 3月 14, 2021

BPF Flow dissection programs are read-only and don't touch input
buffers.
Mark 'data' and 'data_end' in struct bpf_flow_dissector as const
in preparation for global input constifying.
Signed-off-by: NAlexander Lobakin <alobakin@pm.me>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dac06b32

Merge branch 'gro-micro-optimize-dev_gro_receive' · 3f79eb3c

由 David S. Miller 提交于 3月 14, 2021

Alexander Lobakin says:

====================
gro: micro-optimize dev_gro_receive()

This random series addresses some of suboptimal constructions used
in the main GRO entry point.
The main body is gro_list_prepare() simplification and pointer usage
optimization in dev_gro_receive() itself. Being mostly cosmetic, it
gives like +10 Mbps on my setup to both TCP and UDP (both single- and
multi-flow).

Since v1 [0]:
 - drop the replacement of bucket index calculation with
   reciprocal_scale() since it makes absolutely no sense (Eric);
 - improve stack usage in dev_gro_receive() (Eric);
 - reverse the order of patches to avoid changes superseding.

[0] https://lore.kernel.org/netdev/20210312162127.239795-1-alobakin@pm.me
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3f79eb3c

gro: give 'hash' variable in dev_gro_receive() a less confusing name · d0eed5c3

由 Alexander Lobakin 提交于 3月 13, 2021

'hash' stores not the flow hash, but the index of the GRO bucket
corresponding to it.
Change its name to 'bucket' to avoid confusion while reading lines
like '__set_bit(hash, &napi->gro_bitmask)'.
Signed-off-by: NAlexander Lobakin <alobakin@pm.me>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d0eed5c3

gro: consistentify napi->gro_hash[x] access in dev_gro_receive() · 9dc2c313

由 Alexander Lobakin 提交于 3月 13, 2021

GRO bucket index doesn't change through the entire function.
Store a pointer to the corresponding bucket instead of its member
and use it consistently through the function.
It is performance-safe since &gro_list->list == gro_list.

Misc: remove superfluous braces around single-line branches.
Signed-off-by: NAlexander Lobakin <alobakin@pm.me>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9dc2c313

gro: simplify gro_list_prepare() · 0ccf4d50

由 Alexander Lobakin 提交于 3月 13, 2021

gro_list_prepare() always returns &napi->gro_hash[bucket].list,
without any variations. Moreover, it uses 'napi' argument only to
have access to this list, and calculates the bucket index for the
second time (firstly it happens at the beginning of
dev_gro_receive()) to do that.
Given that dev_gro_receive() already has an index to the needed
list, just pass it as the first argument to eliminate redundant
calculations, and make gro_list_prepare() return void.
Also, both arguments of gro_list_prepare() can be constified since
this function can only modify the skbs from the bucket list.
Signed-off-by: NAlexander Lobakin <alobakin@pm.me>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ccf4d50

net: dsa: bcm_sf2: Fill in BCM4908 CFP entries · f4e6d7cd

由 Florian Fainelli 提交于 3月 12, 2021

The BCM4908 switch has 256 CFP entrie, update that setting so CFP can be
used.
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4e6d7cd

hv_netvsc: Add a comment clarifying batching logic · bd49fea7

由 Shachar Raindel 提交于 3月 12, 2021

The batching logic in netvsc_send is non-trivial, due to
a combination of the Linux API and the underlying hypervisor
interface. Add a comment explaining why the code is written this
way.
Signed-off-by: NShachar Raindel <shacharr@microsoft.com>
Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bd49fea7

Merge branch 'pktgen-scripts-improvements' · 0f88e6f3

由 David S. Miller 提交于 3月 14, 2021

Igor Russkikh says:

====================
pktgen: scripts improvements

Please consider small improvements to pktgen scripts we use in our environment.

Adding delay parameter through command line,
Adding new -a (append) parameter to make flex runs

v3: change us to ns in docs
v2: Review comments from Jesper

CC: Jesper Dangaard Brouer <brouer@redhat.com>
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0f88e6f3

samples: pktgen: new append mode · c8fd4852

由 Igor Russkikh 提交于 3月 11, 2021

To configure various complex flows we for sure can create custom
pktgen init scripts, but sometimes thats not that easy.

New "-a" (append) option in all the existing sample scripts allows
to append more "devices" into pktgen threads.

The most straightforward usecases for that are:
- using multiple devices. We have to generate full linerate on
all physical functions (ports) of our multiport device.
- pushing multiple flows (with different packet options)
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8fd4852

samples: pktgen: allow to specify delay parameter via new opt · ef700f2e

由 Igor Russkikh 提交于 3月 11, 2021

DELAY may now be explicitly specified via common parameter -w
Signed-off-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ef700f2e

14 3月, 2021 16 次提交

docs: net: add missing devlink health cmd - trigger · 6f162909

由 Jakub Kicinski 提交于 3月 12, 2021

Documentation is missing and it's not very clear what
this callback is for - presumably testing the recovery?
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f162909

docs: net: tweak devlink health documentation · 3cc9b29a

由 Jakub Kicinski 提交于 3月 12, 2021

Minor tweaks and improvement of wording about the diagnose callback.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3cc9b29a

net: stmmac: Set FIFO sizes for ipq806x · e127906b

由 Jonathan McDowell 提交于 3月 13, 2021

Commit eaf4fac4 ("net: stmmac: Do not accept invalid MTU values")
started using the TX FIFO size to verify what counts as a valid MTU
request for the stmmac driver.  This is unset for the ipq806x variant.
Looking at older patches for this it seems the RX + TXs buffers can be
up to 8k, so set appropriately.

(I sent this as an RFC patch in June last year, but received no replies.
I've been running with this on my hardware (a MikroTik RB3011) since
then with larger MTUs to support both the internal qca8k switch and
VLANs with no problems. Without the patch it's impossible to set the
larger MTU required to support this.)
Signed-off-by: NJonathan McDowell <noodles@earth.li>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e127906b

drivers: net: vxlan.c: Fix declaration issue · 6fadbdd6

由 Sanjana Srinidhi 提交于 3月 13, 2021

Added a blank line after structure declaration.
This is done to maintain code uniformity.
Signed-off-by: NSanjana Srinidhi <sanjanasrinidhi1810@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6fadbdd6

net: ethernet: marvell: Fixed typo in the file sky2.c · 65c7bc1b

由 Bhaskar Chowdhury 提交于 3月 13, 2021

s/calclation/calculation/
Signed-off-by: NBhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

65c7bc1b

Merge branch 'dsa-hewllcreek-dumps' · b8eccf2a

由 David S. Miller 提交于 3月 13, 2021

Kurt Kanzenbach says:

====================
net: dsa: hellcreek: Add support for dumping tables

add support for dumping the VLAN and FDB table via devlink. As the driver uses
internal VLANs and static FDB entries, this is a useful debugging feature.

Changes since v1:

 * Drop memory reporting as there are better APIs to expose this
 * Move comment to VLAN patch

Previous versions:

 * https://lkml.kernel.org/netdev/20210311175344.3084-1-kurt@kmk-computers.de/
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b8eccf2a

net: dsa: hellcreek: Add devlink FDB region · 292cd449

由 Kurt Kanzenbach 提交于 3月 13, 2021

Allow to dump the FDB table via devlink. This is a useful debugging feature.
Signed-off-by: NKurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

292cd449

net: dsa: hellcreek: Move common code to helper · eb5f3d31

由 Kurt Kanzenbach 提交于 3月 13, 2021

There are two functions which need to populate fdb entries. Move that to a
helper function.
Signed-off-by: NKurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eb5f3d31

net: dsa: hellcreek: Use boolean value · e81813fb

由 Kurt Kanzenbach 提交于 3月 13, 2021

hellcreek_select_vlan() takes a boolean instead of an integer.
So, use false accordingly.
Signed-off-by: NKurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e81813fb

net: dsa: hellcreek: Add devlink VLAN region · ba2d1c28

由 Kurt Kanzenbach 提交于 3月 13, 2021

Allow to dump the VLAN table via devlink. This especially useful, because the
driver internally leverages VLANs for the port separation. These are not visible
via the bridge utility.
Signed-off-by: NKurt Kanzenbach <kurt@kmk-computers.de>
Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba2d1c28

Merge tag 'batadv-next-pullrequest-20210312' of git://git.open-mesh.org/linux-merge · ebc71a38

由 David S. Miller 提交于 3月 13, 2021

Simon Wunderlich says:

====================
There is only a single patch this time:

 - Use netif_rx_any_context(), by Sebastian Andrzej Siewior
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ebc71a38

Merge branch 'pps-policing' · 361f7e4a

由 David S. Miller 提交于 3月 13, 2021

Simon Horman says:

====================
net/sched: act_police: add support for packet-per-second policing

This series enhances the TC policer action implementation to allow a
policer action instance to enforce a rate-limit based on
packets-per-second, configurable using a packet-per-second rate and burst
parameters.

In the hope of aiding review this is broken up into three patches.

* [PATCH 1/3] flow_offload: add support for packet-per-second policing

  Add support for this feature to the flow_offload API that is used to allow
  programming flows, including TC rules and their actions, into hardware.

* [PATCH 2/3] flow_offload: reject configuration of packet-per-second policing in offload drivers

  Teach all exiting users of the flow_offload API that allow offload of
  policer action instances to reject offload if packet-per-second rate
  limiting is configured: none support it at this time

* [PATCH 3/3] net/sched: act_police: add support for packet-per-second policing

  With the above ground-work in place add the new feature to the TC policer
  action itself

With the above in place the feature may be used.

As follow-ups we plan to provide:
* Corresponding updates to iproute2
* Corresponding self tests (which depend on the iproute2 changes)
* Hardware offload support for the NFP driver

Key changes since v2:
* Added patches 1 and 2, which makes adding patch 3 safe for existing
  hardware offload of the policer action
* Re-worked patch 3 so that a TC policer action instance may be configured
  for packet-per-second or byte-per-second rate limiting, but not both.
* Corrected kdoc usage
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

361f7e4a

net/sched: act_police: add support for packet-per-second policing · 2ffe0395

由 Baowen Zheng 提交于 3月 12, 2021

Allow a policer action to enforce a rate-limit based on packets-per-second,
configurable using a packet-per-second rate and burst parameters.

e.g.
tc filter add dev tap1 parent ffff: u32 match \
        u32 0 0 police pkts_rate 3000 pkts_burst 1000

Testing was unable to uncover a performance impact of this change on
existing features.
Signed-off-by: NBaowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NLouis Peens <louis.peens@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2ffe0395

flow_offload: reject configuration of packet-per-second policing in offload drivers · 6a56e199

由 Baowen Zheng 提交于 3月 12, 2021

A follow-up patch will allow users to configures packet-per-second policing
in the software datapath. In preparation for this, teach all drivers that
support offload of the policer action to reject such configuration as
currently none of them support it.
Signed-off-by: NBaowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NLouis Peens <louis.peens@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a56e199

flow_offload: add support for packet-per-second policing · 25660156

由 Xingfeng Hu 提交于 3月 12, 2021

Allow flow_offload API to configure packet-per-second policing using rate
and burst parameters.

Dummy implementations of tcf_police_rate_pkt_ps() and
tcf_police_burst_pkt() are supplied which return 0, the unconfigured state.
This is to facilitate splitting the offload, driver, and TC code portion of
this feature into separate patches with the aim of providing a logical flow
for review. And the implementation of these helpers will be filled out by a
follow-up patch.
Signed-off-by: NXingfeng Hu <xingfeng.hu@corigine.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NLouis Peens <louis.peens@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

25660156

Merge branch 'hns3-imp-phys' · 4849d9be

由 David S. Miller 提交于 3月 13, 2021

Huazhong Tan says:

====================
net: hns3: support imp-controlled PHYs

This series adds support for imp-controlled PHYs in the HNS3
ethernet driver.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4849d9be

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功