提交 · 0d22a3cf8da164dfc694cc159eabd355f14aba7e · openanolis / cloud-kernel

02 9月, 2017 8 次提交

Merge branch 'mlxsw-next-fixes' · 0d22a3cf

由 David S. Miller 提交于 9月 01, 2017

Jiri Pirko says:

====================
mlxsw: spectrum_router: Couple of fixes

Ido Schimmel (2):
  mlxsw: spectrum_router: Trap packets hitting anycast routes
  mlxsw: spectrum_router: Set abort trap in all virtual routers
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d22a3cf

mlxsw: spectrum_router: Set abort trap in all virtual routers · 241bc859

由 Ido Schimmel 提交于 9月 01, 2017

When the abort mechanism is invoked a default route directing packets to
the CPU is programmed in all the virtual routers currently in use. This
can result in packet loss in case a new VRF is configured.

Upon abort, program the default route in all virtual routers, whether
they are in use or not.

The patch is directed at net-next since post-abort fixes aren't critical
and packet loss due to a missing default route will be insignificant
compared to packet loss caused by the CPU port policer.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

241bc859

mlxsw: spectrum_router: Trap packets hitting anycast routes · d3b6d377

由 Ido Schimmel 提交于 9月 01, 2017

I relied on the fact that anycast routes use the loopback device as
their nexthop device to trap packets hitting them to the CPU.

After commit 4832c30d ("net: ipv6: put host and anycast routes on
device with address") this is no longer the case and such routes are
programmed with a forward action (note the 'offload' flag):

anycast cafe:: dev enp3s0np7 proto kernel metric 0 offload pref medium

This will prevent the router from locally receiving packets destined to
the Subnet-Router anycast address.

Fix this by specifically programming anycast routes with action trap,
which results in the following output:

anycast cafe:: dev enp3s0np7 proto kernel metric 0 pref medium

Fixes: 4832c30d ("net: ipv6: put host and anycast routes on device with address")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3b6d377

Merge branch 'bpf-Improve-LRU-map-lookup-performance' · 843bd2b3

由 David S. Miller 提交于 9月 01, 2017

Martin KaFai Lau says:

====================
bpf: Improve LRU map lookup performance

This patchset improves the lookup performance of the LRU map.
Please see individual patch for details.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

843bd2b3

bpf: Only set node->ref = 1 if it has not been set · bb9b9f88

由 Martin KaFai Lau 提交于 8月 31, 2017

This patch writes 'node->ref = 1' only if node->ref is 0.
The number of lookups/s for a ~1M entries LRU map increased by
~30% (260097 to 343313).

Other writes on 'node->ref = 0' is not changed.  In those cases, the
same cache line has to be changed anyway.

First column: Size of the LRU hash
Second column: Number of lookups/s

Before:
> echo "$((2**20+1)): $(./map_perf_test 1024 1 $((2**20+1)) 10000000 | awk '{print $3}')"
1048577: 260097

After:
> echo "$((2**20+1)): $(./map_perf_test 1024 1 $((2**20+1)) 10000000 | awk '{print $3}')"
1048577: 343313
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bb9b9f88

bpf: Inline LRU map lookup · cc555421

由 Martin KaFai Lau 提交于 8月 31, 2017

Inline the lru map lookup to save the cost in making calls to
bpf_map_lookup_elem() and htab_lru_map_lookup_elem().

Different LRU hash size is tested.  The benefit diminishes when
the cache miss starts to dominate in the bigger LRU hash.
Considering the change is simple, it is still worth to optimize.

First column: Size of the LRU hash
Second column: Number of lookups/s

Before:
> for i in $(seq 9 20); do echo "$((2**i+1)): $(./map_perf_test 1024 1 $((2**i+1)) 10000000 | awk '{print $3}')"; done
513: 1132020
1025: 1056826
2049: 1007024
4097: 853298
8193: 742723
16385: 712600
32769: 688142
65537: 677028
131073: 619437
262145: 498770
524289: 316695
1048577: 260038

After:
> for i in $(seq 9 20); do echo "$((2**i+1)): $(./map_perf_test 1024 1 $((2**i+1)) 10000000 | awk '{print $3}')"; done
513: 1221851
1025: 1144695
2049: 1049902
4097: 884460
8193: 773731
16385: 729673
32769: 721989
65537: 715530
131073: 671665
262145: 516987
524289: 321125
1048577: 260048
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc555421

bpf: Add lru_hash_lookup performance test · 637cd8c3

由 Martin KaFai Lau 提交于 8月 31, 2017

Create a new case to test the LRU lookup performance.

At the beginning, the LRU map is fully loaded (i.e. the number of keys
is equal to map->max_entries).   The lookup is done through key 0
to num_map_entries and then repeats from 0 again.

This patch also creates an anonymous struct to properly
name the test params in stress_lru_hmap_alloc() in map_perf_test_kern.c.
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

637cd8c3

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next · 08daaec7

由 David S. Miller 提交于 9月 01, 2017

Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2017-09-01

This should be the last ipsec-next pull request for this
release cycle:

1) Support netdevice ESP trailer removal when decryption
   is offloaded. From Yossi Kuperman.

2) Fix overwritten return value of copy_sec_ctx().

Please pull or let me know if there are problems.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

08daaec7

01 9月, 2017 23 次提交

Merge branch 'bpf-Add-option-to-set-mark-and-priority-in-cgroup-sock-programs' · 8fd68207

由 David S. Miller 提交于 9月 01, 2017

David Ahern says:

====================
bpf: Add option to set mark and priority in cgroup sock programs

Add option to set mark and priority in addition to bound device for newly
created sockets. Also, allow the bpf programs to use the get_current_uid_gid
helper meaning socket marks, priority and device can be set based on the
uid/gid of the running process.

Sample programs are updated to demonstrate the new options.

v3
- no changes to Patches 1 and 2 which Alexei acked in previous versions
- dropped change related to recursive programs in a cgroup
- updated tests per dropped patch

v2
- added flag to control recursive behavior as requested by Alexei
- added comment to sock_filter_func_proto regarding use of
  get_current_uid_gid helper
- updated test programs for recursive option
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fd68207

samples/bpf: Update cgroup socket examples to use uid gid helper · 0adc3dd9

由 David Ahern 提交于 8月 31, 2017

Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0adc3dd9

samples/bpf: Update cgrp2 socket tests · 33aeb5e3

由 David Ahern 提交于 8月 31, 2017

Update cgrp2 bpf sock tests to check that device, mark and priority
can all be set on a socket via bpf programs attached to a cgroup.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

33aeb5e3

samples/bpf: Add option to dump socket settings · f776d460

由 David Ahern 提交于 8月 31, 2017

Add option to dump socket settings. Will be used in the next patch
to verify bpf programs are correctly setting mark, priority and
device based on the cgroup attachment for the program run.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f776d460

samples/bpf: Add detach option to test_cgrp2_sock · 609b1c32

由 David Ahern 提交于 8月 31, 2017

Add option to detach programs from a cgroup.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

609b1c32

samples/bpf: Update sock test to allow setting mark and priority · fa38aa17

由 David Ahern 提交于 8月 31, 2017

Update sock test to set mark and priority on socket create.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fa38aa17

bpf: Allow cgroup sock filters to use get_current_uid_gid helper · ae2cf1c4

由 David Ahern 提交于 8月 31, 2017

Allow BPF programs run on sock create to use the get_current_uid_gid
helper. IPv4 and IPv6 sockets are created in a process context so
there is always a valid uid/gid
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae2cf1c4

bpf: Add mark and priority to sock options that can be set · 482dca93

由 David Ahern 提交于 8月 31, 2017

Add socket mark and priority to fields that can be set by
ebpf program when a socket is created.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

482dca93

Merge branch 'mlxsw-Add-IPv6-host-dpipe-table' · e12f1a59

由 David S. Miller 提交于 8月 31, 2017

Jiri Pirko says:

====================
mlxsw: Add IPv6 host dpipe table

This patchset adds IPv6 host dpipe table support. This will provide the
ability to observe the hardware offloaded IPv6 neighbors.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e12f1a59

mlxsw: spectrum_dpipe: Add support for controlling IPv6 neighbor counters · 0fb5fe3c

由 Arkadi Sharshevsky 提交于 8月 31, 2017

Add support for controlling IPv6 neighbor counters via dpipe.
Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0fb5fe3c

mlxsw: spectrum_router: Add support for setting counters on IPv6 neighbors · 1ed5574c

由 Arkadi Sharshevsky 提交于 8月 31, 2017

Add support for setting counters on IPv6 neighbors based on dpipe's host6
table counter status.
Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ed5574c

mlxsw: spectrum_dpipe: Add support for IPv6 host table dump · 410774bd

由 Arkadi Sharshevsky 提交于 8月 31, 2017

Add support for IPv6 host table dump.
Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

410774bd

mlxsw: spectrum_dpipe: Make host entry fill handler more generic · 6049e539

由 Arkadi Sharshevsky 提交于 8月 31, 2017

Change the host entry filler helper to be applicable for both IPv4/6
addresses.
Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6049e539

mlxsw: spectrum_router: Add IPv6 neighbor access helper · 0250768c

由 Arkadi Sharshevsky 提交于 8月 31, 2017

Add helper for accessing destination IP in case of IPv6 neighbor.
Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0250768c

mlxsw: spectrum_dpipe: Add IPv6 host table initial support · 506f7dd5

由 Arkadi Sharshevsky 提交于 8月 31, 2017

Add IPv6 host table initial support. The action behavior for both IPv4/6
tables is the same, thus the same action dump op is used. Neighbors with
link local address are ignored.
Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

506f7dd5

mlxsw: spectrum_router: Export IPv6 link local address check helper · 1d1056d8

由 Arkadi Sharshevsky 提交于 8月 31, 2017

Neighbors with link local addresses are not offloaded to the host table,
yet, the are maintained in the driver for adjacency table usage. When
dumping the IPv6 host neighbors this link local neighbors should be
ignored. This patch exports this helper for dpipe usage.
Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d1056d8

devlink: Add IPv6 header for dpipe · 1797f5b3

由 Arkadi Sharshevsky 提交于 8月 31, 2017

This will be used by the IPv6 host table which will be introduced in the
following patches. The fields in the header are added per-use. This header
is global and can be reused by many drivers.
Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1797f5b3

x86: bpf_jit: small optimization in emit_bpf_tail_call() · 84ccac6e

由 Eric Dumazet 提交于 8月 31, 2017

Saves 4 bytes replacing following instructions :

lea rax, [rsi + rdx * 8 + offsetof(...)]
mov rax, qword ptr [rax]
cmp rax, 0

by :

mov rax, [rsi + rdx * 8 + offsetof(...)]
test rax, rax
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84ccac6e

samples/bpf: Fix compilation issue in redirect dummy program · 3edcf18e

由 Tariq Toukan 提交于 8月 31, 2017

Fix compilation error below:

$ make samples/bpf/

LLVM ERROR: 'xdp_redirect_dummy' label emitted multiple times to
assembly file
make[1]: *** [samples/bpf/xdp_redirect_kern.o] Error 1
make: *** [samples/bpf/] Error 2

Fixes: 306da4e6 ("samples/bpf: xdp_redirect load XDP dummy prog on TX device")
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3edcf18e

net: fix two typos in net_device_ops documentation. · f16ded59

由 Rami Rosen 提交于 8月 31, 2017

This patch fixes two trivial typos in net_device_ops documentation,
related to ndo_xdp_flush callback.
Signed-off-by: NRami Rosen <rami.rosen@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f16ded59

net: dccp: Add handling of IPV6_PKTOPTIONS to dccp_v6_do_rcv() · 323fbd0e

由 Andrii 提交于 8月 31, 2017

Add handling of IPV6_PKTOPTIONS to dccp_v6_do_rcv() in net/dccp/ipv6.c,
similar
to the handling in net/ipv6/tcp_ipv6.c
Signed-off-by: NAndrii Vladyka <tulup@mail.ru>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

323fbd0e

bridge: add tracepoint in br_fdb_update · e3cfddd5

由 Roopa Prabhu 提交于 8月 30, 2017

This extends bridge fdb table tracepoints to also cover
learned fdb entries in the br_fdb_update path. Note that
unlike other tracepoints I have moved this to when the fdb
is modified because this is in the datapath and can generate
a lot of noise in the trace output. br_fdb_update is also called
from added_by_user context in the NTF_USE case which is already
traced ..hence the !added_by_user check.
Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3cfddd5

net_sched: add reverse binding for tc class · 07d79fc7

由 Cong Wang 提交于 8月 30, 2017

TC filters when used as classifiers are bound to TC classes.
However, there is a hidden difference when adding them in different
orders:

1. If we add tc classes before its filters, everything is fine.
   Logically, the classes exist before we specify their ID's in
   filters, it is easy to bind them together, just as in the current
   code base.

2. If we add tc filters before the tc classes they bind, we have to
   do dynamic lookup in fast path. What's worse, this happens all
   the time not just once, because on fast path tcf_result is passed
   on stack, there is no way to propagate back to the one in tc filters.

This hidden difference hurts performance silently if we have many tc
classes in hierarchy.

This patch intends to close this gap by doing the reverse binding when
we create a new class, in this case we can actually search all the
filters in its parent, match and fixup by classid. And because
tcf_result is specific to each type of tc filter, we have to introduce
a new ops for each filter to tell how to bind the class.

Note, we still can NOT totally get rid of those class lookup in
->enqueue() because cgroup and flow filters have no way to determine
the classid at setup time, they still have to go through dynamic lookup.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07d79fc7

31 8月, 2017 9 次提交

xfrm: Fix return value check of copy_sec_ctx. · 8598112d

由 Steffen Klassert 提交于 8月 31, 2017

A recent commit added an output_mark. When copying
this output_mark, the return value of copy_sec_ctx
is overwitten without a check. Fix this by copying
the output_mark before the security context.

Fixes: 077fbac4 ("net: xfrm: support setting an output mark.")
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

8598112d

xfrm: Add support for network devices capable of removing the ESP trailer · 47ebcc0b

由 Yossi Kuperman 提交于 8月 30, 2017

In conjunction with crypto offload [1], removing the ESP trailer by
hardware can potentially improve the performance by avoiding (1) a
cache miss incurred by reading the nexthdr field and (2) the necessity
to calculate the csum value of the trailer in order to keep skb->csum
valid.

This patch introduces the changes to the xfrm stack and merely serves
as an infrastructure. Subsequent patch to mlx5 driver will put this to
a good use.

[1] https://www.mail-archive.com/netdev@vger.kernel.org/msg175733.htmlSigned-off-by: NYossi Kuperman <yossiku@mellanox.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

47ebcc0b

Merge tag 'mlx5-GRE-Offload' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · ea3100ab

由 David S. Miller 提交于 8月 30, 2017

Saeed Mahameed says:

====================
mlx5-updates-2017-08-31 (GRE Offloads support)

This series provides the support for MPLS RSS and GRE TX offloads and
RSS support.

The first patch from Gal and Ariel provides the mlx5 driver support for
ConnectX capability to perform IP version identification and matching in
order to distinguish between IPv4 and IPv6 without the need to specify the
encapsulation type, thus perform RSS in MPLS automatically without
specifying MPLS ethertyoe. This patch will also serve for inner GRE IPv4/6
classification for inner GRE RSS.

2nd patch from Gal, Adds the TX offloads support for GRE tunneled packets,
by reporting the needed netdev features.

3rd patch from Gal, Adds GRE inner RSS support by creating the needed device
resources (Steering Tables/rules and traffic classifiers) to Match GRE traffic
and perform RSS hashing on the inner headers.

Improvement:
Testing 8 TCP streams bandwidth over GRE:
    System: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
    NIC: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
    Before: 21.3 Gbps (Single RQ)
    Now   : 90.5 Gbps (RSS spread on 8 RQs)
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea3100ab

liquidio: fix crash in presence of zeroed-out base address regs · acfb98b9

由 Rick Farrington 提交于 8月 30, 2017

Fix crash in linux PF driver when BARs have been cleared/de-programmed;
fail early init (prior to mapping BARs) if the BAR0 or
BAR1 registers are zero.

This situation can arise when the PF is added to a VM (PCI pass-through),
then a PF FLR is issued (in the VM).  After this occurs, the BAR registers
will be zero. If we attempt to load the PF driver in the host
(after VM has been shutdown), the host can reset.
Signed-off-by: NRick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: NRaghu Vatsavayi <raghu.vatsavayi@cavium.com>
Signed-off-by: NFelix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

acfb98b9

devlink: Maintain consistency in mac field name · 12bdc5e1

由 David Ahern 提交于 8月 30, 2017

IPv4 name uses "destination ip" as does the IPv6 patch set.
Make the mac field consistent.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

12bdc5e1

hv_netvsc: Fix typos in the document of UDP hashing · d35d6e92

由 Haiyang Zhang 提交于 8月 30, 2017

There are two typos in the document, netvsc.txt,
regarding UDP hashing level. This patch fixes them.
Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d35d6e92

xen-netfront: be more drop monitor friendly · 62f3250f

由 Eric Dumazet 提交于 8月 30, 2017

xennet_start_xmit() might copy skb with inappropriate layout
into a fresh one.

Old skb is freed, and at this point it is not a drop, but
a consume. New skb will then be either consumed or dropped.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62f3250f

net/mlx5e: Support RSS for GRE tunneled packets · 7b3722fa

由 Gal Pressman 提交于 8月 13, 2017

Introduce a new flow table and indirect TIRs which are used to hash the
inner packet headers of GRE tunneled packets.

When a GRE tunneled packet is received, the TTC flow table will match
the new IPv4/6->GRE rules which will forward it to the inner TTC table.
The inner TTC is similar to its counterpart outer TTC table, but
matching the inner packet headers instead of the outer ones (and does
not include the new IPv4/6->GRE rules).
The new rules will not add steering hops since they are added to an
already existing flow group which will be matched regardless of this
patch. Non GRE traffic will not be affected.

The inner flow table will forward the packet to inner indirect TIRs
which hash the inner packet and thus result in RSS for the tunneled
packets.

Testing 8 TCP streams bandwidth over GRE:
System: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
NIC: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
Before: 21.3 Gbps (Single RQ)
Now   : 90.5 Gbps (RSS spread on 8 RQs)
Signed-off-by: NGal Pressman <galp@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

7b3722fa

net/mlx5e: Support TSO and TX checksum offloads for GRE tunnels · 27299841

由 Gal Pressman 提交于 8月 13, 2017

Add TX offloads support for GRE tunneled packets by reporting the needed
netdev features.
Signed-off-by: NGal Pressman <galp@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

27299841

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功