提交 · 9ffe79a9c2eec0f30687c2fd8b452bda5c8287b0 · gsplhtlxg / clone-Linux

29 9月, 2017 18 次提交

net: hns3: Support for dynamically assigning tx buffer to TC · 9ffe79a9

由 Yunsheng Lin 提交于 9月 27, 2017

This patch add support of dynamically assigning tx buffer to
TC when the TC is enabled.
It will save buffer for rx direction to avoid packet loss.
Signed-off-by: NYunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ffe79a9

arp: make arp_hdr_len() return unsigned int · 6ade97da

由 Alexey Dobriyan 提交于 9月 26, 2017

Negative ARP header length are not a thing.

Constify arguments while I'm at it.

Space savings:

	add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-3 (-3)
	function                        old     new   delta
	arpt_do_table                  1163    1160      -3
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6ade97da

net-next/hinic: Fix a case of Tx Queue is Stopped forever · bbdc9e68

由 Aviad Krawczyk 提交于 9月 27, 2017

Fix the following scenario:
1. tx_free_poll is running on cpu X
2. xmit function is running on cpu Y and fails to get sq wqe
3. tx_free_poll frees wqes on cpu X and checks the queue is not stopped
4. xmit function stops the queue after failed to get sq wqe
5. The queue is stopped forever
Signed-off-by: NAviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bbdc9e68

net-next/hinic: Set Rxq irq to specific cpu for NUMA · 352f58b0

由 Aviad Krawczyk 提交于 9月 27, 2017

Set Rxq irq to specific cpu for allocating and receiving the skb from
the same node.
Signed-off-by: NAviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

352f58b0

Merge branch 'bpf-verifier-disassembly-improvements' · 93771b01

由 David S. Miller 提交于 9月 28, 2017

Edward Cree says:

====================
bpf/verifier: disassembly improvements

Fix the output of print_bpf_insn() for ALU ops that don't look like
 compound assignment (i.e. BPF_END and BPF_NEG).

Sample output for a short test program:
0: (b4) (u32) r0 = (u32) 0
1: (dc) r0 = be32 r0
2: (84) r0 = (u32) -r0
3: (95) exit
processed 4 insns, stack depth 0
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

93771b01

bpf/verifier: improve disassembly of BPF_NEG instructions · 73c864b3

由 Edward Cree 提交于 9月 26, 2017

BPF_NEG takes only one operand, unlike the bulk of BPF_ALU[64] which are
 compound-assignments.  So give it its own format in print_bpf_insn().
Signed-off-by: NEdward Cree <ecree@solarflare.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73c864b3

bpf/verifier: improve disassembly of BPF_END instructions · 2b7c6ba9

由 Edward Cree 提交于 9月 26, 2017

print_bpf_insn() was treating all BPF_ALU[64] the same, but BPF_END has a
 different structure: it has a size in insn->imm (even if it's BPF_X) and
 uses the BPF_SRC (X or K) to indicate which endianness to use.  So it
 needs different code to print it.
Signed-off-by: NEdward Cree <ecree@solarflare.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b7c6ba9

cxgb4: make function ch_flower_stats_cb, fixes warning · 4d8806fd

由 Colin Ian King 提交于 9月 26, 2017

The function ch_flower_stats_cb is local to the source and does not need
to be in global scope, so make it static.

Cleans up sparse warnings:
symbol 'ch_flower_stats_cb' was not declared. Should it be static?
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d8806fd

Merge branch 'rtnl-pushdown-prep' · f46d9632

由 David S. Miller 提交于 9月 28, 2017

Florian Westphal says:

====================
rtnetlink: preparation patches for further rtnl lock pushdown/removal

Patches split large rtnl_fill_ifinfo into smaller chunks
to better see which parts

1. require rtnl
2. do not require it at all
3. rely on rtnl locking now but could be converted

Changes since v3:

I dropped the 'ifalias' patch, I have a change to decouple ifalias and
rtnl mutex, I will send it once this series has been merged.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f46d9632

rtnetlink: rtnl_have_link_slave_info doesn't need rtnl · 4c82a95e

由 Florian Westphal 提交于 9月 26, 2017

it can be switched to rcu.
Reviewed-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c82a95e

rtnetlink: add helpers to dump netnsid information · b1e66b9a

由 Florian Westphal 提交于 9月 26, 2017

Reviewed-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1e66b9a

rtnetlink: add helpers to dump vf information · 250fc3df

由 Florian Westphal 提交于 9月 26, 2017

similar to earlier patches, split out more parts of this function to
better see what is happening and where we assume rtnl is locked.
Reviewed-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

250fc3df

rtnetlink: add helper to put master and link ifindexes · 79110a04

由 Florian Westphal 提交于 9月 26, 2017

rtnl_fill_ifinfo currently requires caller to hold the rtnl mutex.
Unfortunately the function is quite large which makes it harder to see
which spots require the lock, which spots assume it and which ones could
do without.

Add helpers to factor out the ifindex dumping, one can use rcu to avoid
rtnl dependency.
Reviewed-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

79110a04

selftests: rtnetlink.sh: add rudimentary vrf test · 61f26d92

由 Florian Westphal 提交于 9月 26, 2017

Acked-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61f26d92

net_sched: use idr to allocate u32 filter handles · e7614370

由 Cong Wang 提交于 9月 25, 2017

Instead of calling u32_lookup_ht() in a loop to find
a unused handle, just switch to idr API to allocate
new handles. u32 filters are special as the handle
could contain a hash table id and a key id, so we
need two IDR to allocate each of them.

Cc: Chris Mi <chrism@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e7614370

net_sched: use idr to allocate basic filter handles · 1d8134fe

由 Cong Wang 提交于 9月 25, 2017

Instead of calling basic_get() in a loop to find
a unused handle, just switch to idr API to allocate
new handles.

Cc: Chris Mi <chrism@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d8134fe

net_sched: use idr to allocate bpf filter handles · 76cf546c

由 Cong Wang 提交于 9月 25, 2017

Instead of calling cls_bpf_get() in a loop to find
a unused handle, just switch to idr API to allocate
new handles.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Chris Mi <chrism@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76cf546c

inetpeer: speed up inetpeer_invalidate_tree() · 8f1975e3

由 Eric Dumazet 提交于 9月 25, 2017

As measured in my prior patch ("sch_netem: faster rb tree removal"),
rbtree_postorder_for_each_entry_safe() is nice looking but much slower
than using rb_next() directly, except when tree is small enough
to fit in CPU caches (then the cost is the same)

From: Eric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8f1975e3

28 9月, 2017 13 次提交

Merge branch 'mlxsw-Add-support-for-offloading-IPv4-multicast-routes' · a2e4a219

由 David S. Miller 提交于 9月 27, 2017

Jiri Pirko says:

====================
mlxsw: Add support for offloading IPv4 multicast routes

Yotam says:

This patch-set introduces offloading of the kernel IPv4 multicast router
logic in the Spectrum driver.

The first patch makes the Spectrum driver ignore FIB notifications that are
not of address family IPv4 or IPv6. This is needed in order to prevent
crashes while the next patches introduce the RTNL_FAMILY_IPMR FIB
notifications.

Patches 2-5 update ipmr to use the FIB notification chain for both MFC and
VIF notifications, and patches 8-12 update the Spectrum driver to register
to these notifications and offload the routes.

Similarly to IPv4 and IPv6, any failure will trigger the abort mechanism
which is updated in this patch-set to eject multicast route tables too.

At this stage, the following limitations apply:
 - A multicast MFC route will be offloaded by the driver if all the output
   interfaces are Spectrum router interfaces (RIFs). In any other case
   (which includes pimreg device, tunnel devices and management ports) the
   route will be trapped to the CPU and the packets will be forwarded by
   software.
 - ipmr proxy routes are not supported and will trigger the abort
   mechanism.
 - The MFC TTL values are currently treated as boolean: if the value is
   different than 255, the traffic is forwarded to the interface and if the
   value is 255 it is not forwarded. Dropping packets based on their TTL isn't
   currently supported.

To allow users to have visibility on which of the routes are offloaded and
which are not, patch 6 introduces a per-route offload indication similar to
IPv4 and IPv6 routes which is sent to the user via the RTNetlink interface.

The Spectrum driver multicast router offloading support, which is
introduced in patches 8 and 9, is divided into two parts:
 - The hardware logic which abstracts the Spectrum hardware and provides a
   simple API for the upper levels.
 - The offloading logic which gets the MFC and VIF notifications from the
   kernel and updates the hardware using the hardware logic part.

Finally, the last patch makes the Spectrum router logic not ignore the
multicast FIB notifications and call the corresponding functions in the
multicast router offloading logic.

---
v2->v3:
 - Move the ipmr_rule_default function definition to be inside the already
   existing CONFIG_IP_MROUTE_MULTIPLE_TABLES ifdef block (patch 6)
 - Remove double =0 initialization in spectrum_mr.c (patch 7)
 - Fix route4 allocation size (patch 7)
v1->v2:
 - Add comments for struct fields in mroute.h
 - Take the mrt_lock while dumping VIFs in the fib_notifier dump callback
 - Update the MFC lastuse field too
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2e4a219

mlxsw: spectrum: router: Don't ignore IPMR notifications · 664375e9

由 Yotam Gigi 提交于 9月 27, 2017

Make the Spectrum router logic not ignore the RTNL_FAMILY_IPMR FIB
notifications.

Past commits added the IPMR VIF and MFC add/del notifications via the
fib_notifier chain. In addition, a code for handling these notifications in
the Spectrum router logic was added. Make the Spectrum router logic not
ignore these notifications and forward the requests to the Spectrum
multicast router offloading logic.
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

664375e9

mlxsw: spectrum: Notify multicast router on RIF MTU changes · fd890fe9

由 Yotam Gigi 提交于 9月 27, 2017

Due to the fact that multicast routes hold the minimum MTU of all the
egress RIFs and trap packets that don't meet it, notify the mulitcast
router code on RIF MTU changes.
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd890fe9

mlxsw: spectrum_router: Add multicast routes notification handling functionality · d42b0965

由 Yotam Gigi 提交于 9月 27, 2017

Add functionality for calling the multicast routing offloading logic upon
MFC and VIF add and delete notifications. In addition, call the multicast
routing upon RIF addition and deletion events.

As the multicast routing offload logic may sleep, the actual calls are done
in a deferred work. To ensure the MFC object is not freed in that interval,
a reference is held to it. In case of a failure, the abort mechanism is
used, which ejects all the routes from the hardware and triggers the
traffic to flow through the kernel.

Note: At that stage, the FIB notifications are still ignored, and will be
enabled in a further patch.
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d42b0965

mlxsw: spectrum: router: Squash the default route table to main · 7e50d435

由 Yotam Gigi 提交于 9月 27, 2017

Currently, the mlxsw Spectrum driver offloads only either the RT_TABLE_MAIN
FIB table or the VRF tables, so the RT_TABLE_LOCAL table is squashed to the
RT_TABLE_MAIN table to allow local routes to be offloaded too.

By default, multicast MFC routes which are not assigned to any user
requested table are put in the RT_TABLE_DEFAULT table.

Due to the fact that offloading multicast MFC routes support in Spectrum
router logic is going to be introduced soon, squash the default table to
MAIN too.
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e50d435

mlxsw: spectrum: Add the multicast routing hardware logic · 0e14c777

由 Yotam Gigi 提交于 9月 27, 2017

Implement the multicast routing hardware API introduced in previous patch
for the specific spectrum hardware.

The spectrum hardware multicast routes are written using the RMFT2 register
and point to an ACL flexible action set. The actions used for multicast
routes are:
 - Counter action, which allows counting bytes and packets on multicast
   routes.
 - Multicast route action, which provide RPF check and do the actual packet
   duplication to a list of RIFs.
 - Trap action, in the case the route action specified by the called is
   trap.
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e14c777

mlxsw: spectrum: Add the multicast routing offloading logic · c011ec1b

由 Yotam Gigi 提交于 9月 27, 2017

Add the multicast router offloading logic, which is in charge of handling
the VIF and MFC notifications and translating it to the hardware logic API.

The offloading logic has to overcome several obstacles in order to safely
comply with the kernel multicast router user API:
 - It must keep track of the mapping between VIFs to netdevices. The user
   can add an MFC cache entry pointing to a VIF, delete the VIF and add
   re-add it with a different netdevice. The offloading logic has to handle
   this in order to be compatible with the kernel logic.
 - It must keep track of the mapping between netdevices to spectrum RIFs,
   as the current hardware implementation assume having a RIF for every
   port in a multicast router.
 - It must handle routes pointing to pimreg device to be trapped to the
   kernel, as the packet should be delivered to userspace.
 - It must handle routes pointing tunnel VIFs. The current implementation
   does not support multicast forwarding to tunnels, thus routes that point
   to a tunnel should be trapped to the kernel.
 - It must be aware of proxy multicast routes, which include both (*,*)
   routes and duplicate routes. Currently proxy routes are not offloaded
   and trigger the abort mechanism: removal of all routes from hardware and
   triggering the traffic to go through the kernel.

The multicast routing offloading logic also updates the counters of the
offloaded MFC routes in a periodic work.
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c011ec1b

net: mroute: Check if rule is a default rule · 478e4c2f

由 Yotam Gigi 提交于 9月 27, 2017

When the ipmr starts, it adds one default FIB rule that matches all packets
and sends them to the DEFAULT (multicast) FIB table. A more complex rule
can be added by user to specify that for a specific interface, a packet
should be look up at either an arbitrary table or according to the l3mdev
of the interface.

For drivers willing to offload the ipmr logic into a hardware but don't
want to offload all the FIB rules functionality, provide a function that
can indicate whether the FIB rule is the default multicast rule, thus only
one routing table is needed.

This way, a driver can register to the FIB notification chain, get
notifications about FIB rules added and trigger some kind of an internal
abort mechanism when a non default rule is added by the user.
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

478e4c2f

net: ipmr: Add MFC offload indication · c7c0bbea

由 Yotam Gigi 提交于 9月 27, 2017

Allow drivers, registered to the fib notification chain indicate whether a
multicast MFC route is offloaded or not, similarly to unicast routes. The
indication of whether a route is offloaded is done using the mfc_flags
field on an mfc_cache struct, and the information is sent to the userspace
via the RTNetlink interface only.

Currently, MFC routes are either offloaded or not, thus there is no need to
add per-VIF offload indication.
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7c0bbea

ipmr: Send FIB notifications on MFC and VIF entries · b362053a

由 Yotam Gigi 提交于 9月 27, 2017

Use the newly introduced notification chain to send events upon VIF and MFC
addition and deletion. The MFC notifications are sent only on resolved MFC
entries, as unresolved cannot be offloaded.
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b362053a

ipmr: Add FIB notification access functions · 4d65b948

由 Yotam Gigi 提交于 9月 27, 2017

Make the ipmr module register as a FIB notifier. To do that, implement both
the ipmr_seq_read and ipmr_dump ops.

The ipmr_seq_read op returns a sequence counter that is incremented on
every notification related operation done by the ipmr. To implement that,
add a sequence counter in the netns_ipv4 struct and increment it whenever a
new MFC route or VIF are added or deleted. The sequence operations are
protected by the RTNL lock.

The ipmr_dump iterates the list of MFC routes and the list of VIF entries
and sends notifications about them. The entries dump is done under RCU
where the VIF dump uses the mrt_lock too, as the vif->dev field can change
under RCU.
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d65b948

ipmr: Add reference count to MFC entries · 310ebbba

由 Yotam Gigi 提交于 9月 27, 2017

Next commits will introduce MFC notifications through the atomic
fib_notification chain, thus allowing modules to be aware of MFC entries.

Due to the fact that modules may need to hold a reference to an MFC entry,
add reference count to MFC entries to prevent them from being freed while
these modules use them.

The reference counting is done only on resolved MFC entries currently.
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

310ebbba

fib: notifier: Add VIF add and delete event types · 85e48228

由 Yotam Gigi 提交于 9月 27, 2017

In order for an interface to forward packets according to the kernel
multicast routing table, it must be configured with a VIF index according
to the mroute user API. The VIF index is then used to refer to that
interface in the mroute user API, for example, to set the iif and oifs of
an MFC entry.

In order to allow drivers to be aware and offload multicast routes, they
have to be aware of the VIF add and delete notifications.

Due to the fact that a specific VIF can be deleted and re-added pointing to
another netdevice, and the MFC routes that point to it will forward the
matching packets to the new netdevice, a driver willing to offload MFC
cache entries must be aware of the VIF add and delete events in addition to
MFC routes notifications.
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

85e48228

27 9月, 2017 9 次提交

Merge branch 'nfp-flower-vxlan-tunnel-offload' · 1ca94d79

由 David S. Miller 提交于 9月 26, 2017

Simon Horman says:

====================
nfp: flower vxlan tunnel offload

John says:

This patch set allows offloading of TC flower match and set tunnel fields
to the NFP. The initial focus is on VXLAN traffic. Due to the current
state of the NFP firmware, only VXLAN traffic on well known port 4789 is
handled. The match and action fields must explicity set this value to be
supported. Tunnel end point information is also offloaded to the NFP for
both encapsulation and decapsulation. The NFP expects 3 separate data sets
to be supplied.

For decapsulation, 2 separate lists exist; a list of MAC addresses
referenced by an index comprised of the port number, and a list of IP
addresses. These IP addresses are not connected to a MAC or port. The MAC
addresses can be written as a block or one at a time (because they have an
index, previous values can be overwritten) while the IP addresses are
always written as a list of all the available IPs. Because the MAC address
used as a tunnel end point may be associated with a physical port or may
be a virtual netdev like an OVS bridge, we do not know which addresses
should be offloaded. For this reason, all MAC addresses of active netdevs
are offloaded to the NFP. A notifier checks for changes to any currently
offloaded MACs or any new netdevs that may occur. For IP addresses, the
tunnel end point used in the rules is known as the destination IP address
must be specified in the flower classifier rule. When a new IP address
appears in a rule, the IP address is offloaded. The IP is removed from the
offloaded list when all rules matching on that IP are deleted.

For encapsulation, a next hop table is updated on the NFP that contains
the source/dest IPs, MACs and egress port. These are written individually
when requested. If the NFP tries to encapsulate a packet but does not know
the next hop, then is sends a request to the host. The host carries out a
route lookup and populates the given entry on the NFP table. A notifier
also exists to check for any links changing or going down in the kernel
next hop table. If an offloaded next hop entry is removed from the kernel
then it is also removed on the NFP.

The NFP periodically sends a message to the host telling it which tunnel
ports have packets egressing the system. The host uses this information to
update the used value in the neighbour entry. This means that, rather than
expire when it times out, the kernel will send an ARP to check if the link
is still live. From an NFP perspective, this means that valid entries will
not be removed from its next hop table.
====================
Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ca94d79

nfp: flower vxlan neighbour keep-alive · 856f5b13

由 John Hurley 提交于 9月 25, 2017

Periodically receive messages containing the destination IPs of tunnels
that have recently forwarded traffic. Update the neighbour entries 'used'
value for these IPs next hop.

This prevents the neighbour entry from expiring on timeout but rather
signals an ARP to verify the connection. From an NFP perspective, packets
will not fall back mid-flow unless the link is verified to be down.
Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
Reviewed-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

856f5b13

nfp: flower vxlan neighbour offload · 8e6a9046

由 John Hurley 提交于 9月 25, 2017

Receive a request when the NFP does not know the next hop for a packet
that is to be encapsulated in a VXLAN tunnel. Do a route lookup, determine
the next hop entry and update neighbour table on NFP. Monitor the kernel
neighbour table for link changes and update NFP with relevant information.
Overwrite routes with zero values on the NFP when they expire.
Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
Reviewed-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e6a9046

nfp: offload vxlan IPv4 endpoints of flower rules · 2d9ad71a

由 John Hurley 提交于 9月 25, 2017

Maintain a list of IPv4 addresses used as the tunnel destination IP match
fields in currently active flower rules. Offload the entire list of
NFP_FL_IPV4_ADDRS_MAX (even if some are unused) when new IPs are added or
removed. The NFP should only be aware of tunnel end points that are
currently used by rules on the device
Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
Reviewed-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d9ad71a

nfp: offload flower vxlan endpoint MAC addresses · fd0dd1ab

由 John Hurley 提交于 9月 25, 2017

Generate a list of MAC addresses of netdevs that could be used as VXLAN
tunnel end points. Give offloaded MACs an index for storage on the NFP in
the ranges:
0x100-0x1ff physical port representors
0x200-0x2ff VF port representors
0x300-0x3ff other offloads (e.g. vxlan netdevs, ovs bridges)

Assign phys and vf indexes based on unique 8 bit values in the port num.
Maintain list of other netdevs to ensure same netdev is not offloaded
twice and each gets a unique ID without exhausting the entries. Because
the IDs are unique but constant for a netdev, any changes are implemented
by overwriting the index on NFP.
Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd0dd1ab

nfp: compile flower vxlan tunnel set actions · b27d6a95

由 John Hurley 提交于 9月 25, 2017

Compile set tunnel actions for tc flower. Only support VXLAN and ensure a
tunnel destination port of 4789 is used.
Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b27d6a95

nfp: compile flower vxlan tunnel metadata match fields · 611aec10

由 John Hurley 提交于 9月 25, 2017

Compile ovs-tc flower vxlan metadata match fields for offloading. Only
support offload of tunnel data when the VXLAN port specifically matches
well known port 4789.
Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

611aec10

nfp: add helper to get flower cmsg length · 79ede4ae

由 John Hurley 提交于 9月 25, 2017

Add a helper function that returns the length of the cmsg data when given
the cmsg skb
Signed-off-by: NJohn Hurley <john.hurley@netronome.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

79ede4ae

Merge branch 'mlxsw-pass-gact' · 14a0d032

由 David S. Miller 提交于 9月 26, 2017

Jiri Pirko says:

====================
mlxsw: Introduce support for "pass" gact action offloading

Very simple patchset adds ability for user to insert filters with "pass"
gact action and offload it. That allows scenarios like this:

$ tc filter add dev enp3s0np19 ingress protocol ip pref 10 flower skip_sw dst_ip 192.168.101.0/24 action drop
$ tc filter add dev enp3s0np19 ingress protocol ip pref 9 flower skip_sw dst_ip 192.168.101.1 action pass
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

14a0d032