提交 · 54ef207ac8f7a17d677082157a29f4df8499dc81 · openanolis / cloud-kernel

15 2月, 2017 5 次提交

xfrm: Extend the sec_path for IPsec offloading · 54ef207a

由 Steffen Klassert 提交于 2月 15, 2017

We need to keep per packet offloading informations across
the layers. So we extend the sec_path to carry these for
the input and output offload codepath.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

54ef207a

xfrm: Export xfrm_parse_spi. · 1e295370

由 Steffen Klassert 提交于 2月 15, 2017

We need it in the ESP offload handlers, so export it.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

1e295370

net: Prepare gro for packet consuming gro callbacks · 25393d3f

由 Steffen Klassert 提交于 2月 15, 2017

The upcomming IPsec ESP gro callbacks will consume the skb,
so prepare for that.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

25393d3f

net: Add a skb_gro_flush_final helper. · 5f114163

由 Steffen Klassert 提交于 2月 15, 2017

Add a skb_gro_flush_final helper to prepare for  consuming
skbs in call_gro_receive. We will extend this helper to not
touch the skb if the skb is consumed by a gro callback with
a followup patch. We need this to handle the upcomming IPsec
ESP callbacks as they reinject the skb to the napi_gro_receive
asynchronous. The handler is used in all gro_receive functions
that can call the ESP gro handlers.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

5f114163

xfrm: Add a secpath_set helper. · b0fcee82

由 Steffen Klassert 提交于 2月 15, 2017

Add a new helper to set the secpath to the skb.
This avoids code duplication, as this is used
in multiple places.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

b0fcee82

09 2月, 2017 7 次提交

xfrm: policy: make policy backend const · 37b10383

由 Florian Westphal 提交于 2月 07, 2017

Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

37b10383

xfrm: policy: remove xfrm_policy_put_afinfo · bdba9fe0

由 Florian Westphal 提交于 2月 07, 2017

Alternative is to keep it an make the (unused) afinfo arg const to avoid
the compiler warnings once the afinfo structs get constified.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

bdba9fe0

xfrm: policy: remove family field · a2817d8b

由 Florian Westphal 提交于 2月 07, 2017

Only needed it to register the policy backend at init time.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

a2817d8b

xfrm: policy: remove garbage_collect callback · 3d7d25a6

由 Florian Westphal 提交于 2月 07, 2017

Just call xfrm_garbage_collect_deferred() directly.
This gets rid of a write to afinfo in register/unregister and allows to
constify afinfo later on.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

3d7d25a6

xfrm: policy: xfrm_policy_unregister_afinfo can return void · 2b61997a

由 Florian Westphal 提交于 2月 07, 2017

Nothing checks the return value. Also, the errors returned on unregister
are impossible (we only support INET and INET6, so no way
xfrm_policy_afinfo[afinfo->family] can be anything other than 'afinfo'
itself).
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

2b61997a

xfrm: policy: xfrm_get_tos cannot fail · f5e2bb4f

由 Florian Westphal 提交于 2月 07, 2017

The comment makes it look like get_tos() is used to validate something,
but it turns out the comment was about xfrm_find_bundle() which got removed
years ago.

xfrm_get_tos will return either the tos (ipv4) or 0 (ipv6).
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

f5e2bb4f

xfrm: input: constify xfrm_input_afinfo · 960fdfde

由 Florian Westphal 提交于 2月 07, 2017

Nothing writes to these structures (the module owner was not used).

While at it, size xfrm_input_afinfo[] by the highest existing xfrm family
(INET6), not AF_MAX.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

960fdfde

07 2月, 2017 28 次提交

Merge branch 'bridge-improve-cache-utilization' · 152bff37

由 David S. Miller 提交于 2月 06, 2017

Nikolay Aleksandrov says:

====================
bridge: improve cache utilization

This is the first set which begins to deal with the bad bridge cache
access patterns. The first patch rearranges the bridge and port structs
a little so the frequently (and closely) accessed members are in the same
cache line. The second patch then moves the garbage collection to a
workqueue trying to improve system responsiveness under load (many fdbs)
and more importantly removes the need to check if the matched entry is
expired in __br_fdb_get which was a major source of false-sharing.
The third patch is a preparation for the final one which
If properly configured, i.e. ports bound to CPUs (thus updating "updated"
locally) then the bridge's HitM goes from 100% to 0%, but even without
binding we get a win because previously every lookup that iterated over
the hash chain caused false-sharing due to the first cache line being
used for both mac/vid and used/updated fields.

Some results from tests I've run:
(note that these were run in good conditions for the baseline, everything
 ran on a single NUMA node and there were only 3 fdbs)

1. baseline
100% Load HitM on the fdbs (between everyone who has done lookups and hit
                            one of the 3 hash chains of the communicating
                            src/dst fdbs)
Overall 5.06% Load HitM for the bridge, first place in the list

2. patched & ports bound to CPUs
0% Local load HitM, bridge is not even in the c2c report list
Also there's 3% consistent improvement in netperf tests.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

152bff37

bridge: fdb: write to used and updated at most once per jiffy · 83a718d6

由 Nikolay Aleksandrov 提交于 2月 04, 2017

Writing once per jiffy is enough to limit the bridge's false sharing.
After this change the bridge doesn't show up in the local load HitM stats.
Suggested-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83a718d6

bridge: move write-heavy fdb members in their own cache line · 1214628c

由 Nikolay Aleksandrov 提交于 2月 04, 2017

Fdb's used and updated fields are written to on every packet forward and
packet receive respectively. Thus if we are receiving packets from a
particular fdb, they'll cause false-sharing with everyone who has looked
it up (even if it didn't match, since mac/vid share cache line!). The
"used" field is even worse since it is updated on every packet forward
to that fdb, thus the standard config where X ports use a single gateway
results in 100% fdb false-sharing. Note that this patch does not prevent
the last scenario, but it makes it better for other bridge participants
which are not using that fdb (and are only doing lookups over it).
The point is with this move we make sure that only communicating parties
get the false-sharing, in a later patch we'll show how to avoid that too.
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1214628c

bridge: move to workqueue gc · f7cdee8a

由 Nikolay Aleksandrov 提交于 2月 04, 2017

Move the fdb garbage collector to a workqueue which fires at least 10
milliseconds apart and cleans chain by chain allowing for other tasks
to run in the meantime. When having thousands of fdbs the system is much
more responsive. Most importantly remove the need to check if the
matched entry has expired in __br_fdb_get that causes false-sharing and
is completely unnecessary if we cleanup entries, at worst we'll get 10ms
of traffic for that entry before it gets deleted.
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7cdee8a

bridge: modify bridge and port to have often accessed fields in one cache line · 1f90c7f3

由 Nikolay Aleksandrov 提交于 2月 04, 2017

Move around net_bridge so the vlan fields are in the beginning since
they're checked on every packet even if vlan filtering is disabled.
For the port move flags & vlan group to the beginning, so they're in the
same cache line with the port's state (both flags and state are checked
on each packet).
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1f90c7f3

bpf: enable verifier to add 0 to packet ptr · 63dfef75

由 William Tu 提交于 2月 04, 2017

The patch fixes the case when adding a zero value to the packet
pointer.  The zero value could come from src_reg equals type
BPF_K or CONST_IMM.  The patch fixes both, otherwise the verifer
reports the following error:
  [...]
    R0=imm0,min_value=0,max_value=0
    R1=pkt(id=0,off=0,r=4)
    R2=pkt_end R3=fp-12
    R4=imm4,min_value=4,max_value=4
    R5=pkt(id=0,off=4,r=4)
  269: (bf) r2 = r0     // r2 becomes imm0
  270: (77) r2 >>= 3
  271: (bf) r4 = r1     // r4 becomes pkt ptr
  272: (0f) r4 += r2    // r4 += 0
  addition of negative constant to packet pointer is not allowed
Signed-off-by: NWilliam Tu <u9012063@gmail.com>
Signed-off-by: NMihai Budiu <mbudiu@vmware.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

63dfef75

bpf: test for AND edge cases · 29200c19

由 Josef Bacik 提交于 2月 03, 2017

These two tests are based on the work done for f23cc643. The first test is
just a basic one to make sure we don't allow AND'ing negative values, even if it
would result in a valid index for the array. The second is a cleaned up version
of the original testcase provided by Jann Horn that resulted in the commit.
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29200c19

Merge branch 'dsa-add-fabric-notifier' · 9172d2a0

由 David S. Miller 提交于 2月 06, 2017

Vivien Didelot says:

====================
net: dsa: add fabric notifier

When a switch fabric is composed of multiple switch chips, these chips
must be programmed accordingly when an event occurred on one of them.

Examples of such event include hardware bridging: when a Linux bridge
spans interconnected chips, they must be programmed to allow external
ports to ingress frames on their internal ports.

Another example is cross-chip hardware VLANs. Switch chips in-between
interconnected bridge ports must also configure a given VLAN to allow
packets to pass through them.

In order to support that, this patchset introduces a non-intrusive
notifier mechanism. It adds a notifier head in every DSA switch tree
(the said fabric), and a notifier block in every DSA switch chip.

When an even occurs, it is chained to all notifiers of the fabric.
Switch chips can react accordingly if they are cross-chip capable.

On a dynamic debug enabled system, bridging a port in a multi-chip
fabric will print something like this (ZII Rev B board):

    # brctl addif br0 lan3
    mv88e6085 0.1:00: crosschip DSA port 1.0 bridged to br0
    mv88e6085 0.4:00: crosschip DSA port 1.0 bridged to br0
    # brctl delif br0 lan3
    mv88e6085 0.1:00: crosschip DSA port 1.0 unbridged from br0
    mv88e6085 0.4:00: crosschip DSA port 1.0 unbridged from br0

Currently only bridging events are added. A patchset introducing support
for cross-chip hardware bridging configuration in mv88e6xxx will follow
right after. Then events for switchdev operations are next on the line.

We should note that non-switchdev events do not support rolling-back
switch-wide operations. We'll have to work on closer integration with
switchdev for that, like introducing new attributes or objects, to
benefit from the prepare and commit phases.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9172d2a0

net: dsa: introduce bridge notifier · 04d3a4c6

由 Vivien Didelot 提交于 2月 03, 2017

A slave device will now notify the switch fabric once its port is
bridged or unbridged, instead of calling directly its switch operations.

This code allows propagating cross-chip bridging events in the fabric.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04d3a4c6

net: dsa: add switch notifier · f515f192

由 Vivien Didelot 提交于 2月 03, 2017

Add a notifier block per DSA switch, registered against a notifier head
in the switch fabric they belong to.

This infrastructure will allow to propagate fabric-wide events such as
port bridging, VLAN configuration, etc. If a DSA switch driver cares
about cross-chip configuration, such events can be caught.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f515f192

net: dsa: change state setter scope · c5d35cb3

由 Vivien Didelot 提交于 2月 03, 2017

The scope of the functions inside net/dsa/slave.c must be the slave
net_device pointer. Change to state setter helper accordingly to
simplify callers.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5d35cb3

net: dsa: rollback bridging on error · 9c265426

由 Vivien Didelot 提交于 2月 03, 2017

When an error is returned during the bridging of a port in a
NETDEV_CHANGEUPPER event, net/core/dev.c rolls back the operation.

Be consistent and unassign dp->bridge_dev when this happens.

In the meantime, add comments to document this behavior.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9c265426

net: dsa: simplify netdevice events handling · 8e92ab3a

由 Vivien Didelot 提交于 2月 03, 2017

Simplify the code handling the slave netdevice notifier call by
providing a dsa_slave_changeupper helper for NETDEV_CHANGEUPPER, and so
on (only this event is supported at the moment.)

Return NOTIFY_DONE when we did not care about an event, and NOTIFY_OK
when we were concerned but no error occurred, as the API suggests.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e92ab3a

net: dsa: move netdevice notifier registration · 88e4f0ca

由 Vivien Didelot 提交于 2月 03, 2017

Move the netdevice notifier block register code in slave.c and provide
helpers for dsa.c to register and unregister it.

At the same time, check for errors since (un)register_netdevice_notifier
may fail.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

88e4f0ca

net/mlx5e: fix another maybe-uninitialized false-positive · 321fa4ff

由 Arnd Bergmann 提交于 2月 03, 2017

In commit abeffce9 ("net/mlx5e: Fix a -Wmaybe-uninitialized warning"), I fixed a
gcc warning for the ipv4 offload handling. Now we get the same warning for the
added ipv6 support:

drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:815:40: warning: 'out_dev' may be used uninitialized in this function [-Wmaybe-uninitialized]

We can apply the same workaround here as well.

Fixes: ce99f6b9 ("net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

321fa4ff

net-next: treewide use is_vlan_dev() helper function. · d0d7b10b

由 Parav Pandit 提交于 2月 04, 2017

This patch makes use of is_vlan_dev() function instead of flag
comparison which is exactly done by is_vlan_dev() helper function.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
Acked-by: NStephen Hemminger <stephen@networkplumber.org>
Acked-by: NJon Maxwell <jmaxwell37@gmail.com>
Acked-by: NJohannes Thumshirn <jth@kernel.org>
Acked-by: NHaiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d0d7b10b

net/mlx4_en: fix a condition · 73cfb2a2

由 Dan Carpenter 提交于 2月 03, 2017

There is a "||" vs "|" typo here so we test 0x1 instead of 0x6.

Fixes: 1f8176f7 ("net/mlx4_en: Check the enabling pptx/pprx flags in SET_PORT wrapper flow")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73cfb2a2

sfc: don't rearm interrupts if busy polling · f820c0ac

由 Bert Kenward 提交于 2月 06, 2017

Since commit 364b6055 ("net: busy-poll: return busypolling status
to drivers"), napi_complete_done() returns a boolean that can be used
by drivers to conditionally rearm interrupts.

Testing with a 7142 shows a small latency improvement of ~100 ns.
Signed-off-by: NBert Kenward <bkenward@solarflare.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f820c0ac

sctp: process fwd tsn chunk only when prsctp is enabled · d15c9ede

由 Xin Long 提交于 2月 03, 2017

This patch is to check if asoc->peer.prsctp_capable is set before
processing fwd tsn chunk, if not, it will return an ERROR to the
peer, just as rfc3758 section 3.3.1 demands.
Reported-by: NJulian Cordes <julian.cordes@gmail.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d15c9ede

Merge branch 'mlxsw-cleanup-neigh-handling' · 3bc32d03

由 David S. Miller 提交于 2月 06, 2017

Jiri Pirko says:

====================
mlxsw: cleanup neigh handling

Ido says:

This series addresses long standing issues in the mlxsw driver
concerning neighbour reflection. It also prepares the code for follow-up
changes dealing with proper resource cleanup and nexthop reflection.

The first two patches convert the neighbour reflection code to use an
ordered workqueue, to prevent re-ordering of NEIGH_UPDATE events that
may happen following subsequent patches.

The third to fifth patches remove the ndo_neigh_{construct,destroy}
entry points from the driver, thereby relying only on NEIGH_UPDATE
events for neighbour reflection. This simplifies the code considerably.

Last patches are fallout and adjust nits in the code I noticed while
going over it.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3bc32d03

mlxsw: spectrum_router: Fix typo in comment · fd76d910