提交 · fc08a01a6925a1a0d69bb9026f266606a6a96a20 · openeuler / raspberrypi-kernel

19 2月, 2016 12 次提交

H
cxgb4: Use __dev_uc_sync/__dev_mc_sync to sync MAC address · fc08a01a
由 Hariprasad Shenai 提交于 2月 16, 2016
```
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
fc08a01a

vxlan: tun_id is 64bit, not 32bit · 07dabf20

由 Jiri Benc 提交于 2月 18, 2016

The tun_id field in struct ip_tunnel_key is __be64, not __be32. We need to
convert the vni to tun_id correctly.

Fixes: 54bfd872 ("vxlan: keep flags and vni in network byte order")
Reported-by: NPaolo Abeni <pabeni@redhat.com>
Tested-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NThadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07dabf20

Merge branch 'netlink-mmap-remove' · f169af2c

由 David S. Miller 提交于 2月 18, 2016

Florian Westphal says:

====================
netlink: remove mmapped netlink support

As discussed during netconf 2016 in Seville, this series removes
CONFIG_NETLINK_MMAP.

Close to three years after it was merged it has retained several problems
that do not appear to be fixable.

No official netfilter libmnl release contains support for mmap backed netlink
sockets. No openvswitch release makes use of it either.

To use the mmap interface, userspace not only has to probe for mmap netlink
support, it also has to implement a recv/socket receive path in order to
handle messages that exceed the size of an rx ring element (NL_MMAP_STATUS_COPY).

So if there are odd programs out there that attempt to use MMAP netlink
they should continue to work as they already need a socket based code path
to work properly.

The actual revert (first patch) has a list of problems.
The followup patches remove a couple of helpers that are no longer needed
after the revert.

I did a few tests with mmap vs. socket based interface on a 4.4 based
kernel on an i7-4790 box and there are no performance advantages:

loopback, single nfqueue, queueing in -t filter INPUT:
traffic generated by 8 * ping -q -f localhost:
socket backend:
real    0m27.325s
user    0m3.993s
sys     0m23.292s

with mmap ring backend:
real    0m29.054s
user    0m4.924s
sys     0m24.127s

with single tcp stream, unidirectional, loopback mtu set at 1500
(nc localhost discard < /dev/zero > /dev/null):

socket interface:
time nfqdump -b $((8 * 1024 * 1024 * 1024)) -w /dev/null
real    0m15.960s
user    0m1.756s
sys     0m11.143s

mmap ring:
real    0m16.441s
user    0m3.040s
sys     0m13.687s

socket interface nfqdump[1] with --gso option (i.e. MTU is exceeded,
no kernel-side segmentation and checksum fixups) completes in about 5s.

I also tested dumping a conntrack table with 1m entries.
On my box this takes about 2.4 seconds for both mmap and socket backend:

time LD_PRELOAD=../../src/.libs/libmnl.so ./nfct-dump-sk > /dev/null
mnl_cb_run: Success
messages: 1000000
real    0m2.485s
user    0m1.085s
sys     0m1.400s

time LD_PRELOAD=../../src/.libs/libmnl.so ./nfct-dump-mmap > /dev/null
messages: 1000000
real    0m2.451s
user    0m1.124s
sys     0m1.328s

[1] https://git.breakpoint.cc/cgit/fw/nfqdump.git/
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f169af2c

nfnetlink: Revert "nfnetlink: add support for memory mapped netlink" · c5b0db32

由 Florian Westphal 提交于 2月 18, 2016

reverts commit 3ab1f683 ("nfnetlink: add support for memory mapped
netlink")'

Like previous commits in the series, remove wrappers that are not needed
after mmapped netlink removal.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5b0db32

nfnetlink: remove nfnetlink_alloc_skb · 905f0a73

由 Florian Westphal 提交于 2月 18, 2016

Following mmapped netlink removal this code can be simplified by
removing the alloc wrapper.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

905f0a73

Revert "genl: Add genlmsg_new_unicast() for unicast message allocation" · 263ea090

由 Florian Westphal 提交于 2月 18, 2016

This reverts commit bb9b18fb ("genl: Add genlmsg_new_unicast() for
unicast message allocation")'.

Nothing wrong with it; its no longer needed since this was only for
mmapped netlink support.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

263ea090

openvswitch: Revert: "Enable memory mapped Netlink i/o" · 551ddc05

由 Florian Westphal 提交于 2月 18, 2016

revert commit 795449d8 ("openvswitch: Enable memory mapped Netlink i/o").
Following the mmaped netlink removal this code can be removed.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

551ddc05

netlink: remove mmapped netlink support · d1b4c689

由 Florian Westphal 提交于 2月 18, 2016

mmapped netlink has a number of unresolved issues:

- TX zerocopy support had to be disabled more than a year ago via
  commit 4682a035 ("netlink: Always copy on mmap TX.")
  because the content of the mmapped area can change after netlink
  attribute validation but before message processing.

- RX support was implemented mainly to speed up nfqueue dumping packet
  payload to userspace.  However, since commit ae08ce00
  ("netfilter: nfnetlink_queue: zero copy support") we avoid one copy
  with the socket-based interface too (via the skb_zerocopy helper).

The other problem is that skbs attached to mmaped netlink socket
behave different from normal skbs:

- they don't have a shinfo area, so all functions that use skb_shinfo()
(e.g. skb_clone) cannot be used.

- reserving headroom prevents userspace from seeing the content as
it expects message to start at skb->head.
See for instance
commit aa3a0220 ("netlink: not trim skb for mmaped socket when dump").

- skbs handed e.g. to netlink_ack must have non-NULL skb->sk, else we
crash because it needs the sk to check if a tx ring is attached.

Also not obvious, leads to non-intuitive bug fixes such as 7c7bdf35
("netfilter: nfnetlink: use original skbuff when acking batches").

mmaped netlink also didn't play nicely with the skb_zerocopy helper
used by nfqueue and openvswitch.  Daniel Borkmann fixed this via
commit 6bb0fef4 ("netlink, mmap: fix edge-case leakages in nf queue
zero-copy")' but at the cost of also needing to provide remaining
length to the allocation function.

nfqueue also has problems when used with mmaped rx netlink:
- mmaped netlink doesn't allow use of nfqueue batch verdict messages.
  Problem is that in the mmap case, the allocation time also determines
  the ordering in which the frame will be seen by userspace (A
  allocating before B means that A is located in earlier ring slot,
  but this also means that B might get a lower sequence number then A
  since seqno is decided later.  To fix this we would need to extend the
  spinlocked region to also cover the allocation and message setup which
  isn't desirable.
- nfqueue can now be configured to queue large (GSO) skbs to userspace.
  Queing GSO packets is faster than having to force a software segmentation
  in the kernel, so this is a desirable option.  However, with a mmap based
  ring one has to use 64kb per ring slot element, else mmap has to fall back
  to the socket path (NL_MMAP_STATUS_COPY) for all large packets.

To use the mmap interface, userspace not only has to probe for mmap netlink
support, it also has to implement a recv/socket receive path in order to
handle messages that exceed the size of an rx ring element.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ken-ichirou MATSUZAWA <chamaken@gmail.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1b4c689

net_sched: Improve readability of filter processing · 7e6e18fb

由 Jamal Hadi Salim 提交于 2月 18, 2016

Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e6e18fb

bridge: switchdev: Offload VLAN flags to hardware bridge · 7fbac984

由 Ido Schimmel 提交于 2月 18, 2016

When VLANs are created / destroyed on a VLAN filtering bridge (MASTER
flag set), the configuration is passed down to the hardware. However,
when only the flags (e.g. PVID) are toggled, the configuration is done
in the software bridge alone.

While it is possible to pass these flags to hardware when invoked with
the SELF flag set, this creates inconsistency with regards to the way
the VLANs are initially configured.

Pass the flags down to the hardware even when the VLAN already exists
and only the flags are toggled.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7fbac984

net: phy: Add SGMII support for Marvell 88E1510/1512/1514/1518 · 930b37ee

由 Stefan Roese 提交于 2月 18, 2016

Add code to select SGMII-to-copper mode upon SGMII interface selection.
Signed-off-by: NStefan Roese <sr@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

930b37ee

isdn: divamnt: use y2038-safe ktime_get_ts64() for trace data timestamps · 096f6262

由 Alison Schofield 提交于 2月 17, 2016

divamnt stores a start_time at module init and uses it to calculate
elapsed time. The elapsed time, stored in secs and usecs, is part of
the trace data the driver maintains for the DIVA Server ISDN cards.
No change to the format of that time data is required.

To avoid overflow on 32-bit systems use ktime_get_ts64() to return
the elapsed monotonic time since system boot.

This is a change from real to monotonic time. Since the driver only
stores elapsed time, monotonic time is sufficient and more robust
against real time clock changes. These new monotonic values can be
more useful for debugging because they can be easily compared to
other monotonic timestamps.

Note elaspsed time values will now start at system boot time rather
than module load time, so they will differ slightly from previously
reported values.

Remove declaration and init of previously unused time constants:
start_sec, start_usec.
Signed-off-by: NAlison Schofield <amsfield22@gmail.com>
Reviewed-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

096f6262

18 2月, 2016 28 次提交

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · d56fddaa

由 David S. Miller 提交于 2月 18, 2016

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2016-02-17

This series contains updates to i40e/i40evf once again.

Mitch updates the use of a define instead of a magic number.  Adds support
for packet split receive on VFs, which is disabled by default.  Expands on
a code comment which was not verbose or really helpful.  Fixes an issue
where if a reset fails to complete and was not properly setting the
adapter state, which would cause a panic on rmmod, so set the adpater
state to DOWN to avoid a panic.

Jesse cleans up a "dump" in debugfs that never panned out to be useful.

Anjali adds a workaround for cases where we might have interrupts that get
lost but wright-back (WB) happened.  Fixes an issue by falling back to
enabling unicast, multicast and broadcast promiscuous mode when the driver
must disable it's use of "default port" (defport mode) due to internal
incompatibility with Multiple Function per Port (MFP).  Fixes an issue
where queues should never be enabled/disabled in the interrupt handler.

Kiran cleans up th code which used hard coded base VEB SEID since it was
removed from the specification.

Shannon adds a few bits for better debug messages.  Fixes an obscure corner
case, where it was possible to clear the NVM update wait flag when no
update_done message was actually received.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d56fddaa

net-sysfs: remove unused fmt_long_hex · fbbef866

由 Colin Ian King 提交于 2月 15, 2016

Ever since commit 04ed3e74
("net: change netdev->features to u32") the format string
fmt_long_hex has not been used, so we may as well remove it.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fbbef866

i40e/i40evf: Bump i40e to 1.4.15 and i40evf to 1.4.11. · 8888fd88

由 Catherine Sullivan 提交于 1月 15, 2016

Bump.

Change-ID: Ie280dc67e37a1cf667c3469499a4fb90f4177b75
Signed-off-by: NCatherine Sullivan <catherine.sullivan@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

8888fd88

i40e: When in promisc mode apply promisc mode to Tx Traffic as well · 3b120089

由 Anjali Singhai Jain 提交于 1月 15, 2016

In MFP mode particularly when we were setting the PF VSI in limited
promiscuous, the HW switch was still mirroring the outgoing packets
from other VSIs (VF/VMdq) onto the PF VSI.

With this new bit set, the mirroring doesn't happen any more and so
we are in limited promiscuous on the PF VSI in MFP which is similar
to defport.

An API check is not required, since this bit is reserved for FW API
version < 1.5

Also update copyright year in file headers.

Change-ID: I9840cb95f11dde733d943cb03ce84f68b9611bc8
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

3b120089

i40e: clean event descriptor before use · 73b03f98

由 Shannon Nelson 提交于 1月 15, 2016

In one obscure corner case, it was possible to clear the NVM update wait
flag when no update_done message was actually received.  This patch
cleans the event descriptor before use, and moves the opcode check to
where it won't get done if there was no event to clean.

Also update copyright year in file headers.

Change-ID: I68bbc41965e93f4adf07cbe98b9dfd63d41509a4
Signed-off-by: NShannon Nelson <shannon.nelson@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

73b03f98

i40evf: set adapter state on reset failure · 9b9344f7

由 Mitch Williams 提交于 1月 15, 2016

If a reset fails to complete, the driver gets its affairs in order and
awaits the cold solace of rmmod. Unfortunately, it was not properly
setting the adapter state, which would cause a panic on rmmod, instead
of the desired surcease.

Set the adapter state to DOWN in this case, and avoid a panic.

Change-ID: I6fdd9906da52e023f8dc744f7da44b5d95278ca9
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

9b9344f7

i40e: better error reporting for nvmupdate · 6e93d0c9

由 Shannon Nelson 提交于 1月 15, 2016

Make sure we return EBUSY while finishing up a reset, and add a few bits
for better debug messages.

Change-ID: I23f6c28a8d96d7aa171abcc265737cec7826c292
Signed-off-by: NShannon Nelson <shannon.nelson@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

6e93d0c9

i40e: expand comment · 0d790327

由 Mitch Williams 提交于 1月 15, 2016

Explain why we cannot remove this code, even though it works differently
than any of our other interrupt cause handling code.

Change-ID: Ie66203bd037a466066036611c31d44f759ec5176
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

0d790327

i40e: Do not disable queues in the Legacy/MSI Interrupt handler · a16ae2d5

由 Anjali Singhai Jain 提交于 1月 15, 2016

The queues should never be enabled/disabled in the interrupt handler,
ICR0 interrupt enable should be the only thing that needs to be
dynamically changed in the handler.

This patch fixes that. Without this patch X722 platforms were
seeing weird ping timings when in Legacy mode since it takes
a whole lot of time for the HW/FW to re-enable queues.

Change-ID: If065afc45d81c5a19d4a94a00cd5b8f61cefc40c
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

a16ae2d5

i40e/i40evf: avoid atomics · 16fd08b8

由 Mitch Williams 提交于 1月 15, 2016

In the case where we have a page fully used by receive data, we need to
release the page fully to the stack. Instead of calling get_page (which
increments the page count) followed by free_page (which decrements the
page count), just donate our reference to the stack. Although this
donation is not tax deductible, it does allow us to avoid two very
expensive atomic operations that reverse each other.

Change-ID: If70739792d5748995fc175ec92ac2171ed4ad8fc
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

16fd08b8

i40e: Removal of code which relies on BASE VEB SEID · 4147e2c5

由 Kiran Patil 提交于 1月 15, 2016

Fixed mapping of SEID is removed from specification. Hence
this patch removes code which was using hard coded base VEB SEID.

Changed FCoE code to use "hw->pf_id" to obtain correct "idx"
and verified.

Removed defines for BASE VSI/VEB SEID and BASE_PF_SEID since it
is not used anymore.

Change-ID: Id507cf4b1fae1c0145e3f08ae9ea5846ea5840de
Signed-off-by: NKiran Patil <kiran.patil@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

4147e2c5

i40e: Fix PROMISC mode for Multi-function per port (MFP) devices · 6784ed5a

由 Anjali Singhai Jain 提交于 1月 15, 2016

This patch falls back to enabling unicast, multicast and
broadcast promiscuous mode when the driver must disable it's use
of "default port" aka defport mode (which is normally used to
provide a promiscuous mode), due to internal incompatibility
with Multiple Function per Port (aka MFP).

The situation that requires this patch is when Physical
Function 0 is the device being used, and it can support SR-IOV
when MFP is enabled, via the driver creating a VEB on an MFP
enabled adapter.

Change-ID: Ie90b00d0d58782a5dfcf2c3c9725a2eb90bd63d8
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

6784ed5a

i40e: Add a SW workaround for lost interrupts · dd353109

由 Anjali Singhai Jain 提交于 1月 15, 2016

This patch adds a workaround for cases where we might have
interrupts that got lost but WB happened.
If that happens without this patch we will see a tx_timeout.
To work around it, this patch goes ahead and reschedules NAPI
in that situation, if NAPI is not already scheduled.
We also add a counter in ethtool to keep track of when
we detect a case of tx_lost_interrupt.

Note: napi_reschedule() can be safely called from process/service_task
context and is done in other drivers as well without an issue.

Change-ID: I00f98f1ce3774524d9421227652bef20fcbd0d20
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

dd353109

i40e: trivial: cleanup use of pf->hw · f734dfff

由 Jesse Brandeburg 提交于 1月 15, 2016

This patch makes use of a pointer called hw consistent
in the i40e_remove function.

Change-ID: Idacc7ff0a09a68289c57457a78618bf5497de077
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

f734dfff

i40evf: support packet split receive · 00e5ec4b

由 Mitch Williams 提交于 1月 15, 2016

Support packet split receive on VFs. This is off by default but can be
enabled using ethtool private flags. Because we need to trigger a reset
from outside of i40evf_main.c, create a new function to do so, and
export it.

Also update copyright year in file headers.

Change-ID: I721aa5d70113d3d6d94102e5f31526f6fc57cbbb
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

00e5ec4b

i40e: drop unused debugfs file "dump" · cb5c260e

由 Jesse Brandeburg 提交于 1月 15, 2016

There was a completely unused file "dump" in debugfs that
never panned out to be useful.

Change-ID: I12bb9e37b5a83299725dda815a8746157baf6562
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

cb5c260e

i40e: get rid of magic number · d6b3bca1

由 Mitch Williams 提交于 1月 15, 2016

We have a define for this, use it. No functional change.

Change-ID: Ic0e3ea4f562e46de63b2a8de07f291ccc10205fd
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

d6b3bca1

Merge branch 'vxlan-cleanups' · 5270c4da

由 David S. Miller 提交于 2月 17, 2016

Jiri Benc says:

====================
vxlan: clean up rx path, consolidating extension handling

The rx path of VXLAN turned over time into kind of spaghetti code. The rx
processing is split between vxlan_udp_encap_recv and vxlan_rcv but in an
artificial way: vxlan_rcv is just called at the end of vxlan_udp_encap_recv,
continuing the rx processing where vxlan_udp_encap_recv left it. There's no
clear border between those two functions.

It makes sense to combine those functions into one; this will be actually
needed for VXLAN-GPE where we'll need to skip part of the processing which
is hard to do with the current code.

However, both functions are too long already. This patchset is shortening
them, consolidating extension handling that is spread all around together
and moving it to separate functions. (Later patchsets will do more
consolidation in other parts of the functions with the final goal of merging
vxlan_udp_encap_recv and vxlan_rcv.)

In process of consolidation of the extension handling, I needed to deal with
vni field in a generic way, as its lower 8 bits mean different things for
different extensions. While cleaning up the code to strictly distinguish
between "vni" and "vni field" (which contains vni plus an additional byte),
I also converted the code not to convert endianess back and forth.

The full picture can be seen at:
https://github.com/jbenc/linux-vxlan/commits/master
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5270c4da

vxlan: treat vni in metadata based tunnels consistently · b9167b2e

由 Jiri Benc 提交于 2月 16, 2016

For metadata based tunnels, VNI is ignored when doing vxlan device lookups
(because such tunnel receives all VNIs). However, this was not honored by
vxlan_xmit_one when doing encapsulation bypass. Move the check for metadata
based tunnel to the common place where it belongs.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b9167b2e

vxlan: clean up rx error path · 288b01c8

由 Jiri Benc 提交于 2月 16, 2016

When there are unrecognized flags present in the vxlan header, it doesn't
make much sense to return the packet for further UDP processing, especially
considering that for other invalid flag combinations we drop the packet
because of previous checks.

This means we return positive value only at the beginning of the function
where tun_dst is not yet allocated. This allows us to get rid of the
bad_flags and error jump labels.

When we're dropping packet, we need to free tun_dst now.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

288b01c8

vxlan: clean up extension handling on rx · f14ecebb

由 Jiri Benc 提交于 2月 16, 2016

Bring the extension handling to a single place and move the actual handling
logic out of vxlan_udp_encap_recv as much as possible.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f14ecebb

vxlan: move GBP header parsing to a separate function · 3288af08

由 Jiri Benc 提交于 2月 16, 2016

To make vxlan_udp_encap_recv shorter and more comprehensible.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3288af08

vxlan: simplify vxlan_remcsum · be5cfeab

由 Jiri Benc 提交于 2月 16, 2016

Part of the parameters is not needed. Simplify the caller of this function
in preparation of making vxlan rx more comprehensible.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

be5cfeab

vxlan: keep flags and vni in network byte order · 54bfd872

由 Jiri Benc 提交于 2月 16, 2016

Prevent repeated conversions from and to network order in the fast path.

To achieve this, define all flag constants in big endian order and store VNI
as __be32. To prevent confusion between the actual VNI value and the VNI
field from the header (which contains additional reserved byte), strictly
distinguish between "vni" and "vni_field".
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54bfd872

vxlan: introduce vxlan_hdr · d4ac05ff

由 Jiri Benc 提交于 2月 16, 2016

Currently, pointer to the vxlan header is kept in a local variable. It has
to be reloaded whenever the pskb pull operations are performed which usually
happens somewhere deep in called functions.

Create a vxlan_hdr function and use it to reference the vxlan header
instead.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4ac05ff

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · d8ef0347

由 David S. Miller 提交于 2月 17, 2016

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2016-02-17

This series contains updates to i40e/i40evf only (again).

Jesse moves sync_vsi_filters() up in the service_task because it may need
to request a reset, and we do not want to wait another round of service
task time.  Refactored the enable_icr0() in order to allow it to be
decided by the caller whether the CLEARPBA (clear pending events) bit will
be set while re-enabling the interrupt.  Also provides the "Don't Give Up"
patch, where the driver will keep polling trying to allocate receive buffers
until it succeeds.  This should keep all receive queues running even in
the face of memory pressure.  Cleans up the debugging helpers by putting
everything in hex to be consistent.

Neerav updates the DCB firmware version related checkes specific to X710
and XL710 only since the checks are not required for X722 devices.

Shannon adds the use of the new shared MAC filter bit for multicast and
broadcast filters in order to make better use of the filters available
from the device.  Added a parameter to allow the driver to set the
enable/disable of statistics gathering in the hardware switch.  Also the
L2 cloud filtering parameter is removed since it was never used.

Anjali refactors the force_wb and WB_ON_ITR functionality since
Force-WriteBack functionality in X710/XL710 devices has been moved out of
the clean routine and into the service task, so we need to make sure
WriteBack-On-ITR is separated out since it is still called from clean.

Catherine changes the VF driver string to reflect all the products that
are supported.

Mitch refactors the packet split receive code to properly use half-pages
for receives.  Also changes the use of bitwise operators to logical
operators on clean_complete variable, while making a witty reference to
Mr. Spock.  Cleans up (i.e. removes) the hsplit field in the ring
structure and use the existing macro to detect packet split enablement,
which allows debugfs dumps of the VSI to properly show which recevie
routine is in use.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d8ef0347

Merge branch 'rocker-worlds' · 2bc6b4f4

由 David S. Miller 提交于 2月 17, 2016

Jiri Pirko says:

====================
rocker: do world split

This patchset allows new rocker worlds to be easily added in future.
Two new worlds are now under development: P4 and eBPF.

The main part of the patchset is the OF-DPA carve-out. It resuts in OF-DPA
specific file. Clean cut.

Note this patchset is based on my original attempt in October 2015.
I had to rebase, included all suggestions and did lot of small changes.
Main change to go with all-port-one-world approach. Port world is set according
to what is setup in HW. Not possible to change worlds from driver.

v1->v2:
  patch 12/13:
  - split port_init into pre-init and init
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2bc6b4f4

rocker: return -EOPNOTSUPP for undefined world ops · fccd84d4

由 Jiri Pirko 提交于 2月 16, 2016

Suggested-by: NScott Feldman <sfeldma@gmail.com>
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fccd84d4