提交 · 6b633e82b0f902a4cceb9bcdcb5bb31d04ca6264 · openeuler / Kernel

22 4月, 2017 25 次提交

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next · 6b633e82

由 David S. Miller 提交于 4月 21, 2017

Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2017-04-20

This adds the basic infrastructure for IPsec hardware
offloading, it creates a configuration API and adjusts
the packet path.

1) Add the needed netdev features to configure IPsec offloads.

2) Add the IPsec hardware offloading API.

3) Prepare the ESP packet path for hardware offloading.

4) Add gso handlers for esp4 and esp6, this implements
   the software fallback for GSO packets.

5) Add xfrm replay handler functions for offloading.

6) Change ESP to use a synchronous crypto algorithm on
   offloading, we don't have the option for asynchronous
   returns when we handle IPsec at layer2.

7) Add a xfrm validate function to validate_xmit_skb. This
   implements the software fallback for non GSO packets.

8) Set the inner_network and inner_transport members of
   the SKB, as well as encapsulation, to reflect the actual
   positions of these headers, and removes them only once
   encryption is done on the payload.
   From Ilan Tayari.

9) Prepare the ESP GRO codepath for hardware offloading.

10) Fix incorrect null pointer check in esp6.
    From Colin Ian King.

11) Fix for the GSO software fallback path to detect the
    fallback correctly.
    From Ilan Tayari.

Please pull or let me know if there are problems.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b633e82

MAINTAINERS: Add new IPsec offloading files. · 77999328

由 Steffen Klassert 提交于 4月 20, 2017

This adds two new files to IPsec maintenance scope:

net/ipv4/esp4_offload.c
net/ipv6/ip6_offload.c
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

77999328

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 072cec77

由 David S. Miller 提交于 4月 21, 2017

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2017-04-19

This series contains updates to i40e and i40evf only, most notable being
the addition of trace points for BPF programs.

Tobias Klauser updates i40evf to use net_device stats struct instead
of a local private copy.

Preethi updates the VF driver to not enable receive checksum offload by
default for tunneled packets.

Alex fixes an issue he introduced when he converted the code over to
using the length field to determine if a descriptor was done or not.

Mitch adds the ability to dump additional information on the VFs, which
is not available through 'ip link show' using debugfs.

Scott adds trace points to the drivers so that BPF programs can be
attached for feature testing and verification.

Jingjing adds admin queue functions for Pipeline Personalization Profile
commands.

Jake does most of the heavy lifting in this series, starting with the
a reduction in the scope of the RTNL lock being held while resetting VFs
to allow multiple PFs to reset in a timely manner.  Factored out the
direct queue modification so that we are able to re-use the code.
Reduced the wait time for admin queue commands to complete, since we were
waiting a minimum of a millisecond, when in practice the admin queue
command is processed often much faster.  Cleaned up code (flag) we never
use.  Make the code to resetting all the VFs optimized for parallel
computing instead of the current way is a serialized fashion, to help
reduce the time it takes.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

072cec77

netvsc: fix use after free on module removal · 76bb5db5

由 stephen hemminger 提交于 4月 19, 2017

The NAPI data structure is embedded in the netvsc_device structure
and is freed when device is closed. There is still a reference
(in NAPI list) to this which causes a crash in netif_napi_del
when device is removed. Fix by managing NAPI instances correctly.
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76bb5db5

Merge branch 'tc-filter-cleanup-destroy-delete' · dfb05553

由 David S. Miller 提交于 4月 21, 2017

Cong Wang says:

====================
net_sched: clean up tc filter destroy and delete logic

The first patch fixes a potenial race condition, the second one
is pure cleanup.
====================
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dfb05553

net_sched: remove useless NULL to tp->root · 43920538

由 WANG Cong 提交于 4月 19, 2017

There is no need to NULL tp->root in ->destroy(), since tp is
going to be freed very soon, and existing readers are still
safe to read them.

For cls_route, we always init its tp->root, so it can't be NULL,
we can drop more useless code.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

43920538

net_sched: move the empty tp check from ->destroy() to ->delete() · 763dbf63

由 WANG Cong 提交于 4月 19, 2017

We could have a race condition where in ->classify() path we
dereference tp->root and meanwhile a parallel ->destroy() makes it
a NULL. Daniel cured this bug in commit d9363774
("net, sched: respect rcu grace period on cls destruction").

This happens when ->destroy() is called for deleting a filter to
check if we are the last one in tp, this tp is still linked and
visible at that time. The root cause of this problem is the semantic
of ->destroy(), it does two things (for non-force case):

1) check if tp is empty
2) if tp is empty we could really destroy it

and its caller, if cares, needs to check its return value to see if it
is really destroyed. Therefore we can't unlink tp unless we know it is
empty.

As suggested by Daniel, we could actually move the test logic to ->delete()
so that we can safely unlink tp after ->delete() tells us the last one is
just deleted and before ->destroy().

Fixes: 1e052be6 ("net_sched: destroy proto tp when all filters are gone")
Cc: Roi Dayan <roid@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

763dbf63

bpf: add napi_id read access to __sk_buff · b1d9fc41

由 Daniel Borkmann 提交于 4月 19, 2017

Add napi_id access to __sk_buff for socket filter program types, tc
program types and other bpf_convert_ctx_access() users. Having access
to skb->napi_id is useful for per RX queue listener siloing, f.e.
in combination with SO_ATTACH_REUSEPORT_EBPF and when busy polling is
used, meaning SO_REUSEPORT enabled listeners can then select the
corresponding socket at SYN time already [1]. The skb is marked via
skb_mark_napi_id() early in the receive path (e.g., napi_gro_receive()).

Currently, sockets can only use SO_INCOMING_NAPI_ID from 6d433902
("net: Introduce SO_INCOMING_NAPI_ID") as a socket option to look up
the NAPI ID associated with the queue for steering, which requires a
prior sk_mark_napi_id() after the socket was looked up.

Semantics for the __sk_buff napi_id access are similar, meaning if
skb->napi_id is < MIN_NAPI_ID (e.g. outgoing packets using sender_cpu),
then an invalid napi_id of 0 is returned to the program, otherwise a
valid non-zero napi_id.

  [1] http://netdevconf.org/2.1/slides/apr6/dumazet-BUSY-POLLING-Netdev-2.1.pdfSuggested-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1d9fc41

netvsc: Deal with rescinded channels correctly · 73e64fa4

由 K. Y. Srinivasan 提交于 4月 19, 2017

We will not be able to send packets over a channel that has been
rescinded. Make necessary adjustments so we can properly cleanup
even when the channel is rescinded. This issue can be trigerred
in the NIC hot-remove path.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73e64fa4

Merge branch 'ibmvnic-updates-and-bug-fixes' · 87e978ed

由 David S. Miller 提交于 4月 21, 2017

Nathan Fontenot says:

====================
ibmvnic: Updates and bug fixes

This set of patches is a series of updates to remove some unneeded
and unused code in the driver as well as bug fixes for the
ibmvnic driver.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

87e978ed

ibmvnic: Remove unused bouce buffer · d76e0fec

由 Nathan Fontenot 提交于 4月 19, 2017

The bounce buffer is not used in the ibmvnic driver, just
get rid of it.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d76e0fec

ibmvnic: Allocate zero-filled memory for sub crqs · 7f7adc50

由 Nathan Fontenot 提交于 4月 19, 2017

Update the allocation of memory for the sub crq structs and their
associated pages to allocate zero-filled memory.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f7adc50

ibmvnic: Disable irq prior to close · dd9c20fa

由 Brian King 提交于 4月 19, 2017

    Add some code to call disable_irq on all the vnic interface's irqs.
    This fixes a crash observed when closing an active interface, as
    seen in the oops below when we try to access a buffer in the interrupt
    handler which we've already freed.

    Unable to handle kernel paging request for data at address 0x00000001
    Faulting instruction address: 0xd000000003886824
    Oops: Kernel access of bad area, sig: 11 [#1]
    SMP NR_CPUS=2048 NUMA pSeries
    Modules linked in: ibmvnic(OEN) rpadlpar_io(X) rpaphp(X) tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag rpcsec_
    Supported: No, Unsupported modules are loaded
    CPU: 8 PID: 0 Comm: swapper/8 Tainted: G           OE   NX 4.4.49-92.11-default #1
    task: c00000007f990110 ti: c0000000fffa0000 task.ti: c00000007f9b8000
    NIP: d000000003886824 LR: d000000003886824 CTR: c0000000007eff60
    REGS: c0000000fffa3a70 TRAP: 0300   Tainted: G           OE   NX  (4.4.49-92.11-default)
    MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22008042  XER: 20000008
    CFAR: c000000000008468 DAR: 0000000000000001 DSISR: 40000000 SOFTE: 0
    GPR00: d000000003886824 c0000000fffa3cf0 d000000003894118 0000000000000000
    GPR04: 0000000000000000 0000000000000000 c000000001249da0 0000000000000000
    GPR08: 000000000000000e 0000000000000000 c0000000ccb00000 d000000003889180
    GPR12: c0000000007eff60 c000000007af4c00 0000000000000001 c0000000010def30
    GPR16: c00000007f9b8000 c000000000b98c30 c00000007f9b8080 c000000000bab858
    GPR20: 0000000000000005 0000000000000000 c0000000ff5d7e80 c0000000f809f648
    GPR24: c0000000ff5d7ec8 0000000000000000 0000000000000000 c0000000ccb001a0
    GPR28: 000000000000000a c0000000f809f600 c0000000fd4cd900 c0000000f9cd5b00
    NIP [d000000003886824] ibmvnic_interrupt_tx+0x114/0x380 [ibmvnic]
    LR [d000000003886824] ibmvnic_interrupt_tx+0x114/0x380 [ibmvnic]
    Call Trace:
    [c0000000fffa3cf0] [d000000003886824] ibmvnic_interrupt_tx+0x114/0x380 [ibmvnic] (unreliable)
    [c0000000fffa3dd0] [c000000000132940] __handle_irq_event_percpu+0x90/0x2e0
    [c0000000fffa3e90] [c000000000132bcc] handle_irq_event_percpu+0x3c/0x90
    [c0000000fffa3ed0] [c000000000132c88] handle_irq_event+0x68/0xc0
    [c0000000fffa3f00] [c000000000137edc] handle_fasteoi_irq+0xec/0x250
    [c0000000fffa3f30] [c000000000131b04] generic_handle_irq+0x54/0x80
    [c0000000fffa3f60] [c000000000011190] __do_irq+0x80/0x1d0
    [c0000000fffa3f90] [c0000000000248d8] call_do_irq+0x14/0x24
    [c00000007f9bb9e0] [c000000000011380] do_IRQ+0xa0/0x120
    [c00000007f9bba40] [c000000000002594] hardware_interrupt_common+0x114/0x180
Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd9c20fa

ibmvnic: Correct crq and resource releasing · 37489055

由 Nathan Fontenot 提交于 4月 19, 2017

We should not be releasing the crq's when calling close for the
adapter, these need to remain open to facilitate operations such
as updating the mac address. The crq's should be released in the
adpaters remove routine.

Additionally, we need to call release_reources from remove. This
corrects the scenario of trying to remove an adapter that has only
been probed.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37489055

ibmvnic: Remove inflight list · 661a2622

由 Nathan Fontenot 提交于 4月 19, 2017

The inflight list used to track memory that is allocated for crq that are
inflight is not needed. The one piece of the inflight list that does need
to be cleaned at module exit is the error buffer list which is already
attached to the adapter struct.

This patch removes the inflight list and moves checking the error buffer
list to ibmvnic_remove.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

661a2622

ibmvnic: Do not disable IRQ after scheduling tasklet · ed7ecbf7

由 Brian King 提交于 4月 19, 2017

Since the primary CRQ is only used for service functions and
not in the performance path, simplify the code a bit and avoid
disabling the IRQ.
Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ed7ecbf7

ibmvnic: Fixup atomic API usage · 58c8c0c0

由 Brian King 提交于 4月 19, 2017

Replace a couple of modifications of an atomic followed
by a read of the atomic, which is no longer atomic, to
use atomic_XX_return variants to avoid race conditions.
Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

58c8c0c0

ibmvnic: Unmap longer term buffer before free · 59af56c2

由 Brian King 提交于 4月 19, 2017

Make sure we unregister long term buffers from the adapter
prior to DMA unmapping it and freeing the buffer. Failure
to do so could result in a DMA to a now invalid address.
Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

59af56c2

ibmvnic: Fix ibmvnic_change_mac_addr struct format · 993a82b0

由 Murilo Fossa Vicentini 提交于 4月 19, 2017

The ibmvnic_change_mac_addr struct alignment was not matching the defined
format in PAPR+, it had the reserved and return code fields swapped. As a
consequence, the CHANGE_MAC_ADDR_RSP commands were being improperly handled
and executed even when the operation wasn't successfully completed by the
system firmware.

Also changing the endianness of the debug message to make it easier to
parse the CRQ content.
Signed-off-by: NMurilo Fossa Vicentini <muvic@linux.vnet.ibm.com>
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

993a82b0

ibmvnic: Report errors when failing to release sub-crqs · ffa73855

由 Thomas Falcon 提交于 4月 19, 2017

Add reporting of errors when releasing sub-crqs fails.
Signed-off-by: NThomas Falcon <tlfalcon@us.ibm.com>
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ffa73855

liquidio: remove unnecessary variable assignment · ca1cb28d

由 Arnd Bergmann 提交于 4月 19, 2017

gcc points out an useless assignment that was added during code refactoring:

drivers/net/ethernet/cavium/liquidio/lio_ethtool.c: In function 'octnet_intrmod_callback':
drivers/net/ethernet/cavium/liquidio/lio_ethtool.c:1315:59: error: parameter 'oct_dev' set but not used [-Werror=unused-but-set-parameter]

This is harmless but can clearly be remove to avoid the warning.

Fixes: 50c0add5 ("liquidio: refactor interrupt moderation code")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca1cb28d

Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning · 7acf8a1e

由 Matthew Whitehead 提交于 4月 19, 2017

Constants used for tuning are generally a bad idea, especially as hardware
changes over time. Replace the constant 2 jiffies with sysctl variable
netdev_budget_usecs to enable sysadmins to tune the softirq processing.
Also document the variable.

For example, a very fast machine might tune this to 1000 microseconds,
while my regression testing 486DX-25 needs it to be 4000 microseconds on
a nearly idle network to prevent time_squeeze from being incremented.

Version 2: changed jiffies to microseconds for predictable units.
Signed-off-by: NMatthew Whitehead <tedheadster@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7acf8a1e

Merge branch 'iptunnel-policy-based-routing' · 20da848f

由 David S. Miller 提交于 4月 21, 2017

Craig Gallek says:

====================
ip_tunnel: Allow policy-based routing through tunnels

iproute2 changes to follow.  Example usage:
  ip link add gre-test type gre local 10.0.0.1 remote 10.0.0.2 fwmark 0x4
  ip -detail link show gre-test
  ...
  ip link set gre-test type gre fwmark 0
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

20da848f

ip_tunnel: Allow policy-based routing through tunnels · 9830ad4c

由 Craig Gallek 提交于 4月 19, 2017

This feature allows the administrator to set an fwmark for
packets traversing a tunnel.  This allows the use of independent
routing tables for tunneled packets without the use of iptables.

There is no concept of per-packet routing decisions through IPv4
tunnels, so this implementation does not need to work with
per-packet route lookups as the v6 implementation may
(with IP6_TNL_F_USE_ORIG_FWMARK).

Further, since the v4 tunnel ioctls share datastructures
(which can not be trivially modified) with the kernel's internal
tunnel configuration structures, the mark attribute must be stored
in the tunnel structure itself and passed as a parameter when
creating or changing tunnel attributes.
Signed-off-by: NCraig Gallek <kraig@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9830ad4c

ip6_tunnel: Allow policy-based routing through tunnels · 0a473b82

由 Craig Gallek 提交于 4月 19, 2017

This feature allows the administrator to set an fwmark for
packets traversing a tunnel.  This allows the use of independent
routing tables for tunneled packets without the use of iptables.
Signed-off-by: NCraig Gallek <kraig@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a473b82

21 4月, 2017 15 次提交

net: dsa: Remove redundant NULL dst check · 8e6c1812

由 Florian Fainelli 提交于 4月 20, 2017

tag_lan9303.c does check for a NULL dst but that's already checked by
dsa_switch_rcv() one layer above.
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Acked-by: NJuergen Borleis <jbe@pengutronix.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e6c1812

net/mlx5e: IPoIB, Fix error handling in mlx5_rdma_netdev_alloc() · 6905e5a5

由 Dan Carpenter 提交于 4月 19, 2017

The labels were out of order, so it either could result in an Oops or a
leak.

Fixes: 48935bbb ("net/mlx5e: IPoIB, Add netdevice profile skeleton")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6905e5a5

qede: allocate enough data for ->arfs_fltr_bmap · f6ca26f2

由 Dan Carpenter 提交于 4月 19, 2017

We've got the number of longs, yes, but we should multiply by
sizeof(long) to get the number of bytes needed.

Fixes: e4917d46 ("qede: Add aRFS support")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NYuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6ca26f2

tcp_cubic: fix typo in module param description · d6ecf328

由 Chema Gonzalez 提交于 4月 18, 2017

Signed-off-by: NChema Gonzalez <chemag@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d6ecf328

Add Jiri Pirko as TC subsystem co-maintainer · b603aa4d

由 Jamal Hadi Salim 提交于 4月 18, 2017

Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Acked-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b603aa4d

Add Cong Wang as TC subsystem co-maintainer · 7ab273be

由 Jamal Hadi Salim 提交于 4月 18, 2017

Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7ab273be

Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · a5f62ca6

由 David S. Miller 提交于 4月 20, 2017

Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2017-04-18

This series contains updates to mainly ixgbe with only one ixgbevf change.

Usha adds a check to ensure the creation of number of VF's is valid based
on the traffic classes configured, all to avoid transmit hangs.

Joe Perches reduces the use of pr_cont since the output can be interleaved
by other processes.

Tony cleans up the code overwriting the KX4 config, which is configured by
the NVM.  Adds a check for MMNGC.MNG_VETO, to resolve an issue where we
were getting a link loss for the BMC when loading the driver.

Don fixes up SGMII x553 config details which were missed in earlier
implementations.  Added support for x552 XFI backplane interface support.
Cleaned up an unused define, which was causing confusion on supported
devices.

Emil fixes a link issue on KR parts by making sure the default setting is
set.  Refactors the code so that the code for allocating memory for the
list of MAC addresses that the VFs can use into its own function.  Made
some code cleans to help readability and ensure notification of SRIOV
being enabled is done upon completion.  Fixed an issue where if we failed
to allocate vfinfo in __ixgbe_enable_sriov() the driver would crash with
a NULL pointer dereference.

Philippe Reynes updates ixgbevf to use the new API for
{get|set}_link_ksettings.

Alex increases the headroom allocation when using build_skb() on a
system with 4K pages.  Fixed an issue in ixgbe_dump() where we were no
longer clearing the status bit.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5f62ca6

net: ipv6: Fix UDP early demux lookup with udp_l3mdev_accept=0 · 0bd84065

由 subashab@codeaurora.org 提交于 4月 18, 2017

David Ahern reported that 5425077d ("net: ipv6: Add early demux
handler for UDP unicast") breaks udp_l3mdev_accept=0 since early
demux for IPv6 UDP was doing a generic socket lookup which does not
require an exact match. Fix this by making UDPv6 early demux match
connected sockets only.

v1->v2: Take reference to socket after match as suggested by Eric
v2->v3: Add comment before break

Fixes: 5425077d ("net: ipv6: Add early demux handler for UDP unicast")
Reported-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Eric Dumazet <edumazet@google.com>
Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
Tested-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0bd84065

Merge branch 'tcp_poll-flakes' · 8ad0921b

由 David S. Miller 提交于 4月 20, 2017

Eric Dumazet says:

====================
tcp: address two poll() flakes

Some packetdrill tests are failing when host kernel is using ASAN
or other debugging infrastructure.

I was able to fix the flakes by making sure we were not
sending wakeup events too soon.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ad0921b

tcp: remove poll() flakes with FastOpen · 0f9fa831

由 Eric Dumazet 提交于 4月 18, 2017

When using TCP FastOpen for an active session, we send one wakeup event
from tcp_finish_connect(), right before the data eventually contained in
the received SYNACK is queued to sk->sk_receive_queue.

This means that depending on machine load or luck, poll() users
might receive POLLOUT events instead of POLLIN|POLLOUT

To fix this, we need to move the call to sk->sk_state_change()
after the (optional) call to tcp_rcv_fastopen_synack()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0f9fa831

tcp: remove poll() flakes when receiving RST · 3d476263

由 Eric Dumazet 提交于 4月 18, 2017

When a RST packet is processed, we send two wakeup events to interested
polling users.

First one by a sk->sk_error_report(sk) from tcp_reset(),
followed by a sk->sk_state_change(sk) from tcp_done().

Depending on machine load and luck, poll() can either return POLLERR,
or POLLIN|POLLOUT|POLLERR|POLLHUP (this happens on 99 % of the cases)

This is probably fine, but we can avoid the confusion by reordering
things so that we have more TCP fields updated before the first wakeup.

This might even allow us to remove some barriers we added in the past.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d476263

Merge branch 'mlxsw-flow-based-forwarding-OVS' · d02e93d6

由 David S. Miller 提交于 4月 20, 2017

Jiri Pirko says:

====================
mlxsw: Allow flow based forwarding in OVS

This patchset does some fixes so the HW is setup correctly to do
flow-based (ACL based) forwarding for OVS-enslaved port.

The first patch is just trivial fix spotted on the way.

Patches 2-4 take care of proper FID setup which HW needs in order to
for ACL based forwarding.

The 7th patch (with dependency of patch 5 and 6) takes care of proper setup
of ports that are enslaved in OVS.

The last patch implements new FID miss trap that is used to push
packets belonging to unknown flows to kernel and userspace.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d02e93d6

mlxsw: spectrum: Add FID miss trap · 9d41accc

由 Jiri Pirko 提交于 4月 18, 2017

When there is no FID set for a specific packet, the HW will drop it.
However, by default these packets are useful to be delivered to CPU as
it can inspect them and program HW accordingly. So add this trap.

This would only ever happen when port is enslaved to an OVS master.
Otherwise, packets would be dropped during VLAN / STP filtering,
before FID classification.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d41accc

mlxsw: spectrum: Allow ports to work under OVS master · 2b94e58d

由 Jiri Pirko 提交于 4月 18, 2017

>From now on, a port can become a slave of OVS master. All vlans
are enabled, STP state is set to "forwarding". It is up to the OVS
userspace daemon to setup the flows either in kernel or in HW using TC
flower offload.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b94e58d

net: add netif_is_ovs_port helper · 5be66141

由 Jiri Pirko 提交于 4月 18, 2017

To find out if a netdev is an OVS port.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5be66141

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功