提交 · 7303a1475008bee5c3e82a06a282568415690d72 · openeuler / raspberrypi-kernel

10 9月, 2016 4 次提交

sctp: identify chunks that need to be fragmented at IP level · 7303a147

由 Marcelo Ricardo Leitner 提交于 9月 08, 2016

Previously, without GSO, it was easy to identify it: if the chunk didn't
fit and there was no data chunk in the packet yet, we could fragment at
IP level. So if there was an auth chunk and we were bundling a big data
chunk, it would fragment regardless of the size of the auth chunk. This
also works for the context of PMTU reductions.

But with GSO, we cannot distinguish such PMTU events anymore, as the
packet is allowed to exceed PMTU.

So we need another check: to ensure that the chunk that we are adding,
actually fits the current PMTU. If it doesn't, trigger a flush and let
it be fragmented at IP level in the next round.
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7303a147

Merge branch 'mlxsw-fixes' · 1b672f5f

由 David S. Miller 提交于 9月 09, 2016

Jiri Pirko says:

====================
mlxsw: couple of fixes

Couple of fixes from Ido and myself.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b672f5f

mlxsw: spectrum: Set port type before setting its address · 3247ff2b

由 Ido Schimmel 提交于 9月 08, 2016

During port init, we currently set the port's type to Ethernet after
setting its MAC address. However, the hardware documentation states this
should be the other way around.

Align the driver with the hardware documentation and set the port's MAC
address after setting its type.

Fixes: 56ade8fe ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3247ff2b

mlxsw: spectrum_router: Fix error path in mlxsw_sp_router_init · 40d25904

由 Jiri Pirko 提交于 9月 08, 2016

When neigh_init fails, we have to do proper cleanup including
router_fini call.

Fixes: 6cf3c971 ("mlxsw: spectrum_router: Add private neigh table")
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Acked-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40d25904

09 9月, 2016 13 次提交

Merge branch 'nfp-fixes' · 2c2c8e33

由 David S. Miller 提交于 9月 08, 2016

Jakub Kicinski says:

====================
nfp: fixes and trivial cleanup

First patch drops unnecessary version.h includes.  Second one
drops support for pre-release versions of FW ABI.  Removing
FW ABI 0.0 from supported set is particularly good since 0
could just be uninitialized memory.  Last but not least I drop
unnecessary padding of frames on RX which makes us count bytes
incorrectly for the VF2VF traffic.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c2c8e33

nfp: don't pad frames on receive · ebecefc8

由 Jakub Kicinski 提交于 9月 07, 2016

There is no need to pad frames to ETH_ZLEN on RX.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NSimon Horman <simon.horman@netronome.com>
Reviewed-by: NDinan Gunawardena <dinan.gunawardena@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ebecefc8

nfp: drop support for old firmware ABIs · 313b345c

由 Jakub Kicinski 提交于 9月 07, 2016

Be more strict about FW versions.  Drop support for old
transitional revisions which were never used in production.
Dropping support for FW ABI version 0.0.0.0 is particularly
useful because 0 could just be uninitialized memory.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NDinan Gunawardena <dinan.gunawardena@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

313b345c

nfp: remove linux/version.h includes · 312fada1

由 Jakub Kicinski 提交于 9月 07, 2016

Remove unnecessary version.h includes.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NSimon Horman <simon.horman@netronome.com>
Reviewed-by: NDinan Gunawardena <dinan.gunawardena@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

312fada1

tcp: cwnd does not increase in TCP YeAH · db7196a0

由 Artem Germanov 提交于 9月 07, 2016

Commit 76174004
(tcp: do not slow start when cwnd equals ssthresh )
introduced regression in TCP YeAH. Using 100ms delay 1% loss virtual
ethernet link kernel 4.2 shows bandwidth ~500KB/s for single TCP
connection and kernel 4.3 and above (including 4.8-rc4) shows bandwidth
~100KB/s.
   That is caused by stalled cwnd when cwnd equals ssthresh. This patch
fixes it by proper increasing cwnd in this case.
Signed-off-by: NArtem Germanov <agermanov@anchorfree.com>
Acked-by: NDmitry Adamushko <d.adamushko@anchorfree.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db7196a0

Merge branch 'mlx5-fixes' · 81d1a366

由 David S. Miller 提交于 9月 08, 2016

Saeed Mahameed says:

====================
Mellanox 100G mlx5 fixes 2016-09-07

The following series contains bug fixes for the mlx5e driver.

from Gal,
	- Static code checker cleanup (casting overflow)
	- Fix global PFC counter statistics reading
	- Fix HW LRO when vlan stripping is off

From Bodong,
	- Deprecate old autoneg capability bit and use new one.

From Tariq,
	- Fix xmit more counter race condition
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

81d1a366

net/mlx5e: Fix parsing of vlan packets when updating lro header · cd17d230

由 Gal Pressman 提交于 9月 07, 2016

Currently vlan tagged packets were not parsed correctly
and assumed to be regular IPv4/IPv6 packets.
We should check for 802.1Q/802.1ad tags and update the lro header
accordingly.
This fixes the use case where LRO is on and rxvlan is off
(vlan stripping is off).

Fixes: e586b3b0 ('net/mlx5: Ethernet Datapath files')
Signed-off-by: NGal Pressman <galp@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cd17d230

net/mlx5e: Fix global PFC counters replication · 4e39883d

由 Gal Pressman 提交于 9月 07, 2016

Currently when reading global PFC statistics we left the counter
iterator out of the equation and we ended up reading the same counter
over and over again.

Instead of reading the counter at index 0 on every iteration we now read
the counter at index (i).

Fixes: e989d5a5 ('net/mlx5e: Expose flow control counters to ethtool')
Signed-off-by: NGal Pressman <galp@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4e39883d

net/mlx5e: Prevent casting overflow · 7abc2110

由 Gal Pressman 提交于 9月 07, 2016

On 64 bits architectures unsigned long is longer than u32,
casting to unsigned long will result in overflow.
We need to first allocate an unsigned long variable, then assign the
wanted value.

Fixes: 665bc539 ('net/mlx5e: Use new ethtool get/set link ksettings API')
Signed-off-by: NGal Pressman <galp@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7abc2110

net/mlx5e: Move an_disable_cap bit to a new position · e7e31ca4

由 Bodong Wang 提交于 9月 07, 2016

Previous an_disable_cap position bit31 is deprecated to be use in driver
with newer firmware.  New firmware will advertise the same capability
in bit29.

Old capability didn't allow setting more than one protocol for a
specific speed when autoneg is off, while newer firmware will allow
this and it is indicated in the new capability location.
Signed-off-by: NBodong Wang <bodong@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e7e31ca4

net/mlx5e: Fix xmit_more counter race issue · 0dbf657c

由 Tariq Toukan 提交于 9月 07, 2016

Update the xmit_more counter before notifying the HW,
to prevent a possible use-after-free of the skb.

Fixes: c8cf78fe ("net/mlx5e: Add ethtool counter for TX xmit_more")
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0dbf657c

tcp: fastopen: avoid negative sk_forward_alloc · 76061f63

由 Eric Dumazet 提交于 9月 07, 2016

When DATA and/or FIN are carried in a SYN/ACK message or SYN message,
we append an skb in socket receive queue, but we forget to call
sk_forced_mem_schedule().

Effect is that the socket has a negative sk->sk_forward_alloc as long as
the message is not read by the application.

Josh Hunt fixed a similar issue in commit d22e1537 ("tcp: fix tcp
fin memory accounting")

Fixes: 168a8f58 ("tcp: TCP Fast Open Server - main code path")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NJosh Hunt <johunt@akamai.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76061f63

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 40e3012e

由 David S. Miller 提交于 9月 08, 2016

Steffen Klassert says:

====================
ipsec 2016-09-08

1) Fix a crash when xfrm_dump_sa returns an error.
   From Vegard Nossum.

2) Remove some incorrect WARN() on normal error handling.
   From Vegard Nossum.

3) Ignore socket policies when rebuilding hash tables,
   socket policies are not inserted into the hash tables.
   From Tobias Brunner.

4) Initialize and check tunnel pointers properly before
   we use it. From Alexey Kodanev.

5) Fix l3mdev oif setting on xfrm dst lookups.
   From David Ahern.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40e3012e

08 9月, 2016 1 次提交

MAINTAINERS: Update CPMAC email address · 9dd4aaef

由 Florian Fainelli 提交于 9月 06, 2016

Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9dd4aaef

07 9月, 2016 6 次提交

ipv6: addrconf: fix dev refcont leak when DAD failed · 751eb6b6

由 Wei Yongjun 提交于 9月 05, 2016

In general, when DAD detected IPv6 duplicate address, ifp->state
will be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a
delayed work, the call tree should be like this:

ndisc_recv_ns
  -> addrconf_dad_failure        <- missing ifp put
     -> addrconf_mod_dad_work
       -> schedule addrconf_dad_work()
         -> addrconf_dad_stop()  <- missing ifp hold before call it

addrconf_dad_failure() called with ifp refcont holding but not put.
addrconf_dad_work() call addrconf_dad_stop() without extra holding
refcount. This will not cause any issue normally.

But the race between addrconf_dad_failure() and addrconf_dad_work()
may cause ifp refcount leak and netdevice can not be unregister,
dmesg show the following messages:

IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!
...
unregister_netdevice: waiting for eth0 to become free. Usage count = 1

Cc: stable@vger.kernel.org
Fixes: c15b1cca ("ipv6: move DAD and addrconf_verify processing
to workqueue")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

751eb6b6

bnxt_en: Fix TX push operation on ARM64. · 9d13744b

由 Michael Chan 提交于 9月 05, 2016

There is a code path where we are calling __iowrite64_copy() on
an address that is not 64-bit aligned.  This causes an exception on
some architectures such as arm64.  Fix that code path by using
__iowrite32_copy().
Reported-by: NJD Zheng <jiandong.zheng@broadcom.com>
Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d13744b

net: Don't delete routes in different VRFs · 5a56a0b3

由 Mark Tomlinson 提交于 9月 05, 2016

When deleting an IP address from an interface, there is a clean-up of
routes which refer to this local address. However, there was no check to
see that the VRF matched. This meant that deletion wasn't confined to
the VRF it should have been.

To solve this, a new field has been added to fib_info to hold a table
id. When removing fib entries corresponding to a local ip address, this
table id is also used in the comparison.

The table id is populated when the fib_info is created. This was already
done in some places, but not in ip_rt_ioctl(). This has now been fixed.

Fixes: 021dd3b8 ("net: Add routes to the table associated with the device")
Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
Tested-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NMark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a56a0b3

net: smsc: remove build warning of duplicate definition · daa7ee8d

由 Sudip Mukherjee 提交于 9月 04, 2016

The build of m32r was giving warning:

In file included from drivers/net/ethernet/smsc/smc91x.c:92:0:
drivers/net/ethernet/smsc/smc91x.h:448:0: warning: "SMC_inb" redefined
 #define SMC_inb(ioaddr, reg)  ({ BUG(); 0; })

drivers/net/ethernet/smsc/smc91x.h:106:0:
	note: this is the location of the previous definition
 #define SMC_inb(a, r)  inb(((u32)a) + (r))

drivers/net/ethernet/smsc/smc91x.h:449:0: warning: "SMC_outb" redefined
 #define SMC_outb(x, ioaddr, reg) BUG()

drivers/net/ethernet/smsc/smc91x.h:108:0:
	note: this is the location of the previous definition
 #define SMC_outb(v, a, r) outb(v, ((u32)a) + (r))
Signed-off-by: NSudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

daa7ee8d

net: macb: initialize checksum when using checksum offloading · 007e4ba3

由 Helmut Buchsbaum 提交于 9月 04, 2016

I'm still struggling to get this fix right..

Changes since v2:
 - do not blindly modify SKB contents according to Dave's legitimate
   objection

Changes since v1:
 - dropped disabling HW checksum offload for Zynq
 - initialize checksum similar to net/ethernet/freescale/fec_main.c

-- >8 --
MACB/GEM needs the checksum field initialized to 0 to get correct
results on transmit in all cases, e.g. on Zynq, UDP packets with
payload <= 2 otherwise contain a wrong checksums.
Signed-off-by: NHelmut Buchsbaum <helmut.buchsbaum@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

007e4ba3

ipv6: release dst in ping_v6_sendmsg · 03c2778a

由 Dave Jones 提交于 9月 02, 2016

Neither the failure or success paths of ping_v6_sendmsg release
the dst it acquires.  This leads to a flood of warnings from
"net/core/dst.c:288 dst_release" on older kernels that
don't have 8bf4ada2 backported.

That patch optimistically hoped this had been fixed post 3.10, but
it seems at least one case wasn't, where I've seen this triggered
a lot from machines doing unprivileged icmp sockets.

Cc: Martin Lau <kafai@fb.com>
Signed-off-by: NDave Jones <davej@codemonkey.org.uk>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

03c2778a

05 9月, 2016 6 次提交

af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock' · 6e1ce3c3

由 Linus Torvalds 提交于 9月 01, 2016

Right now we use the 'readlock' both for protecting some of the af_unix
IO path and for making the bind be single-threaded.

The two are independent, but using the same lock makes for a nasty
deadlock due to ordering with regards to filesystem locking. The bind
locking would want to nest outside the VSF pathname locking, but the IO
locking wants to nest inside some of those same locks.

We tried to fix this earlier with commit c845acb3 ("af_unix: Fix
splice-bind deadlock") which moved the readlock inside the vfs locks,
but that caused problems with overlayfs that will then call back into
filesystem routines that take the lock in the wrong order anyway.

Splitting the locks means that we can go back to having the bind lock be
the outermost lock, and we don't have any deadlocks with lock ordering.
Acked-by: NRainer Weikusat <rweikusat@cyberadapt.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e1ce3c3

Revert "af_unix: Fix splice-bind deadlock" · 38f7bd94

由 Linus Torvalds 提交于 9月 01, 2016

This reverts commit c845acb3.

It turns out that it just replaces one deadlock with another one: we can
still get the wrong lock ordering with the readlock due to overlayfs
calling back into the filesystem layer and still taking the vfs locks
after the readlock.

The proper solution ends up being to just split the readlock into two
pieces: the bind lock (taken *outside* the vfs locks) and the IO lock
(taken *inside* the filesystem locks).  The two locks are independent
anyway.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

38f7bd94

Merge branch 'vxlan-fixes' · 2f83a53a

由 David S. Miller 提交于 9月 04, 2016

Jiri Benc says:

====================
vxlan: fix error reporting

This patchset improves checking for invalid configuration in VXLAN and
fixes problems with duplicated and inappropriate error messages.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f83a53a

vxlan: fix duplicated and wrong error messages · 3555621d

由 Jiri Benc 提交于 9月 02, 2016

vxlan_dev_configure outputs error messages before returning, no need to
print again the same mesages in vxlan_newlink. Also, vxlan_dev_configure may
return a particular error code for a different reason than vxlan_newlink
thinks.

Move the remaining error messages into vxlan_dev_configure and let
vxlan_newlink just pass on the error code.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3555621d

vxlan: reject multicast destination without an interface · 9b4cdd51

由 Jiri Benc 提交于 9月 02, 2016

Currently, kernel accepts configurations such as:

ip l a type vxlan dstport 4789 id 1 group 239.192.0.1
ip l a type vxlan dstport 4789 id 1 group ff0e::110

However, neither of those really works. In the IPv4 case, the interface
cannot be brought up ("RTNETLINK answers: No such device"). This is because
multicast join will be rejected without the interface being specified.

In the IPv6 case, multicast wil be joined on the first interface found. This
is not what the user wants as it depends on random factors (order of
interfaces).

Note that it's possible to add a local address but it doesn't solve
anything. For IPv4, it's not considered in the multicast join (thus the same
error as above is returned on ifup). This could be added but it wouldn't
help for IPv6 anyway. For IPv6, we do need the interface.

Just reject a configuration that sets multicast address and does not provide
an interface. Nobody can depend on the previous behavior as it never worked.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b4cdd51

bonding: Fix bonding crash · 24b27fc4

由 Mahesh Bandewar 提交于 9月 01, 2016

Following few steps will crash kernel -

  (a) Create bonding master
      > modprobe bonding miimon=50
  (b) Create macvlan bridge on eth2
      > ip link add link eth2 dev mvl0 address aa:0:0:0:0:01 \
	   type macvlan
  (c) Now try adding eth2 into the bond
      > echo +eth2 > /sys/class/net/bond0/bonding/slaves
      <crash>

Bonding does lots of things before checking if the device enslaved is
busy or not.

In this case when the notifier call-chain sends notifications, the
bond_netdev_event() assumes that the rx_handler /rx_handler_data is
registered while the bond_enslave() hasn't progressed far enough to
register rx_handler for the new slave.

This patch adds a rx_handler check that can be performed right at the
beginning of the enslave code to avoid getting into this situation.
Signed-off-by: NMahesh Bandewar <maheshb@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

24b27fc4

03 9月, 2016 6 次提交

Merge branch 'smsc911x-fixes' · 312565a0

由 David S. Miller 提交于 9月 02, 2016

Jeremy Linton says:

====================
net: smsc911x: Move phy and interrupt config

v2-v3: Move error handing into separate patch, replace a couple cases
 of fixed errors with the errors being returned from the failing functions.
 Hoist irq handler.

The smsc911x driver is doing a number of things in its probe routine that
should be delayed until the interface is started. Because of this, the module
cannot be unloaded, the phy states are incorrect/stale if the interface isn't
running, open's unnecessarily fail causing network configuration problems, and
the /proc/irq nodes are incorrectly named.

Clean up a number of these problems by moving the mdio and interrupt
configuration into the smsc911x_open routine.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

312565a0

net: smsc911x: Move interrupt allocation to open/stop · f252974e

由 Jeremy Linton 提交于 9月 01, 2016

The /proc/irq/xx information is incorrect for smsc911x because
the request_irq is happening before the register_netdev has the
proper device name. Moving it to the open also fixes the case
of when the device is renamed.
Reported-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJeremy Linton <jeremy.linton@arm.com>
Tested-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f252974e

net: smsc911x: Move interrupt handler before open · a85f00c3

由 Jeremy Linton 提交于 9月 01, 2016

In preparation for the allocating/enabling interrupts
in the ndo_open routine move the irq handler before it.
Signed-off-by: NJeremy Linton <jeremy.linton@arm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a85f00c3

net: smsc911x: Fix register_netdev, phy startup, driver unload ordering · aea95dd5

由 Jeremy Linton 提交于 9月 01, 2016

Move phy startup/shutdown into the smsc911x_open/stop routines. This
allows the module to be unloaded because phy_connect_direct is no longer
always holding the module use count. This one change also resolves a
number of other problems.

The link status of a downed interface no longer reflects a stale state.
Errors caused by the net device being opened before the mdio/phy was
configured. There is also a potential power savings as the phy's don't
remain powered when the interface isn't running.
Signed-off-by: NJeremy Linton <jeremy.linton@arm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aea95dd5

net: smsc911x: Remove multiple exit points from smsc911x_open · 1358bd5a

由 Jeremy Linton 提交于 9月 01, 2016

Rework the error handling in smsc911x open in preparation
for the mdio startup being moved here.
Signed-off-by: NJeremy Linton <jeremy.linton@arm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1358bd5a

l2tp: fix use-after-free during module unload · 2f86953e

由 Sabrina Dubroca 提交于 9月 02, 2016

Tunnel deletion is delayed by both a workqueue (l2tp_tunnel_delete -> wq
 -> l2tp_tunnel_del_work) and RCU (sk_destruct -> RCU ->
l2tp_tunnel_destruct).

By the time l2tp_tunnel_destruct() runs to destroy the tunnel and finish
destroying the socket, the private data reserved via the net_generic
mechanism has already been freed, but l2tp_tunnel_destruct() actually
uses this data.

Make sure tunnel deletion for the netns has completed before returning
from l2tp_exit_net() by first flushing the tunnel removal workqueue, and
then waiting for RCU callbacks to complete.

Fixes: 167eb17e ("l2tp: create tunnel sockets in the right namespace")
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f86953e

02 9月, 2016 4 次提交

ipv6: Don't unset flowi6_proto in ipxip6_tnl_xmit() · ab343801

由 Eli Cooper 提交于 8月 26, 2016

Commit 8eb30be0 ("ipv6: Create ip6_tnl_xmit") unsets
flowi6_proto in ip4ip6_tnl_xmit() and ip6ip6_tnl_xmit().
Since xfrm_selector_match() relies on this info, IPv6 packets
sent by an ip6tunnel cannot be properly selected by their
protocols after removing it. This patch puts flowi6_proto back.

Cc: stable@vger.kernel.org
Fixes: 8eb30be0 ("ipv6: Create ip6_tnl_xmit")
Signed-off-by: NEli Cooper <elicooper@gmx.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab343801

bnx2x: don't reset chip on cleanup if PCI function is offline · b44e108b

由 Guilherme G. Piccoli 提交于 8月 31, 2016

When PCI error is detected, in some architectures (like PowerPC) a slot
reset is performed - the driver's error handlers are in charge of "disable"
device before the reset, and re-enable it after a successful slot reset.

There are two cases though that another path is taken on the code: if the
slot reset is not successful or if too many errors already happened in the
specific adapter (meaning that possibly the device is experiencing a HW
failure that slot reset is not able to solve), the core PCI error mechanism
(called EEH in PowerPC) will remove the adapter from the system, since it
will consider this as a permanent failure on device. In this case, a path
is taken that leads to bnx2x_chip_cleanup() calling bnx2x_reset_hw(), which
then tries to perform a HW reset on chip. This reset won't succeed since
the HW is in a fault state, which can be seen by multiple messages on
kernel log like below:

bnx2x: [bnx2x_issue_dmae_with_comp:552(eth1)]DMAE timeout!
bnx2x: [bnx2x_write_dmae:600(eth1)]DMAE returned failure -1

After some time, the PCI error mechanism gives up on waiting the driver's
correct removal procedure and forcibly remove the adapter from the system.
We can see soft lockup while core PCI error mechanism is waiting for driver
to accomplish the right removal process.

This patch adds a verification to avoid a chip reset whenever the function
is in PCI error state - since this case is only reached when we have a
device being removed because of a permanent failure, the HW chip reset is
not expected to work fine neither is necessary.

Also, as a minor improvement in error path, we avoid the MCP information dump
in case of non-recoverable PCI error (when adapter is about to be removed),
since it will certainly fail.
Reported-by: NHarsha Thyagaraja <hathyaga@in.ibm.com>
Signed-off-by: NGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-By: NYuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b44e108b

rps: flow_dissector: Fix uninitialized flow_keys used in __skb_get_hash possibly · 635c223c

由 Gao Feng 提交于 8月 31, 2016

The original codes depend on that the function parameters are evaluated from
left to right. But the parameter's evaluation order is not defined in C
standard actually.

When flow_keys_have_l4(&keys) is invoked before ___skb_get_hash(skb, &keys,
hashrnd) with some compilers or environment, the keys passed to
flow_keys_have_l4 is not initialized.

Fixes: 6db61d79 ("flow_dissector: Ignore flow dissector return value from ___skb_get_hash")
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NGao Feng <fgao@ikuai8.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

635c223c

tcp: fastopen: fix rcv_wup initialization for TFO server on SYN/data · 28b346cb

由 Neal Cardwell 提交于 8月 30, 2016

Yuchung noticed that on the first TFO server data packet sent after
the (TFO) handshake, the server echoed the TCP timestamp value in the
SYN/data instead of the timestamp value in the final ACK of the
handshake. This problem did not happen on regular opens.

The tcp_replace_ts_recent() logic that decides whether to remember an
incoming TS value needs tp->rcv_wup to hold the latest receive
sequence number that we have ACKed (latest tp->rcv_nxt we have
ACKed). This commit fixes this issue by ensuring that a TFO server
properly updates tp->rcv_wup to match tp->rcv_nxt at the time it sends
a SYN/ACK for the SYN/data.
Reported-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
Fixes: 168a8f58 ("tcp: TCP Fast Open Server - main code path")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

28b346cb