提交 · 20a41fba679d665cdae2808e2b9cae97c073351f · openeuler / raspberrypi-kernel

23 10月, 2015 13 次提交

amd-xgbe: Use wmb before updating current descriptor count · 20a41fba

由 Lendacky, Thomas 提交于 10月 21, 2015

The code currently uses the lightweight dma_wmb barrier before updating
the current descriptor count. Under heavy load, the Tx cleanup routine
was seeing the updated current descriptor count before the updated
descriptor information. As a result, the Tx descriptor was being cleaned
up before it was used because it was not "owned" by the hardware yet,
resulting in a Tx queue hang.

Using the wmb barrier insures that the descriptor is updated before the
descriptor counter preventing the Tx queue hang. For extra insurance,
the Tx cleanup routine is changed to grab the current decriptor count on
entry and uses that initial value in the processing loop rather than
trying to chase the current value.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Tested-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

20a41fba

net/phy: micrel: Add workaround for bad autoneg · d2fd719b

由 Nathan Sullivan 提交于 10月 21, 2015

Very rarely, the KSZ9031 will appear to complete autonegotiation, but
will drop all traffic afterwards.  When this happens, the idle error
count will read 0xFF after autonegotiation completes.  Reset the PHY
when in that state.
Signed-off-by: NNathan Sullivan <nathan.sullivan@ni.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d2fd719b

Merge branch 'ipv6-overflow-arith' · ec3661b4

由 David S. Miller 提交于 10月 23, 2015

Hannes Frederic Sowa says:

====================
overflow-arith: begin to add support for overflow builtins functions

I add a new header, linux/overflow-arith.h, as the central place to add
overflow and wrap-around checking functions. The reason I am doing so
is that it can make use of compiler supported builtin functions which
can leverage hardware.

As I need this for a fix in the ipv6 stack, which is also included in
this series, I propose to add it sooner than later over Davem's net
tree. This is also the reason why I start slowly with only the one
function I need at this time.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ec3661b4

ipv6: protect mtu calculation of wrap-around and infinite loop by rounding issues · b72a2b01

由 Hannes Frederic Sowa 提交于 10月 16, 2015

Raw sockets with hdrincl enabled can insert ipv6 extension headers
right into the data stream. In case we need to fragment those packets,
we reparse the options header to find the place where we can insert
the fragment header. If the extension headers exceed the link's MTU we
actually cannot make progress in such a case.

Instead of ending up in broken arithmetic or rounding towards 0 and
entering an endless loop in ip6_fragment, just prevent those cases by
aborting early and signal -EMSGSIZE to user space.
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b72a2b01

overflow-arith: begin to add support for overflow builtin functions · 79907146

由 Hannes Frederic Sowa 提交于 10月 16, 2015

The idea of the overflow-arith.h header is to collect overflow checking
functions in one central place.

If gcc compiler supports the __builtin_overflow_* builtins we use them
because they might give better performance, otherwise the code falls
back to normal overflow checking functions.

The builtin_overflow functions are supported by gcc-5 and clang. The
matter of supporting clang is to just provide a corresponding
CC_HAVE_BUILTIN_OVERFLOW, because the specific overflow checking builtins
don't differ between gcc and clang.

I just provide overflow_usub function here as I intend this to get merged
into net, more functions will definitely follow as they are needed.
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

79907146

tcp: allow dctcp alpha to drop to zero · c80dbe04

由 Andrew Shewmaker 提交于 10月 18, 2015

If alpha is strictly reduced by alpha >> dctcp_shift_g and if alpha is less
than 1 << dctcp_shift_g, then alpha may never reach zero. For example,
given shift_g=4 and alpha=15, alpha >> dctcp_shift_g yields 0 and alpha
remains 15. The effect isn't noticeable in this case below cwnd=137, but
could gradually drive uncongested flows with leftover alpha down to
cwnd=137. A larger dctcp_shift_g would have a greater effect.

This change causes alpha=15 to drop to 0 instead of being decrementing by 1
as it would when alpha=16. However, it requires one less conditional to
implement since it doesn't have to guard against subtracting 1 from 0U. A
decay of 15 is not unreasonable since an equal or greater amount occurs at
alpha >= 240.
Signed-off-by: NAndrew G. Shewmaker <agshew@gmail.com>
Acked-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c80dbe04

ipv6: fix the incorrect return value of throw route · ab997ad4

由 lucien 提交于 10月 23, 2015

The error condition -EAGAIN, which is signaled by throw routes, tells
the rules framework to walk on searching for next matches. If the walk
ends and we stop walking the rules with the result of a throw route we
have to translate the error conditions to -ENETUNREACH.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab997ad4

macvtap: unbreak receiving of gro skb with frag list · f23d538b

由 Jason Wang 提交于 10月 23, 2015

We don't have fraglist support in TAP_FEATURES. This will lead
software segmentation of gro skb with frag list. Fixes by having
frag list support in TAP_FEATURES.

With this patch single session of netperf receiving were restored from
about 5Gb/s to about 12Gb/s on mlx4.

Fixes a567dd62 ("macvtap: simplify usage of tap_features")
Cc: Vlad Yasevich <vyasevic@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f23d538b

openvswitch: Fix egress tunnel info. · fc4099f1

由 Pravin B Shelar 提交于 10月 22, 2015

While transitioning to netdev based vport we broke OVS
feature which allows user to retrieve tunnel packet egress
information for lwtunnel devices.  Following patch fixes it
by introducing ndo operation to get the tunnel egress info.
Same ndo operation can be used for lwtunnel devices and compat
ovs-tnl-vport devices. So after adding such device operation
we can remove similar operation from ovs-vport.

Fixes: 614732ea ("openvswitch: Use regular VXLAN net_device device").
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc4099f1

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · 0c472b9b

由 David S. Miller 提交于 10月 22, 2015

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-10-22

This series contains fixes to i40e only.

Jesse provides two small fixes for i40e, first fixes counters that were
being displayed incorrectly due to indexing beyond the array of strings
when printing stats.  Then fixed the fact that the driver was printing
a message about not being able to assign VMDq because a lack of MSI-X
vectors, when it was not true.  It was due to a line missing that
initialized a variable.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c472b9b

VSOCK: Fix lockdep issue. · 8566b86a

由 Jorgen Hansen 提交于 10月 22, 2015

The recent fix for the vsock sock_put issue used the wrong
initializer for the transport spin_lock causing an issue when
running with lockdep checking.

Testing: Verified fix on kernel with lockdep enabled.
Reviewed-by: NThomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: NJorgen Hansen <jhansen@vmware.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8566b86a

i40e: fix annoying message · e9e53662

由 Jesse Brandeburg 提交于 10月 02, 2015

The driver was printing a message about not being able
to assign VMDq because of a lack of MSI-X vectors.

This was because a line was missing that initialized a variable,
simply a merge error.
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

e9e53662

i40e: fix stats offsets · 74a6c665

由 Jesse Brandeburg 提交于 10月 02, 2015

The code was setting up stats that were not being initialized.
This caused several counters to be displayed incorrectly, due
to indexing beyond the array of strings when printing stats.
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

74a6c665

22 10月, 2015 22 次提交

qmi_wwan: add Sierra Wireless MC74xx/EM74xx · 0db65fcf

由 Bjørn Mork 提交于 10月 22, 2015

New device IDs shamelessly lifted from the vendor driver.
Signed-off-by: NBjørn Mork <bjorn@mork.no>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0db65fcf

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 199c6550

由 David S. Miller 提交于 10月 22, 2015

Steffen Klassert says:

====================
pull request (net): ipsec 2015-10-22

1) Fix IPsec pre-encap fragmentation for GSO packets.
   From Herbert Xu.

2) Fix some header checks in _decode_session6.
   We skip the header informations if the data pointer points
   already behind the header in question for some protocols.
   This is because we call pskb_may_pull with a negative value
   converted to unsigened int from pskb_may_pull in this case.
   Skipping the header informations can lead to incorrect policy
   lookups. From Mathias Krause.

3) Allow to change the replay threshold and expiry timer of a
   state without having to set other attributes like replay
   counter and byte lifetime. Changing these other attributes
   may break the SA. From Michael Rossberg.

4) Fix pmtu discovery for local generated packets.
   We may fail dispatch to the inner address family.
   As a reault, the local error handler is not called
   and the mtu value is not reported back to userspace.

Please pull or let me know if there are problems.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

199c6550

net: ipv6: Dont add RT6_LOOKUP_F_IFACE flag if saddr set · d46a9d67

由 David Ahern 提交于 10月 21, 2015

741a11d9 ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set")
adds the RT6_LOOKUP_F_IFACE flag to make device index mismatch fatal if
oif is given. Hajime reported that this change breaks the Mobile IPv6
use case that wants to force the message through one interface yet use
the source address from another interface. Handle this case by only
adding the flag if oif is set and saddr is not set.

Fixes: 741a11d9 ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set")
Cc: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d46a9d67

Merge branch 'isdn-null-deref' · 92a93fd5

由 David S. Miller 提交于 10月 22, 2015

Karsten Keil says:

====================
Fix potential NULL pointer access and memory leak in ISDN layer2 functions

Insu Yun did brinup the issue with not checking the skb_clone() return
value in the layer2 I-frame ull functions.
This series fix the issue in a way which avoid protocol violations/data loss
on a temporary memory shortage.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92a93fd5

mISDN: fix OOM condition for sending queued I-Frames · c96356a9

由 Karsten Keil 提交于 10月 21, 2015

The old code did not check the return value of skb_clone().
The extra skb_clone() is not needed at all, if using skb_realloc_headroom()
instead, which gives us a private copy with enough headroom as well.
We need to requeue the original skb if the call failed, because we cannot
inform upper layers about the data loss. Restructure the code to minimise
rollback effort if it happens.
This fix kernel bug #86091

Thanks to Insu Yun <wuninsu@gmail.com> to remind me on this issue.
Signed-off-by: NKarsten Keil <keil@b1-systems.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c96356a9

ISDN: fix OOM condition for sending queued I-Frames · c7a7c95e

由 Karsten Keil 提交于 10月 21, 2015

The skb_clone() return value was not checked and the skb_realloc_headroom()
usage was wrong, the old skb was not freed. It turned out, that the
skb_clone is not needed at all, the skb_realloc_headroom() will create a
private copy with enough headroom and the original SKB can be used for the
ACK queue.
We need to requeue the original skb if the call failed, since the upper
layer cannot be informed about memory shortage.

Thanks to Insu Yun <wuninsu@gmail.com> to remind me on this issue.
Signed-off-by: NKarsten Keil <keil@b1-systems.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7a7c95e

VSOCK: sock_put wasn't safe to call in interrupt context · 4ef7ea91

由 Jorgen Hansen 提交于 10月 21, 2015

In the vsock vmci_transport driver, sock_put wasn't safe to call
in interrupt context, since that may call the vsock destructor
which in turn calls several functions that should only be called
from process context. This change defers the callling of these
functions  to a worker thread. All these functions were
deallocation of resources related to the transport itself.

Furthermore, an unused callback was removed to simplify the
cleanup.

Multiple customers have been hitting this issue when using
VMware tools on vSphere 2015.

Also added a version to the vmci transport module (starting from
1.0.2.0-k since up until now it appears that this module was
sharing version with vsock that is currently at 1.0.1.0-k).
Reviewed-by: NAditya Asarwade <asarwade@vmware.com>
Reviewed-by: NThomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: NJorgen Hansen <jhansen@vmware.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ef7ea91

netlink: fix locking around NETLINK_LIST_MEMBERSHIPS · 47191d65

由 David Herrmann 提交于 10月 21, 2015

Currently, NETLINK_LIST_MEMBERSHIPS grabs the netlink table while copying
the membership state to user-space. However, grabing the netlink table is
effectively a write_lock_irq(), and as such we should not be triggering
page-faults in the critical section.

This can be easily reproduced by the following snippet:
    int s = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
    void *p = mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0);
    int r = getsockopt(s, 0x10e, 9, p, (void*)((char*)p + 4092));

This should work just fine, but currently triggers EFAULT and a possible
WARN_ON below handle_mm_fault().

Fix this by reducing locking of NETLINK_LIST_MEMBERSHIPS to a read-side
lock. The write-lock was overkill in the first place, and the read-lock
allows page-faults just fine.
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

47191d65

net: phy: dp83848: Add TI DP83848 Ethernet PHY · 34e45ad9

由 Andrew F. Davis 提交于 10月 20, 2015

Add support for the TI DP83848 Ethernet PHY device.

The DP83848 is a highly reliable, feature rich, IEEE 802.3 compliant
single port 10/100 Mb/s Ethernet Physical Layer Transceiver supporting
the MII and RMII interfaces.
Signed-off-by: NAndrew F. Davis <afd@ti.com>
Signed-off-by: NDan Murphy <dmurphy@ti.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Acked-by: NDan Murphy <dmurphy@ti.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

34e45ad9

net: sun4i-emac: Properly free resources on probe failure and remove · 104eb270

由 Hans de Goede 提交于 10月 20, 2015

Fix sun4i-emac not releasing the following resources:
-iomapped memory not released on probe-failure nor on remove
-clock not getting disabled on probe-failure nor on remove
-sram not being released on remove

And while at it also add error checking to the clk_prepare_enable call
done on probe.
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Acked-by: NMaxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

104eb270

openvswitch: Serialize nested ct actions if provided · e754ec69

由 Joe Stringer 提交于 10月 19, 2015

If userspace provides a ct action with no nested mark or label, then the
storage for these fields is zeroed. Later when actions are requested,
such zeroed fields are serialized even though userspace didn't
originally specify them. Fix the behaviour by ensuring that no action is
serialized in this case, and reject actions where userspace attempts to
set these fields with mask=0. This should make netlink marshalling
consistent across deserialization/reserialization.
Reported-by: NJarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: NJoe Stringer <joestringer@nicira.com>
Acked-by: NPravin B Shelar <pshelar@nicira.com>
Acked-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e754ec69

openvswitch: Mark connections new when not confirmed. · 4f0909ee

由 Joe Stringer 提交于 10月 19, 2015

New, related connections are marked as such as part of ovs_ct_lookup(),
but they are not marked as "new" if the commit flag is used. Make this
consistent by setting the "new" flag whenever !nf_ct_is_confirmed(ct).
Reported-by: NJarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: NJoe Stringer <joestringer@nicira.com>
Acked-by: NThomas Graf <tgraf@suug.ch>
Acked-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f0909ee

openvswitch: Clarify conntrack COMMIT behaviour · 1d008a1d

由 Joe Stringer 提交于 10月 19, 2015

The presence of this attribute does not modify the ct_state for the
current packet, only future packets. Make this more clear in the header
definition.
Signed-off-by: NJoe Stringer <joestringer@nicira.com>
Acked-by: NThomas Graf <tgraf@suug.ch>
Acked-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d008a1d

openvswitch: Reject ct_state masks for unknown bits · 9e384715

由 Joe Stringer 提交于 10月 19, 2015

Currently, 0-bits are generated in ct_state where the bit position is
undefined, and matches are accepted on these bit-positions. If userspace
requests to match the 0-value for this bit then it may expect only a
subset of traffic to match this value, whereas currently all packets
will have this bit set to 0. Fix this by rejecting such masks.
Signed-off-by: NJoe Stringer <joestringer@nicira.com>
Acked-by: NPravin B Shelar <pshelar@nicira.com>
Acked-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e384715

tcp: remove improper preemption check in tcp_xmit_probe_skb() · e2e8009f

由 Renato Westphal 提交于 10月 19, 2015

Commit e520af48 introduced the following bug when setting the
TCP_REPAIR sockoption:

[ 2860.657036] BUG: using __this_cpu_add() in preemptible [00000000] code: daemon/12164
[ 2860.657045] caller is __this_cpu_preempt_check+0x13/0x20
[ 2860.657049] CPU: 1 PID: 12164 Comm: daemon Not tainted 4.2.3 #1
[ 2860.657051] Hardware name: Dell Inc. PowerEdge R210 II/0JP7TR, BIOS 2.0.5 03/13/2012
[ 2860.657054]  ffffffff81c7f071 ffff880231e9fdf8 ffffffff8185d765 0000000000000002
[ 2860.657058]  0000000000000001 ffff880231e9fe28 ffffffff8146ed91 ffff880231e9fe18
[ 2860.657062]  ffffffff81cd1a5d ffff88023534f200 ffff8800b9811000 ffff880231e9fe38
[ 2860.657065] Call Trace:
[ 2860.657072]  [<ffffffff8185d765>] dump_stack+0x4f/0x7b
[ 2860.657075]  [<ffffffff8146ed91>] check_preemption_disabled+0xe1/0xf0
[ 2860.657078]  [<ffffffff8146edd3>] __this_cpu_preempt_check+0x13/0x20
[ 2860.657082]  [<ffffffff817e0bc7>] tcp_xmit_probe_skb+0xc7/0x100
[ 2860.657085]  [<ffffffff817e1e2d>] tcp_send_window_probe+0x2d/0x30
[ 2860.657089]  [<ffffffff817d1d8c>] do_tcp_setsockopt.isra.29+0x74c/0x830
[ 2860.657093]  [<ffffffff817d1e9c>] tcp_setsockopt+0x2c/0x30
[ 2860.657097]  [<ffffffff81767b74>] sock_common_setsockopt+0x14/0x20
[ 2860.657100]  [<ffffffff817669e1>] SyS_setsockopt+0x71/0xc0
[ 2860.657104]  [<ffffffff81865172>] entry_SYSCALL_64_fastpath+0x16/0x75

Since tcp_xmit_probe_skb() can be called from process context, use
NET_INC_STATS() instead of NET_INC_STATS_BH().

Fixes: e520af48 ("tcp: add TCPWinProbe and TCPKeepAlive SNMP counters")
Signed-off-by: NRenato Westphal <renatow@taghos.com.br>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e2e8009f

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 36a28b21

由 David S. Miller 提交于 10月 21, 2015

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains four Netfilter fixes for net, they are:

1) Fix Kconfig dependencies of new nf_dup_ipv4 and nf_dup_ipv6.

2) Remove bogus test nh_scope in IPv4 rpfilter match that is breaking
   --accept-local, from Xin Long.

3) Wait for RCU grace period after dropping the pending packets in the
   nfqueue, from Florian Westphal.

4) Fix sleeping allocation while holding spin_lock_bh, from Nikolay Borisov.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

36a28b21

tipc: conditionally expand buffer headroom over udp tunnel · e5356794

由 Jon Paul Maloy 提交于 10月 19, 2015

In commit d999297c ("tipc: reduce locking scope during packet reception")
we altered the packet retransmission function. Since then, when
restransmitting packets, we create a clone of the original buffer
using __pskb_copy(skb, MIN_H_SIZE), where MIN_H_SIZE is the size of
the area we want to have copied, but also the smallest possible TIPC
packet size. The value of MIN_H_SIZE is 24.

Unfortunately, __pskb_copy() also has the effect that the headroom
of the cloned buffer takes the size MIN_H_SIZE. This is too small
for carrying the packet over the UDP tunnel bearer, which requires
a minimum headroom of 28 bytes. A change to just use pskb_copy()
lets the clone inherit the original headroom of 80 bytes, but also
assumes that the copied data area is of at least that size, something
that is not always the case. So that is not a viable solution.

We now fix this by adding a check for sufficient headroom in the
transmit function of udp_media.c, and expanding it when necessary.

Fixes: commit d999297c ("tipc: reduce locking scope during packet reception")
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5356794

net: cavium: change NET_VENDOR_CAVIUM to bool · 7a4264a9

由 Andreas Schwab 提交于 10月 19, 2015

CONFIG_NET_VENDOR_CAVIUM is only used to hide/show config options and to
include subdirectories in the build, so it doesn't make sense to make it
tristate.
Signed-off-by: NAndreas Schwab <schwab@suse.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7a4264a9

tipc: allow non-linear first fragment buffer · 45c8b7b1

由 Jon Paul Maloy 提交于 10月 19, 2015

The current code for message reassembly is erroneously assuming that
the the first arriving fragment buffer always is linear, and then goes
ahead resetting the fragment list of that buffer in anticipation of
more arriving fragments.

However, if the buffer already happens to be non-linear, we will
inadvertently drop the already attached fragment list, and later
on trig a BUG() in __pskb_pull_tail().

We see this happen when running fragmented TIPC multicast across UDP,
something made possible since
commit d0f91938 ("tipc: add ip/udp media type")

We fix this by not resetting the fragment list when the buffer is non-
linear, and by initiatlizing our private fragment list tail pointer to
the tail of the existing fragment list.

Fixes: commit d0f91938 ("tipc: add ip/udp media type")
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

45c8b7b1

openvswitch: Allocate memory for ovs internal device stats. · 1241365f

由 James Morse 提交于 10月 19, 2015

"openvswitch: Remove vport stats" removed the per-vport statistics, in
order to use the netdev's statistics fields.
"openvswitch: Fix ovs_vport_get_stats()" fixed the export of these stats
to user-space, by using the provided netdev_ops to collate them - but ovs
internal devices still use an unallocated dev->tstats field to count
packets, which are no longer exported by this api.

Allocate the dev->tstats field for ovs internal devices, and wire up
ndo_get_stats64 with the original implementation of
ovs_vport_get_stats().

On its own, "openvswitch: Fix ovs_vport_get_stats()" fixes the OOPs,
unmasking a full-on panic on arm64:

=============%<==============
[<ffffffbffc00ce4c>] internal_dev_recv+0xa8/0x170 [openvswitch]
[<ffffffbffc0008b4>] do_output.isra.31+0x60/0x19c [openvswitch]
[<ffffffbffc000bf8>] do_execute_actions+0x208/0x11c0 [openvswitch]
[<ffffffbffc001c78>] ovs_execute_actions+0xc8/0x238 [openvswitch]
[<ffffffbffc003dfc>] ovs_packet_cmd_execute+0x21c/0x288 [openvswitch]
[<ffffffc0005e8c5c>] genl_family_rcv_msg+0x1b0/0x310
[<ffffffc0005e8e60>] genl_rcv_msg+0xa4/0xe4
[<ffffffc0005e7ddc>] netlink_rcv_skb+0xb0/0xdc
[<ffffffc0005e8a94>] genl_rcv+0x38/0x50
[<ffffffc0005e76c0>] netlink_unicast+0x164/0x210
[<ffffffc0005e7b70>] netlink_sendmsg+0x304/0x368
[<ffffffc0005a21c0>] sock_sendmsg+0x30/0x4c
[SNIP]
Kernel panic - not syncing: Fatal exception in interrupt
=============%<==============

Fixes: 8c876639 ("openvswitch: Remove vport stats.")
Signed-off-by: NJames Morse <james.morse@arm.com>
Acked-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1241365f

net: Really fix vti6 with oif in dst lookups · f1900fb5

由 David Ahern 提交于 10月 19, 2015

6e28b000 ("net: Fix vti use case with oif in dst lookups for IPv6")
is missing the checks on FLOWI_FLAG_SKIP_NH_OIF. Add them.

Fixes: 42a7b32b ("xfrm: Add oif to dst lookups")
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f1900fb5

tipc: extend broadcast link window size · 53387c4e

由 Jon Paul Maloy 提交于 10月 19, 2015

The default fix broadcast window size is currently set to 20 packets.
This is a very low value, set at a time when we were still testing on
10 Mb/s hubs, and a change to it is long overdue.

Commit 7845989c ("net: tipc: fix stall during bclink wakeup procedure")
revealed a problem with this low value. For messages of importance LOW,
the backlog queue limit will be  calculated to 30 packets, while a
single, maximum sized message of 66000 bytes, carried across a 1500 MTU
network consists of 46 packets.

This leads to the following scenario (among others leading to the same
situation):

1: Msg 1 of 46 packets is sent. 20 packets go to the transmit queue, 26
   packets to the backlog queue.
2: Msg 2 of 46 packets is attempted sent, but rejected because there is
   no more space in the backlog queue at this level. The sender is added
   to the wakeup queue with a "pending packets chain size" number of 46.
3: Some packets in the transmit queue are acked and released. We try to
   wake up the sender, but the pending size of 46 is bigger than the LOW
   wakeup limit of 30, so this doesn't happen.
5: Subsequent acks releases all the remaining buffers. Each time we test
   for the wakeup criteria and find that 46 still is larger than 30,
   even after both the transmit and the backlog queues are empty.
6: The sender is never woken up and given a chance to send its message.
   He is stuck.

We could now loosen the wakeup criteria (used by link_prepare_wakeup())
to become equal to the send criteria (used by tipc_link_xmit()), i.e.,
by ignoring the "pending packets chain size" value altogether, or we can
just increase the queue limits so that the criteria can be satisfied
anyway. There are good reasons (potentially multiple waiting senders) to
not opt for the former solution, so we choose the latter one.

This commit fixes the problem by giving the broadcast link window a
default value of 50 packets. We also introduce a new minimum link
window size BCLINK_MIN_WIN of 32, which is enough to always avoid the
described situation. Finally, in order to not break any existing users
which may set the window explicitly, we enforce that the window is set
to the new minimum value in case the user is trying to set it to
anything lower.

Fixes: 7845989c ("net: tipc: fix stall during bclink wakeup procedure")
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Reviewed-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

53387c4e

21 10月, 2015 5 次提交

irda: precedence bug in irlmp_seq_hb_idx() · 50010c20

由 Dan Carpenter 提交于 10月 19, 2015

This is decrementing the pointer, instead of the value stored in the
pointer. KASan detects it as an out of bounds reference.
Reported-by: N"Berry Cheng 程君(成淼)" <chengmiao.cj@alibaba-inc.com>
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

50010c20

xen-netfront: update num_queues to real created · ca88ea12

由 Joe Jin 提交于 10月 19, 2015

Sometimes xennet_create_queues() may failed to created all requested
queues, we need to update num_queues to real created to avoid NULL
pointer dereference.
Signed-off-by: NJoe Jin <joe.jin@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David S. Miller <davem@davemloft.net>
Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca88ea12

vsock: fix missing cleanup when misc_register failed · f6a835bb

由 Gao feng 提交于 10月 18, 2015

reset transport and unlock if misc_register failed.
Signed-off-by: NGao feng <omarapazanadi@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6a835bb

Merge branch 'mv643xx-fixes' · 10be15ff

由 David S. Miller 提交于 10月 21, 2015

Philipp Kirchhofer says:

====================
net: mv643xx_eth: TSO TX data corruption fixes

as previously discussed [1] the mv643xx_eth driver has some
issues with data corruption when using TCP segmentation offload (TSO).

The following patch set improves this situation by fixing two data
corruption bugs in the TSO TX path.

Before applying the patches repeatedly accessing large files located on
a SMB share on my NSA325 NAS with TSO enabled resulted in different
hash sums, which confirmed that data corruption is happening during
file transfer. After applying the patches the hash sums were the same.

As this is my first patch submission please feel free to point out any
issues with the patch set.

[1] http://thread.gmane.org/gmane.linux.network/336530
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

10be15ff

net: mv643xx_eth: Defer writing the first TX descriptor when using TSO · 968200f3

由 Philipp Kirchhofer 提交于 10月 18, 2015

To prevent a race between the TX DMA engine and the CPU the writing of the
first transmit descriptor must be deferred until all following descriptors
have been updated. The network card may otherwise start transmitting before
all packet descriptors are set up correctly, which leads to data corruption
or an aborted transmit operation.

This deferral is already done in the non-TSO TX path, implement it also in
the TSO TX path.
Signed-off-by: NPhilipp Kirchhofer <philipp@familie-kirchhofer.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

968200f3