提交 · 61a3855bb726cbb062ef02a31a832dea455456e0 · xiphi1978 / linux

19 3月, 2015 3 次提交

IB/mlx4: Saturate RoCE port PMA counters in case of overflow · 61a3855b

由 Majd Dibbiny 提交于 3月 18, 2015

For RoCE ports, we set the u32 PMA values based on u64 HCA counters. In case of
overflow, according to the IB spec, we have to saturate a counter to its
max value, do that.

Fixes: c3779134 ('IB/mlx4: Support PMA counters for IBoE')
Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61a3855b

net/mlx4_en: Fix off-by-one in ethtool statistics display · a16f3565

由 Eran Ben Elisha 提交于 3月 18, 2015

NUM_PORT_STATS was 9 instead of 10, which caused off-by-one bug when
displaying the statistics starting from tx_chksum_offload in ethtool.

Fixes: f8c6455b ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE')
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a16f3565

IB/mlx4: Verify net device validity on port change event · 217e8b16

由 Moni Shoua 提交于 3月 18, 2015

Processing an event is done in a different context from the one when
the event was dispatched. This requires a check that the slave
net device is still valid when the event is being processed. The check is done
under the iboe lock which ensure correctness.

Fixes: a5750090 ('IB/mlx4: Add port aggregation support')
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

217e8b16

18 3月, 2015 5 次提交

act_bpf: allow non-default TC_ACT opcodes as BPF exec outcome · ced585c8

由 Daniel Borkmann 提交于 3月 17, 2015

Revisiting commit d23b8ad8 ("tc: add BPF based action") with regards
to eBPF support, I was thinking that it might be better to improve
return semantics from a BPF program invoked through BPF_PROG_RUN().

Currently, in case filter_res is 0, we overwrite the default action
opcode with TC_ACT_SHOT. A default action opcode configured through tc's
m_bpf can be: TC_ACT_RECLASSIFY, TC_ACT_PIPE, TC_ACT_SHOT, TC_ACT_UNSPEC,
TC_ACT_OK.

In cls_bpf, we have the possibility to overwrite the default class
associated with the classifier in case filter_res is _not_ 0xffffffff
(-1).

That allows us to fold multiple [e]BPF programs into a single one, where
they would otherwise need to be defined as a separate classifier with
its own classid, needlessly redoing parsing work, etc.

Similarly, we could do better in act_bpf: Since above TC_ACT* opcodes
are exported to UAPI anyway, we reuse them for return-code-to-tc-opcode
mapping, where we would allow above possibilities. Thus, like in cls_bpf,
a filter_res of 0xffffffff (-1) means that the configured _default_ action
is used. Any unkown return code from the BPF program would fail in
tcf_bpf() with TC_ACT_UNSPEC.

Should we one day want to make use of TC_ACT_STOLEN or TC_ACT_QUEUED,
which both have the same semantics, we have the option to either use
that as a default action (filter_res of 0xffffffff) or non-default BPF
return code.

All that will allow us to transparently use tcf_bpf() for both BPF
flavours.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ced585c8

Revert "smc91x: retrieve IRQ and trigger flags in a modern way" · 8d7d9cca

由 Robert Jarzmik 提交于 2月 12, 2015

The commit breaks the legacy platforms, ie. these not using device-tree,
and setting up the interrupt resources with a flag to activate edge
detection. The issue was found on the zylonite platform.

The reason is that zylonite uses platform resources to pass the interrupt number
and the irq flags (here IORESOURCE_IRQ_HIGHEDGE). It expects the driver to
request the irq with these flags, which in turn setups the irq as high edge
triggered.

After the patch, this was supposed to be taken care of with :
irq_resflags = irqd_get_trigger_type(irq_get_irq_data(ndev->irq));

But irq_resflags is 0 for legacy platforms, while for example in
arch/arm/mach-pxa/zylonite.c, in struct resource smc91x_resources[] the
irq flag is specified. This breaks zylonite because the interrupt is not
setup as triggered, and hardware doesn't provide interrupts.
Signed-off-by: NRobert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d7d9cca

inet: Clean up inet_csk_wait_for_connect() vs. might_sleep() · cb7cf8a3

由 Eric Dumazet 提交于 3月 16, 2015

I got the following trace with current net-next kernel :

[14723.885290] WARNING: CPU: 26 PID: 22658 at kernel/sched/core.c:7285 __might_sleep+0x89/0xa0()
[14723.885325] do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810e8734>] prepare_to_wait_exclusive+0x34/0xa0
[14723.885355] CPU: 26 PID: 22658 Comm: netserver Not tainted 4.0.0-dbg-DEV #1379
[14723.885359]  ffffffff81a223a8 ffff881fae9e7ca8 ffffffff81650b5d 0000000000000001
[14723.885364]  ffff881fae9e7cf8 ffff881fae9e7ce8 ffffffff810a72e7 0000000000000000
[14723.885367]  ffffffff81a57620 000000000000093a 0000000000000000 ffff881fae9e7e64
[14723.885371] Call Trace:
[14723.885377]  [<ffffffff81650b5d>] dump_stack+0x4c/0x65
[14723.885382]  [<ffffffff810a72e7>] warn_slowpath_common+0x97/0xe0
[14723.885386]  [<ffffffff810a73e6>] warn_slowpath_fmt+0x46/0x50
[14723.885390]  [<ffffffff810f4c5d>] ? trace_hardirqs_on_caller+0x10d/0x1d0
[14723.885393]  [<ffffffff810e8734>] ? prepare_to_wait_exclusive+0x34/0xa0
[14723.885396]  [<ffffffff810e8734>] ? prepare_to_wait_exclusive+0x34/0xa0
[14723.885399]  [<ffffffff810ccdc9>] __might_sleep+0x89/0xa0
[14723.885403]  [<ffffffff81581846>] lock_sock_nested+0x36/0xb0
[14723.885406]  [<ffffffff815829a3>] ? release_sock+0x173/0x1c0
[14723.885411]  [<ffffffff815ea1f7>] inet_csk_accept+0x157/0x2a0
[14723.885415]  [<ffffffff810e8900>] ? abort_exclusive_wait+0xc0/0xc0
[14723.885419]  [<ffffffff8161b96d>] inet_accept+0x2d/0x150
[14723.885424]  [<ffffffff8157db6f>] SYSC_accept4+0xff/0x210
[14723.885428]  [<ffffffff8165a451>] ? retint_swapgs+0xe/0x44
[14723.885431]  [<ffffffff810f4c5d>] ? trace_hardirqs_on_caller+0x10d/0x1d0
[14723.885437]  [<ffffffff81369c0e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[14723.885441]  [<ffffffff8157ef40>] SyS_accept+0x10/0x20
[14723.885444]  [<ffffffff81659872>] system_call_fastpath+0x12/0x17
[14723.885447] ---[ end trace ff74cd83355b1873 ]---

In commit 26cabd31
Peter added a sched_annotate_sleep() in sk_wait_event()

Is the following patch needed as well ?

Alternative would be to use sk_wait_event() from inet_csk_wait_for_connect()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cb7cf8a3

ip6_tunnel: fix error code when tunnel exists · 37355565

由 Nicolas Dichtel 提交于 3月 16, 2015

After commit 2b0bb01b, the kernel returns -ENOBUFS when user tries to add
an existing tunnel with ioctl API:
$ ip -6 tunnel add ip6tnl1 mode ip6ip6 dev eth1
add tunnel "ip6tnl0" failed: No buffer space available

It's confusing, the right error is EEXIST.

This patch also change a bit the code returned:
 - ENOBUFS -> ENOMEM
 - ENOENT -> ENODEV

Fixes: 2b0bb01b ("ip6_tunnel: Return an error when adding an existing tunnel.")
CC: Steffen Klassert <steffen.klassert@secunet.com>
Reported-by: NPierre Cheynier <me@pierre-cheynier.net>
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37355565

netdevice.h: fix ndo_bridge_* comments · ad41faa8

由 Nicolas Dichtel 提交于 3月 17, 2015

The argument 'flags' was missing in ndo_bridge_setlink().
ndo_bridge_dellink() was missing.

Fixes: 407af329 ("bridge: Add netlink interface to configure vlans on bridge ports")
Fixes: add511b3 ("bridge: add flags argument to ndo_bridge_setlink and ndo_bridge_dellink")
CC: Vlad Yasevich <vyasevic@redhat.com>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ad41faa8

17 3月, 2015 3 次提交

bnx2x: fix encapsulation features on 57710/57711 · a8e0c246

由 Michal Schmidt 提交于 3月 16, 2015

E1x chips (57710, 57711(E)) have no support for encapsulation
offload. bnx2x incorrectly advertises the support as available.

Setting of those features is conditional on "!CHIP_IS_E1x(bp)", but
the bp struct is not initialized yet at this point and consequently
any chip passes the check.
The check must use the "chip_is_e1x" local variable instead to work
correctly.
Signed-off-by: NMichal Schmidt <mschmidt@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a8e0c246

Merge tag 'mac80211-for-davem-2015-03-16' of... · 48b810d9

由 David S. Miller 提交于 3月 16, 2015

Merge tag 'mac80211-for-davem-2015-03-16' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Here are a few fixes that I'd like to still get in:
 * disable U-APSD for better interoperability, from Michal Kazior
 * drop unencrypted frames in mesh forwarding, from Bob Copeland
 * treat non-QoS/WMM HT stations as non-HT, to fix confusion when
   they connect and then get QoS packets anyway due to HT
 * fix counting interfaces for combination checks, otherwise the
   interface combinations aren't properly enforced (from Andrei)
 * fix pure ECSA by reacting to the IE change
 * ignore erroneous (E)CSA to the current channel which sometimes
   happens due to AP/GO bugs
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

48b810d9

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · ca00942a

由 David S. Miller 提交于 3月 16, 2015

Steffen Klassert says:

====================
pull request (net): ipsec 2015-03-16

1) Fix the network header offset in _decode_session6
   when multiple IPv6 extension headers are present.
   From Hajime Tazaki.

2) Fix an interfamily tunnel crash. We set outer mode
   protocol too early and may dispatch to the wrong
   address family. Move the setting of the outer mode
   protocol behind the last accessing of the inner mode
   to fix the crash.

3) Most callers of xfrm_lookup() expect that dst_orig
   is released on error. But xfrm_lookup_route() may
   need dst_orig to handle certain error cases. So
   introduce a flag that tells what should be done in
   case of error. From Huaibin Wang.

Please pull or let me know if there are problems.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca00942a

16 3月, 2015 7 次提交

mac80211: ignore CSA to same channel · f84eaa10

由 Johannes Berg 提交于 3月 12, 2015

If the AP is confused and starts doing a CSA to the same channel,
just ignore that request instead of trying to act it out since it
was likely sent in error anyway.

In the case of the bug I was investigating the GO was misbehaving
and sending out a beacon with CSA IEs still included after having
actually done the channel switch.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

f84eaa10

nl80211: ignore HT/VHT capabilities without QoS/WMM · 496fcc29

由 Johannes Berg 提交于 3月 12, 2015

As HT/VHT depend heavily on QoS/WMM, it's not a good idea to
let userspace add clients that have HT/VHT but not QoS/WMM.
Since it does so in certain cases we've observed (client is
using HT IEs but not QoS/WMM) just ignore the HT/VHT info at
this point and don't pass it down to the drivers which might
unconditionally use it.

Cc: stable@vger.kernel.org
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

496fcc29

mac80211: ask for ECSA IE to be considered for beacon parse CRC · 70a3fd6c

由 Johannes Berg 提交于 3月 12, 2015

When a beacon from the AP contains only the ECSA IE, and not a CSA IE
as well, this ECSA IE is not considered for calculating the CRC and
the beacon might be dropped as not being interesting. This is clearly
wrong, it should be handled and the channel switch should be executed.

Fix this by including the ECSA IE ID in the bitmap of interesting IEs.
Reported-by: NGil Tribush <gil.tribush@intel.com>
Reviewed-by: NLuciano Coelho <luciano.coelho@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

70a3fd6c

mac80211: count interfaces correctly for combination checks · 0f611d28

由 Andrei Otcheretianski 提交于 3月 12, 2015

Since moving the interface combination checks to mac80211, it's
broken because it now only considers interfaces with an assigned
channel context, so for example any interface that isn't active
can still be up, which is clearly an issue; also, in particular
P2P-Device wdevs are an issue since they never have a chanctx.

Fix this by counting running interfaces instead the ones with a
channel context assigned.

Cc: stable@vger.kernel.org [3.16+]
Fixes: 73de86a3 ("cfg80211/mac80211: move interface counting for combination check to mac80211")
Signed-off-by: NAndrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
[rewrite commit message, dig out the commit it fixes]
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

0f611d28

isdn: icn: use strlcpy() when parsing setup options · 10640d34

由 Dan Carpenter 提交于 3月 15, 2015

If you pass an invalid string here then you probably deserve the memory
corruption, but it annoys static analysis tools so lets fix it.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

10640d34

rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg() · 7d985ed1

由 Al Viro 提交于 3月 14, 2015

[I would really like an ACK on that one from dhowells; it appears to be
quite straightforward, but...]

MSG_PEEK isn't passed to ->recvmsg() via msg->msg_flags; as the matter of
fact, neither the kernel users of rxrpc, nor the syscalls ever set that bit
in there.  It gets passed via flags; in fact, another such check in the same
function is done correctly - as flags & MSG_PEEK.

It had been that way (effectively disabled) for 8 years, though, so the patch
needs beating up - that case had never been tested.  If it is correct, it's
-stable fodder.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7d985ed1

caif: fix MSG_OOB test in caif_seqpkt_recvmsg() · 3eeff778

由 Al Viro 提交于 3月 14, 2015

It should be checking flags, not msg->msg_flags.  It's ->sendmsg()
instances that need to look for that in ->msg_flags, ->recvmsg() ones
(including the other ->recvmsg() instance in that file, as well as
unix_dgram_recvmsg() this one claims to be imitating) check in flags.
Braino had been introduced in commit dcda13 ("caif: Bugfix - use MSG_TRUNC
in receive") back in 2010, so it goes quite a while back.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3eeff778

15 3月, 2015 2 次提交

bridge: reset bridge mtu after deleting an interface · 4c906c27

由 Venkat Venkatsubra 提交于 3月 13, 2015

On adding an interface br_add_if() sets the MTU to the min of
all the interfaces. Do the same thing on removing an interface too
in br_del_if.
Signed-off-by: NVenkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c906c27

Merge tag 'linux-can-fixes-for-4.0-20150314' of... · 07c21715

由 David S. Miller 提交于 3月 14, 2015

Merge tag 'linux-can-fixes-for-4.0-20150314' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
Hello David, this is a pull request for net/master, consisting of two patches:

In the first patch Michal Simek enables the xilinx CAN driver for ARM64. The
second patch by Ahmed S. Darwish fixes a race condition in the tx-queue of the
kvaser_usb driver.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07c21715

14 3月, 2015 6 次提交

can: kvaser_usb: Fix tx queue start/stop race conditions · a9dc960c

由 Ahmed S. Darwish 提交于 3月 14, 2015

A number of tx queue wake-up events went missing due to the
outlined scenario below. Start state is a pool of 16 tx URBs,
active tx_urbs count = 15, with the netdev tx queue open.

CPU #1 [softirq]                         CPU #2 [softirq]
start_xmit()                             tx_acknowledge()
................                         ................

atomic_inc(&tx_urbs);
if (atomic_read(&tx_urbs) >= 16) {
                        -->
                                         atomic_dec(&tx_urbs);
                                         netif_wake_queue();
                                         return;
                        <--
    netif_stop_queue();
}

At the end, the correct state expected is a 15 tx_urbs count
value with the tx queue state _open_. Due to the race, we get
the same tx_urbs value but with the tx queue state _stopped_.
The wake-up event is completely lost.

Thus avoid hand-rolled concurrency mechanisms and use a proper
lock for contexts and tx queue protection.
Signed-off-by: NAhmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

a9dc960c

net: can: Enable xilinx driver for ARM64 · 963a822b

由 Michal Simek 提交于 3月 09, 2015

Enable the xilinx driver for ARM64.
Signed-off-by: NMichal Simek <michal.simek@xilinx.com>
Acked-by: NSören Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

963a822b

inet_diag: fix possible overflow in inet_diag_dump_one_icsk() · c8e2c80d

由 Eric Dumazet 提交于 3月 13, 2015

inet_diag_dump_one_icsk() allocates too small skb.

Add inet_sk_attr_size() helper right before inet_sk_diag_fill()
so that it can be updated if/when new attributes are added.

iproute2/ss currently does not use this dump_one() interface,
this might explain nobody noticed this problem yet.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8e2c80d

Revert "net: fec: fix the warning found by dma debug" · a2fe37b6

由 Fabio Estevam 提交于 3月 13, 2015

This reverts commit 2b995f63.

Панов Андрей reported the following regression:

"Commit 2b995f63 in 4.0.0-rc3 introduces a
nasty bug in transmit, corrupting packets.

To reproduce:

$ dd if=/dev/zero of=zeros bs=1M count=20
$ md5sum -b zeros
8f4e33f3dc3e414ff94e5fb6905cba8c *zeros

This checksum is correct.

Copy file "zeros" to another host with NFS, and it gets corrupted, checksum is
changed.
File should be big, small amounts of transmit isn't affected.

I use an i.MX6 Quad board.

If this commit is reverted, all works fine."
Reported-by: NПанов Андрей <rockford@yandex.ru>
Signed-off-by: NFabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2fe37b6

vxlan: fix wrong usage of VXLAN_VID_MASK · 40fb70f3

由 Alexey Kodanev 提交于 3月 13, 2015

commit dfd8645e wrongly assumes that VXLAN_VDI_MASK includes
eight lower order reserved bits of VNI field that are using for remote
checksum offload.

Right now, when VNI number greater then 0xffff, vxlan_udp_encap_recv()
will always return with 'bad_flag' error, reducing the usable vni range
from 0..16777215 to 0..65535. Also, it doesn't really check whether RCO
bits processed or not.

Fix it by adding new VNI mask which has all 32 bits of VNI field:
24 bits for id and 8 bits for other usage.
Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40fb70f3

tulip_core.c : out-of-bounds check. · b57578b3

由 Ameen Ali 提交于 3月 13, 2015

Array index 'j' is used before limits check.

Suggest put limit check before index use.

Signed-off-by : <Ameenali023@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b57578b3

13 3月, 2015 1 次提交

virtio-net: correctly delete napi hash · ab3971b1

由 Jason Wang 提交于 3月 12, 2015

We don't delete napi from hash list during module exit. This will
cause the following panic when doing module load and unload:

BUG: unable to handle kernel paging request at 0000004e00000075
IP: [<ffffffff816bd01b>] napi_hash_add+0x6b/0xf0
PGD 3c5d5067 PUD 0
Oops: 0000 [#1] SMP
...
Call Trace:
[<ffffffffa0a5bfb7>] init_vqs+0x107/0x490 [virtio_net]
[<ffffffffa0a5c9f2>] virtnet_probe+0x562/0x791815639d880be [virtio_net]
[<ffffffff8139e667>] virtio_dev_probe+0x137/0x200
[<ffffffff814c7f2a>] driver_probe_device+0x7a/0x250
[<ffffffff814c81d3>] __driver_attach+0x93/0xa0
[<ffffffff814c8140>] ? __device_attach+0x40/0x40
[<ffffffff814c6053>] bus_for_each_dev+0x63/0xa0
[<ffffffff814c7a79>] driver_attach+0x19/0x20
[<ffffffff814c76f0>] bus_add_driver+0x170/0x220
[<ffffffffa0a60000>] ? 0xffffffffa0a60000
[<ffffffff814c894f>] driver_register+0x5f/0xf0
[<ffffffff8139e41b>] register_virtio_driver+0x1b/0x30
[<ffffffffa0a60010>] virtio_net_driver_init+0x10/0x12 [virtio_net]

This patch fixes this by doing this in virtnet_free_queues(). And also
don't delete napi in virtnet_freeze() since it will call
virtnet_free_queues() which has already did this.

Fixes 91815639 ("virtio-net: rx busy polling support")
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab3971b1

12 3月, 2015 8 次提交

rds: avoid potential stack overflow · f862e07c

由 Arnd Bergmann 提交于 3月 11, 2015

The rds_iw_update_cm_id function stores a large 'struct rds_sock' object
on the stack in order to pass a pair of addresses. This happens to just
fit withint the 1024 byte stack size warning limit on x86, but just
exceed that limit on ARM, which gives us this warning:

net/rds/iw_rdma.c:200:1: warning: the frame size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=]

As the use of this large variable is basically bogus, we can rearrange
the code to not do that. Instead of passing an rds socket into
rds_iw_get_device, we now just pass the two addresses that we have
available in rds_iw_update_cm_id, and we change rds_iw_get_mr accordingly,
to create two address structures on the stack there.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f862e07c

sock: fix possible NULL sk dereference in __skb_tstamp_tx · 3a8dd971

由 Willem de Bruijn 提交于 3月 11, 2015

Test that sk != NULL before reading sk->sk_tsflags.

Fixes: 49ca0d8b ("net-timestamp: no-payload option")
Reported-by: NOne Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a8dd971

xps: must clear sender_cpu before forwarding · c29390c6

由 Eric Dumazet 提交于 3月 11, 2015

John reported that my previous commit added a regression
on his router.

This is because sender_cpu & napi_id share a common location,
so get_xps_queue() can see garbage and perform an out of bound access.

We need to make sure sender_cpu is cleared before doing the transmit,
otherwise any NIC busy poll enabled (skb_mark_napi_id()) can trigger
this bug.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NJohn <jw@nuclearfallout.net>
Bisected-by: NJohn <jw@nuclearfallout.net>
Fixes: 2bd82484 ("xps: fix xps for stacked devices")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c29390c6

xen-netback: notify immediately after pushing Tx response. · c8a4d299

由 David Vrabel 提交于 3月 11, 2015

This fixes a performance regression introduced by
7fbb9d84 (xen-netback: release pending
index before pushing Tx responses)

Moving the notify outside of the spin locks means it can be delayed a
long time (if the dealloc thread is descheduled or there is an
interrupt or softirq).
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Reviewed-by: NZoltan Kiss <zoltan.kiss@linaro.org>
Acked-by: NWei Liu <wei.liu2@citrix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8a4d299

net: sysctl_net_core: check SNDBUF and RCVBUF for min length · b1cb59cf

由 Alexey Kodanev 提交于 3月 11, 2015

sysctl has sysctl.net.core.rmem_*/wmem_* parameters which can be
set to incorrect values. Given that 'struct sk_buff' allocates from
rcvbuf, incorrectly set buffer length could result to memory
allocation failures. For example, set them as follows:

    # sysctl net.core.rmem_default=64
      net.core.wmem_default = 64
    # sysctl net.core.wmem_default=64
      net.core.wmem_default = 64
    # ping localhost -s 1024 -i 0 > /dev/null

This could result to the following failure:

skbuff: skb_over_panic: text:ffffffff81628db4 len:-32 put:-32
head:ffff88003a1cc200 data:ffff88003a1cc200 tail:0xffffffe0 end:0xc0 dev:<NULL>
kernel BUG at net/core/skbuff.c:102!
invalid opcode: 0000 [#1] SMP
...
task: ffff88003b7f5550 ti: ffff88003ae88000 task.ti: ffff88003ae88000
RIP: 0010:[<ffffffff8155fbd1>]  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
RSP: 0018:ffff88003ae8bc68  EFLAGS: 00010296
RAX: 000000000000008d RBX: 00000000ffffffe0 RCX: 0000000000000000
RDX: ffff88003fdcf598 RSI: ffff88003fdcd9c8 RDI: ffff88003fdcd9c8
RBP: ffff88003ae8bc88 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 00000000000002b2 R12: 0000000000000000
R13: 0000000000000000 R14: ffff88003d3f7300 R15: ffff88000012a900
FS:  00007fa0e2b4a840(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000d0f7e0 CR3: 000000003b8fb000 CR4: 00000000000006f0
Stack:
 ffff88003a1cc200 00000000ffffffe0 00000000000000c0 ffffffff818cab1d
 ffff88003ae8bd68 ffffffff81628db4 ffff88003ae8bd48 ffff88003b7f5550
 ffff880031a09408 ffff88003b7f5550 ffff88000012aa48 ffff88000012ab00
Call Trace:
 [<ffffffff81628db4>] unix_stream_sendmsg+0x2c4/0x470
 [<ffffffff81556f56>] sock_write_iter+0x146/0x160
 [<ffffffff811d9612>] new_sync_write+0x92/0xd0
 [<ffffffff811d9cd6>] vfs_write+0xd6/0x180
 [<ffffffff811da499>] SyS_write+0x59/0xd0
 [<ffffffff81651532>] system_call_fastpath+0x12/0x17
Code: 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00
      00 00 48 c7 c7 30 db 91 81 48 89 04 24 31 c0 e8 4f a8 0e 00 <0f> 0b
      eb fe 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83
RIP  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
RSP <ffff88003ae8bc68>
Kernel panic - not syncing: Fatal exception

Moreover, the possible minimum is 1, so we can get another kernel panic:
...
BUG: unable to handle kernel paging request at ffff88013caee5c0
IP: [<ffffffff815604cf>] __alloc_skb+0x12f/0x1f0
...
Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1cb59cf

tcp: restore 1.5x per RTT limit to CUBIC cwnd growth in congestion avoidance · d578e18c

由 Neal Cardwell 提交于 3月 10, 2015

Commit 814d488c ("tcp: fix the timid additive increase on stretch
ACKs") fixed a bug where tcp_cong_avoid_ai() would either credit a
connection with an increase of snd_cwnd_cnt, or increase snd_cwnd, but
not both, resulting in cwnd increasing by 1 packet on at most every
alternate invocation of tcp_cong_avoid_ai().

Although the commit correctly implemented the CUBIC algorithm, which
can increase cwnd by as much as 1 packet per 1 packet ACKed (2x per
RTT), in practice that could be too aggressive: in tests on network
paths with small buffers, YouTube server retransmission rates nearly
doubled.

This commit restores CUBIC to a maximum cwnd growth rate of 1 packet
per 2 packets ACKed (1.5x per RTT). In YouTube tests this restored
retransmit rates to low levels.

Testing: This patch has been tested in datacenter netperf transfers
and live youtube.com and google.com servers.

Fixes: 9cd981dc ("tcp: fix stretch ACK bugs in CUBIC")
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d578e18c

tcp: fix tcp_cong_avoid_ai() credit accumulation bug with decreases in w · 9949afa4

由 Neal Cardwell 提交于 3月 10, 2015

The recent change to tcp_cong_avoid_ai() to handle stretch ACKs
introduced a bug where snd_cwnd_cnt could accumulate a very large
value while w was large, and then if w was reduced snd_cwnd could be
incremented by a large delta, leading to a large burst and high packet
loss. This was tickled when CUBIC's bictcp_update() sets "ca->cnt =
100 * cwnd".

This bug crept in while preparing the upstream version of
814d488c.

Testing: This patch has been tested in datacenter netperf transfers
and live youtube.com and google.com servers.

Fixes: 814d488c ("tcp: fix the timid additive increase on stretch ACKs")
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9949afa4

MAINTAINERS: Update my email address · 366c1bd1

由 chas williams - CONTRACTOR 提交于 3月 11, 2015

Changed to my private email address.
Signed-off-by: NChas Williams -- CONTRACTOR <chas@cmf.nrl.navy.mil>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

366c1bd1

11 3月, 2015 5 次提交

net: Handle unregister properly when netdev namespace change fails. · 43638900

由 David S. Miller 提交于 3月 10, 2015

If rtnl_newlink() fails on it's call to dev_change_net_namespace(), we
have to make use of the ->dellink() method, if present, just like we
do when rtnl_configure_link() fails.

Fixes: 317f4810 ("rtnl: allow to create device with IFLA_LINK_NETNSID set")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

43638900

net: add comment for sock_efree() usage · 7768eed8

由 Oliver Hartkopp 提交于 3月 10, 2015

Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
Acked-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7768eed8

Merge tag 'wireless-drivers-for-davem-2015-03-10' of... · 3f39d625

由 David S. Miller 提交于 3月 10, 2015

Merge tag 'wireless-drivers-for-davem-2015-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

iwlwifi:

* fix ROC removal - avoids a firmware crash
* fix throughput regression on iwldvm devices
* fix panic in BT Coex
* fixes in rate control
* fixes in scan

b43:

* fix support for 5 GHz only BCM43228 model

rtlwifi:

* improve handling of IPv6 packets

brcmfmac:

* perform bound checking on vendor command buffer
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3f39d625

cxgb4: fix coccinelle warnings · e3d50738

由 Hariprasad Shenai 提交于 3月 10, 2015

Commit 16e47624 ("cxgb4: Add new scheme to update T4/T5 firmware")
introduced below coccinelle warning.

>> drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:994:2-8: Replace memcpy with
   struct assignment
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3d50738

net: fec: fix receive VLAN CTAG HW acceleration issue · af5cbc98

由 Nimrod Andy 提交于 3月 10, 2015

The current driver support receive VLAN CTAG HW acceleration feature
(NETIF_F_HW_VLAN_CTAG_RX) through software simulation. There calls the
api .skb_copy_to_linear_data_offset() to skip the VLAN tag, but there
have overlap between the two memory data point range. The patch just fix
the issue.

V2:
Michael Grzeschik suggest to use memmove() instead of skb_copy_to_linear_data_offset().
Reported-by: NMichael Grzeschik <m.grzeschik@pengutronix.de>
Fixes: 1b7bde6d ("net: fec: implement rx_copybreak to improve rx performance")
Signed-off-by: NFugang Duan <B38611@freescale.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

af5cbc98