提交 · c53ed7423619b4e8108914a9f31b426dd58ad591 · openanolis / cloud-kernel

20 11月, 2013 5 次提交

genetlink: only pass array to genl_register_family_with_ops() · c53ed742

由 Johannes Berg 提交于 11月 19, 2013

As suggested by David Miller, make genl_register_family_with_ops()
a macro and pass only the array, evaluating ARRAY_SIZE() in the
macro, this is a little safer.

The openvswitch has some indirection, assing ops/n_ops directly in
that code. This might ultimately just assign the pointers in the
family initializations, saving the struct genl_family_and_ops and
code (once mcast groups are handled differently.)
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c53ed742

tcp: don't update snd_nxt, when a socket is switched from repair mode · dbde4979

由 Andrey Vagin 提交于 11月 19, 2013

snd_nxt must be updated synchronously with sk_send_head.  Otherwise
tp->packets_out may be updated incorrectly, what may bring a kernel panic.

Here is a kernel panic from my host.
[  103.043194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[  103.044025] IP: [<ffffffff815aaaaf>] tcp_rearm_rto+0xcf/0x150
...
[  146.301158] Call Trace:
[  146.301158]  [<ffffffff815ab7f0>] tcp_ack+0xcc0/0x12c0

Before this panic a tcp socket was restored. This socket had sent and
unsent data in the write queue. Sent data was restored in repair mode,
then the socket was switched from reapair mode and unsent data was
restored. After that the socket was switched back into repair mode.

In that moment we had a socket where write queue looks like this:
snd_una    snd_nxt   write_seq
   |_________|________|
             |
	  sk_send_head

After a second switching from repair mode the state of socket was
changed:

snd_una          snd_nxt, write_seq
   |_________ ________|
             |
	  sk_send_head

This state is inconsistent, because snd_nxt and sk_send_head are not
synchronized.

Bellow you can find a call trace, how packets_out can be incremented
twice for one skb, if snd_nxt and sk_send_head are not synchronized.
In this case packets_out will be always positive, even when
sk_write_queue is empty.

tcp_write_wakeup
	skb = tcp_send_head(sk);
	tcp_fragment
		if (!before(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq))
			tcp_adjust_pcount(sk, skb, diff);
	tcp_event_new_data_sent
		tp->packets_out += tcp_skb_pcount(skb);

I think update of snd_nxt isn't required, when a socket is switched from
repair mode.  Because it's initialized in tcp_connect_init. Then when a
write queue is restored, snd_nxt is incremented in tcp_event_new_data_sent,
so it's always is in consistent state.

I have checked, that the bug is not reproduced with this patch and
all tests about restoring tcp connections work fine.

Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: NAndrey Vagin <avagin@openvz.org>
Acked-by: NPavel Emelyanov <xemul@parallels.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dbde4979

atm: idt77252: fix dev refcnt leak · b5de4a22

由 Ying Xue 提交于 11月 19, 2013

init_card() calls dev_get_by_name() to get a network deceive. But it
doesn't decrease network device reference count after the device is
used.
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5de4a22

xfrm: Release dst if this dst is improper for vti tunnel · 236c9f84

由 fan.du 提交于 11月 19, 2013

After searching rt by the vti tunnel dst/src parameter,
if this rt has neither attached to any transformation
nor the transformation is not tunnel oriented, this rt
should be released back to ip layer.

otherwise causing dst memory leakage.
Signed-off-by: NFan Du <fan.du@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

236c9f84

netlink: fix documentation typo in netlink_set_err() · 840e93f2

由 Johannes Berg 提交于 11月 19, 2013

The parameter is just 'group', not 'groups', fix the documentation typo.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

840e93f2

19 11月, 2013 14 次提交

be2net: Delete secondary unicast MAC addresses during be_close · d11a347d

由 Ajit Khaparde 提交于 11月 18, 2013

Secondary unicast MAC addresses will get deleted only when the interface is UP.
When the interface is DOWN, though these secondary MAC addresses are unusable
and awaiting to be deleted, cause the firmware to believe that they are being used.
If the user intends to set a MAC address as primary MAC from one of these
secondary MAC addresses, the firmware returns a MAC address Collision error.
Delete these secondary MAC addresses during be_close.

The secondary MAC addresses list will be refreshed during interface open anyway.
Signed-off-by: NAjit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d11a347d

be2net: Fix unconditional enabling of Rx interface options · 012bd387

由 Ajit Khaparde 提交于 11月 18, 2013

The driver currently requests the firmware to enable rx_interface options
without considering if the interface was created with that capability.
This could cause commands to firmware to fail.

To avoid this, enable only those options on an interface if the interface
was created with that capability.
Signed-off-by: NAjit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

012bd387

net, virtio_net: replace the magic value · 0f13b66b

由 Zhi Yong Wu 提交于 11月 18, 2013

It is more appropriate to use # of queue pairs currently used by
the driver instead of a magic value.
Signed-off-by: NZhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0f13b66b

ping: prevent NULL pointer dereference on write to msg_name · cf970c00

由 Hannes Frederic Sowa 提交于 11月 18, 2013

A plain read() on a socket does set msg->msg_name to NULL. So check for
NULL pointer first.
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf970c00

Merge branch 'bnx2x' · 4861292f

由 David S. Miller 提交于 11月 18, 2013

Yuval Mintz says:

====================
bnx2x: Bug fixes patch series

This series contains several fixes, relating either to SR-IOV flows
or to critical sections protected by the rtnl lock.

Please consider applying these patches to `net'.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4861292f

bnx2x: Prevent "timeout waiting for state X" · 6ffa39f2

由 Dmitry Kravkov 提交于 11月 17, 2013

Current driver release rtnl lock in between DCB re-configuration.
As a result, other flows (e.g., mtu config) may enter in between and fail
due to halted tx path for dcb configuration.
Signed-off-by: NDmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: NYuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: NAriel Elior <ariele@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6ffa39f2

bnx2x: prevent CFC attention · ffa1cb96

由 Dmitry Kravkov 提交于 11月 17, 2013

During VF load, prior to sending messages on HW channel to PF the VF
checks its bulletin board to see whether the PF indicated it has closed;
If a closed PF is encountered, the VF skips sending the message.

Due to incorrect return values, there's a possible scenario in which the VF
finishes loading "successfully", while the PF hasn't actually fully configured
FW/HW for the VFs supposed configuration.
Once VF tries to send Tx packets, HW will raise an attention (and FW possibly
will start treat the VF as malicious).

The patch fails the loading process in such a scenario.
Signed-off-by: NDmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: NYuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: NAriel Elior <ariele@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ffa1cb96

bnx2x: Prevent panic during DMAE timeout · 9dcd9acd

由 Dmitry Kravkov 提交于 11月 17, 2013

If chip enters a recovery flow just after the driver issues a DMAE request
the DMAE will timeout. Current code will cause a bnx2x_panic() as a result,
which means interface will no longer be usable (regardless of the recovery
results), as bnx2x_panic() is irreversible for the driver.

As this is a possible flow, the panic should be reached only when driver
is compiled with STOP_ON_ERROR.
Signed-off-by: NDmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: NYuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: NAriel Elior <ariele@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9dcd9acd

bnx2x: Clean the sp rtnl task upon unload · a0d307b2

由 Dmitry Kravkov 提交于 11月 17, 2013

While unloading, bnx2x needs to clean the sp_rtnl_state to prevent
configuration made before the unload to be applied afterwards with
stale values.
Signed-off-by: NDmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: NYuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: NAriel Elior <ariele@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a0d307b2

ipv6: Fix inet6_init() cleanup order · eca42aaf

由 Vlad Yasevich 提交于 11月 16, 2013

Commit 6d0bfe22
	net: ipv6: Add IPv6 support to the ping socket

introduced a change in the cleanup logic of inet6_init and
has a bug in that ipv6_packet_cleanup() may not be called.
Fix the cleanup ordering.

CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
CC: Lorenzo Colitti <lorenzo@google.com>
CC: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: NVlad Yasevich <vyasevich@gmail.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eca42aaf

genetlink: rename shadowed variable · 029b234f

由 Johannes Berg 提交于 11月 18, 2013

Sparse pointed out that the new flags variable I had added
shadowed an existing one, rename the new one to avoid that,
making the code clearer.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

029b234f

inet: prevent leakage of uninitialized memory to user in recv syscalls · bceaa902

由 Hannes Frederic Sowa 提交于 11月 18, 2013

Only update *addr_len when we actually fill in sockaddr, otherwise we
can return uninitialized memory from the stack to the caller in the
recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL)
checks because we only get called with a valid addr_len pointer either
from sock_common_recvmsg or inet_recvmsg.

If a blocking read waits on a socket which is concurrently shut down we
now return zero and set msg_msgnamelen to 0.
Reported-by: Nmpb <mpb.mail@gmail.com>
Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bceaa902

net: ipv6: ndisc: Fix warning when CONFIG_SYSCTL=n · bcd081a3

由 Fabio Estevam 提交于 11月 16, 2013

When CONFIG_SYSCTL=n the following build warning happens:

net/ipv6/ndisc.c:1730:1: warning: label 'out' defined but not used [-Wunused-label]

The 'out' label is only used when CONFIG_SYSCTL=y, so move it inside the
'ifdef CONFIG_SYSCTL' block.
Signed-off-by: NFabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bcd081a3

xen-netfront: fix missing rx_refill_timer when allocate memory failed · fdcf7765

由 Ma JieYue 提交于 11月 15, 2013

There was a bug in xennet_alloc_rx_buffers, when allocating page or
sk_buff failed, and at the same time rx_batch queue not empty,
the rx_refill_timer timer won't be scheduled. If finally the remaining
request buffers in rx ring less than what backend driver expected,
the backend driver would think of rx ring as full and start dropping packets.
In such situation, there is no way for the netfront driver to recover
automatically, so that the device can not work properly.

The patch fixes the problem by always scheduling rx_refill_timer timer when
alloc_page or __netdev_alloc_skb fails, no matter whether rx_batch queue is
empty or not. It ensures that the rx ring request buffers will finally meet
the backend needs.
Signed-off-by: NMa JieYue <jieyue.majy@alibaba-inc.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fdcf7765

16 11月, 2013 10 次提交

net: Handle CHECKSUM_COMPLETE more adequately in pskb_trim_rcsum(). · 018c5bba

由 David S. Miller 提交于 11月 15, 2013

Currently pskb_trim_rcsum() just balks on CHECKSUM_COMPLETE packets
and remarks them as CHECKSUM_NONE, forcing a software checksum
validation later.

We have all of the mechanics available to fixup the skb->csum value,
even for complicated fragmented packets, via the helpers
skb_checksum() and csum_sub().

So just use them.

Based upon a suggestion by Herbert Xu.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

018c5bba

pkt_sched: fq: fix pacing for small frames · f52ed899

由 Eric Dumazet 提交于 11月 15, 2013

For performance reasons, sch_fq tried hard to not setup timers for every
sent packet, using a quantum based heuristic : A delay is setup only if
the flow exhausted its credit.

Problem is that application limited flows can refill their credit
for every queued packet, and they can evade pacing.

This problem can also be triggered when TCP flows use small MSS values,
as TSO auto sizing builds packets that are smaller than the default fq
quantum (3028 bytes)

This patch adds a 40 ms delay to guard flow credit refill.

Fixes: afe4fd06 ("pkt_sched: fq: Fair Queue packet scheduler")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f52ed899

pkt_sched: fq: warn users using defrate · 65c5189a

由 Eric Dumazet 提交于 11月 15, 2013

Commit 7eec4174 ("pkt_sched: fq: fix non TCP flows pacing")
obsoleted TCA_FQ_FLOW_DEFAULT_RATE without notice for the users.

Suggested by David Miller
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

65c5189a

genetlink: unify registration functions · 568508aa

由 Johannes Berg 提交于 11月 15, 2013

Now that the ops assignment is just two variables rather than a
long list iteration etc., there's no reason to separately export
__genl_register_family() and __genl_register_family_with_ops().

Unify the two functions into __genl_register_family() and make
genl_register_family_with_ops() call it after assigning the ops.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

568508aa

MAINTAINERS: Update Pegasus and RTL8150 repositories; · 052e3128

由 Petko Manolov 提交于 11月 15, 2013

The diff is against latest 'net' repository;
Signed-off-by: NPetko Manolov <petkan@nucleusys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

052e3128

net: ethernet: ti/cpsw: do not crash on single-MAC machines during resume · 1e7a2e21

由 Daniel Mack 提交于 11月 15, 2013

During resume, use for_each_slave to walk the slaves of the cpsw, and
soft-reset each of them. This prevents oopses if there is only one
slave configured.
Signed-off-by: NDaniel Mack <zonque@gmail.com>
Acked-by: NMugunthan V N <mugunthanvnm@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1e7a2e21

Merge branch 'macvlan' · 82c80e9d

由 David S. Miller 提交于 11月 15, 2013

Michal Kubecek says:

====================
macvlan: disable LRO on lowerdev instead of a macvlan

A customer of ours encountered a problem with LRO on an ixgbe network
card. Analysis showed that it was a known conflict of forwarding and LRO
but the forwarding was enabled in an LXC container where only a macvlan
was, not the ethernet device itself.

I believe the solution is exactly the same as what we do for "normal"
(802.1q) VLAN devices: if dev_disable_lro() is called for such device,
LRO is disabled on the underlying "real" device instead.

v2: adapt to changes merged from net-next

v3: use BUG() in macvlan_dev_real_dev() if compiled without macvlan
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82c80e9d

macvlan: disable LRO on lower device instead of macvlan · 529d0489

由 Michal Kubeček 提交于 11月 15, 2013

A macvlan device has always LRO disabled so that calling
dev_disable_lro() on it does nothing. If we need to disable LRO
e.g. because

  - the macvlan device is inserted into a bridge
  - IPv6 forwarding is enabled for it
  - it is in a different namespace than lowerdev and IPv4
    forwarding is enabled in it

we need to disable LRO on its underlying device instead (as we
do for 802.1q VLAN devices).

v2: use newly introduced netif_is_macvlan()
Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

529d0489

macvlan: introduce macvlan_dev_real_dev() helper function · be9eac48

由 Michal Kubeček 提交于 11月 15, 2013

Introduce helper function macvlan_dev_real_dev which returns the
underlying device of a macvlan device, similar to vlan_dev_real_dev()
for 802.1q VLAN devices.

v2: IFF_MACVLAN flag and equivalent of is_macvlan_dev() were
introduced in the meantime

v3: do BUG() if compiled without macvlan support
Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

be9eac48

bonding: add ip checks when store ip target · f9de11a1

由 Wang Weidong 提交于 11月 15, 2013

I met a Bug when I add ip target with the wrong ip address:

echo +500.500.500.500 > /sys/class/net/bond0/bonding/arp_ip_target

the wrong ip address will transfor to 245.245.245.244 and add
to the ip target success, it is uncorrect, so I add checks to avoid
adding wrong address.

The in4_pton() will set wrong ip address to 0.0.0.0, it will return by
the next check and will not add to ip target.

v2
According Veaceslav's opinion, simplify the code.

v3
According Veaceslav's opinion, add broadcast check and make a micro
definition to package it.

v4
Solve the problem of the format which David point out.
Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
Suggested-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f9de11a1

15 11月, 2013 11 次提交

6lowpan: Uncompression of traffic class field was incorrect · 1188f054

由 Jukka Rissanen 提交于 11月 13, 2013

If priority/traffic class field in IPv6 header is set (seen when
using ssh), the uncompression sets the TC and Flow fields incorrectly.

Example:

This is IPv6 header of a sent packet. Note the priority/TC (=1) in
the first byte.

00000000: 61 00 00 00 00 2c 06 40 fe 80 00 00 00 00 00 00
00000010: 02 02 72 ff fe c6 42 10 fe 80 00 00 00 00 00 00
00000020: 02 1e ab ff fe 4c 52 57

This gets compressed like this in the sending side

00000000: 72 31 04 06 02 1e ab ff fe 4c 52 57 ec c2 00 16
00000010: aa 2d fe 92 86 4e be c6 ....

In the receiving end, the packet gets uncompressed to this
IPv6 header

00000000: 60 06 06 02 00 2a 1e 40 fe 80 00 00 00 00 00 00
00000010: 02 02 72 ff fe c6 42 10 fe 80 00 00 00 00 00 00
00000020: ab ff fe 4c 52 57 ec c2

First four bytes are set incorrectly and we have also lost
two bytes from destination address.

The fix is to switch the case values in switch statement
when checking the TC field.
Signed-off-by: NJukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1188f054

tipc: fix dereference before check warning · 3db0a197

由 Erik Hugne 提交于 11月 13, 2013

This fixes the following Smatch warning:
net/tipc/link.c:2364 tipc_link_recv_fragment()
    warn: variable dereferenced before check '*head' (see line 2361)

A null pointer might be passed to skb_try_coalesce if
a malicious sender injects orphan fragments on a link.
Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3db0a197

ipv4: fix possible seqlock deadlock · c9e90429

由 Eric Dumazet 提交于 11月 14, 2013

ip4_datagram_connect() being called from process context,
it should use IP_INC_STATS() instead of IP_INC_STATS_BH()
otherwise we can deadlock on 32bit arches, or get corruptions of
SNMP counters.

Fixes: 584bdf8c ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NDave Jones <davej@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c9e90429

net/hsr: Fix possible leak in 'hsr_get_node_status()' · 84a035f6

由 Geyslan G. Bem 提交于 11月 14, 2013

If 'hsr_get_node_data()' returns error, going directly to 'fail' label
doesn't free the memory pointed by 'skb_out'.
Signed-off-by: NGeyslan G. Bem <geyslan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84a035f6

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless · 8422d1f1

由 David S. Miller 提交于 11月 14, 2013

John W. Linville says:

====================
pull request: wireless 2013-11-14

Please pull this batch of fixes intended for the 3.13 stream!

Amitkumar Karwar offers a quartet of mwifiex fixes, including an
endian fix and three fixes for invalid memory access.

Avinash Patil trims the packet length value for packets received from
an SDIO interface.

Colin Ian King fixes a NULL pointer dereference in the rtlwifi
efuse code.

Dan Carpenter cleans-up an mwifiex integer underflow, a potential
libertas oops, a memory corrupion bug in wcn36xx, and a locking issue
also in wcn36xx.

Dan Williams helps prism54 devices to avoid being misclassified as
Ethernet devices.

Felipe Pena fixes a couple of typo errors, one in rt2x00 and the
other in rtlwifi.

Janusz Dziedzic corrects a pair of DFS-related problems in ath9k.

Larry Finger patches three rtlwifi drivers to correctly report signal
strength even for an unassociated AP.

Mark Cave-Ayland rewrites some endian-illiterate packet type extraction
code in rtlwifi.

Stanislaw Gruszka addresses an rt2x00 regression related to setting
HT station WCID and AMPDU density parameters.

Sujith Manoharan corrects the initvals settings for AR9485.

Ujjal Roy patches an obscure bit of code in mwifiex that was using
the wrong definition of eth_hdr when briding patches in AP mode.

Wei Yongjun fixes a couple of bugs: one is a return code handling
bug in libertas; and, the other is a locking issue in wcn36xx.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8422d1f1

virtio-net: mergeable buffer size should include virtio-net header · 5061de36

由 Michael Dalton 提交于 11月 14, 2013

Commit 2613af0e ("virtio_net: migrate mergeable rx buffers to page
frag allocators") changed the mergeable receive buffer size from PAGE_SIZE
to MTU-size. However, the merge buffer size does not take into account the
size of the virtio-net header. Consequently, packets that are MTU-size
will take two buffers intead of one (to store the virtio-net header),
substantially decreasing the throughput of MTU-size traffic due to TCP
window / SKB truesize effects.

This commit changes the mergeable buffer size to include the virtio-net
header. The buffer size is cacheline-aligned because skb_page_frag_refill
will not automatically align the requested size.

Benchmarks taken from an average of 5 netperf 30-second TCP_STREAM runs
between two QEMU VMs on a single physical machine. Each VM has two VCPUs and
vhost enabled. All VMs and vhost threads run in a single 4 CPU cgroup
cpuset, using cgroups to ensure that other processes in the system will not
be scheduled on the benchmark CPUs. Transmit offloads and mergeable receive
buffers are enabled, but guest_tso4 / guest_csum are explicitly disabled to
force MTU-sized packets on the receiver.

next-net trunk before 2613af0e (PAGE_SIZE buf): 3861.08Gb/s
net-next trunk (MTU 1500- packet uses two buf due to size bug): 4076.62Gb/s
net-next trunk (MTU 1480- packet fits in one buf): 6301.34Gb/s
net-next trunk w/ size fix (MTU 1500 - packet fits in one buf): 6445.44Gb/s
Suggested-by: NEric Northup <digitaleric@google.com>
Signed-off-by: NMichael Dalton <mwdalton@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5061de36

connector: improved unaligned access error fix · 1ca1a4cf

由 Chris Metcalf 提交于 11月 14, 2013

In af3e095a, Erik Jacobsen fixed one type of unaligned access
bug for ia64 by converting a 64-bit write to use put_unaligned().
Unfortunately, since gcc will convert a short memset() to a series
of appropriately-aligned stores, the problem is now visible again
on tilegx, where the memset that zeros out proc_event is converted
to three 64-bit stores, causing an unaligned access panic.

A better fix for the original problem is to ensure that proc_event
is aligned to 8 bytes here. We can do that relatively easily by
arranging to start the struct cn_msg aligned to 8 bytes and then
offset by 4 bytes. Doing so means that the immediately following
proc_event structure is then correctly aligned to 8 bytes.

The result is that the memset() stores are now aligned, and as an
added benefit, we can remove the put_unaligned() calls in the code.
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ca1a4cf

pkt_sched: fq: change classification of control packets · 2abc2f07

由 Maciej Żenczykowski 提交于 11月 14, 2013

Initial sch_fq implementation copied code from pfifo_fast to classify
a packet as a high prio packet.

This clashes with setups using PRIO with say 7 bands, as one of the
band could be incorrectly (mis)classified by FQ.

Packets would be queued in the 'internal' queue, and no pacing ever
happen for this special queue.

Fixes: afe4fd06 ("pkt_sched: fq: Fair Queue packet scheduler")
Signed-off-by: NMaciej Żenczykowski <maze@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2abc2f07

alx: Reset phy speed after resume · b54629e2

由 hahnjo 提交于 11月 12, 2013

This fixes bug 62491 (https://bugzilla.kernel.org/show_bug.cgi?id=62491).
After resuming some users got the following error flooding the kernel log:
alx 0000:02:00.0: invalid PHY speed/duplex: 0xffff
Signed-off-by: NJonas Hahnfeld <linux@hahnjo.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b54629e2

Merge branch 'genetlink' · 4fb09a87

由 David S. Miller 提交于 11月 14, 2013

Johannes Berg says:

====================
genetlink: reduce ops size and complexity (v2)

As before - reduce the complexity and data/code size of genetlink ops
by making them an array rather than a linked list. Most users already
use an array thanks to genl_register_family_with_ops(), so convert the
remaining ones allowing us to get rid of the list head in each op.

Also make them const, this just makes sense at that point and the security
people like making function pointers const as well :-)
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4fb09a87

genetlink: make genl_ops flags a u8 and move to end · 3f5ccd06

由 Johannes Berg 提交于 11月 14, 2013

To save some space in the struct on 32-bit systems,
make the flags a u8 (only 4 bits are used) and also
move them to the end of the struct.

This has no impact on 64-bit systems as alignment of
the struct in an array uses up the space anyway.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3f5ccd06

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功