提交 · bf73130d7f98c8c4db143e2dc4982f4eefd5d5e5 · openeuler / raspberrypi-kernel

24 4月, 2010 5 次提交

IPv6: Complete IPV6_DONTFRAG support · 4b340ae2

由 Brian Haley 提交于 4月 23, 2010

Finally add support to detect a local IPV6_DONTFRAG event
and return the relevant data to the user if they've enabled
IPV6_RECVPATHMTU on the socket.  The next recvmsg() will
return no data, but have an IPV6_PATHMTU as ancillary data.
Signed-off-by: NBrian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b340ae2

IPv6: Add dontfrag argument to relevant functions · 13b52cd4

由 Brian Haley 提交于 4月 23, 2010

Add dontfrag argument to relevant functions for
IPV6_DONTFRAG support, as well as allowing the value
to be passed-in via ancillary cmsg data.
Signed-off-by: NBrian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13b52cd4

IPv6: data structure changes for new socket options · 793b1473

由 Brian Haley 提交于 4月 23, 2010

Add underlying data structure changes and basic setsockopt()
and getsockopt() support for IPV6_RECVPATHMTU, IPV6_PATHMTU,
and IPV6_DONTFRAG.  IPV6_PATHMTU is actually fully functional
at this point.
Signed-off-by: NBrian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

793b1473

l2tp_eth: fix memory allocation · 3a737028

由 Jiri Pirko 提交于 4月 23, 2010

Since .size is set properly in "struct pernet_operations l2tp_eth_net_ops",
allocating space for "struct l2tp_eth_net" by hand is not correct, even causes
memory leakage.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a737028

l2tp: fix memory allocation · e773aaff

由 Jiri Pirko 提交于 4月 23, 2010

Since .size is set properly in "struct pernet_operations l2tp_net_ops",
allocating space for "struct l2tp_net" by hand is not correct, even causes
memory leakage.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e773aaff

23 4月, 2010 7 次提交

Y
bridge br_multicast: IPv6 MLD support. · 08b202b6
由 YOSHIFUJI Hideaki 提交于 4月 23, 2010
```
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
```
08b202b6

bridge br_multicast: Make functions less ipv4 dependent. · 8ef2a9a5

由 YOSHIFUJI Hideaki 提交于 4月 18, 2010

Introduce struct br_ip{} to store ip address and protocol
and make functions more generic so that we can support
both IPv4 and IPv6 with less pain.
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

8ef2a9a5

Y
ipv6 mcast: Introduce include/net/mld.h for MLD definitions. · 6e7cb837
由 YOSHIFUJI Hideaki 提交于 4月 18, 2010
```
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
```
6e7cb837

X25: Add if_x25.h and x25 to device identifiers · 5ebfbc06

由 Andrew Hendry 提交于 4月 22, 2010

V2 Feedback from John Hughes.
- Add header for userspace implementations such as xot/xoe to use
- Use explicit values for interface stability
- No changes to driver patches

V1
- Use identifiers instead of magic numbers for X25 layer 3 to device interface.
- Also fixed checkpatch notes on updated code.

[ Add new user header to include/linux/Kbuild  -DaveM ]
Signed-off-by: NAndrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ebfbc06

net: Socket filter ancilliary data access for skb->dev->type · 40eaf962

由 Paul LeoNerd Evans 提交于 4月 22, 2010

Add an SKF_AD_HATYPE field to the packet ancilliary data area, giving
access to skb->dev->type, as reported in the sll_hatype field.

When capturing packets on a PF_PACKET/SOCK_RAW socket bound to all
interfaces, there doesn't appear to be a way for the filter program to
actually find out the underlying hardware type the packet was captured
on. This patch adds such ability.

This patch also handles the case where skb->dev can be NULL, such as on
netlink sockets.
Signed-off-by: NPaul Evans <leonerd@leonerd.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40eaf962

tcp: fix outsegs stat for TSO segments · aa2ea058

由 Tom Herbert 提交于 4月 22, 2010

Account for TSO segments of an skb in TCP_MIB_OUTSEGS counter.  Without
doing this, the counter can be off by orders of magnitude from the
actual number of segments sent.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa2ea058

IPv6: Generic TTL Security Mechanism (final version) · e802af9c

由 Stephen Hemminger 提交于 4月 22, 2010

This patch adds IPv6 support for RFC5082 Generalized TTL Security Mechanism.  

Not to users of mapped address; the IPV6 and IPV4 socket options are seperate.
The server does have to deal with both IPv4 and IPv6 socket options
and the client has to handle the different for each family.

On client:
	int ttl = 255;
	getaddrinfo(argv[1], argv[2], &hint, &result);

	for (rp = result; rp != NULL; rp = rp->ai_next) {
		s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
		if (s < 0) continue;

		if (rp->ai_family == AF_INET) {
			setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl));
		} else if (rp->ai_family == AF_INET6) {
			setsockopt(s, IPPROTO_IPV6,  IPV6_UNICAST_HOPS, 
					&ttl, sizeof(ttl)))
		}
			
		if (connect(s, rp->ai_addr, rp->ai_addrlen) == 0) {
		   ...

On server:
	int minttl = 255 - maxhops;
   
	getaddrinfo(NULL, port, &hints, &result);
	for (rp = result; rp != NULL; rp = rp->ai_next) {
		s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
		if (s < 0) continue;

		if (rp->ai_family == AF_INET6)
			setsockopt(s, IPPROTO_IPV6,  IPV6_MINHOPCOUNT,
					&minttl, sizeof(minttl));
		setsockopt(s, IPPROTO_IP, IP_MINTTL, &minttl, sizeof(minttl));
			
		if (bind(s, rp->ai_addr, rp->ai_addrlen) == 0)
			break
...
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e802af9c

22 4月, 2010 6 次提交

net: Orphan and de-dst skbs earlier in xmit path. · 9ccb8975

由 David S. Miller 提交于 4月 22, 2010

This way GSO packets don't get handled differently.

With help from Eric Dumazet.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>

9ccb8975

rps: immediate send IPI in process_backlog() · e326bed2

由 Eric Dumazet 提交于 4月 22, 2010

If some skb are queued to our backlog, we are delaying IPI sending at
the end of net_rx_action(), increasing latencies. This defeats the
queueing, since we want to quickly dispatch packets to the pool of
worker cpus, then eventually deeply process our packets.

It's better to send IPI before processing our packets in upper layers,
from process_backlog().

Change the _and_disable_irq suffix to _and_enable_irq(), since we enable
local irq in net_rps_action(), sorry for the confusion.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e326bed2

ethernet: print protocol in host byte order · b002a861

由 Johannes Berg 提交于 4月 20, 2010

Eric's recent patch added __force, but this
place would seem to require actually doing
a byte order conversion so the printk is
consistent across architectures.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b002a861

net: Introduce skb_orphan_try() · 9a20e319

由 Eric Dumazet 提交于 4月 20, 2010

At this point, skb->destructor is not the original one (stored in
DEV_GSO_CB(skb)->destructor)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9a20e319

fasync: RCU and fine grained locking · 989a2979

由 Eric Dumazet 提交于 4月 14, 2010

kill_fasync() uses a central rwlock, candidate for RCU conversion, to
avoid cache line ping pongs on SMP.

fasync_remove_entry() and fasync_add_entry() can disable IRQS on a short
section instead during whole list scan.

Use a spinlock per fasync_struct to synchronize kill_fasync_rcu() and
fasync_{remove|add}_entry(). This spinlock is IRQ safe, so sock_fasync()
doesnt need its own implementation and can use fasync_helper(), to
reduce code size and complexity.

We can remove __kill_fasync() direct use in net/socket.c, and rename it
to kill_fasync_rcu().
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

989a2979

tcp: Mark v6 response packets as CHECKSUM_PARTIAL · e5700aff

由 David S. Miller 提交于 4月 21, 2010

Otherwise we only get the checksum right for data-less TCP responses.

Noticed by Herbert Xu.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5700aff

21 4月, 2010 7 次提交

tcp: Fix ipv6 checksumming on response packets for real. · f71b70e1

由 David S. Miller 提交于 4月 21, 2010

Commit 6651ffc8
("ipv6: Fix tcp_v6_send_response transport header setting.")
fixed one half of why ipv6 tcp response checksums were
invalid, but it's not the whole story.

If we're going to use CHECKSUM_PARTIAL for these things (which we are
since commit 2e8e18ef "tcp: Set
CHECKSUM_UNNECESSARY in tcp_init_nondata_skb"), we can't be setting
buff->csum as we always have been here in tcp_v6_send_response.  We
need to leave it at zero.

Kill that line and checksums are good again.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f71b70e1

net: Fix an RCU warning in dev_pick_tx() · 05d17608

由 David Howells 提交于 4月 20, 2010

Fix the following RCU warning in dev_pick_tx():

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
net/core/dev.c:1993 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
2 locks held by swapper/0:
 #0:  (&idev->mc_ifc_timer){+.-...}, at: [<ffffffff81039e65>] run_timer_softirq+0x17b/0x278
 #1:  (rcu_read_lock_bh){.+....}, at: [<ffffffff812ea3eb>] dev_queue_xmit+0x14e/0x4dc

stack backtrace:
Pid: 0, comm: swapper Not tainted 2.6.34-rc5-cachefs #4
Call Trace:
 <IRQ>  [<ffffffff810516c4>] lockdep_rcu_dereference+0xaa/0xb2
 [<ffffffff812ea4f6>] dev_queue_xmit+0x259/0x4dc
 [<ffffffff812ea3eb>] ? dev_queue_xmit+0x14e/0x4dc
 [<ffffffff81052324>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff81035362>] ? local_bh_enable_ip+0xbc/0xc1
 [<ffffffff812f0954>] neigh_resolve_output+0x24b/0x27c
 [<ffffffff8134f673>] ip6_output_finish+0x7c/0xb4
 [<ffffffff81350c34>] ip6_output2+0x256/0x261
 [<ffffffff81052324>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff813517fb>] ip6_output+0xbbc/0xbcb
 [<ffffffff8135bc5d>] ? fib6_force_start_gc+0x2b/0x2d
 [<ffffffff81368acb>] mld_sendpack+0x273/0x39d
 [<ffffffff81368858>] ? mld_sendpack+0x0/0x39d
 [<ffffffff81052099>] ? mark_held_locks+0x52/0x70
 [<ffffffff813692fc>] mld_ifc_timer_expire+0x24f/0x288
 [<ffffffff81039ed6>] run_timer_softirq+0x1ec/0x278
 [<ffffffff81039e65>] ? run_timer_softirq+0x17b/0x278
 [<ffffffff813690ad>] ? mld_ifc_timer_expire+0x0/0x288
 [<ffffffff81035531>] ? __do_softirq+0x69/0x140
 [<ffffffff8103556a>] __do_softirq+0xa2/0x140
 [<ffffffff81002e0c>] call_softirq+0x1c/0x28
 [<ffffffff81004b54>] do_softirq+0x38/0x80
 [<ffffffff81034f06>] irq_exit+0x45/0x47
 [<ffffffff810177c3>] smp_apic_timer_interrupt+0x88/0x96
 [<ffffffff810028d3>] apic_timer_interrupt+0x13/0x20
 <EOI>  [<ffffffff810488dd>] ? __atomic_notifier_call_chain+0x0/0x86
 [<ffffffff810096bf>] ? mwait_idle+0x6e/0x78
 [<ffffffff810096b6>] ? mwait_idle+0x65/0x78
 [<ffffffff810011cb>] cpu_idle+0x4d/0x83
 [<ffffffff81380b05>] rest_init+0xb9/0xc0
 [<ffffffff81380a4c>] ? rest_init+0x0/0xc0
 [<ffffffff8168dcf0>] start_kernel+0x392/0x39d
 [<ffffffff8168d2a3>] x86_64_start_reservations+0xb3/0xb7
 [<ffffffff8168d38b>] x86_64_start_kernel+0xe4/0xeb

An rcu_dereference() should be an rcu_dereference_bh().
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05d17608

ipv6: Fix tcp_v6_send_response transport header setting. · 6651ffc8

由 Herbert Xu 提交于 4月 21, 2010

My recent patch to remove the open-coded checksum sequence in
tcp_v6_send_response broke it as we did not set the transport
header pointer on the new packet.

Actually, there is code there trying to set the transport
header properly, but it sets it for the wrong skb ('skb'
instead of 'buff').

This bug was introduced by commit
a8fdf2b3 ("ipv6: Fix
tcp_v6_send_response(): it didn't set skb transport header")
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6651ffc8

net: Remove two unnecessary exports (skbuff). · ccb7c773

由 Rami Rosen 提交于 4月 20, 2010

There is no need to export skb_under_panic() and skb_over_panic() in
skbuff.c, since these methods are used only in skbuff.c ; this patch
removes these two exports. It also marks these functions as 'static'
and removeS the extern declarations of them from
include/linux/skbuff.h
Signed-off-by: NRami Rosen <ramirose@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ccb7c773

net: Fix various endianness glitches · 0eae88f3

由 Eric Dumazet 提交于 4月 20, 2010

Sparse can help us find endianness bugs, but we need to make some
cleanups to be able to more easily spot real bugs.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0eae88f3

bridge: add a missing ntohs() · 8eabf95c

由 Eric Dumazet 提交于 4月 20, 2010

grec_nsrcs is in network order, we should convert to host horder in
br_multicast_igmp3_report()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8eabf95c

net: sk_sleep() helper · aa395145

由 Eric Dumazet 提交于 4月 20, 2010

Define a new function to return the waitqueue of a "struct sock".

static inline wait_queue_head_t *sk_sleep(struct sock *sk)
{
	return sk->sk_sleep;
}

Change all read occurrences of sk_sleep by a call to this function.

Needed for a future RCU conversion. sk_sleep wont be a field directly
available.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa395145

20 4月, 2010 13 次提交

mac80211: Fix ieee80211_sta_conn_mon_timer with hw connection monitoring · 7bdfcaaf

由 Juuso Oikarinen 提交于 4月 20, 2010

When IEEE80211_HW_CONNECTION_MONITOR is configured by the driver, starting
of ieee80211_sta_conn_mon_timer should be prevented, as it is then not needed.

This is currently partially the case. As it seems, when a probe-response is
received from the AP the timer is still restarted, thus restarting the host
based connection keep-alive mechanism. These probe-responses happen at least
when scanning while associated.

Fix this by preventing starting of the ieee80211_sta_conn_mon_timer in the
ieee80211_rx_mgmt_probe_resp function.
Signed-off-by: NJuuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

7bdfcaaf

mac80211: sample survey implementation for mac80211 & hwsim · 1289723e

由 Holger Schurig 提交于 4月 19, 2010

This adds the survey function to both mac80211 itself and to mac80211_hwsim.
For the latter driver, we simply invent some noise level.A real driver which
cannot determine the real channel noise MUST NOT report any noise, especially
not a magically conjured one :-)
Signed-off-by: NHolger Schurig <holgerschurig@gmail.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

1289723e

ath9k: Group Key fix for VAPs · 03ceedea

由 Daniel Yingqiang Ma 提交于 4月 13, 2010

When I set up multiple VAPs with ath9k, I encountered an issue that
the traffic may be lost after a while.

The detailed phenomenon is
1. After a while the clients connected to one of these VAPs will get
into a state that no broadcast/multicast packets can be transfered
successfully while the unicast packets can be transfered normally.
2. Minutes latter the unitcast packets transfer will fail as well,
because the ARP entry is expired and it can't be freshed due to the
broadcast trouble.

It's caused by the group key overwritten and someone discussed this
issue in ath9k-devel maillist before, but haven't work out a fix yet.

I referred the method in madwifi, and made a patch for ath9k.
The method is to set the high bit of the sender(AP)'s address, and
associated that mac and the group key. It requires the hardware
supports multicast frame key search. It seems true for AR9160.

Not sure whether it's the correct way to fix this issue. But it seems
to work in my test. The patch is attached, feel free to revise it.
Signed-off-by: NDaniel Yingqiang ma <yma.cool@gmail.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

03ceedea

net: emphasize rtnl lock required in call_netdevice_notifiers · ab930471

由 Jiri Pirko 提交于 4月 20, 2010

Since netdev_chain is guarded by rtnl_lock, ASSERT_RTNL should be
present here to make sure that all callers of call_netdevice_notifiers
does the locking properly.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab930471

rps: consistent rxhash · b249dcb8

由 Eric Dumazet 提交于 4月 19, 2010

In case we compute a software skb->rxhash, we can generate a consistent
hash : Its value will be the same in both flow directions.

This helps some workloads, like conntracking, since the same state needs
to be accessed in both directions.

tbench + RFS + this patch gives better results than tbench with default
kernel configuration (no RPS, no RFS)

Also fixed some sparse warnings.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b249dcb8

rps: cleanups · e36fa2f7

由 Eric Dumazet 提交于 4月 19, 2010

struct softnet_data holds many queues, so consistent use "sd" name
instead of "queue" is better.

Adds a rps_ipi_queued() helper to cleanup enqueue_to_backlog()

Adds a _and_irq_disable suffix to net_rps_action() name, as David
suggested.

incr_input_queue_head() becomes input_queue_head_incr()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e36fa2f7

rps: static functions · f5acb907

由 Eric Dumazet 提交于 4月 19, 2010

store_rps_map() & store_rps_dev_flow_table_cnt() are static.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5acb907

mac80211: add missing newline · 67e0f392

由 Johannes Berg 提交于 4月 19, 2010

One HT debugging printk is missing a newline,
add it.
Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

67e0f392

mac80211: Prevent running sta_cleanup timer unnecessarily · 3393a608

由 Juuso Oikarinen 提交于 4月 19, 2010

The sta_cleanup timer is used to periodically expire buffered frames from the
tx buf. The timer is executing periodically, regardless of the need for it.
This is wasting resources.

Fix this simply by not restarting the sta_cleanup timer if the tx buffer was
empty. Restart the timer when there is some more tx-traffic.

Cc: Janne Ylälehto <janne.ylalehto@nokia.com>
Signed-off-by: NJuuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

3393a608

mac80211: fix stopping RX BA session from timer · 2aab4c27

由 Johannes Berg 提交于 4月 19, 2010

Kalle reported that his system deadlocks since my
recent work in this area. The reason quickly became
apparent: we try to cancel_timer_sync() a timer
from within itself. Fix that by making the function
aware of the context it is called from.
Reported-by: NKalle Valo <kvalo@adurom.com>
Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
Tested-by: NKalle Valo <kvalo@adurom.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

2aab4c27

mac80211: pass HT changes to driver when off channel · fe6f212c

由 Reinette Chatre 提交于 4月 19, 2010

Since "mac80211: make off-channel work generic" drivers have not been
notified of configuration changes after association or authentication. This
caused more dependence on current state to ensure driver will be notified
when configuration changes occur. One such problem arises if off-channel is
in progress when HT information changes. Since HT is only enabled on the
"oper_channel" the driver will never be notified of this change. Usually
the driver is notified soon after of a BSS information change
(BSS_CHANGED_HT) ... but since the driver did not get a notification that
this is a HT channel the new BSS information does not make sense.

Fix this by also changing the off-channel information when HT is enabled
and thus cause driver to be notified correctly.

This fixes a problem in 4965 when associated with 5GHz 40MHz channel.
Without this patch the system can associate but is unable to transfer any
data, not even ping.

See http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2158Signed-off-by: NReinette Chatre <reinette.chatre@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

fe6f212c

mac80211: remove bogus TX agg state assignment · b4bb5c3f

由 Johannes Berg 提交于 4月 19, 2010

When the addba timer expires but has no work to do,
it should not affect the state machine. If it does,
TX will not see the successfully established and we
can also crash trying to re-establish the session.

Cc: stable@kernel.org [2.6.32, 2.6.33]
Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

b4bb5c3f

rps: shortcut net_rps_action() · 88751275

由 Eric Dumazet 提交于 4月 19, 2010

net_rps_action() is a bit expensive on NR_CPUS=64..4096 kernels, even if
RPS is not active.

Tom Herbert used two bitmasks to hold information needed to send IPI,
but a single LIFO list seems more appropriate.

Move all RPS logic into net_rps_action() to cleanup net_rx_action() code
(remove two ifdefs)

Move rps_remote_softirq_cpus into softnet_data to share its first cache
line, filling an existing hole.

In a future patch, we could call net_rps_action() from process_backlog()
to make sure we send IPI before handling this cpu backlog.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

88751275

18 4月, 2010 2 次提交

net: Introduce skb_orphan_try() · fc6055a5

由 Eric Dumazet 提交于 4月 16, 2010

Transmitted skb might be attached to a socket and a destructor, for
memory accounting purposes.

Traditionally, this destructor is called at tx completion time, when skb
is freed.

When tx completion is performed by another cpu than the sender, this
forces some cache lines to change ownership. XPS was an attempt to give
tx completion to initial cpu.

David idea is to call destructor right before giving skb to device (call
to ndo_start_xmit()). Because device queues are usually small, orphaning
skb before tx completion is not a big deal. Some drivers already do
this, we could do it in upper level.

There is one known exception to this early orphaning, called tx
timestamping. It needs to keep a reference to socket until device can
give a hardware or software timestamp.

This patch adds a skb_orphan_try() helper, to centralize all exceptions
to early orphaning in one spot, and use it in dev_hard_start_xmit().

"tbench 16" results on a Nehalem machine (2 X5570  @ 2.93GHz)
before: Throughput 4428.9 MB/sec 16 procs
after: Throughput 4448.14 MB/sec 16 procs

UDP should get even better results, its destructor being more complex,
since SOCK_USE_WRITE_QUEUE is not set (four atomic ops instead of one)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc6055a5

net: remove time limit in process_backlog() · 9958da05

由 Eric Dumazet 提交于 4月 17, 2010

- There is no point to enforce a time limit in process_backlog(), since
other napi instances dont follow same rule. We can exit after only one
packet processed...
The normal quota of 64 packets per napi instance should be the norm, and
net_rx_action() already has its own time limit.
Note : /proc/net/core/dev_weight can be used to tune this 64 default
value.

- Use DEFINE_PER_CPU_ALIGNED for softnet_data definition.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9958da05