提交 · 0c78a92fbd655ab990e2799f645707f05f548e2f · openanolis / cloud-kernel

12 6月, 2010 2 次提交

由 Eric Dumazet 提交于 6月 09, 2010

econet lacks proper locking. It holds econet_lock only when inserting or
deleting an entry in econet_sklist, not during lookups.

- convert econet_lock from rwlock to spinlock

- use econet_lock in ec_listening_socket() lookup

- use appropriate sock_hold() / sock_put() to avoid corruptions.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c78a92f

pkt_sched: gen_kill_estimator() rcu fixes · c7de2cf0

由 Eric Dumazet 提交于 6月 09, 2010

gen_kill_estimator() API is incomplete or not well documented, since
caller should make sure an RCU grace period is respected before
freeing stats_lock.

This was partially addressed in commit 5d944c64
(gen_estimator: deadlock fix), but same problem exist for all
gen_kill_estimator() users, if lock they use is not already RCU
protected.

A code review shows xt_RATEEST.c, act_api.c, act_police.c have this
problem. Other are ok because they use qdisc lock, already RCU
protected.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7de2cf0

11 6月, 2010 5 次提交

net-next: remove useless union keyword · d8d1f30b

由 Changli Gao 提交于 6月 10, 2010

remove useless union keyword in rtable, rt6_info and dn_route.

Since there is only one member in a union, the union keyword isn't useful.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d8d1f30b

pktgen: Fix accuracy of inter-packet delay. · 07a0f0f0

由 Daniel Turull 提交于 6月 10, 2010

This patch correct a bug in the delay of pktgen. 
It makes sure the inter-packet interval is accurate.
Signed-off-by: NDaniel Turull <daniel.turull@gmail.com>
Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07a0f0f0

pkt_sched: gen_estimator: add a new lock · ae638c47

由 Eric Dumazet 提交于 6月 08, 2010

gen_kill_estimator() / gen_new_estimator() is not always called with
RTNL held.

net/netfilter/xt_RATEEST.c is one user of these API that do not hold
RTNL, so random corruptions can occur between "tc" and "iptables".

Add a new fine grained lock instead of trying to use RTNL in netfilter.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae638c47

ip: ip_ra_control() rcu fix · 592fcb9d

由 Eric Dumazet 提交于 6月 09, 2010

commit 66018506 (ip: Router Alert RCU conversion) introduced RCU
lookups to ip_call_ra_chain(). It missed proper deinit phase :
When ip_ra_control() deletes an ip_ra_chain, it should make sure
ip_call_ra_chain() users can not start to use socket during the rcu
grace period. It should also delay the sock_put() after the grace
period, or we risk a premature socket freeing and corruptions, as
raw sockets are not rcu protected yet.

This delay avoids using expensive atomic_inc_not_zero() in
ip_call_ra_chain().
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

592fcb9d

net: deliver skbs on inactive slaves to exact matches · 597a264b

由 John Fastabend 提交于 6月 03, 2010

Currently, the accelerated receive path for VLAN's will
drop packets if the real device is an inactive slave and
is not one of the special pkts tested for in
skb_bond_should_drop().  This behavior is different then
the non-accelerated path and for pkts over a bonded vlan.

For example,

vlanx -> bond0 -> ethx

will be dropped in the vlan path and not delivered to any
packet handlers at all.  However,

bond0 -> vlanx -> ethx

and

bond0 -> ethx

will be delivered to handlers that match the exact dev,
because the VLAN path checks the real_dev which is not a
slave and netif_recv_skb() doesn't drop frames but only
delivers them to exact matches.

This patch adds a sk_buff flag which is used for tagging
skbs that would previously been dropped and allows the
skb to continue to skb_netif_recv().  Here we add
logic to check for the deliver_no_wcard flag and if it
is set only deliver to handlers that match exactly.  This
makes both paths above consistent and gives pkt handlers
a way to identify skbs that come from inactive slaves.
Without this patch in some configurations skbs will be
delivered to handlers with exact matches and in others
be dropped out right in the vlan path.

I have tested the following 4 configurations in failover modes
and load balancing modes.

# bond0 -> ethx

# vlanx -> bond0 -> ethx

# bond0 -> vlanx -> ethx

# bond0 -> ethx
            |
  vlanx -> --
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

597a264b

10 6月, 2010 6 次提交

ipv6: fix ICMP6_MIB_OUTERRORS · 00d9d6a1

由 Eric Dumazet 提交于 6月 07, 2010

In commit 1f8438a8 (icmp: Account for ICMP out errors), I did a typo
on IPV6 side, using ICMP6_MIB_OUTMSGS instead of ICMP6_MIB_OUTERRORS
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

00d9d6a1

ipv6: mcast: RCU conversions · 96b52e61

由 Eric Dumazet 提交于 6月 07, 2010

- ipv6_sock_mc_join() : doesnt touch dev refcount

- ipv6_sock_mc_drop() : doesnt touch dev/idev refcounts

- ip6_mc_find_dev() becomes ip6_mc_find_dev_rcu() (called from rcu),
                    and doesnt touch dev/idev refcounts

- ipv6_sock_mc_close() : doesnt touch dev/idev refcounts

- ip6_mc_source() uses ip6_mc_find_dev_rcu()

- ip6_mc_msfilter() uses ip6_mc_find_dev_rcu()

- ip6_mc_msfget() uses ip6_mc_find_dev_rcu()

- ipv6_dev_mc_dec(), ipv6_chk_mcast_addr(),
  igmp6_event_query(), igmp6_event_report(),
  mld_sendpack(), igmp6_send() dont touch idev refcount
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

96b52e61

icmp: RCU conversion in icmp_address_reply() · cfa087f6

由 Eric Dumazet 提交于 6月 07, 2010

- rcu_read_lock() already held by caller
- use __in_dev_get_rcu() instead of in_dev_get() / in_dev_put()
- remove goto out;
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cfa087f6

caif: fix a couple range checks · aea34e7a

由 Dan Carpenter 提交于 6月 07, 2010

The extra ! character means that these conditions are always false.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Acked-by: NSjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aea34e7a

phonet: use call_rcu for phonet device free · 88e7594a

由 Jiri Pirko 提交于 6月 07, 2010

Use call_rcu rather than synchronize_rcu.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

88e7594a

net: Print num_rx_queues imbalance warning only when there are allocated queues · 08c801f8

由 Tim Gardner 提交于 6月 08, 2010

BugLink: http://bugs.launchpad.net/bugs/591416

There are a number of network drivers (bridge, bonding, etc) that are not yet
receive multi-queue enabled and use alloc_netdev(), so don't print a
num_rx_queues imbalance warning in that case.

Also, only print the warning once for those drivers that _are_ multi-queue
enabled.
Signed-off-by: NTim Gardner <tim.gardner@canonical.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>

08c801f8

09 6月, 2010 1 次提交

mac80211: fix deauth before assoc · b054b747

由 Johannes Berg 提交于 6月 07, 2010

When we receive a deauthentication frame before
having successfully associated, we neither print
a message nor abort assocation. The former makes
it hard to debug, while the latter later causes
a warning in cfg80211 when, as will typically be
the case, association timed out.

This warning was reported by many, e.g. in
https://bugzilla.kernel.org/show_bug.cgi?id=15981,
but I couldn't initially pinpoint it. I verified
the fix by hacking hostapd to send a deauth frame
instead of an association response.

Cc: stable@kernel.org
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Tested-by: NMiles Lane <miles.lane@gmail.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

b054b747

08 6月, 2010 8 次提交

mac80211: Add netif state checking to ieee80211_ifa_changed · 90b72609

由 Juuso Oikarinen 提交于 6月 07, 2010

There's a window for ieee80211_ifa_changed() to get called whilst the
managed mode mutex has not been initialized when opening and stopping the
interface. Currently this causes a kernel BUG like the following:

[  132.460013] kernel BUG at /home/wifi/iwlwifi-2.6/net/mac80211/main.c:380!
[  132.460013] invalid opcode: 0000 [#1] SMP

The mutex is initialized during open(), hence once netif_running() is true,
the mutex should be valid. Fix by adding a netif_running() check to the
function.
Reported-by: NReinette Chatre <reinette.chatre@intel.com>
Signed-off-by: NJuuso Oikarinen <juuso.oikarinen@nokia.com>
Tested-by: NReinette Chatre <reinette.chatre@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

90b72609

anycast: Some RCU conversions · bb69ae04

由 Eric Dumazet 提交于 6月 07, 2010

- dev_get_by_flags() changed to dev_get_by_flags_rcu()

- ipv6_sock_ac_join() dont touch dev & idev refcounts
- ipv6_sock_ac_drop() dont touch dev & idev refcounts
- ipv6_sock_ac_close() dont touch dev & idev refcounts
- ipv6_dev_ac_dec() dount touch idev refcount
- ipv6_chk_acast_addr() dont touch idev refcount
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bb69ae04

net: avoid two atomic ops in ip_rcv_options() · 6e8b11b4

由 Eric Dumazet 提交于 6月 07, 2010

in_dev_get() -> __in_dev_get_rcu() in a rcu protected function.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e8b11b4

ipv4: avoid two atomic ops in ip_rt_redirect() · ed7865a4

由 Eric Dumazet 提交于 6月 07, 2010

in_dev_get() -> __in_dev_get_rcu() in a rcu protected function.

[ Fix build with CONFIG_IP_ROUTE_VERBOSE disabled. -DaveM ]
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ed7865a4

igmp: avoid two atomic ops in igmp_rcv() · 9a57a9d2

由 Eric Dumazet 提交于 6月 07, 2010

in_dev_get() -> __in_dev_get_rcu() in a rcu protected function.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9a57a9d2

ip: Router Alert RCU conversion · 66018506

由 Eric Dumazet 提交于 6月 07, 2010

Straightforward conversion to RCU.

One rwlock becomes a spinlock, and is static.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66018506

mac80211: fix lock leak w/ ARP filtering and w/o CONFIG_INET · 11b7c609

由 John W. Linville 提交于 6月 07, 2010

"mac80211: make ARP filtering depend on CONFIG_INET" introduced this
potential locking leak.
Reported-by: NJuuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

11b7c609

mac80211: fix function pointer check · 35dd0509

由 Holger Schurig 提交于 6月 07, 2010

This makes "iw wlan0 dump survey" work again with
mac80211-based drivers that support it, e.g. ath5k.
Signed-off-by: NHolger Schurig <holgerschurig@gmail.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

35dd0509

07 6月, 2010 8 次提交

ipmr: dont corrupt lists · 035320d5

由 Eric Dumazet 提交于 6月 06, 2010

ipmr_rules_exit() and ip6mr_rules_exit() free a list of items, but
forget to properly remove these items from list. List head is not
changed and still points to freed memory.

This can trigger a fault later when icmpv6_sk_exit() is called.

Fix is to either reinit list, or use list_del() to properly remove items
from list before freeing them.

bugzilla report : https://bugzilla.kernel.org/show_bug.cgi?id=16120

Introduced by commit d1db275d (ipv6: ip6mr: support multiple
tables) and commit f0ad0860 (ipv4: ipmr: support multiple tables)
Reported-by: NAlex Zhavnerchik <alex.vizor@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

035320d5

net: Remove unnecessary net action assertion · 271c1dfa

由 jamal 提交于 6月 04, 2010

The extra assertion to allow packet munging only when there are
no other ptypes listening which may have worked around an old bug
is unnecessary. It is sufficient to check if the skb is cloned before
trampling on it. Thanks to Herbert Xu for being persistent and patient
in getting this across.
[Note that cloning checks and assertions are the general rule used
by tc actions (documentation/networking/tc-actions-env-rules.txt)].
Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

271c1dfa

net sched: make pedit check for clones instead · 9dacaf17

由 jamal 提交于 6月 04, 2010

Now that the core path doesnt set OK to munge we detect
writable skbs by looking to see if they are cloned.
Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9dacaf17

htb: remove two unnecessary assignments · f2a03367

由 Changli Gao 提交于 6月 04, 2010

remove two unnecessary assignments

we don't need to assign NULL when initialize structure objects.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
----
 net/sched/sch_htb.c |    2 --
 1 file changed, 2 deletions(-)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f2a03367

raw: avoid two atomics in xmit · 1789a640

由 Eric Dumazet 提交于 6月 03, 2010

Avoid two atomic ops per raw_send_hdrinc() call

Avoid two atomic ops per raw6_send_hdrinc() call
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1789a640

net-caif: Added missing lock validator constants · fe33147a

由 Alex Lorca 提交于 6月 07, 2010

CAIF is using "xxx-AF_MAX" strings for the lock validator. It should use
its own strings.
Signed-off-by: NAlex Lorca <alex.lorca@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe33147a

tcp: Fix slowness in read /proc/net/tcp · a8b690f9

由 Tom Herbert 提交于 6月 07, 2010

This patch address a serious performance issue in reading the
TCP sockets table (/proc/net/tcp).

Reading the full table is done by a number of sequential read
operations.  At each read operation, a seek is done to find the
last socket that was previously read.  This seek operation requires
that the sockets in the table need to be counted up to the current
file position, and to count each of these requires taking a lock for
each non-empty bucket.  The whole algorithm is O(n^2).

The fix is to cache the last bucket value, offset within the bucket,
and the file position returned by the last read operation.   On the
next sequential read, the bucket and offset are used to find the
last read socket immediately without needing ot scan the previous
buckets  the table.  This algorithm t read the whole table is O(n).

The improvement offered by this patch is easily show by performing
cat'ing /proc/net/tcp on a machine with a lot of connections.  With
about 182K connections in the table, I see the following:

- Without patch
time cat /proc/net/tcp > /dev/null

real	1m56.729s
user	0m0.214s
sys	1m56.344s

- With patch
time cat /proc/net/tcp > /dev/null

real	0m0.894s
user	0m0.290s
sys	0m0.594s
Signed-off-by: NTom Herbert <therbert@google.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a8b690f9

ip6mr: fix a typo in ip6mr_for_each_table() · 8ffb335e

由 Eric Dumazet 提交于 6月 06, 2010

Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ffb335e

05 6月, 2010 10 次提交

ipv6: avoid high order allocations · 72e09ad1

由 Eric Dumazet 提交于 6月 05, 2010

With mtu=9000, mld_newpack() use order-2 GFP_ATOMIC allocations, that
are very unreliable, on machines where PAGE_SIZE=4K

Limit allocated skbs to be at most one page. (order-0 allocations)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72e09ad1

skbuff: add check for non-linear to warn_if_lro and needs_linearize · b78462eb

由 Alexander Duyck 提交于 6月 02, 2010

We can avoid an unecessary cache miss by checking if the skb is non-linear
before accessing gso_size/gso_type in skb_warn_if_lro, the same can also be
done to avoid a cache miss on nr_frags if data_len is 0.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b78462eb

syncookies: update mss tables · 5918e2fb

由 Florian Westphal 提交于 6月 03, 2010

- ipv6 msstab: account for ipv6 header size
- ipv4 msstab: add mss for Jumbograms.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5918e2fb

syncookies: avoid unneeded tcp header flag double check · af9b4738

由 Florian Westphal 提交于 6月 03, 2010

caller: if (!th->rst && !th->syn && th->ack)
callee: if (!th->ack)

make the caller only check for !syn (common for 3whs), and move
the !rst / ack test to the callee.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

af9b4738

syncookies: make v4/v6 synflood warning behaviour the same · 2a1d4bd4

由 Florian Westphal 提交于 6月 03, 2010

both syn_flood_warning functions print a message, but
ipv4 version only prints a warning if CONFIG_SYN_COOKIES=y.

Make the v4 one behave like the v6 one.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2a1d4bd4

tcp: use correct net ns in cookie_v4_check() · c4464921

由 Eric Dumazet 提交于 6月 03, 2010

Its better to make a route lookup in appropriate namespace.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c4464921

rps: tcp: fix rps_sock_flow_table table updates · ca55158c

由 Eric Dumazet 提交于 6月 03, 2010

I believe a moderate SYN flood attack can corrupt RFS flow table
(rps_sock_flow_table), making RPS/RFS much less effective.

Even in a normal situation, server handling short lived sessions suffer
from bad steering for the first data packet of a session, if another SYN
packet is received for another session.

We do following action in tcp_v4_rcv() :

	sock_rps_save_rxhash(sk, skb->rxhash);

We should _not_ do this if sk is a LISTEN socket, as about each
packet received on a LISTEN socket has a different rxhash than
previous one.
 -> RPS_NO_CPU markers are spread all over rps_sock_flow_table.

Also, it makes sense to protect sk->rxhash field changes with socket
lock (We currently can change it even if user thread owns the lock
and might use rxhash)

This patch moves sock_rps_save_rxhash() to a sock locked section,
and only for non LISTEN sockets.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca55158c

syncookies: remove Kconfig text line about disabled-by-default · 57f1553e

由 Florian Westphal 提交于 6月 03, 2010

syncookies default to on since
e994b7c9
(tcp: Don't make syn cookies initial setting depend on CONFIG_SYSCTL).
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

57f1553e

net: check for refcount if pop a stacked dst_entry · 8764ab2c

由 Steffen Klassert 提交于 6月 04, 2010

xfrm triggers a warning if dst_pop() drops a refcount
on a noref dst. This patch changes dst_pop() to
skb_dst_pop(). skb_dst_pop() drops the refcnt only
on a refcounted dst. Also we don't clone the child
dst_entry, so it is not refcounted and we can use
skb_dst_set_noref() in xfrm_output_one().
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8764ab2c

mac80211: process station blockack action frames from work · 8b9a4e6e

由 Johannes Berg 提交于 5月 28, 2010

Processing an association response could take a bit
of time while we set up the hardware etc. During that
time, the AP might already send a blockack request.
If this happens very quickly on a fairly slow machine,
we can end up processing the blockack request before
the association processing has finished. Since the
blockack processing cannot sleep right now, we also
cannot make it wait in the driver.

As a result, sometimes on slow machines the iwlagn
driver gets totally confused, and no traffic can pass
when the aggregation setup was done before the assoc
setup completed.

I'm working on a proper fix for this, which involves
queuing all blockack category action frames from a
work struct, and also allowing the ampdu_action driver
callback to sleep, which will generally clean up the
code and make things easier.

However, this is a very involved and complex change.
To fix the problem at hand in a way that can also be
backported to stable, I've come up with this patch.
Here, I simply process all aggregation action frames
from the managed interface skb queue, which means
their processing will be serialized with processing
the association response, thereby fixing the problem.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Cc: stable@kernel.org
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

8b9a4e6e

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功