提交 · 65ac6a5fa658b90f1be700c55e7cd72e4611015d · openeuler / raspberrypi-kernel

21 10月, 2010 4 次提交

vlan: Avoid hash table lookup to find group. · 65ac6a5f

由 Jesse Gross 提交于 10月 20, 2010

A struct net_device always maps to zero or one vlan groups and we
always know the device when we are looking up a group.  We currently
do a hash table lookup on the device to find the group but it is
much simpler to just store a pointer.
Signed-off-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

65ac6a5f

vlan: Enable software emulation for vlan accleration. · 7b9c6090

由 Jesse Gross 提交于 10月 20, 2010

Currently users of hardware vlan accleration need to know whether
the device supports it before generating packets.  However, vlan
acceleration will soon be available in a more flexible manner so
knowing ahead of time becomes much more difficult.  This adds
a software fallback path for vlan packets on devices without the
necessary offloading support, similar to other types of hardware
accleration.
Signed-off-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b9c6090

vlan: Rename VLAN_GROUP_ARRAY_LEN to VLAN_N_VID. · b738127d

由 Jesse Gross 提交于 10月 20, 2010

VLAN_GROUP_ARRAY_LEN is simply the number of possible vlan VIDs.
Since vlan groups will soon be more of an implementation detail
for vlan devices, rename the constant to be descriptive of its
actual purpose.
Signed-off-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b738127d

ebtables: Allow filtering of hardware accelerated vlan frames. · 13937911

由 Jesse Gross 提交于 10月 20, 2010

An upcoming commit will allow packets with hardware vlan acceleration
information to be passed though more parts of the network stack, including
packets trunked through the bridge.  This adds support for matching and
filtering those packets through ebtables.
Signed-off-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13937911

20 10月, 2010 6 次提交

net: avoid RCU for NOCACHE dst · 27b75c95

由 Eric Dumazet 提交于 10月 15, 2010

There is no point using RCU for dst we allocate for a very short time
(used once).

Change dst_release() to take DST_NOCACHE into account, but also change
skb_dst_set_noref() to force a refcount increment for such dst.

This is a _huge_ gain, because we dont waste memory to store xx thousand
of dsts. Instead of queueing them to RCU, we can free them instantly.

CPU caches can stay hot, re-using same memory blocks to hold temporary
dsts.

Note : remove unneeded smp_mb__before_atomic_dec(); in dst_release(),
since atomic_dec_return() implies a full memory barrier.

Stress test, 160.000.000 udp frames sent, IP route cache disabled
(DDOS).

Before:

real    0m38.091s
user    0m13.189s
sys     7m53.018s

After:

real	0m29.946s
user	0m12.157s
sys	7m40.605s

For reference, if IP route cache was enabled :

real	0m32.030s
user	0m10.521s
sys	8m15.243s
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27b75c95

net: allocate tx queues in register_netdevice · e6484930

由 Tom Herbert 提交于 10月 18, 2010

This patch introduces netif_alloc_netdev_queues which is called from
register_device instead of alloc_netdev_mq.  This makes TX queue
allocation symmetric with RX allocation.  Also, queue locks allocation
is done in netdev_init_one_queue.  Change set_real_num_tx_queues to
fail if requested number < 1 or greater than number of allocated
queues.
Signed-off-by: NTom Herbert <therbert@google.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e6484930

net: cleanups in RX queue allocation · bd25fa7b

由 Tom Herbert 提交于 10月 18, 2010

Clean up in RX queue allocation.  In netif_set_real_num_rx_queues
return error on attempt to set zero queues, or requested number is
greater than number of allocated queues.  In netif_alloc_rx_queues,
do BUG_ON if queue_count is zero.
Signed-off-by: NTom Herbert <therbert@google.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bd25fa7b

net: fail alloc_netdev_mq if queue count < 1 · 55513fb4

由 Tom Herbert 提交于 10月 18, 2010

In alloc_netdev_mq fail if requested queue_count < 1.
Signed-off-by: NTom Herbert <therbert@google.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55513fb4

phonet: remove the unused variable pn · c5e90f56

由 Changli Gao 提交于 10月 19, 2010

Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5e90f56

netpoll: Revert napi_poll fix for bonding driver · f13d493d

由 Neil Horman 提交于 10月 19, 2010

In an erlier patch I modified napi_poll so that devices with IFF_MASTER polled
the per_cpu list instead of the device list for napi. I did this because the
bonding driver has no napi instances to poll, it instead expects to check the
slave devices napi instances, which napi_poll was unaware of. Looking at this
more closely however, I now see this isn't strictly needed. As the bond driver
poll_controller calls the slaves poll_controller via netpoll_poll_dev, which
recursively calls poll_napi on each slave, allowing those napi instances to get
serviced. The earlier patch isn't at all harmfull, its just not needed, so lets
revert it to make the code cleaner. Sorry for the noise,
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Reviewed-by: NWANG Cong <amwang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f13d493d

19 10月, 2010 2 次提交

inet: RCU changes in inetdev_by_index() · 8723e1b4

由 Eric Dumazet 提交于 10月 19, 2010

Convert inetdev_by_index() to not increment in_dev refcount.

Callers hold RCU or RTNL, and should not decrement in_dev refcount.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8723e1b4

net: avoid a dev refcount in ip_mc_find_dev() · 9e917dca

由 Eric Dumazet 提交于 10月 19, 2010

We hold RTNL in ip_mc_find_dev(), no need to touch device refcount.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e917dca

18 10月, 2010 10 次提交

bonding: Fix napi poll for bonding driver · 990c3d6f

由 Neil Horman 提交于 10月 13, 2010

Usually the netpoll path, when preforming a napi poll can get away with just
polling all the napi instances of the configured device. Thats not the case for
the bonding driver however, as the napi instances which may wind up getting
flagged as needing polling after the poll_controller call don't belong to the
bonded device, but rather to the slave devices. Fix this by checking the device
in question for the IFF_MASTER flag, if set, we know we need to check the full
poll list for this cpu, rather than just the devices napi instance list.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

990c3d6f

bonding: Fix bonding drivers improper modification of netpoll structure · c2355e1a

由 Neil Horman 提交于 10月 13, 2010

The bonding driver currently modifies the netpoll structure in its xmit path
while sending frames from netpoll. This is racy, as other cpus can access the
netpoll structure in parallel. Since the bonding driver points np->dev to a
slave device, other cpus can inadvertently attempt to send data directly to
slave devices, leading to improper locking with the bonding master, lost frames,
and deadlocks. This patch fixes that up.

This patch also removes the real_dev pointer from the netpoll structure as that
data is really only used by bonding in the poll_controller, and we can emulate
its behavior by check each slave for IS_UP.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2355e1a

IPv4: route.c: Change checks against 0xffffffff to ipv4_is_lbcast() · 27a954bd

由 Andy Walls 提交于 10月 17, 2010

Change a few checks against the hardcoded broadcast address,
0xffffffff, to ipv4_is_lbcast().  Remove some existing checks
using ipv4_is_lbcast() that are now obviously superfluous.
Signed-off-by: NAndy Walls <awalls@md.metrocast.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27a954bd

tipc: Simplify bearer shutdown logic · ccc901ee

由 Allan Stephens 提交于 10月 14, 2010

Optimize processing in TIPC's bearer shutdown code, including:

1. Remove an unnecessary check to see if TIPC bearer's can exist.
2. Don't release spinlocks before calling a media-specific disabling
routine, since the routine can't sleep.
3. Make bearer_disable() operate directly on a struct bearer, instead
of needlessly taking a name and then mapping that to the struct.
Signed-off-by: NAllan Stephens <allan.stephens@windriver.com>
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Reviewed-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ccc901ee

tipc: Kill tipc_get_mode() completely. · 724829b3

由 David S. Miller 提交于 10月 18, 2010

It's completely unused and exporting a static symbol
makes no sense and breaks the build.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

724829b3

fib_hash: RCU conversion phase 2 · 19f57256

由 Eric Dumazet 提交于 10月 14, 2010

Get rid of fib_hash_lock rwlock.

The fn_zone hash table resize is the noticeable part of this patch.

I added a seqlock per fn_zone, so that readers can restart their lookup
in the (very rare) case a writer expanded the hash table.

Add rcu heads in fib_alias and fib_node, use call_rcu() to defer their
freeing, and use appropriate _rcu list manipulations.

Stress test (160.000.000 udp frames sent, IP route cache disabled to
mimic DDOS attack, FIB_HASH)

Before:
real	0m41.191s
user	0m13.137s
sys	8m55.241s

After:
real	0m38.091s
user	0m13.189s
sys	7m53.018s
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

19f57256

fib_hash: RCU conversion phase 1 · 117a8cde

由 Eric Dumazet 提交于 10月 14, 2010

First step for RCU conversion of fib_hash :

struct fn_zone are created and never deleted.

Very classic conversion, using rcu_assign_pointer(), rcu_dereference()
and rtnl_dereference() verbs.

__rcu markers on fz_next and fn_zone_list

They are created under RTNL, we dont need fib_hash_lock anymore in
fn_new_zone().
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

117a8cde

fib_hash: embed initial hash table in fn_zone · 9bef83ed

由 Eric Dumazet 提交于 10月 14, 2010

While looking for false sharing problems, I noticed
sizeof(struct fn_zone) was small (28 bytes) and possibly sharing a cache
line with an often written kernel structure.

Most of the time, fn_zone uses its initial hash table of 16 slots.

We can avoid the false sharing problem by embedding this initial hash
table in fn_zone itself, so that sizeof(fn_zone) > L1_CACHE_BYTES

We did a similar optimization in commit a6501e08 (Reduce memory needs
and speedup lookups)

Add a fz_revorder field to speedup fn_hash() a bit.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9bef83ed

tcp: use correct counters in CA_CWR state too · c60ce4e2

由 Ilpo Järvinen 提交于 10月 14, 2010

As CWR is stronger than CA_Disorder state, we can miscount
SACK/Reno failure into other timeouts. Not a bad problem as
it can happen only due to ECN, FRTO detecting spurious RTO
or xmit error which are the only callers of tcp_enter_cwr.
And even then losses and RTO must still follow thereafter
to actually end up into the relevant code paths.

Compile tested.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c60ce4e2

tcp: sack lost marking fixes · 1fdb9361

由 Ilpo Järvinen 提交于 10月 14, 2010

When only fast rexmit should be done, tcp_mark_head_lost marks
L too far. Also, sacked_upto below 1 is perfectly valid number,
the packets == 0 then needs to be trapped elsewhere.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1fdb9361

17 10月, 2010 6 次提交

tipc: cleanup function namespace · 31e3c3f6

由 stephen hemminger 提交于 10月 13, 2010

Do some cleanups of TIPC based on make namespacecheck
  1. Don't export unused symbols
  2. Eliminate dead code
  3. Make functions and variables local
  4. Rename buf_acquire to tipc_buf_acquire since it is used in several files

Compile tested only.
This make break out of tree kernel modules that depend on TIPC routines.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Acked-by: NJon Maloy <jon.maloy@ericsson.com>
Acked-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31e3c3f6

fib: avoid false sharing on fib_table_hash · 10da66f7

由 Eric Dumazet 提交于 10月 13, 2010

While doing profile analysis, I found fib_hash_table was sometime in a
cache line shared by a possibly often written kernel structure.

(CONFIG_IP_ROUTE_MULTIPATH || !CONFIG_IPV6_MULTIPLE_TABLES)

It's hard to detect because not easily reproductible.

Make sure we allocate a full cache line to keep this shared in all cpus
caches.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

10da66f7

fib_trie: use fls() instead of open coded loop · 874ffa8f

由 Eric Dumazet 提交于 10月 13, 2010

fib_table_lookup() might use fls() to speedup an open coded loop.

Noticed while doing a profile analysis.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

874ffa8f

fib: remove a useless synchronize_rcu() call · a0a4a85a

由 Eric Dumazet 提交于 10月 13, 2010

fib_nl_delrule() calls synchronize_rcu() for no apparent reason,
while rtnl is held.

I suspect it was done to avoid an atomic_inc_not_zero() in
fib_rules_lookup(), which commit 7fa7cb71 added anyway.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a0a4a85a

fib6: use FIB_LOOKUP_NOREF in fib6_rule_lookup() · 2c1c0004

由 Eric Dumazet 提交于 10月 13, 2010

Avoid two atomic ops on found rule in fib6_rule_lookup()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c1c0004

net: allocate skbs on local node · 564824b0

由 Eric Dumazet 提交于 10月 11, 2010

commit b30973f8 (node-aware skb allocation) spread a wrong habit of
allocating net drivers skbs on a given memory node : The one closest to
the NIC hardware. This is wrong because as soon as we try to scale
network stack, we need to use many cpus to handle traffic and hit
slub/slab management on cross-node allocations/frees when these cpus
have to alloc/free skbs bound to a central node.

skb allocated in RX path are ephemeral, they have a very short
lifetime : Extra cost to maintain NUMA affinity is too expensive. What
appeared as a nice idea four years ago is in fact a bad one.

In 2010, NIC hardwares are multiqueue, or we use RPS to spread the load,
and two 10Gb NIC might deliver more than 28 million packets per second,
needing all the available cpus.

Cost of cross-node handling in network and vm stacks outperforms the
small benefit hardware had when doing its DMA transfert in its 'local'
memory node at RX time. Even trying to differentiate the two allocations
done for one skb (the sk_buff on local node, the data part on NIC
hardware node) is not enough to bring good performance.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

564824b0

16 10月, 2010 1 次提交

radiotap: fix vendor namespace parsing · 9ebad4ab

由 Johannes Berg 提交于 10月 14, 2010

There's a bug with radiotap vendor namespace
parsing if you don't register for the given
namespace extensions. Fix this by passing
only the unknown vendor namespaces and the
registered data to frontends, but not both.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

9ebad4ab

14 10月, 2010 5 次提交

Phonet: 'connect' socket implementation for Pipe controller · b3d62553

由 Kumar Sanghvi 提交于 10月 12, 2010

Based on suggestion by Rémi Denis-Courmont to implement 'connect'
for Pipe controller logic,  this patch implements 'connect' socket
call for the Pipe controller logic.
The patch does following:-
- Removes setsockopts for PNPIPE_CREATE and PNPIPE_DESTROY
- Adds setsockopt for setting the Pipe handle value
- Implements connect socket call
- Updates the Pipe controller logic

User-space should now follow below sequence with Pipe controller:-
-socket
-bind
-setsockopt for PNPIPE_PIPE_HANDLE
-connect
-setsockopt for PNPIPE_ENCAP_IP
-setsockopt for PNPIPE_ENABLE

GPRS/3G data has been tested working fine with this.
Signed-off-by: NKumar Sanghvi <kumar.sanghvi@stericsson.com>
Acked-by: NRémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b3d62553

tipc: clean out all instances of #if 0'd unused code · 7368ddf1

由 Paul Gortmaker 提交于 10月 12, 2010

Remove all instances of legacy, or as yet to be implemented code
that is currently living within an #if 0 ... #endif block.
In the rare instance that some of it be needed in the future,
it can still be dragged out of history, but there is no need
for it to sit in mainline.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7368ddf1

mac80211: fix SMPS request · e4b55957

由 Johannes Berg 提交于 10月 13, 2010

It looks like I submitted a different patch
than I tested, because clearly the code in
mac80211 is missing actually propagating the
requested SMPS mode. Fix that!
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

e4b55957

mac80211: add probe request filter flag · 7be5086d

由 Johannes Berg 提交于 10月 13, 2010

Using the frame registration notification, we
can see when probe requests are requested and
notify the low-level driver via filtering. The
flag is also set in AP and IBSS modes.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

7be5086d

cfg80211: notify drivers about frame registrations · 271733cf

由 Johannes Berg 提交于 10月 13, 2010

Drivers may need to adjust their filters according
to frame registrations, so notify them about them.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

271733cf

13 10月, 2010 2 次提交

wireless: Print wiphy name in sysfs. · cfd8e12f

由 Ben Greear 提交于 10月 11, 2010

The index cannot be used to reliably reconstruct a phy
name, so explicitly add the phy name to sysfs so that scripts
can figure out the parent phy device for a particular
wireless interface.
Signed-off-by: NBen Greear <greearb@candelatech.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

cfd8e12f

net: percpu net_device refcount · 29b4433d

由 Eric Dumazet 提交于 10月 11, 2010

We tried very hard to remove all possible dev_hold()/dev_put() pairs in
network stack, using RCU conversions.

There is still an unavoidable device refcount change for every dst we
create/destroy, and this can slow down some workloads (routers or some
app servers, mmap af_packet)

We can switch to a percpu refcount implementation, now dynamic per_cpu
infrastructure is mature. On a 64 cpus machine, this consumes 256 bytes
per device.

On x86, dev_hold(dev) code :

before
        lock    incl 0x280(%ebx)
after:
        movl    0x260(%ebx),%eax
        incl    fs:(%eax)

Stress bench :

(Sending 160.000.000 UDP frames,
IP route cache disabled, dual E5540 @2.53GHz,
32bit kernel, FIB_TRIE)

Before:

real    1m1.662s
user    0m14.373s
sys     12m55.960s

After:

real    0m51.179s
user    0m15.329s
sys     10m15.942s
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29b4433d

12 10月, 2010 4 次提交

Bluetooth: clean up rfcomm code · 534c92fd

由 Andrei Emeltchenko 提交于 10月 01, 2010

Remove dead code and unused rfcomm thread events
Signed-off-by: NAndrei Emeltchenko <andrei.emeltchenko@nokia.com>
Acked-by: NMarcel Holtmann <marcel@holtmann.org>
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>

534c92fd

Bluetooth: Update conf_state before send config_req out · ab3e5715

由 Haijun Liu 提交于 9月 30, 2010

Update conf_state with L2CAP_CONF_REQ_SENT before send config_req out in
l2cap_config_req().
Signed-off-by: NHaijun Liu <haijun.liu@atheros.com>
Acked-by: NMarcel Holtmann <marcel@holtmann.org>
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>

ab3e5715

Bluetooth: Use the proper error value from bt_skb_send_alloc() · 0175d629

由 Gustavo F. Padovan 提交于 9月 24, 2010

&err points to the proper error set by bt_skb_send_alloc() when it
fails.
Acked-by: NMarcel Holtmann <marcel@holtmann.org>
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>

0175d629

Bluetooth: make batostr() print in the right order · d6b2eb2f

由 Gustavo F. Padovan 提交于 9月 03, 2010

The Bluetooth core uses the the BD_ADDR in the opposite order from the
human readable order. So we are changing batostr() to print in the
correct order and then removing some baswap(), as they are not needed
anymore.
Acked-by: NMarcel Holtmann <marcel@holtmann.org>
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>

d6b2eb2f