提交 · f30ab418a1d3c5a8b83493e7d70d6876a74aa0ce · openeuler / Kernel

12 11月, 2008 2 次提交

net: Cleanup of neighbour code · e42ea986

由 Eric Dumazet 提交于 11月 12, 2008

Using read_pnet() and write_pnet() in neighbour code ease the reading
of code.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e42ea986

net: remove struct neigh_table::pde · 9b739ba5

由 Alexey Dobriyan 提交于 11月 11, 2008

->pde isn't actually needed, since name is stashed in ->id.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b739ba5

11 11月, 2008 3 次提交

net: fix setting of skb->tail in skb_recycle_check() · 5cd33db2

由 Lennert Buytenhek 提交于 11月 10, 2008

Since skb_reset_tail_pointer() reads skb->data, we need to set
skb->data before calling skb_reset_tail_pointer(). This was causing
spurious skb_over_panic()s from skb_put() being called on a recycled
skb that had its skb->tail set to beyond where it should have been.

Bug report from Peter van Valderen <linux@ddcrew.com>.
Signed-off-by: NLennert Buytenhek <buytenh@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5cd33db2

pktgen: add full reset functionality · eb37b41c

由 Jesse Brandeburg 提交于 11月 10, 2008

While testing pktgen, I found that sometimes my configurations from
previous runs would be left over, particularly when going from a test
with 8 threads down to a test with 4 threads.

This adds new functionality to pktgen where you can call
pgset "reset"

and it will be just like you just insmod'ed pktgen again.
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eb37b41c

net: struct device - replace bus_id with dev_name(), dev_set_name() · fb28ad35

由 Kay Sievers 提交于 11月 10, 2008

Acked-by: NMarcel Holtmann <marcel@holtmann.org>
Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fb28ad35

08 11月, 2008 3 次提交

net: Guaranetee the proper ordering of the loopback device. v2 · 505d4f73

由 Eric W. Biederman 提交于 11月 07, 2008

I was recently hunting a bug that occurred in network namespace
cleanup.  In looking at the code it became apparrent that we have
and will continue to have cases where if we have anything going
on in a network namespace there will be assumptions that the
loopback device is present.   Things like sending igmp unsubscribe
messages when we bring down network devices invokes the routing
code which assumes that at least the loopback driver is present.

Therefore to avoid magic initcall ordering hackery that is hard
to follow and hard to get right insert a call to register the
loopback device directly from net_dev_init().    This guarantes
that the loopback device is the first device registered and
the last network device to go away.

But do it carefully so we register the loopback device after
we clear dev_boot_phase.
Signed-off-by: NEric W. Biederman <ebiederm@maxwell.aristanetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

505d4f73

net: fib_rules ordering fixes. · 5d6d4809

由 Eric W. Biederman 提交于 11月 07, 2008

We need to setup the network namespace state before we register
the notifier.  Otherwise if a network device is already registered
we get a nasty NULL pointer dereference.
Signed-off-by: NEric W. Biederman <ebiederm@maxwell.aristanetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d6d4809

D
Revert "net: Guaranetee the proper ordering of the loopback device." · 3d8160b1
由 David S. Miller 提交于 11月 07, 2008
```
This reverts commit ae33bc40.
```
3d8160b1

07 11月, 2008 3 次提交

net: mark flow_cache_cpu_prepare() as __init · 76acfdb9

由 Alexey Dobriyan 提交于 11月 06, 2008

It's called from __init code only. And__devinit in generic networking code
is pretty strange :^)
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76acfdb9

net: Fix recursive descent in __scm_destroy(). · 3b53fbf4

由 David S. Miller 提交于 11月 06, 2008

__scm_destroy() walks the list of file descriptors in the scm_fp_list
pointed to by the scm_cookie argument.

Those, in turn, can close sockets and invoke __scm_destroy() again.

There is nothing which limits how deeply this can occur.

The idea for how to fix this is from Linus.  Basically, we do all of
the fput()s at the top level by collecting all of the scm_fp_list
objects hit by an fput().  Inside of the initial __scm_destroy() we
keep running the list until it is empty.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b53fbf4

net: Fix recursive descent in __scm_destroy(). · f8d570a4

由 David Miller 提交于 11月 06, 2008

__scm_destroy() walks the list of file descriptors in the scm_fp_list
pointed to by the scm_cookie argument.

Those, in turn, can close sockets and invoke __scm_destroy() again.

There is nothing which limits how deeply this can occur.

The idea for how to fix this is from Linus.  Basically, we do all of
the fput()s at the top level by collecting all of the scm_fp_list
objects hit by an fput().  Inside of the initial __scm_destroy() we
keep running the list until it is empty.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f8d570a4

06 11月, 2008 3 次提交

net: Don't leak packets when a netns is going down · 0a36b345

由 Eric W. Biederman 提交于 11月 05, 2008

I have been tracking for a while a case where when the
network namespace exits the cleanup gets stck in an
endless precessess of:

unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3

It turns out that if you listen on a multicast address an unsubscribe
packet is sent when the network device goes down.   If you shutdown
the network namespace without carefully cleaning up this can trigger
the unsubscribe packet to be sent over the loopback interface while
the network namespace is going down.

All of which is fine except when we drop the packet and forget to
free it leaking the skb and the dst entry attached to.  As it
turns out the dst entry hold a reference to the idev which holds
the dev and keeps everything from being cleaned up.  Yuck!

By fixing my earlier thinko and add the needed kfree_skb and everything
cleans up beautifully. 
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a36b345

net: Guaranetee the proper ordering of the loopback device. · ae33bc40

由 Eric W. Biederman 提交于 11月 05, 2008

I was recently hunting a bug that occurred in network namespace
cleanup.  In looking at the code it became apparrent that we have
and will continue to have cases where if we have anything going
on in a network namespace there will be assumptions that the
loopback device is present.   Things like sending igmp unsubscribe
messages when we bring down network devices invokes the routing
code which assumes that at least the loopback driver is present.

Therefore to avoid magic initcall ordering hackery that is hard
to follow and hard to get right insert a call to register the
loopback device directly from net_dev_init().    This guarantes
that the loopback device is the first device registered and
the last network device to go away.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae33bc40

netns: Delete virtual interfaces during namespace cleanup · d0c082ce

由 Eric W. Biederman 提交于 11月 05, 2008

When physical devices are inside of network namespace and that
network namespace terminates we can not make them go away.  We
have to keep them and moving them to the initial network namespace
is the best we can do.

For virtual devices left in a network namespace that is exiting
we have no need to preserve them and we now have the infrastructure
that allows us to delete them.  So delete virtual devices when we
exit a network namespace.  Keeping the necessary user space clean up
after a network namespace exits much more tractable.
Acked-by: NDaniel Lezcano <dlezcano@fr.ibm.com>
Acked-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d0c082ce

05 11月, 2008 2 次提交

net: sk_free_datagram() should use sk_mem_reclaim_partial() · 270acefa

由 Eric Dumazet 提交于 11月 05, 2008

I noticed a contention on udp_memory_allocated on regular UDP applications.

While tcp_memory_allocated is seldom used, it appears each incoming UDP frame
is currently touching udp_memory_allocated when queued, and when received by
application.

One possible solution is to use sk_mem_reclaim_partial() instead of
sk_mem_reclaim(), so that we keep a small reserve (less than one page)
of memory for each UDP socket.

We did something very similar on TCP side in commit
9993e7d3
([TCP]: Do not purge sk_forward_alloc entirely in tcp_delack_timer())

A more complex solution would need to convert prot->memory_allocated to
use a percpu_counter with batches of 64 or 128 pages.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

270acefa

net: fix packet socket delivery in rx irq handler · 9b22ea56

由 Patrick McHardy 提交于 11月 04, 2008

The changes to deliver hardware accelerated VLAN packets to packet
sockets (commit bc1d0411) caused a warning for non-NAPI drivers.
The __vlan_hwaccel_rx() function is called directly from the drivers
RX function, for non-NAPI drivers that means its still in RX IRQ
context:

[   27.779463] ------------[ cut here ]------------
[   27.779509] WARNING: at kernel/softirq.c:136 local_bh_enable+0x37/0x81()
...
[   27.782520]  [<c0264755>] netif_nit_deliver+0x5b/0x75
[   27.782590]  [<c02bba83>] __vlan_hwaccel_rx+0x79/0x162
[   27.782664]  [<f8851c1d>] atl1_intr+0x9a9/0xa7c [atl1]
[   27.782738]  [<c0155b17>] handle_IRQ_event+0x23/0x51
[   27.782808]  [<c015692e>] handle_edge_irq+0xc2/0x102
[   27.782878]  [<c0105fd5>] do_IRQ+0x4d/0x64

Split hardware accelerated VLAN reception into two parts to fix this:

- __vlan_hwaccel_rx just stores the VLAN TCI and performs the VLAN
  device lookup, then calls netif_receive_skb()/netif_rx()

- vlan_hwaccel_do_receive(), which is invoked by netif_receive_skb()
  in softirq context, performs the real reception and delivery to
  packet sockets.
Reported-and-tested-by: NRamon Casellas <ramon.casellas@cttc.es>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b22ea56

04 11月, 2008 2 次提交

net: '&' redux · 6d9f239a

由 Alexey Dobriyan 提交于 11月 03, 2008

I want to compile out proc_* and sysctl_* handlers totally and
stub them to NULL depending on config options, however usage of &
will prevent this, since taking adress of NULL pointer will break
compilation.

So, drop & in front of every ->proc_handler and every ->strategy
handler, it was never needed in fact.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d9f239a

net: increase receive packet quantum · 24f8b238

由 Stephen Hemminger 提交于 11月 03, 2008

This patch gets about 1.25% back on tbench regression.

My change to NAPI for multiqueue support changed the time limit on
network receive processing.  Under sustained loads like tbench, this
can cause the receiver to reschedule prematurely. 
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

24f8b238

02 11月, 2008 1 次提交

net: add documentation for skb recycling · d1a203ea

由 Stephen Hemminger 提交于 11月 01, 2008

Commit 04a4bb55 ("net: add
skb_recycle_check() to enable netdriver skb recycling") added a
method for network drivers to recycle skbuffs, but while use of
this mechanism was documented in the commit message, it should
really have been added as a docbook comment as well -- this
patch does that.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NLennert Buytenhek <buytenh@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1a203ea

01 11月, 2008 1 次提交

mac80211: Re-enable aggregation · 8b30b1fe

由 Sujith 提交于 10月 24, 2008

Wireless HW without any dedicated queues for aggregation
do not need the ampdu_queues mechanism present right now
in mac80211. Since mac80211 is still incomplete wrt TX MQ
changes, do not allow aggregation sessions for drivers that
set ampdu_queues.

This is only an interim hack until Intel fixes the requeue issue.
Signed-off-by: NSujith <Sujith.Manoharan@atheros.com>
Signed-off-by: NLuis Rodriguez <Luis.Rodriguez@Atheros.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

8b30b1fe

31 10月, 2008 1 次提交

netns: add register_pernet_gen_subsys/unregister_pernet_gen_subsys · 485ac57b

由 Alexey Dobriyan 提交于 10月 30, 2008

netns ops which are registered with register_pernet_gen_device() are
shutdown strictly before those which are registered with
register_pernet_subsys(). Sometimes this leads to opposite (read: buggy)
shutdown ordering between two modules.

Add register_pernet_gen_subsys()/unregister_pernet_gen_subsys() for modules
which aren't elite enough for entry in struct net, and which can't use
register_pernet_gen_device(). PPTP conntracking module is such one.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

485ac57b

29 10月, 2008 4 次提交

udp: RCU handling for Unicast packets. · 271b72c7

由 Eric Dumazet 提交于 10月 29, 2008

Goals are :

1) Optimizing handling of incoming Unicast UDP frames, so that no memory
 writes should happen in the fast path.

 Note: Multicasts and broadcasts still will need to take a lock,
 because doing a full lockless lookup in this case is difficult.

2) No expensive operations in the socket bind/unhash phases :
  - No expensive synchronize_rcu() calls.

  - No added rcu_head in socket structure, increasing memory needs,
  but more important, forcing us to use call_rcu() calls,
  that have the bad property of making sockets structure cold.
  (rcu grace period between socket freeing and its potential reuse
   make this socket being cold in CPU cache).
  David did a previous patch using call_rcu() and noticed a 20%
  impact on TCP connection rates.
  Quoting Cristopher Lameter :
   "Right. That results in cacheline cooldown. You'd want to recycle
    the object as they are cache hot on a per cpu basis. That is screwed
    up by the delayed regular rcu processing. We have seen multiple
    regressions due to cacheline cooldown.
    The only choice in cacheline hot sensitive areas is to deal with the
    complexity that comes with SLAB_DESTROY_BY_RCU or give up on RCU."

  - Because udp sockets are allocated from dedicated kmem_cache,
  use of SLAB_DESTROY_BY_RCU can help here.

Theory of operation :
---------------------

As the lookup is lockfree (using rcu_read_lock()/rcu_read_unlock()),
special attention must be taken by readers and writers.

Use of SLAB_DESTROY_BY_RCU is tricky too, because a socket can be freed,
reused, inserted in a different chain or in worst case in the same chain
while readers could do lookups in the same time.

In order to avoid loops, a reader must check each socket found in a chain
really belongs to the chain the reader was traversing. If it finds a
mismatch, lookup must start again at the begining. This *restart* loop
is the reason we had to use rdlock for the multicast case, because
we dont want to send same message several times to the same socket.

We use RCU only for fast path.
Thus, /proc/net/udp still takes spinlocks.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

271b72c7

net: don't use INIT_RCU_HEAD · 93adcc80

由 Alexey Dobriyan 提交于 10月 28, 2008

call_rcu() will unconditionally rewrite RCU head anyway.
Applies to 
	struct neigh_parms
	struct neigh_table
	struct net
	struct cipso_v4_doi
	struct in_ifaddr
	struct in_device
	rt->u.dst
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

93adcc80

net: reduce structures when XFRM=n · def8b4fa

由 Alexey Dobriyan 提交于 10月 28, 2008

ifdef out
* struct sk_buff::sp		(pointer)
* struct dst_entry::xfrm	(pointer)
* struct sock::sk_policy	(2 pointers)
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

def8b4fa

pktgen: fix multiple queue warning · 88271660

由 Jesse Brandeburg 提交于 10月 28, 2008

when testing the new pktgen module with multiple queues and ixgbe with:
	pgset "flag QUEUE_MAP_CPU"

I found that I was getting errors in dmesg like:
pktgen: WARNING: QUEUE_MAP_CPU disabled because CPU count (8) exceeds number
<4>pktgen: WARNING: of tx queues (8) on eth15

you'll note, 8 really doesn't exceed 8.

This patch seemed to fix the logic errors and also the attempts at
limiting line length in printk (which didn't work anyway)
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

88271660

28 10月, 2008 2 次提交

netns: Coexist with the sysfs limitations v2 · 3891845e

由 Eric W. Biederman 提交于 10月 27, 2008

To make testing of the network namespace simpler allow
the network namespace code and the sysfs code to be
compiled and run at the same time.  To do this only
virtual devices are allowed in the additional network
namespaces and those virtual devices are not placed
in the kobject tree.

Since virtual devices don't actually do anything interesting
hardware wise that needs device management there should
be no loss in keeping them out of the kobject tree and
by implication sysfs.  The gain in ease of testing
and code coverage should be significant.

Changelog:

v2: As pointed out by Benjamin Thery it only makes sense to call
    device_rename in the initial network namespace for now.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Acked-by: NBenjamin Thery <benjamin.thery@bull.net>
Tested-by: NSerge Hallyn <serue@us.ibm.com>
Acked-by: NSerge Hallyn <serue@us.ibm.com>
Acked-by: NDaniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3891845e

net: convert print_mac to %pM · e174961c

由 Johannes Berg 提交于 10月 27, 2008

This converts pretty much everything to print_mac. There were
a few things that had conflicts which I have just dropped for
now, no harm done.

I've built an allyesconfig with this and looked at the files
that weren't built very carefully, but it's a huge patch.
Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e174961c

23 10月, 2008 1 次提交

net: Fix disjunct computation of netdev features · b63365a2

由 Herbert Xu 提交于 10月 23, 2008

My change

    commit e2a6b852
    net: Enable TSO if supported by at least one device

didn't do what was intended because the netdev_compute_features
function was designed for conjunctions.  So what happened was that
it would simply take the TSO status of the last constituent device.

This patch extends it to support both conjunctions and disjunctions
under the new name of netdev_increment_features.

It also adds a new function netdev_fix_features which does the
sanity checking that usually occurs upon registration.  This ensures
that the computation doesn't result in an illegal combination
since this checking is absent when the change is initiated via
ethtool.

The two users of netdev_compute_features have been converted.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b63365a2

20 10月, 2008 1 次提交

netdev: change name dropping error codes · 92845ffd

由 Stephen Hemminger 提交于 10月 19, 2008

If changename notifier returns an error code, it gets incorrectly
cleared during rollback so the error is never returned to the user.
Found while testing similar code for MTU changes.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92845ffd

17 10月, 2008 1 次提交

net: Remove CONFIG_KMOD from net/ (towards removing CONFIG_KMOD entirely) · 95a5afca

由 Johannes Berg 提交于 10月 16, 2008

Some code here depends on CONFIG_KMOD to not try to load
protocol modules or similar, replace by CONFIG_MODULES
where more than just request_module depends on CONFIG_KMOD
and and also use try_then_request_module in ebtables.
Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95a5afca

15 10月, 2008 1 次提交

netns: fix net_generic array leak · 4ef079cc

由 Alexey Dobriyan 提交于 10月 14, 2008

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ef079cc

14 10月, 2008 2 次提交

net: Rationalise email address: Network Specific Parts · 113aa838

由 Alan Cox 提交于 10月 13, 2008

Clean up the various different email addresses of mine listed in the code
to a single current and valid address. As Dave says his network merges
for 2.6.28 are now done this seems a good point to send them in where
they won't risk disrupting real changes.
Signed-off-by: NAlan Cox <alan@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

113aa838

pktgen: fix skb leak in case of failure · b4bb4ac8

由 Ilpo Järvinen 提交于 10月 13, 2008

Seems that skb goes into void unless something magic happened
in pskb_expand_head in case of failure.
Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b4bb4ac8

13 10月, 2008 1 次提交

net: Fix off-by-one in skb_dma_map · ab396eb0

由 Dimitris Michailidis 提交于 10月 12, 2008

The unwind loop iterates down to -1 instead of stopping at 0 and ends up
accessing ->frags[-1].
Signed-off-by: NDimitris Michailidis <dm@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab396eb0

08 10月, 2008 5 次提交

netns: export netns list · b76a461f

由 Alexey Dobriyan 提交于 10月 08, 2008

Conntrack code will use it for
a) removing expectations and helpers when corresponding module is removed, and
b) removing conntracks when L3 protocol conntrack module is removed.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

b76a461f

net: Fix netdev_run_todo dead-lock · 58ec3b4d

由 Herbert Xu 提交于 10月 07, 2008

Benjamin Thery tracked down a bug that explains many instances
of the error

unregister_netdevice: waiting for %s to become free. Usage count = %d

It turns out that netdev_run_todo can dead-lock with itself if
a second instance of it is run in a thread that will then free
a reference to the device waited on by the first instance.

The problem is really quite silly.  We were trying to create
parallelism where none was required.  As netdev_run_todo always
follows a RTNL section, and that todo tasks can only be added
with the RTNL held, by definition you should only need to wait
for the very ones that you've added and be done with it.

There is no need for a second mutex or spinlock.

This is exactly what the following patch does.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

58ec3b4d

net: only invoke dev->change_rx_flags when device is UP · b6c40d68

由 Patrick McHardy 提交于 10月 07, 2008

Jesper Dangaard Brouer <hawk@comx.dk> reported a bug when setting a VLAN
device down that is in promiscous mode:

When the VLAN device is set down, the promiscous count on the real
device is decremented by one by vlan_dev_stop(). When removing the
promiscous flag from the VLAN device afterwards, the promiscous
count on the real device is decremented a second time by the
vlan_change_rx_flags() callback.

The root cause for this is that the ->change_rx_flags() callback is
invoked while the device is down. The synchronization is meant to mirror
the behaviour of the ->set_rx_mode callbacks, meaning the ->open function
is responsible for doing a full sync on open, the ->close() function is
responsible for doing full cleanup on ->stop() and ->change_rx_flags()
is meant to do incremental changes while the device is UP.

Only invoke ->change_rx_flags() while the device is UP to provide the
intended behaviour.
Tested-by: NJesper Dangaard Brouer <jdb@comx.dk>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6c40d68

net: packet split receive api · 654bed16

由 Peter Zijlstra 提交于 10月 07, 2008

Add some packet-split receive hooks.

For one this allows to do NUMA node affine page allocs. Later on these
hooks will be extended to do emergency reserve allocations for
fragments.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

654bed16

net: wrap sk->sk_backlog_rcv() · c57943a1

由 Peter Zijlstra 提交于 10月 07, 2008

Wrap calling sk->sk_backlog_rcv() in a function. This will allow extending the
generic sk_backlog_rcv behaviour.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c57943a1

01 10月, 2008 1 次提交

net: BUG instead of corrupting memory in pskb_expand_head · 4edd87ad

由 Herbert Xu 提交于 10月 01, 2008

If the caller of pskb_expand_head specifies a negative nhead
we'll silently overwrite other people's memory.  This patch
makes it BUG instead.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4edd87ad

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功