提交 · 65ac6a5fa658b90f1be700c55e7cd72e4611015d · openeuler / raspberrypi-kernel

21 10月, 2010 1 次提交

vlan: Enable software emulation for vlan accleration. · 7b9c6090

由 Jesse Gross 提交于 10月 20, 2010

Currently users of hardware vlan accleration need to know whether
the device supports it before generating packets.  However, vlan
acceleration will soon be available in a more flexible manner so
knowing ahead of time becomes much more difficult.  This adds
a software fallback path for vlan packets on devices without the
necessary offloading support, similar to other types of hardware
accleration.
Signed-off-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b9c6090

20 10月, 2010 3 次提交

net: allocate tx queues in register_netdevice · e6484930

由 Tom Herbert 提交于 10月 18, 2010

This patch introduces netif_alloc_netdev_queues which is called from
register_device instead of alloc_netdev_mq.  This makes TX queue
allocation symmetric with RX allocation.  Also, queue locks allocation
is done in netdev_init_one_queue.  Change set_real_num_tx_queues to
fail if requested number < 1 or greater than number of allocated
queues.
Signed-off-by: NTom Herbert <therbert@google.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e6484930

net: cleanups in RX queue allocation · bd25fa7b

由 Tom Herbert 提交于 10月 18, 2010

Clean up in RX queue allocation.  In netif_set_real_num_rx_queues
return error on attempt to set zero queues, or requested number is
greater than number of allocated queues.  In netif_alloc_rx_queues,
do BUG_ON if queue_count is zero.
Signed-off-by: NTom Herbert <therbert@google.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bd25fa7b

net: fail alloc_netdev_mq if queue count < 1 · 55513fb4

由 Tom Herbert 提交于 10月 18, 2010

In alloc_netdev_mq fail if requested queue_count < 1.
Signed-off-by: NTom Herbert <therbert@google.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55513fb4

13 10月, 2010 1 次提交

net: percpu net_device refcount · 29b4433d

由 Eric Dumazet 提交于 10月 11, 2010

We tried very hard to remove all possible dev_hold()/dev_put() pairs in
network stack, using RCU conversions.

There is still an unavoidable device refcount change for every dst we
create/destroy, and this can slow down some workloads (routers or some
app servers, mmap af_packet)

We can switch to a percpu refcount implementation, now dynamic per_cpu
infrastructure is mature. On a 64 cpus machine, this consumes 256 bytes
per device.

On x86, dev_hold(dev) code :

before
        lock    incl 0x280(%ebx)
after:
        movl    0x260(%ebx),%eax
        incl    fs:(%eax)

Stress bench :

(Sending 160.000.000 UDP frames,
IP route cache disabled, dual E5540 @2.53GHz,
32bit kernel, FIB_TRIE)

Before:

real    1m1.662s
user    0m14.373s
sys     12m55.960s

After:

real    0m51.179s
user    0m15.329s
sys     10m15.942s
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29b4433d

09 10月, 2010 2 次提交

net: Fix rxq ref counting · 4315d834

由 Tom Herbert 提交于 10月 07, 2010

The rx->count reference is used to track reference counts to the
number of rx-queue kobjects created for the device.  This patch
eliminates initialization of the counter in netif_alloc_rx_queues
and instead increments the counter each time a kobject is created.
This is now symmetric with the decrement that is done when an object is
released.
Signed-off-by: NTom Herbert <therbert@google.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4315d834

net: Update kernel-doc for netif_set_real_num_rx_queues() · 4e7f7951

由 Ben Hutchings 提交于 10月 08, 2010

Synchronise the comment with the preceding implementation change.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4e7f7951

07 10月, 2010 1 次提交

net: netif_set_real_num_rx_queues may cap num_rx_queues at init time · 3d3211ef

由 John Fastabend 提交于 10月 06, 2010

Do not set num_rx_queues in netif_set_real_num_rx_queues() some
drivers will increase the real_num_rx_queues later due to a feature
changes or available interrupts increasing. By setting num_rx_queues
here this ends up creating a cap on the number of rx queues
available.

For example the ixgbe driver sets the max number of queues it intends
to use ever then sets the current number in use with the
netif_set_num_{rx|tx}_queues calls. With the current implementation
the number of rx queues gets limited so when a feature such as DCB
or FCoE is enabled the queues are no longer available.

kobjects will only be allocated for real_num_rx_queues so the waste
in memory is minimal.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d3211ef

06 10月, 2010 1 次提交

net: add a core netdev->rx_dropped counter · caf586e5

由 Eric Dumazet 提交于 9月 30, 2010

In various situations, a device provides a packet to our stack and we
drop it before it enters protocol stack :
- softnet backlog full (accounted in /proc/net/softnet_stat)
- bad vlan tag (not accounted)
- unknown/unregistered protocol (not accounted)

We can handle a per-device counter of such dropped frames at core level,
and automatically adds it to the device provided stats (rx_dropped), so
that standard tools can be used (ifconfig, ip link, cat /proc/net/dev)

This is a generalization of commit 8990f468 (net: rx_dropped
accounting), thus reverting it.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

caf586e5

05 10月, 2010 1 次提交

net: dynamic ingress_queue allocation · 24824a09

由 Eric Dumazet 提交于 10月 02, 2010

ingress being not used very much, and net_device->ingress_queue being
quite a big object (128 or 256 bytes), use a dynamic allocation if
needed (tc qdisc add dev eth0 ingress ...)

dev_ingress_queue(dev) helper should be used only with RTNL taken.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

24824a09

30 9月, 2010 2 次提交

net: rename netdev rx_queue to ingress_queue · bfa5ae63

由 Eric Dumazet 提交于 9月 28, 2010

There is some confusion with rx_queue name after RPS, and net drivers
private rx_queue fields.

I suggest to rename "struct net_device"->rx_queue to ingress_queue.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bfa5ae63

net: add a recursion limit in xmit path · 745e20f1

由 Eric Dumazet 提交于 9月 29, 2010

As tunnel devices are going to be lockless, we need to make sure a
misconfigured machine wont enter an infinite loop.

Add a percpu variable, and limit to three the number of stacked xmits.
Reported-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

745e20f1

28 9月, 2010 1 次提交

net: Allow changing number of RX queues after device allocation · 62fe0b40

由 Ben Hutchings 提交于 9月 27, 2010

For RPS, we create a kobject for each RX queue based on the number of
queues passed to alloc_netdev_mq().  However, drivers generally do not
determine the numbers of hardware queues to use until much later, so
this usually represents the maximum number the driver may use and not
the actual number in use.

For TX queues, drivers can update the actual number using
netif_set_real_num_tx_queues().  Add a corresponding function for RX
queues, netif_set_real_num_rx_queues().
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62fe0b40

27 9月, 2010 2 次提交

rps: allocate rx queues in register_netdevice only · 1b4bf461

由 Eric Dumazet 提交于 9月 23, 2010

Instead of having two places were we allocate dev->_rx, introduce
netif_alloc_rx_queues() helper and call it only from
register_netdevice(), not from alloc_netdev_mq()

Goal is to let drivers change dev->num_rx_queues after allocating netdev
and before registering it.

This also removes a lot of ifdefs in net/core/dev.c
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b4bf461

net: propagate NETIF_F_HIGHDMA to vlans · c5256c51

由 Eric Dumazet 提交于 9月 23, 2010

Automatically allows vlans to get NETIF_F_HIGHDMA if underlying device
supports it.

On 32bit arches (and more precisely if CONFIG_HIGHMEM is enabled), it
can help to reduce cost of illegal_highdma() and __skb_linearize()
calls.

Tested on tg3 , bnx2, bonding, this worked very well.

This is a generalization of a patch provided by Yi Zou & Jeff Kirsher.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5256c51

18 9月, 2010 1 次提交

netns: keep vlan slaves on master netns move · 3b27e105

由 David Lamparter 提交于 9月 17, 2010

previously, if a vlan master device was moved from one network namespace
to another, all 802.1q and macvlan slaves were deleted.

we can use dev->reg_state to figure out whether dev_change_net_namespace
is happening, since that won't set dev->reg_state NETREG_UNREGISTERING.
so, this changes 8021q and macvlan to ignore NETDEV_UNREGISTER when
reg_state is not NETREG_UNREGISTERING.
Signed-off-by: NDavid Lamparter <equinox@diac24.net>
Reviewed-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: NDaniel Lezcano <daniel.lezcano@free.fr>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b27e105

17 9月, 2010 1 次提交

net: include inetdevice.h for rcu_dereference_raw api change · caeda9b9

由 Stephen Rothwell 提交于 9月 16, 2010

rcu_dereference_raw() now needs to know the type of its argument.
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

caeda9b9

16 9月, 2010 2 次提交

net: enable GRO by default for vlan devices · 16c3ea78

由 Brandon Philips 提交于 9月 15, 2010

Currently vlan devices don't have GRO by default as none of the Ethernet
drivers add NETIF_F_GRO to their vlan_features.

As GRO is a software feature add GRO to dev->vlan_features in
register_netdevice() and let vlan_dev_init() take care that it gets
enabled only when dev->features has NETIF_F_GRO too.
Signed-off-by: NBrandon Philips <bphilips@suse.de>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

16c3ea78

ipv4: ip_ptr cleanups · 95ae6b22

由 Eric Dumazet 提交于 9月 15, 2010

dev->ip_ptr is protected by rtnl and rcu.

Yet some places dont use appropriate primitives and/or locking rules.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95ae6b22

15 9月, 2010 1 次提交

net: use rcu_barrier() in rollback_registered_many · ef885afb

由 Eric Dumazet 提交于 9月 13, 2010

netdev_wait_allrefs() waits that all references to a device vanishes.

It currently uses a _very_ pessimistic 250 ms delay between each probe.
Some users reported that no more than 4 devices can be dismantled per
second, this is a pretty serious problem for some setups.

Most of the time, a refcount is about to be released by an RCU callback,
that is still in flight because rollback_registered_many() uses a
synchronize_rcu() call instead of rcu_barrier(). Problem is visible if
number of online cpus is one, because synchronize_rcu() is then a no op.

time to remove 50 ipip tunnels on a UP machine :

before patch : real 11.910s
after patch : real 1.250s
Reported-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Reported-by: NOctavian Purdila <opurdila@ixiacom.com>
Reported-by: NBenjamin LaHaise <bcrl@kvack.org>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ef885afb

09 9月, 2010 1 次提交

net: rps: add the shortcut for one rps_cpus · 6febfca9

由 Changli Gao 提交于 9月 03, 2010

When there is only one rps_cpus, skb_get_rxhash() can be eliminated.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6febfca9

08 9月, 2010 1 次提交

net: fix tx queue selection for bridged devices implementing select_queue · deabc772

由 Helmut Schaa 提交于 9月 03, 2010

When a net device is implementing the select_queue callback and is part of
a bridge, frames coming from the bridge already have a tx queue associated
to the socket (introduced in commit a4ee3ce3,
"net: Use sk_tx_queue_mapping for connected sockets"). The call to
sk_tx_queue_get will then return the tx queue used by the bridge instead
of calling the select_queue callback.

In case of mac80211 this broke QoS which is implemented by using the
select_queue callback. Furthermore it introduced problems with rt2x00
because frames with the same TID and RA sometimes appeared on different
tx queues which the hw cannot handle correctly.

Fix this by always calling select_queue first if it is available and only
afterwards use the socket tx queue mapping.
Signed-off-by: NHelmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

deabc772

03 9月, 2010 1 次提交

net: dev_add_pack() & __dev_remove_pack() changes · c07b68e8

由 Eric Dumazet 提交于 9月 02, 2010

Add a small helper ptype_head() to get the head to manipulate

dev_add_pack() & __dev_remove_pack() can use a spinlock without
blocking BH, since softirq use RCU, and these functions are run from
process context only.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c07b68e8

02 9月, 2010 1 次提交

skge: add GRO support · 86cac58b

由 Eric Dumazet 提交于 8月 31, 2010

- napi_gro_flush() is exported from net/core/dev.c, to avoid
  an irq_save/irq_restore in the packet receive path.
- use napi_gro_receive() instead of netif_receive_skb()
- use napi_gro_flush() before calling __napi_complete()
- turn on NETIF_F_GRO by default
- Tested on a Marvell 88E8001 Gigabit NIC
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86cac58b

27 8月, 2010 1 次提交

gro: __napi_gro_receive() optimizations · 40d0802b

由 Eric Dumazet 提交于 8月 26, 2010

compare_ether_header() can have a special implementation on 64 bit
arches if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is defined.

__napi_gro_receive() and vlan_gro_common() can avoid a conditional
branch to perform device match.

On x86_64, __napi_gro_receive() has now 38 instructions instead of 53

As gcc-4.4.3 still choose to not inline it, add inline keyword to this
performance critical function.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40d0802b

23 8月, 2010 2 次提交

net: Rename skb_has_frags to skb_has_frag_list · 21dc3301

由 David S. Miller 提交于 8月 23, 2010

SKBs can be "fragmented" in two ways, via a page array (called
skb_shinfo(skb)->frags[]) and via a list of SKBs (called
skb_shinfo(skb)->frag_list).

Since skb_has_frags() tests the latter, it's name is confusing
since it sounds more like it's testing the former.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

21dc3301

net: 802.1q: make vlan_hwaccel_do_receive() return void · 05532121

由 Changli Gao 提交于 8月 22, 2010

vlan_hwaccel_do_receive() always returns 0, so make it return void.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05532121

22 8月, 2010 1 次提交

net: rps: fix the wrong network header pointer · 1003489e

由 Changli Gao 提交于 8月 21, 2010

__skb_get_rxhash() was broken after the commit:

 commit bfb564e7
 Author: Krishna Kumar <krkumar2@in.ibm.com>
 Date:   Wed Aug 4 06:15:52 2010 +0000

 core: Factor out flow calculation from get_rps_cpu
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1003489e

20 8月, 2010 3 次提交

net: rps: use proto_ports_offset() to handle the AH message correctly · 12fcdefb

由 Changli Gao 提交于 8月 17, 2010

The SPI isn't at the beginning of an AH message.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

12fcdefb

net: rps: skip fragment when computing rxhash · dbe5775b

由 Changli Gao 提交于 8月 17, 2010

Fragmented IP packets may have no transfer header, so when computing
rxhash, we should skip them.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dbe5775b

net: rps: reset network header before calling skb_get_rxhash() · 2d47b459

由 Changli Gao 提交于 8月 17, 2010

skb_get_rxhash() assumes the network header pointer of the skb is set
properly after the commit:

commit bfb564e7
Author: Krishna Kumar <krkumar2@in.ibm.com>
Date:   Wed Aug 4 06:15:52 2010 +0000

    core: Factor out flow calculation from get_rps_cpu
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d47b459

19 8月, 2010 1 次提交

net: simplify flags for tx timestamping · 2244d07b

由 Oliver Hartkopp 提交于 8月 17, 2010

This patch removes the abstraction introduced by the union skb_shared_tx in
the shared skb data.

The access of the different union elements at several places led to some
confusion about accessing the shared tx_flags e.g. in skb_orphan_try().

http://marc.info/?l=linux-netdev&m=128084897415886&w=2Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2244d07b

18 8月, 2010 1 次提交

net: Fix a memmove bug in dev_gro_receive() · e5093aec

由 Jarek Poplawski 提交于 8月 11, 2010

>Xin Xiaohui wrote:
> I looked into the code dev_gro_receive(), found the code here:
> if the frags[0] is pulled to 0, then the page will be released,
> and memmove() frags left.
> Is that right? I'm not sure if memmove do right or not, but
> frags[0].size is never set after memove at least. what I think
> a simple way is not to do anything if we found frags[0].size == 0.
> The patch is as followed.
...

This version of the patch fixes the bug directly in memmove.
Reported-by: N"Xin, Xiaohui" <xiaohui.xin@intel.com>
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5093aec

17 8月, 2010 1 次提交

core: Factor out flow calculation from get_rps_cpu · bfb564e7

由 Krishna Kumar 提交于 8月 04, 2010

Factor out flow calculation code from get_rps_cpu, since other
functions can use the same code.

Revisions:

v2 (Ben): Separate flow calcuation out and use in select queue.
v3 (Arnd): Don't re-implement MIN.
v4 (Changli): skb->data points to ethernet header in macvtap, and
	make a fast path. Tested macvtap with this patch.
v5 (Changli):
	- Cache skb->rxhash in skb_get_rxhash
	- macvtap may not have pow(2) queues, so change code for
	  queue selection.
    (Arnd):
	- Use first available queue if all fails.
Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bfb564e7

08 8月, 2010 1 次提交

net: disable preemption before call smp_processor_id() · cece1945

由 Changli Gao 提交于 8月 07, 2010

Although netif_rx() isn't expected to be called in process context with
preemption enabled, it'd better handle this case. And this is why get_cpu()
is used in the non-RPS #ifdef branch. If tree RCU is selected,
rcu_read_lock() won't disable preemption, so preempt_disable() should be
called explictly.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cece1945

06 8月, 2010 1 次提交

net: Fix napi_gro_frags vs netpoll path · ce9e76c8

由 Jarek Poplawski 提交于 8月 05, 2010

The netpoll_rx_on() check in __napi_gro_receive() skips part of the
"common" GRO_NORMAL path, especially "pull:" in dev_gro_receive(),
where at least eth header should be copied for entirely paged skbs.
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ce9e76c8

03 8月, 2010 2 次提交

Revert "net: remove zap_completion_queue" · 3578b0c8

由 David S. Miller 提交于 8月 03, 2010

This reverts commit 15e83ed7.

As explained by Johannes Berg, the optimization made here is
invalid.  Or, at best, incomplete.

Not only destructor invocation, but conntract entry releasing
must be executed outside of hw IRQ context.

So just checking "skb->destructor" is insufficient.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3578b0c8

net: cleanup inclusion · a427615e

由 Changli Gao 提交于 8月 02, 2010

Commit ab95bfe0 replaces bridge and macvlan
hooks in __netif_receive_skb(), so dev.c doesn't need to include their headers.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a427615e

01 8月, 2010 1 次提交

net: ingress filter message limit · de384830

由 Stephen Hemminger 提交于 8月 01, 2010

If user misconfigures ingress and causes a redirection loop, don't
overwhelm the log.  This is also a error case so make it unlikely.
Found by inspection, luckily not in real system.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de384830

26 7月, 2010 1 次提交

net: dev_forward_skb should call nf_reset · c736eefa

由 Ben Greear 提交于 7月 22, 2010

With conn-track zones and probably with different network
namespaces, the netfilter logic needs to be re-calculated
on packet receive.  If the netfilter logic is not reset,
it will not be recalculated properly.  This patch adds
the nf_reset logic to dev_forward_skb.
Signed-off-by: NBen Greear <greearb@candelatech.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c736eefa