提交 · 7489594cb249aeb178287c9a43a9e4f366044259 · OpenHarmony / kernel_linux

27 5月, 2009 3 次提交

gro: Optimise length comparison in skb_gro_header · 7489594c

由 Herbert Xu 提交于 5月 26, 2009

By caching frag0_len, we can avoid checking both frag0 and the
length separately in skb_gro_header.  This helps as skb_gro_header
is called four times per packet which amounts to a few million
times at 10Gb/s.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7489594c

gro: Only use skb_gro_header for completely non-linear packets · 78d3fd0b

由 Herbert Xu 提交于 5月 26, 2009

Currently skb_gro_header is used for packets which put the hardware
header in skb->data with the rest in frags.  Since the drivers that
need this optimisation all provide completely non-linear packets,
we can gain extra optimisations by only performing the frag0
optimisation for completely non-linear packets.

In particular, we can simply test frag0 (instead of skb_headlen)
to see whether the optimisation is in force.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78d3fd0b

gro: Inline skb_gro_header and cache frag0 virtual address · 78a478d0

由 Herbert Xu 提交于 5月 26, 2009

The function skb_gro_header is called four times per packet which
quickly adds up at 10Gb/s.  This patch inlines it to allow better
optimisations.

Some architectures perform multiplication for page_address, which
is done by each skb_gro_header invocation.  This patch caches that
value in skb->cb to avoid the unnecessary multiplications.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78a478d0

26 5月, 2009 1 次提交

net: txq_trans_update() helper · 08baf561

由 Eric Dumazet 提交于 5月 25, 2009

We would like to get rid of netdev->trans_start = jiffies; that about all net
drivers have to use in their start_xmit() function, and use txq->trans_start
instead.

This can be done generically in core network, as suggested by David.

Some devices, (particularly loopback) dont need trans_start update, because
they dont have transmit watchdog. We could add a new device flag, or rely
on fact that txq->tran_start can be updated is txq->xmit_lock_owner is
different than -1. Use a helper function to hide our choice.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

08baf561

25 5月, 2009 1 次提交

net: remove COMPAT_NET_DEV_OPS · e3804cbe

由 Alexander Beregalov 提交于 5月 25, 2009

All drivers are already converted to new net_device_ops API
and nobody uses old API anymore.
Signed-off-by: NAlexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3804cbe

19 5月, 2009 1 次提交

net: add tx_packets/tx_bytes/tx_dropped counters in struct netdev_queue · 7004bf25

由 Eric Dumazet 提交于 5月 18, 2009

offsetof(struct net_device, features)=0x44
offsetof(struct net_device, stats.tx_packets)=0x54
offsetof(struct net_device, stats.tx_bytes)=0x5c
offsetof(struct net_device, stats.tx_dropped)=0x6c

Network drivers that touch dev->stats.tx_packets/stats.tx_bytes in their
tx path can slow down SMP operations, since they dirty a cache line
that should stay shared (dev->features is needed in rx and tx paths)

We could move away stats field in net_device but it wont help that much.
(Two cache lines dirtied in tx path, we can do one only)

Better solution is to add tx_packets/tx_bytes/tx_dropped in struct
netdev_queue because this structure is already touched in tx path and
counters updates will then be free (no increase in size)
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7004bf25

18 5月, 2009 1 次提交

net: tx scalability works : trans_start · 9d21493b

由 Eric Dumazet 提交于 5月 17, 2009

struct net_device trans_start field is a hot spot on SMP and high performance
devices, particularly multi queues ones, because every transmitter dirties
it. Is main use is tx watchdog and bonding alive checks.

But as most devices dont use NETIF_F_LLTX, we have to lock
a netdev_queue before calling their ndo_start_xmit(). So it makes
sense to move trans_start from net_device to netdev_queue. Its update
will occur on a already present (and in exclusive state) cache line, for
free.

We can do this transition smoothly. An old driver continue to
update dev->trans_start, while an updated one updates txq->trans_start.

Further patches could also put tx_bytes/tx_packets counters in 
netdev_queue to avoid dirtying dev->stats (vlan device comes to mind)
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d21493b

07 5月, 2009 1 次提交

net: Add missing rculist.h include to netdevice.h · 4d5b78c0

由 David S. Miller 提交于 5月 06, 2009

Otherwise list_for_each_entry_rcu() et al. aren't visible
and we get build failures in some configurations.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d5b78c0

06 5月, 2009 1 次提交

net: introduce a list of device addresses dev_addr_list (v6) · f001fde5

由 Jiri Pirko 提交于 5月 05, 2009

v5 -> v6 (current):
-removed so far unused static functions
-corrected dev_addr_del_multiple to call del instead of add

v4 -> v5:
-added device address type (suggested by davem)
-removed refcounting (better to have simplier code then safe potentially few
 bytes)

v3 -> v4:
-changed kzalloc to kmalloc in __hw_addr_add_ii()
-ASSERT_RTNL() avoided in dev_addr_flush() and dev_addr_init()

v2 -> v3:
-removed unnecessary rcu read locking
-moved dev_addr_flush() calling to ensure no null dereference of dev_addr

v1 -> v2:
-added forgotten ASSERT_RTNL to dev_addr_init and dev_addr_flush
-removed unnecessary rcu_read locking in dev_addr_init
-use compare_ether_addr_64bits instead of compare_ether_addr
-use L1_CACHE_BYTES as size for allocating struct netdev_hw_addr
-use call_rcu instead of rcu_synchronize
-moved is_etherdev_addr into __KERNEL__ ifdef

This patch introduces a new list in struct net_device and brings a set of
functions to handle the work with device address list. The list is a replacement
for the original dev_addr field and because in some situations there is need to
carry several device addresses with the net device. To be backward compatible,
dev_addr is made to point to the first member of the list so original drivers
sees no difference.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f001fde5

28 4月, 2009 2 次提交

net: netif_tx_queue_stopped too expensive · 6a321cb3

由 Eric Dumazet 提交于 4月 28, 2009

netif_tx_queue_stopped(txq) is most of the time false.

Yet its cost is very expensive on SMP.

static inline int netif_tx_queue_stopped(const struct netdev_queue *dev_queue)
{
	return test_bit(__QUEUE_STATE_XOFF, &dev_queue->state);
}

I saw this on oprofile hunting and bnx2 driver bnx2_tx_int().

We probably should split "struct netdev_queue" in two parts, one
being read mostly.

__netif_tx_lock() touches _xmit_lock & xmit_lock_owner, these
deserve a separate cache line.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a321cb3

sctp: add feature bit for SCTP offload in hardware · 8dc92f7e

由 Jesse Brandeburg 提交于 4月 27, 2009

this is the sctp code to enable hardware crc32c offload for
adapters that support it.

Originally by: Vlad Yasevich <vladislav.yasevich@hp.com>

modified by Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8dc92f7e

27 4月, 2009 4 次提交

net: Fix typo in net_device_ops description. · 37b607c5

由 Mike Rapoport 提交于 4月 27, 2009

Signed-off-by: NMike Rapoport <mike@compulab.co.il>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37b607c5

gro: Fix COMPLETE checksum handling · 36e7b1b8

由 Herbert Xu 提交于 4月 27, 2009

On a brand new GRO skb, we cannot call ip_hdr since the header
may lie in the non-linear area.  This patch adds the helper
skb_gro_network_header to handle this.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

36e7b1b8

gro: Fix handling of headers that extend over the tail · edbd9e30

由 Herbert Xu 提交于 4月 27, 2009

The skb_gro_* code fails to handle the case where a header starts
in the linear area but ends in the frags area.  Since the goal
of skb_gro_* is to optimise the case of completely non-linear
packets, we can simply bail out if we have anything in the linear
area.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

edbd9e30

net: Fix LL_MAX_HEADER for CONFIG_TR_MODULE · c759a6b4

由 Adrian Bunk 提交于 4月 27, 2009

Unless I miss anything this should fix a bug.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c759a6b4

21 4月, 2009 1 次提交
- P
  net: factor out ethtool invocation of vlan/macvlan drivers · b1b67dd4
  由 Patrick McHardy 提交于 4月 20, 2009
```
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  b1b67dd4
16 4月, 2009 1 次提交

gro: New frags interface to avoid copying shinfo · 76620aaf

由 Herbert Xu 提交于 4月 16, 2009

It turns out that copying a 16-byte area at ~800k times a second
can be really expensive :) This patch redesigns the frags GRO
interface to avoid copying that area twice.

The two disciples of the frags interface have been converted.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76620aaf

28 3月, 2009 1 次提交

net: Add missing include into include/linux/netdevice.h · cc0be322

由 Dmitri Vorobiev 提交于 3月 27, 2009

The inline function skb_gro_mac_header defined in include/linux/netdevice.h
makes use of page_address(). Depending on configuration options, the latter
is either defined as a macro or is declared as a function in another header
file, namely include/linux/mm.h. However, include/linux/netdevice.h does not
include include/linux/mm.h.

On MIPS, this has produced the following build error:

  CC      kernel/sysctl_check.o
In file included from include/linux/icmpv6.h:173,
                 from include/linux/ipv6.h:208,
                 from include/net/ip_vs.h:26,
                 from kernel/sysctl_check.c:6:
include/linux/netdevice.h: In function 'skb_gro_mac_header':
include/linux/netdevice.h:1132: error: implicit declaration of function
'page_address'
include/linux/netdevice.h:1133: warning: pointer/integer type mismatch
in conditional expression
make[1]: *** [kernel/sysctl_check.o] Error 1
make: *** [kernel] Error 2

The patch adds the missing include and fixes the build error.
Signed-off-by: NDmitri Vorobiev <dmitri.vorobiev@movial.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc0be322

17 3月, 2009 1 次提交

GRO: Move netpoll checks to correct location · d1c76af9

由 Herbert Xu 提交于 3月 16, 2009

As my netpoll fix for net doesn't really work for net-next, we
need this update to move the checks into the right place.  As it
stands we may pass freed skbs to netpoll_receive_skb.

This patch also introduces a netpoll_rx_on function to avoid GRO
completely if we're invoked through netpoll.  This might seem
paranoid but as netpoll may have an external receive hook it's
better to be safe than sorry.  I don't think we need this for
2.6.29 though since there's nothing immediately broken by it.

This patch also moves the GRO_* return values to netdevice.h since
VLAN needs them too (I tried to avoid this originally but alas
this seems to be the easiest way out).  This fixes a bug in VLAN
where it continued to use the old return value 2 instead of the
correct GRO_DROP.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1c76af9

14 3月, 2009 3 次提交

[SCSI] net: add FCoE offload support through net_device · 4d288d57

由 Yi Zou 提交于 2月 27, 2009

This adds support to provide Fiber Channel over Ethernet (FCoE) offload
through net_device's net_device_ops struct. The offload through net_device
for FCoE is enabled in kernel as built-in or module driver.
Signed-off-by: NYi Zou <yi.zou@intel.com>
Acked-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>

4d288d57

[SCSI] net: define feature flags for FCoE offloads · 01d5b2fc

由 Chris Leech 提交于 2月 27, 2009

Define feature flags for FCoE offloads.
Signed-off-by: NChris Leech <christopher.leech@intel.com>
Signed-off-by: NYi Zou <yi.zou@intel.com>
Acked-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>

01d5b2fc

[SCSI] net: reclaim 8 upper bits of the netdev->features from GSO · 43eb99c5

由 Chris Leech 提交于 2月 27, 2009

Reclaim 8 upper bits of netdev->features from GSO.
Signed-off-by: NChris Leech <christopher.leech@intel.com>
Signed-off-by: NYi Zou <yi.zou@intel.com>
Acked-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>

43eb99c5

05 3月, 2009 1 次提交

vlan: Fix vlan-in-vlan crashes. · 9d40bbda

由 David S. Miller 提交于 3月 04, 2009

As analyzed by Patrick McHardy, vlan needs to reset it's
netdev_ops pointer in it's ->init() function but this
leaves the compat method pointers stale.

Add a netdev_resync_ops() and call it from the vlan code.

Any other driver which changes ->netdev_ops after register_netdevice()
will need to call this new function after doing so too.

With help from Patrick McHardy.
Tested-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d40bbda

15 2月, 2009 1 次提交

net: replace __constant_{endian} uses in net headers · f3a7c66b

由 Harvey Harrison 提交于 2月 14, 2009

Base versions handle constant folding now.  For headers exposed to
userspace, we must only expose the __ prefixed versions.
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f3a7c66b

09 2月, 2009 2 次提交

gro: Optimise Ethernet header comparison · aa4b9f53

由 Herbert Xu 提交于 2月 08, 2009

This patch optimises the Ethernet header comparison to use 2-byte
and 4-byte xors instead of memcmp.  In order to facilitate this,
the actual comparison is now carried out by the callers of the
shared dev_gro_receive function.

This has a significant impact when receiving 1500B packets through
10GbE.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa4b9f53

gro: Remember number of held packets instead of counting every time · 4ae5544f

由 Herbert Xu 提交于 2月 08, 2009

This patch prepares for the move of the same_flow checks out of
dev_gro_receive.  As such we need to remember the number of held
packets since doing a loop just to count them every time is silly.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ae5544f

06 2月, 2009 1 次提交

net: fix some trailing whitespaces · fe2918b0

由 Graf Yang 提交于 2月 05, 2009

Signed-off-by: NGraf Yang <graf.yang@analog.com>
Signed-off-by: NBryan Wu <cooloney@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe2918b0

30 1月, 2009 2 次提交

gro: Avoid copying headers of unmerged packets · 86911732

由 Herbert Xu 提交于 1月 29, 2009

Unfortunately simplicity isn't always the best.  The fraginfo
interface turned out to be suboptimal.  The problem was quite
obvious.  For every packet, we have to copy the headers from
the frags structure into skb->head, even though for 99% of the
packets this part is immediately thrown away after the merge.

LRO didn't have this problem because it directly read the headers
from the frags structure.

This patch attempts to address this by creating an interface
that allows GRO to access the headers in the first frag without
having to copy it.  Because all drivers that use frags place the
headers in the first frag this optimisation should be enough.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86911732

gro: Move common completion code into helpers · 5d0d9be8

由 Herbert Xu 提交于 1月 29, 2009

Currently VLAN still has a bit of common code handling the aftermath
of GRO that's shared with the common path.  This patch moves them
into shared helpers to reduce code duplication.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d0d9be8

22 1月, 2009 1 次提交

net: Remove redundant NAPI functions · 288379f0

由 Ben Hutchings 提交于 1月 19, 2009

Following the removal of the unused struct net_device * parameter from
the NAPI functions named *netif_rx_* in commit 908a7a16, they are
exactly equivalent to the corresponding *napi_* functions and are
therefore redundant.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

288379f0

15 1月, 2009 1 次提交

net: Add init_dummy_netdev() and fix EMAC driver using it · 937f1ba5

由 Benjamin Herrenschmidt 提交于 1月 14, 2009

This adds an init_dummy_netdev() function that gets a network device
structure (allocation and lifetime entirely under caller's control) and
initialize the minimum amount of fields so it can be used to schedule
NAPI polls without registering a full blown interface. This is to be
used by drivers that need to tie several hardware interfaces to a single
NAPI poll scheduler due to HW limitations.

It also updates the ibm_newemac driver to use that, this fixing the
oops on 2.6.29 due to passing NULL as "dev" to netif_napi_add()

Symbol is exported GPL only a I don't think we want binary drivers doing
that sort of acrobatics (if we want them at all).
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: NGeert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

937f1ba5

13 1月, 2009 1 次提交

net: Fix a comment in include/linux/netdevice.h. · 985ebdb5

由 Krzysztof Hałasa 提交于 1月 12, 2009

Fix a comment in include/linux/netdevice.h.
Signed-off-by: NKrzysztof Hałasa <khc@pm.waw.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

985ebdb5

07 1月, 2009 2 次提交

gro: Add internal interfaces for VLAN · 96e93eab

由 Herbert Xu 提交于 1月 06, 2009

Previously GRO's only entry point from the outside is through
napi_gro_receive and napi_gro_frags.  These interfaces are for
device drivers.

This patch rearranges things to provide a new set of interfaces
for VLANs.  These interfaces are for internal use only.  The
VLAN code itself can then provide a set of entry points for
device drivers.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

96e93eab

net_dma: convert to dma_find_channel · f67b4599

由 Dan Williams 提交于 1月 06, 2009

Use the general-purpose channel allocation provided by dmaengine.
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

f67b4599

05 1月, 2009 1 次提交

gro: Add page frag support · 5d38a079

由 Herbert Xu 提交于 1月 04, 2009

This patch allows GRO to merge page frags (skb_shinfo(skb)->frags)
in one skb, rather than using the less efficient frag_list.

It also adds a new interface, napi_gro_frags to allow drivers
to inject page frags directly into the stack without allocating
an skb.  This is intended to be the GRO equivalent for LRO's
lro_receive_frags interface.

The existing GSO interface can already handle page frags with
or without an appended frag_list so nothing needs to be changed
there.

The merging itself is rather simple.  We store any new frag entries
after the last existing entry, without checking whether the first
new entry can be merged with the last existing entry.  Making this
check would actually be easy but since no existing driver can
produce contiguous frags anyway it would just be mental masturbation.

If the total number of entries would exceed the capacity of a
single skb, we simply resort to using frag_list as we do now.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d38a079

23 12月, 2008 1 次提交

net: Remove unused netdev arg from some NAPI interfaces. · 908a7a16

由 Neil Horman 提交于 12月 22, 2008

When the napi api was changed to separate its 1:1 binding to the net_device
struct, the netif_rx_[prep|schedule|complete] api failed to remove the now
vestigual net_device structure parameter.  This patch cleans up that api by
properly removing it..
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

908a7a16

16 12月, 2008 2 次提交

net: Add Generic Receive Offload infrastructure · d565b0a1

由 Herbert Xu 提交于 12月 15, 2008

This patch adds the top-level GRO (Generic Receive Offload) infrastructure.
This is pretty similar to LRO except that this is protocol-independent.
Instead of holding packets in an lro_mgr structure, they're now held in
napi_struct.

For drivers that intend to use this, they can set the NETIF_F_GRO bit and
call napi_gro_receive instead of netif_receive_skb or just call netif_rx.
The latter will call napi_receive_skb automatically. When napi_gro_receive
is used, the driver must either call napi_complete/napi_rx_complete, or
call napi_gro_flush in softirq context if the driver uses the primitives
__napi_complete/__napi_rx_complete.

Protocols will set the gro_receive and gro_complete function pointers in
order to participate in this scheme.

In addition to the packet, gro_receive will get a list of currently held
packets. Each packet in the list has a same_flow field which is non-zero
if it is a potential match for the new packet. For each packet that may
match, they also have a flush field which is non-zero if the held packet
must not be merged with the new packet.

Once gro_receive has determined that the new skb matches a held packet,
the held packet may be processed immediately if the new skb cannot be
merged with it. In this case gro_receive should return the pointer to
the existing skb in gro_list. Otherwise the new skb should be merged into
the existing packet and NULL should be returned, unless the new skb makes
it impossible for any further merges to be made (e.g., FIN packet) where
the merged skb should be returned.

Whenever the skb is merged into an existing entry, the gro_receive
function should set NAPI_GRO_CB(skb)->same_flow. Note that if an skb
merely matches an existing entry but can't be merged with it, then
this shouldn't be set.

If gro_receive finds it pointless to hold the new skb for future merging,
it should set NAPI_GRO_CB(skb)->flush.

Held packets will be flushed by napi_gro_flush which is called by
napi_complete and napi_rx_complete.

Currently held packets are stored in a singly liked list just like LRO.
The list is limited to a maximum of 8 entries. In future, this may be
expanded to use a hash table to allow more flows to be held for merging.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d565b0a1

net: Add frag_list support to GSO · 1a881f27

由 Herbert Xu 提交于 12月 15, 2008

This patch allows GSO to handle frag_list in a limited way for the
purposes of allowing packets merged by GRO to be refragmented on
output.

Most hardware won't (and aren't expected to) support handling GRO
frag_list packets directly.  Therefore we will perform GSO in
software for those cases.

However, for drivers that can support it (such as virtual NICs) we
may not have to segment the packets at all.

Whether the added overhead of GRO/GSO is worthwhile for bridges
and routers when weighed against the benefit of potentially
increasing the MTU within the host is still an open question.
However, for the case of host nodes this is undoubtedly a win.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1a881f27

10 12月, 2008 1 次提交

netpoll: fix race on poll_list resulting in garbage entry · 7b363e44

由 Neil Horman 提交于 12月 09, 2008

	A few months back a race was discused between the netpoll napi service
path, and the fast path through net_rx_action:
http://kerneltrap.org/mailarchive/linux-netdev/2007/10/16/345470

A patch was submitted for that bug, but I think we missed a case.

Consider the following scenario:

INITIAL STATE
CPU0 has one napi_struct A on its poll_list
CPU1 is calling netpoll_send_skb and needs to call poll_napi on the same
napi_struct A that CPU0 has on its list



CPU0						CPU1
net_rx_action					poll_napi
!list_empty (returns true)			locks poll_lock for A
						 poll_one_napi
						  napi->poll
						   netif_rx_complete
						    __napi_complete
						    (removes A from poll_list)
list_entry(list->next)


In the above scenario, net_rx_action assumes that the per-cpu poll_list is
exclusive to that cpu.  netpoll of course violates that, and because the netpoll
path can dequeue from the poll list, its possible for CPU0 to detect a non-empty
list at the top of the while loop in net_rx_action, but have it become empty by
the time it calls list_entry.  Since the poll_list isn't surrounded by any other
structure, the returned data from that list_entry call in this situation is
garbage, and any number of crashes can result based on what exactly that garbage
is.

Given that its not fasible for performance reasons to place exclusive locks
arround each cpus poll list to provide that mutal exclusion, I think the best
solution is modify the netpoll path in such a way that we continue to guarantee
that the poll_list for a cpu is in fact exclusive to that cpu.  To do this I've
implemented the patch below.  It adds an additional bit to the state field in
the napi_struct.  When executing napi->poll from the netpoll_path, this bit will
be set. When a driver calls netif_rx_complete, if that bit is set, it will not
remove the napi_struct from the poll_list.  That work will be saved for the next
iteration of net_rx_action.

I've tested this and it seems to work well.  About the biggest drawback I can
see to it is the fact that it might result in an extra loop through
net_rx_action in the event that the device is actually contended for (i.e. the
netpoll path actually preforms all the needed work no the device, and the call
to net_rx_action winds up doing nothing, except removing the napi_struct from
the poll_list.  However I think this is probably a small price to pay, given
that the alternative is a crash.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b363e44

08 12月, 2008 1 次提交

netdevice: Kill netdev->priv · b74ca3a8

由 Wang Chen 提交于 12月 08, 2008

This is the last shoot of this series.
After I removing all directly reference of netdev->priv, I am killing
"priv" of "struct net_device" and fixing relative comments/docs.

Anyone will not be allowed to reference netdev->priv directly.
If you want to reference the memory of private data, use netdev_priv()
instead.
If the private data is not allocted when alloc_netdev(), use
netdev->ml_priv to point that memory after you creating that private
data.
Signed-off-by: NWang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b74ca3a8

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多