提交 · 01d5b2fca1fa58ed5039239fd531e9f658971ace · openeuler / raspberrypi-kernel

01 3月, 2009 1 次提交

netpoll: Add drop checks to all entry points · 4ead4431

由 Herbert Xu 提交于 3月 01, 2009

The netpoll entry checks are required to ensure that we don't
receive normal packets when invoked via netpoll.  Unfortunately
it only ever worked for the netif_receive_skb/netif_rx entry
points.  The VLAN (and subsequently GRO) entry point didn't
have the check and therefore can trigger all sorts of weird
problems.

This patch adds the netpoll check to all entry points.

I'm still uneasy with receiving at all under netpoll (which
apparently is only used by the out-of-tree kdump code).  The
reason is it is perfectly legal to receive all data including
headers into highmem if netpoll is off, but if you try to do
that with netpoll on and someone gets a printk in an IRQ handler                                             
you're going to get a nice BUG_ON.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ead4431

24 2月, 2009 2 次提交

net: amend the fix for SO_BSDCOMPAT gsopt infoleak · 50fee1de

由 Eugene Teo 提交于 2月 23, 2009

The fix for CVE-2009-0676 (upstream commit df0bca04) is incomplete. Note
that the same problem of leaking kernel memory will reappear if someone
on some architecture uses struct timeval with some internal padding (for
example tv_sec 64-bit and tv_usec 32-bit) --- then, you are going to
leak the padded bytes to userspace.
Signed-off-by: NEugene Teo <eugeneteo@kernel.sg>
Reported-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

50fee1de

netns: build fix for net_alloc_generic · ebe47d47

由 Clemens Noss 提交于 2月 23, 2009

net_alloc_generic was defined in #ifdef CONFIG_NET_NS, but used
unconditionally. Move net_alloc_generic out of #ifdef.
Signed-off-by: NClemens Noss <cnoss@gmx.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ebe47d47

22 2月, 2009 1 次提交

netns: fix double free at netns creation · 486a87f1

由 Daniel Lezcano 提交于 2月 22, 2009

This patch fix a double free when a network namespace fails.
The previous code does a kfree of the net_generic structure when
one of the init subsystem initialization fails.
The 'setup_net' function does kfree(ng) and returns an error.
The caller, 'copy_net_ns', call net_free on error, and this one
calls kfree(net->gen), making this pointer freed twice.

This patch make the code symetric, the net_alloc does the net_generic
allocation and the net_free frees the net_generic.
Signed-off-by: NDaniel Lezcano <daniel.lezcano@free.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

486a87f1

18 2月, 2009 1 次提交

net: Kill skb_truesize_check(), it only catches false-positives. · 92a0acce

由 David S. Miller 提交于 2月 17, 2009

A long time ago we had bugs, primarily in TCP, where we would modify
skb->truesize (for TSO queue collapsing) in ways which would corrupt
the socket memory accounting.

skb_truesize_check() was added in order to try and catch this error
more systematically.

However this debugging check has morphed into a Frankenstein of sorts
and these days it does nothing other than catch false-positives.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92a0acce

13 2月, 2009 1 次提交

net: 4 bytes kernel memory disclosure in SO_BSDCOMPAT gsopt try #2 · df0bca04

由 Clément Lecigne 提交于 2月 12, 2009

In function sock_getsockopt() located in net/core/sock.c, optval v.val
is not correctly initialized and directly returned in userland in case
we have SO_BSDCOMPAT option set.

This dummy code should trigger the bug:

int main(void)
{
	unsigned char buf[4] = { 0, 0, 0, 0 };
	int len;
	int sock;
	sock = socket(33, 2, 2);
	getsockopt(sock, 1, SO_BSDCOMPAT, &buf, &len);
	printf("%x%x%x%x\n", buf[0], buf[1], buf[2], buf[3]);
	close(sock);
}

Here is a patch that fix this bug by initalizing v.val just after its
declaration.
Signed-off-by: NClément Lecigne <clement.lecigne@netasq.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df0bca04

07 2月, 2009 1 次提交

net_dma: call dmaengine_get only if NET_DMA enabled · b4bd07c2

由 David S. Miller 提交于 2月 06, 2009

Based upon a patch from Atsushi Nemoto <anemo@mba.ocn.ne.jp>

--------------------
The commit 649274d9 ("net_dma:
acquire/release dma channels on ifup/ifdown") added unconditional call
of dmaengine_get() to net_dma.  The API should be called only if
NET_DMA was enabled.
--------------------
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NDan Williams <dan.j.williams@intel.com>

b4bd07c2

06 2月, 2009 1 次提交

neigh: some entries can be skipped during dumping · efc683fc

由 Gautam Kachroo 提交于 2月 06, 2009

neightbl_dump_info and neigh_dump_table  can skip entries if the
*fill*info functions return an error. This results in an incomplete
dump ((invoked by netlink requests for RTM_GETNEIGHTBL or
RTM_GETNEIGH)

nidx and idx should not be incremented if the current entry was not
placed in the output buffer
Signed-off-by: NGautam Kachroo <gk@aristanetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

efc683fc

30 1月, 2009 2 次提交

net: Fix OOPS in skb_seq_read(). · 71b3346d

由 Shyam Iyer 提交于 1月 29, 2009

It oopsd for me in skb_seq_read. addr2line said it was
linux-2.6/net/core/skbuff.c:2228, which is this line:


	while (st->frag_idx < skb_shinfo(st->cur_skb)->nr_frags) {


I added some printks in there and it looks like we hit this:

        } else if (st->root_skb == st->cur_skb &&
                   skb_shinfo(st->root_skb)->frag_list) {
                 st->cur_skb = skb_shinfo(st->root_skb)->frag_list;
                 st->frag_idx = 0;
                 goto next_skb;
        }



Actually I did some testing and added a few printks and found that the
st->cur_skb->data was 0 and hence the ptr used by iscsi_tcp was null.
This caused the kernel panic.

 	if (abs_offset < block_limit) {
-		*data = st->cur_skb->data + abs_offset;
+		*data = st->cur_skb->data + (abs_offset - st->stepped_offset);

I enabled the debug_tcp and with a few printks found that the code did
not go to the next_skb label and could find that the sequence being
followed was this -

It hit this if condition -

        if (st->cur_skb->next) {
                st->cur_skb = st->cur_skb->next;
                st->frag_idx = 0;
                goto next_skb;

And so, now the st pointer is shifted to the next skb whereas actually
it should have hit the second else if first since the data is in the
frag_list.

        else if (st->root_skb == st->cur_skb &&
                 skb_shinfo(st->root_skb)->frag_list) {
                st->cur_skb = skb_shinfo(st->root_skb)->frag_list;
                goto next_skb;
        }

Reversing the two conditions the attached patch fixes the issue for me
on top of Herbert's patches. 
Signed-off-by: NShyam Iyer <shyam_iyer@dell.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

71b3346d

net: Fix frag_list handling in skb_seq_read · 95e3b24c

由 Herbert Xu 提交于 1月 29, 2009

The frag_list handling was broken in skb_seq_read:

1) We didn't add the stepped offset when looking at the head
are of fragments other than the first.

2) We didn't take the stepped offset away when setting the data
pointer in the head area.

3) The frag index wasn't reset.

This patch fixes both issues.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95e3b24c

21 1月, 2009 3 次提交

gro: Fix merging of paged packets · 37fe4732

由 Herbert Xu 提交于 1月 17, 2009

The previous fix to paged packets broke the merging because it
reset the skb->len before we added it to the merged packet.  This
wasn't detected because it simply resulted in the truncation of
the packet while the missing bit is subsequently retransmitted.

The fix is to store skb->len before we clobber it.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37fe4732

gro: Fix error handling on extremely short frags · 9a8e47ff

由 Herbert Xu 提交于 1月 17, 2009

When a frag is shorter than an Ethernet header, we'd return a
zeroed packet instead of aborting.  This patch fixes that.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9a8e47ff

NET: net_namespace, fix lock imbalance · 357f5b0b

由 Jiri Slaby 提交于 1月 17, 2009

register_pernet_gen_subsys omits mutex_unlock in one fail path.
Fix it.
Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

357f5b0b

20 1月, 2009 2 次提交

net: Fix data corruption when splicing from sockets. · 8b9d3728

由 Jarek Poplawski 提交于 1月 19, 2009

The trick in socket splicing where we try to convert the skb->data
into a page based reference using virt_to_page() does not work so
well.

The idea is to pass the virt_to_page() reference via the pipe
buffer, and refcount the buffer using a SKB reference.

But if we are splicing from a socket to a socket (via sendpage)
this doesn't work.

The from side processing will grab the page (and SKB) references.
The sendpage() calls will grab page references only, return, and
then the from side processing completes and drops the SKB ref.

The page based reference to skb->data is not enough to keep the
kmalloc() buffer backing it from being reused.  Yet, that is
all that the socket send side has at this point.

This leads to data corruption if the skb->data buffer is reused
by SLAB before the send side socket actually gets the TX packet
out to the device.

The fix employed here is to simply allocate a page and copy the
skb->data bytes into that page.

This will hurt performance, but there is no clear way to fix this
properly without a copy at the present time, and it is important
to get rid of the data corruption.

With fixes from Herbert Xu.
Tested-by: NWilly Tarreau <w@1wt.eu>
Foreseen-by: NChangli Gao <xiaosuo@gmail.com>
Diagnosed-by: NWilly Tarreau <w@1wt.eu>
Reported-by: NWilly Tarreau <w@1wt.eu>
Fixed-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8b9d3728

net: Add debug info to track down GSO checksum bug · 67fd1a73

由 Herbert Xu 提交于 1月 19, 2009

I'm trying to track down why people're hitting the checksum warning
in skb_gso_segment.  As the problem seems to be hitting lots of
people and I can't reproduce it or locate the bug, here is a patch
to print out more details which hopefully should help us to track
this down.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

67fd1a73

15 1月, 2009 3 次提交

net: Add init_dummy_netdev() and fix EMAC driver using it · 937f1ba5

由 Benjamin Herrenschmidt 提交于 1月 14, 2009

This adds an init_dummy_netdev() function that gets a network device
structure (allocation and lifetime entirely under caller's control) and
initialize the minimum amount of fields so it can be used to schedule
NAPI polls without registering a full blown interface. This is to be
used by drivers that need to tie several hardware interfaces to a single
NAPI poll scheduler due to HW limitations.

It also updates the ibm_newemac driver to use that, this fixing the
oops on 2.6.29 due to passing NULL as "dev" to netif_napi_add()

Symbol is exported GPL only a I don't think we want binary drivers doing
that sort of acrobatics (if we want them at all).
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: NGeert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

937f1ba5

gro: Fix page ref count for skbs freed normally · f5572068

由 Herbert Xu 提交于 1月 14, 2009

When an skb with page frags is merged into an existing one, we
cannibalise its reference count.  This is OK when the skb is
reused because we set nr_frags to zero in that case.  However,
for the case where the skb is freed through kfree_skb, we didn't
clear nr_frags which causes the page to be freed prematurely.

This is fixed by moving the skb resetting into skb_gro_receive.
Reported-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5572068

gro: Check for GSO packets and packets with frag_list · f17f5c91

由 Herbert Xu 提交于 1月 14, 2009

As GRO cannot be applied to packets with frag_list we need to
make sure that we reject such packets if they are fed to us,
e.g., through a tunnel device.

Also there is no point in applying GRO on GSO packets so they
too should be rejected.  This allows GRO to be used in virtio-net
which may produce GSO packets directly but may still benefit
from GRO if the other end of it doesn't support GSO.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f17f5c91

11 1月, 2009 1 次提交

net_dma: acquire/release dma channels on ifup/ifdown · 649274d9

由 Dan Williams 提交于 1月 11, 2009

The recent dmaengine rework removed the capability to remove dma device
driver modules while net_dma is active. Rather than notify
dmaengine-clients that channels are trying to be removed, we now rely on
clients to notify dmaengine when they no longer have a need for
channels. Teach net_dma to release channels by taking dmaengine
references at netdevice open and dropping references at netdevice close.
Acked-by: NMaciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

649274d9

07 1月, 2009 5 次提交

gro: Add internal interfaces for VLAN · 96e93eab

由 Herbert Xu 提交于 1月 06, 2009

Previously GRO's only entry point from the outside is through
napi_gro_receive and napi_gro_frags.  These interfaces are for
device drivers.

This patch rearranges things to provide a new set of interfaces
for VLANs.  These interfaces are for internal use only.  The
VLAN code itself can then provide a set of entry points for
device drivers.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

96e93eab

dmaengine: kill struct dma_client and supporting infrastructure · aa1e6f1a

由 Dan Williams 提交于 1月 06, 2009

All users have been converted to either the general-purpose allocator,
dma_find_channel, or dma_request_channel.
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

aa1e6f1a

dmaengine: replace dma_async_client_register with dmaengine_get · 209b84a8

由 Dan Williams 提交于 1月 06, 2009

Now that clients no longer need to be notified of channel arrival
dma_async_client_register can simply increment the dmaengine_ref_count.
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

209b84a8

net_dma: convert to dma_find_channel · f67b4599

由 Dan Williams 提交于 1月 06, 2009

Use the general-purpose channel allocation provided by dmaengine.
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

f67b4599

dmaengine: provide a common 'issue_pending_all' implementation · 2ba05622

由 Dan Williams 提交于 1月 06, 2009

async_tx and net_dma each have open-coded versions of issue_pending_all,
so provide a common routine in dmaengine.

The implementation needs to walk the global device list, so implement
rcu to allow dma_issue_pending_all to run lockless. Clients protect
themselves from channel removal events by holding a dmaengine reference.
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

2ba05622

06 1月, 2009 1 次提交

Revert "net: Fix for initial link state in 2.6.28" · c276e098

由 David S. Miller 提交于 1月 05, 2009

This reverts commit 22604c86.

We can't fix this issue in this way, because we now can try
to take the dev_base_lock rwlock as a writer in software interrupt
context and that is not allowed without major surgery elsewhere.

This initial link state problem needs to be solved in some other
way.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c276e098

05 1月, 2009 3 次提交

net: Fix for initial link state in 2.6.28 · 22604c86

由 Michael Marineau 提交于 1月 04, 2009

From: Michael Marineau <mike@marineau.org>

Commit b4730016 "Do not fire linkwatch
events until the device is registered." was made as a workaround for
drivers that call netif_carrier_off before registering the device.
Unfortunately this causes these drivers to incorrectly report their
link status as IF_OPER_UNKNOWN which can falsely set the IFF_RUNNING
flag when the interface is first brought up. This issues was
previously pointed out[1] but was dismissed saying that IFF_RUNNING is
not related to the link status. From my digging IFF_RUNNING, as
reported to userspace, is based on the link state. It is set based on
__LINK_STATE_START and IF_OPER_UP or IF_OPER_UNKNOWN. See [2], [3],
and [4]. (Whether or not the kernel has IFF_RUNNING set in flags is
not reported to user space so it may well be independent of the link,
I don't know if and when it may get set.)

The end result depends slightly depending on the driver. The the two I
tested were e1000e and b44. With e1000e if the system is booted
without a network cable attached the interface will falsely report
RUNNING when it is brought up causing NetworkManager to attempt to
start it and eventually time out. With b44 when the system is booted
with a network cable attached and brought up with dhcpcd it will time
out the first time.

The attached patch that will still set the operstate variable
correctly to IF_OPER_UP/DOWN/etc when linkwatch_fire_event is called
but then return rather than skipping the linkwatch_fire_event call
entirely as the previous fix did. (sorry it isn't inline, I don't have
a patch friendly email client at the moment)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22604c86

gro: Add page frag support · 5d38a079

由 Herbert Xu 提交于 1月 04, 2009

This patch allows GRO to merge page frags (skb_shinfo(skb)->frags)
in one skb, rather than using the less efficient frag_list.

It also adds a new interface, napi_gro_frags to allow drivers
to inject page frags directly into the stack without allocating
an skb.  This is intended to be the GRO equivalent for LRO's
lro_receive_frags interface.

The existing GSO interface can already handle page frags with
or without an appended frag_list so nothing needs to be changed
there.

The merging itself is rather simple.  We store any new frag entries
after the last existing entry, without checking whether the first
new entry can be merged with the last existing entry.  Making this
check would actually be easy but since no existing driver can
produce contiguous frags anyway it would just be mental masturbation.

If the total number of entries would exceed the capacity of a
single skb, we simply resort to using frag_list as we do now.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d38a079

gro: Use gso_size to store MSS · b530256d

由 Herbert Xu 提交于 1月 04, 2009

In order to allow GRO packets without frag_list at all, we need to
store the MSS in the packet itself.  The obvious place is gso_size.
The only thing to watch out for is if the packet ends up not being
GRO then we need to clear gso_size before pushing the packet into
the stack.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b530256d

30 12月, 2008 2 次提交

cpumask: prepare for iterators to only go to nr_cpu_ids/nr_cpumask_bits: net · 0f23174a

由 Rusty Russell 提交于 12月 29, 2008

In future all cpumask ops will only be valid (in general) for bit
numbers < nr_cpu_ids.  So use that instead of NR_CPUS in iterators
and other comparisons.

This is always safe: no cpu number can be >= nr_cpu_ids, and
nr_cpu_ids is initialized to NR_CPUS at boot.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NMike Travis <travis@sgi.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0f23174a

netns: foreach_netdev_safe is insufficient in default_device_exit · 8eb79863

由 Eric W. Biederman 提交于 12月 29, 2008

During network namespace teardown we either move or delete
all of the network devices associated with a network namespace.
In the case of veth devices deleting one will also delete it's
pair device.  If both devices are in the same network namespace
then for_each_netdev_safe is insufficient as next may point
to the second veth device we have deleted.

To avoid problems I do what we do in __rtnl_kill_links and
restart the scan of the device list, after we have deleted
a device.

Currently dev_change_netnamespace does not appear to suffer from
this problem, but wireless devices are also paired and likely
should be moved between network namespaces together.  So I have
errored on the side of caution and restart the scan of the network
devices in that case as well.
Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8eb79863

27 12月, 2008 1 次提交

gro: Fix potential use after free · 0da2afd5

由 Herbert Xu 提交于 12月 26, 2008

The initial skb may have been freed after napi_gro_complete in
napi_gro_receive if it was merged into an existing packet.  Thus
we cannot check same_flow (which indicates whether it was merged)
after calling napi_gro_complete.

This patch fixes this by saving the same_flow status before the
call to napi_gro_complete.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0da2afd5

26 12月, 2008 1 次提交

net: Init NAPI dev_list on napi_del · d7b06636

由 Peter P Waskiewicz Jr 提交于 12月 26, 2008

The recent GRO patches introduced the NAPI removal of devices in
free_netdev. For drivers that can change the number of queues during
driver operation, the NAPI infrastructure doesn't allow the freeing and
re-addition of NAPI entities without reloading the driver.

This change reinitializes the dev_list in each NAPI struct on delete,
instead of just deleting it (and assigning the list pointers to POISON).
Drivers that wish to remove/re-add NAPI will need to re-initialize the
netdev napi_list after removing all NAPI instances, before re-adding NAPI
devices again.
Signed-off-by: NPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d7b06636

23 12月, 2008 1 次提交

net: Fix oops in dev_ifsioc() · 5f2f6da7

由 Jarek Poplawski 提交于 12月 22, 2008

A command like this: "brctl addif br1 eth1" issued as a user gave me
an oops when bridge module wasn't loaded. It's caused by using a dev
pointer before checking for NULL.
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f2f6da7

18 12月, 2008 3 次提交

Revert "net: release skb->dst in sock_queue_rcv_skb()" · 49ad9599

由 David S. Miller 提交于 12月 17, 2008

This reverts commit 70355602.

As pointed out by Mark McLoughlin IP_PKTINFO cmsg data is one
post-queueing user, so this optimization is not valid right
now.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

49ad9599

Phonet: allocate separate ARP type for GPRS over a Phonet pipe · 57c81fff

由 Rémi Denis-Courmont 提交于 12月 17, 2008

A separate xmit lock class supports GPRS over a Phonet pipe over a TUN
device (type ARPHRD_NONE).
Signed-off-by: NRémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

57c81fff

Phonet: allocate a non-Ethernet ARP type · 2d91d78b

由 Rémi Denis-Courmont 提交于 12月 17, 2008

Also leave some room for more 802.11 types.
Signed-off-by: NRémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d91d78b

16 12月, 2008 4 次提交

ethtool: Add GGRO and SGRO ops · b240a0e5

由 Herbert Xu 提交于 12月 15, 2008

This patch adds the ethtool ops to enable and disable GRO.  It also
makes GRO depend on RX checksum offload much the same as how TSO
depends on SG support.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b240a0e5

net: Add skb_gro_receive · 71d93b39

由 Herbert Xu 提交于 12月 15, 2008

This patch adds the helper skb_gro_receive to merge packets for
GRO.  The current method is to allocate a new header skb and then
chain the original packets to its frag_list.  This is done to
make it easier to integrate into the existing GSO framework.

In future as GSO is moved into the drivers, we can undo this and
simply chain the original packets together.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

71d93b39

net: Add Generic Receive Offload infrastructure · d565b0a1

由 Herbert Xu 提交于 12月 15, 2008

This patch adds the top-level GRO (Generic Receive Offload) infrastructure.
This is pretty similar to LRO except that this is protocol-independent.
Instead of holding packets in an lro_mgr structure, they're now held in
napi_struct.

For drivers that intend to use this, they can set the NETIF_F_GRO bit and
call napi_gro_receive instead of netif_receive_skb or just call netif_rx.
The latter will call napi_receive_skb automatically. When napi_gro_receive
is used, the driver must either call napi_complete/napi_rx_complete, or
call napi_gro_flush in softirq context if the driver uses the primitives
__napi_complete/__napi_rx_complete.

Protocols will set the gro_receive and gro_complete function pointers in
order to participate in this scheme.

In addition to the packet, gro_receive will get a list of currently held
packets. Each packet in the list has a same_flow field which is non-zero
if it is a potential match for the new packet. For each packet that may
match, they also have a flush field which is non-zero if the held packet
must not be merged with the new packet.

Once gro_receive has determined that the new skb matches a held packet,
the held packet may be processed immediately if the new skb cannot be
merged with it. In this case gro_receive should return the pointer to
the existing skb in gro_list. Otherwise the new skb should be merged into
the existing packet and NULL should be returned, unless the new skb makes
it impossible for any further merges to be made (e.g., FIN packet) where
the merged skb should be returned.

Whenever the skb is merged into an existing entry, the gro_receive
function should set NAPI_GRO_CB(skb)->same_flow. Note that if an skb
merely matches an existing entry but can't be merged with it, then
this shouldn't be set.

If gro_receive finds it pointless to hold the new skb for future merging,
it should set NAPI_GRO_CB(skb)->flush.

Held packets will be flushed by napi_gro_flush which is called by
napi_complete and napi_rx_complete.

Currently held packets are stored in a singly liked list just like LRO.
The list is limited to a maximum of 8 entries. In future, this may be
expanded to use a hash table to allow more flows to be held for merging.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d565b0a1

net: Add frag_list support to GSO · 1a881f27

由 Herbert Xu 提交于 12月 15, 2008

This patch allows GSO to handle frag_list in a limited way for the
purposes of allowing packets merged by GRO to be refragmented on
output.

Most hardware won't (and aren't expected to) support handling GRO
frag_list packets directly.  Therefore we will perform GSO in
software for those cases.

However, for drivers that can support it (such as virtual NICs) we
may not have to segment the packets at all.

Whether the added overhead of GRO/GSO is worthwhile for bridges
and routers when weighed against the benefit of potentially
increasing the MTU within the host is still an open question.
However, for the case of host nodes this is undoubtedly a win.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1a881f27