提交 · 073a24364fe6de7eef0a3dec0ec7d48e56624092 · openeuler / raspberrypi-kernel

27 1月, 2009 1 次提交

virtio_net: use correct accessors for scatterlists · 8527bec5

由 Ira W. Snyder 提交于 1月 26, 2009

Without this fix, virtio_net makes incorrect usage of scatterlists. It sets
the end of the scatterlist chain after the first element, despite the fact
that more entries come after it.

If you try to run dma_map_sg() on one of the scatterlists given to you by
add_buf(), you will get a null pointer oops.
Signed-off-by: NIra W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8527bec5

26 1月, 2009 1 次提交

virtio_net: Fix MAX_PACKET_LEN to support 802.1Q VLANs · e918085a

由 Alex Williamson 提交于 1月 25, 2009

802.1Q expanded the maximum ethernet frame size by 4 bytes for the
VLAN tag.  We're not taking this into account in virtio_net, which
means the buffers we provide to the backend in the virtqueue RX ring
aren't big enough to hold a full MTU VLAN packet.  For QEMU/KVM,
this results in the backend exiting with a packet truncation error.
Signed-off-by: NAlex Williamson <alex.williamson@hp.com>
Acked-by: NMark McLoughlin <markmc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e918085a

22 1月, 2009 2 次提交

virtio_net: add link status handling · 9f4d26d0

由 Mark McLoughlin 提交于 1月 19, 2009

Allow the host to inform us that the link is down by adding
a VIRTIO_NET_F_STATUS which indicates that device status is
available in virtio_net config.

This is currently useful for simulating link down conditions
(e.g. using proposed qemu 'set_link' monitor command) but
would also be needed if we were to support device assignment
via virtio.
Signed-off-by: NMark McLoughlin <markmc@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (added future masking)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f4d26d0

net: Remove redundant NAPI functions · 288379f0

由 Ben Hutchings 提交于 1月 19, 2009

Following the removal of the unused struct net_device * parameter from
the NAPI functions named *netif_rx_* in commit 908a7a16, they are
exactly equivalent to the corresponding *napi_* functions and are
therefore redundant.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

288379f0

07 1月, 2009 1 次提交

virtio: convert to net_device_ops · 76288b4e

由 Stephen Hemminger 提交于 1月 06, 2009

Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Acked-by: NMark McLoughlin <markmc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76288b4e

23 12月, 2008 1 次提交

net: Remove unused netdev arg from some NAPI interfaces. · 908a7a16

由 Neil Horman 提交于 12月 22, 2008

When the napi api was changed to separate its 1:1 binding to the net_device
struct, the netif_rx_[prep|schedule|complete] api failed to remove the now
vestigual net_device structure parameter.  This patch cleans up that api by
properly removing it..
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

908a7a16

02 12月, 2008 1 次提交

virtio_net: large tx MTU support · 39da5814

由 Mark McLoughlin 提交于 11月 26, 2008

We don't really have a max tx packet size limit, so allow configuring
the device with up to 64k tx MTU.
Signed-off-by: NMark McLoughlin <markmc@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

39da5814

17 11月, 2008 3 次提交

virtio_net: VIRTIO_NET_F_MSG_RXBUF (imprive rcv buffer allocation) · 3f2c31d9

由 Mark McLoughlin 提交于 11月 16, 2008

If segmentation offload is enabled by the host, we currently allocate
maximum sized packet buffers and pass them to the host. This uses up
20 ring entries, allowing us to supply only 20 packet buffers to the
host with a 256 entry ring. This is a huge overhead when receiving
small packets, and is most keenly felt when receiving MTU sized
packets from off-host.

The VIRTIO_NET_F_MRG_RXBUF feature flag is set by hosts which support
using receive buffers which are smaller than the maximum packet size.
In order to transfer large packets to the guest, the host merges
together multiple receive buffers to form a larger logical buffer.
The number of merged buffers is returned to the guest via a field in
the virtio_net_hdr.

Make use of this support by supplying single page receive buffers to
the host. On receive, we extract the virtio_net_hdr, copy 128 bytes of
the payload to the skb's linear data buffer and adjust the fragment
offset to point to the remaining data. This ensures proper alignment
and allows us to not use any paged data for small packets. If the
payload occupies multiple pages, we simply append those pages as
fragments and free the associated skbs.

This scheme allows us to be efficient in our use of ring entries
while still supporting large packets. Benchmarking using netperf from
an external machine to a guest over a 10Gb/s network shows a 100%
improvement from ~1Gb/s to ~2Gb/s. With a local host->guest benchmark
with GSO disabled on the host side, throughput was seen to increase
from 700Mb/s to 1.7Gb/s.

Based on a patch from Herbert Xu.
Signed-off-by: NMark McLoughlin <markmc@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3f2c31d9

virtio_net: hook up the set-tso ethtool op · 0276b497

由 Mark McLoughlin 提交于 11月 16, 2008

Seems like an oversight that we have set-tx-csum and set-sg hooked
up, but not set-tso.

Also leads to the strange situation that if you e.g. disable tx-csum,
then tso doesn't get disabled.
Signed-off-by: NMark McLoughlin <markmc@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0276b497

virtio_net: Recycle some more rx buffer pages · 0a888fd1

由 Mark McLoughlin 提交于 11月 16, 2008

Each time we re-fill the recv queue with buffers, we allocate
one too many skbs and free it again when adding fails. We should
recycle the pages allocated in this case.

A previous version of this patch made trim_pages() trim trailing
unused pages from skbs with some paged data, but this actually
caused a barely measurable slowdown.
Signed-off-by: NMark McLoughlin <markmc@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a888fd1

13 11月, 2008 1 次提交

netdevice: safe convert to netdev_priv() #part-3 · 8f15ea42

由 Wang Chen 提交于 11月 12, 2008

We have some reasons to kill netdev->priv:
1. netdev->priv is equal to netdev_priv().
2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously
   netdev_priv() is more flexible than netdev->priv.
But we cann't kill netdev->priv, because so many drivers reference to it
directly.

This patch is a safe convert for netdev->priv to netdev_priv(netdev).
Since all of the netdev->priv is only for read.
But it is too big to be sent in one mail.
I split it to 4 parts and make every part smaller than 100,000 bytes,
which is max size allowed by vger.
Signed-off-by: NWang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8f15ea42

28 10月, 2008 1 次提交

net: convert print_mac to %pM · e174961c

由 Johannes Berg 提交于 10月 27, 2008

This converts pretty much everything to print_mac. There were
a few things that had conflicts which I have just dropped for
now, no harm done.

I've built an allyesconfig with this and looked at the files
that weren't built very carefully, but it's a huge patch.
Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e174961c

25 7月, 2008 4 次提交

virtio: Recycle unused recv buffer pages for large skbs in net driver · fb6813f4

由 Rusty Russell 提交于 7月 25, 2008

If we hack the virtio_net driver to always allocate full-sized (64k+)
skbuffs, the driver slows down (lguest numbers):

  Time to receive 1GB (small buffers): 10.85 seconds
  Time to receive 1GB (64k+ buffers): 24.75 seconds

Of course, large buffers use up more space in the ring, so we increase
that from 128 to 2048:

  Time to receive 1GB (64k+ buffers, 2k ring): 16.61 seconds

If we recycle pages rather than using alloc_page/free_page:

  Time to receive 1GB (64k+ buffers, 2k ring, recycle pages): 10.81 seconds

This demonstrates that with efficient allocation, we don't need to
have a separate "small buffer" queue.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

fb6813f4

virtio net: Allow receiving SG packets · 97402b96

由 Herbert Xu 提交于 4月 18, 2008

Finally this patch lets virtio_net receive GSO packets in addition
to sending them. This can definitely be optimised for the non-GSO
case. For comparison the Xen approach stores one page in each skb
and uses subsequent skb's pages to construct an SG skb instead of
preallocating the maximum amount of pages per skb.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (added feature bits)

97402b96

virtio net: Add ethtool ops for SG/GSO · a9ea3fc6

由 Herbert Xu 提交于 4月 18, 2008

This patch adds some basic ethtool operations to virtio_net so
I could test SG without GSO (which was really useful because TSO
turned out to be buggy :)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (remove MTU setting)

a9ea3fc6

virtio: fix virtio_net xmit of freed skb bug · 9953ca6c

由 Mark McLoughlin 提交于 5月 27, 2008

On Mon, 2008-05-26 at 17:42 +1000, Rusty Russell wrote:
> If we fail to transmit a packet, we assume the queue is full and put
> the skb into last_xmit_skb.  However, if more space frees up before we
> xmit it, we loop, and the result can be transmitting the same skb twice.
>
> Fix is simple: set skb to NULL if we've used it in some way, and check
> before sending.
...
> diff -r 564237b31993 drivers/net/virtio_net.c
> --- a/drivers/net/virtio_net.c	Mon May 19 12:22:00 2008 +1000
> +++ b/drivers/net/virtio_net.c	Mon May 19 12:24:58 2008 +1000
> @@ -287,21 +287,25 @@ again:
>  	free_old_xmit_skbs(vi);
>
>  	/* If we has a buffer left over from last time, send it now. */
> -	if (vi->last_xmit_skb) {
> +	if (unlikely(vi->last_xmit_skb)) {
>  		if (xmit_skb(vi, vi->last_xmit_skb) != 0) {
>  			/* Drop this skb: we only queue one. */
>  			vi->dev->stats.tx_dropped++;
>  			kfree_skb(skb);
> +			skb = NULL;
>  			goto stop_queue;
>  		}
>  		vi->last_xmit_skb = NULL;

With this, may drop an skb and then later in the function discover that
we could have sent it after all. Poor wee skb :)

How about the incremental patch below?

Cheers,
Mark.

Subject: [PATCH] virtio_net: Delay dropping tx skbs

Currently we drop the skb in start_xmit() if we have a
queued buffer and fail to transmit it.

However, if we delay dropping it until we've stopped the
queue and enabled the tx notification callback, then there
is a chance space might become available for it.
Signed-off-by: NMark McLoughlin <markmc@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

9953ca6c

11 7月, 2008 1 次提交

virtio_net: Set VIRTIO_NET_F_GUEST_CSUM feature · 5e4fe5c4

由 Mark McLoughlin 提交于 7月 08, 2008

We can handle receiving partial csums, so set the
appropriate feature bit.
Signed-off-by: NMark McLoughlin <markmc@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

5e4fe5c4

11 6月, 2008 3 次提交

virtio: use callback on empty in virtio_net · 363f1514

由 Rusty Russell 提交于 6月 08, 2008

virtio_net uses a timer to free old transmitted packets, rather than
leaving callbacks enabled all the time.  If the host promises to
always notify us when the transmit ring is empty, we can free packets
at that point and avoid the timer.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

363f1514

virtio: virtio_net free transmit skbs in a timer · 14c998f0

由 Mark McLoughlin 提交于 6月 08, 2008

virtio_net currently only frees old transmit skbs just
before queueing new ones. If the queue is full, it then
enables interrupts and waits for notification that more
work has been performed.

However, a side-effect of this scheme is that there are
always xmit skbs left dangling when no new packets are
sent, against the Documentation/networking/driver.txt
guideline:

  "... it is not allowed for your TX mitigation scheme
   to let TX packets "hang out" in the TX ring unreclaimed
   forever if no new TX packets are sent."

Add a timer to ensure that any time we queue new TX
skbs, we will shortly free them again.

This fixes an easily reproduced hang at shutdown where
iptables attempts to unload nf_conntrack and nf_conntrack
waits for an skb it is tracking to be freed, but virtio_net
never frees it.
Signed-off-by: NMark McLoughlin <markmc@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

14c998f0

virtio_net: Fix skb->csum_start computation · 23cde76d

由 Mark McLoughlin 提交于 6月 08, 2008

hdr->csum_start is the offset from the start of the ethernet
header to the transport layer checksum field. skb->csum_start
is the offset from skb->head.

skb_partial_csum_set() assumes that skb->data points to the
ethernet header - i.e. it computes skb->csum_start by adding
the headroom to hdr->csum_start.

Since eth_type_trans() skb_pull()s the ethernet header,
skb_partial_csum_set() should be called before
eth_type_trans().

(Without this patch, GSO packets from a guest to the world outside the
host are corrupted).
Signed-off-by: NMark McLoughlin <markmc@redhat.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

23cde76d

31 5月, 2008 2 次提交

virtio: fix delayed xmit of packet and freeing of old packets. · 11a3a154

由 Rusty Russell 提交于 5月 26, 2008

Because we cache the last failed-to-xmit packet, if there are no
packets queued behind that one we may never send it (reproduced here
as TCP stalls, "cured" by an outgoing ping).

Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

11a3a154

virtio: fix virtio_net xmit of freed skb bug · 7eb2e251

由 Rusty Russell 提交于 5月 26, 2008

If we fail to transmit a packet, we assume the queue is full and put
the skb into last_xmit_skb.  However, if more space frees up before we
xmit it, we loop, and the result can be transmitting the same skb twice.

Fix is simple: set skb to NULL if we've used it in some way, and check
before sending.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

7eb2e251

23 5月, 2008 1 次提交

VIRTIO: Use __skb_queue_purge() · 288369cc

由 Wang Chen 提交于 5月 22, 2008

Use standard routine for queue purging.
Signed-off-by: NWang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

288369cc

02 5月, 2008 5 次提交

virtio: explicit advertisement of driver features · c45a6816

由 Rusty Russell 提交于 5月 02, 2008

A recent proposed feature addition to the virtio block driver revealed
some flaws in the API: in particular, we assume that feature
negotiation is complete once a driver's probe function returns.

There is nothing in the API to require this, however, and even I
didn't notice when it was violated.

So instead, we require the driver to specify what features it supports
in a table, we can then move the feature negotiation into the virtio
core.  The intersection of device and driver features are presented in
a new 'features' bitmap in the struct virtio_device.

Note that this highlights the difference between Linux unsigned-long
bitmaps where each unsigned long is in native endian, and a
straight-forward little-endian array of bytes.

Drivers can still remove feature bits in their probe routine if they
really have to.

API changes:
- dev->config->feature() no longer gets and acks a feature.
- drivers should advertise their features in the 'feature_table' field
- use virtio_has_feature() for extra sanity when checking feature bits
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

c45a6816

virtio: finer-grained features for virtio_net · 5539ae96

由 Rusty Russell 提交于 5月 02, 2008

So, we previously had a 'VIRTIO_NET_F_GSO' bit which meant that 'the
host can handle csum offload, and any TSO (v4&v6 incl ECN) or UFO
packets you might want to send.  I thought this was good enough for
Linux, but it actually isn't, since we don't do UFO in software.

So, add separate feature bits for what the host can handle.  Add
equivalent ones for the guest to say what it can handle, because LRO
is coming too (thanks Herbert!).
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

5539ae96

virtio: wean net driver off NETDEV_TX_BUSY · 99ffc696

由 Rusty Russell 提交于 5月 02, 2008

Herbert tells me that returning NETDEV_TX_BUSY from hard_start_xmit is
seen as a poor thing to do; we should cache the packet and stop the queue.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>

99ffc696

virtio: fix scatterlist sizing in net driver. · 05271685

由 Rusty Russell 提交于 5月 02, 2008

Herbert Xu points out (within another patch) that my scatterlists are
too short: one entry for the gso header, one for the skb->data, and
MAX_SKB_FRAGS for all the fragments.

Fix both xmit and recv sides (recv currently unused, coming in later
patch).
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

05271685

virtio: fix tx_ stats in virtio_net · 655aa31f

由 Rusty Russell 提交于 5月 02, 2008

get_buf() gives the length written by the other side, which will be
zero.  We want to add the skb length.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

655aa31f

09 4月, 2008 1 次提交

[NET]: Undo code bloat in hot paths due to print_mac(). · 21f644f3

由 David S. Miller 提交于 4月 08, 2008

If print_mac() is used inside of a pr_debug() the compiler
can't see that the call is redundant so still performs it
even of pr_debug() ends up being a nop.

So don't use print_mac() in such cases in hot code paths,
use MAC_FMT et al. instead.

As noted by Joe Perches, pr_debug() could be modified to
handle this better, but that is a change to an interface
used by the entire kernel and thus needs to be validated
carefully.  This here is thus the less risky fix for
2.6.25
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

21f644f3

08 4月, 2008 1 次提交

virtio_net: remove overzealous printk · 6ea0a467

由 Anthony Liguori 提交于 4月 07, 2008

The 'disable_cb' is really just a hint and as such, it's possible for more
work to get queued up while callbacks are disabled. Under stress with an
SMP guest, this printk triggers very frequently. There is no race here, this
is how things are designed to work so let's just remove the printk.
Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6ea0a467

17 3月, 2008 2 次提交

virtio: fix race in enable_cb · 4265f161

由 Christian Borntraeger 提交于 3月 14, 2008

There is a race in virtio_net, dealing with disabling/enabling the callback.
I saw the following oops:

kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:218!
illegal operation: 0001 [#1] SMP
Modules linked in: sunrpc dm_mod
CPU: 2 Not tainted 2.6.25-rc1zlive-host-10623-gd358142-dirty #99
Process swapper (pid: 0, task: 000000000f85a610, ksp: 000000000f873c60)
Krnl PSW : 0404300180000000 00000000002b81a6 (vring_disable_cb+0x16/0x20)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:3 PM:0 EA:3
Krnl GPRS: 0000000000000001 0000000000000001 0000000010005800 0000000000000001
           000000000f3a0900 000000000f85a610 0000000000000000 0000000000000000
           0000000000000000 000000000f870000 0000000000000000 0000000000001237
           000000000f3a0920 000000000010ff74 00000000002846f6 000000000fa0bcd8
Krnl Code: 00000000002b819a: a7110001           tmll    %r1,1
           00000000002b819e: a7840004           brc     8,2b81a6
           00000000002b81a2: a7f40001           brc     15,2b81a4
          >00000000002b81a6: a51b0001           oill    %r1,1
           00000000002b81aa: 40102000           sth     %r1,0(%r2)
           00000000002b81ae: 07fe               bcr     15,%r14
           00000000002b81b0: eb7ff0380024       stmg    %r7,%r15,56(%r15)
           00000000002b81b6: a7f13e00           tmll    %r15,15872
Call Trace:
([<000000000fa0bcd0>] 0xfa0bcd0)
 [<00000000002b8350>] vring_interrupt+0x5c/0x6c
 [<000000000010ab08>] do_extint+0xb8/0xf0
 [<0000000000110716>] ext_no_vtime+0x16/0x1a
 [<0000000000107e72>] cpu_idle+0x1c2/0x1e0

The problem can be triggered with a high amount of host->guest traffic.
I think its the following race:

poll says netif_rx_complete
poll calls enable_cb
enable_cb opens the interrupt mask
a new packet comes, an interrupt is triggered----\
enable_cb sees that there is more work           |
enable_cb disables the interrupt                 |
       .                                         V
       .                            interrupt is delivered
       .                            skb_recv_done does atomic napi test, ok
 some waiting                       disable_cb is called->check fails->bang!
       .
poll would do napi check
poll would do disable_cb

The fix is to let enable_cb not disable the interrupt again, but expect the
caller to do the cleanup if it returns false. In that case, the interrupt is
only disabled, if the napi test_set_bit was successful.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (cleaned up doco)

4265f161

virtio: Enable netpoll interface for netconsole logging · da74e89d

由 Amit Shah 提交于 2月 29, 2008

Add a new poll_controller handler that the netpoll interface needs.

This enables netconsole logging from a kvm guest over the virtio
net interface.
Signed-off-by: NAmit Shah <amitshah@gmx.net>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

da74e89d

24 2月, 2008 1 次提交
- C
  virtio_net: Fix oops on early interrupts - introduced by virtio reset code · d9d5dcc8
  由 Christian Borntraeger 提交于 2月 18, 2008
```
Signed-off-by: NJeff Garzik <jeff@garzik.org>
```
  d9d5dcc8
06 2月, 2008 1 次提交

virtio net: fix oops on interface-up · 370076d9

由 Christian Borntraeger 提交于 2月 06, 2008

I got the following oops during interface ifup. Unfortunately its not
easily reproducable so I cant say for sure that my fix fixes this
problem, but I am confident and I think its correct anyway:

   <2>kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:234!
    <4>illegal operation: 0001 [#1] PREEMPT SMP
    <4>Modules linked in:
    <4>CPU: 0 Not tainted 2.6.24zlive-guest-07293-gf1ca1512-dirty #91
    <4>Process swapper (pid: 0, task: 0000000000800938, ksp: 000000000084ddb8)
    <4>Krnl PSW : 0404300180000000 0000000000466374 (vring_disable_cb+0x30/0x34)
    <4>           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:3 PM:0 EA:3
    <4>Krnl GPRS: 0000000000000001 0000000000000001 0000000010003800 0000000000466344
    <4>           000000000e980900 00000000008848b0 000000000084e748 0000000000000000
    <4>           000000000087b300 0000000000001237 0000000000001237 000000000f85bdd8
    <4>           000000000e980920 00000000001137c0 0000000000464754 000000000f85bdd8
    <4>Krnl Code: 0000000000466368: e3b0b0700004        lg      %r11,112(%r11)
    <4>           000000000046636e: 07fe                bcr     15,%r14
    <4>           0000000000466370: a7f40001            brc     15,466372
    <4>          >0000000000466374: a7f4fff6            brc     15,466360
    <4>           0000000000466378: eb7ff0500024        stmg    %r7,%r15,80(%r15)
    <4>           000000000046637e: a7f13e00            tmll    %r15,15872
    <4>           0000000000466382: b90400ef            lgr     %r14,%r15
    <4>           0000000000466386: a7840001            brc     8,466388
    <4>Call Trace:
    <4>([<000201500f85c000>] 0x201500f85c000)
    <4> [<0000000000466556>] vring_interrupt+0x72/0x88
    <4> [<00000000004801a0>] kvm_extint_handler+0x34/0x44
    <4> [<000000000010d22c>] do_extint+0xbc/0xf8
    <4> [<0000000000113f98>] ext_no_vtime+0x16/0x1a
    <4> [<000000000010a182>] cpu_idle+0x216/0x238
    <4>([<000000000010a162>] cpu_idle+0x1f6/0x238)
    <4> [<0000000000568656>] rest_init+0xaa/0xb8
    <4> [<000000000084ee2c>] start_kernel+0x3fc/0x490
    <4> [<0000000000100020>] _stext+0x20/0x80
    <4>
    <4> <0>Kernel panic - not syncing: Fatal exception in interrupt
    <4>

After looking at the code and the dump I think the following scenario
happened: Ifup was running on cpu2 and the interrupt arrived on cpu0.
Now virtnet_open on cpu 2 managed to execute napi_enable and disable_cb
but did not execute rx_schedule. Meanwhile on cpu 0 skb_recv_done was
called by vring_interrupt, executed netif_rx_schedule_prep, which
succeeded and therefore called disable_cb. This triggered the BUG_ON,
as interrupts were already disabled by cpu 2.

I think the proper solution is to make the call to disable_cb depend on
the atomic update of NAPI_STATE_SCHED by using netif_rx_schedule_prep
in the same way as skb_recv_done.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NJeff Garzik <jeff@garzik.org>

370076d9

04 2月, 2008 6 次提交

virtio_net: parametrize the napi_weight for virtio receive queue. · 6c0cd7c0

由 Dor Laor 提交于 12月 16, 2007

It is done in order to improve performance.
Signed-off-by: NDor Laor <dor.laor@qumranet.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

6c0cd7c0

virtio: free transmit skbs when notified, not on next xmit. · 2cb9c6ba

由 Rusty Russell 提交于 2月 04, 2008

This fixes a potential dangling xmit problem.

We also suppress refill interrupts until we need them.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

2cb9c6ba

virtio: flush buffers on open · a48bd8f6

由 Rusty Russell 提交于 2月 04, 2008

Fix bug found by Christian Borntraeger: if the other side fills all
the registered network buffers before we enable NAPI, we will never
get an interrupt.  The simplest fix is to process the input queue once
on open.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

a48bd8f6

virtnet: remove double ether_setup · e70f2f1b

由 Christian Borntraeger 提交于 12月 06, 2007

Hello Rusty,

virtnet_probe already calls alloc_etherdev, which calls ether_setup.
There is no need to do that again.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

e70f2f1b

virtio: reset function · 6e5aa7ef

由 Rusty Russell 提交于 2月 04, 2008

A reset function solves three problems:

1) It allows us to renegotiate features, eg. if we want to upgrade a
   guest driver without rebooting the guest.

2) It gives us a clean way of shutting down virtqueues: after a reset,
   we know that the buffers won't be used by the host, and

3) It helps the guest recover from messed-up drivers.

So we remove the ->shutdown hook, and the only way we now remove
feature bits is via reset.

We leave it to the driver to do the reset before it deletes queues:
the balloon driver, for example, needs to chat to the host in its
remove function.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

6e5aa7ef

virtio: populate network rings in the probe routine, not open · b3369c1f

由 Rusty Russell 提交于 2月 04, 2008

Since we want to reset the device to remove them, this is simpler
(device is reset for us on driver remove).
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

b3369c1f