提交 · 6ba422489bcaebd89142cc0aeae9095d03c8301a · openeuler / Kernel

21 1月, 2015 1 次提交

virtio/net: verify device has config space · 6ba42248

由 Michael S. Tsirkin 提交于 1月 12, 2015

Some devices might not implement config space access
(e.g. remoteproc used not to - before 3.9).
virtio/net needs config space access so make it
fail gracefully if not there.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

6ba42248

23 12月, 2014 1 次提交

virtio_net: Fix napi poll list corruption · 8acdf999

由 Herbert Xu 提交于 12月 20, 2014

The commit d75b1ade (net: less
interrupt masking in NAPI) breaks virtio_net in an insidious way.

It is now required that if the entire budget is consumed when poll
returns, the napi poll_list must remain empty.  However, like some
other drivers virtio_net tries to do a last-ditch check and if
there is more work it will call napi_schedule and then immediately
process some of this new work.  Should the entire budget be consumed
while processing such new work then we will violate the new caller
contract.

This patch fixes this by not touching any work when we reschedule
in virtio_net.

The worst part of this bug is that the list corruption causes other
napi users to be moved off-list.  In my case I was chasing a stall
in IPsec (IPsec uses netif_rx) and I only belatedly realised that it
was virtio_net which caused the stall even though the virtio_net
poll was still functioning perfectly after IPsec stalled.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8acdf999

09 12月, 2014 8 次提交

virtio: drop VIRTIO_F_VERSION_1 from drivers · 51cdc381

由 Michael S. Tsirkin 提交于 12月 01, 2014

Core activates this bit automatically now,
drop it from drivers that set it explicitly.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

51cdc381

virtio_net: enable v1.0 support · 9465a7a6

由 Michael S. Tsirkin 提交于 10月 24, 2014

Now that we have completed 1.0 support, enable it in our driver.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

9465a7a6

virtio_net: disable mac write for virtio 1.0 · 7e93a02f

由 Michael S. Tsirkin 提交于 11月 26, 2014

The spec states that mac in config space is only driver-writable in the
legacy case.  Fence writing it in virtnet_set_mac_address() in the
virtio 1.0 case.
Suggested-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>

7e93a02f

virtio_net: bigger header when VERSION_1 is set · d04302b3

由 Michael S. Tsirkin 提交于 10月 24, 2014

With VERSION_1 virtio_net uses same header size
whether mergeable buffers are enabled or not.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>

d04302b3

virtio_net: stricter short buffer length checks · bcff3162

由 Michael S. Tsirkin 提交于 10月 24, 2014

Our buffer length check is not strict enough for mergeable
buffers: buffer can still be shorter that header + address
by 2 bytes.

Fix that up.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>

bcff3162

virtio_net: get rid of virtio_net_hdr/skb_vnet_hdr · 012873d0

由 Michael S. Tsirkin 提交于 10月 24, 2014

virtio 1.0 doesn't use virtio_net_hdr anymore, and in fact, it's not
really useful since virtio_net_hdr_mrg_rxbuf includes that as the first
field anyway.

Let's drop it, precalculate header len and store within vi instead.

This way we can also remove struct skb_vnet_hdr.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>

012873d0

virtio_net: pass vi around · 946fa564

由 Michael S. Tsirkin 提交于 10月 24, 2014

Too many places poke at [rs]q->vq->vdev->priv just to get
the vi structure.  Let's just pass the pointer around: seems
cleaner, and might even be faster.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>

946fa564

virtio_net: v1.0 endianness · fdd819b2

由 Michael S. Tsirkin 提交于 10月 07, 2014

Based on patches by Rusty Russell, Cornelia Huck.
Note: more code changes are needed for 1.0 support
(due to different header size).
So we don't advertize support for 1.0 yet.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

fdd819b2

21 11月, 2014 1 次提交

virtio-net: validate features during probe · 892d6eb1

由 Jason Wang 提交于 11月 20, 2014

We currently trigger BUG when VIRTIO_NET_F_CTRL_VQ
is not set but one of features depending on it is.
That's not a friendly way to report errors to
hypervisors.
Let's check, and fail probe instead.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

892d6eb1

31 10月, 2014 1 次提交

drivers/net: Disable UFO through virtio · 3d0ad094

由 Ben Hutchings 提交于 10月 30, 2014

IPv6 does not allow fragmentation by routers, so there is no
fragmentation ID in the fixed header.  UFO for IPv6 requires the ID to
be passed separately, but there is no provision for this in the virtio
net protocol.

Until recently our software implementation of UFO/IPv6 generated a new
ID, but this was a bug.  Now we will use ID=0 for any UFO/IPv6 packet
passed through a tap, which is even worse.

Unfortunately there is no distinction between UFO/IPv4 and v6
features, so disable UFO on taps and virtio_net completely until we
have a proper solution.

We cannot depend on VM managers respecting the tap feature flags, so
keep accepting UFO packets but log a warning the first time we do
this.
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Fixes: 916e4cf4 ("ipv6: reuse ip6_frag_id from ip6_ufo_append_data")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d0ad094

16 10月, 2014 1 次提交

virtio_net: fix use after free · 4b7fd2e6

由 Michael S. Tsirkin 提交于 10月 15, 2014

commit 0b725a2c
    net: Remove ndo_xmit_flush netdev operation, use signalling instead.

added code that looks at skb->xmit_more after the skb has
been put in TX VQ. Since some paths process the ring and free the skb
immediately, this can cause use after free.

Fix by storing xmit_more in a local variable.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b7fd2e6

15 10月, 2014 6 次提交

virtio_net: enable VQs early on restore · e53fbd11

由 Michael S. Tsirkin 提交于 10月 15, 2014

virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after restore returns, virtio net violated this
rule by using receive VQs within restore.

To fix, call virtio_device_ready before using VQs.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

e53fbd11

virtio_net: fix use after free on allocation failure · 02465555

由 Michael S. Tsirkin 提交于 10月 15, 2014

In the extremely unlikely event that driver initialization fails after
RX buffers are added, virtio net frees RX buffers while VQs are
still active, potentially causing device to use a freed buffer.

To fix, reset device first - same as we do on device removal.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

02465555

virtio_net: enable VQs early · 4baf1e33

由 Michael S. Tsirkin 提交于 10月 15, 2014

virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after probe returns, virtio net violated this
rule by using receive VQs within probe.

To fix, call virtio_device_ready before using VQs.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

4baf1e33

virtio_net: minor cleanup · 507613bf

由 Michael S. Tsirkin 提交于 10月 15, 2014

	goto done;
done:
	return;
is ugly, it was put there to make diff review easier.
replace by open-coded return.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

507613bf

virtio-net: drop config_mutex · 080c6373

由 Michael S. Tsirkin 提交于 10月 15, 2014

config_mutex served two purposes: prevent multiple concurrent config
change handlers, and synchronize access to config_enable flag.

Since commit dbf2576e
    workqueue: make all workqueues non-reentrant
all workqueues are non-reentrant, and config_enable
is now gone.

Get rid of the unnecessary lock.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

080c6373

virtio_net: drop config_enable · 102a2786

由 Michael S. Tsirkin 提交于 10月 15, 2014

Now that virtio core ensures config changes don't arrive during probing,
drop config_enable flag in virtio net.
On removal, flush is now sufficient to guarantee that no change work is
queued.

This help simplify the driver, and will allow setting DRIVER_OK earlier
without losing config change notifications.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

102a2786

14 9月, 2014 1 次提交

virtio_net: pass well-formed sgs to virtqueue_add_*() · a5835440

由 Rusty Russell 提交于 9月 11, 2014

This is the only driver which doesn't hand virtqueue_add_inbuf and
virtqueue_add_outbuf a well-formed, well-terminated sg.  Fix it,
so we can make virtio_add_* simpler.

pktgen results:
	modprobe pktgen
	echo 'add_device eth0' > /proc/net/pktgen/kpktgend_0
	echo nowait 1 > /proc/net/pktgen/eth0
	echo count 1000000 > /proc/net/pktgen/eth0
	echo clone_skb 100000 > /proc/net/pktgen/eth0
	echo dst_mac 4e:14:25:a9:30:ac > /proc/net/pktgen/eth0
	echo dst 192.168.1.2 > /proc/net/pktgen/eth0
	for i in `seq 20`; do echo start > /proc/net/pktgen/pgctrl; tail -n1 /proc/net/pktgen/eth0; done

Before:
  746547-793084(786421+/-9.6e+03)pps 346-367(364.4+/-4.4)Mb/sec (346397808-367990976(3.649e+08+/-4.5e+06)bps) errors: 0

After:
  767390-792966(785159+/-6.5e+03)pps 356-367(363.75+/-2.9)Mb/sec (356068960-367936224(3.64314e+08+/-3e+06)bps) errors: 0
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5835440

28 8月, 2014 1 次提交

virtio_net: flush when in xmit_more mode and under descriptor pressure · c89fcfd4

由 David S. Miller 提交于 8月 28, 2014

Mirror the changes made to ixgbe in commit 2367a173
("ixgbe: flush when in xmit_more mode and under descriptor pressure")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c89fcfd4

26 8月, 2014 1 次提交

net: Remove ndo_xmit_flush netdev operation, use signalling instead. · 0b725a2c

由 David S. Miller 提交于 8月 25, 2014

As reported by Jesper Dangaard Brouer, for high packet rates the
overhead of having another indirect call in the TX path is
non-trivial.

There is the indirect call itself, and then there is all of the
reloading of the state to refetch the tail pointer value and
then write the device register.

Move to a more passive scheme, which requires very light modifications
to the device drivers.

The signal is a new skb->xmit_more value, if it is non-zero it means
that more SKBs are pending to be transmitted on the same queue as the
current SKB.  And therefore, the driver may elide the tail pointer
update.

Right now skb->xmit_more is always zero.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0b725a2c

25 8月, 2014 1 次提交
- D
  virtio_net: Support netdev_ops->ndo_xmit_flush() · c223a078
  由 David S. Miller 提交于 8月 23, 2014
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  c223a078
24 7月, 2014 2 次提交

virtio-net: rx busy polling support · 91815639

由 Jason Wang 提交于 7月 23, 2014

Add basic support for rx busy polling. Instead of introducing new
states and spinlock to synchronize between NAPI and polling method,
this patch just reuse NAPI state to avoid extra overhead for fast path
and simplified the codes.

Test was done between a kvm guest and an external host. Two hosts were
connected through 40gb mlx4 cards. With both busy_poll and busy_read
are set to 50 in guest, 1 byte netperf tcp_rr shows 127% improvement:
transaction rate was increased from 8353.33 to 18966.87.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Vlad Yasevich <vyasevic@redhat.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

91815639

virtio-net: introduce virtnet_receive() · 2ffa7598

由 Jason Wang 提交于 7月 23, 2014

Move common receive logic to a new helper virtnet_receive(). It will
also be used by rx busy polling method.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Vlad Yasevich <vyasevic@redhat.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2ffa7598

14 5月, 2014 1 次提交

net: get rid of SET_ETHTOOL_OPS · 7ad24ea4

由 Wilfried Klaebe 提交于 5月 11, 2014

net: get rid of SET_ETHTOOL_OPS

Dave Miller mentioned he'd like to see SET_ETHTOOL_OPS gone.
This does that.

Mostly done via coccinelle script:
@@
struct ethtool_ops *ops;
struct net_device *dev;
@@
-       SET_ETHTOOL_OPS(dev, ops);
+       dev->ethtool_ops = ops;

Compile tested only, but I'd seriously wonder if this broke anything.
Suggested-by: NDave Miller <davem@davemloft.net>
Signed-off-by: NWilfried Klaebe <w-lkml@lebenslange-mailadresse.de>
Acked-by: NFelipe Balbi <balbi@ti.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7ad24ea4

01 5月, 2014 1 次提交

virtio-net: Set needed_headroom for virtio-net when VIRTIO_F_ANY_LAYOUT is true · 6ebbc1a6

由 Zhangjie \(HZ\) 提交于 4月 29, 2014

This is a small supplement for commit e7428e95
("virtio-net: put virtio-net header inline with data"). TCP packages have
enough room to put virtio-net header in, but UDP packages do not. By
setting dev->needed_headroom for virtio-net device, UDP packages could have
enough room.

For UDP packages, sk_buff is alloced in fun __ip_append_data. The size is
"alloclen + hh_len + 15", and "hh_len = LL_RESERVED_SPACE(rt-dst.dev);".
The Macro is defined as follows:
#define LL_RESERVED_SPACE(dev) \
     ((((dev)->hard_header_len+(dev)->needed_headroom)\
     &~(HH_DATA_MOD - 1)) + HH_DATA_MOD)
By default, for UDP packages, after skb is allocated, only 16 bytes
reserved. And 2 bytes remained after mac header is set. That is not enough
to put virtio-net header in. If we set dev->needed_headroom to 12 or 10
(according to mergeable_rx_bufs is on or off ), more room can be reserved.
Then there is enough room for UDP packages to put the header in.

test result list as below:
guest and host: suse11sp3, netperf, intel 2.4GHz
+-------+---------+---------+---------+---------+
|       |   old             |   new             |
+-------+---------+---------+---------+---------+
| UDP   |  Gbit/s | pps     |  Gbit/s | pps     |
| 64    |  0.57   | 692232  |  0.61   | 742420  |
| 256   |  1.60   | 686860  |  1.71   | 733331  |
| 512   |  2.92   | 674576  |  3.07   | 710446  |
| 1024  |  4.99   | 598977  |  5.17   | 620821  |
| 1460  |  5.68   | 483757  |  7.16   | 610519  |
| 4096  |  6.98   | 637468  |  7.21   | 658471  |
+-------+---------+---------+---------+---------+
Signed-off-by: NZhang Jie <zhangjie14@huawei.com>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Acked-by: NJason Wang <jasowang@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6ebbc1a6

23 4月, 2014 1 次提交

virtio_net: zero is an invald queue_pairs number · c18e9cd6

由 Amos Kong 提交于 4月 18, 2014

Execute "ethtool -L eth0 combined 0" in guest, if multiqueue
is enabled, virtnet_send_command() will return -EINVAL error,
there is a validation in QEMU.

But if multiqueue is disabled, virtnet_set_queues() will just
return zero (success). We should return error for this situation.
Signed-off-by: NAmos Kong <akong@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c18e9cd6

28 3月, 2014 1 次提交

virtio-net: correct error handling of virtqueue_kick() · 681daee2

由 Jason Wang 提交于 3月 26, 2014

Current error handling of virtqueue_kick() was wrong in two places:
- The skb were freed immediately when virtqueue_kick() fail during
  xmit. This may lead double free since the skb was not detached from
  the virtqueue.
- try_fill_recv() returns false when virtqueue_kick() fail. This will
  lead unnecessary rescheduling of refill work.

Actually, it's safe to just ignore the kick failure in those two
places. So this patch fixes this by partially revert commit
67975901.

Fixes 67975901
(virtio_net: verify if virtqueue_kick() succeeded).

Cc: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

681daee2

25 3月, 2014 1 次提交

virtio_net: Call dev_kfree_skb_any instead of dev_kfree_skb. · 85e94525

由 Eric W. Biederman 提交于 3月 15, 2014

Replace dev_kfree_skb with dev_kfree_skb_any in start_xmit which can
be called in hard irq and other contexts.

start_xmit only frees skbs that it is dropping.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

85e94525

15 3月, 2014 1 次提交

net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq · 57a7744e

由 Eric W. Biederman 提交于 3月 13, 2014

Replace the bh safe variant with the hard irq safe variant.

We need a hard irq safe variant to deal with netpoll transmitting
packets from hard irq context, and we need it in most if not all of
the places using the bh safe variant.

Except on 32bit uni-processor the code is exactly the same so don't
bother with a bh variant, just have a hard irq safe variant that
everyone can use.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

57a7744e

13 3月, 2014 1 次提交

virtio_net: don't crash if virtqueue is broken. · a7c58146

由 Rusty Russell 提交于 3月 13, 2014

A bad implementation of virtio might cause us to mark the virtqueue
broken: we'll dev_err() in that case, and the device is useless, but
let's not BUG_ON().
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

a7c58146

25 2月, 2014 1 次提交

virtio-net: alloc big buffers also when guest can receive UFO · 0e7ede80

由 Jason Wang 提交于 2月 21, 2014

We should alloc big buffers also when guest can receive UFO
packets to let the big packets fit into guest rx buffer.

Fixes 5c516751
(virtio-net: Allow UFO feature to be set and advertised.)

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e7ede80

17 1月, 2014 4 次提交

virtio-net: initial rx sysfs support, export mergeable rx buffer size · fbf28d78

由 Michael Dalton 提交于 1月 16, 2014

Add initial support for per-rx queue sysfs attributes to virtio-net. If
mergeable packet buffers are enabled, adds a read-only mergeable packet
buffer size sysfs attribute for each RX queue.
Suggested-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael Dalton <mwdalton@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fbf28d78

virtio-net: auto-tune mergeable rx buffer size for improved performance · ab7db917

由 Michael Dalton 提交于 1月 16, 2014

Commit 2613af0e ("virtio_net: migrate mergeable rx buffers to page frag
allocators") changed the mergeable receive buffer size from PAGE_SIZE to
MTU-size, introducing a single-stream regression for benchmarks with large
average packet size. There is no single optimal buffer size for all
workloads.  For workloads with packet size <= MTU bytes, MTU + virtio-net
header-sized buffers are preferred as larger buffers reduce the TCP window
due to SKB truesize. However, single-stream workloads with large average
packet sizes have higher throughput if larger (e.g., PAGE_SIZE) buffers
are used.

This commit auto-tunes the mergeable receiver buffer packet size by
choosing the packet buffer size based on an EWMA of the recent packet
sizes for the receive queue. Packet buffer sizes range from MTU_SIZE +
virtio-net header len to PAGE_SIZE. This improves throughput for
large packet workloads, as any workload with average packet size >=
PAGE_SIZE will use PAGE_SIZE buffers.

These optimizations interact positively with recent commit
ba275241 ("virtio-net: coalesce rx frags when possible during rx"),
which coalesces adjacent RX SKB fragments in virtio_net. The coalescing
optimizations benefit buffers of any size.

Benchmarks taken from an average of 5 netperf 30-second TCP_STREAM runs
between two QEMU VMs on a single physical machine. Each VM has two VCPUs
with all offloads & vhost enabled. All VMs and vhost threads run in a
single 4 CPU cgroup cpuset, using cgroups to ensure that other processes
in the system will not be scheduled on the benchmark CPUs. Trunk includes
SKB rx frag coalescing.

net-next w/ virtio_net before 2613af0e (PAGE_SIZE bufs): 14642.85Gb/s
net-next (MTU-size bufs):  13170.01Gb/s
net-next + auto-tune: 14555.94Gb/s

Jason Wang also reported a throughput increase on mlx4 from 22Gb/s
using MTU-sized buffers to about 26Gb/s using auto-tuning.
Signed-off-by: NMichael Dalton <mwdalton@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab7db917

virtio-net: use per-receive queue page frag alloc for mergeable bufs · fb51879d

由 Michael Dalton 提交于 1月 16, 2014

The virtio-net driver currently uses netdev_alloc_frag() for GFP_ATOMIC
mergeable rx buffer allocations. This commit migrates virtio-net to use
per-receive queue page frags for GFP_ATOMIC allocation. This change unifies
mergeable rx buffer memory allocation, which now will use skb_refill_frag()
for both atomic and GFP-WAIT buffer allocations.

To address fragmentation concerns, if after buffer allocation there
is too little space left in the page frag to allocate a subsequent
buffer, the remaining space is added to the current allocated buffer
so that the remaining space can be used to store packet data.
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael Dalton <mwdalton@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fb51879d

virtio-net: drop rq->max and rq->num · be121f46

由 Jason Wang 提交于 1月 16, 2014

It looks like there's no need for those two fields:

- Unless there's a failure for the first refill try, rq->max should be always
  equal to the vring size.
- rq->num is only used to determine the condition that we need to do the refill,
  we could check vq->num_free instead.
- rq->num was required to be increased or decreased explicitly after each
  get/put which results a bad API.

So this patch removes them both to make the code simpler.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

be121f46

03 1月, 2014 1 次提交

virtio-net: fix refill races during restore · 6cd4ce00

由 Jason Wang 提交于 12月 30, 2013

During restoring, try_fill_recv() was called with neither napi lock nor napi
disabled. This can lead two try_fill_recv() was called in the same time. Fix
this by refilling before trying to enable napi.

Fixes 0741bcb5
(virtio: net: Add freeze, restore handlers to support S4).

Cc: Amit Shah <amit.shah@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6cd4ce00

11 12月, 2013 2 次提交

virtio_net: spelling fixes · 788a8b6d

由 stephen hemminger 提交于 12月 09, 2013

Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

788a8b6d

virtio_net: remove unused parameter to send_command · d24bae32

由 stephen hemminger 提交于 12月 09, 2013

All the code passes NULL for the last sg list (in).
Simplify by just removing it.
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d24bae32

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功