提交 · 0f13b66b01c6e2ec4913a7812414183844d1cc4f · openanolis / cloud-kernel

19 11月, 2013 1 次提交

net, virtio_net: replace the magic value · 0f13b66b

由 Zhi Yong Wu 提交于 11月 18, 2013

It is more appropriate to use # of queue pairs currently used by
the driver instead of a magic value.
Signed-off-by: NZhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0f13b66b

15 11月, 2013 1 次提交

virtio-net: mergeable buffer size should include virtio-net header · 5061de36

由 Michael Dalton 提交于 11月 14, 2013

Commit 2613af0e ("virtio_net: migrate mergeable rx buffers to page
frag allocators") changed the mergeable receive buffer size from PAGE_SIZE
to MTU-size. However, the merge buffer size does not take into account the
size of the virtio-net header. Consequently, packets that are MTU-size
will take two buffers intead of one (to store the virtio-net header),
substantially decreasing the throughput of MTU-size traffic due to TCP
window / SKB truesize effects.

This commit changes the mergeable buffer size to include the virtio-net
header. The buffer size is cacheline-aligned because skb_page_frag_refill
will not automatically align the requested size.

Benchmarks taken from an average of 5 netperf 30-second TCP_STREAM runs
between two QEMU VMs on a single physical machine. Each VM has two VCPUs and
vhost enabled. All VMs and vhost threads run in a single 4 CPU cgroup
cpuset, using cgroups to ensure that other processes in the system will not
be scheduled on the benchmark CPUs. Transmit offloads and mergeable receive
buffers are enabled, but guest_tso4 / guest_csum are explicitly disabled to
force MTU-sized packets on the receiver.

next-net trunk before 2613af0e (PAGE_SIZE buf): 3861.08Gb/s
net-next trunk (MTU 1500- packet uses two buf due to size bug): 4076.62Gb/s
net-next trunk (MTU 1480- packet fits in one buf): 6301.34Gb/s
net-next trunk w/ size fix (MTU 1500 - packet fits in one buf): 6445.44Gb/s
Suggested-by: NEric Northup <digitaleric@google.com>
Signed-off-by: NMichael Dalton <mwdalton@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5061de36

06 11月, 2013 1 次提交

virtio-net: switch to use XPS to choose txq · 9bb8ca86

由 Jason Wang 提交于 11月 05, 2013

We used to use a percpu structure vq_index to record the cpu to queue
mapping, this is suboptimal since it duplicates the work of XPS and
loses all other XPS functionality such as allowing user to configure
their own transmission steering strategy.

So this patch switches to use XPS and suggest a default mapping when
the number of cpus is equal to the number of queues. With XPS support,
there's no need for keeping per-cpu vq_index and .ndo_select_queue(),
so they were removed also.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9bb8ca86

05 11月, 2013 1 次提交

virtio-net: coalesce rx frags when possible during rx · ba275241

由 Jason Wang 提交于 11月 01, 2013

Commit 2613af0e (virtio_net: migrate mergeable
rx buffers to page frag allocators) try to increase the payload/truesize for
MTU-sized traffic. But this will introduce the extra overhead for GSO packets
received because of the frag list. This commit tries to reduce this issue by
coalesce the possible rx frags when possible during rx. Test result shows the
about 15% improvement on full size GSO packet receiving (and even better than
before commit 2613af0e).

Before this commit:
./netperf -H 192.168.100.4
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.100.4
() port 0 AF_INET : demo
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    20303.87

After this commit:
./netperf -H 192.168.100.4
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.100.4
() port 0 AF_INET : demo
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    23841.26

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Michael Dalton <mwdalton@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba275241

30 10月, 2013 1 次提交

virtio-net: correctly handle cpu hotplug notifier during resuming · ec9debbd

由 Jason Wang 提交于 10月 29, 2013

commit 3ab098df (virtio-net: don't respond to
cpu hotplug notifier if we're not ready) tries to bypass the cpu hotplug
notifier by checking the config_enable and does nothing is it was false. So it
need to try to hold the config_lock mutex which may happen in atomic
environment which leads the following warnings:

[  622.944441] CPU0 attaching NULL sched-domain.
[  622.944446] CPU1 attaching NULL sched-domain.
[  622.944485] CPU0 attaching NULL sched-domain.
[  622.950795] BUG: sleeping function called from invalid context at kernel/mutex.c:616
[  622.950796] in_atomic(): 1, irqs_disabled(): 1, pid: 10, name: migration/1
[  622.950796] no locks held by migration/1/10.
[  622.950798] CPU: 1 PID: 10 Comm: migration/1 Not tainted 3.12.0-rc5-wl-01249-gb91e82d #317
[  622.950799] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  622.950802]  0000000000000000 ffff88001d42dba0 ffffffff81a32f22 ffff88001bfb9c70
[  622.950803]  ffff88001d42dbb0 ffffffff810edb02 ffff88001d42dc38 ffffffff81a396ed
[  622.950805]  0000000000000046 ffff88001d42dbe8 ffffffff810e861d 0000000000000000
[  622.950805] Call Trace:
[  622.950810]  [<ffffffff81a32f22>] dump_stack+0x54/0x74
[  622.950815]  [<ffffffff810edb02>] __might_sleep+0x112/0x114
[  622.950817]  [<ffffffff81a396ed>] mutex_lock_nested+0x3c/0x3c6
[  622.950818]  [<ffffffff810e861d>] ? up+0x39/0x3e
[  622.950821]  [<ffffffff8153ea7c>] ? acpi_os_signal_semaphore+0x21/0x2d
[  622.950824]  [<ffffffff81565ed1>] ? acpi_ut_release_mutex+0x5e/0x62
[  622.950828]  [<ffffffff816d04ec>] virtnet_cpu_callback+0x33/0x87
[  622.950830]  [<ffffffff81a42576>] notifier_call_chain+0x3c/0x5e
[  622.950832]  [<ffffffff810e86a8>] __raw_notifier_call_chain+0xe/0x10
[  622.950835]  [<ffffffff810c5556>] __cpu_notify+0x20/0x37
[  622.950836]  [<ffffffff810c5580>] cpu_notify+0x13/0x15
[  622.950838]  [<ffffffff81a237cd>] take_cpu_down+0x27/0x3a
[  622.950841]  [<ffffffff81136289>] stop_machine_cpu_stop+0x93/0xf1
[  622.950842]  [<ffffffff81136167>] cpu_stopper_thread+0xa0/0x12f
[  622.950844]  [<ffffffff811361f6>] ? cpu_stopper_thread+0x12f/0x12f
[  622.950847]  [<ffffffff81119710>] ? lock_release_holdtime.part.7+0xa3/0xa8
[  622.950848]  [<ffffffff81135e4b>] ? cpu_stop_should_run+0x3f/0x47
[  622.950850]  [<ffffffff810ea9b0>] smpboot_thread_fn+0x1c5/0x1e3
[  622.950852]  [<ffffffff810ea7eb>] ? lg_global_unlock+0x67/0x67
[  622.950854]  [<ffffffff810e36b7>] kthread+0xd8/0xe0
[  622.950857]  [<ffffffff81a3bfad>] ? wait_for_common+0x12f/0x164
[  622.950859]  [<ffffffff810e35df>] ? kthread_create_on_node+0x124/0x124
[  622.950861]  [<ffffffff81a45ffc>] ret_from_fork+0x7c/0xb0
[  622.950862]  [<ffffffff810e35df>] ? kthread_create_on_node+0x124/0x124
[  622.950876] smpboot: CPU 1 is now offline
[  623.194556] SMP alternatives: lockdep: fixing up alternatives
[  623.194559] smpboot: Booting Node 0 Processor 1 APIC 0x1
...

A correct fix is to unregister the hotcpu notifier during restore and register a
new one in resume.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Tested-by: NFengguang Wu <fengguang.wu@intel.com>
Cc: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NWanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ec9debbd

29 10月, 2013 1 次提交

virtio_net: migrate mergeable rx buffers to page frag allocators · 2613af0e

由 Michael Dalton 提交于 10月 28, 2013

The virtio_net driver's mergeable receive buffer allocator
uses 4KB packet buffers. For MTU-sized traffic, SKB truesize
is > 4KB but only ~1500 bytes of the buffer is used to store
packet data, reducing the effective TCP window size
substantially. This patch addresses the performance concerns
with mergeable receive buffers by allocating MTU-sized packet
buffers using page frag allocators. If more than MAX_SKB_FRAGS
buffers are needed, the SKB frag_list is used.
Signed-off-by: NMichael Dalton <mwdalton@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2613af0e

18 10月, 2013 2 次提交

virtio-net: refill only when device is up during setting queues · 35ed159b

由 Jason Wang 提交于 10月 15, 2013

We used to schedule the refill work unconditionally after changing the
number of queues. This may lead an issue if the device is not
up. Since we only try to cancel the work in ndo_stop(), this may cause
the refill work still work after removing the device. Fix this by only
schedule the work when device is up.

The bug were introduce by commit 9b9cd802.
(virtio-net: fix the race between channels setting and refill)

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

35ed159b

virtio-net: don't respond to cpu hotplug notifier if we're not ready · 3ab098df

由 Jason Wang 提交于 10月 15, 2013

We're trying to re-configure the affinity unconditionally in cpu hotplug
callback. This may lead the issue during resuming from s3/s4 since

- virt queues haven't been allocated at that time.
- it's unnecessary since thaw method will re-configure the affinity.

Fix this issue by checking the config_enable and do nothing is we're not ready.

The bug were introduced by commit 8de4b2f3
(virtio-net: reset virtqueue affinity when doing cpu hotplug).

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NWanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3ab098df

04 9月, 2013 1 次提交

virtio-net: Set RXCSUM feature if GUEST_CSUM is available · 4f49129b

由 Thomas Huth 提交于 8月 27, 2013

If the VIRTIO_NET_F_GUEST_CSUM virtio feature is available, the guest
does not have to calculate the checksums on all received packets. This
is pretty much the same feature as RX checksum offloading on real
network cards, so the virtio-net driver should report this by setting
the NETIF_F_RXCSUM flag. When the user now runs "ethtool -k", he or she
can see whether the virtio-net interface has to calculate RX checksums
or not.
Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f49129b

28 7月, 2013 1 次提交

virtio-net: put virtio net header inline with data · e7428e95

由 Michael S. Tsirkin 提交于 7月 25, 2013

For small packets we can simplify xmit processing
by linearizing buffers with the header:
most packets seem to have enough head room
we can use for this purpose.
Since existing hypervisors require that header
is the first s/g element, we need a feature bit
for this.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e7428e95

10 7月, 2013 1 次提交

virtio_net: fix race in RX VQ processing · cbdadbbf

由 Michael S. Tsirkin 提交于 7月 09, 2013

virtio net called virtqueue_enable_cq on RX path after napi_complete, so
with NAPI_STATE_SCHED clear - outside the implicit napi lock.
This violates the requirement to synchronize virtqueue_enable_cq wrt
virtqueue_add_buf.  In particular, used event can move backwards,
causing us to lose interrupts.
In a debug build, this can trigger panic within START_USE.

Jason Wang reports that he can trigger the races artificially,
by adding udelay() in virtqueue_enable_cb() after virtio_mb().

However, we must call napi_complete to clear NAPI_STATE_SCHED before
polling the virtqueue for used buffers, otherwise napi_schedule_prep in
a callback will fail, causing us to lose RX events.

To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED
set (under napi lock), later call virtqueue_poll with
NAPI_STATE_SCHED clear (outside the lock).
Reported-by: NJason Wang <jasowang@redhat.com>
Tested-by: NJason Wang <jasowang@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cbdadbbf

04 7月, 2013 1 次提交

virtio-net: fix the race between channels setting and refill · 9b9cd802

由 Jason Wang 提交于 7月 04, 2013

Commit 55257d72 (virtio-net: fill only rx queues
which are being used) tries to refill on demand when changing the number of
channels by call try_refill_recv() directly, this may race:

- the refill work who may do the refill in the same time
- the try_refill_recv() called in bh since napi was not disabled

Which may led guest complain during setting channels:

virtio_net virtio0: input.1:id 0 is not a head!

Solve this issue by scheduling a refill work which can guarantee the
serialization of refill.

Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

9b9cd802

23 5月, 2013 1 次提交

virtio_net: enable napi for all possible queues during open · e4166625

由 Jason Wang 提交于 5月 21, 2013

Commit 55257d72 (virtio-net: fill only rx
queues which are being used) only does the napi enabling during open for
curr_queue_pairs. This will break multiqueue receiving since napi of new queues
were still disabled after changing the number of queues.

This patch fixes this by enabling napi for all possible queues during open.

Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e4166625

12 5月, 2013 1 次提交

virtio_net: use default napi weight by default · d34710e3

由 Amerigo Wang 提交于 5月 09, 2013

Since commit 82dc3c63 ("net: introduce NAPI_POLL_WEIGHT")
we warn drivers when they use napi weight higher than NAPI_POLL_WEIGHT,
but virtio_net still uses 128 by default. This patch makes its default
value to NAPI_POLL_WEIGHT.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <amwang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d34710e3

29 4月, 2013 1 次提交

virtio-net: fill only rx queues which are being used · 55257d72

由 Sasha Levin 提交于 4月 29, 2013

Due to MQ support we may allocate a whole bunch of rx queues but
never use them. With this patch we'll safe the space used by
the receive buffers until they are actually in use:

sh-4.2# free -h
             total       used       free     shared    buffers     cached
Mem:          490M        35M       455M         0B         0B       4.1M
-/+ buffers/cache:        31M       459M
Swap:           0B         0B         0B
sh-4.2# ethtool -L eth0 combined 8
sh-4.2# free -h
             total       used       free     shared    buffers     cached
Mem:          490M       162M       327M         0B         0B       4.1M
-/+ buffers/cache:       158M       331M
Swap:           0B         0B         0B
Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

55257d72

20 4月, 2013 2 次提交

net: vlan: prepare for 802.1ad VLAN filtering offload · 80d5c368

由 Patrick McHardy 提交于 4月 19, 2013

Change the rx_{add,kill}_vid callbacks to take a protocol argument in
preparation of 802.1ad support. The protocol argument used so far is
always htons(ETH_P_8021Q).
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

80d5c368

net: vlan: rename NETIF_F_HW_VLAN_* feature flags to NETIF_F_HW_VLAN_CTAG_* · f646968f

由 Patrick McHardy 提交于 4月 19, 2013

Rename the hardware VLAN acceleration features to include "CTAG" to indicate
that they only support CTAGs. Follow up patches will introduce 802.1ad
server provider tagging (STAGs) and require the distinction for hardware not
supporting acclerating both.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f646968f

12 4月, 2013 1 次提交

virtio-net: initialize vlan_features · 4fda8302

由 Jason Wang 提交于 4月 10, 2013

There's nothing that prevent passing the device features of virtio_net to its
vlan device. So this patch simply passes those to vlan device to benefit from
advanced features.

Netperf shows better sending performance for vlan device since TSO can work on
vlan now.

before:
netperf -H 192.168.5.2
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.5.2 ()
port 0 AF_INET : demo
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    4162.35

after:
netperf -H 192.168.5.2
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.5.2 ()
port 0 AF_INET : demo
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    9365.42

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4fda8302

22 3月, 2013 1 次提交

virtio: remove obsolete virtqueue_get_queue_index() · 9d0ca6ed

由 Rusty Russell 提交于 3月 21, 2013

You can access it directly now, since 3.8: v3.7-rc1-13-g06ca287d
'virtio: move queue_index and num_free fields into core struct
virtqueue.'

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d0ca6ed

20 3月, 2013 2 次提交

virtio_net: use simplified virtqueue accessors. · 9dc7b9e4

由 Rusty Russell 提交于 3月 20, 2013

We never add buffers with input and output parts, so use the new accessors.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Reviewed-by: NAsias He <asias@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

9dc7b9e4

virtio_net: use virtqueue_add_sgs[] for command buffers. · f7bc9594

由 Rusty Russell 提交于 3月 20, 2013

It's a bit cleaner to hand multiple sgs, rather than one big one.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Tested-by: NWanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

f7bc9594

14 2月, 2013 1 次提交

net: Fix possible wrong checksum generation. · c9af6db4

由 Pravin B Shelar 提交于 2月 11, 2013

Patch cef401de (net: fix possible wrong checksum
generation) fixed wrong checksum calculation but it broke TSO by
defining new GSO type but not a netdev feature for that type.
net_gso_ok() would not allow hardware checksum/segmentation
offload of such packets without the feature.

Following patch fixes TSO and wrong checksum. This patch uses
same logic that Eric Dumazet used. Patch introduces new flag
SKBTX_SHARED_FRAG if at least one frag can be modified by
the user. but SKBTX_SHARED_FRAG flag is kept in skb shared
info tx_flags rather than gso_type.

tx_flags is better compared to gso_type since we can have skb with
shared frag without gso packet. It does not link SHARED_FRAG to
GSO, So there is no need to define netdev feature for this.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c9af6db4

13 2月, 2013 1 次提交
- R
  virtio: use module_virtio_driver. · b2a17029
  由 Rusty Russell 提交于 2月 13, 2013
```
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
```
  b2a17029
05 2月, 2013 1 次提交

drivers:net:misc: Remove unnecessary alloc/OOM messages · e68ed8f0

由 Joe Perches 提交于 2月 03, 2013

alloc failures already get standardized OOM
messages and a dump_stack.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e68ed8f0

28 1月, 2013 1 次提交

net: fix possible wrong checksum generation · cef401de

由 Eric Dumazet 提交于 1月 25, 2013

Pravin Shelar mentioned that GSO could potentially generate
wrong TX checksum if skb has fragments that are overwritten
by the user between the checksum computation and transmit.

He suggested to linearize skbs but this extra copy can be
avoided for normal tcp skbs cooked by tcp_sendmsg().

This patch introduces a new SKB_GSO_SHARED_FRAG flag, set
in skb_shinfo(skb)->gso_type if at least one frag can be
modified by the user.

Typical sources of such possible overwrites are {vm}splice(),
sendfile(), and macvtap/tun/virtio_net drivers.

Tested:

$ netperf -H 7.7.8.84
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
7.7.8.84 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    3959.52

$ netperf -H 7.7.8.84 -t TCP_SENDFILE
TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.8.84 ()
port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    3216.80

Performance of the SENDFILE is impacted by the extra allocation and
copy, and because we use order-0 pages, while the TCP_STREAM uses
bigger pages.
Reported-by: NPravin Shelar <pshelar@nicira.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cef401de

27 1月, 2013 3 次提交

virtio-net: reset virtqueue affinity when doing cpu hotplug · 8de4b2f3

由 Wanlong Gao 提交于 1月 24, 2013

Add a cpu notifier to virtio-net, so that we can reset the
virtqueue affinity if the cpu hotplug happens. It improve
the performance through enabling or disabling the virtqueue
affinity after doing cpu hotplug.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Eric Dumazet <erdnetdev@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: virtualization@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Signed-off-by: NWanlong Gao <gaowanlong@cn.fujitsu.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8de4b2f3

virtio-net: split out clean affinity function · 8898c21c

由 Wanlong Gao 提交于 1月 24, 2013

Split out the clean affinity function to virtnet_clean_affinity().

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Eric Dumazet <erdnetdev@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: virtualization@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Signed-off-by: NWanlong Gao <gaowanlong@cn.fujitsu.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8898c21c

virtio-net: fix the set affinity bug when CPU IDs are not consecutive · 47be2479

由 Wanlong Gao 提交于 1月 24, 2013

As Michael mentioned, set affinity and select queue will not work very
well when CPU IDs are not consecutive, this can happen with hot unplug.
Fix this bug by traversal the online CPUs, and create a per cpu variable
to find the mapping from CPU to the preferable virtual-queue.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Eric Dumazet <erdnetdev@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: virtualization@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Signed-off-by: NWanlong Gao <gaowanlong@cn.fujitsu.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

47be2479

22 1月, 2013 2 次提交

virtio-net: introduce a new control to set macaddr · 7e58d5ae

由 Amos Kong 提交于 1月 21, 2013

Currently we write MAC address to pci config space byte by byte,
this means that we have an intermediate step where mac is wrong.
This patch introduced a new control command to set MAC address,
it's atomic.

VIRTIO_NET_F_CTRL_MAC_ADDR is a new feature bit for compatibility.
Signed-off-by: NAmos Kong <akong@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e58d5ae

move virtnet_send_command() above virtnet_set_mac_address() · 40cbfc37

由 Amos Kong 提交于 1月 21, 2013

We want to send vq command to set mac address in
virtnet_set_mac_address(), so do this function moving.
Fixed a little issue of coding style.
Signed-off-by: NAmos Kong <akong@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40cbfc37

18 12月, 2012 4 次提交

R
virtio: net: make it clear that virtqueue_add_buf() no longer returns > 0 · 0e3daa64
由 Rusty Russell 提交于 10月 16, 2012
```
We simplified virtqueue_add_buf(), make it clear in the callers.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
```
0e3daa64

virtio_net: don't rely on virtqueue_add_buf() returning capacity. · 9ed4cb07

由 Rusty Russell 提交于 10月 16, 2012

Now we can easily use vq->num_free to determine if there are descriptors
left in the queue, we're about to change virtqueue_add_buf() to return 0
on success.  The virtio_net driver is the only one which actually uses
the return value, so change that.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>

9ed4cb07

virtio-net: remove unused skb_vnet_hdr->num_sg field · 7bedc7dc

由 Michael S. Tsirkin 提交于 10月 16, 2012

[Split from "correct capacity math on ring full" -- Rusty]
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>

7bedc7dc

virtio-net: correct capacity math on ring full · 6ee57bcc

由 Michael S. Tsirkin 提交于 10月 16, 2012

Capacity math on ring full is wrong: we are
looking at num_sg but that might be optimistic
because of indirect buffer use.

The implementation also penalizes fast path
with extra memory accesses for the benefit of
ring full condition handling which is slow path.

It's easy to query ring capacity so let's do just that.

This change also makes it easier to move vnet header
for tx around as follow-up patch does.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>

6ee57bcc

11 12月, 2012 1 次提交

virtio_net: fix a typo in virtnet_alloc_queues() · 008d4278

由 Amerigo Wang 提交于 12月 10, 2012

Obviously it should check !vi->rq.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <amwang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

008d4278

09 12月, 2012 3 次提交

virtio-net: support changing the number of queue pairs through ethtool · d73bcd2c

由 Jason Wang 提交于 12月 07, 2012

This patch implements the ethtool_{set|get}_channels method of virtio-net to
allow user to change the number of queues when the device is running on demand.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d73bcd2c

virtio_net: multiqueue support · 986a4f4d

由 Jason Wang 提交于 12月 07, 2012

This patch adds the multiqueue (VIRTIO_NET_F_MQ) support to virtio_net
driver. VIRTIO_NET_F_MQ capable device could allow the driver to do packet
transmission and reception through multiple queue pairs and does the packet
steering to get better performance. By default, one one queue pair is used, user
could change the number of queue pairs by ethtool in the next patch.

When multiple queue pairs is used and the number of queue pairs is equal to the
number of vcpus. Driver does the following optimizations to implement per-cpu
virt queue pairs:

- select the txq based on the smp processor id.
- smp affinity hint to the cpu that owns the queue pairs.

This could be used with the flow steering support of the device to guarantee the
packets of a single flow is handled by the same cpu.
Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

986a4f4d

virtio-net: separate fields of sending/receiving queue from virtnet_info · e9d7417b

由 Jason Wang 提交于 12月 07, 2012

To support multiqueue transmitq/receiveq, the first step is to separate queue
related structure from virtnet_info. This patch introduce send_queue and
receive_queue structure and use the pointer to them as the parameter in
functions handling sending/receiving.
Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e9d7417b

04 12月, 2012 1 次提交

virtio_net: remove __dev* attributes · 8cc085d6

由 Bill Pemberton 提交于 12月 03, 2012

CONFIG_HOTPLUG is going away as an option.  As result the __dev*
markings will be going away.

Remove use of __devinit, __devexit_p, __devinitdata, __devinitconst,
and __devexit.
Signed-off-by: NBill Pemberton <wfp5p@virginia.edu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

8cc085d6

10 11月, 2012 1 次提交

virtio_net: use net_*_ratelimited() helpers · be443899

由 Amerigo Wang 提交于 11月 08, 2012

These can be converted to net_*_ratelimited().

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <amwang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

be443899

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功