提交 · e7802212ea4bbbd5db99181942a19ab36ca4b914 · openeuler / raspberrypi-kernel

07 7月, 2013 1 次提交

vhost: Make vhost a separate module · 6ac1afbf

由 Asias He 提交于 5月 06, 2013

Currently, vhost-net and vhost-scsi are sharing the vhost core code.
However, vhost-scsi shares the code by including the vhost.c file
directly.

Making vhost a separate module makes it is easier to share code with
other vhost devices.
Signed-off-by: NAsias He <asias@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

6ac1afbf

11 6月, 2013 1 次提交

vhost: check owner before we overwrite ubuf_info · 05c05351

由 Michael S. Tsirkin 提交于 6月 06, 2013

If device has an owner, we shouldn't touch ubuf_info
since it might be in use.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05c05351

06 5月, 2013 4 次提交

vhost: Remove vhost_enable_zcopy in vhost.h · e40ab748

由 Asias He 提交于 5月 06, 2013

It is net.c specific.
Signed-off-by: NAsias He <asias@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

e40ab748

vhost: Remove comments for hdr in vhost.h · ab00c42a

由 Asias He 提交于 5月 06, 2013

It is supposed to be removed when hdr is moved into vhost_net_virtqueue.
Signed-off-by: NAsias He <asias@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

ab00c42a

vhost: Move VHOST_NET_FEATURES to net.c · 8570a6e7

由 Asias He 提交于 5月 06, 2013

vhost.h should not depend on device specific marcos like
VHOST_NET_F_VIRTIO_NET_HDR and VIRTIO_NET_F_MRG_RXBUF.
Signed-off-by: NAsias He <asias@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

8570a6e7

vhost: Export vhost_dev_set_owner · 54db63c2

由 Asias He 提交于 5月 06, 2013

Signed-off-by: NAsias He <asias@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

54db63c2

01 5月, 2013 4 次提交

vhost: fix error handling in RESET_OWNER ioctl · 150b9e51

由 Michael S. Tsirkin 提交于 4月 28, 2013

RESET_OWNER ioctl would leave the fd in a bad state if
memory allocation failed: device is stopped
but owner is not reset. Make state changes
after allocating memory, such that a failed
ioctl has no effect.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

150b9e51

vhost: move per-vq net specific fields out to net · 81f95a55

由 Michael S. Tsirkin 提交于 4月 28, 2013

This will remove the need for vhost scsi to pull
in virtio-net.h.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

81f95a55

vhost: move vhost-net zerocopy fields to net.c · 2839400f

由 Asias He 提交于 4月 27, 2013

On top of 'vhost: Allow device specific fields per vq', we can move device
specific fields to device virt queue from vhost virt queue.
Signed-off-by: NAsias He <asias@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

2839400f

vhost: Allow device specific fields per vq · 3ab2e420

由 Asias He 提交于 4月 27, 2013

This is useful for any device who wants device specific fields per vq.
For example, tcm_vhost wants a per vq field to track requests which are
in flight on the vq. Also, on top of this we can add patches to move
things like ubufs from vhost.h out to net.c.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NAsias He <asias@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

3ab2e420

30 1月, 2013 1 次提交

vhost_net: handle polling errors when setting backend · 2b8b328b

由 Jason Wang 提交于 1月 28, 2013

Currently, the polling errors were ignored, which can lead following issues:

- vhost remove itself unconditionally from waitqueue when stopping the poll,
  this may crash the kernel since the previous attempt of starting may fail to
  add itself to the waitqueue
- userspace may think the backend were successfully set even when the polling
  failed.

Solve this by:

- check poll->wqh before trying to remove from waitqueue
- report polling errors in vhost_poll_start(), tx_poll_start(), the return value
  will be checked and returned when userspace want to set the backend

After this fix, there still could be a polling failure after backend is set, it
will addressed by the next patch.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b8b328b

06 12月, 2012 1 次提交

vhost: avoid backend flush on vring ops · 935cdee7

由 Michael S. Tsirkin 提交于 12月 06, 2012

vring changes already do a flush internally where appropriate, so we do
not need a second flush.

It's currently not very expensive but a follow-up patch makes flush more
heavy-weight, so remove the extra flush here to avoid regressing
performance if call or kick fds are changed on data path.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

935cdee7

03 11月, 2012 4 次提交

vhost: move -net specific code out · b211616d

由 Michael S. Tsirkin 提交于 11月 01, 2012

Zerocopy handling code is vhost-net specific.
Move it from vhost.c/vhost.h out to net.c
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b211616d

vhost: track zero copy failures using DMA length · c4fcb586

由 Michael S. Tsirkin 提交于 11月 01, 2012

This will be used to disable zerocopy when error rate
is high.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c4fcb586

vhost-net: cleanup macros for DMA status tracking · 70e4cb9a

由 Michael S. Tsirkin 提交于 11月 01, 2012

Better document macros for DMA tracking. Add an
explicit one for DMA in progress instead of
relying on user supplying len != 1.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

70e4cb9a

skb: report completion status for zero copy skbs · e19d6763

由 Michael S. Tsirkin 提交于 11月 01, 2012

Even if skb is marked for zero copy, net core might still decide
to copy it later which is somewhat slower than a copy in user context:
besides copying the data we need to pin/unpin the pages.

Add a parameter reporting such cases through zero copy callback:
if this happens a lot, device can take this into account
and switch to copying in user context.

This patch updates all users but ignores the passed value for now:
it will be used by follow-up patches.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e19d6763

22 7月, 2012 2 次提交

vhost: make vhost work queue visible · 163049ae

由 Stefan Hajnoczi 提交于 7月 21, 2012

The vhost work queue allows processing to be done in vhost worker thread
context, which uses the owner process mm.  Access to the vring and guest
memory is typically only possible from vhost worker context so it is
useful to allow work to be queued directly by users.

Currently vhost_net only uses the poll wrappers which do not expose the
work queue functions.  However, for tcm_vhost (vhost_scsi) it will be
necessary to queue custom work.
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Zhi Yong Wu <wuzhy@cn.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

163049ae

vhost: Separate vhost-net features from vhost features · 0dd05a3b

由 Stefan Hajnoczi 提交于 7月 21, 2012

In order for other vhost devices to use the VHOST_FEATURES bits the
vhost-net specific bits need to be moved to their own VHOST_NET_FEATURES
constant.

(Asias: Update drivers/vhost/test.c to use VHOST_NET_FEATURES)
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Zhi Yong Wu <wuzhy@cn.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Asias He <asias@redhat.com>
Signed-off-by: NNicholas A. Bellinger <nab@risingtidesystems.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

0dd05a3b

14 4月, 2012 1 次提交

skbuff: struct ubuf_info callback type safety · ca8f4fb2

由 Michael S. Tsirkin 提交于 4月 09, 2012

The skb struct ubuf_info callback gets passed struct ubuf_info
itself, not the arg value as the field name and the function signature
seem to imply. Rename the arg field to ctx to match usage,
add documentation and change the callback argument type
to make usage clear and to have compiler check correctness.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca8f4fb2

28 2月, 2012 1 次提交

vhost: fix release path lockdep checks · ea5d4046

由 Michael S. Tsirkin 提交于 11月 27, 2011

We shouldn't hold any locks on release path. Pass a flag to
vhost_dev_cleanup to use the lockdep info correctly.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Tested-by: NSasha Levin <levinsasha928@gmail.com>

ea5d4046

27 7月, 2011 1 次提交

atomic: use <linux/atomic.h> · 60063497

由 Arun Sharma 提交于 7月 26, 2011

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: NArun Sharma <asharma@fb.com>
Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

60063497

19 7月, 2011 2 次提交

vhost: init used ring after backend was set · f59281da

由 Jason Wang 提交于 6月 21, 2011

Move the used ring initialization after backend was set. This
makes it possible to disable the backend and tweak the used ring,
then restart. This will also make it possible to log the used ring
write correctly.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

f59281da

vhost: vhost TX zero-copy support · bab632d6

由 Michael S. Tsirkin 提交于 7月 18, 2011

>From: Shirley Ma <mashirle@us.ibm.com>

This adds experimental zero copy support in vhost-net,
disabled by default. To enable, set
experimental_zcopytx module option to 1.

This patch maintains the outstanding userspace buffers in the
sequence it is delivered to vhost. The outstanding userspace buffers
will be marked as done once the lower device buffers DMA has finished.
This is monitored through last reference of kfree_skb callback. Two
buffer indices are used for this purpose.

The vhost-net device passes the userspace buffers info to lower device
skb through message control. DMA done status check and guest
notification are handled by handle_tx: in the worst case is all buffers
in the vq are in pending/done status, so we need to notify guest to
release DMA done buffers first before we get any new buffers from the
vq.

One known problem is that if the guest stops submitting
buffers, buffers might never get used until some
further action, e.g. device reset. This does not
seem to affect linux guests.
Signed-off-by: NShirley <xma@us.ibm.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bab632d6

30 5月, 2011 1 次提交

vhost: support event index · 8ea8cf89

由 Michael S. Tsirkin 提交于 5月 20, 2011

Support the new event index feature. When acked,
utilize it to reduce the # of interrupts sent to the guest.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

8ea8cf89

01 2月, 2011 1 次提交

vhost: rcu annotation fixup · 5e18247b

由 Michael S. Tsirkin 提交于 1月 18, 2011

When built with rcu checks enabled, vhost triggers
bogus warnings as vhost features are read without
dev->mutex sometimes, and private pointer is read
with our kind of rcu where work serves as a
read side critical section.

Fixing it properly is not trivial.
Disable the warnings by stubbing out the checks for now.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

5e18247b

09 12月, 2010 1 次提交

vhost: fix typos in comment · a290aec8

由 Jason Wang 提交于 11月 29, 2010

Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

a290aec8

05 10月, 2010 1 次提交

vhost: max s/g to match qemu · e0e9b406

由 Jason Wang 提交于 9月 14, 2010

Qemu supports up to UIO_MAXIOV s/g so we have to match that because guest
drivers may rely on this.

Allocate indirect and log arrays dynamically to avoid using too much contigious
memory and make the length of hdr array to match the header length since each
iovec entry has a least one byte.

Test with copying large files w/ and w/o migration in both linux and windows
guests.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

e0e9b406

22 8月, 2010 1 次提交

vhost: add __rcu annotations · 28457ee6

由 Arnd Bergmann 提交于 3月 09, 2010

Also add rcu_dereference_protected() for code paths where locks are held.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>

28457ee6

28 7月, 2010 2 次提交

vhost-net: mergeable buffers support · 8dd014ad

由 David Stevens 提交于 7月 27, 2010

This adds support for mergeable buffers in vhost-net: this is needed
for older guests without indirect buffer support, as well
as for zero copy with some devices.

Includes changes by Michael S. Tsirkin to make the
patch as low risk as possible (i.e., close to no changes
when feature is disabled).
Signed-off-by: NDavid Stevens <dlstevens@us.ibm.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

8dd014ad

vhost: replace vhost_workqueue with per-vhost kthread · c23f3445

由 Tejun Heo 提交于 6月 02, 2010

Replace vhost_workqueue with per-vhost kthread.  Other than callback
argument change from struct work_struct * to struct vhost_work *,
there's no visible change to vhost_poll_*() interface.

This conversion is to make each vhost use a dedicated kthread so that
resource control via cgroup can be applied.

Partially based on Sridhar Samudrala's patch.

* Updated to use sub structure vhost_work instead of directly using
  vhost_poll at Michael's suggestion.

* Added flusher wake_up() optimization at Michael's suggestion.

Changes by MST:
* Converted atomics/barrier use to a spinlock.
* Create thread on SET_OWNER
* Fix flushing
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Cc: Sridhar Samudrala <samudrala.sridhar@gmail.com>

c23f3445

27 6月, 2010 1 次提交

vhost: break out of polling loop on error · d5675bd2

由 Michael S. Tsirkin 提交于 6月 24, 2010

When ring parsing fails, we currently handle this
as ring empty condition. This means that we enable
kicks and recheck ring empty: if this not empty,
we re-start polling which of course will fail again.

Instead, let's return a negative error code and stop polling.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

d5675bd2

15 1月, 2010 1 次提交

vhost_net: a kernel-level virtio server · 3a4d5c94

由 Michael S. Tsirkin 提交于 1月 14, 2010

What it is: vhost net is a character device that can be used to reduce
the number of system calls involved in virtio networking.
Existing virtio net code is used in the guest without modification.

There's similarity with vringfd, with some differences and reduced scope
- uses eventfd for signalling
- structures can be moved around in memory at any time (good for
  migration, bug work-arounds in userspace)
- write logging is supported (good for migration)
- support memory table and not just an offset (needed for kvm)

common virtio related code has been put in a separate file vhost.c and
can be made into a separate module if/when more backends appear.  I used
Rusty's lguest.c as the source for developing this part : this supplied
me with witty comments I wouldn't be able to write myself.

What it is not: vhost net is not a bus, and not a generic new system
call. No assumptions are made on how guest performs hypercalls.
Userspace hypervisors are supported as well as kvm.

How it works: Basically, we connect virtio frontend (configured by
userspace) to a backend. The backend could be a network device, or a tap
device.  Backend is also configured by userspace, including vlan/mac
etc.

Status: This works for me, and I haven't see any crashes.
Compared to userspace, people reported improved latency (as I save up to
4 system calls per packet), as well as better bandwidth and CPU
utilization.

Features that I plan to look at in the future:
- mergeable buffers
- zero copy
- scalability tuning: figure out the best threading model to use

Note on RCU usage (this is also documented in vhost.h, near
private_pointer which is the value protected by this variant of RCU):
what is happening is that the rcu_dereference() is being used in a
workqueue item.  The role of rcu_read_lock() is taken on by the start of
execution of the workqueue item, of rcu_read_unlock() by the end of
execution of the workqueue item, and of synchronize_rcu() by
flush_workqueue()/flush_work(). In the future we might need to apply
some gcc attribute or sparse annotation to the function passed to
INIT_WORK(). Paul's ack below is for this RCU usage.

(Includes fixes by Alan Cox <alan@linux.intel.com>,
David L Stevens <dlstevens@us.ibm.com>,
Chris Wright <chrisw@redhat.com>)
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a4d5c94