1. 07 7月, 2013 1 次提交
  2. 11 6月, 2013 1 次提交
  3. 06 5月, 2013 4 次提交
  4. 01 5月, 2013 4 次提交
  5. 30 1月, 2013 1 次提交
    • J
      vhost_net: handle polling errors when setting backend · 2b8b328b
      Jason Wang 提交于
      Currently, the polling errors were ignored, which can lead following issues:
      
      - vhost remove itself unconditionally from waitqueue when stopping the poll,
        this may crash the kernel since the previous attempt of starting may fail to
        add itself to the waitqueue
      - userspace may think the backend were successfully set even when the polling
        failed.
      
      Solve this by:
      
      - check poll->wqh before trying to remove from waitqueue
      - report polling errors in vhost_poll_start(), tx_poll_start(), the return value
        will be checked and returned when userspace want to set the backend
      
      After this fix, there still could be a polling failure after backend is set, it
      will addressed by the next patch.
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2b8b328b
  6. 06 12月, 2012 1 次提交
    • M
      vhost: avoid backend flush on vring ops · 935cdee7
      Michael S. Tsirkin 提交于
      vring changes already do a flush internally where appropriate, so we do
      not need a second flush.
      
      It's currently not very expensive but a follow-up patch makes flush more
      heavy-weight, so remove the extra flush here to avoid regressing
      performance if call or kick fds are changed on data path.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      935cdee7
  7. 03 11月, 2012 4 次提交
  8. 22 7月, 2012 2 次提交
  9. 14 4月, 2012 1 次提交
  10. 28 2月, 2012 1 次提交
  11. 27 7月, 2011 1 次提交
  12. 19 7月, 2011 2 次提交
    • J
      vhost: init used ring after backend was set · f59281da
      Jason Wang 提交于
      Move the used ring initialization after backend was set. This
      makes it possible to disable the backend and tweak the used ring,
      then restart. This will also make it possible to log the used ring
      write correctly.
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      f59281da
    • M
      vhost: vhost TX zero-copy support · bab632d6
      Michael S. Tsirkin 提交于
      >From: Shirley Ma <mashirle@us.ibm.com>
      
      This adds experimental zero copy support in vhost-net,
      disabled by default. To enable, set
      experimental_zcopytx module option to 1.
      
      This patch maintains the outstanding userspace buffers in the
      sequence it is delivered to vhost. The outstanding userspace buffers
      will be marked as done once the lower device buffers DMA has finished.
      This is monitored through last reference of kfree_skb callback. Two
      buffer indices are used for this purpose.
      
      The vhost-net device passes the userspace buffers info to lower device
      skb through message control. DMA done status check and guest
      notification are handled by handle_tx: in the worst case is all buffers
      in the vq are in pending/done status, so we need to notify guest to
      release DMA done buffers first before we get any new buffers from the
      vq.
      
      One known problem is that if the guest stops submitting
      buffers, buffers might never get used until some
      further action, e.g. device reset. This does not
      seem to affect linux guests.
      Signed-off-by: NShirley <xma@us.ibm.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bab632d6
  13. 30 5月, 2011 1 次提交
  14. 01 2月, 2011 1 次提交
    • M
      vhost: rcu annotation fixup · 5e18247b
      Michael S. Tsirkin 提交于
      When built with rcu checks enabled, vhost triggers
      bogus warnings as vhost features are read without
      dev->mutex sometimes, and private pointer is read
      with our kind of rcu where work serves as a
      read side critical section.
      
      Fixing it properly is not trivial.
      Disable the warnings by stubbing out the checks for now.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      5e18247b
  15. 09 12月, 2010 1 次提交
  16. 05 10月, 2010 1 次提交
    • J
      vhost: max s/g to match qemu · e0e9b406
      Jason Wang 提交于
      Qemu supports up to UIO_MAXIOV s/g so we have to match that because guest
      drivers may rely on this.
      
      Allocate indirect and log arrays dynamically to avoid using too much contigious
      memory and make the length of hdr array to match the header length since each
      iovec entry has a least one byte.
      
      Test with copying large files w/ and w/o migration in both linux and windows
      guests.
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      e0e9b406
  17. 22 8月, 2010 1 次提交
  18. 28 7月, 2010 2 次提交
    • D
      vhost-net: mergeable buffers support · 8dd014ad
      David Stevens 提交于
      This adds support for mergeable buffers in vhost-net: this is needed
      for older guests without indirect buffer support, as well
      as for zero copy with some devices.
      
      Includes changes by Michael S. Tsirkin to make the
      patch as low risk as possible (i.e., close to no changes
      when feature is disabled).
      Signed-off-by: NDavid Stevens <dlstevens@us.ibm.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      8dd014ad
    • T
      vhost: replace vhost_workqueue with per-vhost kthread · c23f3445
      Tejun Heo 提交于
      Replace vhost_workqueue with per-vhost kthread.  Other than callback
      argument change from struct work_struct * to struct vhost_work *,
      there's no visible change to vhost_poll_*() interface.
      
      This conversion is to make each vhost use a dedicated kthread so that
      resource control via cgroup can be applied.
      
      Partially based on Sridhar Samudrala's patch.
      
      * Updated to use sub structure vhost_work instead of directly using
        vhost_poll at Michael's suggestion.
      
      * Added flusher wake_up() optimization at Michael's suggestion.
      
      Changes by MST:
      * Converted atomics/barrier use to a spinlock.
      * Create thread on SET_OWNER
      * Fix flushing
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Cc: Sridhar Samudrala <samudrala.sridhar@gmail.com>
      c23f3445
  19. 27 6月, 2010 1 次提交
  20. 15 1月, 2010 1 次提交
    • M
      vhost_net: a kernel-level virtio server · 3a4d5c94
      Michael S. Tsirkin 提交于
      What it is: vhost net is a character device that can be used to reduce
      the number of system calls involved in virtio networking.
      Existing virtio net code is used in the guest without modification.
      
      There's similarity with vringfd, with some differences and reduced scope
      - uses eventfd for signalling
      - structures can be moved around in memory at any time (good for
        migration, bug work-arounds in userspace)
      - write logging is supported (good for migration)
      - support memory table and not just an offset (needed for kvm)
      
      common virtio related code has been put in a separate file vhost.c and
      can be made into a separate module if/when more backends appear.  I used
      Rusty's lguest.c as the source for developing this part : this supplied
      me with witty comments I wouldn't be able to write myself.
      
      What it is not: vhost net is not a bus, and not a generic new system
      call. No assumptions are made on how guest performs hypercalls.
      Userspace hypervisors are supported as well as kvm.
      
      How it works: Basically, we connect virtio frontend (configured by
      userspace) to a backend. The backend could be a network device, or a tap
      device.  Backend is also configured by userspace, including vlan/mac
      etc.
      
      Status: This works for me, and I haven't see any crashes.
      Compared to userspace, people reported improved latency (as I save up to
      4 system calls per packet), as well as better bandwidth and CPU
      utilization.
      
      Features that I plan to look at in the future:
      - mergeable buffers
      - zero copy
      - scalability tuning: figure out the best threading model to use
      
      Note on RCU usage (this is also documented in vhost.h, near
      private_pointer which is the value protected by this variant of RCU):
      what is happening is that the rcu_dereference() is being used in a
      workqueue item.  The role of rcu_read_lock() is taken on by the start of
      execution of the workqueue item, of rcu_read_unlock() by the end of
      execution of the workqueue item, and of synchronize_rcu() by
      flush_workqueue()/flush_work(). In the future we might need to apply
      some gcc attribute or sparse annotation to the function passed to
      INIT_WORK(). Paul's ack below is for this RCU usage.
      
      (Includes fixes by Alan Cox <alan@linux.intel.com>,
      David L Stevens <dlstevens@us.ibm.com>,
      Chris Wright <chrisw@redhat.com>)
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a4d5c94