1. 30 5月, 2011 3 次提交
    • M
      virtio: add api for delayed callbacks · 7ab358c2
      Michael S. Tsirkin 提交于
      Add an API that tells the other side that callbacks
      should be delayed until a lot of work has been done.
      Implement using the new event_idx feature.
      
      Note: it might seem advantageous to let the drivers
      ask for a callback after a specific capacity has
      been reached. However, as a single head can
      free many entries in the descriptor table,
      we don't really have a clue about capacity
      until get_buf is called. The API is the simplest
      to implement at the moment, we'll see what kind of
      hints drivers can pass when there's more than one
      user of the feature.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      7ab358c2
    • M
      virtio_ring: support event idx feature · a5c262c5
      Michael S. Tsirkin 提交于
      Support for the new event idx feature:
      1. When enabling interrupts, publish the current avail index
         value to the host to get interrupts on the next update.
      2. Use the new avail_event feature to reduce the number
         of exits from the guest.
      
      Simple test with the simulator:
      
      [virtio]# time ./virtio_test
      spurious wakeus: 0x7
      
      real    0m0.169s
      user    0m0.140s
      sys     0m0.019s
      [virtio]# time ./virtio_test --no-event-idx
      spurious wakeus: 0x11
      
      real    0m0.649s
      user    0m0.295s
      sys     0m0.335s
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      a5c262c5
    • D
      virtio balloon: kill tell-host-first logic · bf50e69f
      Dave Hansen 提交于
      The virtio balloon driver has a VIRTIO_BALLOON_F_MUST_TELL_HOST
      feature bit.  Whenever the bit is set, the guest kernel must
      always tell the host before we free pages back to the allocator.
      Without this feature, we might free a page (and have another
      user touch it) while the hypervisor is unprepared for it.
      
      But, if the bit is _not_ set, we are under no obligation to
      reverse the order; we're under no obligation to do _anything_.
      As of now, qemu-kvm defines the bit, but doesn't set it.
      
      This patch makes the "tell host first" logic the only case.  This
      should make everybody happy, and reduce the amount of untested or
      untestable code in the kernel.
      
      This _also_ means that we don't have to preserve a pfn list
      after the pages are freed, which should let us get rid of some
      temporary storage (vb->pfns) eventually.
      Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      bf50e69f
  2. 21 4月, 2011 2 次提交
    • A
      virtio_pci: Prevent double-free of pci regions after device hot-unplug · 31a3ddda
      Amit Shah 提交于
      In the case where a virtio-console port is in use (opened by a program)
      and a virtio-console device is removed, the port is kept around but all
      the virtio-related state is assumed to be gone.
      
      When the port is finally released (close() called), we call
      device_destroy() on the port's device.  This results in the parent
      device's structures to be freed as well.  This includes the PCI regions
      for the virtio-console PCI device.
      
      Once this is done, however, virtio_pci_release_dev() kicks in, as the
      last ref to the virtio device is now gone, and attempts to do
      
           pci_iounmap(pci_dev, vp_dev->ioaddr);
           pci_release_regions(pci_dev);
           pci_disable_device(pci_dev);
      
      which results in a double-free warning.
      
      Move the code that releases regions, etc., to the virtio_pci_remove()
      function, and all that's now left in release_dev is the final freeing of
      the vp_dev.
      Signed-off-by: NAmit Shah <amit.shah@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      31a3ddda
    • A
      virtio: Decrement avail idx on buffer detach · b3258ff1
      Amit Shah 提交于
      When detaching a buffer from a vq, the avail.idx value should be
      decremented as well.
      
      This was noticed by hot-unplugging a virtio console port and then
      plugging in a new one on the same number (re-using the vqs which were
      just 'disowned').  qemu reported
      
         'Guest moved used index from 0 to 256'
      
      when any IO was attempted on the new port.
      
      CC: stable@kernel.org
      Reported-by: Njuzhang <juzhang@redhat.com>
      Signed-off-by: NAmit Shah <amit.shah@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      b3258ff1
  3. 20 1月, 2011 1 次提交
  4. 24 11月, 2010 2 次提交
  5. 26 7月, 2010 1 次提交
  6. 23 6月, 2010 2 次提交
  7. 19 5月, 2010 3 次提交
  8. 22 4月, 2010 1 次提交
  9. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  10. 16 3月, 2010 1 次提交
  11. 02 3月, 2010 1 次提交
  12. 01 3月, 2010 1 次提交
    • M
      virtio: fix out of range array access · 31198159
      Michael S. Tsirkin 提交于
      I have observed the following error on virtio-net module unload:
      
      ------------[ cut here ]------------
      WARNING: at kernel/irq/manage.c:858 __free_irq+0xa0/0x14c()
      Hardware name: Bochs
      Trying to free already-free IRQ 0
      Modules linked in: virtio_net(-) virtio_blk virtio_pci virtio_ring
      virtio af_packet e1000 shpchp aacraid uhci_hcd ohci_hcd ehci_hcd [last
      unloaded: scsi_wait_scan]
      Pid: 1957, comm: rmmod Not tainted 2.6.33-rc8-vhost #24
      Call Trace:
       [<ffffffff8103e195>] warn_slowpath_common+0x7c/0x94
       [<ffffffff8103e204>] warn_slowpath_fmt+0x41/0x43
       [<ffffffff810a7a36>] ? __free_pages+0x5a/0x70
       [<ffffffff8107cc00>] __free_irq+0xa0/0x14c
       [<ffffffff8107cceb>] free_irq+0x3f/0x65
       [<ffffffffa0081424>] vp_del_vqs+0x81/0xb1 [virtio_pci]
       [<ffffffffa0091d29>] virtnet_remove+0xda/0x10b [virtio_net]
       [<ffffffffa0075200>] virtio_dev_remove+0x22/0x4a [virtio]
       [<ffffffff812709ee>] __device_release_driver+0x66/0xac
       [<ffffffff81270ab7>] driver_detach+0x83/0xa9
       [<ffffffff8126fc66>] bus_remove_driver+0x91/0xb4
       [<ffffffff81270fcf>] driver_unregister+0x6c/0x74
       [<ffffffffa0075418>] unregister_virtio_driver+0xe/0x10 [virtio]
       [<ffffffffa0091c4d>] fini+0x15/0x17 [virtio_net]
       [<ffffffff8106997b>] sys_delete_module+0x1c3/0x230
       [<ffffffff81007465>] ? old_ich_force_enable_hpet+0x117/0x164
       [<ffffffff813bb720>] ? do_page_fault+0x29c/0x2cc
       [<ffffffff81028e58>] sysenter_dispatch+0x7/0x27
      ---[ end trace 15e88e4c576cc62b ]---
      
      The bug is in virtio-pci: we use msix_vector as array index to get irq
      entry, but some vqs do not have a dedicated vector so this causes an out
      of bounds access.  By chance, we seem to often get 0 value, which
      results in this error.
      
      Fix by verifying that vector is legal before using it as index.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NAnthony Liguori <aliguori@us.ibm.com>
      Acked-by: NShirley Ma <xma@us.ibm.com>
      Acked-by: NAmit Shah <amit.shah@redhat.com>
      31198159
  13. 24 2月, 2010 8 次提交
    • A
      virtio: Initialize vq->data entries to NULL · 3b870624
      Amit Shah 提交于
      vq operations depend on vq->data[i] being NULL to figure out if the vq
      entry is in use (since the previous patch).
      
      We have to initialize them to NULL to ensure we don't work with junk
      data and trigger false BUG_ONs.
      Signed-off-by: NAmit Shah <amit.shah@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Shirley Ma <xma@us.ibm.com>
      3b870624
    • S
      virtio: Add ability to detach unused buffers from vrings · c021eac4
      Shirley Ma 提交于
      There's currently no way for a virtio driver to ask for unused
      buffers, so it has to keep a list itself to reclaim them at shutdown.
      This is redundant, since virtio_ring stores that information.  So
      add a new hook to do this.
      Signed-off-by: NShirley Ma <xma@us.ibm.com>
      Signed-off-by: NAmit Shah <amit.shah@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      c021eac4
    • M
      virtio: use smp_XX barriers on SMP · d57ed95d
      Michael S. Tsirkin 提交于
      virtio is communicating with a virtual "device" that actually runs on
      another host processor. Thus SMP barriers can be used to control
      memory access ordering.
      
      Where possible, we should use SMP barriers which are more lightweight than
      mandatory barriers, because mandatory barriers also control MMIO effects on
      accesses through relaxed memory I/O windows (which virtio does not use)
      (compare specifically smp_rmb and rmb on x86_64).
      
      We can't just use smp_mb and friends though, because
      we must force memory ordering even if guest is UP since host could be
      running on another CPU, but SMP barriers are defined to barrier() in
      that configuration. So, for UP fall back to mandatory barriers instead.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      d57ed95d
    • R
      virtio: remove bogus barriers from DEBUG version of virtio_ring.c · 97a545ab
      Rusty Russell 提交于
      With DEBUG defined, we add an ->in_use flag to detect if the caller
      invokes two virtio methods in parallel.  The barriers attempt to ensure
      timely update of the ->in_use flag.
      
      But they're voodoo: if we need these barriers it implies that the
      calling code doesn't have sufficient synchronization to ensure the
      code paths aren't invoked at the same time anyway, and we want to
      detect it.
      
      Also, adding barriers changes timing, so turning on debug has more
      chance of hiding real problems.
      
      Thanks to MST for drawing my attention to this code...
      
      CC: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      97a545ab
    • R
      virtio: fix balloon without VIRTIO_BALLOON_F_STATS_VQ · 169c246a
      Rusty Russell 提交于
      When running under qemu-kvm-0.11.0:
      
      	BUG: unable to handle kernel paging request at 56e58955
      	...
      	Process vballoon (pid: 1297, ti=c7976000 task=c70a6ca0 task.ti=c7
      	...
      	Call Trace:
      	 [<c88253a3>] ? balloon+0x1b3/0x440 [virtio_balloon]
      	 [<c041c2d7>] ? schedule+0x327/0x9d0
      	 [<c88251f0>] ? balloon+0x0/0x440 [virtio_balloon]
      	 [<c014a2d4>] ? kthread+0x74/0x80
      	 [<c014a260>] ? kthread+0x0/0x80
      	 [<c0103b36>] ? kernel_thread_helper+0x6/0x30
      
      need_stats_update should be zero-initialized.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NAdam Litke <agl@us.ibm.com>
      169c246a
    • A
      virtio: Fix scheduling while atomic in virtio_balloon stats · 1f34c71a
      Adam Litke 提交于
      This is a fix for my earlier patch: "virtio: Add memory statistics reporting to
      the balloon driver (V4)".
      
      I discovered that all_vm_events() can sleep and therefore stats collection
      cannot be done in interrupt context.  One solution is to handle the interrupt
      by noting that stats need to be collected and waking the existing vballoon
      kthread which will complete the work via stats_handle_request().  Rusty, is
      this a saner way of doing business?
      
      There is one issue that I would like a broader opinion on.  In stats_request, I
      update vb->need_stats_update and then wake up the kthread.  The kthread uses
      vb->need_stats_update as a condition variable.  Do I need a memory barrier
      between the update and wake_up to ensure that my kthread sees the correct
      value?  My testing suggests that it is not needed but I would like some
      confirmation from the experts.
      Signed-off-by: NAdam Litke <agl@us.ibm.com>
      To: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Anthony Liguori <aliguori@linux.vnet.ibm.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      1f34c71a
    • A
      virtio: Add memory statistics reporting to the balloon driver (V4) · 9564e138
      Adam Litke 提交于
      Changes since V3:
       - Do not do endian conversions as they will be done in the host
       - Report stats that reference a quantity of memory in bytes
       - Minor coding style updates
      
      Changes since V2:
       - Increase stat field size to 64 bits
       - Report all sizes in kb (not pages)
       - Drop anon_pages stat and fix endianness conversion
      
      Changes since V1:
       - Use a virtqueue instead of the device config space
      
      When using ballooning to manage overcommitted memory on a host, a system for
      guests to communicate their memory usage to the host can provide information
      that will minimize the impact of ballooning on the guests.  The current method
      employs a daemon running in each guest that communicates memory statistics to a
      host daemon at a specified time interval.  The host daemon aggregates this
      information and inflates and/or deflates balloons according to the level of
      host memory pressure.  This approach is effective but overly complex since a
      daemon must be installed inside each guest and coordinated to communicate with
      the host.  A simpler approach is to collect memory statistics in the virtio
      balloon driver and communicate them directly to the hypervisor.
      
      This patch enables the guest-side support by adding stats collection and
      reporting to the virtio balloon driver.
      Signed-off-by: NAdam Litke <agl@us.ibm.com>
      Cc: Anthony Liguori <anthony@codemonkey.ws>
      Cc: virtualization@lists.linux-foundation.org
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (minor fixes)
      9564e138
    • J
      Add __devexit_p around reference to virtio_pci_remove · 1f08b833
      Jamie Lokier 提交于
      This is needed to compile with CONFIG_VIRTIO_PCI=y,
      because virtio_pci_remove is marked __devexit.
      Signed-off-by: NJamie Lokier <jamie@shareable.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      1f08b833
  14. 03 2月, 2010 1 次提交
  15. 17 1月, 2010 1 次提交
  16. 29 10月, 2009 2 次提交
    • M
      virtio: order used ring after used index read · 2d61ba95
      Michael S. Tsirkin 提交于
      On SMP guests, reads from the ring might bypass used index reads. This
      causes guest crashes because host writes to used index to signal ring
      data readiness.  Fix this by inserting rmb before used ring reads.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: stable@kernel.org
      2d61ba95
    • M
      virtio-pci: fix per-vq MSI-X request logic · 0b22bd0b
      Michael S. Tsirkin 提交于
      Commit f68d2408
      in 2.6.32-rc1 broke requesting IRQs for per-VQ MSI-X vectors:
      - vector number was used instead of the vector itself
      - we try to request an IRQ for VQ which does not
        have a callback handler
      
      This is a regression that causes warnings in kernel log,
      potentially lower performance as we need to scan vq list,
      and might cause system failure if the interrupt
      requested is in fact needed by another system.
      
      This was not noticed earlier because in most cases
      we were falling back on shared interrupt for all vqs.
      
      The warnings often look like this:
      
      virtio-pci 0000:00:03.0: irq 26 for MSI/MSI-X
      virtio-pci 0000:00:03.0: irq 27 for MSI/MSI-X
      virtio-pci 0000:00:03.0: irq 28 for MSI/MSI-X
      IRQ handler type mismatch for IRQ 1
      current handler: i8042
      Pid: 2400, comm: modprobe Tainted: G        W
      2.6.32-rc3-11952-gf3ed8d8-dirty #1
      Call Trace:
       [<ffffffff81072aed>] ? __setup_irq+0x299/0x304
       [<ffffffff81072ff3>] ? request_threaded_irq+0x144/0x1c1
       [<ffffffff813455af>] ? vring_interrupt+0x0/0x30
       [<ffffffff81346598>] ? vp_try_to_find_vqs+0x583/0x5c7
       [<ffffffffa0015188>] ? skb_recv_done+0x0/0x34 [virtio_net]
       [<ffffffff81346609>] ? vp_find_vqs+0x2d/0x83
       [<ffffffff81345d00>] ? vp_get+0x3c/0x4e
       [<ffffffffa0016373>] ? virtnet_probe+0x2f1/0x428 [virtio_net]
       [<ffffffffa0015188>] ? skb_recv_done+0x0/0x34 [virtio_net]
       [<ffffffffa00150d8>] ? skb_xmit_done+0x0/0x39 [virtio_net]
       [<ffffffff8110ab92>] ? sysfs_do_create_link+0xcb/0x116
       [<ffffffff81345cc2>] ? vp_get_status+0x14/0x16
       [<ffffffff81345464>] ? virtio_dev_probe+0xa9/0xc8
       [<ffffffff8122b11c>] ? driver_probe_device+0x8d/0x128
       [<ffffffff8122b206>] ? __driver_attach+0x4f/0x6f
       [<ffffffff8122b1b7>] ? __driver_attach+0x0/0x6f
       [<ffffffff8122a9f9>] ? bus_for_each_dev+0x43/0x74
       [<ffffffff8122a374>] ? bus_add_driver+0xea/0x22d
       [<ffffffff8122b4a3>] ? driver_register+0xa7/0x111
       [<ffffffffa001a000>] ? init+0x0/0xc [virtio_net]
       [<ffffffff81009051>] ? do_one_initcall+0x50/0x148
       [<ffffffff8106e117>] ? sys_init_module+0xc5/0x21a
       [<ffffffff8100af02>] ? system_call_fastpath+0x16/0x1b
      virtio-pci 0000:00:03.0: irq 26 for MSI/MSI-X
      virtio-pci 0000:00:03.0: irq 27 for MSI/MSI-X
      Reported-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Reported-by: NShirley Ma <xma@us.ibm.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      0b22bd0b
  17. 22 10月, 2009 2 次提交
  18. 23 9月, 2009 3 次提交
  19. 30 7月, 2009 3 次提交
  20. 17 7月, 2009 1 次提交