1. 10 9月, 2009 3 次提交
    • G
      KVM: cleanup io_device code · d76685c4
      Gregory Haskins 提交于
      We modernize the io_device code so that we use container_of() instead of
      dev->private, and move the vtable to a separate ops structure
      (theoretically allows better caching for multiple instances of the same
      ops structure)
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      Acked-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      d76685c4
    • S
      KVM: No disable_irq for MSI/MSI-X interrupt on device assignment · 968a6347
      Sheng Yang 提交于
      Disable interrupt at interrupt handler and enable it when guest ack is for
      the level triggered interrupt, to prevent reinjected interrupt. MSI/MSI-X don't
      need it.
      
      One possible problem is multiply same vector interrupt injected between irq
      handler and scheduled work handler would be merged as one for MSI/MSI-X.
      But AFAIK, the drivers handle it well.
      
      The patch fixed the oplin card performance issue(MSI-X performance is half of
      MSI/INTx).
      Signed-off-by: NSheng Yang <sheng@linux.intel.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      968a6347
    • G
      KVM: irqfd · 721eecbf
      Gregory Haskins 提交于
      KVM provides a complete virtual system environment for guests, including
      support for injecting interrupts modeled after the real exception/interrupt
      facilities present on the native platform (such as the IDT on x86).
      Virtual interrupts can come from a variety of sources (emulated devices,
      pass-through devices, etc) but all must be injected to the guest via
      the KVM infrastructure.  This patch adds a new mechanism to inject a specific
      interrupt to a guest using a decoupled eventfd mechnanism:  Any legal signal
      on the irqfd (using eventfd semantics from either userspace or kernel) will
      translate into an injected interrupt in the guest at the next available
      interrupt window.
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      721eecbf
  2. 28 6月, 2009 2 次提交
  3. 12 6月, 2009 1 次提交
  4. 10 6月, 2009 12 次提交
  5. 09 6月, 2009 1 次提交
    • A
      kvm: fix kvm reboot crash when MAXSMP is used · 8437a617
      Avi Kivity 提交于
      one system was found there is crash during reboot then kvm/MAXSMP
      Sending all processes the KILL signal...                              done
      Please stand by while rebooting the system...
      [ 1721.856538] md: stopping all md devices.
      [ 1722.852139] kvm: exiting hardware virtualization
      [ 1722.854601] BUG: unable to handle kernel NULL pointer dereference at (null)
      [ 1722.872219] IP: [<ffffffff8102c6b6>] hardware_disable+0x4c/0xb4
      [ 1722.877955] PGD 0
      [ 1722.880042] Oops: 0000 [#1] SMP
      [ 1722.892548] last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/host0/target0:2:0/0:2:0:0/vendor
      [ 1722.900977] CPU 9
      [ 1722.912606] Modules linked in:
      [ 1722.914226] Pid: 0, comm: swapper Not tainted 2.6.30-rc7-tip-01843-g2305324-dirty #299 ...
      [ 1722.932589] RIP: 0010:[<ffffffff8102c6b6>]  [<ffffffff8102c6b6>] hardware_disable+0x4c/0xb4
      [ 1722.942709] RSP: 0018:ffffc900010b6ed8  EFLAGS: 00010046
      [ 1722.956121] RAX: 0000000000000000 RBX: ffffc9000e253140 RCX: 0000000000000009
      [ 1722.972202] RDX: 000000000000b020 RSI: ffffc900010c3220 RDI: ffffffffffffd790
      [ 1722.977399] RBP: ffffc900010b6f08 R08: 0000000000000000 R09: 0000000000000000
      [ 1722.995149] R10: 00000000000004b8 R11: 966912b6c78fddbd R12: 0000000000000009
      [ 1723.011551] R13: 000000000000b020 R14: 0000000000000009 R15: 0000000000000000
      [ 1723.019898] FS:  0000000000000000(0000) GS:ffffc900010b3000(0000) knlGS:0000000000000000
      [ 1723.034389] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      [ 1723.041164] CR2: 0000000000000000 CR3: 0000000001001000 CR4: 00000000000006e0
      [ 1723.056192] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1723.072546] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [ 1723.080562] Process swapper (pid: 0, threadinfo ffff88107e464000, task ffff88047e5a2550)
      [ 1723.096144] Stack:
      [ 1723.099071]  0000000000000046 ffffc9000e253168 966912b6c78fddbd ffffc9000e253140
      [ 1723.115471]  ffff880c7d4304d0 ffffc9000e253168 ffffc900010b6f28 ffffffff81011022
      [ 1723.132428]  ffffc900010b6f48 966912b6c78fddbd ffffc900010b6f48 ffffffff8100b83b
      [ 1723.141973] Call Trace:
      [ 1723.142981]  <IRQ> <0> [<ffffffff81011022>] kvm_arch_hardware_disable+0x26/0x3c
      [ 1723.158153]  [<ffffffff8100b83b>] hardware_disable+0x3f/0x55
      [ 1723.172168]  [<ffffffff810b95f6>] generic_smp_call_function_interrupt+0x76/0x13c
      [ 1723.178836]  [<ffffffff8104cbea>] smp_call_function_interrupt+0x3a/0x5e
      [ 1723.194689]  [<ffffffff81035bf3>] call_function_interrupt+0x13/0x20
      [ 1723.199750]  <EOI> <0> [<ffffffff814ad3b4>] ? acpi_idle_enter_c1+0xd3/0xf4
      [ 1723.217508]  [<ffffffff814ad3ae>] ? acpi_idle_enter_c1+0xcd/0xf4
      [ 1723.232172]  [<ffffffff814ad4bc>] ? acpi_idle_enter_bm+0xe7/0x2ce
      [ 1723.235141]  [<ffffffff81a8d93f>] ? __atomic_notifier_call_chain+0x0/0xac
      [ 1723.253381]  [<ffffffff818c3dff>] ? menu_select+0x58/0xd2
      [ 1723.258179]  [<ffffffff818c2c9d>] ? cpuidle_idle_call+0xa4/0xf3
      [ 1723.272828]  [<ffffffff81034085>] ? cpu_idle+0xb8/0x101
      [ 1723.277085]  [<ffffffff81a80163>] ? start_secondary+0x1bc/0x1d7
      [ 1723.293708] Code: b0 00 00 65 48 8b 04 25 28 00 00 00 48 89 45 e0 31 c0 48 8b 04 cd 30 ee 27 82 49 89 cc 49 89 d5 48 8b 04 10 48 8d b8 90 d7 ff ff <48> 8b 87 70 28 00 00 48 8d 98 90 d7 ff ff eb 16 e8 e9 fe ff ff
      [ 1723.335524] RIP  [<ffffffff8102c6b6>] hardware_disable+0x4c/0xb4
      [ 1723.342076]  RSP <ffffc900010b6ed8>
      [ 1723.352021] CR2: 0000000000000000
      [ 1723.354348] ---[ end trace e2aec53dae150aa1 ]---
      
      it turns out that we need clear cpus_hardware_enabled in that case.
      Reported-and-tested-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      8437a617
  6. 08 6月, 2009 1 次提交
  7. 22 4月, 2009 2 次提交
  8. 24 3月, 2009 8 次提交
    • S
      KVM: Get support IRQ routing entry counts · 36463146
      Sheng Yang 提交于
      In capability probing ioctl.
      Signed-off-by: NSheng Yang <sheng@linux.intel.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      36463146
    • W
      KVM: fix kvm_vm_ioctl_deassign_device · 4a906e49
      Weidong Han 提交于
      only need to set assigned_dev_id for deassignment, use
      match->flags to judge and deassign it.
      Acked-by: NMark McLoughlin <markmc@redhat.com>
      Signed-off-by: NWeidong Han <weidong.han@intel.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      4a906e49
    • J
      KVM: MMU: handle compound pages in kvm_is_mmio_pfn · fc5659c8
      Joerg Roedel 提交于
      The function kvm_is_mmio_pfn is called before put_page is called on a
      page by KVM. This is a problem when when this function is called on some
      struct page which is part of a compund page. It does not test the
      reserved flag of the compound page but of the struct page within the
      compount page. This is a problem when KVM works with hugepages allocated
      at boot time. These pages have the reserved bit set in all tail pages.
      Only the flag in the compount head is cleared. KVM would not put such a
      page which results in a memory leak.
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      Acked-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      fc5659c8
    • S
      KVM: Use irq routing API for MSI · 79950e10
      Sheng Yang 提交于
      Merge MSI userspace interface with IRQ routing table. Notice the API have been
      changed, and using IRQ routing table would be the only interface kvm-userspace
      supported.
      Signed-off-by: NSheng Yang <sheng@linux.intel.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      79950e10
    • A
      KVM: Userspace controlled irq routing · 399ec807
      Avi Kivity 提交于
      Currently KVM has a static routing from GSI numbers to interrupts (namely,
      0-15 are mapped 1:1 to both PIC and IOAPIC, and 16:23 are mapped 1:1 to
      the IOAPIC).  This is insufficient for several reasons:
      
      - HPET requires non 1:1 mapping for the timer interrupt
      - MSIs need a new method to assign interrupt numbers and dispatch them
      - ACPI APIC mode needs to be able to reassign the PCI LINK interrupts to the
        ioapics
      
      This patch implements an interrupt routing table (as a linked list, but this
      can be easily changed) and a userspace interface to replace the table.  The
      routing table is initialized according to the current hardwired mapping.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      399ec807
    • A
      KVM: Interrupt mask notifiers for ioapic · 75858a84
      Avi Kivity 提交于
      Allow clients to request notifications when the guest masks or unmasks a
      particular irq line.  This complements irq ack notifications, as the guest
      will not ack an irq line that is masked.
      
      Currently implemented for the ioapic only.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      75858a84
    • S
      KVM: Add support to disable MSI for assigned device · 17071fe7
      Sheng Yang 提交于
      MSI is always enabled by default for msi2intx=1. But if msi2intx=0, we
      have to disable MSI if guest require to do so.
      
      The patch also discard unnecessary msi2intx judgment if guest want to update
      MSI state.
      
      Notice KVM_DEV_IRQ_ASSIGN_MSI_ACTION is a mask which should cover all MSI
      related operations, though we only got one for now.
      Signed-off-by: NSheng Yang <sheng@linux.intel.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      17071fe7
    • J
      KVM: New guest debug interface · d0bfb940
      Jan Kiszka 提交于
      This rips out the support for KVM_DEBUG_GUEST and introduces a new IOCTL
      instead: KVM_SET_GUEST_DEBUG. The IOCTL payload consists of a generic
      part, controlling the "main switch" and the single-step feature. The
      arch specific part adds an x86 interface for intercepting both types of
      debug exceptions separately and re-injecting them when the host was not
      interested. Moveover, the foundation for guest debugging via debug
      registers is layed.
      
      To signal breakpoint events properly back to userland, an arch-specific
      data block is now returned along KVM_EXIT_DEBUG. For x86, the arch block
      contains the PC, the debug exception, and relevant debug registers to
      tell debug events properly apart.
      
      The availability of this new interface is signaled by
      KVM_CAP_SET_GUEST_DEBUG. Empty stubs for not yet supported archs are
      provided.
      
      Note that both SVM and VTX are supported, but only the latter was tested
      yet. Based on the experience with all those VTX corner case, I would be
      fairly surprised if SVM will work out of the box.
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      d0bfb940
  9. 15 2月, 2009 5 次提交
  10. 03 1月, 2009 4 次提交
  11. 31 12月, 2008 1 次提交
    • M
      KVM: fix handling of ACK from shared guest IRQ · defaf158
      Mark McLoughlin 提交于
      If an assigned device shares a guest irq with an emulated
      device then we currently interpret an ack generated by the
      emulated device as originating from the assigned device
      leading to e.g. "Unbalanced enable for IRQ 4347" from the
      enable_irq() in kvm_assigned_dev_ack_irq().
      
      The fix is fairly simple - don't enable the physical device
      irq unless it was previously disabled.
      
      Of course, this can still lead to a situation where a
      non-assigned device ACK can cause the physical device irq to
      be reenabled before the device was serviced. However, being
      level sensitive, the interrupt will merely be regenerated.
      Signed-off-by: NMark McLoughlin <markmc@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      defaf158