1. 09 1月, 2014 1 次提交
  2. 22 12月, 2013 10 次提交
    • C
      KVM: arm-vgic: Support CPU interface reg access · fa20f5ae
      Christoffer Dall 提交于
      Implement support for the CPU interface register access driven by MMIO
      address offsets from the CPU interface base address.  Useful for user
      space to support save/restore of the VGIC state.
      
      This commit adds support only for the same logic as the current VGIC
      support, and no more.  For example, the active priority registers are
      handled as RAZ/WI, just like setting priorities on the emulated
      distributor.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      fa20f5ae
    • C
      KVM: arm-vgic: Add GICD_SPENDSGIR and GICD_CPENDSGIR handlers · 90a5355e
      Christoffer Dall 提交于
      Handle MMIO accesses to the two registers which should support both the
      case where the VMs want to read/write either of these registers and the
      case where user space reads/writes these registers to do save/restore of
      the VGIC state.
      
      Note that the added complexity compared to simple set/clear enable
      registers stems from the bookkeping of source cpu ids.  It may be
      possible to change the underlying data structure to simplify the
      complexity, but since this is not in the critical path at all, this will
      do.
      
      Also note that reading this register from a live guest will not be
      accurate compared to on hardware, because some state may be living on
      the CPU LRs and the only way to give a consistent read would be to force
      stop all the VCPUs and request them to unqueu the LR state onto the
      distributor.  Until we have an actual user of live reading this
      register, we can live with the difference.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      90a5355e
    • C
      KVM: arm-vgic: Support unqueueing of LRs to the dist · cbd333a4
      Christoffer Dall 提交于
      To properly access the VGIC state from user space it is very unpractical
      to have to loop through all the LRs in all register access functions.
      Instead, support moving all pending state from LRs to the distributor,
      but leave active state LRs alone.
      
      Note that to accurately present the active and pending state to VCPUs
      reading these distributor registers from a live VM, we would have to
      stop all other VPUs than the calling VCPU and ask each CPU to unqueue
      their LR state onto the distributor and add fields to track active state
      on the distributor side as well.  We don't have any users of such
      functionality yet and there are other inaccuracies of the GIC emulation,
      so don't provide accurate synchronized access to this state just yet.
      However, when the time comes, having this function should help.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      cbd333a4
    • C
      KVM: arm-vgic: Add vgic reg access from dev attr · c07a0191
      Christoffer Dall 提交于
      Add infrastructure to handle distributor and cpu interface register
      accesses through the KVM_{GET/SET}_DEVICE_ATTR interface by adding the
      KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_CPU_REGS groups
      and defining the semantics of the attr field to be the MMIO offset as
      specified in the GICv2 specs.
      
      Missing register accesses or other changes in individual register access
      functions to support save/restore of the VGIC state is added in
      subsequent patches.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      c07a0191
    • C
      KVM: arm-vgic: Make vgic mmio functions more generic · 1006e8cb
      Christoffer Dall 提交于
      Rename the vgic_ranges array to vgic_dist_ranges to be more specific and
      to prepare for handling CPU interface register access as well (for
      save/restore of VGIC state).
      
      Pass offset from distributor or interface MMIO base to
      find_matching_range function instead of the physical address of the
      access in the VM memory map.  This allows other callers unaware of the
      VM specifics, but with generic VGIC knowledge to reuse the function.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      1006e8cb
    • C
      KVM: arm-vgic: Set base addr through device API · ce01e4e8
      Christoffer Dall 提交于
      Support setting the distributor and cpu interface base addresses in the
      VM physical address space through the KVM_{SET,GET}_DEVICE_ATTR API
      in addition to the ARM specific API.
      
      This has the added benefit of being able to share more code in user
      space and do things in a uniform manner.
      
      Also deprecate the older API at the same time, but backwards
      compatibility will be maintained.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      ce01e4e8
    • C
      KVM: arm-vgic: Support KVM_CREATE_DEVICE for VGIC · 7330672b
      Christoffer Dall 提交于
      Support creating the ARM VGIC device through the KVM_CREATE_DEVICE
      ioctl, which can then later be leveraged to use the
      KVM_{GET/SET}_DEVICE_ATTR, which is useful both for setting addresses in
      a more generic API than the ARM-specific one and is useful for
      save/restore of VGIC state.
      
      Adds KVM_CAP_DEVICE_CTRL to ARM capabilities.
      
      Note that we change the check for creating a VGIC from bailing out if
      any VCPUs were created, to bailing out if any VCPUs were ever run.  This
      is an important distinction that shouldn't break anything, but allows
      creating the VGIC after the VCPUs have been created.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      7330672b
    • C
      ARM: KVM: Allow creating the VGIC after VCPUs · e1ba0207
      Christoffer Dall 提交于
      Rework the VGIC initialization slightly to allow initialization of the
      vgic cpu-specific state even if the irqchip (the VGIC) hasn't been
      created by user space yet.  This is safe, because the vgic data
      structures are already allocated when the CPU is allocated if VGIC
      support is compiled into the kernel.  Further, the init process does not
      depend on any other information and the sacrifice is a slight
      performance degradation for creating VMs in the no-VGIC case.
      
      The reason is that the new device control API doesn't mandate creating
      the VGIC before creating the VCPU and it is unreasonable to require user
      space to create the VGIC before creating the VCPUs.
      
      At the same time move the irqchip_in_kernel check out of
      kvm_vcpu_first_run_init and into the init function to make the per-vcpu
      and global init functions symmetric and add comments on the exported
      functions making it a bit easier to understand the init flow by only
      looking at vgic.c.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      e1ba0207
    • A
      ARM/KVM: save and restore generic timer registers · 39735a3a
      Andre Przywara 提交于
      For migration to work we need to save (and later restore) the state of
      each core's virtual generic timer.
      Since this is per VCPU, we can use the [gs]et_one_reg ioctl and export
      the three needed registers (control, counter, compare value).
      Though they live in cp15 space, we don't use the existing list, since
      they need special accessor functions and the arch timer is optional.
      Acked-by: NMarc Zynger <marc.zyngier@arm.com>
      Signed-off-by: NAndre Przywara <andre.przywara@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      39735a3a
    • C
      arm/arm64: KVM: arch_timer: Initialize cntvoff at kvm_init · a1a64387
      Christoffer Dall 提交于
      Initialize the cntvoff at kvm_init_vm time, not before running the VCPUs
      at the first time because that will overwrite any potentially restored
      values from user space.
      
      Cc: Andre Przywara <andre.przywara@linaro.org>
      Acked-by: NMarc Zynger <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      a1a64387
  3. 13 12月, 2013 1 次提交
  4. 21 11月, 2013 1 次提交
    • H
      KVM: kvm_clear_guest_page(): fix empty_zero_page usage · 8a3caa6d
      Heiko Carstens 提交于
      Using the address of 'empty_zero_page' as source address in order to
      clear a page is wrong. On some architectures empty_zero_page is only the
      pointer to the struct page of the empty_zero_page.  Therefore the clear
      page operation would copy the contents of a couple of struct pages instead
      of clearing a page.  For kvm only arm/arm64 are affected by this bug.
      
      To fix this use the ZERO_PAGE macro instead which will return the struct
      page address of the empty_zero_page on all architectures.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      8a3caa6d
  5. 06 11月, 2013 1 次提交
  6. 05 11月, 2013 1 次提交
    • G
      KVM: IOMMU: hva align mapping page size · 27ef63c7
      Greg Edwards 提交于
      When determining the page size we could use to map with the IOMMU, the
      page size should also be aligned with the hva, not just the gfn.  The
      gfn may not reflect the real alignment within the hugetlbfs file.
      
      Most of the time, this works fine.  However, if the hugetlbfs file is
      backed by non-contiguous huge pages, a multi-huge page memslot starts at
      an unaligned offset within the hugetlbfs file, and the gfn is aligned
      with respect to the huge page size, kvm_host_page_size() will return the
      huge page size and we will use that to map with the IOMMU.
      
      When we later unpin that same memslot, the IOMMU returns the unmap size
      as the huge page size, and we happily unpin that many pfns in
      monotonically increasing order, not realizing we are spanning
      non-contiguous huge pages and partially unpin the wrong huge page.
      
      Ensure the IOMMU mapping page size is aligned with the hva corresponding
      to the gfn, which does reflect the alignment within the hugetlbfs file.
      Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NGreg Edwards <gedwards@ddn.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      27ef63c7
  7. 31 10月, 2013 3 次提交
  8. 30 10月, 2013 1 次提交
  9. 28 10月, 2013 1 次提交
    • Y
      KVM: Mapping IOMMU pages after updating memslot · e0230e13
      Yang Zhang 提交于
      In kvm_iommu_map_pages(), we need to know the page size via call
      kvm_host_page_size(). And it will check whether the target slot
      is valid before return the right page size.
      Currently, we will map the iommu pages when creating a new slot.
      But we call kvm_iommu_map_pages() during preparing the new slot.
      At that time, the new slot is not visible by domain(still in preparing).
      So we cannot get the right page size from kvm_host_page_size() and
      this will break the IOMMU super page logic.
      The solution is to map the iommu pages after we insert the new slot
      into domain.
      Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
      Tested-by: NPatrick Lu <patrick.lu@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e0230e13
  10. 17 10月, 2013 2 次提交
  11. 15 10月, 2013 1 次提交
  12. 03 10月, 2013 2 次提交
  13. 30 9月, 2013 3 次提交
    • P
      KVM: Convert kvm_lock back to non-raw spinlock · 2f303b74
      Paolo Bonzini 提交于
      In commit e935b837 ("KVM: Convert kvm_lock to raw_spinlock"),
      the kvm_lock was made a raw lock.  However, the kvm mmu_shrink()
      function tries to grab the (non-raw) mmu_lock within the scope of
      the raw locked kvm_lock being held.  This leads to the following:
      
      BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
      in_atomic(): 1, irqs_disabled(): 0, pid: 55, name: kswapd0
      Preemption disabled at:[<ffffffffa0376eac>] mmu_shrink+0x5c/0x1b0 [kvm]
      
      Pid: 55, comm: kswapd0 Not tainted 3.4.34_preempt-rt
      Call Trace:
       [<ffffffff8106f2ad>] __might_sleep+0xfd/0x160
       [<ffffffff817d8d64>] rt_spin_lock+0x24/0x50
       [<ffffffffa0376f3c>] mmu_shrink+0xec/0x1b0 [kvm]
       [<ffffffff8111455d>] shrink_slab+0x17d/0x3a0
       [<ffffffff81151f00>] ? mem_cgroup_iter+0x130/0x260
       [<ffffffff8111824a>] balance_pgdat+0x54a/0x730
       [<ffffffff8111fe47>] ? set_pgdat_percpu_threshold+0xa7/0xd0
       [<ffffffff811185bf>] kswapd+0x18f/0x490
       [<ffffffff81070961>] ? get_parent_ip+0x11/0x50
       [<ffffffff81061970>] ? __init_waitqueue_head+0x50/0x50
       [<ffffffff81118430>] ? balance_pgdat+0x730/0x730
       [<ffffffff81060d2b>] kthread+0xdb/0xe0
       [<ffffffff8106e122>] ? finish_task_switch+0x52/0x100
       [<ffffffff817e1e94>] kernel_thread_helper+0x4/0x10
       [<ffffffff81060c50>] ? __init_kthread_worker+0x
      
      After the previous patch, kvm_lock need not be a raw spinlock anymore,
      so change it back.
      Reported-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Cc: kvm@vger.kernel.org
      Cc: gleb@redhat.com
      Cc: jan.kiszka@siemens.com
      Reviewed-by: NGleb Natapov <gleb@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2f303b74
    • P
      KVM: protect kvm_usage_count with its own spinlock · 4a937f96
      Paolo Bonzini 提交于
      The VM list need not be protected by a raw spinlock.  Separate the
      two so that kvm_lock can be made non-raw.
      
      Cc: kvm@vger.kernel.org
      Cc: gleb@redhat.com
      Cc: jan.kiszka@siemens.com
      Reviewed-by: NGleb Natapov <gleb@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4a937f96
    • P
      KVM: cleanup (physical) CPU hotplug · 4fa92fb2
      Paolo Bonzini 提交于
      Remove the useless argument, and do not do anything if there are no
      VMs running at the time of the hotplug.
      
      Cc: kvm@vger.kernel.org
      Cc: gleb@redhat.com
      Cc: jan.kiszka@siemens.com
      Reviewed-by: NGleb Natapov <gleb@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4fa92fb2
  14. 25 9月, 2013 1 次提交
  15. 17 9月, 2013 2 次提交
  16. 04 9月, 2013 1 次提交
  17. 30 8月, 2013 3 次提交
  18. 28 8月, 2013 1 次提交
  19. 27 8月, 2013 1 次提交
    • A
      kvm: optimize away THP checks in kvm_is_mmio_pfn() · 11feeb49
      Andrea Arcangeli 提交于
      The checks on PG_reserved in the page structure on head and tail pages
      aren't necessary because split_huge_page wouldn't transfer the
      PG_reserved bit from head to tail anyway.
      
      This was a forward-thinking check done in the case PageReserved was
      set by a driver-owned page mapped in userland with something like
      remap_pfn_range in a VM_PFNMAP region, but using hugepmds (not
      possible right now). It was meant to be very safe, but it's overkill
      as it's unlikely split_huge_page could ever run without the driver
      noticing and tearing down the hugepage itself.
      
      And if a driver in the future will really want to map a reserved
      hugepage in userland using an huge pmd it should simply take care of
      marking all subpages reserved too to keep KVM safe. This of course
      would require such a hypothetical driver to tear down the huge pmd
      itself and splitting the hugepage itself, instead of relaying on
      split_huge_page, but that sounds very reasonable, especially
      considering split_huge_page wouldn't currently transfer the reserved
      bit anyway.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      11feeb49
  20. 26 8月, 2013 1 次提交
  21. 29 7月, 2013 1 次提交
  22. 18 7月, 2013 1 次提交