1. 17 9月, 2015 1 次提交
    • M
      arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS' · ef748917
      Ming Lei 提交于
      This patch removes config option of KVM_ARM_MAX_VCPUS,
      and like other ARCHs, just choose the maximum allowed
      value from hardware, and follows the reasons:
      
      1) from distribution view, the option has to be
      defined as the max allowed value because it need to
      meet all kinds of virtulization applications and
      need to support most of SoCs;
      
      2) using a bigger value doesn't introduce extra memory
      consumption, and the help text in Kconfig isn't accurate
      because kvm_vpu structure isn't allocated until request
      of creating VCPU is sent from QEMU;
      
      3) the main effect is that the field of vcpus[] in 'struct kvm'
      becomes a bit bigger(sizeof(void *) per vcpu) and need more cache
      lines to hold the structure, but 'struct kvm' is one generic struct,
      and it has worked well on other ARCHs already in this way. Also,
      the world switch frequecy is often low, for example, it is ~2000
      when running kernel building load in VM from APM xgene KVM host,
      so the effect is very small, and the difference can't be observed
      in my test at all.
      
      Cc: Dann Frazier <dann.frazier@canonical.com>
      Signed-off-by: NMing Lei <ming.lei@canonical.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      ef748917
  2. 12 8月, 2015 1 次提交
  3. 21 7月, 2015 4 次提交
  4. 12 6月, 2015 1 次提交
  5. 13 3月, 2015 1 次提交
    • M
      arm/arm64: KVM: Implement Stage-2 page aging · 35307b9a
      Marc Zyngier 提交于
      Until now, KVM/arm didn't care much for page aging (who was swapping
      anyway?), and simply provided empty hooks to the core KVM code. With
      server-type systems now being available, things are quite different.
      
      This patch implements very simple support for page aging, by clearing
      the Access flag in the Stage-2 page tables. On access fault, the current
      fault handling will write the PTE or PMD again, putting the Access flag
      back on.
      
      It should be possible to implement a much faster handling for Access
      faults, but that's left for a later patch.
      
      With this in place, performance in VMs is degraded much more gracefully.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      35307b9a
  6. 12 3月, 2015 1 次提交
  7. 06 2月, 2015 1 次提交
    • P
      kvm: add halt_poll_ns module parameter · f7819512
      Paolo Bonzini 提交于
      This patch introduces a new module parameter for the KVM module; when it
      is present, KVM attempts a bit of polling on every HLT before scheduling
      itself out via kvm_vcpu_block.
      
      This parameter helps a lot for latency-bound workloads---in particular
      I tested it with O_DSYNC writes with a battery-backed disk in the host.
      In this case, writes are fast (because the data doesn't have to go all
      the way to the platters) but they cannot be merged by either the host or
      the guest.  KVM's performance here is usually around 30% of bare metal,
      or 50% if you use cache=directsync or cache=writethrough (these
      parameters avoid that the guest sends pointless flush requests, and
      at the same time they are not slow because of the battery-backed cache).
      The bad performance happens because on every halt the host CPU decides
      to halt itself too.  When the interrupt comes, the vCPU thread is then
      migrated to a new physical CPU, and in general the latency is horrible
      because the vCPU thread has to be scheduled back in.
      
      With this patch performance reaches 60-65% of bare metal and, more
      important, 99% of what you get if you use idle=poll in the guest.  This
      means that the tunable gets rid of this particular bottleneck, and more
      work can be done to improve performance in the kernel or QEMU.
      
      Of course there is some price to pay; every time an otherwise idle vCPUs
      is interrupted by an interrupt, it will poll unnecessarily and thus
      impose a little load on the host.  The above results were obtained with
      a mostly random value of the parameter (500000), and the load was around
      1.5-2.5% CPU usage on one of the host's core for each idle guest vCPU.
      
      The patch also adds a new stat, /sys/kernel/debug/kvm/halt_successful_poll,
      that can be used to tune the parameter.  It counts how many HLT
      instructions received an interrupt during the polling period; each
      successful poll avoids that Linux schedules the VCPU thread out and back
      in, and may also avoid a likely trip to C1 and back for the physical CPU.
      
      While the VM is idle, a Linux 4 VCPU VM halts around 10 times per second.
      Of these halts, almost all are failed polls.  During the benchmark,
      instead, basically all halts end within the polling period, except a more
      or less constant stream of 50 per second coming from vCPUs that are not
      running the benchmark.  The wasted time is thus very low.  Things may
      be slightly different for Windows VMs, which have a ~10 ms timer tick.
      
      The effect is also visible on Marcelo's recently-introduced latency
      test for the TSC deadline timer.  Though of course a non-RT kernel has
      awful latency bounds, the latency of the timer is around 8000-10000 clock
      cycles compared to 20000-120000 without setting halt_poll_ns.  For the TSC
      deadline timer, thus, the effect is both a smaller average latency and
      a smaller variance.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f7819512
  8. 30 1月, 2015 1 次提交
    • M
      arm/arm64: KVM: Use set/way op trapping to track the state of the caches · 3c1e7165
      Marc Zyngier 提交于
      Trying to emulate the behaviour of set/way cache ops is fairly
      pointless, as there are too many ways we can end-up missing stuff.
      Also, there is some system caches out there that simply ignore
      set/way operations.
      
      So instead of trying to implement them, let's convert it to VA ops,
      and use them as a way to re-enable the trapping of VM ops. That way,
      we can detect the point when the MMU/caches are turned off, and do
      a full VM flush (which is what the guest was trying to do anyway).
      
      This allows a 32bit zImage to boot on the APM thingy, and will
      probably help bootloaders in general.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      3c1e7165
  9. 21 1月, 2015 2 次提交
  10. 16 1月, 2015 1 次提交
  11. 13 12月, 2014 2 次提交
  12. 24 9月, 2014 2 次提交
    • T
      kvm: Add arch specific mmu notifier for page invalidation · fe71557a
      Tang Chen 提交于
      This will be used to let the guest run while the APIC access page is
      not pinned.  Because subsequent patches will fill in the function
      for x86, place the (still empty) x86 implementation in the x86.c file
      instead of adding an inline function in kvm_host.h.
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fe71557a
    • A
      kvm: Fix page ageing bugs · 57128468
      Andres Lagar-Cavilla 提交于
      1. We were calling clear_flush_young_notify in unmap_one, but we are
      within an mmu notifier invalidate range scope. The spte exists no more
      (due to range_start) and the accessed bit info has already been
      propagated (due to kvm_pfn_set_accessed). Simply call
      clear_flush_young.
      
      2. We clear_flush_young on a primary MMU PMD, but this may be mapped
      as a collection of PTEs by the secondary MMU (e.g. during log-dirty).
      This required expanding the interface of the clear_flush_young mmu
      notifier, so a lot of code has been trivially touched.
      
      3. In the absence of shadow_accessed_mask (e.g. EPT A bit), we emulate
      the access bit by blowing the spte. This requires proper synchronizing
      with MMU notifier consumers, like every other removal of spte's does.
      Signed-off-by: NAndres Lagar-Cavilla <andreslc@google.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      57128468
  13. 29 8月, 2014 3 次提交
  14. 28 8月, 2014 2 次提交
  15. 01 8月, 2014 1 次提交
  16. 11 7月, 2014 5 次提交
  17. 30 4月, 2014 1 次提交
  18. 28 12月, 2013 1 次提交
  19. 14 10月, 2013 1 次提交
  20. 03 10月, 2013 1 次提交
  21. 09 8月, 2013 1 次提交
  22. 12 6月, 2013 3 次提交
  23. 07 6月, 2013 2 次提交
  24. 19 5月, 2013 1 次提交
    • M
      ARM: KVM: move GIC/timer code to a common location · 7275acdf
      Marc Zyngier 提交于
      As KVM/arm64 is looming on the horizon, it makes sense to move some
      of the common code to a single location in order to reduce duplication.
      
      The code could live anywhere. Actually, most of KVM is already built
      with a bunch of ugly ../../.. hacks in the various Makefiles, so we're
      not exactly talking about style here. But maybe it is time to start
      moving into a less ugly direction.
      
      The include files must be in a "public" location, as they are accessed
      from non-KVM files (arch/arm/kernel/asm-offsets.c).
      
      For this purpose, introduce two new locations:
      - virt/kvm/arm/ : x86 and ia64 already share the ioapic code in
        virt/kvm, so this could be seen as a (very ugly) precedent.
      - include/kvm/  : there is already an include/xen, and while the
        intent is slightly different, this seems as good a location as
        any
      
      Eventually, we should probably have independant Makefiles at every
      levels (just like everywhere else in the kernel), but this is just
      the first step.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      7275acdf