1. 01 5月, 2019 1 次提交
    • S
      KVM: x86: Omit caching logic for always-available GPRs · de3cd117
      Sean Christopherson 提交于
      Except for RSP and RIP, which are held in VMX's VMCS, GPRs are always
      treated "available and dirtly" on both VMX and SVM, i.e. are
      unconditionally loaded/saved immediately before/after VM-Enter/VM-Exit.
      
      Eliminating the unnecessary caching code reduces the size of KVM by a
      non-trivial amount, much of which comes from the most common code paths.
      E.g. on x86_64, kvm_emulate_cpuid() is reduced from 342 to 182 bytes and
      kvm_emulate_hypercall() from 1362 to 1143, with the total size of KVM
      dropping by ~1000 bytes.  With CONFIG_RETPOLINE=y, the numbers are even
      more pronounced, e.g.: 353->182, 1418->1172 and well over 2000 bytes.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      de3cd117
  2. 19 4月, 2019 1 次提交
    • V
      x86: kvm: hyper-v: deal with buggy TLB flush requests from WS2012 · da66761c
      Vitaly Kuznetsov 提交于
      It was reported that with some special Multi Processor Group configuration,
      e.g:
       bcdedit.exe /set groupsize 1
       bcdedit.exe /set maxgroup on
       bcdedit.exe /set groupaware on
      for a 16-vCPU guest WS2012 shows BSOD on boot when PV TLB flush mechanism
      is in use.
      
      Tracing kvm_hv_flush_tlb immediately reveals the issue:
      
       kvm_hv_flush_tlb: processor_mask 0x0 address_space 0x0 flags 0x2
      
      The only flag set in this request is HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES,
      however, processor_mask is 0x0 and no HV_FLUSH_ALL_PROCESSORS is specified.
      We don't flush anything and apparently it's not what Windows expects.
      
      TLFS doesn't say anything about such requests and newer Windows versions
      seem to be unaffected. This all feels like a WS2012 bug, which is, however,
      easy to workaround in KVM: let's flush everything when we see an empty
      flush request, over-flushing doesn't hurt.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      da66761c
  3. 29 3月, 2019 1 次提交
    • V
      x86/kvm/hyper-v: avoid spurious pending stimer on vCPU init · 013cc6eb
      Vitaly Kuznetsov 提交于
      When userspace initializes guest vCPUs it may want to zero all supported
      MSRs including Hyper-V related ones including HV_X64_MSR_STIMERn_CONFIG/
      HV_X64_MSR_STIMERn_COUNT. With commit f3b138c5 ("kvm/x86: Update SynIC
      timers on guest entry only") we began doing stimer_mark_pending()
      unconditionally on every config change.
      
      The issue I'm observing manifests itself as following:
      - Qemu writes 0 to STIMERn_{CONFIG,COUNT} MSRs and marks all stimers as
        pending in stimer_pending_bitmap, arms KVM_REQ_HV_STIMER;
      - kvm_hv_has_stimer_pending() starts returning true;
      - kvm_vcpu_has_events() starts returning true;
      - kvm_arch_vcpu_runnable() starts returning true;
      - when kvm_arch_vcpu_ioctl_run() gets into
        (vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED) case:
        - kvm_vcpu_block() gets in 'kvm_vcpu_check_block(vcpu) < 0' and returns
          immediately, avoiding normal wait path;
        - -EAGAIN is returned from kvm_arch_vcpu_ioctl_run() immediately forcing
          userspace to retry.
      
      So instead of normal wait path we get a busy loop on all secondary vCPUs
      before they get INIT signal. This seems to be undesirable, especially given
      that this happens even when Hyper-V extensions are not used.
      
      Generally, it seems to be pointless to mark an stimer as pending in
      stimer_pending_bitmap and arm KVM_REQ_HV_STIMER as the only thing
      kvm_hv_process_stimers() will do is clear the corresponding bit. We may
      just not mark disabled timers as pending instead.
      
      Fixes: f3b138c5 ("kvm/x86: Update SynIC timers on guest entry only")
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      013cc6eb
  4. 21 2月, 2019 1 次提交
    • B
      kvm: x86: Add memcg accounting to KVM allocations · 254272ce
      Ben Gardon 提交于
      There are many KVM kernel memory allocations which are tied to the life of
      the VM process and should be charged to the VM process's cgroup. If the
      allocations aren't tied to the process, the OOM killer will not know
      that killing the process will free the associated kernel memory.
      Add __GFP_ACCOUNT flags to many of the allocations which are not yet being
      charged to the VM process's cgroup.
      
      Tested:
      	Ran all kvm-unit-tests on a 64 bit Haswell machine, the patch
      	introduced no new failures.
      	Ran a kernel memory accounting test which creates a VM to touch
      	memory and then checks that the kernel memory allocated for the
      	process is within certain bounds.
      	With this patch we account for much more of the vmalloc and slab memory
      	allocated for the VM.
      
      There remain a few allocations which should be charged to the VM's
      cgroup but are not. In x86, they include:
      	vcpu->arch.pio_data
      There allocations are unaccounted in this patch because they are mapped
      to userspace, and accounting them to a cgroup causes problems. This
      should be addressed in a future patch.
      Signed-off-by: NBen Gardon <bgardon@google.com>
      Reviewed-by: NShakeel Butt <shakeelb@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      254272ce
  5. 26 1月, 2019 4 次提交
  6. 15 12月, 2018 8 次提交
  7. 17 10月, 2018 11 次提交
  8. 06 8月, 2018 1 次提交
    • P
      KVM: x86: ensure all MSRs can always be KVM_GET/SET_MSR'd · 44883f01
      Paolo Bonzini 提交于
      Some of the MSRs returned by GET_MSR_INDEX_LIST currently cannot be sent back
      to KVM_GET_MSR and/or KVM_SET_MSR; either they can never be sent back, or you
      they are only accepted under special conditions.  This makes the API a pain to
      use.
      
      To avoid this pain, this patch makes it so that the result of the get-list
      ioctl can always be used for host-initiated get and set.  Since we don't have
      a separate way to check for read-only MSRs, this means some Hyper-V MSRs are
      ignored when written.  Arguably they should not even be in the result of
      GET_MSR_INDEX_LIST, but I am leaving there in case userspace is using the
      outcome of GET_MSR_INDEX_LIST to derive the support for the corresponding
      Hyper-V feature.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      44883f01
  9. 26 5月, 2018 5 次提交
  10. 11 5月, 2018 2 次提交
  11. 29 3月, 2018 1 次提交
  12. 24 3月, 2018 1 次提交
  13. 17 3月, 2018 3 次提交