1. 18 6月, 2021 8 次提交
  2. 27 5月, 2021 1 次提交
    • W
      KVM: X86: hyper-v: Task srcu lock when accessing kvm_memslots() · da6d63a0
      Wanpeng Li 提交于
         WARNING: suspicious RCU usage
         5.13.0-rc1 #4 Not tainted
         -----------------------------
         ./include/linux/kvm_host.h:710 suspicious rcu_dereference_check() usage!
      
        other info that might help us debug this:
      
        rcu_scheduler_active = 2, debug_locks = 1
         1 lock held by hyperv_clock/8318:
          #0: ffffb6b8cb05a7d8 (&hv->hv_lock){+.+.}-{3:3}, at: kvm_hv_invalidate_tsc_page+0x3e/0xa0 [kvm]
      
        stack backtrace:
        CPU: 3 PID: 8318 Comm: hyperv_clock Not tainted 5.13.0-rc1 #4
        Call Trace:
         dump_stack+0x87/0xb7
         lockdep_rcu_suspicious+0xce/0xf0
         kvm_write_guest_page+0x1c1/0x1d0 [kvm]
         kvm_write_guest+0x50/0x90 [kvm]
         kvm_hv_invalidate_tsc_page+0x79/0xa0 [kvm]
         kvm_gen_update_masterclock+0x1d/0x110 [kvm]
         kvm_arch_vm_ioctl+0x2a7/0xc50 [kvm]
         kvm_vm_ioctl+0x123/0x11d0 [kvm]
         __x64_sys_ioctl+0x3ed/0x9d0
         do_syscall_64+0x3d/0x80
         entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      kvm_memslots() will be called by kvm_write_guest(), so we should take the srcu lock.
      
      Fixes: e880c6ea (KVM: x86: hyper-v: Prevent using not-yet-updated TSC page by secondary CPUs)
      Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1621339235-11131-4-git-send-email-wanpengli@tencent.com>
      Reviewed-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      da6d63a0
  3. 18 3月, 2021 2 次提交
    • V
      KVM: x86: hyper-v: Don't touch TSC page values when guest opted for re-enlightenment · 0469f2f7
      Vitaly Kuznetsov 提交于
      When guest opts for re-enlightenment notifications upon migration, it is
      in its right to assume that TSC page values never change (as they're only
      supposed to change upon migration and the host has to keep things as they
      are before it receives confirmation from the guest). This is mostly true
      until the guest is migrated somewhere. KVM userspace (e.g. QEMU) will
      trigger masterclock update by writing to HV_X64_MSR_REFERENCE_TSC, by
      calling KVM_SET_CLOCK,... and as TSC value and kvmclock reading drift
      apart (even slightly), the update causes TSC page values to change.
      
      The issue at hand is that when Hyper-V is migrated, it uses stale (cached)
      TSC page values to compute the difference between its own clocksource
      (provided by KVM) and its guests' TSC pages to program synthetic timers
      and in some cases, when TSC page is updated, this puts all stimer
      expirations in the past. This, in its turn, causes an interrupt storm
      and L2 guests not making much forward progress.
      
      Note, KVM doesn't fully implement re-enlightenment notification. Basically,
      the support for reenlightenment MSRs is just a stub and userspace is only
      expected to expose the feature when TSC scaling on the expected destination
      hosts is available. With TSC scaling, no real re-enlightenment is needed
      as TSC frequency doesn't change. With TSC scaling becoming ubiquitous, it
      likely makes little sense to fully implement re-enlightenment in KVM.
      
      Prevent TSC page from being updated after migration. In case it's not the
      guest who's initiating the change and when TSC page is already enabled,
      just keep it as it is: TSC value is supposed to be preserved across
      migration and TSC frequency can't change with re-enlightenment enabled.
      The guest is doomed anyway if any of this is not true.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20210316143736.964151-5-vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0469f2f7
    • V
      KVM: x86: hyper-v: Track Hyper-V TSC page status · cc9cfddb
      Vitaly Kuznetsov 提交于
      Create an infrastructure for tracking Hyper-V TSC page status, i.e. if it
      was updated from guest/host side or if we've failed to set it up (because
      e.g. guest wrote some garbage to HV_X64_MSR_REFERENCE_TSC) and there's no
      need to retry.
      
      Also, in a hypothetical situation when we are in 'always catchup' mode for
      TSC we can now avoid contending 'hv->hv_lock' on every guest enter by
      setting the state to HV_TSC_PAGE_BROKEN after compute_tsc_page_parameters()
      returns false.
      
      Check for HV_TSC_PAGE_SET state instead of '!hv->tsc_ref.tsc_sequence' in
      get_time_ref_counter() to properly handle the situation when we failed to
      write the updated TSC page values to the guest.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20210316143736.964151-4-vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cc9cfddb
  4. 17 3月, 2021 2 次提交
  5. 26 2月, 2021 1 次提交
    • W
      KVM: x86: hyper-v: Fix Hyper-V context null-ptr-deref · 919f4ebc
      Wanpeng Li 提交于
      Reported by syzkaller:
      
          KASAN: null-ptr-deref in range [0x0000000000000140-0x0000000000000147]
          CPU: 1 PID: 8370 Comm: syz-executor859 Not tainted 5.11.0-syzkaller #0
          RIP: 0010:synic_get arch/x86/kvm/hyperv.c:165 [inline]
          RIP: 0010:kvm_hv_set_sint_gsi arch/x86/kvm/hyperv.c:475 [inline]
          RIP: 0010:kvm_hv_irq_routing_update+0x230/0x460 arch/x86/kvm/hyperv.c:498
          Call Trace:
           kvm_set_irq_routing+0x69b/0x940 arch/x86/kvm/../../../virt/kvm/irqchip.c:223
           kvm_vm_ioctl+0x12d0/0x2800 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3959
           vfs_ioctl fs/ioctl.c:48 [inline]
           __do_sys_ioctl fs/ioctl.c:753 [inline]
           __se_sys_ioctl fs/ioctl.c:739 [inline]
           __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739
           do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
           entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Hyper-V context is lazily allocated until Hyper-V specific MSRs are accessed
      or SynIC is enabled. However, the syzkaller testcase sets irq routing table
      directly w/o enabling SynIC. This results in null-ptr-deref when accessing
      SynIC Hyper-V context. This patch fixes it.
      
      syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=163342ccd00000
      
      Reported-by: syzbot+6987f3b2dbd9eda95f12@syzkaller.appspotmail.com
      Fixes: 8f014550 ("KVM: x86: hyper-v: Make Hyper-V emulation enablement conditional")
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1614326399-5762-1-git-send-email-wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      919f4ebc
  6. 09 2月, 2021 11 次提交
  7. 04 2月, 2021 2 次提交
    • J
      KVM: x86/xen: Fix coexistence of Xen and Hyper-V hypercalls · 79033beb
      Joao Martins 提交于
      Disambiguate Xen vs. Hyper-V calls by adding 'orl $0x80000000, %eax'
      at the start of the Hyper-V hypercall page when Xen hypercalls are
      also enabled.
      
      That bit is reserved in the Hyper-V ABI, and those hypercall numbers
      will never be used by Xen (because it does precisely the same trick).
      
      Switch to using kvm_vcpu_write_guest() while we're at it, instead of
      open-coding it.
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      79033beb
    • J
      KVM: x86: use static calls to reduce kvm_x86_ops overhead · b3646477
      Jason Baron 提交于
      Convert kvm_x86_ops to use static calls. Note that all kvm_x86_ops are
      covered here except for 'pmu_ops and 'nested ops'.
      
      Here are some numbers running cpuid in a loop of 1 million calls averaged
      over 5 runs, measured in the vm (lower is better).
      
      Intel Xeon 3000MHz:
      
                 |default    |mitigations=off
      -------------------------------------
      vanilla    |.671s      |.486s
      static call|.573s(-15%)|.458s(-6%)
      
      AMD EPYC 2500MHz:
      
                 |default    |mitigations=off
      -------------------------------------
      vanilla    |.710s      |.609s
      static call|.664s(-6%) |.609s(0%)
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Signed-off-by: NJason Baron <jbaron@akamai.com>
      Message-Id: <e057bf1b8a7ad15652df6eeba3f907ae758d3399.1610680941.git.jbaron@akamai.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b3646477
  8. 15 11月, 2020 1 次提交
  9. 28 9月, 2020 2 次提交
  10. 27 9月, 2020 1 次提交
  11. 24 8月, 2020 1 次提交
  12. 11 8月, 2020 1 次提交
    • J
      x86/kvm/hyper-v: Synic default SCONTROL MSR needs to be enabled · 99b48ecc
      Jon Doron 提交于
      Based on an analysis of the HyperV firmwares (Gen1 and Gen2) it seems
      like the SCONTROL is not being set to the ENABLED state as like we have
      thought.
      
      Also from a test done by Vitaly Kuznetsov, running a nested HyperV it
      was concluded that the first access to the SCONTROL MSR with a read
      resulted with the value of 0x1, aka HV_SYNIC_CONTROL_ENABLE.
      
      It's important to note that this diverges from the value states in the
      HyperV TLFS of 0.
      Signed-off-by: NJon Doron <arilou@gmail.com>
      Message-Id: <20200717125238.1103096-2-arilou@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      99b48ecc
  13. 04 6月, 2020 1 次提交
  14. 01 6月, 2020 3 次提交
    • J
      x86/kvm/hyper-v: Add support for synthetic debugger via hypercalls · b187038b
      Jon Doron 提交于
      There is another mode for the synthetic debugger which uses hypercalls
      to send/recv network data instead of the MSR interface.
      
      This interface is much slower and less recommended since you might get
      a lot of VMExits while KDVM polling for new packets to recv, rather
      than simply checking the pending page to see if there is data avialble
      and then request.
      Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NJon Doron <arilou@gmail.com>
      Message-Id: <20200529134543.1127440-6-arilou@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b187038b
    • J
      x86/kvm/hyper-v: enable hypercalls regardless of hypercall page · 45c38973
      Jon Doron 提交于
      Microsoft's kdvm.dll dbgtransport module does not respect the hypercall
      page and simply identifies the CPU being used (AMD/Intel) and according
      to it simply makes hypercalls with the relevant instruction
      (vmmcall/vmcall respectively).
      
      The relevant function in kdvm is KdHvConnectHypervisor which first checks
      if the hypercall page has been enabled via HV_X64_MSR_HYPERCALL_ENABLE,
      and in case it was not it simply sets the HV_X64_MSR_GUEST_OS_ID to
      0x1000101010001 which means:
      build_number = 0x0001
      service_version = 0x01
      minor_version = 0x01
      major_version = 0x01
      os_id = 0x00 (Undefined)
      vendor_id = 1 (Microsoft)
      os_type = 0 (A value of 0 indicates a proprietary, closed source OS)
      
      and starts issuing the hypercall without setting the hypercall page.
      
      To resolve this issue simply enable hypercalls also if the guest_os_id
      is not 0.
      Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NJon Doron <arilou@gmail.com>
      Message-Id: <20200529134543.1127440-5-arilou@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      45c38973
    • J
      x86/kvm/hyper-v: Add support for synthetic debugger interface · f97f5a56
      Jon Doron 提交于
      Add support for Hyper-V synthetic debugger (syndbg) interface.
      The syndbg interface is using MSRs to emulate a way to send/recv packets
      data.
      
      The debug transport dll (kdvm/kdnet) will identify if Hyper-V is enabled
      and if it supports the synthetic debugger interface it will attempt to
      use it, instead of trying to initialize a network adapter.
      Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NJon Doron <arilou@gmail.com>
      Message-Id: <20200529134543.1127440-4-arilou@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f97f5a56
  15. 20 5月, 2020 1 次提交
  16. 08 5月, 2020 1 次提交
  17. 23 4月, 2020 1 次提交
    • P
      KVM: x86: move nested-related kvm_x86_ops to a separate struct · 33b22172
      Paolo Bonzini 提交于
      Clean up some of the patching of kvm_x86_ops, by moving kvm_x86_ops related to
      nested virtualization into a separate struct.
      
      As a result, these ops will always be non-NULL on VMX.  This is not a problem:
      
      * check_nested_events is only called if is_guest_mode(vcpu) returns true
      
      * get_nested_state treats VMXOFF state the same as nested being disabled
      
      * set_nested_state fails if you attempt to set nested state while
        nesting is disabled
      
      * nested_enable_evmcs could already be called on a CPU without VMX enabled
        in CPUID.
      
      * nested_get_evmcs_version was fixed in the previous patch
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      33b22172