1. 13 3月, 2021 4 次提交
    • W
      KVM: LAPIC: Advancing the timer expiration on guest initiated write · 35737d2d
      Wanpeng Li 提交于
      Advancing the timer expiration should only be necessary on guest initiated
      writes. When we cancel the timer and clear .pending during state restore,
      clear expired_tscdeadline as well.
      Reviewed-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1614818118-965-1-git-send-email-wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      35737d2d
    • S
      KVM: x86/mmu: Skip !MMU-present SPTEs when removing SP in exclusive mode · 8df9f1af
      Sean Christopherson 提交于
      If mmu_lock is held for write, don't bother setting !PRESENT SPTEs to
      REMOVED_SPTE when recursively zapping SPTEs as part of shadow page
      removal.  The concurrent write protections provided by REMOVED_SPTE are
      not needed, there are no backing page side effects to record, and MMIO
      SPTEs can be left as is since they are protected by the memslot
      generation, not by ensuring that the MMIO SPTE is unreachable (which
      is racy with respect to lockless walks regardless of zapping behavior).
      
      Skipping !PRESENT drastically reduces the number of updates needed to
      tear down sparsely populated MMUs, e.g. when tearing down a 6gb VM that
      didn't touch much memory, 6929/7168 (~96.6%) of SPTEs were '0' and could
      be skipped.
      
      Avoiding the write itself is likely close to a wash, but avoiding
      __handle_changed_spte() is a clear-cut win as that involves saving and
      restoring all non-volatile GPRs (it's a subtly big function), as well as
      several conditional branches before bailing out.
      
      Cc: Ben Gardon <bgardon@google.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210310003029.1250571-1-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      8df9f1af
    • W
      KVM: kvmclock: Fix vCPUs > 64 can't be online/hotpluged · d7eb79c6
      Wanpeng Li 提交于
      # lscpu
      Architecture:          x86_64
      CPU op-mode(s):        32-bit, 64-bit
      Byte Order:            Little Endian
      CPU(s):                88
      On-line CPU(s) list:   0-63
      Off-line CPU(s) list:  64-87
      
      # cat /proc/cmdline
      BOOT_IMAGE=/vmlinuz-5.10.0-rc3-tlinux2-0050+ root=/dev/mapper/cl-root ro
      rd.lvm.lv=cl/root rhgb quiet console=ttyS0 LANG=en_US .UTF-8 no-kvmclock-vsyscall
      
      # echo 1 > /sys/devices/system/cpu/cpu76/online
      -bash: echo: write error: Cannot allocate memory
      
      The per-cpu vsyscall pvclock data pointer assigns either an element of the
      static array hv_clock_boot (#vCPU <= 64) or dynamically allocated memory
      hvclock_mem (vCPU > 64), the dynamically memory will not be allocated if
      kvmclock vsyscall is disabled, this can result in cpu hotpluged fails in
      kvmclock_setup_percpu() which returns -ENOMEM. It's broken for no-vsyscall
      and sometimes you end up with vsyscall disabled if the host does something
      strange. This patch fixes it by allocating this dynamically memory
      unconditionally even if vsyscall is disabled.
      
      Fixes: 6a1cac56 ("x86/kvm: Use __bss_decrypted attribute in shared variables")
      Reported-by: NZelin Deng <zelin.deng@linux.alibaba.com>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: stable@vger.kernel.org#v4.19-rc5+
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1614130683-24137-1-git-send-email-wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d7eb79c6
    • M
      kvm: x86: annotate RCU pointers · 6fcd9cbc
      Muhammad Usama Anjum 提交于
      This patch adds the annotation to fix the following sparse errors:
      arch/x86/kvm//x86.c:8147:15: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//x86.c:8147:15:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//x86.c:8147:15:    struct kvm_apic_map *
      arch/x86/kvm//x86.c:10628:16: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//x86.c:10628:16:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//x86.c:10628:16:    struct kvm_apic_map *
      arch/x86/kvm//x86.c:10629:15: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//x86.c:10629:15:    struct kvm_pmu_event_filter [noderef] __rcu *
      arch/x86/kvm//x86.c:10629:15:    struct kvm_pmu_event_filter *
      arch/x86/kvm//lapic.c:267:15: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//lapic.c:267:15:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//lapic.c:267:15:    struct kvm_apic_map *
      arch/x86/kvm//lapic.c:269:9: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//lapic.c:269:9:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//lapic.c:269:9:    struct kvm_apic_map *
      arch/x86/kvm//lapic.c:637:15: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//lapic.c:637:15:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//lapic.c:637:15:    struct kvm_apic_map *
      arch/x86/kvm//lapic.c:994:15: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//lapic.c:994:15:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//lapic.c:994:15:    struct kvm_apic_map *
      arch/x86/kvm//lapic.c:1036:15: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//lapic.c:1036:15:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//lapic.c:1036:15:    struct kvm_apic_map *
      arch/x86/kvm//lapic.c:1173:15: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//lapic.c:1173:15:    struct kvm_apic_map [noderef] __rcu *
      arch/x86/kvm//lapic.c:1173:15:    struct kvm_apic_map *
      arch/x86/kvm//pmu.c:190:18: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//pmu.c:190:18:    struct kvm_pmu_event_filter [noderef] __rcu *
      arch/x86/kvm//pmu.c:190:18:    struct kvm_pmu_event_filter *
      arch/x86/kvm//pmu.c:251:18: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//pmu.c:251:18:    struct kvm_pmu_event_filter [noderef] __rcu *
      arch/x86/kvm//pmu.c:251:18:    struct kvm_pmu_event_filter *
      arch/x86/kvm//pmu.c:522:18: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//pmu.c:522:18:    struct kvm_pmu_event_filter [noderef] __rcu *
      arch/x86/kvm//pmu.c:522:18:    struct kvm_pmu_event_filter *
      arch/x86/kvm//pmu.c:522:18: error: incompatible types in comparison expression (different address spaces):
      arch/x86/kvm//pmu.c:522:18:    struct kvm_pmu_event_filter [noderef] __rcu *
      arch/x86/kvm//pmu.c:522:18:    struct kvm_pmu_event_filter *
      Signed-off-by: NMuhammad Usama Anjum <musamaanjum@gmail.com>
      Message-Id: <20210305191123.GA497469@LEGION>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6fcd9cbc
  2. 11 3月, 2021 2 次提交
  3. 06 3月, 2021 1 次提交
  4. 05 3月, 2021 2 次提交
  5. 03 3月, 2021 6 次提交
  6. 27 2月, 2021 4 次提交
  7. 26 2月, 2021 4 次提交
  8. 25 2月, 2021 1 次提交
    • S
      KVM: SVM: Fix nested VM-Exit on #GP interception handling · 2df8d380
      Sean Christopherson 提交于
      Fix the interpreation of nested_svm_vmexit()'s return value when
      synthesizing a nested VM-Exit after intercepting an SVM instruction while
      L2 was running.  The helper returns '0' on success, whereas a return
      value of '0' in the exit handler path means "exit to userspace".  The
      incorrect return value causes KVM to exit to userspace without filling
      the run state, e.g. QEMU logs "KVM: unknown exit, hardware reason 0".
      
      Fixes: 14c2bf81 ("KVM: SVM: Fix #GP handling for doubly-nested virtualization")
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210224005627.657028-1-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2df8d380
  9. 24 2月, 2021 5 次提交
  10. 23 2月, 2021 4 次提交
    • D
      KVM: x86/mmu: Consider the hva in mmu_notifier retry · 4a42d848
      David Stevens 提交于
      Track the range being invalidated by mmu_notifier and skip page fault
      retries if the fault address is not affected by the in-progress
      invalidation. Handle concurrent invalidations by finding the minimal
      range which includes all ranges being invalidated. Although the combined
      range may include unrelated addresses and cannot be shrunk as individual
      invalidation operations complete, it is unlikely the marginal gains of
      proper range tracking are worth the additional complexity.
      
      The primary benefit of this change is the reduction in the likelihood of
      extreme latency when handing a page fault due to another thread having
      been preempted while modifying host virtual addresses.
      Signed-off-by: NDavid Stevens <stevensd@chromium.org>
      Message-Id: <20210222024522.1751719-3-stevensd@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4a42d848
    • S
      KVM: x86/mmu: Skip mmu_notifier check when handling MMIO page fault · 5f8a7cf2
      Sean Christopherson 提交于
      Don't retry a page fault due to an mmu_notifier invalidation when
      handling a page fault for a GPA that did not resolve to a memslot, i.e.
      an MMIO page fault.  Invalidations from the mmu_notifier signal a change
      in a host virtual address (HVA) mapping; without a memslot, there is no
      HVA and thus no possibility that the invalidation is relevant to the
      page fault being handled.
      
      Note, the MMIO vs. memslot generation checks handle the case where a
      pending memslot will create a memslot overlapping the faulting GPA.  The
      mmu_notifier checks are orthogonal to memslot updates.
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210222024522.1751719-2-stevensd@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5f8a7cf2
    • P
      KVM: nSVM: prepare guest save area while is_guest_mode is true · d2df592f
      Paolo Bonzini 提交于
      Right now, enter_svm_guest_mode is calling nested_prepare_vmcb_save and
      nested_prepare_vmcb_control.  This results in is_guest_mode being false
      until the end of nested_prepare_vmcb_control.
      
      This is a problem because nested_prepare_vmcb_save can in turn cause
      changes to the intercepts and these have to be applied to the "host VMCB"
      (stored in svm->nested.hsave) and then merged with the VMCB12 intercepts
      into svm->vmcb.
      
      In particular, without this change we forget to set the CR0 read and CR0
      write intercepts when running a real mode L2 guest with NPT disabled.
      The guest is therefore able to see the CR0.PG bit that KVM sets to
      enable "paged real mode".  This patch fixes the svm.flat mode_switch
      test case with npt=0.  There are no other problematic calls in
      nested_prepare_vmcb_save.
      
      Moving is_guest_mode to the end is done since commit 06fc7772
      ("KVM: SVM: Activate nested state only when guest state is complete",
      2010-04-25).  However, back then KVM didn't grab a different VMCB
      when updating the intercepts, it had already copied/merged L1's stuff
      to L0's VMCB, and then updated L0's VMCB regardless of is_nested().
      Later recalc_intercepts was introduced in commit 384c6368
      ("KVM: SVM: Add function to recalculate intercept masks", 2011-01-12).
      This introduced the bug, because recalc_intercepts now throws away
      the intercept manipulations that svm_set_cr0 had done in the meanwhile
      to svm->vmcb.
      
      [1] https://lore.kernel.org/kvm/1266493115-28386-1-git-send-email-joerg.roedel@amd.com/Reviewed-by: NSean Christopherson <seanjc@google.com>
      Tested-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d2df592f
    • B
      bpf, x86: Fix BPF_FETCH atomic and/or/xor with r0 as src · b29dd96b
      Brendan Jackman 提交于
      This code generates a CMPXCHG loop in order to implement atomic_fetch
      bitwise operations. Because CMPXCHG is hard-coded to use rax (which
      holds the BPF r0 value), it saves the _real_ r0 value into the
      internal "ax" temporary register and restores it once the loop is
      complete.
      
      In the middle of the loop, the actual bitwise operation is performed
      using src_reg. The bug occurs when src_reg is r0: as described above,
      r0 has been clobbered and the real r0 value is in the ax register.
      
      Therefore, perform this operation on the ax register instead, when
      src_reg is r0.
      
      Fixes: 981f94c3 ("bpf: Add bitwise atomic instructions")
      Signed-off-by: NBrendan Jackman <jackmanb@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NKP Singh <kpsingh@kernel.org>
      Link: https://lore.kernel.org/bpf/20210216125307.1406237-1-jackmanb@google.com
      b29dd96b
  11. 22 2月, 2021 3 次提交
  12. 19 2月, 2021 4 次提交
反馈
建议
客服 返回
顶部