提交 · ffd1925a596ce68bed7d81c61cb64bc35f788a9d · openeuler / Kernel

25 5月, 2022 15 次提交

KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest · ffd1925a

由 Yanfei Xu 提交于 5月 23, 2022

When kernel handles the vm-exit caused by external interrupts and NMI,
it always sets kvm_intr_type to tell if it's dealing an IRQ or NMI. For
the PMI scenario, it could be IRQ or NMI.

However, intel_pt PMIs are only generated for HARDWARE perf events, and
HARDWARE events are always configured to generate NMIs.  Use
kvm_handling_nmi_from_guest() to precisely identify if the intel_pt PMI
came from the guest; this avoids false positives if an intel_pt PMI/NMI
arrives while the host is handling an unrelated IRQ VM-Exit.

Fixes: db215756 ("KVM: x86: More precisely identify NMI from guest when handling PMI")
Signed-off-by: NYanfei Xu <yanfei.xu@intel.com>
Message-Id: <20220523140821.1345605-1-yanfei.xu@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ffd1925a

KVM: selftests: x86: Sync the new name of the test case to .gitignore · 366d4a12

由 Like Xu 提交于 5月 19, 2022

Fixing side effect of the so-called opportunistic change in the commit.

Fixes: dc8a9febbab0 ("KVM: selftests: x86: Fix test failure on arch lbr capable platforms")
Signed-off-by: NLike Xu <likexu@tencent.com>
Message-Id: <20220518170118.66263-2-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

366d4a12

P
Documentation: kvm: reorder ARM-specific section about KVM_SYSTEM_EVENT_SUSPEND · 186af6bb
由 Paolo Bonzini 提交于 5月 20, 2022
```
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
186af6bb

x86, kvm: use correct GFP flags for preemption disabled · baec4f5a

由 Paolo Bonzini 提交于 5月 24, 2022

Commit ddd7ed842627 ("x86/kvm: Alloc dummy async #PF token outside of
raw spinlock") leads to the following Smatch static checker warning:

	arch/x86/kernel/kvm.c:212 kvm_async_pf_task_wake()
	warn: sleeping in atomic context

arch/x86/kernel/kvm.c
    202         raw_spin_lock(&b->lock);
    203         n = _find_apf_task(b, token);
    204         if (!n) {
    205                 /*
    206                  * Async #PF not yet handled, add a dummy entry for the token.
    207                  * Allocating the token must be down outside of the raw lock
    208                  * as the allocator is preemptible on PREEMPT_RT kernels.
    209                  */
    210                 if (!dummy) {
    211                         raw_spin_unlock(&b->lock);
--> 212                         dummy = kzalloc(sizeof(*dummy), GFP_KERNEL);
                                                                ^^^^^^^^^^
Smatch thinks the caller has preempt disabled.  The `smdb.py preempt
kvm_async_pf_task_wake` output call tree is:

sysvec_kvm_asyncpf_interrupt() <- disables preempt
-> __sysvec_kvm_asyncpf_interrupt()
   -> kvm_async_pf_task_wake()

The caller is this:

arch/x86/kernel/kvm.c
   290        DEFINE_IDTENTRY_SYSVEC(sysvec_kvm_asyncpf_interrupt)
   291        {
   292                struct pt_regs *old_regs = set_irq_regs(regs);
   293                u32 token;
   294
   295                ack_APIC_irq();
   296
   297                inc_irq_stat(irq_hv_callback_count);
   298
   299                if (__this_cpu_read(apf_reason.enabled)) {
   300                        token = __this_cpu_read(apf_reason.token);
   301                        kvm_async_pf_task_wake(token);
   302                        __this_cpu_write(apf_reason.token, 0);
   303                        wrmsrl(MSR_KVM_ASYNC_PF_ACK, 1);
   304                }
   305
   306                set_irq_regs(old_regs);
   307        }

The DEFINE_IDTENTRY_SYSVEC() is a wrapper that calls this function
from the call_on_irqstack_cond().  It's inside the call_on_irqstack_cond()
where preempt is disabled (unless it's already disabled).  The
irq_enter/exit_rcu() functions disable/enable preempt.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

baec4f5a

KVM: LAPIC: Drop pending LAPIC timer injection when canceling the timer · 619f51da

由 Wanpeng Li 提交于 5月 20, 2022

The timer is disarmed when switching between TSC deadline and other modes;
however, the pending timer is still in-flight, so let's accurately remove
any traces of the previous mode.

Fixes: 44275932 ("KVM: x86: thoroughly disarm LAPIC timer around TSC deadline switch")
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

619f51da

x86/kvm: Alloc dummy async #PF token outside of raw spinlock · 0547758a

由 Sean Christopherson 提交于 5月 19, 2022

Drop the raw spinlock in kvm_async_pf_task_wake() before allocating the
the dummy async #PF token, the allocator is preemptible on PREEMPT_RT
kernels and must not be called from truly atomic contexts.

Opportunistically document why it's ok to loop on allocation failure,
i.e. why the function won't get stuck in an infinite loop.
Reported-by: NYajun Deng <yajun.deng@linux.dev>
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0547758a

KVM: x86: avoid calling x86 emulator without a decoded instruction · fee060cd

由 Sean Christopherson 提交于 3月 11, 2022

Whenever x86_decode_emulated_instruction() detects a breakpoint, it
returns the value that kvm_vcpu_check_breakpoint() writes into its
pass-by-reference second argument.  Unfortunately this is completely
bogus because the expected outcome of x86_decode_emulated_instruction
is an EMULATION_* value.

Then, if kvm_vcpu_check_breakpoint() does "*r = 0" (corresponding to
a KVM_EXIT_DEBUG userspace exit), it is misunderstood as EMULATION_OK
and x86_emulate_instruction() is called without having decoded the
instruction.  This causes various havoc from running with a stale
emulation context.

The fix is to move the call to kvm_vcpu_check_breakpoint() where it was
before commit 4aa2691d ("KVM: x86: Factor out x86 instruction
emulation with decoding") introduced x86_decode_emulated_instruction().
The other caller of the function does not need breakpoint checks,
because it is invoked as part of a vmexit and the processor has already
checked those before executing the instruction that #GP'd.

This fixes CVE-2022-1852.
Reported-by: NQiuhao Li <qiuhao@sysec.org>
Reported-by: NGaoning Pan <pgn@zju.edu.cn>
Reported-by: NYongkang Jia <kangel@zju.edu.cn>
Fixes: 4aa2691d ("KVM: x86: Factor out x86 instruction emulation with decoding")
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20220311032801.3467418-2-seanjc@google.com>
[Rewrote commit message according to Qiuhao's report, since a patch
 already existed to fix the bug. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fee060cd

KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leak · d22d2474

由 Ashish Kalra 提交于 5月 16, 2022

For some sev ioctl interfaces, the length parameter that is passed maybe
less than or equal to SEV_FW_BLOB_MAX_SIZE, but larger than the data
that PSP firmware returns. In this case, kmalloc will allocate memory
that is the size of the input rather than the size of the data.
Since PSP firmware doesn't fully overwrite the allocated buffer, these
sev ioctl interface may return uninitialized kernel slab memory.
Reported-by: NAndy Nguyen <theflow@google.com>
Suggested-by: NDavid Rientjes <rientjes@google.com>
Suggested-by: NPeter Gonda <pgonda@google.com>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Fixes: eaf78265 ("KVM: SVM: Move SEV code to separate file")
Fixes: 2c07ded0 ("KVM: SVM: add support for SEV attestation command")
Fixes: 4cfdd47d ("KVM: SVM: Add KVM_SEV SEND_START command")
Fixes: d3d1af85 ("KVM: SVM: Add KVM_SEND_UPDATE_DATA command")
Fixes: eba04b20 ("KVM: x86: Account a variety of miscellaneous allocations")
Signed-off-by: NAshish Kalra <ashish.kalra@amd.com>
Reviewed-by: NPeter Gonda <pgonda@google.com>
Message-Id: <20220516154310.3685678-1-Ashish.Kalra@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d22d2474

x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave) · d187ba53

由 Sean Christopherson 提交于 5月 04, 2022

Set the starting uABI size of KVM's guest FPU to 'struct kvm_xsave',
i.e. to KVM's historical uABI size.  When saving FPU state for usersapce,
KVM (well, now the FPU) sets the FP+SSE bits in the XSAVE header even if
the host doesn't support XSAVE.  Setting the XSAVE header allows the VM
to be migrated to a host that does support XSAVE without the new host
having to handle FPU state that may or may not be compatible with XSAVE.

Setting the uABI size to the host's default size results in out-of-bounds
writes (setting the FP+SSE bits) and data corruption (that is thankfully
caught by KASAN) when running on hosts without XSAVE, e.g. on Core2 CPUs.

WARN if the default size is larger than KVM's historical uABI size; all
features that can push the FPU size beyond the historical size must be
opt-in.

  ==================================================================
  BUG: KASAN: slab-out-of-bounds in fpu_copy_uabi_to_guest_fpstate+0x86/0x130
  Read of size 8 at addr ffff888011e33a00 by task qemu-build/681
  CPU: 1 PID: 681 Comm: qemu-build Not tainted 5.18.0-rc5-KASAN-amd64 #1
  Hardware name:  /DG35EC, BIOS ECG3510M.86A.0118.2010.0113.1426 01/13/2010
  Call Trace:
   <TASK>
   dump_stack_lvl+0x34/0x45
   print_report.cold+0x45/0x575
   kasan_report+0x9b/0xd0
   fpu_copy_uabi_to_guest_fpstate+0x86/0x130
   kvm_arch_vcpu_ioctl+0x72a/0x1c50 [kvm]
   kvm_vcpu_ioctl+0x47f/0x7b0 [kvm]
   __x64_sys_ioctl+0x5de/0xc90
   do_syscall_64+0x31/0x50
   entry_SYSCALL_64_after_hwframe+0x44/0xae
   </TASK>
  Allocated by task 0:
  (stack is not available)
  The buggy address belongs to the object at ffff888011e33800
   which belongs to the cache kmalloc-512 of size 512
  The buggy address is located 0 bytes to the right of
   512-byte region [ffff888011e33800, ffff888011e33a00)
  The buggy address belongs to the physical page:
  page:0000000089cd4adb refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11e30
  head:0000000089cd4adb order:2 compound_mapcount:0 compound_pincount:0
  flags: 0x4000000000010200(slab|head|zone=1)
  raw: 4000000000010200 dead000000000100 dead000000000122 ffff888001041c80
  raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
  page dumped because: kasan: bad access detected
  Memory state around the buggy address:
   ffff888011e33900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   ffff888011e33980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  >ffff888011e33a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                     ^
   ffff888011e33a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
   ffff888011e33b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ==================================================================
  Disabling lock debugging due to kernel taint

Fixes: be50b206 ("kvm: x86: Add support for getting/setting expanded xstate buffer")
Fixes: c60427dd ("x86/fpu: Add uabi_size to guest_fpu")
Reported-by: NZdenek Kaspar <zkaspar82@gmail.com>
Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <seanjc@google.com>
Tested-by: NZdenek Kaspar <zkaspar82@gmail.com>
Message-Id: <20220504001219.983513-1-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d187ba53

P
s390/uv_uapi: depend on CONFIG_S390 · eb3de2d8
由 Paolo Bonzini 提交于 5月 23, 2022
```
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
eb3de2d8

Merge tag 'kvm-s390-next-5.19-1' of... · 1644e270

由 Paolo Bonzini 提交于 5月 25, 2022

Merge tag 'kvm-s390-next-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fix and feature for 5.19

- ultravisor communication device driver
- fix TEID on terminating storage key ops

1644e270

Merge tag 'kvm-riscv-5.19-1' of https://github.com/kvm-riscv/linux into HEAD · b699da3d

由 Paolo Bonzini 提交于 5月 25, 2022

KVM/riscv changes for 5.19

- Added Sv57x4 support for G-stage page table
- Added range based local HFENCE functions
- Added remote HFENCE functions based on VCPU requests
- Added ISA extension registers in ONE_REG interface
- Updated KVM RISC-V maintainers entry to cover selftests support

b699da3d

Merge tag 'kvmarm-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD · 47e8eec8

由 Paolo Bonzini 提交于 5月 20, 2022

KVM/arm64 updates for 5.19

- Add support for the ARMv8.6 WFxT extension

- Guard pages for the EL2 stacks

- Trap and emulate AArch32 ID registers to hide unsupported features

- Ability to select and save/restore the set of hypercalls exposed
  to the guest

- Support for PSCI-initiated suspend in collaboration with userspace

- GICv3 register-based LPI invalidation support

- Move host PMU event merging into the vcpu data structure

- GICv3 ITS save/restore fixes

- The usual set of small-scale cleanups and fixes

[Due to the conflict, KVM_SYSTEM_EVENT_SEV_TERM is relocated
 from 4 to 6. - Paolo]

47e8eec8

KVM: selftests: x86: Fix test failure on arch lbr capable platforms · 825be3b5

由 Yang Weijiang 提交于 5月 12, 2022

On Arch LBR capable platforms, LBR_FMT in perf capability msr is 0x3f,
so the last format test will fail. Use a true invalid format(0x30) for
the test if it's running on these platforms. Opportunistically change
the file name to reflect the tests actually carried out.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NYang Weijiang <weijiang.yang@intel.com>
Message-Id: <20220512084046.105479-1-weijiang.yang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

825be3b5

KVM: LAPIC: Trace LAPIC timer expiration on every vmentry · e0ac5351

由 Wanpeng Li 提交于 4月 26, 2022

In commit ec0671d5 ("KVM: LAPIC: Delay trace_kvm_wait_lapic_expire
tracepoint to after vmexit", 2019-06-04), trace_kvm_wait_lapic_expire
was moved after guest_exit_irqoff() because invoking tracepoints within
kvm_guest_enter/kvm_guest_exit caused a lockdep splat.

These days this is not necessary, because commit 87fa7f3e ("x86/kvm:
Move context tracking where it belongs", 2020-07-09) restricted
the RCU extended quiescent state to be closer to vmentry/vmexit.
Moving the tracepoint back to __kvm_wait_lapic_expire is more accurate,
because it will be reported even if vcpu_enter_guest causes multiple
vmentries via the IPI/Timer fast paths, and it allows the removal of
advance_expire_delta.
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Message-Id: <1650961551-38390-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e0ac5351

20 5月, 2022 15 次提交

KVM: s390: selftest: Test suppression indication on key prot exception · c7115964

由 Janis Schoetterl-Glausch 提交于 5月 12, 2022

Check that suppression is not indicated on injection of a key checked
protection exception caused by a memop after it already modified guest
memory, as that violates the definition of suppression.
Signed-off-by: NJanis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@linux.ibm.com>
Link: https://lore.kernel.org/r/20220512131019.2594948-3-scgl@linux.ibm.comSigned-off-by: NChristian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>

c7115964

KVM: s390: Don't indicate suppression on dirtying, failing memop · c783631b

由 Janis Schoetterl-Glausch 提交于 5月 12, 2022

If user space uses a memop to emulate an instruction and that
memop fails, the execution of the instruction ends.
Instruction execution can end in different ways, one of which is
suppression, which requires that the instruction execute like a no-op.
A writing memop that spans multiple pages and fails due to key
protection may have modified guest memory, as a result, the likely
correct ending is termination. Therefore, do not indicate a
suppressing instruction ending in this case.
Signed-off-by: NJanis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: NClaudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20220512131019.2594948-2-scgl@linux.ibm.comSigned-off-by: NChristian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>

c783631b

selftests: drivers/s390x: Add uvdevice tests · cbac9242

由 Steffen Eiden 提交于 5月 10, 2022

Adds some selftests to test ioctl error paths of the uv-uapi.
The Kconfig S390_UV_UAPI must be selected and the Ultravisor facility
must be available. The test can be executed by non-root, however, the
uvdevice special file /dev/uv must be accessible for reading and
writing which may imply root privileges.

  ./test-uv-device
  TAP version 13
  1..6
  # Starting 6 tests from 3 test cases.
  #  RUN           uvio_fixture.att.fault_ioctl_arg ...
  #            OK  uvio_fixture.att.fault_ioctl_arg
  ok 1 uvio_fixture.att.fault_ioctl_arg
  #  RUN           uvio_fixture.att.fault_uvio_arg ...
  #            OK  uvio_fixture.att.fault_uvio_arg
  ok 2 uvio_fixture.att.fault_uvio_arg
  #  RUN           uvio_fixture.att.inval_ioctl_cb ...
  #            OK  uvio_fixture.att.inval_ioctl_cb
  ok 3 uvio_fixture.att.inval_ioctl_cb
  #  RUN           uvio_fixture.att.inval_ioctl_cmd ...
  #            OK  uvio_fixture.att.inval_ioctl_cmd
  ok 4 uvio_fixture.att.inval_ioctl_cmd
  #  RUN           attest_fixture.att_inval_request ...
  #            OK  attest_fixture.att_inval_request
  ok 5 attest_fixture.att_inval_request
  #  RUN           attest_fixture.att_inval_addr ...
  #            OK  attest_fixture.att_inval_addr
  ok 6 attest_fixture.att_inval_addr
  # PASSED: 6 / 6 tests passed.
  # Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0
Signed-off-by: NSteffen Eiden <seiden@linux.ibm.com>
Acked-by: NJanosch Frank <frankja@linux.ibm.com>
Message-Id: <20220510144724.3321985-3-seiden@linux.ibm.com>
Link: https://lore.kernel.org/kvm/20220510144724.3321985-3-seiden@linux.ibm.com/Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>

cbac9242

drivers/s390/char: Add Ultravisor io device · 4689752c

由 Steffen Eiden 提交于 5月 16, 2022

This patch adds a new miscdevice to expose some Ultravisor functions
to userspace. Userspace can send IOCTLs to the uvdevice that will then
emit a corresponding Ultravisor Call and hands the result over to
userspace. The uvdevice is available if the Ultravisor Call facility is
present.
Userspace can call the Retrieve Attestation Measurement
Ultravisor Call using IOCTLs on the uvdevice.

The uvdevice will do some sanity checks first.
Then, copy the request data to kernel space, build the UVCB,
perform the UV call, and copy the result back to userspace.
Signed-off-by: NSteffen Eiden <seiden@linux.ibm.com>
Reviewed-by: NJanosch Frank <frankja@linux.ibm.com>
Reviewed-by: NClaudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/kvm/20220516113335.338212-1-seiden@linux.ibm.com/
Message-Id: <20220516113335.338212-1-seiden@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com> (whitespace and  tristate fixes, pick)

4689752c

MAINTAINERS: Update KVM RISC-V entry to cover selftests support · fed9b26b

由 Anup Patel 提交于 5月 09, 2022

We update KVM RISC-V maintainers entry to include appropriate KVM
selftests directories so that RISC-V related KVM selftests patches
are CC'ed to KVM RISC-V mailing list.
Signed-off-by: NAnup Patel <anup@brainfault.org>

fed9b26b

RISC-V: KVM: Introduce ISA extension register · affa28e4

由 Atish Patra 提交于 5月 09, 2022

Currently, there is no provision for vmm (qemu-kvm or kvmtool) to
query about multiple-letter ISA extensions. The config register
is only used for base single letter ISA extensions.

A new ISA extension register is added that will allow the vmm
to query about any ISA extension one at a time. It is enabled for
both single letter or multi-letter ISA extensions. The ISA extension
register is useful to if the vmm requires to retrieve/set single
extension while the config register should be used if all the base
ISA extension required to retrieve or set.

For any multi-letter ISA extensions, the new register interface
must be used.
Signed-off-by: NAtish Patra <atishp@rivosinc.com>
Signed-off-by: NAnup Patel <anup@brainfault.org>

affa28e4

RISC-V: KVM: Cleanup stale TLB entries when host CPU changes · 92e45050

由 Anup Patel 提交于 5月 09, 2022

On RISC-V platforms with hardware VMID support, we share same
VMID for all VCPUs of a particular Guest/VM. This means we might
have stale G-stage TLB entries on the current Host CPU due to
some other VCPU of the same Guest which ran previously on the
current Host CPU.

To cleanup stale TLB entries, we simply flush all G-stage TLB
entries by VMID whenever underlying Host CPU changes for a VCPU.
Signed-off-by: NAnup Patel <apatel@ventanamicro.com>
Reviewed-by: NAtish Patra <atishp@rivosinc.com>
Signed-off-by: NAnup Patel <anup@brainfault.org>

92e45050

RISC-V: KVM: Add remote HFENCE functions based on VCPU requests · 13acfec2

由 Anup Patel 提交于 5月 09, 2022

The generic KVM has support for VCPU requests which can be used
to do arch-specific work in the run-loop. We introduce remote
HFENCE functions which will internally use VCPU requests instead
of host SBI calls.

Advantages of doing remote HFENCEs as VCPU requests are:
1) Multiple VCPUs of a Guest may be running on different Host CPUs
   so it is not always possible to determine the Host CPU mask for
   doing Host SBI call. For example, when VCPU X wants to do HFENCE
   on VCPU Y, it is possible that VCPU Y is blocked or in user-space
   (i.e. vcpu->cpu < 0).
2) To support nested virtualization, we will be having a separate
   shadow G-stage for each VCPU and a common host G-stage for the
   entire Guest/VM. The VCPU requests based remote HFENCEs helps
   us easily synchronize the common host G-stage and shadow G-stage
   of each VCPU without any additional IPI calls.

This is also a preparatory patch for upcoming nested virtualization
support where we will be having a shadow G-stage page table for
each Guest VCPU.
Signed-off-by: NAnup Patel <apatel@ventanamicro.com>
Reviewed-by: NAtish Patra <atishp@rivosinc.com>
Signed-off-by: NAnup Patel <anup@brainfault.org>

13acfec2

RISC-V: KVM: Reduce KVM_MAX_VCPUS value · 486a3842

由 Anup Patel 提交于 5月 09, 2022

Currently, the KVM_MAX_VCPUS value is 16384 for RV64 and 128
for RV32.

The KVM_MAX_VCPUS value is too high for RV64 and too low for
RV32 compared to other architectures (e.g. x86 sets it to 1024
and ARM64 sets it to 512). The too high value of KVM_MAX_VCPUS
on RV64 also leads to VCPU mask on stack consuming 2KB.

We set KVM_MAX_VCPUS to 1024 for both RV64 and RV32 to be
aligned other architectures.
Signed-off-by: NAnup Patel <apatel@ventanamicro.com>
Reviewed-by: NAtish Patra <atishp@rivosinc.com>
Signed-off-by: NAnup Patel <anup@brainfault.org>

486a3842

RISC-V: KVM: Introduce range based local HFENCE functions · 2415e46e

由 Anup Patel 提交于 5月 09, 2022

Various  __kvm_riscv_hfence_xyz() functions implemented in the
kvm/tlb.S are equivalent to corresponding HFENCE.GVMA instructions
and we don't have range based local HFENCE functions.

This patch provides complete set of local HFENCE functions which
supports range based TLB invalidation and supports HFENCE.VVMA
based functions. This is also a preparatory patch for upcoming
Svinval support in KVM RISC-V.
Signed-off-by: NAnup Patel <apatel@ventanamicro.com>
Reviewed-by: NAtish Patra <atishp@rivosinc.com>
Signed-off-by: NAnup Patel <anup@brainfault.org>

2415e46e

RISC-V: KVM: Treat SBI HFENCE calls as NOPs · c7fa3c48

由 Anup Patel 提交于 5月 09, 2022

We should treat SBI HFENCE calls as NOPs until nested virtualization
is supported by KVM RISC-V. This will help us test booting a hypervisor
under KVM RISC-V.
Signed-off-by: NAnup Patel <apatel@ventanamicro.com>
Reviewed-by: NAtish Patra <atishp@rivosinc.com>
Signed-off-by: NAnup Patel <anup@brainfault.org>

c7fa3c48

RISC-V: KVM: Add Sv57x4 mode support for G-stage · b4bbb95e

由 Anup Patel 提交于 5月 09, 2022

Latest QEMU supports G-stage Sv57x4 mode so this patch extends KVM
RISC-V G-stage handling to detect and use Sv57x4 mode when available.
Signed-off-by: NAnup Patel <apatel@ventanamicro.com>
Reviewed-by: NAtish Patra <atishp@rivosinc.com>
Signed-off-by: NAnup Patel <anup@brainfault.org>

b4bbb95e

RISC-V: KVM: Use G-stage name for hypervisor page table · 26708234

由 Anup Patel 提交于 5月 09, 2022

The two-stage address translation defined by the RISC-V privileged
specification defines: VS-stage (guest virtual address to guest
physical address) programmed by the Guest OS  and G-stage (guest
physical addree to host physical address) programmed by the
hypervisor.

To align with above terminology, we replace "stage2" with "gstage"
and "Stage2" with "G-stage" name everywhere in KVM RISC-V sources.
Signed-off-by: NAnup Patel <apatel@ventanamicro.com>
Reviewed-by: NAtish Patra <atishp@rivosinc.com>
Signed-off-by: NAnup Patel <anup@brainfault.org>

26708234

KVM: selftests: riscv: Remove unneeded semicolon · dba90d6f

由 Jiapeng Chong 提交于 5月 06, 2022

Fix the following coccicheck warnings:

./tools/testing/selftests/kvm/lib/riscv/processor.c:353:3-4: Unneeded
semicolon.
Reported-by: NAbaci Robot <abaci@linux.alibaba.com>
Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: NAnup Patel <anup@brainfault.org>

dba90d6f

KVM: selftests: riscv: Improve unexpected guest trap handling · ac6c85e9

由 Anup Patel 提交于 4月 09, 2022

Currently, we simply hang using "while (1) ;" upon any unexpected
guest traps because the default guest trap handler is guest_hang().

The above approach is not useful to anyone because KVM selftests
users will only see a hung application upon any unexpected guest
trap.

This patch improves unexpected guest trap handling for KVM RISC-V
selftests by doing the following:
1) Return to host user-space
2) Dump VCPU registers
3) Die using TEST_ASSERT(0, ...)
Signed-off-by: NAnup Patel <apatel@ventanamicro.com>
Tested-by: NMayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: NAnup Patel <anup@brainfault.org>

ac6c85e9

17 5月, 2022 7 次提交

Merge branch kvm-arm64/its-save-restore-fixes-5.19 into kvmarm-master/next · 5c0ad551