提交 · 7eef87dc99e419b1cc051e4417c37e4744d7b661 · openeuler / Kernel

30 10月, 2011 3 次提交

KVM: s390: fix register setting · 7eef87dc

由 Carsten Otte 提交于 10月 18, 2011

KVM common code does vcpu_load prior to calling our arch ioctls and
vcpu_put after we're done here. Via the kvm_arch_vcpu_load/put
callbacks we do load the fpu and access register state into the
processor, which saves us moving the state on every SIE exit the
kernel handles. However this breaks register setting from userspace,
because of the following sequence:
1a. vcpu load stores userspace register content
1b. vcpu load loads guest register content
2.  kvm_arch_vcpu_ioctl_set_fpu/sregs updates saved guest register content
3a. vcpu put stores the guest registers and overwrites the new content
3b. vcpu put loads the userspace register set again

This patch loads the new guest register state into the cpu, so that the correct
(new) set of guest registers will be stored in step 3a.
Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7eef87dc

KVM: s390: fix return value of kvm_arch_init_vm · b290411a

由 Carsten Otte 提交于 10月 18, 2011

This patch fixes the return value of kvm_arch_init_vm in case a memory
allocation goes wrong.
Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

b290411a

KVM: s390: check cpu_id prior to using it · 4d47555a

由 Carsten Otte 提交于 10月 18, 2011

We use the cpu id provided by userspace as array index here. Thus we
clearly need to check it first. Ooops.

CC: <stable@vger.kernel.org>
Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

4d47555a

05 10月, 2011 1 次提交

KVM: emulate lapic tsc deadline timer for guest · a3e06bbe

由 Liu, Jinsong 提交于 9月 22, 2011

This patch emulate lapic tsc deadline timer for guest:
Enumerate tsc deadline timer capability by CPUID;
Enable tsc deadline timer mode by lapic MMIO;
Start tsc deadline timer by WRMSR;

[jan: use do_div()]
[avi: fix for !irqchip_in_kernel()]
[marcelo: another fix for !irqchip_in_kernel()]
Signed-off-by: NLiu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a3e06bbe

26 9月, 2011 36 次提交

x86: TSC deadline definitions · b90dfb04

由 Liu, Jinsong 提交于 9月 22, 2011

This pre-defination is preparing for KVM tsc deadline timer emulation, but
theirself are not kvm specific.
Signed-off-by: NLiu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

b90dfb04

KVM: Fix simultaneous NMIs · 7460fb4a

由 Avi Kivity 提交于 9月 20, 2011

If simultaneous NMIs happen, we're supposed to queue the second
and next (collapsing them), but currently we sometimes collapse
the second into the first.

Fix by using a counter for pending NMIs instead of a bool; since
the counter limit depends on whether the processor is currently
in an NMI handler, which can only be checked in vcpu context
(via the NMI mask), we add a new KVM_REQ_NMI to request recalculation
of the counter.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7460fb4a

A
KVM: x86 emulator: convert push %sreg/pop %sreg to direct decode · 1cd196ea
由 Avi Kivity 提交于 9月 13, 2011
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
1cd196ea
A
KVM: x86 emulator: switch lds/les/lss/lfs/lgs to direct decode · d4b4325f
由 Avi Kivity 提交于 9月 13, 2011
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
d4b4325f

KVM: x86 emulator: streamline decode of segment registers · c191a7a0

由 Avi Kivity 提交于 9月 13, 2011

The opcodes

  push %seg
  pop %seg
  l%seg, %mem, %reg  (e.g. lds/les/lss/lfs/lgs)

all have an segment register encoded in the instruction.  To allow reuse,
decode the segment number into src2 during the decode stage instead of the
execution stage.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c191a7a0

KVM: x86 emulator: simplify OpMem64 decode · 41ddf978

由 Avi Kivity 提交于 9月 13, 2011

Use the same technique as the other OpMem variants, and goto mem_common.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

41ddf978

A
KVM: x86 emulator: switch src decode to decode_operand() · 0fe59128
由 Avi Kivity 提交于 9月 13, 2011
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
0fe59128

KVM: x86 emulator: qualify OpReg inhibit_byte_regs hack · 5217973e

由 Avi Kivity 提交于 9月 13, 2011

OpReg decoding has a hack that inhibits byte registers for movsx and movzx
instructions.  It should be replaced by something better, but meanwhile,
qualify that the hack is only active for the destination operand.

Note these instructions only use OpReg for the destination, but better to
be explicit about it.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5217973e

KVM: x86 emulator: switch OpImmUByte decode to decode_imm() · 608aabe3

由 Avi Kivity 提交于 9月 13, 2011

Similar to SrcImmUByte.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

608aabe3

KVM: x86 emulator: free up some flag bits near src, dst · 20c29ff2

由 Avi Kivity 提交于 9月 13, 2011

Op fields are going to grow by a bit, we need two free bits.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

20c29ff2

A
KVM: x86 emulator: switch src2 to generic decode_operand() · 4dd6a57d
由 Avi Kivity 提交于 9月 13, 2011
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
4dd6a57d

KVM: x86 emulator: expand decode flags to 64 bits · b1ea50b2

由 Avi Kivity 提交于 9月 13, 2011

Unifiying the operands means not taking advantage of the fact that some
operand types can only go into certain operands (for example, DI can only
be used by the destination), so we need more bits to hold the operand type.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

b1ea50b2

KVM: x86 emulator: split dst decode to a generic decode_operand() · a9945549

由 Avi Kivity 提交于 9月 13, 2011

Instead of decoding each operand using its own code, use a generic
function.  Start with the destination operand.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

a9945549

KVM: x86 emulator: move memop, memopp into emulation context · f09ed83e

由 Avi Kivity 提交于 9月 13, 2011

Simplifies further generalization of decode.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f09ed83e

A
KVM: x86 emulator: convert group 3 instructions to direct decode · 3329ece1
由 Avi Kivity 提交于 9月 13, 2011
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
3329ece1

KVM: x86: Add module parameter for lapic periodic timer limit · 9bc5791d

由 Jan Kiszka 提交于 9月 12, 2011

Certain guests, specifically RTOSes, request faster periodic timers than
what we allow by default. Add a module parameter to adjust the limit for
non-standard setups. Also add a rate-limited warning in case the guest
requested more.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9bc5791d

KVM: Clean up and extend rate-limited output · bd80158a

由 Jan Kiszka 提交于 9月 12, 2011

The use of printk_ratelimit is discouraged, replace it with
pr*_ratelimited or __ratelimit. While at it, convert remaining
guest-triggerable printks to rate-limited variants.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

bd80158a

KVM: x86: Avoid guest-triggerable printks in APIC model · 7712de87

由 Jan Kiszka 提交于 9月 12, 2011

Convert remaining printks that the guest can trigger to apic_printk.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7712de87

KVM: x86: Move kvm_trace_exit into atomic vmexit section · 1e2b1dd7

由 Jan Kiszka 提交于 9月 12, 2011

This avoids that events causing the vmexit are recorded before the
actual exit reason.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

1e2b1dd7

KVM: x86 emulator: disable writeback for TEST · caa8a168

由 Avi Kivity 提交于 9月 11, 2011

The TEST instruction doesn't write its destination operand.  This
could cause problems if an MMIO register was accessed using the TEST
instruction.  Recently Windows XP was observed to use TEST against
the APIC ICR; this can cause spurious IPIs.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

caa8a168

KVM: x86 emulator: simplify emulate_1op_rax_rdx() · e8f2b1d6

由 Avi Kivity 提交于 9月 07, 2011

emulate_1op_rax_rdx() is always called with the same parameters.  Simplify
by passing just the emulation context.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e8f2b1d6

KVM: x86 emulator: merge the two emulate_1op_rax_rdx implementations · 9fef72ce

由 Avi Kivity 提交于 9月 07, 2011

We have two emulate-with-extended-accumulator implementations: once
which expect traps (_ex) and one which doesn't (plain).  Drop the
plain implementation and always use the one which expects traps;
it will simply return 0 in the _ex argument and we can happily ignore
it.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9fef72ce

KVM: x86 emulator: simplify emulate_1op() · d1eef45d

由 Avi Kivity 提交于 9月 07, 2011

emulate_1op() is always called with the same parameters.  Simplify
by passing just the emulation context.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d1eef45d

KVM: x86 emulator: simplify emulate_2op_cl() · 29053a60

由 Avi Kivity 提交于 9月 07, 2011

emulate_2op_cl() is always called with the same parameters.  Simplify
by passing just the emulation context.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

29053a60

KVM: x86 emulator: simplify emulate_2op_cl() · 761441b9

由 Avi Kivity 提交于 9月 07, 2011

emulate_2op_cl() is always called with the same parameters.  Simplify
by passing just the emulation context.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

761441b9

KVM: x86 emulator: simplify emulate_2op_SrcV() · a31b9cea

由 Avi Kivity 提交于 9月 07, 2011

emulate_2op_SrcV(), and its siblings, emulate_2op_SrcV_nobyte()
and emulate_2op_SrcB(), all use the same calling conventions
and all get passed exactly the same parameters.  Simplify them
by passing just the emulation context.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

a31b9cea

KVM: PPC: Implement H_CEDE hcall for book3s_hv in real-mode code · 19ccb76a

由 Paul Mackerras 提交于 7月 23, 2011

With a KVM guest operating in SMT4 mode (i.e. 4 hardware threads per
core), whenever a CPU goes idle, we have to pull all the other
hardware threads in the core out of the guest, because the H_CEDE
hcall is handled in the kernel. This is inefficient.

This adds code to book3s_hv_rmhandlers.S to handle the H_CEDE hcall
in real mode. When a guest vcpu does an H_CEDE hcall, we now only
exit to the kernel if all the other vcpus in the same core are also
idle. Otherwise we mark this vcpu as napping, save state that could
be lost in nap mode (mainly GPRs and FPRs), and execute the nap
instruction. When the thread wakes up, because of a decrementer or
external interrupt, we come back in at kvm_start_guest (from the
system reset interrupt vector), find the `napping' flag set in the
paca, and go to the resume path.

This has some other ramifications. First, when starting a core, we
now start all the threads, both those that are immediately runnable and
those that are idle. This is so that we don't have to pull all the
threads out of the guest when an idle thread gets a decrementer interrupt
and wants to start running. In fact the idle threads will all start
with the H_CEDE hcall returning; being idle they will just do another
H_CEDE immediately and go to nap mode.

This required some changes to kvmppc_run_core() and kvmppc_run_vcpu().
These functions have been restructured to make them simpler and clearer.
We introduce a level of indirection in the wait queue that gets woken
when external and decrementer interrupts get generated for a vcpu, so
that we can have the 4 vcpus in a vcore using the same wait queue.
We need this because the 4 vcpus are being handled by one thread.

Secondly, when we need to exit from the guest to the kernel, we now
have to generate an IPI for any napping threads, because an HDEC
interrupt doesn't wake up a napping thread.

Thirdly, we now need to be able to handle virtual external interrupts
and decrementer interrupts becoming pending while a thread is napping,
and deliver those interrupts to the guest when the thread wakes.
This is done in kvmppc_cede_reentry, just before fast_guest_return.

Finally, since we are not using the generic kvm_vcpu_block for book3s_hv,
and hence not calling kvm_arch_vcpu_runnable, we can remove the #ifdef
from kvm_arch_vcpu_runnable.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

19ccb76a

KVM: PPC: book3s_pr: Simplify transitions between virtual and real mode · 02143947

由 Paul Mackerras 提交于 7月 23, 2011

This simplifies the way that the book3s_pr makes the transition to
real mode when entering the guest.  We now call kvmppc_entry_trampoline
(renamed from kvmppc_rmcall) in the base kernel using a normal function
call instead of doing an indirect call through a pointer in the vcpu.
If kvm is a module, the module loader takes care of generating a
trampoline as it does for other calls to functions outside the module.

kvmppc_entry_trampoline then disables interrupts and jumps to
kvmppc_handler_trampoline_enter in real mode using an rfi[d].
That then uses the link register as the address to return to
(potentially in module space) when the guest exits.

This also simplifies the way that we call the Linux interrupt handler
when we exit the guest due to an external, decrementer or performance
monitor interrupt.  Instead of turning on the MMU, then deciding that
we need to call the Linux handler and turning the MMU back off again,
we now go straight to the handler at the point where we would turn the
MMU on.  The handler will then return to the virtual-mode code
(potentially in the module).

Along the way, this moves the setting and clearing of the HID5 DCBZ32
bit into real-mode interrupts-off code, and also makes sure that
we clear the MSR[RI] bit before loading values into SRR0/1.

The net result is that we no longer need any code addresses to be
stored in vcpu->arch.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

02143947

KVM: PPC: Assemble book3s{,_hv}_rmhandlers.S separately · 177339d7

由 Paul Mackerras 提交于 7月 23, 2011

This makes arch/powerpc/kvm/book3s_rmhandlers.S and
arch/powerpc/kvm/book3s_hv_rmhandlers.S be assembled as
separate compilation units rather than having them #included in
arch/powerpc/kernel/exceptions-64s.S.  We no longer have any
conditional branches between the exception prologs in
exceptions-64s.S and the KVM handlers, so there is no need to
keep their contents close together in the vmlinux image.

In their current location, they are using up part of the limited
space between the first-level interrupt handlers and the firmware
NMI data area at offset 0x7000, and with some kernel configurations
this area will overflow (e.g. allyesconfig), leading to an
"attempt to .org backwards" error when compiling exceptions-64s.S.

Moving them out requires that we add some #includes that the
book3s_{,hv_}rmhandlers.S code was previously getting implicitly
via exceptions-64s.S.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

177339d7

KVM: PPC: Add sanity checking to vcpu_run · af8f38b3

由 Alexander Graf 提交于 8月 10, 2011

There are multiple features in PowerPC KVM that can now be enabled
depending on the user's wishes. Some of the combinations don't make
sense or don't work though.

So this patch adds a way to check if the executing environment would
actually be able to run the guest properly. It also adds sanity
checks if PVR is set (should always be true given the current code
flow), if PAPR is only used with book3s_64 where it works and that
HV KVM is only used in PAPR mode.
Signed-off-by: NAlexander Graf <agraf@suse.de>

af8f38b3

KVM: PPC: Enable the PAPR CAP for Book3S · 930b412a

由 Alexander Graf 提交于 8月 08, 2011

Now that Book3S PV mode can also run PAPR guests, we can add a PAPR cap and
enable it for all Book3S targets. Enabling that CAP switches KVM into PAPR
mode.
Signed-off-by: NAlexander Graf <agraf@suse.de>

930b412a

KVM: PPC: Support SC1 hypercalls for PAPR in PR mode · a668f2bd

由 Alexander Graf 提交于 8月 08, 2011

PAPR defines hypercalls as SC1 instructions. Using these, the guest modifies
page tables and does other privileged operations that it wouldn't be allowed
to do in supervisor mode.

This patch adds support for PR KVM to trap these instructions and route them
through the same PAPR hypercall interface that we already use for HV style
KVM.
Signed-off-by: NAlexander Graf <agraf@suse.de>

a668f2bd

KVM: PPC: Stub emulate CFAR and PURR SPRs · aacf9aa3

由 Alexander Graf 提交于 8月 08, 2011

Recent Linux versions use the CFAR and PURR SPRs, but don't really care about
their contents (yet). So for now, we can simply return 0 when the guest wants
to read them.
Signed-off-by: NAlexander Graf <agraf@suse.de>

aacf9aa3

KVM: PPC: Add PAPR hypercall code for PR mode · 0254f074

由 Alexander Graf 提交于 8月 08, 2011

When running a PAPR guest, we need to handle a few hypercalls in kernel space,
most prominently the page table invalidation (to sync the shadows).

So this patch adds handling for a few PAPR hypercalls to PR mode KVM. I tried
to share the code with HV mode, but it ended up being a lot easier this way
around, as the two differ too much in those details.
Signed-off-by: NAlexander Graf <agraf@suse.de>

---

v1 -> v2:

  - whitespace fix

0254f074

KVM: PPC: Add support for explicit HIOR setting · a15bd354

由 Alexander Graf 提交于 8月 08, 2011

Until now, we always set HIOR based on the PVR, but this is just wrong.
Instead, we should be setting HIOR explicitly, so user space can decide
what the initial HIOR value is - just like on real hardware.

We keep the old PVR based way around for backwards compatibility, but
once user space uses the SREGS based method, we drop the PVR logic.
Signed-off-by: NAlexander Graf <agraf@suse.de>

a15bd354

KVM: PPC: Read out syscall instruction on trap · 77e675ad

由 Alexander Graf 提交于 8月 08, 2011

We have a few traps where we cache the instruction that cause the trap
for analysis later on. Since we now need to be able to distinguish
between SC 0 and SC 1 system calls and the only way to find out which
is which is by looking at the instruction, we also read out the instruction
causing the system call.
Signed-off-by: NAlexander Graf <agraf@suse.de>

77e675ad

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功