提交 · d54d07b2ca19a2908aa89e0c67715ca2e8e62a4c · openeuler / raspberrypi-kernel

03 1月, 2013 5 次提交

KVM: VMX: Do not fix segment register during vcpu initialization. · d54d07b2

由 Gleb Natapov 提交于 12月 20, 2012

Segment registers will be fixed according to current emulation policy
during switching to real mode for the first time.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d54d07b2

KVM: VMX: fix emulation of invalid guest state. · d99e4152

由 Gleb Natapov 提交于 12月 20, 2012

Currently when emulation of invalid guest state is enable
(emulate_invalid_guest_state=1) segment registers are still fixed for
entry to vm86 mode some times. Segment register fixing is avoided in
enter_rmode(), but vmx_set_segment() still does it unconditionally.
The patch fixes it.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d99e4152

KVM: VMX: make rmode_segment_valid() more strict. · 89efbed0

由 Gleb Natapov 提交于 12月 20, 2012

Currently it allows entering vm86 mode if segment limit is greater than
0xffff and db bit is set. Both of those can cause incorrect execution of
instruction by cpu since in vm86 mode limit will be set to 0xffff and db
will be forced to 0.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

89efbed0

KVM: emulator: implement fninit, fnstsw, fnstcw · 045a282c

由 Gleb Natapov 提交于 12月 20, 2012

Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

045a282c

KVM: emulator: drop RPL check from linearize() function · 3a78a4f4

由 Gleb Natapov 提交于 12月 20, 2012

According to Intel SDM Vol3 Section 5.5 "Privilege Levels" and 5.6
"Privilege Level Checking When Accessing Data Segments" RPL checking is
done during loading of a segment selector, not during data access. We
already do checking during segment selector loading, so drop the check
during data access. Checking RPL during data access triggers #GP if
after transition from real mode to protected mode RPL bits in a segment
selector are set.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

3a78a4f4

23 12月, 2012 7 次提交

G
KVM: VMX: remove unneeded temporary variable from vmx_set_segment() · f924d66d
由 Gleb Natapov 提交于 12月 12, 2012
```
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
```
f924d66d

KVM: VMX: clean-up vmx_set_segment() · 1ecd50a9

由 Gleb Natapov 提交于 12月 12, 2012

Move all vm86_active logic into one place.
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

1ecd50a9

KVM: VMX: remove redundant code from vmx_set_segment() · 39dcfb95

由 Gleb Natapov 提交于 12月 12, 2012

Segment descriptor's base is fixed by call to fix_rmode_seg(). Not need
to do it twice.
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

39dcfb95

KVM: VMX: use fix_rmode_seg() to fix all code/data segments · beb853ff

由 Gleb Natapov 提交于 12月 12, 2012

The code for SS and CS does the same thing fix_rmode_seg() is doing.
Use it instead of hand crafted code.
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

beb853ff

KVM: VMX: return correct segment limit and flags for CS/SS registers in real mode · c6ad1153

由 Gleb Natapov 提交于 12月 12, 2012

VMX without unrestricted mode cannot virtualize real mode, so if
emulate_invalid_guest_state=0 kvm uses vm86 mode to approximate
it. Sometimes, when guest moves from protected mode to real mode, it
leaves segment descriptors in a state not suitable for use by vm86 mode
virtualization, so we keep shadow copy of segment descriptors for internal
use and load fake register to VMCS for guest entry to succeed. Till
now we kept shadow for all segments except SS and CS (for SS and CS we
returned parameters directly from VMCS), but since commit a5625189
emulator enforces segment limits in real mode. This causes #GP during move
from protected mode to real mode when emulator fetches first instruction
after moving to real mode since it uses incorrect CS base and limit to
linearize the %rip. Fix by keeping shadow for SS and CS too.
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

c6ad1153

KVM: VMX: relax check for CS register in rmode_segment_valid() · 0647f4aa

由 Gleb Natapov 提交于 12月 12, 2012

rmode_segment_valid() checks if segment descriptor can be used to enter
vm86 mode. VMX spec mandates that in vm86 mode CS register will be of
type data, not code. Lets allow guest entry with vm86 mode if the only
problem with CS register is incorrect type. Otherwise entire real mode
will be emulated.
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

0647f4aa

KVM: VMX: cleanup rmode_segment_valid() · 07f42f5f

由 Gleb Natapov 提交于 12月 12, 2012

Set segment fields explicitly instead of using  binary operations.

No behaviour changes.
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

07f42f5f

18 12月, 2012 1 次提交

kvm: fix i8254 counter 0 wraparound · d4b06c2d

由 Nickolai Zeldovich 提交于 12月 15, 2012

The kvm i8254 emulation for counter 0 (but not for counters 1 and 2)
has at least two bugs in mode 0:

1. The OUT bit, computed by pit_get_out(), is never set high.

2. The counter value, computed by pit_get_count(), wraps back around to
   the initial counter value, rather than wrapping back to 0xFFFF
   (which is the behavior described in the comment in __kpit_elapsed,
   the behavior implemented by qemu, and the behavior observed on AMD
   hardware).

The bug stems from __kpit_elapsed computing the elapsed time mod the
initial counter value (stored as nanoseconds in ps->period).  This is both
unnecessary (none of the callers of kpit_elapsed expect the value to be
at most the initial counter value) and incorrect (it causes pit_get_count
to appear to wrap around to the initial counter value rather than 0xFFFF).
Removing this mod from __kpit_elapsed fixes both of the above bugs.
Signed-off-by: NNickolai Zeldovich <nickolai@csail.mit.edu>
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

d4b06c2d

15 12月, 2012 1 次提交

KVM: remove unused variable. · e11ae1a1

由 Gleb Natapov 提交于 12月 14, 2012

Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e11ae1a1

14 12月, 2012 5 次提交

KVM: struct kvm_memory_slot.user_alloc -> bool · f82a8cfe

由 Alex Williamson 提交于 12月 10, 2012

There's no need for this to be an int, it holds a boolean.
Move to the end of the struct for alignment.
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f82a8cfe

KVM: Rename KVM_MEMORY_SLOTS -> KVM_USER_MEM_SLOTS · bbacc0c1

由 Alex Williamson 提交于 12月 10, 2012

It's easy to confuse KVM_MEMORY_SLOTS and KVM_MEM_SLOTS_NUM.  One is
the user accessible slots and the other is user + private.  Make this
more obvious.
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

bbacc0c1

KVM: inject ExtINT interrupt before APIC interrupts · f3200d00

由 Gleb Natapov 提交于 12月 10, 2012

According to Intel SDM Volume 3 Section 10.8.1 "Interrupt Handling with
the Pentium 4 and Intel Xeon Processors" and Section 10.8.2 "Interrupt
Handling with the P6 Family and Pentium Processors" ExtINT interrupts are
sent directly to the processor core for handling. Currently KVM checks
APIC before it considers ExtINT interrupts for injection which is
backwards from the spec. Make code behave according to the SDM.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Acked-by: N"Zhang, Yang Z" <yang.z.zhang@intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f3200d00

KVM: x86: fix mov immediate emulation for 64-bit operands · 5e2c6883

由 Nadav Amit 提交于 12月 06, 2012

MOV immediate instruction (opcodes 0xB8-0xBF) may take 64-bit operand.
The previous emulation implementation assumes the operand is no longer than 32.
Adding OpImm64 for this matter.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=881579Signed-off-by: NNadav Amit <nadav.amit@gmail.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5e2c6883

KVM: emulator: implement AAD instruction · 7f662273

由 Gleb Natapov 提交于 12月 10, 2012

Windows2000 uses it during boot. This fixes
https://bugzilla.kernel.org/show_bug.cgi?id=50921Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7f662273

12 12月, 2012 3 次提交

KVM: emulator: fix real mode segment checks in address linearization · 58b7825b

由 Gleb Natapov 提交于 12月 11, 2012

In real mode CS register is writable, so do not #GP on write.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

58b7825b

VMX: remove unneeded enable_unrestricted_guest check · 0b26b588

由 Gleb Natapov 提交于 12月 11, 2012

If enable_unrestricted_guest is true vmx->rmode.vm86_active will
always be false.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0b26b588

KVM: VMX: fix DPL during entry to protected mode · a4d3326c

由 Gleb Natapov 提交于 12月 11, 2012

On CPUs without support for unrestricted guests DPL cannot be smaller
than RPL for data segments during guest entry, but this state can occurs
if a data segment selector changes while vcpu is in real mode to a value
with lowest two bits != 00. Fix that by forcing DPL == RPL on transition
to protected mode.

This is a regression introduced by c865c43d.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

a4d3326c

07 12月, 2012 1 次提交

KVM: VMX: provide the vmclear function and a bitmap to support VMCLEAR in kdump · 8f536b76

由 Zhang Yanfei 提交于 12月 06, 2012

The vmclear function will be assigned to the callback function pointer
when loading kvm-intel module. And the bitmap indicates whether we
should do VMCLEAR operation in kdump. The bits in the bitmap are
set/unset according to different conditions.
Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
Acked-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

8f536b76

06 12月, 2012 2 次提交

KVM: MMU: optimize for set_spte · c2193463

由 Xiao Guangrong 提交于 12月 04, 2012

There are two cases we need to adjust page size in set_spte:
1): the one is other vcpu creates new sp in the window between mapping_level()
    and acquiring mmu-lock.
2): the another case is the new sp is created by itself (page-fault path) when
    guest uses the target gfn as its page table.

In current code, set_spte drop the spte and emulate the access for these case,
it works not good:
- for the case 1, it may destroy the mapping established by other vcpu, and
  do expensive instruction emulation.
- for the case 2, it may emulate the access even if the guest is accessing
  the page which not used as page table. There is a example, 0~2M is used as
  huge page in guest, in this huge page, only page 3 used as page table, then
  guest read/writes on other pages can cause instruction emulation.

Both of these cases can be fixed by allowing guest to retry the access, it
will refault, then we can establish the mapping by using small page
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Acked-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

c2193463

KVM: x86: Make register state after reset conform to specification · 66f7b72e

由 Julian Stecklina 提交于 12月 05, 2012

VMX behaves now as SVM wrt to FPU initialization. Code has been moved to
generic code path. General-purpose registers are now cleared on reset and
INIT.  SVM code properly initializes EDX.
Signed-off-by: NJulian Stecklina <jsteckli@os.inf.tu-dresden.de>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

66f7b72e

05 12月, 2012 2 次提交

kvm: don't use bit24 for detecting address-specific invalidation capability · 2b3c5cbc

由 Zhang Xiantao 提交于 12月 05, 2012

Bit24 in VMX_EPT_VPID_CAP_MASI is not used for address-specific invalidation capability
reporting, so remove it from KVM to avoid conflicts in future.
Signed-off-by: NZhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

2b3c5cbc

kvm: remove unnecessary bit checking for ept violation · 0307b7b8

由 Zhang Xiantao 提交于 12月 05, 2012

Bit 6 in EPT vmexit's exit qualification is not defined in SDM, so remove it.
Signed-off-by: NZhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

0307b7b8

02 12月, 2012 1 次提交

KVM: x86: Fix uninitialized return code · 45e3cc7d

由 Jan Kiszka 提交于 12月 02, 2012

This is a regression caused by 18595411.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

45e3cc7d

01 12月, 2012 2 次提交

KVM: x86: Emulate IA32_TSC_ADJUST MSR · ba904635

由 Will Auld 提交于 11月 29, 2012

CPUID.7.0.EBX[1]=1 indicates IA32_TSC_ADJUST MSR 0x3b is supported

Basic design is to emulate the MSR by allowing reads and writes to a guest
vcpu specific location to store the value of the emulated MSR while adding
the value to the vmcs tsc_offset. In this way the IA32_TSC_ADJUST value will
be included in all reads to the TSC MSR whether through rdmsr or rdtsc. This
is of course as long as the "use TSC counter offsetting" VM-execution control
is enabled as well as the IA32_TSC_ADJUST control.

However, because hardware will only return the TSC + IA32_TSC_ADJUST +
vmsc tsc_offset for a guest process when it does and rdtsc (with the correct
settings) the value of our virtualized IA32_TSC_ADJUST must be stored in one
of these three locations. The argument against storing it in the actual MSR
is performance. This is likely to be seldom used while the save/restore is
required on every transition. IA32_TSC_ADJUST was created as a way to solve
some issues with writing TSC itself so that is not an option either.

The remaining option, defined above as our solution has the problem of
returning incorrect vmcs tsc_offset values (unless we intercept and fix, not
done here) as mentioned above. However, more problematic is that storing the
data in vmcs tsc_offset will have a different semantic effect on the system
than does using the actual MSR. This is illustrated in the following example:

The hypervisor set the IA32_TSC_ADJUST, then the guest sets it and a guest
process performs a rdtsc. In this case the guest process will get
TSC + IA32_TSC_ADJUST_hyperviser + vmsc tsc_offset including
IA32_TSC_ADJUST_guest. While the total system semantics changed the semantics
as seen by the guest do not and hence this will not cause a problem.
Signed-off-by: NWill Auld <will.auld@intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ba904635

KVM: x86: Add code to track call origin for msr assignment · 8fe8ab46

由 Will Auld 提交于 11月 29, 2012

In order to track who initiated the call (host or guest) to modify an msr
value I have changed function call parameters along the call path. The
specific change is to add a struct pointer parameter that points to (index,
data, caller) information rather than having this information passed as
individual parameters.

The initial use for this capability is for updating the IA32_TSC_ADJUST msr
while setting the tsc value. It is anticipated that this capability is
useful for other tasks.
Signed-off-by: NWill Auld <will.auld@intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

8fe8ab46

30 11月, 2012 1 次提交

KVM: VMX: fix memory order between loading vmcs and clearing vmcs · 5a560f8b

由 Xiao Guangrong 提交于 11月 28, 2012

vmcs->cpu indicates whether it exists on the target cpu, -1 means the vmcs
does not exist on any vcpu

If vcpu load vmcs with vmcs.cpu = -1, it can be directly added to cpu's percpu
list. The list can be corrupted if the cpu prefetch the vmcs's list before
reading vmcs->cpu. Meanwhile, we should remove vmcs from the list before
making vmcs->vcpu == -1 be visible
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5a560f8b

29 11月, 2012 1 次提交

KVM: VMX: fix invalid cpu passed to smp_call_function_single · e6c7d321

由 Xiao Guangrong 提交于 11月 28, 2012

In loaded_vmcs_clear, loaded_vmcs->cpu is the fist parameter passed to
smp_call_function_single, if the target cpu is downing (doing cpu hot remove),
loaded_vmcs->cpu can become -1 then -1 is passed to smp_call_function_single

It can be triggered when vcpu is being destroyed, loaded_vmcs_clear is called
in the preemptionable context
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e6c7d321

28 11月, 2012 7 次提交

KVM: x86: update pvclock area conditionally, on cpu migration · d98d07ca

由 Marcelo Tosatti 提交于 11月 27, 2012

As requested by Glauber, do not update kvmclock area on vcpu->pcpu
migration, in case the host has stable TSC.

This is to reduce cacheline bouncing.
Acked-by: NGlauber Costa <glommer@parallels.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d98d07ca

KVM: x86: require matched TSC offsets for master clock · b48aa97e

由 Marcelo Tosatti 提交于 11月 27, 2012

With master clock, a pvclock clock read calculates:

ret = system_timestamp + [ (rdtsc + tsc_offset) - tsc_timestamp ]

Where 'rdtsc' is the host TSC.

system_timestamp and tsc_timestamp are unique, one tuple
per VM: the "master clock".

Given a host with synchronized TSCs, its obvious that
guest TSC must be matched for the above to guarantee monotonicity.

Allow master clock usage only if guest TSCs are synchronized.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

b48aa97e

M
KVM: x86: add kvm_arch_vcpu_postcreate callback, move TSC initialization · 42897d86
由 Marcelo Tosatti 提交于 11月 27, 2012
```
TSC initialization will soon make use of online_vcpus.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
42897d86

KVM: x86: implement PVCLOCK_TSC_STABLE_BIT pvclock flag · d828199e

由 Marcelo Tosatti 提交于 11月 27, 2012

KVM added a global variable to guarantee monotonicity in the guest.
One of the reasons for that is that the time between

	1. ktime_get_ts(&timespec);
	2. rdtscll(tsc);

Is variable. That is, given a host with stable TSC, suppose that
two VCPUs read the same time via ktime_get_ts() above.

The time required to execute 2. is not the same on those two instances
executing in different VCPUS (cache misses, interrupts...).

If the TSC value that is used by the host to interpolate when
calculating the monotonic time is the same value used to calculate
the tsc_timestamp value stored in the pvclock data structure, and
a single <system_timestamp, tsc_timestamp> tuple is visible to all
vcpus simultaneously, this problem disappears. See comment on top
of pvclock_update_vm_gtod_copy for details.

Monotonicity is then guaranteed by synchronicity of the host TSCs
and guest TSCs.

Set TSC stable pvclock flag in that case, allowing the guest to read
clock from userspace.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d828199e

KVM: x86: notifier for clocksource changes · 16e8d74d

由 Marcelo Tosatti 提交于 11月 27, 2012

Register a notifier for clocksource change event. In case
the host switches to clock other than TSC, disable master
clock usage.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

16e8d74d

KVM: x86: pass host_tsc to read_l1_tsc · 886b470c

由 Marcelo Tosatti 提交于 11月 27, 2012

Allow the caller to pass host tsc value to kvm_x86_ops->read_l1_tsc().
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

886b470c

KVM: x86: retain pvclock guest stopped bit in guest memory · 78c0337a

由 Marcelo Tosatti 提交于 11月 27, 2012

Otherwise its possible for an unrelated KVM_REQ_UPDATE_CLOCK (such as due to CPU
migration) to clear the bit.

Noticed by Paolo Bonzini.
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Reviewed-by: NGlauber Costa <glommer@parallels.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

78c0337a

27 11月, 2012 1 次提交

x86, kvm: Remove incorrect redundant assembly constraint · cb7cb286

由 H. Peter Anvin 提交于 11月 21, 2012

In __emulate_1op_rax_rdx, we use "+a" and "+d" which are input/output
constraints, and *then* use "a" and "d" as input constraints. This is
incorrect, but happens to work on some versions of gcc.

However, it breaks gcc with -O0 and icc, and may break on future
versions of gcc.
Reported-and-tested-by: NMelanie Blower <melanie.blower@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/B3584E72CFEBED439A3ECA9BCE67A4EF1B17AF90@FMSMSX107.amr.corp.intel.comReviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Acked-by: NMarcelo Tosatti <mtosatti@redhat.com>

cb7cb286