提交 · 6fd01b711bee96ce3356f7b6f370ab708e37504b · openanolis / cloud-kernel

20 9月, 2012 7 次提交

KVM: MMU: Optimize is_last_gpte() · 6fd01b71

由 Avi Kivity 提交于 9月 12, 2012

Instead of branchy code depending on level, gpte.ps, and mmu configuration,
prepare everything in a bitmap during mode changes and look it up during
runtime.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6fd01b71

KVM: MMU: Simplify walk_addr_generic() loop · 13d22b6a

由 Avi Kivity 提交于 9月 12, 2012

The page table walk is coded as an infinite loop, with a special
case on the last pte.

Code it as an ordinary loop with a termination condition on the last
pte (large page or walk length exhausted), and put the last pte handling
code after the loop where it belongs.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

13d22b6a

KVM: MMU: Optimize pte permission checks · 97d64b78

由 Avi Kivity 提交于 9月 12, 2012

walk_addr_generic() permission checks are a maze of branchy code, which is
performed four times per lookup.  It depends on the type of access, efer.nxe,
cr0.wp, cr4.smep, and in the near future, cr4.smap.

Optimize this away by precalculating all variants and storing them in a
bitmap.  The bitmap is recalculated when rarely-changing variables change
(cr0, cr4) and is indexed by the often-changing variables (page fault error
code, pte access permissions).

The permission check is moved to the end of the loop, otherwise an SMEP
fault could be reported as a false positive, when PDE.U=1 but PTE.U=0.
Noted by Xiao Guangrong.

The result is short, branch-free code.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

97d64b78

KVM: MMU: Update accessed and dirty bits after guest pagetable walk · 8cbc7069

由 Avi Kivity 提交于 9月 16, 2012

While unspecified, the behaviour of Intel processors is to first
perform the page table walk, then, if the walk was successful, to
atomically update the accessed and dirty bits of walked paging elements.

While we are not required to follow this exactly, doing so will allow us
to perform the access permissions check after the walk is complete, rather
than after each walk step.

(the tricky case is SMEP: a zero in any pte's U bit makes the referenced
page a supervisor page, so we can't fault on a one bit during the walk
itself).
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8cbc7069

KVM: MMU: Move gpte_access() out of paging_tmpl.h · 3d34adec

由 Avi Kivity 提交于 9月 12, 2012

We no longer rely on paging_tmpl.h defines; so we can move the function
to mmu.c.

Rely on zero extension to 64 bits to get the correct nx behaviour.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3d34adec

KVM: MMU: Optimize gpte_access() slightly · edc2ae84

由 Avi Kivity 提交于 9月 12, 2012

If nx is disabled, then is gpte[63] is set we will hit a reserved
bit set fault before checking permissions; so we can ignore the
setting of efer.nxe.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

edc2ae84

KVM: MMU: Push clean gpte write protection out of gpte_access() · 8ea667f2

由 Avi Kivity 提交于 9月 12, 2012

gpte_access() computes the access permissions of a guest pte and also
write-protects clean gptes.  This is wrong when we are servicing a
write fault (since we'll be setting the dirty bit momentarily) but
correct when instantiating a speculative spte, or when servicing a
read fault (since we'll want to trap a following write in order to
set the dirty bit).

It doesn't seem to hurt in practice, but in order to make the code
readable, push the write protection out of gpte_access() and into
a new protect_clean_gpte() which is called explicitly when needed.
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8ea667f2

18 9月, 2012 1 次提交

KVM: make processes waiting on vcpu mutex killable · 9fc77441

由 Michael S. Tsirkin 提交于 9月 16, 2012

vcpu mutex can be held for unlimited time so
taking it with mutex_lock on an ioctl is wrong:
one process could be passed a vcpu fd and
call this ioctl on the vcpu used by another process,
it will then be unkillable until the owner exits.

Call mutex_lock_killable instead and return status.
Note: mutex_lock_interruptible would be even nicer,
but I am not sure all users are prepared to handle EINTR
from these ioctls. They might misinterpret it as an error.

Cleanup paths expect a vcpu that can't be used by
any userspace so this will always succeed - catch bugs
by calling BUG_ON.

Catch callers that don't check return state by adding
__must_check.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9fc77441

17 9月, 2012 3 次提交

KVM: SVM: Make use of asm.h · 7454766f

由 Avi Kivity 提交于 9月 16, 2012

Use macros for bitness-insensitive register names, instead of
rolling our own.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7454766f

KVM: VMX: Make use of asm.h · b188c81f

由 Avi Kivity 提交于 9月 16, 2012

Use macros for bitness-insensitive register names, instead of
rolling our own.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

b188c81f

KVM: VMX: Make lto-friendly · 83287ea4

由 Avi Kivity 提交于 9月 16, 2012

LTO (link-time optimization) doesn't like local labels to be referred to
from a different function, since the two functions may be built in separate
compilation units.  Use an external variable instead.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

83287ea4

13 9月, 2012 1 次提交

KVM: x86: lapic: Clean up find_highest_vector() and count_vectors() · ecba9a52

由 Takuya Yoshikawa 提交于 9月 05, 2012

find_highest_vector() and count_vectors():
 - Instead of using magic values, define and use proper macros.

find_highest_vector():
 - Remove likely() which is there only for historical reasons and not
   doing correct branch predictions anymore.  Using such heuristics
   to optimize this function is not worth it now.  Let CPUs predict
   things instead.

 - Stop checking word[0] separately.  This was only needed for doing
   likely() optimization.

 - Use for loop, not while, to iterate over the register array to make
   the code clearer.

Note that we actually confirmed that the likely() did wrong predictions
by inserting debug code.
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ecba9a52

10 9月, 2012 2 次提交

KVM: MMU: remove unnecessary check · 7de5bdc9

由 Xiao Guangrong 提交于 9月 07, 2012

Checking the return of kvm_mmu_get_page is unnecessary since it is
guaranteed by memory cache
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

7de5bdc9

KVM: Depend on HIGH_RES_TIMERS · 92b5265d

由 Liu, Jinsong 提交于 9月 10, 2012

KVM lapic timer and tsc deadline timer based on hrtimer,
setting a leftmost node to rb tree and then do hrtimer reprogram.
If hrtimer not configured as high resolution, hrtimer_enqueue_reprogram
do nothing and then make kvm lapic timer and tsc deadline timer fail.
Signed-off-by: NLiu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

92b5265d

06 9月, 2012 7 次提交

KVM: use symbolic constant for nr interrupts · a50abc3b

由 Michael S. Tsirkin 提交于 9月 05, 2012

interrupt_bitmap is KVM_NR_INTERRUPTS bits in size,
so just use that instead of hard-coded constants
and math.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a50abc3b

KVM: emulator: optimize "rep ins" handling · b3356bf0

由 Gleb Natapov 提交于 9月 03, 2012

Optimize "rep ins" by allowing emulator to write back more than one
datum at a time. Introduce new operand type OP_MEM_STR which tells
writeback() that dst contains pointer to an array that should be written
back as opposite to just one data element.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b3356bf0

KVM: emulator: string_addr_inc() cleanup · f3bd64c6

由 Gleb Natapov 提交于 9月 03, 2012

Remove unneeded segment argument. Address structure already has correct
segment which was put there during decode.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f3bd64c6

G
KVM: emulator: make x86 emulation modes enum instead of defines · 9d1b39a9
由 Gleb Natapov 提交于 9月 03, 2012
```
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
9d1b39a9

KVM: Provide userspace IO exit completion callback · 716d51ab

由 Gleb Natapov 提交于 9月 03, 2012

Current code assumes that IO exit was due to instruction emulation
and handles execution back to emulator directly. This patch adds new
userspace IO exit completion callback that can be set by any other code
that caused IO exit to userspace.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

716d51ab

KVM: move postcommit flush to x86, as mmio sptes are x86 specific · 3b4dc3a0

由 Marcelo Tosatti 提交于 8月 28, 2012

Other arches do not need this.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

v2: fix incorrect deletion of mmio sptes on gpa move (noticed by Takuya)
Signed-off-by: NAvi Kivity <avi@redhat.com>

3b4dc3a0

KVM: split kvm_arch_flush_shadow · 2df72e9b

由 Marcelo Tosatti 提交于 8月 24, 2012

Introducing kvm_arch_flush_shadow_memslot, to invalidate the
translations of a single memory slot.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2df72e9b

05 9月, 2012 8 次提交

KVM: SVM: constify lookup tables · 09941fbb

由 Mathias Krause 提交于 8月 30, 2012

We never modify direct_access_msrs[], msrpm_ranges[],
svm_exit_handlers[] or x86_intercept_map[] at runtime.
Mark them r/o.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

09941fbb

KVM: VMX: constify lookup tables · 772e0318

由 Mathias Krause 提交于 8月 30, 2012

We use vmcs_field_to_offset_table[], kvm_vmx_segment_fields[] and
kvm_vmx_exit_handlers[] as lookup tables only -- make them r/o.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

772e0318

KVM: x86: more constification · f1d24831

由 Mathias Krause 提交于 8月 30, 2012

Signed-off-by: NMathias Krause <minipli@googlemail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f1d24831

KVM: x86: constify read_write_emulator_ops · 0fbe9b0b

由 Mathias Krause 提交于 8月 30, 2012

We never change those, make them r/o.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

0fbe9b0b

KVM: x86 emulator: constify emulate_ops · 0225fb50

由 Mathias Krause 提交于 8月 30, 2012

We never change emulate_ops[] at runtime so it should be r/o.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

0225fb50

KVM: x86 emulator: mark opcode tables const · fd0a0d82

由 Mathias Krause 提交于 8月 30, 2012

The opcode tables never change at runtime, therefor mark them const.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fd0a0d82

KVM: x86 emulator: use aligned variants of SSE register ops · 89a87c67

由 Mathias Krause 提交于 8月 30, 2012

As the the compiler ensures that the memory operand is always aligned
to a 16 byte memory location, use the aligned variant of MOVDQ for
read_sse_reg() and write_sse_reg().
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

89a87c67

KVM: x86: minor size optimization · 326d07cb

由 Mathias Krause 提交于 8月 30, 2012

Some fields can be constified and/or made static to reduce code and data
size.

Numbers for a 32 bit build:

        text    data     bss     dec     hex filename
before: 3351      80       0    3431     d67 cpuid.o
 after: 3391       0       0    3391     d3f cpuid.o
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

326d07cb

04 9月, 2012 1 次提交

KVM: cleanup pic reset · ec798660

由 Gleb Natapov 提交于 9月 03, 2012

kvm_pic_reset() is not used anywhere. Move reset logic from
pic_ioport_write() there.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ec798660

31 8月, 2012 1 次提交
- M
  KVM: x86: remove unused variable from kvm_task_switch() · 9a781977
  由 Marcelo Tosatti 提交于 8月 30, 2012
```
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
  9a781977
28 8月, 2012 9 次提交

KVM: VMX: Ignore segment G and D bits when considering whether we can virtualize · a81aba14

由 Avi Kivity 提交于 8月 21, 2012

We will enter the guest with G and D cleared; as real hardware ignores D in
real mode, and G is taken care of by the limit test, we allow more code to
run in vm86 mode.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

a81aba14

KVM: VMX: Save all segment data in real mode · ce566803

由 Avi Kivity 提交于 8月 21, 2012

Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ce566803

KVM: VMX: Preserve segment limit and access rights in real mode · 1390a28b

由 Avi Kivity 提交于 8月 21, 2012

While this is undocumented, real processors do not reload the segment
limit and access rights when loading a segment register in real mode.
Real programs rely on it so we need to comply with this behaviour.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

1390a28b

KVM: VMX: Return real real-mode segment data even if emulate_invalid_guest_state=1 · 72636420

由 Avi Kivity 提交于 8月 21, 2012

emulate_invalid_guest_state=1 doesn't mean we don't munge the segments in the
vmcs; we do. So we need to return the real ones (maintained by vmx_set_segment).
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

72636420

KVM: x86 emulator: Fix #GP error code during linearization · 0afbe2f8

由 Avi Kivity 提交于 8月 21, 2012

We want the segment selector, nor segment number.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0afbe2f8

KVM: x86 emulator: Check segment limits in real mode too · a5625189

由 Avi Kivity 提交于 8月 21, 2012

Segment limits are verified in real mode, not just protected mode.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

a5625189

KVM: x86 emulator: Leave segment limit and attributs alone in real mode · 03ebebeb

由 Avi Kivity 提交于 8月 21, 2012

When loading a segment in real mode, only the base and selector must
be modified.  The limit needs to be left alone, otherwise big real mode
users will hit a #GP due to limit checking (currently this is suppressed
because we don't check limits in real mode).
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

03ebebeb

KVM: VMX: Allow vm86 virtualization of big real mode · e2a610d7

由 Avi Kivity 提交于 8月 21, 2012

Usually, big real mode uses large (4GB) segments. Currently we don't
virtualize this; if any segment has a limit other than 0xffff, we emulate.
But if we set the vmx-visible limit to 0xffff, we can use vm86 to virtualize
real mode; if an access overruns the segment limit, the guest will #GP, which
we will trap and forward to the emulator. This results in significantly
faster execution, and less risk of hitting an unemulated instruction.

If the limit is less than 0xffff, we retain the existing behaviour.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e2a610d7

KVM: VMX: Allow real mode emulation using vm86 with dpl=0 · 495e1166

由 Avi Kivity 提交于 8月 21, 2012

Real mode is always entered from protected mode with dpl=0.  Since
the dpl doesn't affect execution, and we already override it to 3
in the vmcs (as vmx requires), we can allow execution in that state.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

495e1166

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功