提交 · 3457e4192e367fd4e0da5e9f46f9df85fa99cd11 · openanolis / cloud-kernel

01 8月, 2010 21 次提交

KVM: handle emulation failure case first · 3457e419

由 Gleb Natapov 提交于 4月 28, 2010

If emulation failed return immediately.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3457e419

KVM: do not inject #PF in (read|write)_emulated() callbacks · 8fe681e9

由 Gleb Natapov 提交于 4月 28, 2010

Return error to x86 emulator instead of injection exception behind its back.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8fe681e9

KVM: remove export of emulator_write_emulated() · f181b96d

由 Gleb Natapov 提交于 4月 28, 2010

It is not called directly outside of the file it's defined in anymore.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f181b96d

KVM: x86 emulator: x86_emulate_insn() return -1 only in case of emulation failure · c3cd7ffa

由 Gleb Natapov 提交于 4月 28, 2010

Currently emulator returns -1 when emulation failed or IO is needed.
Caller tries to guess whether emulation failed by looking at other
variables. Make it easier for caller to recognise error condition by
always returning -1 in case of failure. For this new emulator
internal return value X86EMUL_IO_NEEDED is introduced. It is used to
distinguish between error condition (which returns X86EMUL_UNHANDLEABLE)
and condition that requires IO exit to userspace to continue emulation.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c3cd7ffa

KVM: fill in run->mmio details in (read|write)_emulated function · 411c35b7

由 Gleb Natapov 提交于 4月 28, 2010

Fill in run->mmio details in (read|write)_emulated function just like
pio does. There is no point in filling only vcpu fields there just to
copy them into vcpu->run a little bit later.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

411c35b7

KVM: x86 emulator: make (get|set)_dr() callback return error if it fails · 338dbc97

由 Gleb Natapov 提交于 4月 28, 2010

Make (get|set)_dr() callback return error if it fails instead of
injecting exception behind emulator's back.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

338dbc97

KVM: x86 emulator: make set_cr() callback return error if it fails · 0f12244f

由 Gleb Natapov 提交于 4月 28, 2010

Make set_cr() callback return error if it fails instead of injecting #GP
behind emulator's back.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

0f12244f

KVM: x86 emulator: cleanup some direct calls into kvm to use existing callbacks · 79168fd1

由 Gleb Natapov 提交于 4月 28, 2010

Use callbacks from x86_emulate_ops to access segments instead of calling
into kvm directly.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

79168fd1

KVM: x86 emulator: add get_cached_segment_base() callback to x86_emulate_ops · 5951c442

由 Gleb Natapov 提交于 4月 28, 2010

On VMX it is expensive to call get_cached_descriptor() just to get segment
base since multiple vmcs_reads are done instead of only one. Introduce
new call back get_cached_segment_base() for efficiency.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

5951c442

KVM: x86 emulator: add (set|get)_msr callbacks to x86_emulate_ops · 3fb1b5db

由 Gleb Natapov 提交于 4月 28, 2010

Add (set|get)_msr callbacks to x86_emulate_ops instead of calling
them directly.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3fb1b5db

KVM: x86 emulator: add (set|get)_dr callbacks to x86_emulate_ops · 35aa5375

由 Gleb Natapov 提交于 4月 28, 2010

Add (set|get)_dr callbacks to x86_emulate_ops instead of calling
them directly.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

35aa5375

KVM: x86 emulator: handle "far address" source operand · 414e6277

由 Gleb Natapov 提交于 4月 28, 2010

ljmp/lcall instruction operand contains address and segment.
It can be 10 bytes long. Currently we decode it as two different
operands. Fix it by introducing new kind of operand that can hold
entire far address.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

414e6277

KVM: x86 emulator: cleanup nop emulation · b8a98945

由 Gleb Natapov 提交于 4月 28, 2010

Make it more explicit what we are checking for.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b8a98945

KVM: x86 emulator: cleanup xchg emulation · f0c13ef1

由 Gleb Natapov 提交于 4月 28, 2010

Dst operand is already initialized during decoding stage. No need to
reinitialize.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f0c13ef1

KVM: x86 emulator: fix Move r/m16 to segment register decoding · 054fe9f6

由 Gleb Natapov 提交于 4月 28, 2010

This instruction does not need generic decoding for its dst operand.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

054fe9f6

KVM: x86 emulator: introduce read cache · 9de41573

由 Gleb Natapov 提交于 4月 28, 2010

Introduce read cache which is needed for instruction that require more
then one exit to userspace. After returning from userspace the instruction
will be re-executed with cached read value.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9de41573

KVM: VMX: Avoid writing HOST_CR0 every entry · 1c11e713

由 Avi Kivity 提交于 5月 03, 2010

cr0.ts may change between entries, so we copy cr0 to HOST_CR0 before each
entry.  That is slow, so instead, set HOST_CR0 to have TS set unconditionally
(which is a safe value), and issue a clts() just before exiting vcpu context
if the task indeed owns the fpu.

Saves ~50 cycles/exit.
Signed-off-by: NAvi Kivity <avi@redhat.com>

1c11e713

A
KVM: kvm_pdptr_read() may sleep · 08acfa18
由 Avi Kivity 提交于 5月 04, 2010
```
Annotate it thusly.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
08acfa18

KVM: x86: avoid unnecessary bitmap allocation when memslot is clean · 914ebccd

由 Takuya Yoshikawa 提交于 4月 28, 2010

Although we always allocate a new dirty bitmap in x86's get_dirty_log(),
it is only used as a zero-source of copy_to_user() and freed right after
that when memslot is clean. This patch uses clear_user() instead of doing
this unnecessary zero-source allocation.

Performance improvement: as we can expect easily, the time needed to
allocate a bitmap is completely reduced. In my test, the improved ioctl
was about 4 to 10 times faster than the original one for clean slots.
Furthermore, reducing memory allocations and copies will produce good
effects to caches too.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

914ebccd

A
KVM: VMX: Simplify vmx_get_nmi_mask() · c332c83a
由 Avi Kivity 提交于 5月 04, 2010
```
!! is not needed due to the cast to bool.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
c332c83a

KVM: Avoid killing userspace through guest SRAO MCE on unmapped pages · bf998156

由 Huang Ying 提交于 5月 31, 2010

In common cases, guest SRAO MCE will cause corresponding poisoned page
be un-mapped and SIGBUS be sent to QEMU-KVM, then QEMU-KVM will relay
the MCE to guest OS.

But it is reported that if the poisoned page is accessed in guest
after unmapping and before MCE is relayed to guest OS, userspace will
be killed.

The reason is as follows. Because poisoned page has been un-mapped,
guest access will cause guest exit and kvm_mmu_page_fault will be
called. kvm_mmu_page_fault can not get the poisoned page for fault
address, so kernel and user space MMIO processing is tried in turn. In
user MMIO processing, poisoned page is accessed again, then userspace
is killed by force_sig_info.

To fix the bug, kvm_mmu_page_fault send HWPOISON signal to QEMU-KVM
and do not try kernel and user space MMIO processing for poisoned
page.

[xiao: fix warning introduced by avi]
Reported-by: NMax Asbock <masbock@linux.vnet.ibm.com>
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bf998156

23 7月, 2010 2 次提交

KVM: Use kmalloc() instead of vmalloc() for KVM_[GS]ET_MSR · 7a73c028

由 Avi Kivity 提交于 7月 22, 2010

We don't need more than a page, and vmalloc() is slower (much
slower recently due to a regression).
Signed-off-by: NAvi Kivity <avi@redhat.com>

7a73c028

KVM: MMU: fix conflict access permissions in direct sp · 6aa0b9de

由 Xiao Guangrong 提交于 6月 30, 2010

In no-direct mapping, we mark sp is 'direct' when we mapping the
guest's larger page, but its access is encoded form upper page-struct
entire not include the last mapping, it will cause access conflict.

For example, have this mapping:
        [W]
      / PDE1 -> |---|
  P[W]          |   | LPA
      \ PDE2 -> |---|
        [R]

P have two children, PDE1 and PDE2, both PDE1 and PDE2 mapping the
same lage page(LPA). The P's access is WR, PDE1's access is WR,
PDE2's access is RO(just consider read-write permissions here)

When guest access PDE1, we will create a direct sp for LPA, the sp's
access is from P, is W, then we will mark the ptes is W in this sp.

Then, guest access PDE2, we will find LPA's shadow page, is the same as
PDE's, and mark the ptes is RO.

So, if guest access PDE1, the incorrect #PF is occured.

Fixed by encode the last mapping access into direct shadow page
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6aa0b9de

19 7月, 2010 1 次提交

mm: add context argument to shrinker callback · 7f8275d0

由 Dave Chinner 提交于 7月 19, 2010

The current shrinker implementation requires the registered callback
to have global state to work from. This makes it difficult to shrink
caches that are not global (e.g. per-filesystem caches). Pass the shrinker
structure to the callback so that users can embed the shrinker structure
in the context the shrinker needs to operate on and get back to it in the
callback via container_of().
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

7f8275d0

13 7月, 2010 1 次提交

KVM: MMU: flush remote tlbs when overwriting spte with different pfn · 91546356

由 Xiao Guangrong 提交于 6月 30, 2010

After remove a rmap, we should flush all vcpu's tlb
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

91546356

06 7月, 2010 1 次提交

KVM: VMX: Fix host MSR_KERNEL_GS_BASE corruption · da38f438

由 Avi Kivity 提交于 7月 06, 2010

enter_lmode() and exit_lmode() modify the guest's EFER.LMA before calling
vmx_set_efer().  However, the latter function depends on the value of EFER.LMA
to determine whether MSR_KERNEL_GS_BASE needs reloading, via
vmx_load_host_state().  With EFER.LMA changing under its feet, it took the
wrong choice and corrupted userspace's %gs.

This causes 32-on-64 host userspace to fault.

Fix not touching EFER.LMA; instead ask vmx_set_efer() to change it.
Signed-off-by: NAvi Kivity <avi@redhat.com>

da38f438

09 6月, 2010 4 次提交

KVM: MMU: Remove user access when allowing kernel access to gpte.w=0 page · 69325a12

由 Avi Kivity 提交于 5月 27, 2010

If cr0.wp=0, we have to allow the guest kernel access to a page with pte.w=0.
We do that by setting spte.w=1, since the host cr0.wp must remain set so the
host can write protect pages. Once we allow write access, we must remove
user access otherwise we mistakenly allow the user to write the page.
Reviewed-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

69325a12

KVM: MMU: invalidate and flush on spte small->large page size change · 3be2264b

由 Marcelo Tosatti 提交于 5月 28, 2010

Always invalidate spte and flush TLBs when changing page size, to make
sure different sized translations for the same address are never cached
in a CPU's TLB.

Currently the only case where this occurs is when a non-leaf spte pointer is
overwritten by a leaf, large spte entry. This can happen after dirty
logging is disabled on a memslot, for example.

Noticed by Andrea.

KVM-Stable-Tag
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3be2264b

KVM: SVM: Implement workaround for Erratum 383 · 67ec6607

由 Joerg Roedel 提交于 5月 17, 2010

This patch implements a workaround for AMD erratum 383 into
KVM. Without this erratum fix it is possible for a guest to
kill the host machine. This patch implements the suggested
workaround for hypervisors which will be published by the
next revision guide update.

[jan: fix overflow warning on i386]
[xiao: fix unused variable warning]

Cc: stable@kernel.org
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

67ec6607

KVM: SVM: Handle MCEs early in the vmexit process · fe5913e4

由 Joerg Roedel 提交于 5月 17, 2010

This patch moves handling of the MC vmexits to an earlier
point in the vmexit. The handle_exit function is too late
because the vcpu might alreadry have changed its physical
cpu.

Cc: stable@kernel.org
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fe5913e4

19 5月, 2010 10 次提交

A
KVM: x86: Add missing locking to arch specific vcpu ioctls · 8fbf065d
由 Avi Kivity 提交于 5月 13, 2010
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
8fbf065d

KVM: MMU: Segregate shadow pages with different cr0.wp · 3dbe1415

由 Avi Kivity 提交于 5月 12, 2010

When cr0.wp=0, we may shadow a gpte having u/s=1 and r/w=0 with an spte
having u/s=0 and r/w=1.  This allows excessive access if the guest sets
cr0.wp=1 and accesses through this spte.

Fix by making cr0.wp part of the base role; we'll have different sptes for
the two cases and the problem disappears.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

3dbe1415

KVM: x86: Check LMA bit before set_efer · a3d204e2

由 Sheng Yang 提交于 5月 12, 2010

kvm_x86_ops->set_efer() would execute vcpu->arch.efer = efer, so the
checking of LMA bit didn't work.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

a3d204e2

KVM: Don't allow lmsw to clear cr0.pe · f78e9176

由 Avi Kivity 提交于 5月 12, 2010

The current lmsw implementation allows the guest to clear cr0.pe, contrary
to the manual, which breaks EMM386.EXE.

Fix by ORing the old cr0.pe with lmsw's operand.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f78e9176

KVM: x86: Tell the guest we'll warn it about tsc stability · 371bcf64

由 Glauber Costa 提交于 5月 11, 2010

This patch puts up the flag that tells the guest that we'll warn it
about the tsc being trustworthy or not. By now, we also say
it is not.
Signed-off-by: NGlauber Costa <glommer@redhat.com>
Acked-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

371bcf64

KVM: x86: export paravirtual cpuid flags in KVM_GET_SUPPORTED_CPUID · 84478c82

由 Glauber Costa 提交于 5月 11, 2010

Right now, we were using individual KVM_CAP entities to communicate
userspace about which cpuids we support. This is suboptimal, since it
generates a delay between the feature arriving in the host, and
being available at the guest.

A much better mechanism is to list para features in KVM_GET_SUPPORTED_CPUID.
This makes userspace automatically aware of what we provide. And if we
ever add a new cpuid bit in the future, we have to do that again,
which create some complexity and delay in feature adoption.
Signed-off-by: NGlauber Costa <glommer@redhat.com>
Acked-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

84478c82

KVM: x86: change msr numbers for kvmclock · 11c6bffa

由 Glauber Costa 提交于 5月 11, 2010

Avi pointed out a while ago that those MSRs falls into the pentium
PMU range. So the idea here is to add new ones, and after a while,
deprecate the old ones.
Signed-off-by: NGlauber Costa <glommer@redhat.com>
Acked-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

11c6bffa

KVM: x86: Inject #GP with the right rip on efer writes · b69e8cae

由 Roedel, Joerg 提交于 5月 06, 2010

This patch fixes a bug in the KVM efer-msr write path. If a
guest writes to a reserved efer bit the set_efer function
injects the #GP directly. The architecture dependent wrmsr
function does not see this, assumes success and advances the
rip. This results in a #GP in the guest with the wrong rip.
This patch fixes this by reporting efer write errors back to
the architectural wrmsr function.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b69e8cae

KVM: SVM: Don't allow nested guest to VMMCALL into host · 0d945bd9

由 Joerg Roedel 提交于 5月 05, 2010

This patch disables the possibility for a l2-guest to do a
VMMCALL directly into the host. This would happen if the
l1-hypervisor doesn't intercept VMMCALL and the l2-guest
executes this instruction.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

0d945bd9

KVM: x86: Fix exception reinjection forced to true · 3f0fd292

由 Joerg Roedel 提交于 5月 05, 2010

The patch merged recently which allowed to mark an exception
as reinjected has a bug as it always marks the exception as
reinjected. This breaks nested-svm shadow-on-shadow
implementation.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3f0fd292

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功