提交 · e4b502ead259fcf70839414abb7c8cdc3b523f01 · openanolis / cloud-kernel

02 8月, 2010 40 次提交

KVM: MMU: cleanup spte set and accssed/dirty tracking · e4b502ea

由 Xiao Guangrong 提交于 7月 16, 2010

Introduce set_spte_track_bits() to cleanup current code
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e4b502ea

KVM: MMU: don't atomicly set spte if it's not present · be233d49

由 Xiao Guangrong 提交于 7月 16, 2010

If the old mapping is not present, the spte.a is not lost, so no need
atomic operation to set it
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

be233d49

KVM: MMU: fix page dirty tracking lost while sync page · 9ed5520d

由 Xiao Guangrong 提交于 7月 16, 2010

In sync-page path, if spte.writable is changed, it will lose page dirty
tracking, for example:

assume spte.writable = 0 in a unsync-page, when it's synced, it map spte
to writable(that is spte.writable = 1), later guest write spte.gfn, it means
spte.gfn is dirty, then guest changed this mapping to read-only, after it's
synced,  spte.writable = 0

So, when host release the spte, it detect spte.writable = 0 and not mark page
dirty
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9ed5520d

KVM: MMU: fix broken page accessed tracking with ept enabled · daa3db69

由 Xiao Guangrong 提交于 7月 16, 2010

In current code, if ept is enabled(shadow_accessed_mask = 0), the page
accessed tracking is lost.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

daa3db69

KVM: MMU: add missing reserved bits check in speculative path · fa1de2bf

由 Xiao Guangrong 提交于 7月 16, 2010

In the speculative path, we should check guest pte's reserved bits just as
the real processor does
Reported-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fa1de2bf

KVM: MMU: fix mmu notifier invalidate handler for huge spte · 6e3e243c

由 Andrea Arcangeli 提交于 7月 16, 2010

The index wasn't calculated correctly (off by one) for huge spte so KVM guest
was unstable with transparent hugepages.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Reviewed-by: NReviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6e3e243c

KVM: x86 emulator: fix xchg instruction emulation · c19b8bd6

由 Wei Yongjun 提交于 7月 15, 2010

If the destination is a memory operand and the memory cannot
map to a valid page, the xchg instruction emulation and locked
instruction will not work on io regions and stuck in endless
loop. We should emulate exchange as write to fix it.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c19b8bd6

KVM: x86: Call mask notifiers from pic · 9195c4da

由 Gleb Natapov 提交于 7月 15, 2010

If pit delivers interrupt while pic is masking it OS will never do EOI
and ack notifier will not be called so when pit will be unmasked no pit
interrupts will be delivered any more. Calling mask notifiers solves this
issue.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9195c4da

KVM: x86: never re-execute instruction with enabled tdp · 68be0803

由 Gleb Natapov 提交于 7月 14, 2010

With tdp enabled we should get into emulator only when emulating io, so
reexecution will always bring us back into emulator.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

68be0803

KVM: Document KVM_GET_SUPPORTED_CPUID2 ioctl · d153513d

由 Avi Kivity 提交于 7月 14, 2010

Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d153513d

KVM: x86: emulator: inc/dec can have lock prefix · c0e0608c

由 Gleb Natapov 提交于 7月 13, 2010

Mark inc (0xfe/0 0xff/0) and dec (0xfe/1 0xff/1) as lock prefix capable.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c0e0608c

KVM: MMU: Eliminate redundant temporaries in FNAME(fetch) · 24157aaf

由 Avi Kivity 提交于 7月 13, 2010

'level' and 'sptep' are aliases for 'interator.level' and 'iterator.sptep', no
need for them.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

24157aaf

KVM: MMU: Validate all gptes during fetch, not just those used for new pages · 5991b332

由 Avi Kivity 提交于 7月 13, 2010

Currently, when we fetch an spte, we only verify that gptes match those that
the walker saw if we build new shadow pages for them.

However, this misses the following race:

  vcpu1            vcpu2

  walk
                  change gpte
                  walk
                  instantiate sp

  fetch existing sp

Fix by validating every gpte, regardless of whether it is used for building
a new sp or not.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5991b332

KVM: MMU: Simplify spte fetch() function · 0b3c9333

由 Avi Kivity 提交于 7月 13, 2010

Partition the function into three sections:

- fetching indirect shadow pages (host_level > guest_level)
- fetching direct shadow pages (page_level < host_level <= guest_level)
- the final spte (page_level == host_level)

Instead of the current spaghetti.

A slight change from the original code is that we call validate_direct_spte()
more often: previously we called it only for gw->level, now we also call it for
lower levels.  The change should have no effect.

[xiao: fix regression caused by validate_direct_spte() called too late]
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0b3c9333

KVM: MMU: Add gpte_valid() helper · 39c8c672

由 Avi Kivity 提交于 7月 13, 2010

Move the code to check whether a gpte has changed since we fetched it into
a helper.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

39c8c672

KVM: MMU: Add validate_direct_spte() helper · a357bd22

由 Avi Kivity 提交于 7月 13, 2010

Add a helper to verify that a direct shadow page is valid wrt the required
access permissions; drop the page if it is not valid.
Reviewed-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

a357bd22

KVM: MMU: Add drop_large_spte() helper · a3aa51cf

由 Avi Kivity 提交于 7月 13, 2010

To clarify spte fetching code, move large spte handling into a helper.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

a3aa51cf

KVM: MMU: Use __set_spte to link shadow pages · 121eee97

由 Avi Kivity 提交于 7月 13, 2010

To avoid split accesses to 64 bit sptes on i386, use __set_spte() to link
shadow pages together.

(not technically required since shadow pages are __GFP_KERNEL, so upper 32
bits are always clear)
Reviewed-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

121eee97

KVM: MMU: Add link_shadow_page() helper · 32ef26a3

由 Avi Kivity 提交于 7月 13, 2010

To simplify the process of fetching an spte, add a helper that links
a shadow page to an spte.
Reviewed-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

32ef26a3

KVM: Convert mask notifiers to use irqchip/pin instead of gsi · 4a994358

由 Gleb Natapov 提交于 7月 11, 2010

Devices register mask notifier using gsi, but irqchip knows about
irqchip/pin, so conversion from irqchip/pin to gsi should be done before
looking for mask notifier to call.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

4a994358

A
KVM: Document MCE banks non-exposure via KVM_GET_MSR_INDEX_LIST · 2e2602ca
由 Avi Kivity 提交于 7月 07, 2010
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
2e2602ca

KVM: Expose MCE control MSRs to userspace · 908e75f3

由 Avi Kivity 提交于 7月 07, 2010

Userspace needs to reset and save/restore these MSRs.

The MCE banks are not exposed since their number varies from vcpu to vcpu.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

908e75f3

KVM: PIT: stop vpit before freeing irq_routing · aea924f6

由 Xiao Guangrong 提交于 7月 10, 2010

Fix:
general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
......
Call Trace:
 [<ffffffffa0159bd1>] ? kvm_set_irq+0xdd/0x24b [kvm]
 [<ffffffff8106ea8b>] ? trace_hardirqs_off_caller+0x1f/0x10e
 [<ffffffff813ad17f>] ? sub_preempt_count+0xe/0xb6
 [<ffffffff8106d273>] ? put_lock_stats+0xe/0x27
...
RIP  [<ffffffffa0159c72>] kvm_set_irq+0x17e/0x24b [kvm]

This bug is triggered when guest is shutdown, is because we freed
irq_routing before pit thread stopped
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

aea924f6

KVM: Reenter guest after emulation failure if due to access to non-mmio address · a6f177ef

由 Gleb Natapov 提交于 7月 08, 2010

When shadow pages are in use sometimes KVM try to emulate an instruction
when it accesses a shadowed page. If emulation fails KVM un-shadows the
page and reenter guest to allow vcpu to execute the instruction. If page
is not in shadow page hash KVM assumes that this was attempt to do MMIO
and reports emulation failure to userspace since there is no way to fix
the situation. This logic has a race though. If two vcpus tries to write
to the same shadowed page simultaneously both will enter emulator, but
only one of them will find the page in shadow page hash since the one who
founds it also removes it from there, so another cpu will report failure
to userspace and will abort the guest.

Fix this by checking (in addition to checking shadowed page hash) that
page that caused the emulation belongs to valid memory slot. If it is
then reenter the guest to allow vcpu to reexecute the instruction.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a6f177ef

KVM: Return EFAULT from kvm ioctl when guest accesses bad area · edba23e5

由 Gleb Natapov 提交于 7月 07, 2010

Currently if guest access address that belongs to memory slot but is not
backed up by page or page is read only KVM treats it like MMIO access.
Remove that capability. It was never part of the interface and should
not be relied upon.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

edba23e5

KVM: define hwpoison variables static · fa7bff8f

由 Gleb Natapov 提交于 7月 07, 2010

They are not used outside of the file.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fa7bff8f

KVM: fix lock imbalance in kvm_create_pit() · 673813e8

由 Jiri Slaby 提交于 7月 07, 2010

Stanse found that there is an omitted unlock in kvm_create_pit in one fail
path. Add proper unlock there.
Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Gleb Natapov <gleb@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gregory Haskins <ghaskins@novell.com>
Cc: kvm@vger.kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

673813e8

KVM: MMU: Keep going on permission error · f59c1d2d

由 Avi Kivity 提交于 7月 06, 2010

Real hardware disregards permission errors when computing page fault error
code bit 0 (page present).  Do the same.
Reviewed-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

f59c1d2d

KVM: MMU: Only indicate a fetch fault in page fault error code if nx is enabled · b0eeec29

由 Avi Kivity 提交于 7月 06, 2010

Bit 4 of the page fault error code is set only if EFER.NX is set.
Reviewed-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b0eeec29

KVM: x86 emulator: re-implementing 'mov AL,moffs' instruction decoding · 5d55f299

由 Wei Yongjun 提交于 7月 07, 2010

This patch change to use DstAcc for decoding 'mov AL, moffs'
and introduced SrcAcc for decoding 'mov moffs, AL'.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

5d55f299

KVM: x86 emulator: fix cli/sti instruction emulation · 07cbc6c1

由 Wei Yongjun 提交于 7月 06, 2010

If IOPL check fail, the cli/sti emulate GP and then we should
skip writeback since the default write OP is OP_REG.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

07cbc6c1

KVM: x86 emulator: fix 'mov rm,sreg' instruction decoding · b16b2b7b

由 Wei Yongjun 提交于 7月 06, 2010

The source operand of 'mov rm,sreg' is segment register, not
general-purpose register, so remove SrcReg from decoding.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b16b2b7b

KVM: x86 emulator: fix 'and AL,imm8' instruction decoding · e97e883f

由 Wei Yongjun 提交于 7月 06, 2010

'and AL,imm8' should be mask as ByteOp, otherwise the dest operand
length will no correct and we may fill the full EAX when writeback.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e97e883f

KVM: x86 emulator: fix the comment of out instruction · ce7a0ad3

由 Wei Yongjun 提交于 7月 06, 2010

Fix the comment of out instruction, using the same style as the
other instructions.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ce7a0ad3

KVM: x86 emulator: fix 'mov sreg,rm16' instruction decoding · a5046e6c

由 Wei Yongjun 提交于 7月 06, 2010

Memory reads for 'mov sreg,rm16' should be 16 bits only.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a5046e6c

KVM: MMU: Don't drop accessed bit while updating an spte · b79b93f9

由 Avi Kivity 提交于 6月 06, 2010

__set_spte() will happily replace an spte with the accessed bit set with
one that has the accessed bit clear.  Add a helper update_spte() which checks
for this condition and updates the page flag if needed.
Signed-off-by: NAvi Kivity <avi@redhat.com>

b79b93f9

KVM: MMU: Atomically check for accessed bit when dropping an spte · a9221dd5

由 Avi Kivity 提交于 6月 06, 2010

Currently, in the window between the check for the accessed bit, and actually
dropping the spte, a vcpu can access the page through the spte and set the bit,
which will be ignored by the mmu.

Fix by using an exchange operation to atmoically fetch the spte and drop it.
Signed-off-by: NAvi Kivity <avi@redhat.com>

a9221dd5

KVM: MMU: Move accessed/dirty bit checks from rmap_remove() to drop_spte() · ce061867

由 Avi Kivity 提交于 6月 06, 2010

Since we need to make the check atomic, move it to the place that will
set the new spte.
Signed-off-by: NAvi Kivity <avi@redhat.com>

ce061867

KVM: MMU: Introduce drop_spte() · be38d276

由 Avi Kivity 提交于 6月 06, 2010

When we call rmap_remove(), we (almost) always immediately follow it by
an __set_spte() to a nonpresent pte.  Since we need to perform the two
operations atomically, to avoid losing the dirty and accessed bits, introduce
a helper drop_spte() and convert all call sites.

The operation is still nonatomic at this point.
Signed-off-by: NAvi Kivity <avi@redhat.com>

be38d276

KVM: VMX: fix tlb flush with invalid root · dd180b3e

由 Xiao Guangrong 提交于 7月 03, 2010

Commit 341d9b535b6c simplify reload logic while entry guest mode, it
can avoid unnecessary sync-root if KVM_REQ_MMU_RELOAD and
KVM_REQ_MMU_SYNC both set.

But, it cause a issue that when we handle 'KVM_REQ_TLB_FLUSH', the
root is invalid, it is triggered during my test:

Kernel BUG at ffffffffa00212b8 [verbose debug info unavailable]
......

Fixed by directly return if the root is not ready.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

dd180b3e

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功