提交 · a18de5a403f9b5010527b2e7b05049b539b4facd · openeuler / raspberrypi-kernel

16 7月, 2007 10 次提交

A
KVM: Move shadow pte modifications from set_pte/set_pde to set_pde_common() · a18de5a4
由 Avi Kivity 提交于 5月 31, 2007
```
We want all shadow pte modifications in one place.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
a18de5a4

KVM: MMU: Fold fix_write_pf() into set_pte_common() · 97a0a01e

由 Avi Kivity 提交于 5月 31, 2007

This prevents some work from being performed twice, and, more importantly,
reduces the number of places where we modify shadow ptes.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

97a0a01e

A
KVM: MMU: Fold fix_read_pf() into set_pte_common() · 63b1ad24
由 Avi Kivity 提交于 5月 31, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
63b1ad24

KVM: MMU: Pass the guest pde to set_pte_common · 6598c8b2

由 Avi Kivity 提交于 5月 31, 2007

We will need the accessed bit (in addition to the dirty bit) and
also write access (for setting the dirty bit) in a future patch.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

6598c8b2

A
KVM: MMU: Move set_pte_common() to pte width dependent code · e60d75ea
由 Avi Kivity 提交于 5月 30, 2007
```
In preparation of some modifications.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
e60d75ea
A
KVM: MMU: Simplify fetch() a little bit · ef0197e8
由 Avi Kivity 提交于 5月 30, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
ef0197e8
E
KVM: Use symbolic constants instead of magic numbers · 8d728203
由 Eddie Dong 提交于 5月 29, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
8d728203
A
KVM: MMU: Store shadow page tables as kernel virtual addresses, not physical · 47ad8e68
由 Avi Kivity 提交于 5月 06, 2007
```
Simpifies things a bit.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
47ad8e68

KVM: Update shadow pte on write to guest pte · 0028425f

由 Avi Kivity 提交于 5月 01, 2007

A typical demand page/copy on write pattern is:

- page fault on vaddr
- kvm propagates fault to guest
- guest handles fault, updates pte
- kvm traps write, clears shadow pte, resumes guest
- guest returns to userspace, re-faults on same vaddr
- kvm installs shadow pte, resumes guest
- guest continues

So, three vmexits for a single guest page fault.  But if instead of clearing
the page table entry, we update to correspond to the value that the guest
has just written, we eliminate the third vmexit.

This patch does exactly that, reducing kbuild time by about 10%.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

0028425f

KVM: Reduce misfirings of the fork detector · a25f7e1f

由 Avi Kivity 提交于 4月 30, 2007

The kvm mmu tries to detects forks by looking for repeated writes to a
page table. If it sees a fork, it unshadows the page table so the page
table copying can proceed at native speed instead of being emulated.

However, the detector also triggered on simple demand paging access patterns:
a linear walk of memory would of course cause repeated writes to the same
pagetable page, causing it to unshadow prematurely.

Fix by resetting the fork detector if we detect a demand fault.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

a25f7e1f

03 5月, 2007 3 次提交

KVM: Per-vcpu statistics · 1165f5fe

由 Avi Kivity 提交于 4月 19, 2007

Make the exit statistics per-vcpu instead of global.  This gives a 3.5%
boost when running one virtual machine per core on my two socket dual core
(4 cores total) machine.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1165f5fe

KVM: MMU: Fix hugepage pdes mapping same physical address with different access · d28c6cfb

由 Avi Kivity 提交于 3月 23, 2007

The kvm mmu keeps a shadow page for hugepage pdes; if several such pdes map
the same physical address, they share the same shadow page. This is a fairly
common case (kernel mappings on i386 nonpae Linux, for example).

However, if the two pdes map the same memory but with different permissions, kvm
will happily use the cached shadow page. If the access through the more
permissive pde will occur after the access to the strict pde, an endless pagefault
loop will be generated and the guest will make no progress.

Fix by making the access permissions part of the cache lookup key.

The fix allows Xen pae to boot on kvm and run guest domains.

Thanks to Jeremy Fitzhardinge for reporting the bug and testing the fix.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

d28c6cfb

KVM: MMU: Remove unnecessary check for pdptr access · ca5aac1f

由 Avi Kivity 提交于 3月 20, 2007

We already special case the pdptr access, so no need to check it again.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

ca5aac1f

04 3月, 2007 2 次提交

A
KVM: Cosmetics · d27d4aca
由 Avi Kivity 提交于 2月 19, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
d27d4aca

KVM: mmu: add missing dirty page tracking cases · bf3f8e86

由 Avi Kivity 提交于 2月 19, 2007

We fail to mark a page dirty in three cases:

- setting the accessed bit in a pte
- setting the dirty bit in a pte
- emulating a write into a pagetable

This fix adds the missing cases.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

bf3f8e86

13 2月, 2007 1 次提交

[PATCH] kvm: Fix gva_to_gpa() · e119d117

由 Avi Kivity 提交于 2月 12, 2007

gva_to_gpa() needs to be updated to the new walk_addr() calling convention,
otherwise it may oops under some circumstances.

Use the opportunity to remove all the code duplication in gva_to_gpa(), which
essentially repeats the calculations in walk_addr().
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e119d117

27 1月, 2007 2 次提交

[PATCH] KVM: MMU: Report nx faults to the guest · 73b1087e

由 Avi Kivity 提交于 1月 26, 2007

With the recent guest page fault change, we perform access checks on our
own instead of relying on the cpu.  This means we have to perform the nx
checks as well.

Software like the google toolbar on windows appears to rely on this
somehow.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

73b1087e

[PATCH] KVM: MMU: Perform access checks in walk_addr() · 7993ba43

由 Avi Kivity 提交于 1月 26, 2007

Check pte permission bits in walk_addr(), instead of scattering the checks all
over the code.  This has the following benefits:

1. We no longer set the accessed bit for accessed which fail permission checks.
2. Setting the accessed bit is simplified.
3. Under some circumstances, we used to pretend a page fault was fixed when
   it would actually fail the access checks.  This caused an unnecessary
   vmexit.
4. The error code for guest page faults is now correct.

The fix helps netbsd further along booting, and allows kvm to pass the new mmu
testsuite.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7993ba43

23 1月, 2007 1 次提交

[PATCH] KVM: fix bogus pagefault on writable pages · fc3dffe1

由 Avi Kivity 提交于 1月 22, 2007

If a page is marked as dirty in the guest pte, set_pte_common() can set the
writable bit on newly-instantiated shadow pte.  This optimization avoids
a write fault after the initial read fault.

However, if a write fault instantiates the pte, fix_write_pf() incorrectly
reports the fault as a guest page fault, and the guest oopses on what appears
to be a correctly-mapped page.

Fix is to detect the condition and only report a guest page fault on a user
access to a kernel page.

With the fix, a kvm guest can survive a whole night of running the kernel
hacker's screensaver (make -j9 in a loop).
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fc3dffe1

06 1月, 2007 15 次提交

[PATCH] KVM: MMU: Add missing dirty bit · 760db773

由 Avi Kivity 提交于 1月 05, 2007

If we emulate a write, we fail to set the dirty bit on the guest pte, leading
the guest to believe the page is clean, and thus lose data.  Bad.

Fix by setting the guest pte dirty bit under such conditions.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

760db773

[PATCH] KVM: MMU: add audit code to check mappings, etc are correct · 37a7d8b0

由 Avi Kivity 提交于 1月 05, 2007

Signed-off-by: NAvi Kivity <avi@qumranet.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

37a7d8b0

[PATCH] KVM: MMU: Detect oom conditions and propagate error to userspace · e2dec939

由 Avi Kivity 提交于 1月 05, 2007

Signed-off-by: NAvi Kivity <avi@qumranet.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e2dec939

[PATCH] KVM: MMU: Replace atomic allocations by preallocated objects · 714b93da

由 Avi Kivity 提交于 1月 05, 2007

The mmu sometimes needs memory for reverse mapping and parent pte chains.
however, we can't allocate from within the mmu because of the atomic context.

So, move the allocations to a central place that can be executed before the
main mmu machinery, where we can bail out on failure before any damage is
done.

(error handling is deffered for now, but the basic structure is there)
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

714b93da

[PATCH] KVM: MMU: Treat user-mode faults as a hint that a page is no longer a page table · 14364656

由 Avi Kivity 提交于 1月 05, 2007

Signed-off-by: NAvi Kivity <avi@qumranet.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

14364656

[PATCH] KVM: MMU: oom handling · ebeace86

由 Avi Kivity 提交于 1月 05, 2007

When beginning to process a page fault, make sure we have enough shadow pages
available to service the fault.  If not, free some pages.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ebeace86

[PATCH] KVM: MMU: Let the walker extract the target page gfn from the pte · 815af8d4

由 Avi Kivity 提交于 1月 05, 2007

This fixes a problem where set_pte_common() looked for shadowed pages based on
the page directory gfn (a huge page) instead of the actual gfn being mapped.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

815af8d4

[PATCH] KVM: MMU: Write protect guest pages when a shadow is created for them · 374cbac0

由 Avi Kivity 提交于 1月 05, 2007

When we cache a guest page table into a shadow page table, we need to prevent
further access to that page by the guest, as that would render the cache
incoherent.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

374cbac0

[PATCH] KVM: MMU: Shadow page table caching · cea0f0e7

由 Avi Kivity 提交于 1月 05, 2007

Define a hashtable for caching shadow page tables. Look up the cache on
context switch (cr3 change) or during page faults.

The key to the cache is a combination of
- the guest page table frame number
- the number of paging levels in the guest
   * we can cache real mode, 32-bit mode, pae, and long mode page
     tables simultaneously.  this is useful for smp bootup.
- the guest page table table
   * some kernels use a page as both a page table and a page directory.  this
     allows multiple shadow pages to exist for that page, one per level
- the "quadrant"
   * 32-bit mode page tables span 4MB, whereas a shadow page table spans
     2MB.  similarly, a 32-bit page directory spans 4GB, while a shadow
     page directory spans 1GB.  the quadrant allows caching up to 4 shadow page
     tables for one guest page in one level.
- a "metaphysical" bit
   * for real mode, and for pse pages, there is no guest page table, so set
     the bit to avoid write protecting the page.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

cea0f0e7

[PATCH] KVM: MMU: Make kvm_mmu_alloc_page() return a kvm_mmu_page pointer · 25c0de2c

由 Avi Kivity 提交于 1月 05, 2007

This allows further manipulation on the shadow page table.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

25c0de2c

[PATCH] KVM: MMU: Make the shadow page tables also special-case pae · aef3d3fe

由 Avi Kivity 提交于 1月 05, 2007

Signed-off-by: NAvi Kivity <avi@qumranet.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

aef3d3fe

[PATCH] KVM: MMU: Use the guest pdptrs instead of mapping cr3 in pae mode · 1b0973bd

由 Avi Kivity 提交于 1月 05, 2007

This lets us not write protect a partial page, and is anyway what a real
processor does.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

1b0973bd

[PATCH] KVM: MMU: Fold fetch_guest() into init_walker() · ac79c978

由 Avi Kivity 提交于 1月 05, 2007

It is never necessary to fetch a guest entry from an intermediate page table
level (except for large pages), so avoid some confusion by always descending
into the lowest possible level.

Rename init_walker() to walk_addr() as it is no longer restricted to
initialization.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ac79c978

[PATCH] KVM: MMU: Teach the page table walker to track guest page table gfns · 6bcbd6ab

由 Avi Kivity 提交于 1月 05, 2007

Saving the table gfns removes the need to walk the guest and host page tables
in lockstep.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

6bcbd6ab

[PATCH] KVM: MMU: Implement simple reverse mapping · cd4a4e53

由 Avi Kivity 提交于 1月 05, 2007

Keep in each host page frame's page->private a pointer to the shadow pte which
maps it.  If there are multiple shadow ptes mapping the page, set bit 0 of
page->private, and use the rest as a pointer to a linked list of all such
mappings.

Reverse mappings are needed because we when we cache shadow page tables, we
must protect the guest page tables from being modified by the guest, as that
would invalidate the cached ptes.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

cd4a4e53

31 12月, 2006 1 次提交

[PATCH] KVM: Simplify is_long_mode() · a9058ecd

由 Avi Kivity 提交于 12月 29, 2006

Instead of doing tricky stuff with the arch dependent virtualization
registers, take a peek at the guest's efer.

This simlifies some code, and fixes some confusion in the mmu branch.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

a9058ecd

14 12月, 2006 1 次提交

[PATCH] KVM: MMU: Ignore pcd, pwt, and pat bits on ptes · 8c7bb723

由 Avi Kivity 提交于 12月 13, 2006

The pcd, pwt, and pat bits on page table entries affect the cpu cache.  Since
the cache is a host resource, the guest should not be able to control it.
Moreover, the meaning of these bits changes depending on whether pat is
enabled or not.

So, force these bits to zero on shadow page table entries at all times.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

8c7bb723

11 12月, 2006 1 次提交

[PATCH] kvm: userspace interface · 6aa8b732

由 Avi Kivity 提交于 12月 10, 2006

web site: http://kvm.sourceforge.net

mailing list: kvm-devel@lists.sourceforge.net
  (http://lists.sourceforge.net/lists/listinfo/kvm-devel)

The following patchset adds a driver for Intel's hardware virtualization
extensions to the x86 architecture.  The driver adds a character device
(/dev/kvm) that exposes the virtualization capabilities to userspace.  Using
this driver, a process can run a virtual machine (a "guest") in a fully
virtualized PC containing its own virtual hard disks, network adapters, and
display.

Using this driver, one can start multiple virtual machines on a host.

Each virtual machine is a process on the host; a virtual cpu is a thread in
that process.  kill(1), nice(1), top(1) work as expected.  In effect, the
driver adds a third execution mode to the existing two: we now have kernel
mode, user mode, and guest mode.  Guest mode has its own address space mapping
guest physical memory (which is accessible to user mode by mmap()ing
/dev/kvm).  Guest mode has no access to any I/O devices; any such access is
intercepted and directed to user mode for emulation.

The driver supports i386 and x86_64 hosts and guests.  All combinations are
allowed except x86_64 guest on i386 host.  For i386 guests and hosts, both pae
and non-pae paging modes are supported.

SMP hosts and UP guests are supported.  At the moment only Intel
hardware is supported, but AMD virtualization support is being worked on.

Performance currently is non-stellar due to the naive implementation of the
mmu virtualization, which throws away most of the shadow page table entries
every context switch.  We plan to address this in two ways:

- cache shadow page tables across tlb flushes
- wait until AMD and Intel release processors with nested page tables

Currently a virtual desktop is responsive but consumes a lot of CPU.  Under
Windows I tried playing pinball and watching a few flash movies; with a recent
CPU one can hardly feel the virtualization.  Linux/X is slower, probably due
to X being in a separate process.

In addition to the driver, you need a slightly modified qemu to provide I/O
device emulation and the BIOS.

Caveats (akpm: might no longer be true):

- The Windows install currently bluescreens due to a problem with the
  virtual APIC.  We are working on a fix.  A temporary workaround is to
  use an existing image or install through qemu
- Windows 64-bit does not work.  That's also true for qemu, so it's
  probably a problem with the device model.

[bero@arklinux.org: build fix]
[simon.kagstrom@bth.se: build fix, other fixes]
[uril@qumranet.com: KVM: Expose interrupt bitmap]
[akpm@osdl.org: i386 build fix]
[mingo@elte.hu: i386 fixes]
[rdreier@cisco.com: add log levels to all printks]
[randy.dunlap@oracle.com: Fix sparse NULL and C99 struct init warnings]
[anthony@codemonkey.ws: KVM: AMD SVM: 32-bit host support]
Signed-off-by: NYaniv Kamay <yaniv@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Cc: Simon Kagstrom <simon.kagstrom@bth.se>
Cc: Bernhard Rosenkraenzer <bero@arklinux.org>
Signed-off-by: NUri Lublin <uril@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NAnthony Liguori <anthony@codemonkey.ws>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

6aa8b732