提交 · 2b3c246a682c50f5415c71fc5387a114a6f0d643 · openeuler / raspberrypi-kernel

26 9月, 2011 1 次提交

KVM: Make coalesced mmio use a device per zone · 2b3c246a

由 Sasha Levin 提交于 7月 20, 2011

This patch changes coalesced mmio to create one mmio device per
zone instead of handling all zones in one device.

Doing so enables us to take advantage of existing locking and prevents
a race condition between coalesced mmio registration/unregistration
and lookups.
Suggested-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

2b3c246a

24 7月, 2011 1 次提交

KVM: MMU: filter out the mmio pfn from the fault pfn · fce92dce

由 Xiao Guangrong 提交于 7月 12, 2011

If the page fault is caused by mmio, the gfn can not be found in memslots, and
'bad_pfn' is returned on gfn_to_hva path, so we can use 'bad_pfn' to identify
the mmio page fault.
And, to clarify the meaning of mmio pfn, we return fault page instead of bad
page when the gfn is not allowd to prefetch
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fce92dce

14 7月, 2011 1 次提交

KVM: Steal time implementation · c9aaa895

由 Glauber Costa 提交于 7月 11, 2011

To implement steal time, we need the hypervisor to pass the guest
information about how much time was spent running other processes
outside the VM, while the vcpu had meaningful work to do - halt
time does not count.

This information is acquired through the run_delay field of
delayacct/schedstats infrastructure, that counts time spent in a
runqueue but not running.

Steal time is a per-cpu information, so the traditional MSR-based
infrastructure is used. A new msr, KVM_MSR_STEAL_TIME, holds the
memory area address containing information about steal time

This patch contains the hypervisor part of the steal time infrasructure,
and can be backported independently of the guest portion.

[avi, yongjie: export delayacct_on, to avoid build failures in some configs]
Signed-off-by: NGlauber Costa <glommer@redhat.com>
Tested-by: NEric B Munson <emunson@mgebm.net>
CC: Rik van Riel <riel@redhat.com>
CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: NYongjie Ren <yongjie.ren@intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c9aaa895

12 7月, 2011 1 次提交

KVM: introduce kvm_read_guest_cached · e03b644f

由 Gleb Natapov 提交于 7月 11, 2011

Introduce kvm_read_guest_cached() function in addition to write one we
already have.

[ by glauber: export function signature in kvm header ]
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NGlauber Costa <glommer@redhat.com>
Acked-by: NRik van Riel <riel@redhat.com>
Tested-by: NEric Munson <emunson@mgebm.net>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e03b644f

22 5月, 2011 2 次提交

KVM: make guest mode entry to be rcu quiescent state · 8fa22068

由 Gleb Natapov 提交于 5月 04, 2011

KVM does not hold any references to rcu protected data when it switches
CPU into a guest mode. In fact switching to a guest mode is very similar
to exiting to userspase from rcu point of view. In addition CPU may stay
in a guest mode for quite a long time (up to one time slice). Lets treat
guest mode as quiescent state, just like we do with user-mode execution.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8fa22068

KVM: Use pci_store/load_saved_state() around VM device usage · f8fcfd77

由 Alex Williamson 提交于 5月 10, 2011

Store the device saved state so that we can reload the device back
to the original state when it's unassigned.  This has the benefit
that the state survives across pci_reset_function() calls via
the PCI sysfs reset interface while the VM is using the device.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

f8fcfd77

11 5月, 2011 4 次提交

KVM: Fix off by one in kvm_for_each_vcpu iteration · b42fc3cb

由 Jeff Mahoney 提交于 4月 12, 2011

This patch avoids gcc issuing the following warning when KVM_MAX_VCPUS=1:
warning: array subscript is above array bounds

kvm_for_each_vcpu currently checks to see if the index for the vcpu is
valid /after/ loading it. We don't run into problems because the address
is still inside the enclosing struct kvm and we never deference or write
to it, so this isn't a security issue.

The warning occurs when KVM_MAX_VCPUS=1 because the increment portion of
the loop will *always* cause the loop to load an invalid location since
++idx will always be > 0.

This patch moves the load so that the check occurs before the load and
we don't run into the compiler warning.
Signed-off-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b42fc3cb

KVM: 16-byte mmio support · cef4dea0

由 Avi Kivity 提交于 1月 20, 2010

Since sse instructions can issue 16-byte mmios, we need to support them. We
can't increase the kvm_run mmio buffer size to 16 bytes without breaking
compatibility, so instead we break the large mmios into two smaller 8-byte
ones. Since the bus is 64-bit we aren't breaking any atomicity guarantees.
Signed-off-by: NAvi Kivity <avi@redhat.com>

cef4dea0

M
Revert "KVM: Fix race between nmi injection and enabling nmi window" · c761e586
由 Marcelo Tosatti 提交于 4月 01, 2011
```
This reverts commit f8636849.

Simpler fix to follow.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
c761e586

KVM: cleanup memslot_id function · 0ee8dcb8

由 Xiao Guangrong 提交于 3月 09, 2011

We can get memslot id from memslot->id directly
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

0ee8dcb8

18 3月, 2011 5 次提交

KVM: Fix race between nmi injection and enabling nmi window · f8636849

由 Avi Kivity 提交于 2月 03, 2011

The interrupt injection logic looks something like

  if an nmi is pending, and nmi injection allowed
    inject nmi
  if an nmi is pending
    request exit on nmi window

the problem is that "nmi is pending" can be set asynchronously by
the PIT; if it happens to fire between the two if statements, we
will request an nmi window even though nmi injection is allowed.  On
SVM, this has disasterous results, since it causes eflags.TF to be
set in random guest code.

The fix is simple; make nmi_pending synchronous using the standard
vcpu->requests mechanism; this ensures the code above is completely
synchronous wrt nmi_pending.
Signed-off-by: NAvi Kivity <avi@redhat.com>

f8636849

KVM: use yield_to instead of sleep in kvm_vcpu_on_spin · 217ece61

由 Rik van Riel 提交于 2月 01, 2011

Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic
slowdowns of certain workloads, we instead use yield_to to get
another VCPU in the same KVM guest to run sooner.

This seems to give a 10-15% speedup in certain workloads.
Signed-off-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

217ece61

KVM: keep track of which task is running a KVM vcpu · 34bb10b7

由 Rik van Riel 提交于 2月 01, 2011

Keep track of which task is running a KVM vcpu.  This helps us
figure out later what task to wake up if we want to boost a
vcpu that got preempted.

Unfortunately there are no guarantees that the same task
always keeps the same vcpu, so we can only track the task
across a single "run" of the vcpu.
Signed-off-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

34bb10b7

KVM: make make_all_cpus_request() lockless · 3cba4130

由 Xiao Guangrong 提交于 1月 12, 2011

Now, we have 'vcpu->mode' to judge whether need to send ipi to other
cpus, this way is very exact, so checking request bit is needless,
then we can drop the spinlock let it's collateral
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3cba4130

KVM: Add "exiting guest mode" state · 6b7e2d09

由 Xiao Guangrong 提交于 1月 12, 2011

Currently we keep track of only two states: guest mode and host
mode.  This patch adds an "exiting guest mode" state that tells
us that an IPI will happen soon, so unless we need to wait for the
IPI, we can avoid it completely.

Also
1: No need atomically to read/write ->mode in vcpu's thread

2: reorganize struct kvm_vcpu to make ->mode and ->requests
   in the same cache line explicitly
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6b7e2d09

12 1月, 2011 13 次提交

A
KVM: Fix build error on s390 due to missing tlbs_dirty · 5c663a15
由 Avi Kivity 提交于 12月 08, 2010
```
Make it available for all archs.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
5c663a15

KVM: MMU: Make the way of accessing lpage_info more generic · d4dbf470

由 Takuya Yoshikawa 提交于 12月 07, 2010

Large page information has two elements but one of them, write_count, alone
is accessed by a helper function.

This patch replaces this helper function with more generic one which returns
newly named kvm_lpage_info structure and use it to access the other element
rmap_pde.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d4dbf470

KVM: MMU: delay flush all tlbs on sync_page path · a4ee1ca4

由 Xiao Guangrong 提交于 11月 23, 2010

Quote from Avi:
| I don't think we need to flush immediately; set a "tlb dirty" bit somewhere
| that is cleareded when we flush the tlb.  kvm_mmu_notifier_invalidate_page()
| can consult the bit and force a flush if set.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

a4ee1ca4

KVM: PPC: Fix compile warning · 27923eb1

由 Alexander Graf 提交于 11月 25, 2010

KVM compilation fails with the following warning:

include/linux/kvm_host.h: In function 'kvm_irq_routing_update':
include/linux/kvm_host.h:679:2: error: 'struct kvm' has no member named 'irq_routing'

That function is only used and reasonable to have on systems that implement
an in-kernel interrupt chip. PPC doesn't.

Fix by #ifdef'ing it out when no irqchip is available.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

27923eb1

KVM: fast-path msi injection with irqfd · bd2b53b2

由 Michael S. Tsirkin 提交于 11月 18, 2010

Store irq routing table pointer in the irqfd object,
and use that to inject MSI directly without bouncing out to
a kernel thread.

While we touch this structure, rearrange irqfd fields to make fastpath
better packed for better cache utilization.

This also adds some comments about locking rules and rcu usage in code.

Some notes on the design:
- Use pointer into the rt instead of copying an entry,
  to make it possible to use rcu, thus side-stepping
  locking complexities.  We also save some memory this way.
- Old workqueue code is still used for level irqs.
  I don't think we DTRT with level anyway, however,
  it seems easier to keep the code around as
  it has been thought through and debugged, and fix level later than
  rip out and re-instate it later.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NMarcelo Tosatti <mtosatti@redhat.com>
Acked-by: NGregory Haskins <ghaskins@novell.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bd2b53b2

KVM: Refactor IRQ names of assigned devices · 1e001d49

由 Jan Kiszka 提交于 11月 16, 2010

Cosmetic change, but it helps to correlate IRQs with PCI devices.
Acked-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

1e001d49

KVM: Switch assigned device IRQ forwarding to threaded handler · 0645211c

由 Jan Kiszka 提交于 11月 16, 2010

This improves the IRQ forwarding for assigned devices: By using the
kernel's threaded IRQ scheme, we can get rid of the latency-prone work
queue and simplify the code in the same run.

Moreover, we no longer have to hold assigned_dev_lock while raising the
guest IRQ, which can be a lenghty operation as we may have to iterate
over all VCPUs. The lock is now only used for synchronizing masking vs.
unmasking of INTx-type IRQs, thus is renames to intx_lock.
Acked-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0645211c

KVM: Clean up vm creation and release · d89f5eff

由 Jan Kiszka 提交于 11月 09, 2010

IA64 support forces us to abstract the allocation of the kvm structure.
But instead of mixing this up with arch-specific initialization and
doing the same on destruction, split both steps. This allows to move
generic destruction calls into generic code.

It also fixes error clean-up on failures of kvm_create_vm for IA64.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d89f5eff

KVM: pre-allocate one more dirty bitmap to avoid vmalloc() · 515a0127

由 Takuya Yoshikawa 提交于 10月 27, 2010

Currently x86's kvm_vm_ioctl_get_dirty_log() needs to allocate a bitmap by
vmalloc() which will be used in the next logging and this has been causing
bad effect to VGA and live-migration: vmalloc() consumes extra systime,
triggers tlb flush, etc.

This patch resolves this issue by pre-allocating one more bitmap and switching
between two bitmaps during dirty logging.

Performance improvement:
  I measured performance for the case of VGA update by trace-cmd.
  The result was 1.5 times faster than the original one.

  In the case of live migration, the improvement ratio depends on the workload
  and the guest memory size. In general, the larger the memory size is the more
  benefits we get.

Note:
  This does not change other architectures's logic but the allocation size
  becomes twice. This will increase the actual memory consumption only when
  the new size changes the number of pages allocated by vmalloc().
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

515a0127

KVM: propagate fault r/w information to gup(), allow read-only memory · 612819c3

由 Marcelo Tosatti 提交于 10月 22, 2010

As suggested by Andrea, pass r/w error code to gup(), upgrading read fault
to writable if host pte allows it.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

612819c3

KVM: Add PV MSR to enable asynchronous page faults delivery. · 344d9588

由 Gleb Natapov 提交于 10月 14, 2010

Guest enables async PF vcpu functionality using this MSR.
Reviewed-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

344d9588

KVM: Add memory slot versioning and use it to provide fast guest write interface · 49c7754c

由 Gleb Natapov 提交于 10月 18, 2010

Keep track of memslots changes by keeping generation number in memslots
structure. Provide kvm_write_guest_cached() function that skips
gfn_to_hva() translation if memslots was not changed since previous
invocation.
Acked-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

49c7754c

KVM: Halt vcpu if page it tries to access is swapped out · af585b92

由 Gleb Natapov 提交于 10月 14, 2010

If a guest accesses swapped out memory do not swap it in from vcpu thread
context. Schedule work to do swapping and put vcpu into halted state
instead.

Interrupts will still be delivered to the guest and if interrupt will
cause reschedule guest will continue to run another task.

[avi: remove call to get_user_pages_noio(), nacked by Linus; this
      makes everything synchrnous again]
Acked-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

af585b92

24 10月, 2010 7 次提交

KVM: Fix signature of kvm_iommu_map_pages stub · d7a79b6c

由 Jan Kiszka 提交于 10月 14, 2010

Breaks otherwise if CONFIG_IOMMU_API is not set.

KVM-Stable-Tag.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d7a79b6c

KVM: x86: Rename timer function · 34c238a1

由 Zachary Amsden 提交于 9月 18, 2010

This just changes some names to better reflect the usage they
will be given.  Separated out to keep confusion to a minimum.
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

34c238a1

KVM: Check for pending events before attempting injection · 3842d135

由 Avi Kivity 提交于 7月 27, 2010

Instead of blindly attempting to inject an event before each guest entry,
check for a possible event first in vcpu->requests.  Sites that can trigger
event injection are modified to set KVM_REQ_EVENT:

- interrupt, nmi window opening
- ppr updates
- i8259 output changes
- local apic irr changes
- rflags updates
- gif flag set
- event set on exit

This improves non-injecting entry performance, and sets the stage for
non-atomic injection.
Signed-off-by: NAvi Kivity <avi@redhat.com>

3842d135

KVM: MMU: Add infrastructure for two-level page walker · c30a358d

由 Joerg Roedel 提交于 9月 10, 2010

This patch introduces a mmu-callback to translate gpa
addresses in the walk_addr code. This is later used to
translate l2_gpa addresses into l1_gpa addresses.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c30a358d

KVM: MMU: rewrite audit_mappings_page() function · 365fb3fd

由 Xiao Guangrong 提交于 8月 28, 2010

There is a bugs in this function, we call gfn_to_pfn() and kvm_mmu_gva_to_gpa_read() in
atomic context(kvm_mmu_audit() is called under the spinlock(mmu_lock)'s protection).

This patch fix it by:
- introduce gfn_to_pfn_atomic instead of gfn_to_pfn
- get the mapping gfn from kvm_mmu_page_get_gfn()

And it adds 'notrap' ptes check in unsync/direct sps
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

365fb3fd

KVM: MMU: introduce gfn_to_page_many_atomic() function · 48987781

由 Xiao Guangrong 提交于 8月 22, 2010

Introduce this function to get consecutive gfn's pages, it can reduce
gup's overload, used by later patch
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

48987781

KVM: MMU: introduce hva_to_pfn_atomic function · 887c08ac

由 Xiao Guangrong 提交于 8月 22, 2010

Introduce hva_to_pfn_atomic(), it's the fast path and can used in atomic
context, the later patch will use it
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

887c08ac

20 8月, 2010 1 次提交

kvm: add __rcu annotations · 4b6a2872

由 Arnd Bergmann 提交于 3月 04, 2010

Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>

4b6a2872

02 8月, 2010 2 次提交

KVM: Convert mask notifiers to use irqchip/pin instead of gsi · 4a994358

由 Gleb Natapov 提交于 7月 11, 2010

Devices register mask notifier using gsi, but irqchip knows about
irqchip/pin, so conversion from irqchip/pin to gsi should be done before
looking for mask notifier to call.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

4a994358

KVM: Return EFAULT from kvm ioctl when guest accesses bad area · edba23e5

由 Gleb Natapov 提交于 7月 07, 2010

Currently if guest access address that belongs to memory slot but is not
backed up by page or page is read only KVM treats it like MMIO access.
Remove that capability. It was never part of the interface and should
not be relied upon.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

edba23e5

01 8月, 2010 2 次提交

KVM: Keep slot ID in memory slot structure · e36d96f7

由 Avi Kivity 提交于 6月 21, 2010

May be used for distinguishing between internal and user slots, or for sorting
slots in size order.
Signed-off-by: NAvi Kivity <avi@redhat.com>

e36d96f7

KVM: Reduce atomic operations on vcpu->requests · 0719837c

由 Avi Kivity 提交于 5月 10, 2010

Usually the vcpu->requests bitmap is sparse, so a test_and_clear_bit() for
each request generates a large number of unneeded atomics if a bit is set.

Replace with a separate test/clear sequence.  This is safe since there is
no clear_bit() outside the vcpu thread.
Signed-off-by: NAvi Kivity <avi@redhat.com>

0719837c