提交 · bc6678a33d9b952981a8e44a4f876c3ad64ca4d8 · openeuler / raspberrypi-kernel

01 3月, 2010 8 次提交

KVM: introduce kvm->srcu and convert kvm_set_memory_region to SRCU update · bc6678a3

由 Marcelo Tosatti 提交于 12月 23, 2009

Use two steps for memslot deletion: mark the slot invalid (which stops
instantiation of new shadow pages for that slot, but allows destruction),
then instantiate the new empty slot.

Also simplifies kvm_handle_hva locking.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

bc6678a3

KVM: use gfn_to_pfn_memslot in kvm_iommu_map_pages · 3ad26d81

由 Marcelo Tosatti 提交于 12月 23, 2009

So its possible to iommu map a memslot before making it visible to
kvm.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

3ad26d81

KVM: introduce gfn_to_pfn_memslot · 506f0d6f

由 Marcelo Tosatti 提交于 12月 23, 2009

Which takes a memslot pointer instead of using kvm->memslots.

To be used by SRCU convertion later.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

506f0d6f

M
KVM: split kvm_arch_set_memory_region into prepare and commit · f7784b8e
由 Marcelo Tosatti 提交于 12月 23, 2009
```
Required for SRCU convertion later.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
f7784b8e

KVM: modify memslots layout in struct kvm · 46a26bf5

由 Marcelo Tosatti 提交于 12月 23, 2009

Have a pointer to an allocated region inside struct kvm.

[alex: fix ppc book 3s]
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

46a26bf5

KVM: Simplify coalesced mmio initialization · 980da6ce

由 Avi Kivity 提交于 12月 20, 2009

- add destructor function
- move related allocation into constructor
- add stubs for !CONFIG_KVM_MMIO
Signed-off-by: NAvi Kivity <avi@redhat.com>

980da6ce

A
KVM: Remove ifdefs from mmu notifier initialization · 4c07b0a4
由 Avi Kivity 提交于 12月 20, 2009
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
4c07b0a4
A
KVM: Disentangle mmu notifiers and coalesced_mmio registration · 283d0c65
由 Avi Kivity 提交于 12月 20, 2009
```
They aren't related.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
283d0c65

27 12月, 2009 2 次提交

KVM: get rid of kvm_create_vm() unused label warning on s390 · b4329db0

由 Heiko Carstens 提交于 12月 18, 2009

arch/s390/kvm/../../../virt/kvm/kvm_main.c: In function 'kvm_create_vm':
arch/s390/kvm/../../../virt/kvm/kvm_main.c:409: warning: label 'out_err' defined but not used
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b4329db0

KVM: Fix possible circular locking in kvm_vm_ioctl_assign_device() · fae3a353

由 Sheng Yang 提交于 12月 15, 2009

One possible order is:

KVM_CREATE_IRQCHIP ioctl(took kvm->lock) -> kvm_iobus_register_dev() ->
down_write(kvm->slots_lock).

The other one is in kvm_vm_ioctl_assign_device(), which take kvm->slots_lock
first, then kvm->lock.

Update the comment of lock order as well.

Observe it due to kernel locking debug warnings.

Cc: stable@kernel.org
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

fae3a353

23 12月, 2009 1 次提交

anonfd: Allow making anon files read-only · 628ff7c1

由 Roland Dreier 提交于 12月 18, 2009

It seems a couple places such as arch/ia64/kernel/perfmon.c and
drivers/infiniband/core/uverbs_main.c could use anon_inode_getfile()
instead of a private pseudo-fs + alloc_file(), if only there were a way
to get a read-only file.  So provide this by having anon_inode_getfile()
create a read-only file if we pass O_RDONLY in flags.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

628ff7c1

03 12月, 2009 9 次提交

KVM: Allow internal errors reported to userspace to carry extra data · a9c7399d

由 Avi Kivity 提交于 11月 04, 2009

Usually userspace will freeze the guest so we can inspect it, but some
internal state is not available.  Add extra data to internal error
reporting so we can expose it to the debugger.  Extra data is specific
to the suberror.
Signed-off-by: NAvi Kivity <avi@redhat.com>

a9c7399d

KVM: Enable 32bit dirty log pointers on 64bit host · 6ff5894c

由 Arnd Bergmann 提交于 10月 22, 2009

With big endian userspace, we can't quite figure out if a pointer
is 32 bit (shifted >> 32) or 64 bit when we read a 64 bit pointer.

This is what happens with dirty logging. To get the pointer interpreted
correctly, we thus need Arnd's patch to implement a compat layer for
the ioctl:

A better way to do this is to add a separate compat_ioctl() method that
converts this for you.

Based on initial patch from Arnd Bergmann.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6ff5894c

KVM: introduce kvm_vcpu_on_spin · d255f4f2

由 Zhai, Edwin 提交于 10月 09, 2009

Introduce kvm_vcpu_on_spin, to be used by VMX/SVM to yield processing
once the cpu detects pause-based looping.
Signed-off-by: N"Zhai, Edwin" <edwin.zhai@intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

d255f4f2

KVM: Activate Virtualization On Demand · 10474ae8

由 Alexander Graf 提交于 9月 15, 2009

X86 CPUs need to have some magic happening to enable the virtualization
extensions on them. This magic can result in unpleasant results for
users, like blocking other VMMs from working (vmx) or using invalid TLB
entries (svm).

Currently KVM activates virtualization when the respective kernel module
is loaded. This blocks us from autoloading KVM modules without breaking
other VMMs.

To circumvent this problem at least a bit, this patch introduces on
demand activation of virtualization. This means, that instead
virtualization is enabled on creation of the first virtual machine
and disabled on destruction of the last one.

So using this, KVM can be easily autoloaded, while keeping other
hypervisors usable.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

10474ae8

A
KVM: Move assigned device code to own file · bfd99ff5
由 Avi Kivity 提交于 8月 26, 2009
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
bfd99ff5

KVM: Drop kvm->irq_lock lock from irq injection path · 680b3648

由 Gleb Natapov 提交于 8月 24, 2009

The only thing it protects now is interrupt injection into lapic and
this can work lockless. Even now with kvm->irq_lock in place access
to lapic is not entirely serialized since vcpu access doesn't take
kvm->irq_lock.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

680b3648

KVM: Move irq ack notifier list to arch independent code · 136bdfee

由 Gleb Natapov 提交于 8月 24, 2009

Mask irq notifier list is already there.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

136bdfee

KVM: Change irq routing table to use gsi indexed array · 46e624b9

由 Gleb Natapov 提交于 8月 24, 2009

Use gsi indexed array instead of scanning all entries on each interrupt
injection.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

46e624b9

A
KVM: Don't wrap schedule() with vcpu_put()/vcpu_load() · 45ec431c
由 Avi Kivity 提交于 8月 23, 2009
```
Preemption notifiers will do that for us automatically.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
45ec431c

05 11月, 2009 1 次提交

Use Little Endian for Dirty Bitmap · c8240bd6

由 Alexander Graf 提交于 10月 30, 2009

We currently use host endian long types to store information
in the dirty bitmap.

This works reasonably well on Little Endian targets, because the
u32 after the first contains the next 32 bits. On Big Endian this
breaks completely though, forcing us to be inventive here.

So Ben suggested to always use Little Endian, which looks reasonable.

We only have dirty bitmap implemented in Little Endian targets so far
and since PowerPC would be the first Big Endian platform, we can just
as well switch to Little Endian always with little effort without
breaking existing targets.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

c8240bd6

16 10月, 2009 1 次提交

KVM: Prevent kvm_init from corrupting debugfs structures · 0ea4ed8e

由 Darrick J. Wong 提交于 10月 14, 2009

I'm seeing an oops condition when kvm-intel and kvm-amd are modprobe'd
during boot (say on an Intel system) and then rmmod'd:

   # modprobe kvm-intel
     kvm_init()
     kvm_init_debug()
     kvm_arch_init()  <-- stores debugfs dentries internally
     (success, etc)

   # modprobe kvm-amd
     kvm_init()
     kvm_init_debug() <-- second initialization clobbers kvm's
                          internal pointers to dentries
     kvm_arch_init()
     kvm_exit_debug() <-- and frees them

   # rmmod kvm-intel
     kvm_exit()
     kvm_exit_debug() <-- double free of debugfs files!

     *BOOM*

If execution gets to the end of kvm_init(), then the calling module has been
established as the kvm provider.  Move the debugfs initialization to the end of
the function, and remove the now-unnecessary call to kvm_exit_debug() from the
error path.  That way we avoid trampling on the debugfs entries and freeing
them twice.

Cc: stable@kernel.org
Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0ea4ed8e

04 10月, 2009 1 次提交

KVM: add support for change_pte mmu notifiers · 3da0dd43

由 Izik Eidus 提交于 9月 23, 2009

this is needed for kvm if it want ksm to directly map pages into its
shadow page tables.

[marcelo: cast pfn assignment to u64]
Signed-off-by: NIzik Eidus <ieidus@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

3da0dd43

02 10月, 2009 1 次提交

const: constify remaining file_operations · 828c0950

由 Alexey Dobriyan 提交于 10月 01, 2009

[akpm@linux-foundation.org: fix KVM]
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

828c0950

28 9月, 2009 1 次提交

const: mark struct vm_struct_operations · f0f37e2f

由 Alexey Dobriyan 提交于 9月 27, 2009

* mark struct vm_area_struct::vm_ops as const
* mark vm_ops in AGP code

But leave TTM code alone, something is fishy there with global vm_ops
being used.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f0f37e2f

24 9月, 2009 1 次提交

cpumask: use zalloc_cpumask_var() where possible · 79f55997

由 Li Zefan 提交于 6月 15, 2009

Remove open-coded zalloc_cpumask_var() and zalloc_cpumask_var_node().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

79f55997

10 9月, 2009 14 次提交

KVM: fix compile warnings on s390 · 28bcb112

由 Heiko Carstens 提交于 9月 03, 2009

CC arch/s390/kvm/../../../virt/kvm/kvm_main.o
arch/s390/kvm/../../../virt/kvm/kvm_main.c: In function '__kvm_set_memory_region':
arch/s390/kvm/../../../virt/kvm/kvm_main.c:485: warning: unused variable 'j'
arch/s390/kvm/../../../virt/kvm/kvm_main.c:484: warning: unused variable 'lpages'
arch/s390/kvm/../../../virt/kvm/kvm_main.c:483: warning: unused variable 'ugfn'

Cc: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

28bcb112

A
KVM: Move #endif KVM_CAP_IRQ_ROUTING to correct place · 6621fbc2
由 Avi Kivity 提交于 8月 10, 2009
```
The symbol only controls irq routing, not MSI-X.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
6621fbc2

KVM: fix kvm_init() error handling · aed665f7

由 Xiao Guangrong 提交于 8月 03, 2009

Remove debugfs file if kvm_arch_init() return error
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

aed665f7

KVM: Drop obsolete cpu_get/put in make_all_cpus_request · e601e3be

由 Jan Kiszka 提交于 7月 20, 2009

spin_lock disables preemption, so we can simply read the current cpu.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e601e3be

KVM: Reduce runnability interface with arch support code · a1b37100

由 Gleb Natapov 提交于 7月 09, 2009

Remove kvm_cpu_has_interrupt() and kvm_arch_interrupt_allowed() from
interface between general code and arch code. kvm_arch_vcpu_runnable()
checks for interrupts instead.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

a1b37100

KVM: add ioeventfd support · d34e6b17

由 Gregory Haskins 提交于 7月 07, 2009

ioeventfd is a mechanism to register PIO/MMIO regions to trigger an eventfd
signal when written to by a guest.  Host userspace can register any
arbitrary IO address with a corresponding eventfd and then pass the eventfd
to a specific end-point of interest for handling.

Normal IO requires a blocking round-trip since the operation may cause
side-effects in the emulated model or may return data to the caller.
Therefore, an IO in KVM traps from the guest to the host, causes a VMX/SVM
"heavy-weight" exit back to userspace, and is ultimately serviced by qemu's
device model synchronously before returning control back to the vcpu.

However, there is a subclass of IO which acts purely as a trigger for
other IO (such as to kick off an out-of-band DMA request, etc).  For these
patterns, the synchronous call is particularly expensive since we really
only want to simply get our notification transmitted asychronously and
return as quickly as possible.  All the sychronous infrastructure to ensure
proper data-dependencies are met in the normal IO case are just unecessary
overhead for signalling.  This adds additional computational load on the
system, as well as latency to the signalling path.

Therefore, we provide a mechanism for registration of an in-kernel trigger
point that allows the VCPU to only require a very brief, lightweight
exit just long enough to signal an eventfd.  This also means that any
clients compatible with the eventfd interface (which includes userspace
and kernelspace equally well) can now register to be notified. The end
result should be a more flexible and higher performance notification API
for the backend KVM hypervisor and perhipheral components.

To test this theory, we built a test-harness called "doorbell".  This
module has a function called "doorbell_ring()" which simply increments a
counter for each time the doorbell is signaled.  It supports signalling
from either an eventfd, or an ioctl().

We then wired up two paths to the doorbell: One via QEMU via a registered
io region and through the doorbell ioctl().  The other is direct via
ioeventfd.

You can download this test harness here:

ftp://ftp.novell.com/dev/ghaskins/doorbell.tar.bz2

The measured results are as follows:

qemu-mmio:       110000 iops, 9.09us rtt
ioeventfd-mmio: 200100 iops, 5.00us rtt
ioeventfd-pio:  367300 iops, 2.72us rtt

I didn't measure qemu-pio, because I have to figure out how to register a
PIO region with qemu's device model, and I got lazy.  However, for now we
can extrapolate based on the data from the NULLIO runs of +2.56us for MMIO,
and -350ns for HC, we get:

qemu-pio:      153139 iops, 6.53us rtt
ioeventfd-hc: 412585 iops, 2.37us rtt

these are just for fun, for now, until I can gather more data.

Here is a graph for your convenience:

http://developer.novell.com/wiki/images/7/76/Iofd-chart.png

The conclusion to draw is that we save about 4us by skipping the userspace
hop.

--------------------
Signed-off-by: NGregory Haskins <ghaskins@novell.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d34e6b17

KVM: make io_bus interface more robust · 090b7aff

由 Gregory Haskins 提交于 7月 07, 2009

Today kvm_io_bus_regsiter_dev() returns void and will internally BUG_ON
if it fails.  We want to create dynamic MMIO/PIO entries driven from
userspace later in the series, so we need to enhance the code to be more
robust with the following changes:

   1) Add a return value to the registration function
   2) Fix up all the callsites to check the return code, handle any
      failures, and percolate the error up to the caller.
   3) Add an unregister function that collapses holes in the array
Signed-off-by: NGregory Haskins <ghaskins@novell.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

090b7aff

KVM: document lock nesting rule · 22fc0294

由 Michael S. Tsirkin 提交于 6月 29, 2009

Document kvm->lock nesting within kvm->slots_lock
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

22fc0294

KVM: remove in_range from io devices · bda9020e

由 Michael S. Tsirkin 提交于 6月 29, 2009

This changes bus accesses to use high-level kvm_io_bus_read/kvm_io_bus_write
functions. in_range now becomes unused so it is removed from device ops in
favor of read/write callbacks performing range checks internally.

This allows aliasing (mostly for in-kernel virtio), as well as better error
handling by making it possible to pass errors up to userspace.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

bda9020e

KVM: convert bus to slots_lock · 6c474694

由 Michael S. Tsirkin 提交于 6月 29, 2009

Use slots_lock to protect device list on the bus.  slots_lock is already
taken for read everywhere, so we only need to take it for write when
registering devices.  This is in preparation to removing in_range and
kvm->lock around it.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

6c474694

KVM: remove old KVMTRACE support code · 2023a29c

由 Marcelo Tosatti 提交于 6月 18, 2009

Return EOPNOTSUPP for KVM_TRACE_ENABLE/PAUSE/DISABLE ioctls.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2023a29c

KVM: x86: missing locking in PIT/IRQCHIP/SET_BSP_CPU ioctl paths · 894a9c55

由 Marcelo Tosatti 提交于 6月 23, 2009

Correct missing locking in a few places in x86's vm_ioctl handling path.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

894a9c55

KVM: Prepare memslot data structures for multiple hugepage sizes · ec04b260

由 Joerg Roedel 提交于 6月 19, 2009

[avi: fix build on non-x86]
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ec04b260

KVM: s390: Fix memslot initialization for userspace_addr != 0 · 3eea8437

由 Christian Borntraeger 提交于 6月 23, 2009

Since
commit 854b5338196b1175706e99d63be43a4f8d8ab607
Author: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
    KVM: s390: streamline memslot handling

s390 uses the values of the memslot instead of doing everything in the arch
ioctl handler of the KVM_SET_USER_MEMORY_REGION. Unfortunately we missed to
set the userspace_addr of our memslot due to our s390 ifdef in
__kvm_set_memory_region.
Old s390 userspace launchers did not notice, since they started the guest at
userspace address 0.
Because of CONFIG_DEFAULT_MMAP_MIN_ADDR we now put the guest at 1M userspace,
which does not work. This patch makes sure that new.userspace_addr is set
on s390.
This fix should go in quickly. Nevertheless, looking at the code we should
clean up that ifdef in the long term. Any kernel janitors?
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3eea8437