提交 · 5cb0944c0c66004c0d9006a7f0fba5782ae38f69 · openeuler / Kernel

14 12月, 2017 2 次提交

KVM: introduce kvm_arch_vcpu_async_ioctl · 5cb0944c

由 Paolo Bonzini 提交于 12月 12, 2017

After the vcpu_load/vcpu_put pushdown, the handling of asynchronous VCPU
ioctl is already much clearer in that it is obvious that they bypass
vcpu_load and vcpu_put.

However, it is still not perfect in that the different state of the VCPU
mutex is still hidden in the caller.  Separate those ioctls into a new
function kvm_arch_vcpu_async_ioctl that returns -ENOIOCTLCMD for more
"traditional" synchronous ioctls.

Cc: James Hogan <jhogan@kernel.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Suggested-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5cb0944c

KVM: Take vcpu->mutex outside vcpu_load · ec7660cc

由 Christoffer Dall 提交于 12月 04, 2017

As we're about to call vcpu_load() from architecture-specific
implementations of the KVM vcpu ioctls, but yet we access data
structures protected by the vcpu->mutex in the generic code, factor
this logic out from vcpu_load().

x86 is the only architecture which calls vcpu_load() outside of the main
vcpu ioctl function, and these calls will no longer take the vcpu mutex
following this patch.  However, with the exception of
kvm_arch_vcpu_postcreate (see below), the callers are either in the
creation or destruction path of the VCPU, which means there cannot be
any concurrent access to the data structure, because the file descriptor
is not yet accessible, or is already gone.

kvm_arch_vcpu_postcreate makes the newly created vcpu potentially
accessible by other in-kernel threads through the kvm->vcpus array, and
we therefore take the vcpu mutex in this case directly.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ec7660cc

06 12月, 2017 1 次提交

x86,kvm: move qemu/guest FPU switching out to vcpu_run · f775b13e

由 Rik van Riel 提交于 11月 14, 2017

Currently, every time a VCPU is scheduled out, the host kernel will
first save the guest FPU/xstate context, then load the qemu userspace
FPU context, only to then immediately save the qemu userspace FPU
context back to memory. When scheduling in a VCPU, the same extraneous
FPU loads and saves are done.

This could be avoided by moving from a model where the guest FPU is
loaded and stored with preemption disabled, to a model where the
qemu userspace FPU is swapped out for the guest FPU context for
the duration of the KVM_RUN ioctl.

This is done under the VCPU mutex, which is also taken when other
tasks inspect the VCPU FPU context, so the code should already be
safe for this change. That should come as no surprise, given that
s390 already has this optimization.

This can fix a bug where KVM calls get_user_pages while owning the
FPU, and the file system ends up requesting the FPU again:

    [258270.527947]  __warn+0xcb/0xf0
    [258270.527948]  warn_slowpath_null+0x1d/0x20
    [258270.527951]  kernel_fpu_disable+0x3f/0x50
    [258270.527953]  __kernel_fpu_begin+0x49/0x100
    [258270.527955]  kernel_fpu_begin+0xe/0x10
    [258270.527958]  crc32c_pcl_intel_update+0x84/0xb0
    [258270.527961]  crypto_shash_update+0x3f/0x110
    [258270.527968]  crc32c+0x63/0x8a [libcrc32c]
    [258270.527975]  dm_bm_checksum+0x1b/0x20 [dm_persistent_data]
    [258270.527978]  node_prepare_for_write+0x44/0x70 [dm_persistent_data]
    [258270.527985]  dm_block_manager_write_callback+0x41/0x50 [dm_persistent_data]
    [258270.527988]  submit_io+0x170/0x1b0 [dm_bufio]
    [258270.527992]  __write_dirty_buffer+0x89/0x90 [dm_bufio]
    [258270.527994]  __make_buffer_clean+0x4f/0x80 [dm_bufio]
    [258270.527996]  __try_evict_buffer+0x42/0x60 [dm_bufio]
    [258270.527998]  dm_bufio_shrink_scan+0xc0/0x130 [dm_bufio]
    [258270.528002]  shrink_slab.part.40+0x1f5/0x420
    [258270.528004]  shrink_node+0x22c/0x320
    [258270.528006]  do_try_to_free_pages+0xf5/0x330
    [258270.528008]  try_to_free_pages+0xe9/0x190
    [258270.528009]  __alloc_pages_slowpath+0x40f/0xba0
    [258270.528011]  __alloc_pages_nodemask+0x209/0x260
    [258270.528014]  alloc_pages_vma+0x1f1/0x250
    [258270.528017]  do_huge_pmd_anonymous_page+0x123/0x660
    [258270.528021]  handle_mm_fault+0xfd3/0x1330
    [258270.528025]  __get_user_pages+0x113/0x640
    [258270.528027]  get_user_pages+0x4f/0x60
    [258270.528063]  __gfn_to_pfn_memslot+0x120/0x3f0 [kvm]
    [258270.528108]  try_async_pf+0x66/0x230 [kvm]
    [258270.528135]  tdp_page_fault+0x130/0x280 [kvm]
    [258270.528149]  kvm_mmu_page_fault+0x60/0x120 [kvm]
    [258270.528158]  handle_ept_violation+0x91/0x170 [kvm_intel]
    [258270.528162]  vmx_handle_exit+0x1ca/0x1400 [kvm_intel]

No performance changes were detected in quick ping-pong tests on
my 4 socket system, which is expected since an FPU+xstate load is
on the order of 0.1us, while ping-ponging between CPUs is on the
order of 20us, and somewhat noisy.

Cc: stable@vger.kernel.org
Signed-off-by: NRik van Riel <riel@redhat.com>
Suggested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
[Fixed a bug where reset_vcpu called put_fpu without preceding load_fpu,
 which happened inside from KVM_CREATE_VCPU ioctl. - Radim]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

f775b13e

28 11月, 2017 1 次提交

KVM: Let KVM_SET_SIGNAL_MASK work as advertised · 20b7035c

由 Jan H. Schönherr 提交于 11月 24, 2017

KVM API says for the signal mask you set via KVM_SET_SIGNAL_MASK, that
"any unblocked signal received [...] will cause KVM_RUN to return with
-EINTR" and that "the signal will only be delivered if not blocked by
the original signal mask".

This, however, is only true, when the calling task has a signal handler
registered for a signal. If not, signal evaluation is short-circuited for
SIG_IGN and SIG_DFL, and the signal is either ignored without KVM_RUN
returning or the whole process is terminated.

Make KVM_SET_SIGNAL_MASK behave as advertised by utilizing logic similar
to that in do_sigtimedwait() to avoid short-circuiting of signals.
Signed-off-by: NJan H. SchÃ¶nherr <jschoenh@amazon.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

20b7035c

09 11月, 2017 1 次提交

KVM: s390: vsie: use common code functions for pinning · f7a6509f

由 David Hildenbrand 提交于 9月 01, 2017

We will not see -ENOMEM (gfn_to_hva() will return KVM_ERR_PTR_BAD_PAGE
for all errors). So we can also get rid of special handling in the
callers of pin_guest_page() and always assume that it is a g2 error.

As also kvm_s390_inject_program_int() should never fail, we can
simplify pin_scb(), too.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Message-Id: <20170901151143.22714-1-david@redhat.com>
Acked-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

f7a6509f

08 8月, 2017 1 次提交

KVM: add spinlock optimization framework · 199b5763

由 Longpeng(Mike) 提交于 8月 08, 2017

If a vcpu exits due to request a user mode spinlock, then
the spinlock-holder may be preempted in user mode or kernel mode.
(Note that not all architectures trap spin loops in user mode,
only AMD x86 and ARM/ARM64 currently do).

But if a vcpu exits in kernel mode, then the holder must be
preempted in kernel mode, so we should choose a vcpu in kernel mode
as a more likely candidate for the lock holder.

This introduces kvm_arch_vcpu_in_kernel() to decide whether the
vcpu is in kernel-mode when it's preempted.  kvm_vcpu_on_spin's
new argument says the same of the spinning VCPU.
Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

199b5763

07 8月, 2017 1 次提交

KVM: nVMX: get rid of nested_get_page() · 5e2f30b7

由 David Hildenbrand 提交于 8月 03, 2017

nested_get_page() just sounds confusing. All we want is a page from G1.
This is even unrelated to nested.

Let's introduce kvm_vcpu_gpa_to_page() so we don't get too lengthy
lines.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
[Squash pasto fix from Wanpeng Li. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5e2f30b7

03 8月, 2017 1 次提交

KVM: avoid using rcu_dereference_protected · 3898da94

由 Paolo Bonzini 提交于 8月 02, 2017

During teardown, accesses to memslots and buses are using
rcu_dereference_protected with an always-true condition because
these accesses are done outside the usual mutexes.  This
is because the last reference is gone and there cannot be any
concurrent modifications, but rcu_dereference_protected is
ugly and unobvious.

Instead, check the refcount in kvm_get_bus and __kvm_memslots.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

3898da94

27 7月, 2017 1 次提交

KVM: make pid available for uevents without debugfs · fdeaf7e3

由 Claudio Imbrenda 提交于 7月 24, 2017

Simplify and improve the code so that the PID is always available in
the uevent even when debugfs is not available.

This adds a userspace_pid field to struct kvm, as per Radim's
suggestion, so that the PID can be retrieved on destruction too.
Acked-by: NJanosch Frank <frankja@linux.vnet.ibm.com>
Fixes: 286de8f6 ("KVM: trigger uevents when creating or destroying a VM")
Signed-off-by: NClaudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fdeaf7e3

10 7月, 2017 1 次提交

KVM: use correct accessor function for __kvm_memslots · 7e988b10

由 Christian Borntraeger 提交于 7月 07, 2017

kvm memslots are protected by srcu and not by rcu. We must use
srcu_dereference_check instead of rcu_dereference_check.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>

7e988b10

07 7月, 2017 3 次提交

KVM: mark memory slots as rcu · a80cf7b5

由 Christian Borntraeger 提交于 7月 06, 2017

we access the memslots array via srcu. Mark it as such and
use the right access functions also for the freeing of
memory slots.

Found by sparse:
./include/linux/kvm_host.h:565:16: error: incompatible types in
comparison expression (different address spaces)
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>

a80cf7b5

KVM: mark kvm->busses as rcu protected · 4a12f951

由 Christian Borntraeger 提交于 7月 07, 2017

mark kvm->busses as rcu protected and use the correct access
function everywhere.

found by sparse
virt/kvm/kvm_main.c:3490:15: error: incompatible types in comparison expression (different address spaces)
virt/kvm/kvm_main.c:3509:15: error: incompatible types in comparison expression (different address spaces)
virt/kvm/kvm_main.c:3561:15: error: incompatible types in comparison expression (different address spaces)
virt/kvm/kvm_main.c:3644:15: error: incompatible types in comparison expression (different address spaces)
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

4a12f951

KVM: mark vcpu->pid pointer as rcu protected · 0e4524a5

由 Christian Borntraeger 提交于 7月 06, 2017

We do use rcu to protect the pid pointer. Mark it as such and
adopt all code to use the proper access methods.

This was detected by sparse.
"virt/kvm/kvm_main.c:2248:15: error: incompatible types in comparison
expression (different address spaces)"
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>

0e4524a5

04 6月, 2017 2 次提交

KVM: add kvm_request_pending · 2fa6e1e1

由 Radim Krčmář 提交于 6月 04, 2017

A first step in vcpu->requests encapsulation.  Additionally, we now
use READ_ONCE() when accessing vcpu->requests, which ensures we
always load vcpu->requests when it's accessed.  This is important as
other threads can change it any time.  Also, READ_ONCE() documents
that vcpu->requests is used with other threads, likely requiring
memory barriers, which it does.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
[ Documented the new use of READ_ONCE() and converted another check
  in arch/mips/kvm/vz.c ]
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Acked-by: NChristoffer Dall <cdall@linaro.org>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>

2fa6e1e1

KVM: improve arch vcpu request defining · 2387149e

由 Andrew Jones 提交于 6月 04, 2017

Marc Zyngier suggested that we define the arch specific VCPU request
base, rather than requiring each arch to remember to start from 8.
That suggestion, along with Radim Krcmar's recent VCPU request flag
addition, snowballed into defining something of an arch VCPU request
defining API.

No functional change.

(Looks like x86 is running out of arch VCPU request bits.  Maybe
 someday we'll need to extend to 64.)
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Acked-by: NChristoffer Dall <cdall@linaro.org>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>

2387149e

09 5月, 2017 2 次提交

KVM: Add kvm_vcpu_get_idx to get vcpu index in kvm->vcpus · 497d72d8

由 Christoffer Dall 提交于 5月 08, 2017

There are occasional needs to use the index of vcpu in the kvm->vcpus
array to map something related to a VCPU. For example, unlike the
vcpu->vcpu_id, the vcpu index is guaranteed to not be sparse across all
vcpus which is useful when allocating a memory area for each vcpu.
Signed-off-by: NChristoffer Dall <cdall@linaro.org>
Reviewed-by: NEric Auger <eric.auger@redhat.com>

497d72d8

mm: introduce kv[mz]alloc helpers · a7c3e901

由 Michal Hocko 提交于 5月 08, 2017

Patch series "kvmalloc", v5.

There are many open coded kmalloc with vmalloc fallback instances in the
tree.  Most of them are not careful enough or simply do not care about
the underlying semantic of the kmalloc/page allocator which means that
a) some vmalloc fallbacks are basically unreachable because the kmalloc
part will keep retrying until it succeeds b) the page allocator can
invoke a really disruptive steps like the OOM killer to move forward
which doesn't sound appropriate when we consider that the vmalloc
fallback is available.

As it can be seen implementing kvmalloc requires quite an intimate
knowledge if the page allocator and the memory reclaim internals which
strongly suggests that a helper should be implemented in the memory
subsystem proper.

Most callers, I could find, have been converted to use the helper
instead.  This is patch 6.  There are some more relying on __GFP_REPEAT
in the networking stack which I have converted as well and Eric Dumazet
was not opposed [2] to convert them as well.

[1] http://lkml.kernel.org/r/20170130094940.13546-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/1485273626.16328.301.camel@edumazet-glaptop3.roam.corp.google.com

This patch (of 9):

Using kmalloc with the vmalloc fallback for larger allocations is a
common pattern in the kernel code.  Yet we do not have any common helper
for that and so users have invented their own helpers.  Some of them are
really creative when doing so.  Let's just add kv[mz]alloc and make sure
it is implemented properly.  This implementation makes sure to not make
a large memory pressure for > PAGE_SZE requests (__GFP_NORETRY) and also
to not warn about allocation failures.  This also rules out the OOM
killer as the vmalloc is a more approapriate fallback than a disruptive
user visible action.

This patch also changes some existing users and removes helpers which
are specific for them.  In some cases this is not possible (e.g.
ext4_kvmalloc, libcfs_kvzalloc) because those seems to be broken and
require GFP_NO{FS,IO} context which is not vmalloc compatible in general
(note that the page table allocation is GFP_KERNEL).  Those need to be
fixed separately.

While we are at it, document that __vmalloc{_node} about unsupported gfp
mask because there seems to be a lot of confusion out there.
kvmalloc_node will warn about GFP_KERNEL incompatible (which are not
superset) flags to catch new abusers.  Existing ones would have to die
slowly.

[sfr@canb.auug.org.au: f2fs fixup]
  Link: http://lkml.kernel.org/r/20170320163735.332e64b7@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170306103032.2540-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>	[ext4 part]
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a7c3e901

03 5月, 2017 1 次提交

Revert "KVM: Support vCPU-based gfn->hva cache" · 4e335d9e

由 Paolo Bonzini 提交于 5月 02, 2017

This reverts commit bbd64115.

I've been sitting on this revert for too long and it unfortunately
missed 4.11.  It's also the reason why I haven't merged ring-based
dirty tracking for 4.12.

Using kvm_vcpu_memslots in kvm_gfn_to_hva_cache_init and
kvm_vcpu_write_guest_offset_cached means that the MSR value can
now be used to access SMRAM, simply by making it point to an SMRAM
physical address.  This is problematic because it lets the guest
OS overwrite memory that it shouldn't be able to touch.

Cc: stable@vger.kernel.org
Fixes: bbd64115Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4e335d9e

02 5月, 2017 1 次提交

KVM: x86: don't hold kvm->lock in KVM_SET_GSI_ROUTING · 5c0aea0e

由 David Hildenbrand 提交于 4月 28, 2017

We needed the lock to avoid racing with creation of the irqchip on x86. As
kvm_set_irq_routing() calls srcu_synchronize_expedited(), this lock
might be held for a longer time.

Let's introduce an arch specific callback to check if we can actually
add irq routes. For x86, all we have to do is check if we have an
irqchip in the kernel. We don't need kvm->lock at that point as the
irqchip is marked as inititalized only when actually fully created.
Reported-by: NSteve Rutherford <srutherford@google.com>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Fixes: 1df6dded ("KVM: x86: race between KVM_SET_GSI_ROUTING and KVM_CREATE_IRQCHIP")
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5c0aea0e

27 4月, 2017 6 次提交

KVM: mark requests that need synchronization · 7a97cec2

由 Paolo Bonzini 提交于 4月 27, 2017

kvm_make_all_requests() provides a synchronization that waits until all
kicked VCPUs have acknowledged the kick.  This is important for
KVM_REQ_MMU_RELOAD as it prevents freeing while lockless paging is
underway.

This patch adds the synchronization property into all requests that are
currently being used with kvm_make_all_requests() in order to preserve
the current behavior and only introduce a new framework.  Removing it
from requests where it is not necessary is left for future patches.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7a97cec2

KVM: return if kvm_vcpu_wake_up() did wake up the VCPU · 178f02ff

由 Radim Krčmář 提交于 4月 26, 2017

No need to kick a VCPU that we have just woken up.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

178f02ff

KVM: add explicit barrier to kvm_vcpu_kick · cde9af6e

由 Andrew Jones 提交于 4月 26, 2017

kvm_vcpu_kick() must issue a general memory barrier prior to reading
vcpu->mode in order to ensure correctness of the mutual-exclusion
memory barrier pattern used with vcpu->requests.  While the cmpxchg
called from kvm_vcpu_kick():

 kvm_vcpu_kick
   kvm_arch_vcpu_should_kick
     kvm_vcpu_exiting_guest_mode
       cmpxchg

implies general memory barriers before and after the operation, that
implication is only valid when cmpxchg succeeds.  We need an explicit
barrier for when it fails, otherwise a VCPU thread on its entry path
that reads zero for vcpu->requests does not exclude the possibility
the requesting thread sees !IN_GUEST_MODE when it reads vcpu->mode.

kvm_make_all_cpus_request already had a barrier, so we remove it, as
now it would be redundant.
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

cde9af6e

KVM: mark requests that do not need a wakeup · 930f7fd6

由 Radim Krčmář 提交于 4月 26, 2017

Some operations must ensure that the guest is not running with stale
data, but if the guest is halted, then the update can wait until another
event happens.  kvm_make_all_requests() currently doesn't wake up, so we
can mark all requests used with it.

First 8 bits were arbitrarily reserved for request numbers.

Most uses of requests have the request type as a constant, so a compiler
will optimize the '&'.

An alternative would be to have an inline function that would return
whether the request needs a wake-up or not, but I like this one better
even though it might produce worse assembly.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

930f7fd6

KVM: add kvm_{test,clear}_request to replace {test,clear}_bit · 72875d8a

由 Radim Krčmář 提交于 4月 26, 2017

Users were expected to use kvm_check_request() for testing and clearing,
but request have expanded their use since then and some users want to
only test or do a faster clear.

Make sure that requests are not directly accessed with bit operations.
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

72875d8a

KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller · 5af50993

由 Benjamin Herrenschmidt 提交于 4月 05, 2017

This patch makes KVM capable of using the XIVE interrupt controller
to provide the standard PAPR "XICS" style hypercalls. It is necessary
for proper operations when the host uses XIVE natively.

This has been lightly tested on an actual system, including PCI
pass-through with a TG3 device.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Cleanup pr_xxx(), unsplit pr_xxx() strings, etc., fix build
 failures by adding KVM_XIVE which depends on KVM_XICS and XIVE, and
 adding empty stubs for the kvm_xive_xxx() routines, fixup subject,
 integrate fixes from Paul for building PR=y HV=n]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5af50993

21 4月, 2017 1 次提交

kvm: Move srcu_struct fields to end of struct kvm · 6ade8694

由 Paul E. McKenney 提交于 4月 20, 2017

Parallelizing SRCU callback handling will increase the size of
srcu_struct, which will move the kvm structure's kvm_arch field out
of reach of powerpc's current assembly code, which will result in the
following sort of build error:

arch/powerpc/kvm/book3s_hv_rmhandlers.S:617: Error: operand out of range (0x000000000000b328 is not between 0xffffffffffff8000 and 0x0000000000007fff)

This commit moves the srcu_struct fields in the kvm structure to follow
the kvm_arch field, which will allow powerpc's assembly code to continue
to be able to reach the kvm_arch field.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Reported-by: NMichael Ellerman <michaele@au1.ibm.com>
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NMichael Ellerman <mpe@ellerman.id.au>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
[ paulmck: Moved this commit to precede SRCU callback parallelization,
  and reworded the commit log into future tense, all in the name of
  bisectability. ]

6ade8694

13 4月, 2017 1 次提交

KVM: x86: rename kvm_vcpu_request_scan_ioapic() · 993225ad

由 David Hildenbrand 提交于 4月 07, 2017

Let's rename it into a proper arch specific callback.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

993225ad

07 4月, 2017 2 次提交

kvm: make KVM_COALESCED_MMIO_PAGE_OFFSET public · 4b4357e0

由 Paolo Bonzini 提交于 3月 31, 2017

Its value has never changed; we might as well make it part of the ABI instead
of using the return value of KVM_CHECK_EXTENSION(KVM_CAP_COALESCED_MMIO).

Because PPC does not always make MMIO available, the code has to be made
dependent on CONFIG_KVM_MMIO rather than KVM_COALESCED_MMIO_PAGE_OFFSET.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

4b4357e0

KVM: x86: drop legacy device assignment · ad6260da

由 Paolo Bonzini 提交于 3月 27, 2017

Legacy device assignment has been deprecated since 4.2 (released
1.5 years ago).  VFIO is better and everyone should have switched to it.
If they haven't, this should convince them. :)
Reviewed-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ad6260da

24 3月, 2017 1 次提交

KVM: kvm_io_bus_unregister_dev() should never fail · 90db1043

由 David Hildenbrand 提交于 3月 23, 2017

No caller currently checks the return value of
kvm_io_bus_unregister_dev(). This is evil, as all callers silently go on
freeing their device. A stale reference will remain in the io_bus,
getting at least used again, when the iobus gets teared down on
kvm_destroy_vm() - leading to use after free errors.

There is nothing the callers could do, except retrying over and over
again.

So let's simply remove the bus altogether, print an error and make
sure no one can access this broken bus again (returning -ENOMEM on any
attempt to access it).

Fixes: e93f8a0f ("KVM: convert io_bus to SRCU")
Cc: stable@vger.kernel.org # 3.4+
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

90db1043

02 3月, 2017 1 次提交

kvm: convert kvm.users_count from atomic_t to refcount_t · e3736c3e

由 Elena Reshetova 提交于 2月 20, 2017

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NDavid Windsor <dwindsor@gmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

e3736c3e

17 2月, 2017 2 次提交

KVM: x86: remove code for lazy FPU handling · bd7e5b08

由 Paolo Bonzini 提交于 2月 03, 2017

The FPU is always active now when running KVM.
Reviewed-by: NDavid Matlack <dmatlack@google.com>
Reviewed-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bd7e5b08

KVM: Support vCPU-based gfn->hva cache · bbd64115

由 Cao, Lei 提交于 2月 03, 2017

Provide versions of struct gfn_to_hva_cache functions that
take vcpu as a parameter instead of struct kvm.  The existing functions
are not needed anymore, so delete them.  This allows dirty pages to
be logged in the vcpu dirty ring, instead of the global dirty ring,
for ring-based dirty memory tracking.
Signed-off-by: NLei Cao <lei.cao@stratus.com>
Message-Id: <CY1PR08MB19929BD2AC47A291FD680E83F04F0@CY1PR08MB1992.namprd08.prod.outlook.com>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bbd64115

30 1月, 2017 1 次提交

arm/arm64: KVM: Get rid of KVM_MEMSLOT_INCOHERENT · 9d93dc1c

由 Marc Zyngier 提交于 1月 25, 2017

KVM_MEMSLOT_INCOHERENT is not used anymore, as we've killed its
only use in the arm/arm64 MMU code. Let's remove the last artifacts.
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

9d93dc1c

28 11月, 2016 1 次提交

KVM: Export kvm module parameter variables · ec76d819

由 Suraj Jitindar Singh 提交于 10月 14, 2016

The kvm module has the parameters halt_poll_ns, halt_poll_ns_grow, and
halt_poll_ns_shrink. Halt polling was recently added to the powerpc kvm-hv
module and these parameters were essentially duplicated for that. There is
no benefit to this duplication and it can lead to confusion when trying to
tune halt polling.

Thus move the definition of these variables to kvm_host.h and export them.
This will allow the kvm-hv module to use the same module parameters by
accessing these variables, which will be implemented in the next patch,
meaning that they will no longer be duplicated.
Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

ec76d819

23 11月, 2016 1 次提交

kvm: x86: don't print warning messages for unimplemented msrs · ae0f5499

由 Bandan Das 提交于 11月 15, 2016

Change unimplemented msrs messages to use pr_debug.
If CONFIG_DYNAMIC_DEBUG is set, then these messages can be
enabled at run time or else -DDEBUG can be used at compile
time to enable them. These messages will still be printed if
ignore_msrs=1.
Signed-off-by: NBandan Das <bsd@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

ae0f5499

22 11月, 2016 1 次提交

kvm: Introduce kvm_write_guest_offset_cached() · 4ec6e863

由 Pan Xinhui 提交于 11月 02, 2016

It allows us to update some status or field of a structure partially.

We can also save a kvm_read_guest_cached() call if we just update one
fild of the struct regardless of its current value.
Signed-off-by: NPan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: borntraeger@de.ibm.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: jgross@suse.com
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: rkrcmar@redhat.com
Cc: virtualization@lists.linux-foundation.org
Cc: will.deacon@arm.com
Cc: xen-devel-request@lists.xenproject.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1478077718-37424-8-git-send-email-xinhui.pan@linux.vnet.ibm.com
[ Typo fixes. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4ec6e863

07 10月, 2016 1 次提交

x86/fpu, kvm: Remove KVM vcpu->fpu_counter · 3d42de25

由 Rik van Riel 提交于 10月 04, 2016

With the removal of the lazy FPU code, this field is no longer used.
Get rid of it.
Signed-off-by: NRik van Riel <riel@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: pbonzini@redhat.com
Link: http://lkml.kernel.org/r/1475627678-20788-7-git-send-email-riel@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

3d42de25

16 9月, 2016 2 次提交

kvm: create per-vcpu dirs in debugfs · 45b5939e

由 Luiz Capitulino 提交于 9月 16, 2016

This commit adds the ability for archs to export
per-vcpu information via a new per-vcpu dir in
the VM's debugfs directory.

If kvm_arch_has_vcpu_debugfs() returns true, then KVM
will create a vcpu dir for each vCPU in the VM's
debugfs directory. Then kvm_arch_create_vcpu_debugfs()
is responsible for populating each vcpu directory
with arch specific entries.

The per-vcpu path in debugfs will look like:

/sys/kernel/debug/kvm/29162-10/vcpu0
/sys/kernel/debug/kvm/29162-10/vcpu1

This is all arch specific for now because the only
user of this interface (x86) wants to export x86-specific
per-vcpu information to user-space.
Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

45b5939e

kvm: add stubs for arch specific debugfs support · 235539b4

由 Luiz Capitulino 提交于 9月 07, 2016

Two stubs are added:

 o kvm_arch_has_vcpu_debugfs(): must return true if the arch
   supports creating debugfs entries in the vcpu debugfs dir
   (which will be implemented by the next commit)

 o kvm_arch_create_vcpu_debugfs(): code that creates debugfs
   entries in the vcpu debugfs dir

For x86, this commit introduces a new file to avoid growing
arch/x86/kvm/x86.c even more.
Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

235539b4

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功