提交 · 2387149eade25f32dcf1398811b3d0293181d005 · openanolis / cloud-kernel

04 6月, 2017 1 次提交

KVM: improve arch vcpu request defining · 2387149e

由 Andrew Jones 提交于 6月 04, 2017

Marc Zyngier suggested that we define the arch specific VCPU request
base, rather than requiring each arch to remember to start from 8.
That suggestion, along with Radim Krcmar's recent VCPU request flag
addition, snowballed into defining something of an arch VCPU request
defining API.

No functional change.

(Looks like x86 is running out of arch VCPU request bits.  Maybe
 someday we'll need to extend to 64.)
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Acked-by: NChristoffer Dall <cdall@linaro.org>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>

2387149e

09 5月, 2017 2 次提交

KVM: Add kvm_vcpu_get_idx to get vcpu index in kvm->vcpus · 497d72d8

由 Christoffer Dall 提交于 5月 08, 2017

There are occasional needs to use the index of vcpu in the kvm->vcpus
array to map something related to a VCPU. For example, unlike the
vcpu->vcpu_id, the vcpu index is guaranteed to not be sparse across all
vcpus which is useful when allocating a memory area for each vcpu.
Signed-off-by: NChristoffer Dall <cdall@linaro.org>
Reviewed-by: NEric Auger <eric.auger@redhat.com>

497d72d8

mm: introduce kv[mz]alloc helpers · a7c3e901

由 Michal Hocko 提交于 5月 08, 2017

Patch series "kvmalloc", v5.

There are many open coded kmalloc with vmalloc fallback instances in the
tree.  Most of them are not careful enough or simply do not care about
the underlying semantic of the kmalloc/page allocator which means that
a) some vmalloc fallbacks are basically unreachable because the kmalloc
part will keep retrying until it succeeds b) the page allocator can
invoke a really disruptive steps like the OOM killer to move forward
which doesn't sound appropriate when we consider that the vmalloc
fallback is available.

As it can be seen implementing kvmalloc requires quite an intimate
knowledge if the page allocator and the memory reclaim internals which
strongly suggests that a helper should be implemented in the memory
subsystem proper.

Most callers, I could find, have been converted to use the helper
instead.  This is patch 6.  There are some more relying on __GFP_REPEAT
in the networking stack which I have converted as well and Eric Dumazet
was not opposed [2] to convert them as well.

[1] http://lkml.kernel.org/r/20170130094940.13546-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/1485273626.16328.301.camel@edumazet-glaptop3.roam.corp.google.com

This patch (of 9):

Using kmalloc with the vmalloc fallback for larger allocations is a
common pattern in the kernel code.  Yet we do not have any common helper
for that and so users have invented their own helpers.  Some of them are
really creative when doing so.  Let's just add kv[mz]alloc and make sure
it is implemented properly.  This implementation makes sure to not make
a large memory pressure for > PAGE_SZE requests (__GFP_NORETRY) and also
to not warn about allocation failures.  This also rules out the OOM
killer as the vmalloc is a more approapriate fallback than a disruptive
user visible action.

This patch also changes some existing users and removes helpers which
are specific for them.  In some cases this is not possible (e.g.
ext4_kvmalloc, libcfs_kvzalloc) because those seems to be broken and
require GFP_NO{FS,IO} context which is not vmalloc compatible in general
(note that the page table allocation is GFP_KERNEL).  Those need to be
fixed separately.

While we are at it, document that __vmalloc{_node} about unsupported gfp
mask because there seems to be a lot of confusion out there.
kvmalloc_node will warn about GFP_KERNEL incompatible (which are not
superset) flags to catch new abusers.  Existing ones would have to die
slowly.

[sfr@canb.auug.org.au: f2fs fixup]
  Link: http://lkml.kernel.org/r/20170320163735.332e64b7@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170306103032.2540-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>	[ext4 part]
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a7c3e901

03 5月, 2017 1 次提交

Revert "KVM: Support vCPU-based gfn->hva cache" · 4e335d9e

由 Paolo Bonzini 提交于 5月 02, 2017

This reverts commit bbd64115.

I've been sitting on this revert for too long and it unfortunately
missed 4.11.  It's also the reason why I haven't merged ring-based
dirty tracking for 4.12.

Using kvm_vcpu_memslots in kvm_gfn_to_hva_cache_init and
kvm_vcpu_write_guest_offset_cached means that the MSR value can
now be used to access SMRAM, simply by making it point to an SMRAM
physical address.  This is problematic because it lets the guest
OS overwrite memory that it shouldn't be able to touch.

Cc: stable@vger.kernel.org
Fixes: bbd64115Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4e335d9e

02 5月, 2017 1 次提交

KVM: x86: don't hold kvm->lock in KVM_SET_GSI_ROUTING · 5c0aea0e

由 David Hildenbrand 提交于 4月 28, 2017

We needed the lock to avoid racing with creation of the irqchip on x86. As
kvm_set_irq_routing() calls srcu_synchronize_expedited(), this lock
might be held for a longer time.

Let's introduce an arch specific callback to check if we can actually
add irq routes. For x86, all we have to do is check if we have an
irqchip in the kernel. We don't need kvm->lock at that point as the
irqchip is marked as inititalized only when actually fully created.
Reported-by: NSteve Rutherford <srutherford@google.com>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Fixes: 1df6dded ("KVM: x86: race between KVM_SET_GSI_ROUTING and KVM_CREATE_IRQCHIP")
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5c0aea0e

27 4月, 2017 6 次提交

KVM: mark requests that need synchronization · 7a97cec2

由 Paolo Bonzini 提交于 4月 27, 2017

kvm_make_all_requests() provides a synchronization that waits until all
kicked VCPUs have acknowledged the kick.  This is important for
KVM_REQ_MMU_RELOAD as it prevents freeing while lockless paging is
underway.

This patch adds the synchronization property into all requests that are
currently being used with kvm_make_all_requests() in order to preserve
the current behavior and only introduce a new framework.  Removing it
from requests where it is not necessary is left for future patches.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7a97cec2

KVM: return if kvm_vcpu_wake_up() did wake up the VCPU · 178f02ff

由 Radim Krčmář 提交于 4月 26, 2017

No need to kick a VCPU that we have just woken up.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

178f02ff

KVM: add explicit barrier to kvm_vcpu_kick · cde9af6e

由 Andrew Jones 提交于 4月 26, 2017

kvm_vcpu_kick() must issue a general memory barrier prior to reading
vcpu->mode in order to ensure correctness of the mutual-exclusion
memory barrier pattern used with vcpu->requests.  While the cmpxchg
called from kvm_vcpu_kick():

 kvm_vcpu_kick
   kvm_arch_vcpu_should_kick
     kvm_vcpu_exiting_guest_mode
       cmpxchg

implies general memory barriers before and after the operation, that
implication is only valid when cmpxchg succeeds.  We need an explicit
barrier for when it fails, otherwise a VCPU thread on its entry path
that reads zero for vcpu->requests does not exclude the possibility
the requesting thread sees !IN_GUEST_MODE when it reads vcpu->mode.

kvm_make_all_cpus_request already had a barrier, so we remove it, as
now it would be redundant.
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

cde9af6e

KVM: mark requests that do not need a wakeup · 930f7fd6

由 Radim Krčmář 提交于 4月 26, 2017

Some operations must ensure that the guest is not running with stale
data, but if the guest is halted, then the update can wait until another
event happens.  kvm_make_all_requests() currently doesn't wake up, so we
can mark all requests used with it.

First 8 bits were arbitrarily reserved for request numbers.

Most uses of requests have the request type as a constant, so a compiler
will optimize the '&'.

An alternative would be to have an inline function that would return
whether the request needs a wake-up or not, but I like this one better
even though it might produce worse assembly.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

930f7fd6

KVM: add kvm_{test,clear}_request to replace {test,clear}_bit · 72875d8a

由 Radim Krčmář 提交于 4月 26, 2017

Users were expected to use kvm_check_request() for testing and clearing,
but request have expanded their use since then and some users want to
only test or do a faster clear.

Make sure that requests are not directly accessed with bit operations.
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

72875d8a

KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller · 5af50993

由 Benjamin Herrenschmidt 提交于 4月 05, 2017

This patch makes KVM capable of using the XIVE interrupt controller
to provide the standard PAPR "XICS" style hypercalls. It is necessary
for proper operations when the host uses XIVE natively.

This has been lightly tested on an actual system, including PCI
pass-through with a TG3 device.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Cleanup pr_xxx(), unsplit pr_xxx() strings, etc., fix build
 failures by adding KVM_XIVE which depends on KVM_XICS and XIVE, and
 adding empty stubs for the kvm_xive_xxx() routines, fixup subject,
 integrate fixes from Paul for building PR=y HV=n]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5af50993

21 4月, 2017 1 次提交

kvm: Move srcu_struct fields to end of struct kvm · 6ade8694

由 Paul E. McKenney 提交于 4月 20, 2017

Parallelizing SRCU callback handling will increase the size of
srcu_struct, which will move the kvm structure's kvm_arch field out
of reach of powerpc's current assembly code, which will result in the
following sort of build error:

arch/powerpc/kvm/book3s_hv_rmhandlers.S:617: Error: operand out of range (0x000000000000b328 is not between 0xffffffffffff8000 and 0x0000000000007fff)

This commit moves the srcu_struct fields in the kvm structure to follow
the kvm_arch field, which will allow powerpc's assembly code to continue
to be able to reach the kvm_arch field.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Reported-by: NMichael Ellerman <michaele@au1.ibm.com>
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NMichael Ellerman <mpe@ellerman.id.au>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
[ paulmck: Moved this commit to precede SRCU callback parallelization,
  and reworded the commit log into future tense, all in the name of
  bisectability. ]

6ade8694

13 4月, 2017 1 次提交

KVM: x86: rename kvm_vcpu_request_scan_ioapic() · 993225ad

由 David Hildenbrand 提交于 4月 07, 2017

Let's rename it into a proper arch specific callback.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

993225ad

07 4月, 2017 2 次提交

kvm: make KVM_COALESCED_MMIO_PAGE_OFFSET public · 4b4357e0

由 Paolo Bonzini 提交于 3月 31, 2017

Its value has never changed; we might as well make it part of the ABI instead
of using the return value of KVM_CHECK_EXTENSION(KVM_CAP_COALESCED_MMIO).

Because PPC does not always make MMIO available, the code has to be made
dependent on CONFIG_KVM_MMIO rather than KVM_COALESCED_MMIO_PAGE_OFFSET.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

4b4357e0

KVM: x86: drop legacy device assignment · ad6260da

由 Paolo Bonzini 提交于 3月 27, 2017

Legacy device assignment has been deprecated since 4.2 (released
1.5 years ago).  VFIO is better and everyone should have switched to it.
If they haven't, this should convince them. :)
Reviewed-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ad6260da

24 3月, 2017 1 次提交

KVM: kvm_io_bus_unregister_dev() should never fail · 90db1043

由 David Hildenbrand 提交于 3月 23, 2017

No caller currently checks the return value of
kvm_io_bus_unregister_dev(). This is evil, as all callers silently go on
freeing their device. A stale reference will remain in the io_bus,
getting at least used again, when the iobus gets teared down on
kvm_destroy_vm() - leading to use after free errors.

There is nothing the callers could do, except retrying over and over
again.

So let's simply remove the bus altogether, print an error and make
sure no one can access this broken bus again (returning -ENOMEM on any
attempt to access it).

Fixes: e93f8a0f ("KVM: convert io_bus to SRCU")
Cc: stable@vger.kernel.org # 3.4+
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

90db1043

02 3月, 2017 1 次提交

kvm: convert kvm.users_count from atomic_t to refcount_t · e3736c3e

由 Elena Reshetova 提交于 2月 20, 2017

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NDavid Windsor <dwindsor@gmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

e3736c3e

17 2月, 2017 2 次提交

KVM: x86: remove code for lazy FPU handling · bd7e5b08

由 Paolo Bonzini 提交于 2月 03, 2017

The FPU is always active now when running KVM.
Reviewed-by: NDavid Matlack <dmatlack@google.com>
Reviewed-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bd7e5b08

KVM: Support vCPU-based gfn->hva cache · bbd64115

由 Cao, Lei 提交于 2月 03, 2017

Provide versions of struct gfn_to_hva_cache functions that
take vcpu as a parameter instead of struct kvm.  The existing functions
are not needed anymore, so delete them.  This allows dirty pages to
be logged in the vcpu dirty ring, instead of the global dirty ring,
for ring-based dirty memory tracking.
Signed-off-by: NLei Cao <lei.cao@stratus.com>
Message-Id: <CY1PR08MB19929BD2AC47A291FD680E83F04F0@CY1PR08MB1992.namprd08.prod.outlook.com>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bbd64115

30 1月, 2017 1 次提交

arm/arm64: KVM: Get rid of KVM_MEMSLOT_INCOHERENT · 9d93dc1c

由 Marc Zyngier 提交于 1月 25, 2017

KVM_MEMSLOT_INCOHERENT is not used anymore, as we've killed its
only use in the arm/arm64 MMU code. Let's remove the last artifacts.
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

9d93dc1c

28 11月, 2016 1 次提交

KVM: Export kvm module parameter variables · ec76d819

由 Suraj Jitindar Singh 提交于 10月 14, 2016

The kvm module has the parameters halt_poll_ns, halt_poll_ns_grow, and
halt_poll_ns_shrink. Halt polling was recently added to the powerpc kvm-hv
module and these parameters were essentially duplicated for that. There is
no benefit to this duplication and it can lead to confusion when trying to
tune halt polling.

Thus move the definition of these variables to kvm_host.h and export them.
This will allow the kvm-hv module to use the same module parameters by
accessing these variables, which will be implemented in the next patch,
meaning that they will no longer be duplicated.
Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

ec76d819

23 11月, 2016 1 次提交

kvm: x86: don't print warning messages for unimplemented msrs · ae0f5499

由 Bandan Das 提交于 11月 15, 2016

Change unimplemented msrs messages to use pr_debug.
If CONFIG_DYNAMIC_DEBUG is set, then these messages can be
enabled at run time or else -DDEBUG can be used at compile
time to enable them. These messages will still be printed if
ignore_msrs=1.
Signed-off-by: NBandan Das <bsd@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

ae0f5499

22 11月, 2016 1 次提交

kvm: Introduce kvm_write_guest_offset_cached() · 4ec6e863

由 Pan Xinhui 提交于 11月 02, 2016

It allows us to update some status or field of a structure partially.

We can also save a kvm_read_guest_cached() call if we just update one
fild of the struct regardless of its current value.
Signed-off-by: NPan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: borntraeger@de.ibm.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: jgross@suse.com
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: rkrcmar@redhat.com
Cc: virtualization@lists.linux-foundation.org
Cc: will.deacon@arm.com
Cc: xen-devel-request@lists.xenproject.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1478077718-37424-8-git-send-email-xinhui.pan@linux.vnet.ibm.com
[ Typo fixes. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4ec6e863

07 10月, 2016 1 次提交

x86/fpu, kvm: Remove KVM vcpu->fpu_counter · 3d42de25

由 Rik van Riel 提交于 10月 04, 2016

With the removal of the lazy FPU code, this field is no longer used.
Get rid of it.
Signed-off-by: NRik van Riel <riel@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: pbonzini@redhat.com
Link: http://lkml.kernel.org/r/1475627678-20788-7-git-send-email-riel@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

3d42de25

16 9月, 2016 2 次提交

kvm: create per-vcpu dirs in debugfs · 45b5939e

由 Luiz Capitulino 提交于 9月 16, 2016

This commit adds the ability for archs to export
per-vcpu information via a new per-vcpu dir in
the VM's debugfs directory.

If kvm_arch_has_vcpu_debugfs() returns true, then KVM
will create a vcpu dir for each vCPU in the VM's
debugfs directory. Then kvm_arch_create_vcpu_debugfs()
is responsible for populating each vcpu directory
with arch specific entries.

The per-vcpu path in debugfs will look like:

/sys/kernel/debug/kvm/29162-10/vcpu0
/sys/kernel/debug/kvm/29162-10/vcpu1

This is all arch specific for now because the only
user of this interface (x86) wants to export x86-specific
per-vcpu information to user-space.
Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

45b5939e

kvm: add stubs for arch specific debugfs support · 235539b4

由 Luiz Capitulino 提交于 9月 07, 2016

Two stubs are added:

 o kvm_arch_has_vcpu_debugfs(): must return true if the arch
   supports creating debugfs entries in the vcpu debugfs dir
   (which will be implemented by the next commit)

 o kvm_arch_create_vcpu_debugfs(): code that creates debugfs
   entries in the vcpu debugfs dir

For x86, this commit introduces a new file to avoid growing
arch/x86/kvm/x86.c even more.
Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

235539b4

12 8月, 2016 2 次提交

KVM: Protect device ops->create and list_add with kvm->lock · a28ebea2

由 Christoffer Dall 提交于 8月 09, 2016

KVM devices were manipulating list data structures without any form of
synchronization, and some implementations of the create operations also
suffered from a lack of synchronization.

Now when we've split the xics create operation into create and init, we
can hold the kvm->lock mutex while calling the create operation and when
manipulating the devices list.

The error path in the generic code gets slightly ugly because we have to
take the mutex again and delete the device from the list, but holding
the mutex during anon_inode_getfd or releasing/locking the mutex in the
common non-error path seemed wrong.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

a28ebea2

KVM: PPC: Move xics_debugfs_init out of create · 023e9fdd

由 Christoffer Dall 提交于 8月 09, 2016

As we are about to hold the kvm->lock during the create operation on KVM
devices, we should move the call to xics_debugfs_init into its own
function, since holding a mutex over extended amounts of time might not
be a good idea.

Introduce an init operation on the kvm_device_ops struct which cannot
fail and call this, if configured, after the device has been created.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

023e9fdd

23 7月, 2016 3 次提交

KVM: arm/arm64: Enable MSI routing · 995a0ee9

由 Eric Auger 提交于 7月 22, 2016

Up to now, only irqchip routing entries could be set. This patch
adds the capability to insert MSI routing entries.

For ARM64, let's also increase KVM_MAX_IRQ_ROUTES to 4096: this
include SPI irqchip routes plus MSI routes. In the future this
might be extended.
Signed-off-by: NEric Auger <eric.auger@redhat.com>
Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

995a0ee9

KVM: Move kvm_setup_default/empty_irq_routing declaration in arch specific header · d9565a73

由 Eric Auger 提交于 7月 22, 2016

kvm_setup_default_irq_routing and kvm_setup_empty_irq_routing are
not used by generic code. So let's move the declarations in x86 irq.h
header instead of kvm_host.h.
Signed-off-by: NEric Auger <eric.auger@redhat.com>
Suggested-by: NAndre Przywara <andre.przywara@arm.com>
Acked-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

d9565a73

KVM: Add devid in kvm_kernel_irq_routing_entry · 0455e72c

由 Eric Auger 提交于 7月 22, 2016

Enhance kvm_kernel_irq_routing_entry to transport the device id
field, devid. A new flags field makes possible to indicate the
devid is valid. Those additions are used for ARM GICv3 ITS MSI
injection. The original struct msi_msg msi field is replaced by
a new custom structure that embeds the new fields.
Signed-off-by: NEric Auger <eric.auger@redhat.com>
Suggested-by: NRadim Krčmář <rkrcmar@redhat.com>
Acked-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

0455e72c

19 7月, 2016 1 次提交

KVM: kvm_io_bus: Add kvm_io_bus_get_dev() call · 8a39d006

由 Andre Przywara 提交于 7月 15, 2016

The kvm_io_bus framework is a nice place of holding information about
various MMIO regions for kernel emulated devices.
Add a call to retrieve the kvm_io_device structure which is associated
with a certain MMIO address. This avoids to duplicate kvm_io_bus'
knowledge of MMIO regions without having to fake MMIO calls if a user
needs the device a certain MMIO address belongs to.
This will be used by the ITS emulation to get the associated ITS device
when someone triggers an MSI via an ioctl from userspace.
Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Tested-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

8a39d006

14 7月, 2016 1 次提交

KVM: pass struct kvm to kvm_set_routing_entry · c63cf538

由 Radim Krčmář 提交于 7月 12, 2016

Arch-specific code will use it.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c63cf538

01 7月, 2016 1 次提交

KVM: remove kvm_guest_enter/exit wrappers · 6edaa530

由 Paolo Bonzini 提交于 6月 15, 2016

Use the functions from context_tracking.h directly.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6edaa530

28 6月, 2016 1 次提交

context_tracking: move rcu_virt_note_context_switch out of kvm_host.h · ebaac173

由 Paolo Bonzini 提交于 6月 15, 2016

Make kvm_guest_{enter,exit} and __kvm_guest_{enter,exit} trivial wrappers
around the code in context_tracking.h.  Name the context_tracking.h functions
consistently with what those for kernel<->user switch.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ebaac173

16 6月, 2016 2 次提交

KVM: remove kvm_vcpu_compatible · 557abc40

由 Paolo Bonzini 提交于 6月 13, 2016

The new created_vcpus field makes it possible to avoid the race between
irqchip and VCPU creation in a much nicer way; just check under kvm->lock
whether a VCPU has already been created.

We can then remove KVM_APIC_ARCHITECTURE too, because at this point the
symbol is only governing the default definition of kvm_vcpu_compatible.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

557abc40

KVM: introduce kvm->created_vcpus · 6c7caebc

由 Paolo Bonzini 提交于 6月 13, 2016

The race between creating the irqchip and the first VCPU is
currently fixed by checking the presence of an irqchip before
updating kvm->online_vcpus, and undoing the whole VCPU creation
if someone created the irqchip in the meanwhile.

Instead, introduce a new field in struct kvm that will count VCPUs
under a mutex, without the atomic access and memory ordering that we
need elsewhere to protect the vcpus array.  This also plugs the race
and is more easily applicable in all similar circumstances.
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6c7caebc

25 5月, 2016 1 次提交

KVM: Create debugfs dir and stat files for each VM · 536a6f88

由 Janosch Frank 提交于 5月 18, 2016

This patch adds a kvm debugfs subdirectory for each VM, which is named
after its pid and file descriptor. The directories contain the same
kind of files that are already in the kvm debugfs directory, but the
data exported through them is now VM specific.

This makes the debugfs kvm data a convenient alternative to the
tracepoints which already have per VM data. The debugfs data is easy
to read and low overhead.

CC: Dan Carpenter <dan.carpenter@oracle.com> [includes fixes by Dan Carpenter]
Signed-off-by: NJanosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

536a6f88

19 5月, 2016 1 次提交

KVM: split kvm_vcpu_wake_up from kvm_vcpu_kick · dd1a4cc1

由 Radim Krčmář 提交于 5月 04, 2016

AVIC has a use for kvm_vcpu_wake_up.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Tested-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

dd1a4cc1

13 5月, 2016 1 次提交

KVM: halt_polling: provide a way to qualify wakeups during poll · 3491caf2

由 Christian Borntraeger 提交于 5月 13, 2016

Some wakeups should not be considered a sucessful poll. For example on
s390 I/O interrupts are usually floating, which means that _ALL_ CPUs
would be considered runnable - letting all vCPUs poll all the time for
transactional like workload, even if one vCPU would be enough.
This can result in huge CPU usage for large guests.
This patch lets architectures provide a way to qualify wakeups if they
should be considered a good/bad wakeups in regard to polls.

For s390 the implementation will fence of halt polling for anything but
known good, single vCPU events. The s390 implementation for floating
interrupts does a wakeup for one vCPU, but the interrupt will be delivered
by whatever CPU checks first for a pending interrupt. We prefer the
woken up CPU by marking the poll of this CPU as "good" poll.
This code will also mark several other wakeup reasons like IPI or
expired timers as "good". This will of course also mark some events as
not sucessful. As  KVM on z runs always as a 2nd level hypervisor,
we prefer to not poll, unless we are really sure, though.

This patch successfully limits the CPU usage for cases like uperf 1byte
transactional ping pong workload or wakeup heavy workload like OLTP
while still providing a proper speedup.

This also introduced a new vcpu stat "halt_poll_no_tuning" that marks
wakeups that are considered not good for polling.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version)
Cc: David Matlack <dmatlack@google.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
[Rename config symbol. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3491caf2

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功