1. 09 5月, 2017 2 次提交
    • C
      KVM: Add kvm_vcpu_get_idx to get vcpu index in kvm->vcpus · 497d72d8
      Christoffer Dall 提交于
      There are occasional needs to use the index of vcpu in the kvm->vcpus
      array to map something related to a VCPU.  For example, unlike the
      vcpu->vcpu_id, the vcpu index is guaranteed to not be sparse across all
      vcpus which is useful when allocating a memory area for each vcpu.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      497d72d8
    • M
      mm: introduce kv[mz]alloc helpers · a7c3e901
      Michal Hocko 提交于
      Patch series "kvmalloc", v5.
      
      There are many open coded kmalloc with vmalloc fallback instances in the
      tree.  Most of them are not careful enough or simply do not care about
      the underlying semantic of the kmalloc/page allocator which means that
      a) some vmalloc fallbacks are basically unreachable because the kmalloc
      part will keep retrying until it succeeds b) the page allocator can
      invoke a really disruptive steps like the OOM killer to move forward
      which doesn't sound appropriate when we consider that the vmalloc
      fallback is available.
      
      As it can be seen implementing kvmalloc requires quite an intimate
      knowledge if the page allocator and the memory reclaim internals which
      strongly suggests that a helper should be implemented in the memory
      subsystem proper.
      
      Most callers, I could find, have been converted to use the helper
      instead.  This is patch 6.  There are some more relying on __GFP_REPEAT
      in the networking stack which I have converted as well and Eric Dumazet
      was not opposed [2] to convert them as well.
      
      [1] http://lkml.kernel.org/r/20170130094940.13546-1-mhocko@kernel.org
      [2] http://lkml.kernel.org/r/1485273626.16328.301.camel@edumazet-glaptop3.roam.corp.google.com
      
      This patch (of 9):
      
      Using kmalloc with the vmalloc fallback for larger allocations is a
      common pattern in the kernel code.  Yet we do not have any common helper
      for that and so users have invented their own helpers.  Some of them are
      really creative when doing so.  Let's just add kv[mz]alloc and make sure
      it is implemented properly.  This implementation makes sure to not make
      a large memory pressure for > PAGE_SZE requests (__GFP_NORETRY) and also
      to not warn about allocation failures.  This also rules out the OOM
      killer as the vmalloc is a more approapriate fallback than a disruptive
      user visible action.
      
      This patch also changes some existing users and removes helpers which
      are specific for them.  In some cases this is not possible (e.g.
      ext4_kvmalloc, libcfs_kvzalloc) because those seems to be broken and
      require GFP_NO{FS,IO} context which is not vmalloc compatible in general
      (note that the page table allocation is GFP_KERNEL).  Those need to be
      fixed separately.
      
      While we are at it, document that __vmalloc{_node} about unsupported gfp
      mask because there seems to be a lot of confusion out there.
      kvmalloc_node will warn about GFP_KERNEL incompatible (which are not
      superset) flags to catch new abusers.  Existing ones would have to die
      slowly.
      
      [sfr@canb.auug.org.au: f2fs fixup]
        Link: http://lkml.kernel.org/r/20170320163735.332e64b7@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20170306103032.2540-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Reviewed-by: Andreas Dilger <adilger@dilger.ca>	[ext4 part]
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7c3e901
  2. 03 5月, 2017 1 次提交
    • P
      Revert "KVM: Support vCPU-based gfn->hva cache" · 4e335d9e
      Paolo Bonzini 提交于
      This reverts commit bbd64115.
      
      I've been sitting on this revert for too long and it unfortunately
      missed 4.11.  It's also the reason why I haven't merged ring-based
      dirty tracking for 4.12.
      
      Using kvm_vcpu_memslots in kvm_gfn_to_hva_cache_init and
      kvm_vcpu_write_guest_offset_cached means that the MSR value can
      now be used to access SMRAM, simply by making it point to an SMRAM
      physical address.  This is problematic because it lets the guest
      OS overwrite memory that it shouldn't be able to touch.
      
      Cc: stable@vger.kernel.org
      Fixes: bbd64115Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4e335d9e
  3. 02 5月, 2017 1 次提交
  4. 27 4月, 2017 6 次提交
  5. 21 4月, 2017 1 次提交
  6. 13 4月, 2017 1 次提交
  7. 07 4月, 2017 2 次提交
  8. 24 3月, 2017 1 次提交
  9. 02 3月, 2017 1 次提交
  10. 17 2月, 2017 2 次提交
  11. 30 1月, 2017 1 次提交
  12. 28 11月, 2016 1 次提交
    • S
      KVM: Export kvm module parameter variables · ec76d819
      Suraj Jitindar Singh 提交于
      The kvm module has the parameters halt_poll_ns, halt_poll_ns_grow, and
      halt_poll_ns_shrink. Halt polling was recently added to the powerpc kvm-hv
      module and these parameters were essentially duplicated for that. There is
      no benefit to this duplication and it can lead to confusion when trying to
      tune halt polling.
      
      Thus move the definition of these variables to kvm_host.h and export them.
      This will allow the kvm-hv module to use the same module parameters by
      accessing these variables, which will be implemented in the next patch,
      meaning that they will no longer be duplicated.
      Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      ec76d819
  13. 23 11月, 2016 1 次提交
  14. 22 11月, 2016 1 次提交
  15. 07 10月, 2016 1 次提交
    • R
      x86/fpu, kvm: Remove KVM vcpu->fpu_counter · 3d42de25
      Rik van Riel 提交于
      With the removal of the lazy FPU code, this field is no longer used.
      Get rid of it.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: pbonzini@redhat.com
      Link: http://lkml.kernel.org/r/1475627678-20788-7-git-send-email-riel@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3d42de25
  16. 16 9月, 2016 2 次提交
    • L
      kvm: create per-vcpu dirs in debugfs · 45b5939e
      Luiz Capitulino 提交于
      This commit adds the ability for archs to export
      per-vcpu information via a new per-vcpu dir in
      the VM's debugfs directory.
      
      If kvm_arch_has_vcpu_debugfs() returns true, then KVM
      will create a vcpu dir for each vCPU in the VM's
      debugfs directory. Then kvm_arch_create_vcpu_debugfs()
      is responsible for populating each vcpu directory
      with arch specific entries.
      
      The per-vcpu path in debugfs will look like:
      
      /sys/kernel/debug/kvm/29162-10/vcpu0
      /sys/kernel/debug/kvm/29162-10/vcpu1
      
      This is all arch specific for now because the only
      user of this interface (x86) wants to export x86-specific
      per-vcpu information to user-space.
      Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      45b5939e
    • L
      kvm: add stubs for arch specific debugfs support · 235539b4
      Luiz Capitulino 提交于
      Two stubs are added:
      
       o kvm_arch_has_vcpu_debugfs(): must return true if the arch
         supports creating debugfs entries in the vcpu debugfs dir
         (which will be implemented by the next commit)
      
       o kvm_arch_create_vcpu_debugfs(): code that creates debugfs
         entries in the vcpu debugfs dir
      
      For x86, this commit introduces a new file to avoid growing
      arch/x86/kvm/x86.c even more.
      Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      235539b4
  17. 12 8月, 2016 2 次提交
  18. 23 7月, 2016 3 次提交
  19. 19 7月, 2016 1 次提交
  20. 14 7月, 2016 1 次提交
  21. 01 7月, 2016 1 次提交
  22. 28 6月, 2016 1 次提交
  23. 16 6月, 2016 2 次提交
    • P
      KVM: remove kvm_vcpu_compatible · 557abc40
      Paolo Bonzini 提交于
      The new created_vcpus field makes it possible to avoid the race between
      irqchip and VCPU creation in a much nicer way; just check under kvm->lock
      whether a VCPU has already been created.
      
      We can then remove KVM_APIC_ARCHITECTURE too, because at this point the
      symbol is only governing the default definition of kvm_vcpu_compatible.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      557abc40
    • P
      KVM: introduce kvm->created_vcpus · 6c7caebc
      Paolo Bonzini 提交于
      The race between creating the irqchip and the first VCPU is
      currently fixed by checking the presence of an irqchip before
      updating kvm->online_vcpus, and undoing the whole VCPU creation
      if someone created the irqchip in the meanwhile.
      
      Instead, introduce a new field in struct kvm that will count VCPUs
      under a mutex, without the atomic access and memory ordering that we
      need elsewhere to protect the vcpus array.  This also plugs the race
      and is more easily applicable in all similar circumstances.
      Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6c7caebc
  24. 25 5月, 2016 1 次提交
    • J
      KVM: Create debugfs dir and stat files for each VM · 536a6f88
      Janosch Frank 提交于
      This patch adds a kvm debugfs subdirectory for each VM, which is named
      after its pid and file descriptor. The directories contain the same
      kind of files that are already in the kvm debugfs directory, but the
      data exported through them is now VM specific.
      
      This makes the debugfs kvm data a convenient alternative to the
      tracepoints which already have per VM data. The debugfs data is easy
      to read and low overhead.
      
      CC: Dan Carpenter <dan.carpenter@oracle.com> [includes fixes by Dan Carpenter]
      Signed-off-by: NJanosch Frank <frankja@linux.vnet.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      536a6f88
  25. 19 5月, 2016 1 次提交
  26. 13 5月, 2016 1 次提交
    • C
      KVM: halt_polling: provide a way to qualify wakeups during poll · 3491caf2
      Christian Borntraeger 提交于
      Some wakeups should not be considered a sucessful poll. For example on
      s390 I/O interrupts are usually floating, which means that _ALL_ CPUs
      would be considered runnable - letting all vCPUs poll all the time for
      transactional like workload, even if one vCPU would be enough.
      This can result in huge CPU usage for large guests.
      This patch lets architectures provide a way to qualify wakeups if they
      should be considered a good/bad wakeups in regard to polls.
      
      For s390 the implementation will fence of halt polling for anything but
      known good, single vCPU events. The s390 implementation for floating
      interrupts does a wakeup for one vCPU, but the interrupt will be delivered
      by whatever CPU checks first for a pending interrupt. We prefer the
      woken up CPU by marking the poll of this CPU as "good" poll.
      This code will also mark several other wakeup reasons like IPI or
      expired timers as "good". This will of course also mark some events as
      not sucessful. As  KVM on z runs always as a 2nd level hypervisor,
      we prefer to not poll, unless we are really sure, though.
      
      This patch successfully limits the CPU usage for cases like uperf 1byte
      transactional ping pong workload or wakeup heavy workload like OLTP
      while still providing a proper speedup.
      
      This also introduced a new vcpu stat "halt_poll_no_tuning" that marks
      wakeups that are considered not good for polling.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version)
      Cc: David Matlack <dmatlack@google.com>
      Cc: Wanpeng Li <kernellwp@gmail.com>
      [Rename config symbol. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3491caf2
  27. 12 5月, 2016 1 次提交