1. 28 2月, 2020 2 次提交
    • M
      KVM: s390: protvirt: Implement interrupt injection · 201ae986
      Michael Mueller 提交于
      This defines the necessary data structures in the SIE control block to
      inject machine checks,external and I/O interrupts. We first define the
      the interrupt injection control, which defines the next interrupt to
      inject. Then we define the fields that contain the payload for machine
      checks,external and I/O interrupts.
      This is then used to implement interruption injection for the following
      list of interruption types:
      
         - I/O (uses inject io interruption)
           __deliver_io
      
         - External (uses inject external interruption)
           __deliver_cpu_timer
           __deliver_ckc
           __deliver_emergency_signal
           __deliver_external_call
      
         - cpu restart (uses inject restart interruption)
           __deliver_restart
      
         - machine checks (uses mcic, failing address and external damage)
           __write_machine_check
      
      Please note that posted interrupts (GISA) are not used for protected
      guests as of today.
      
      The service interrupt is handled in a followup patch.
      Signed-off-by: NMichael Mueller <mimu@linux.ibm.com>
      Reviewed-by: NThomas Huth <thuth@redhat.com>
      Reviewed-by: NCornelia Huck <cohuck@redhat.com>
      [borntraeger@de.ibm.com: patch merging, splitting, fixing]
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      201ae986
    • U
      KVM: s390/interrupt: do not pin adapter interrupt pages · f6547066
      Ulrich Weigand 提交于
      The adapter interrupt page containing the indicator bits is currently
      pinned. That means that a guest with many devices can pin a lot of
      memory pages in the host. This also complicates the reference tracking
      which is needed for memory management handling of protected virtual
      machines. It might also have some strange side effects for madvise
      MADV_DONTNEED and other things.
      
      We can simply try to get the userspace page set the bits and free the
      page. By storing the userspace address in the irq routing entry instead
      of the guest address we can actually avoid many lookups and list walks
      so that this variant is very likely not slower.
      
      If userspace messes around with the memory slots the worst thing that
      can happen is that we write to some other memory within that process.
      As we get the the page with FOLL_WRITE this can also not be used to
      write to shared read-only pages.
      Signed-off-by: NUlrich Weigand <Ulrich.Weigand@de.ibm.com>
      Acked-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NCornelia Huck <cohuck@redhat.com>
      [borntraeger@de.ibm.com: patch simplification]
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      f6547066
  2. 31 1月, 2020 1 次提交
  3. 04 10月, 2019 1 次提交
  4. 12 9月, 2019 1 次提交
    • T
      KVM: s390: Do not leak kernel stack data in the KVM_S390_INTERRUPT ioctl · 53936b5b
      Thomas Huth 提交于
      When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject
      an interrupt, we convert them from the legacy struct kvm_s390_interrupt
      to the new struct kvm_s390_irq via the s390int_to_s390irq() function.
      However, this function does not take care of all types of interrupts
      that we can inject into the guest later (see do_inject_vcpu()). Since we
      do not clear out the s390irq values before calling s390int_to_s390irq(),
      there is a chance that we copy random data from the kernel stack which
      could be leaked to the userspace later.
      
      Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT
      interrupt: s390int_to_s390irq() does not handle it, and the function
      __inject_pfault_init() later copies irq->u.ext which contains the
      random kernel stack data. This data can then be leaked either to
      the guest memory in __deliver_pfault_init(), or the userspace might
      retrieve it directly with the KVM_S390_GET_IRQ_STATE ioctl.
      
      Fix it by handling that interrupt type in s390int_to_s390irq(), too,
      and by making sure that the s390irq struct is properly pre-initialized.
      And while we're at it, make sure that s390int_to_s390irq() now
      directly returns -EINVAL for unknown interrupt types, so that we
      immediately get a proper error code in case we add more interrupt
      types to do_inject_vcpu() without updating s390int_to_s390irq()
      sometime in the future.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NJanosch Frank <frankja@linux.ibm.com>
      Signed-off-by: NThomas Huth <thuth@redhat.com>
      Link: https://lore.kernel.org/kvm/20190912115438.25761-1-thuth@redhat.comSigned-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      53936b5b
  5. 20 7月, 2019 2 次提交
    • W
      KVM: s390: Use kvm_vcpu_wake_up in kvm_s390_vcpu_wakeup · d9847409
      Wanpeng Li 提交于
      Use kvm_vcpu_wake_up() in kvm_s390_vcpu_wakeup().
      Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d9847409
    • W
      KVM: Boost vCPUs that are delivering interrupts · d73eb57b
      Wanpeng Li 提交于
      Inspired by commit 9cac38dd (KVM/s390: Set preempted flag during
      vcpu wakeup and interrupt delivery), we want to also boost not just
      lock holders but also vCPUs that are delivering interrupts. Most
      smp_call_function_many calls are synchronous, so the IPI target vCPUs
      are also good yield candidates.  This patch introduces vcpu->ready to
      boost vCPUs during wakeup and interrupt delivery time; unlike s390 we do
      not reuse vcpu->preempted so that voluntarily preempted vCPUs are taken
      into account by kvm_vcpu_on_spin, but vmx_vcpu_pi_put is not affected
      (VT-d PI handles voluntary preemption separately, in pi_pre_block).
      
      Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM:
      ebizzy -M
      
                  vanilla     boosting    improved
      1VM          21443       23520         9%
      2VM           2800        8000       180%
      3VM           1800        3100        72%
      
      Testing on my Haswell desktop 8 HT, with 8 vCPUs VM 8GB RAM, two VMs,
      one running ebizzy -M, the other running 'stress --cpu 2':
      
      w/ boosting + w/o pv sched yield(vanilla)
      
                  vanilla     boosting   improved
                    1570         4000      155%
      
      w/ boosting + w/ pv sched yield(vanilla)
      
                  vanilla     boosting   improved
                    1844         5157      179%
      
      w/o boosting, perf top in VM:
      
       72.33%  [kernel]       [k] smp_call_function_many
        4.22%  [kernel]       [k] call_function_i
        3.71%  [kernel]       [k] async_page_fault
      
      w/ boosting, perf top in VM:
      
       38.43%  [kernel]       [k] smp_call_function_many
        6.31%  [kernel]       [k] async_page_fault
        6.13%  libc-2.23.so   [.] __memcpy_avx_unaligned
        4.88%  [kernel]       [k] call_function_interrupt
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Marc Zyngier <maz@kernel.org>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d73eb57b
  6. 15 5月, 2019 1 次提交
    • I
      mm/gup: change GUP fast to use flags rather than a write 'bool' · 73b0140b
      Ira Weiny 提交于
      To facilitate additional options to get_user_pages_fast() change the
      singular write parameter to be gup_flags.
      
      This patch does not change any functionality.  New functionality will
      follow in subsequent patches.
      
      Some of the get_user_pages_fast() call sites were unchanged because they
      already passed FOLL_WRITE or 0 for the write parameter.
      
      NOTE: It was suggested to change the ordering of the get_user_pages_fast()
      arguments to ensure that callers were converted.  This breaks the current
      GUP call site convention of having the returned pages be the final
      parameter.  So the suggestion was rejected.
      
      Link: http://lkml.kernel.org/r/20190328084422.29911-4-ira.weiny@intel.com
      Link: http://lkml.kernel.org/r/20190317183438.2057-4-ira.weiny@intel.comSigned-off-by: NIra Weiny <ira.weiny@intel.com>
      Reviewed-by: NMike Marshall <hubcap@omnibond.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      73b0140b
  7. 29 4月, 2019 1 次提交
  8. 18 4月, 2019 1 次提交
  9. 05 2月, 2019 12 次提交
  10. 20 6月, 2018 1 次提交
  11. 17 5月, 2018 1 次提交
  12. 15 3月, 2018 2 次提交
  13. 21 2月, 2018 1 次提交
  14. 14 2月, 2018 3 次提交
  15. 26 1月, 2018 8 次提交
  16. 25 1月, 2018 2 次提交