1. 17 3月, 2018 1 次提交
  2. 16 1月, 2018 1 次提交
    • L
      KVM: nVMX: Fix bug of injecting L2 exception into L1 · 5c7d4f9a
      Liran Alon 提交于
      kvm_clear_exception_queue() should clear pending exception.
      This also includes exceptions which were only marked pending but not
      yet injected. This is because exception.pending is used for both L1
      and L2 to determine if an exception should be raised to guest.
      Note that an exception which is pending but not yet injected will
      be raised again once the guest will be resumed.
      
      Consider the following scenario:
      1) L0 KVM with ignore_msrs=false.
      2) L1 prepare vmcs12 with the following:
          a) No intercepts on MSR (MSR_BITMAP exist and is filled with 0).
          b) No intercept for #GP.
          c) vmx-preemption-timer is configured.
      3) L1 enters into L2.
      4) L2 reads an unhandled MSR that exists in MSR_BITMAP
      (such as 0x1fff).
      
      L2 RDMSR could be handled as described below:
      1) L2 exits to L0 on RDMSR and calls handle_rdmsr().
      2) handle_rdmsr() calls kvm_inject_gp() which sets
      KVM_REQ_EVENT, exception.pending=true and exception.injected=false.
      3) vcpu_enter_guest() consumes KVM_REQ_EVENT and calls
      inject_pending_event() which calls vmx_check_nested_events()
      which sees that exception.pending=true but
      nested_vmx_check_exception() returns 0 and therefore does nothing at
      this point. However let's assume it later sees vmx-preemption-timer
      expired and therefore exits from L2 to L1 by calling
      nested_vmx_vmexit().
      4) nested_vmx_vmexit() calls prepare_vmcs12()
      which calls vmcs12_save_pending_event() but it does nothing as
      exception.injected is false. Also prepare_vmcs12() calls
      kvm_clear_exception_queue() which does nothing as
      exception.injected is already false.
      5) We now return from vmx_check_nested_events() with 0 while still
      having exception.pending=true!
      6) Therefore inject_pending_event() continues
      and we inject L2 exception to L1!...
      
      This commit will fix above issue by changing step (4) to
      clear exception.pending in kvm_clear_exception_queue().
      
      Fixes: 664f8e26 ("KVM: X86: Fix loss of exception which has not yet been injected")
      Signed-off-by: NLiran Alon <liran.alon@oracle.com>
      Reviewed-by: NNikita Leshenko <nikita.leshchenko@oracle.com>
      Reviewed-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Signed-off-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      5c7d4f9a
  3. 14 12月, 2017 3 次提交
  4. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  5. 25 8月, 2017 3 次提交
    • W
      KVM: X86: Fix loss of exception which has not yet been injected · 664f8e26
      Wanpeng Li 提交于
      vmx_complete_interrupts() assumes that the exception is always injected,
      so it can be dropped by kvm_clear_exception_queue().  However,
      an exception cannot be injected immediately if it is: 1) originally
      destined to a nested guest; 2) trapped to cause a vmexit; 3) happening
      right after VMLAUNCH/VMRESUME, i.e. when nested_run_pending is true.
      
      This patch applies to exceptions the same algorithm that is used for
      NMIs, replacing exception.reinject with "exception.injected" (equivalent
      to nmi_injected).
      
      exception.pending now represents an exception that is queued and whose
      side effects (e.g., update RFLAGS.RF or DR7) have not been applied yet.
      If exception.pending is true, the exception might result in a nested
      vmexit instead, too (in which case the side effects must not be applied).
      
      exception.injected instead represents an exception that is going to be
      injected into the guest at the next vmentry.
      Reported-by: NRadim Krčmář <rkrcmar@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      664f8e26
    • Y
      KVM: MMU: Expose the LA57 feature to VM. · fd8cb433
      Yu Zhang 提交于
      This patch exposes 5 level page table feature to the VM.
      At the same time, the canonical virtual address checking is
      extended to support both 48-bits and 57-bits address width.
      Signed-off-by: NYu Zhang <yu.c.zhang@linux.intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fd8cb433
    • Y
      KVM: MMU: Add 5 level EPT & Shadow page table support. · 855feb67
      Yu Zhang 提交于
      Extends the shadow paging code, so that 5 level shadow page
      table can be constructed if VM is running in 5 level paging
      mode.
      
      Also extends the ept code, so that 5 level ept table can be
      constructed if maxphysaddr of VM exceeds 48 bits. Unlike the
      shadow logic, KVM should still use 4 level ept table for a VM
      whose physical address width is less than 48 bits, even when
      the VM is running in 5 level paging mode.
      Signed-off-by: NYu Zhang <yu.c.zhang@linux.intel.com>
      [Unconditionally reset the MMU context in kvm_cpuid_update.
       Changing MAXPHYADDR invalidates the reserved bit bitmasks.
       - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      855feb67
  6. 18 8月, 2017 1 次提交
    • P
      KVM: x86: fix use of L1 MMIO areas in nested guests · 9034e6e8
      Paolo Bonzini 提交于
      There is currently some confusion between nested and L1 GPAs.  The
      assignment to "direct" in kvm_mmu_page_fault tries to fix that, but
      it is not enough.  What this patch does is fence off the MMIO cache
      completely when using shadow nested page tables, since we have neither
      a GVA nor an L1 GPA to put in the cache.  This also allows some
      simplifications in kvm_mmu_page_fault and FNAME(page_fault).
      
      The EPT misconfig likewise does not have an L1 GPA to pass to
      kvm_io_bus_write, so that must be skipped for guest mode.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      [Changed comment to say "GPAs" instead of "L1's physical addresses", as
       per David's review. - Radim]
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      9034e6e8
  7. 21 4月, 2017 1 次提交
    • M
      kvm: better MWAIT emulation for guests · 668fffa3
      Michael S. Tsirkin 提交于
      Guests that are heavy on futexes end up IPI'ing each other a lot. That
      can lead to significant slowdowns and latency increase for those guests
      when running within KVM.
      
      If only a single guest is needed on a host, we have a lot of spare host
      CPU time we can throw at the problem. Modern CPUs implement a feature
      called "MWAIT" which allows guests to wake up sleeping remote CPUs without
      an IPI - thus without an exit - at the expense of never going out of guest
      context.
      
      The decision whether this is something sensible to use should be up to the
      VM admin, so to user space. We can however allow MWAIT execution on systems
      that support it properly hardware wise.
      
      This patch adds a CAP to user space and a KVM cpuid leaf to indicate
      availability of native MWAIT execution. With that enabled, the worst a
      guest can do is waste as many cycles as a "jmp ." would do, so it's not
      a privilege problem.
      
      We consciously do *not* expose the feature in our CPUID bitmap, as most
      people will want to benefit from sleeping vCPUs to allow for over commit.
      Reported-by: N"Gabriel L. Somlo" <gsomlo@gmail.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      [agraf: fix amd, change commit message]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      668fffa3
  8. 20 9月, 2016 1 次提交
    • P
      KVM: x86: introduce get_kvmclock_ns · 108b249c
      Paolo Bonzini 提交于
      Introduce a function that reads the exact nanoseconds value that is
      provided to the guest in kvmclock.  This crystallizes the notion of
      kvmclock as a thin veneer over a stable TSC, that the guest will
      (hopefully) convert with NTP.  In other words, kvmclock is *not* a
      paravirtualized host-to-guest NTP.
      
      Drop the get_kernel_ns() function, that was used both to get the base
      value of the master clock and to get the current value of kvmclock.
      The former use is replaced by ktime_get_boot_ns(), the latter is
      the purpose of get_kernel_ns().
      
      This also allows KVM to provide a Hyper-V time reference counter that
      is synchronized with the time that is computed from the TSC page.
      Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      108b249c
  9. 27 6月, 2016 1 次提交
  10. 22 3月, 2016 1 次提交
  11. 09 2月, 2016 2 次提交
  12. 14 9月, 2015 1 次提交
    • D
      x86/fpu: Rename XSAVE macros · d91cab78
      Dave Hansen 提交于
      There are two concepts that have some confusing naming:
       1. Extended State Component numbers (currently called
          XFEATURE_BIT_*)
       2. Extended State Component masks (currently called XSTATE_*)
      
      The numbers are (currently) from 0-9.  State component 3 is the
      bounds registers for MPX, for instance.
      
      But when we want to enable "state component 3", we go set a bit
      in XCR0.  The bit we set is 1<<3.  We can check to see if a
      state component feature is enabled by looking at its bit.
      
      The current 'xfeature_bit's are at best xfeature bit _numbers_.
      Calling them bits is at best inconsistent with ending the enum
      list with 'XFEATURES_NR_MAX'.
      
      This patch renames the enum to be 'xfeature'.  These also
      happen to be what the Intel documentation calls a "state
      component".
      
      We also want to differentiate these from the "XSTATE_*" macros.
      The "XSTATE_*" macros are a mask, and we rename them to match.
      
      These macros are reasonably widely used so this patch is a
      wee bit big, but this really is just a rename.
      
      The only non-mechanical part of this is the
      
      	s/XSTATE_EXTEND_MASK/XFEATURE_MASK_EXTEND/
      
      We need a better name for it, but that's another patch.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: dave@sr71.net
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/r/20150902233126.38653250@viggo.jf.intel.com
      [ Ported to v4.3-rc1. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d91cab78
  13. 23 7月, 2015 2 次提交
  14. 19 6月, 2015 3 次提交
  15. 07 5月, 2015 1 次提交
  16. 09 1月, 2015 2 次提交
  17. 03 11月, 2014 1 次提交
  18. 24 9月, 2014 1 次提交
  19. 03 9月, 2014 1 次提交
    • D
      kvm: x86: fix stale mmio cache bug · 56f17dd3
      David Matlack 提交于
      The following events can lead to an incorrect KVM_EXIT_MMIO bubbling
      up to userspace:
      
      (1) Guest accesses gpa X without a memory slot. The gfn is cached in
      struct kvm_vcpu_arch (mmio_gfn). On Intel EPT-enabled hosts, KVM sets
      the SPTE write-execute-noread so that future accesses cause
      EPT_MISCONFIGs.
      
      (2) Host userspace creates a memory slot via KVM_SET_USER_MEMORY_REGION
      covering the page just accessed.
      
      (3) Guest attempts to read or write to gpa X again. On Intel, this
      generates an EPT_MISCONFIG. The memory slot generation number that
      was incremented in (2) would normally take care of this but we fast
      path mmio faults through quickly_check_mmio_pf(), which only checks
      the per-vcpu mmio cache. Since we hit the cache, KVM passes a
      KVM_EXIT_MMIO up to userspace.
      
      This patch fixes the issue by using the memslot generation number
      to validate the mmio cache.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NDavid Matlack <dmatlack@google.com>
      [xiaoguangrong: adjust the code to make it simpler for stable-tree fix.]
      Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Tested-by: NDavid Matlack <dmatlack@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      56f17dd3
  20. 19 6月, 2014 2 次提交
  21. 17 3月, 2014 1 次提交
    • P
      KVM: x86: introduce kvm_supported_xcr0() · 4ff41732
      Paolo Bonzini 提交于
      XSAVE support for KVM is already using host_xcr0 & KVM_SUPPORTED_XCR0 as
      a "dynamic" version of KVM_SUPPORTED_XCR0.
      
      However, this is not enough because the MPX bits should not be presented
      to the guest unless kvm_x86_ops confirms the support.  So, replace all
      instances of host_xcr0 & KVM_SUPPORTED_XCR0 with a new function
      kvm_supported_xcr0() that also has this check.
      
      Note that here:
      
      		if (xstate_bv & ~KVM_SUPPORTED_XCR0)
      			return -EINVAL;
      		if (xstate_bv & ~host_cr0)
      			return -EINVAL;
      
      the code is equivalent to
      
      		if ((xstate_bv & ~KVM_SUPPORTED_XCR0) ||
      		    (xstate_bv & ~host_cr0)
      			return -EINVAL;
      
      i.e. "xstate_bv & (~KVM_SUPPORTED_XCR0 | ~host_cr0)" which is in turn
      equal to "xstate_bv & ~(KVM_SUPPORTED_XCR0 & host_cr0)".  So we should
      also use the new function there.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4ff41732
  22. 26 2月, 2014 1 次提交
  23. 15 1月, 2014 1 次提交
  24. 03 10月, 2013 1 次提交
  25. 01 12月, 2012 1 次提交
    • W
      KVM: x86: Add code to track call origin for msr assignment · 8fe8ab46
      Will Auld 提交于
      In order to track who initiated the call (host or guest) to modify an msr
      value I have changed function call parameters along the call path. The
      specific change is to add a struct pointer parameter that points to (index,
      data, caller) information rather than having this information passed as
      individual parameters.
      
      The initial use for this capability is for updating the IA32_TSC_ADJUST msr
      while setting the tsc value. It is anticipated that this capability is
      useful for other tasks.
      Signed-off-by: NWill Auld <will.auld@intel.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      8fe8ab46
  26. 07 8月, 2012 1 次提交
  27. 08 4月, 2012 1 次提交
  28. 27 12月, 2011 1 次提交
  29. 24 7月, 2011 1 次提交
  30. 12 7月, 2011 1 次提交