- 24 10月, 2017 1 次提交
-
-
由 mike.travis@hpe.com 提交于
Fix build problem: >> WARNING: vmlinux.o(.text+0x4223a): Section mismatch in reference from the function uv_tsc_check_sync() to the function .init.text:uv_early_read_mmr() The function uv_tsc_check_sync() references the function __init uv_early_read_mmr(). This is often because uv_tsc_check_sync lacks a __init Signed-off-by: NMike Travis <mike.travis@hpe.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Andrew Banman <andrew.banman@hpe.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Bin Gao <bin.gao@linux.intel.com> Link: https://lkml.kernel.org/r/20171023191841.985614692@stormcage.americas.sgi.com
-
- 17 10月, 2017 6 次提交
-
-
由 Thomas Gleixner 提交于
tsc_async_resets is only available when CONFIG_X86_TSC=y. So a build with CONFIG_X86_TSC=n breaks: arch/x86/kernel/tsc.o: In function `tsc_init': (.init.text+0x87b): undefined reference to `tsc_async_resets' Add a stub define for the TSC=n case. Side note: This config switch should simply be removed. Reported-by: Nkbuild test robot <fengguang.wu@intel.com> Fixes: 341102c3 ("x86/tsc: Add option that TSC on Socket 0 being non-zero is valid") Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Mike Travis <mike.travis@hpe.com>
-
由 mike.travis@hpe.com 提交于
Insert a check early in UV system startup that checks whether BIOS was able to obtain satisfactory TSC Sync stability. If not, it usually is caused by an error in the external TSC clock generation source. In this case the best fallback is to use the builtin hardware RTC as the kernel will not be able to set an accurate TSC sync either. Signed-off-by: NMike Travis <mike.travis@hpe.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: NRuss Anderson <russ.anderson@hpe.com> Reviewed-by: NAndrew Banman <andrew.abanman@hpe.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Andrew Banman <andrew.banman@hpe.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Bin Gao <bin.gao@linux.intel.com> Link: https://lkml.kernel.org/r/20171012163202.406294490@stormcage.americas.sgi.com
-
由 mike.travis@hpe.com 提交于
On systems where multiple chassis are reset asynchronously, and thus the TSC counters are started asynchronously, the offset needed to convert to TSC to ART would be different. Disable ART in that case and rely on the TSC counters to supply the accurate time. Signed-off-by: NMike Travis <mike.travis@hpe.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Andrew Banman <andrew.banman@hpe.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Bin Gao <bin.gao@linux.intel.com> Link: https://lkml.kernel.org/r/20171012163202.289397994@stormcage.americas.sgi.com
-
由 mike.travis@hpe.com 提交于
Prior to the TSC ADJUST MSR being available, the method to set TSC's in sync with each other naturally caused a small skew between cpu threads. This was NOT a firmware bug at the time so introducing a whole avalanche of alarming warning messages might cause unnecessary concern and customer complaints. (Example: >3000 msgs in a 32 socket Skylake system.) Simply report the warning condition, if possible do the necessary fixes, and move on. Signed-off-by: NMike Travis <mike.travis@hpe.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: NRuss Anderson <russ.anderson@hpe.com> Reviewed-by: NPeter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Andrew Banman <andrew.banman@hpe.com> Cc: Bin Gao <bin.gao@linux.intel.com> Link: https://lkml.kernel.org/r/20171012163202.175062400@stormcage.americas.sgi.com
-
由 mike.travis@hpe.com 提交于
If the TSC has already been determined to be unstable, then checking TSC ADJUST values is a waste of time and generates unnecessary error messages. Signed-off-by: NMike Travis <mike.travis@hpe.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: NRuss Anderson <russ.anderson@hpe.com> Reviewed-by: NPeter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Andrew Banman <andrew.banman@hpe.com> Cc: Bin Gao <bin.gao@linux.intel.com> Link: https://lkml.kernel.org/r/20171012163202.060777495@stormcage.americas.sgi.com
-
由 mike.travis@hpe.com 提交于
Add a flag to indicate and process that TSC counters are on chassis that reset at different times during system startup. Therefore which TSC ADJUST values should be zero is not predictable. Signed-off-by: NMike Travis <mike.travis@hpe.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: NRuss Anderson <russ.anderson@hpe.com> Reviewed-by: NAndrew Banman <andrew.abanman@hpe.com> Reviewed-by: NPeter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Andrew Banman <andrew.banman@hpe.com> Cc: Bin Gao <bin.gao@linux.intel.com> Link: https://lkml.kernel.org/r/20171012163201.944370012@stormcage.americas.sgi.com
-
- 25 9月, 2017 3 次提交
-
-
由 Boris Ostrovsky 提交于
simple_udelay_calibration() relies on x86_platform's calibration ops. For KVM these ops are set late in setup_arch() and so simple_udelay_calibration() ends up using native version. Besides being possibly incorrect, this significantly increases kernel boot time. For example, on my laptop executing start_kernel() by a guest takes ~10 times more than when KVM's ops are used. Since early_xdbc_setup_hardware() relies on calibration having been performed move it too. Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Cc: baolu.lu@linux.intel.com Link: https://lkml.kernel.org/r/20170911185111.20636-1-boris.ostrovsky@oracle.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Dou Liyang 提交于
recalibrate_cpu_khz() is called from powernow K7 and Pentium 4/Xeon CPU freq driver. It recalibrates cpu frequency in case of SMP = n and doesn't need to return anything. Mark it void, also remove the #else branch. Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1500003247-17368-2-git-send-email-douly.fnst@cn.fujitsu.com
-
由 Dou Liyang 提交于
Commit dd759d93 ("x86/timers: Add simple udelay calibration") adds an static function in x86 boot-time initializations. But, this function is actually related to TSC, so it should be maintained in tsc.c, not in setup.c. Move simple_udelay_calibration() from setup.c to tsc.c and rename it to tsc_early_delay_calibrate for more readability. Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1500003247-17368-1-git-send-email-douly.fnst@cn.fujitsu.com
-
- 23 9月, 2017 1 次提交
-
-
由 Josh Poimboeuf 提交于
For inline asm statements which have a CALL instruction, we list the stack pointer as a constraint to convince GCC to ensure the frame pointer is set up first: static inline void foo() { register void *__sp asm(_ASM_SP); asm("call bar" : "+r" (__sp)) } Unfortunately, that pattern causes Clang to corrupt the stack pointer. The fix is easy: convert the stack pointer register variable to a global variable. It should be noted that the end result is different based on the GCC version. With GCC 6.4, this patch has exactly the same result as before: defconfig defconfig-nofp distro distro-nofp before 9820389 9491555 8816046 8516940 after 9820389 9491555 8816046 8516940 With GCC 7.2, however, GCC's behavior has changed. It now changes its behavior based on the conversion of the register variable to a global. That somehow convinces it to *always* set up the frame pointer before inserting *any* inline asm. (Therefore, listing the variable as an output constraint is a no-op and is no longer necessary.) It's a bit overkill, but the performance impact should be negligible. And in fact, there's a nice improvement with frame pointers disabled: defconfig defconfig-nofp distro distro-nofp before 9796316 9468236 9076191 8790305 after 9796957 9464267 9076381 8785949 So in summary, while listing the stack pointer as an output constraint is no longer necessary for newer versions of GCC, it's still needed for older versions. Suggested-by: NAndrey Ryabinin <aryabinin@virtuozzo.com> Reported-by: NMatthias Kaehlcke <mka@chromium.org> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 20 9月, 2017 12 次提交
-
-
由 Josh Poimboeuf 提交于
Using RBP as a temporary register breaks frame pointer convention and breaks stack traces when unwinding from an interrupt in the crypto code. Use R13 instead of RBP. Both are callee-saved registers, so the substitution is straightforward. Reported-by: NEric Biggers <ebiggers@google.com> Reported-by: NPeter Zijlstra <peterz@infradead.org> Tested-by: NEric Biggers <ebiggers@google.com> Acked-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Josh Poimboeuf 提交于
Using RBP as a temporary register breaks frame pointer convention and breaks stack traces when unwinding from an interrupt in the crypto code. Mix things up a little bit to get rid of the RBP usage, without hurting performance too much. Use RDI instead of RBP for the TBL pointer. That will clobber CTX, so spill CTX onto the stack and use R12 to read it in the outer loop. R12 is used as a non-persistent temporary variable elsewhere, so it's safe to use. Also remove the unused y4 variable. Reported-by: NEric Biggers <ebiggers3@gmail.com> Reported-by: NPeter Zijlstra <peterz@infradead.org> Tested-by: NEric Biggers <ebiggers@google.com> Acked-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Josh Poimboeuf 提交于
Using RBP as a temporary register breaks frame pointer convention and breaks stack traces when unwinding from an interrupt in the crypto code. Swap the usages of R12 and RBP. Use R12 for the TBL register, and use RBP to store the pre-aligned stack pointer. Reported-by: NEric Biggers <ebiggers@google.com> Reported-by: NPeter Zijlstra <peterz@infradead.org> Tested-by: NEric Biggers <ebiggers@google.com> Acked-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Josh Poimboeuf 提交于
Using RBP as a temporary register breaks frame pointer convention and breaks stack traces when unwinding from an interrupt in the crypto code. There's no need to use RBP as a temporary register for the TBL value, because it always stores the same value: the address of the K256 table. Instead just reference the address of K256 directly. Reported-by: NEric Biggers <ebiggers@google.com> Reported-by: NPeter Zijlstra <peterz@infradead.org> Tested-by: NEric Biggers <ebiggers@google.com> Acked-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Josh Poimboeuf 提交于
Using RBP as a temporary register breaks frame pointer convention and breaks stack traces when unwinding from an interrupt in the crypto code. Swap the usages of R12 and RBP. Use R12 for the TBL register, and use RBP to store the pre-aligned stack pointer. Reported-by: NEric Biggers <ebiggers@google.com> Reported-by: NPeter Zijlstra <peterz@infradead.org> Tested-by: NEric Biggers <ebiggers@google.com> Acked-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Josh Poimboeuf 提交于
Using RBP as a temporary register breaks frame pointer convention and breaks stack traces when unwinding from an interrupt in the crypto code. Swap the usages of R12 and RBP. Use R12 for the REG_D register, and use RBP to store the pre-aligned stack pointer. Reported-by: NEric Biggers <ebiggers@google.com> Reported-by: NPeter Zijlstra <peterz@infradead.org> Tested-by: NEric Biggers <ebiggers@google.com> Acked-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Josh Poimboeuf 提交于
Using RBP as a temporary register breaks frame pointer convention and breaks stack traces when unwinding from an interrupt in the crypto code. Use R11 instead of RBP. Since R11 isn't a callee-saved register, it doesn't need to be saved and restored on the stack. Reported-by: NEric Biggers <ebiggers@google.com> Reported-by: NPeter Zijlstra <peterz@infradead.org> Tested-by: NEric Biggers <ebiggers@google.com> Acked-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Josh Poimboeuf 提交于
Using RBP as a temporary register breaks frame pointer convention and breaks stack traces when unwinding from an interrupt in the crypto code. Use RSI instead of RBP for RT1. Since RSI is also used as a the 'dst' function argument, it needs to be saved on the stack until the argument is needed. Reported-by: NEric Biggers <ebiggers@google.com> Reported-by: NPeter Zijlstra <peterz@infradead.org> Tested-by: NEric Biggers <ebiggers@google.com> Acked-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Josh Poimboeuf 提交于
Using RBP as a temporary register breaks frame pointer convention and breaks stack traces when unwinding from an interrupt in the crypto code. Use R15 instead of RBP. R15 can't be used as the RID1 register because of x86 instruction encoding limitations. So use R15 for CTX and RDI for CTX. This means that CTX is no longer an implicit function argument. Instead it needs to be explicitly copied from RDI. Reported-by: NEric Biggers <ebiggers@google.com> Reported-by: NPeter Zijlstra <peterz@infradead.org> Tested-by: NEric Biggers <ebiggers@google.com> Acked-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Josh Poimboeuf 提交于
Using RBP as a temporary register breaks frame pointer convention and breaks stack traces when unwinding from an interrupt in the crypto code. Use R15 instead of RBP. R15 can't be used as the RID1 register because of x86 instruction encoding limitations. So use R15 for CTX and RDI for CTX. This means that CTX is no longer an implicit function argument. Instead it needs to be explicitly copied from RDI. Reported-by: NEric Biggers <ebiggers@google.com> Reported-by: NPeter Zijlstra <peterz@infradead.org> Tested-by: NEric Biggers <ebiggers@google.com> Acked-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Josh Poimboeuf 提交于
Using RBP as a temporary register breaks frame pointer convention and breaks stack traces when unwinding from an interrupt in the crypto code. Use R12 instead of RBP. Both are callee-saved registers, so the substitution is straightforward. Reported-by: NEric Biggers <ebiggers@google.com> Reported-by: NPeter Zijlstra <peterz@infradead.org> Tested-by: NEric Biggers <ebiggers@google.com> Acked-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Josh Poimboeuf 提交于
Using RBP as a temporary register breaks frame pointer convention and breaks stack traces when unwinding from an interrupt in the crypto code. Use R12 instead of RBP. R12 can't be used as the RT0 register because of x86 instruction encoding limitations. So use R12 for CTX and RDI for CTX. This means that CTX is no longer an implicit function argument. Instead it needs to be explicitly copied from RDI. Reported-by: NEric Biggers <ebiggers@google.com> Reported-by: NPeter Zijlstra <peterz@infradead.org> Tested-by: NEric Biggers <ebiggers@google.com> Acked-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 19 9月, 2017 3 次提交
-
-
由 Haozhong Zhang 提交于
WARN_ON_ONCE(pi_test_sn(&vmx->pi_desc)) in kvm_vcpu_trigger_posted_interrupt() intends to detect the violation of invariant that VT-d PI notification event is not suppressed when vcpu is in the guest mode. Because the two checks for the target vcpu mode and the target suppress field cannot be performed atomically, the target vcpu mode may change in between. If that does happen, WARN_ON_ONCE() here may raise false alarms. As the previous patch fixed the real invariant breaker, remove this WARN_ON_ONCE() to avoid false alarms, and document the allowed cases instead. Signed-off-by: NHaozhong Zhang <haozhong.zhang@intel.com> Reported-by: N"Ramamurthy, Venkatesh" <venkatesh.ramamurthy@intel.com> Reported-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Fixes: 28b835d6 ("KVM: Update Posted-Interrupts Descriptor when vCPU is preempted") Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Haozhong Zhang 提交于
In kvm_vcpu_trigger_posted_interrupt() and pi_pre_block(), KVM assumes that PI notification events should not be suppressed when the target vCPU is not blocked. vmx_update_pi_irte() sets the SN field before changing an interrupt from posting to remapping, but it does not check the vCPU mode. Therefore, the change of SN field may break above the assumption. Besides, I don't see reasons to suppress notification events here, so remove the changes of SN field to avoid race condition. Signed-off-by: NHaozhong Zhang <haozhong.zhang@intel.com> Reported-by: N"Ramamurthy, Venkatesh" <venkatesh.ramamurthy@intel.com> Reported-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Fixes: 28b835d6 ("KVM: Update Posted-Interrupts Descriptor when vCPU is preempted") Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Yu Zhang 提交于
Routine check_cr_write() will trigger emulator_get_cpuid()-> kvm_cpuid() to get maxphyaddr, and NULL is passed as values for ebx/ecx/edx. This is problematic because kvm_cpuid() will dereference these pointers. Fixes: d1cd3ce9 ("KVM: MMU: check guest CR3 reserved bits based on its physical address width.") Reported-by: NJim Mattson <jmattson@google.com> Signed-off-by: NYu Zhang <yu.c.zhang@linux.intel.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NJim Mattson <jmattson@google.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 18 9月, 2017 4 次提交
-
-
由 Andy Lutomirski 提交于
For unknown historical reasons (i.e. Borislav doesn't recall), 32-bit kernels invoke cpu_init() on secondary CPUs with initial_page_table loaded into CR3. Then they set current->active_mm to &init_mm and call enter_lazy_tlb() before fixing CR3. This means that the x86 TLB code gets invoked while CR3 is inconsistent, and, with the improved PCID sanity checks I added, we warn. Fix it by loading swapper_pg_dir (i.e. init_mm.pgd) earlier. Reported-by: NPaul Menzel <pmenzel@molgen.mpg.de> Reported-by: NPavel Machek <pavel@ucw.cz> Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 72c0098d ("x86/mm: Reinitialize TLB state on hotplug and resume") Link: http://lkml.kernel.org/r/30cdfea504682ba3b9012e77717800a91c22097f.1505663533.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
Otherwise we might have the PCID feature bit set during cpu_init(). This is just for robustness. I haven't seen any actual bugs here. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: cba4671a ("x86/mm: Disable PCID on 32-bit kernels") Link: http://lkml.kernel.org/r/b16dae9d6b0db5d9801ddbebbfd83384097c61f3.1505663533.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
Putting the logical ASID into CR3's PCID bits directly means that we have two cases to consider separately: ASID == 0 and ASID != 0. This means that bugs that only hit in one of these cases trigger nondeterministically. There were some bugs like this in the past, and I think there's still one in current kernels. In particular, we have a number of ASID-unware code paths that save CR3, write some special value, and then restore CR3. This includes suspend/resume, hibernate, kexec, EFI, and maybe other things I've missed. This is currently dangerous: if ASID != 0, then this code sequence will leave garbage in the TLB tagged for ASID 0. We could potentially see corruption when switching back to ASID 0. In principle, an initialize_tlbstate_and_flush() call after these sequences would solve the problem, but EFI, at least, does not call this. (And it probably shouldn't -- initialize_tlbstate_and_flush() is rather expensive.) Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/cdc14bbe5d3c3ef2a562be09a6368ffe9bd947a6.1505663533.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
Current, the code that assembles a value to load into CR3 is open-coded everywhere. Factor it out into helpers build_cr3() and build_cr3_noflush(). This makes one semantic change: __get_current_cr3_fast() was wrong on SME systems. No one noticed because the only caller is in the VMX code, and there are no CPUs with both SME and VMX. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Link: http://lkml.kernel.org/r/ce350cf11e93e2842d14d0b95b0199c7d881f527.1505663533.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 16 9月, 2017 1 次提交
-
-
由 Arnd Bergmann 提交于
gcc-4.6 causes a harmless link-time warning: WARNING: vmlinux.o(.text.unlikely+0x48e): Section mismatch in reference from the function xen_find_pt_base() to the function .init.text:m2p() The function xen_find_pt_base() references the function __init m2p(). This is often because xen_find_pt_base lacks a __init annotation or the annotation of m2p is wrong. Newer compilers inline this function, so it never shows up, but marking it __init is the right way to avoid the warning. Fixes: 70e61199 ("xen: move p2m list if conflicting with e820 map") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
- 15 9月, 2017 9 次提交
-
-
由 Jim Mattson 提交于
When emulating a nested VM-entry from L1 to L2, several control field validation checks are deferred to the hardware. Should one of these validation checks fail, vcpu_vmx_run will set the vmx->fail flag. When this happens, the L2 guest state is not loaded (even in part), and execution should continue in L1 with the next instruction after the VMLAUNCH/VMRESUME. The VMCS12 is not modified (except for the VM-instruction error field), the VMCS12 MSR save/load lists are not processed, and the CPU state is not loaded from the VMCS12 host area. Moreover, the vmcs02 exit reason is stale, so it should not be consulted for any reason. Signed-off-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jim Mattson 提交于
On an early VMLAUNCH/VMRESUME failure (i.e. one which sets the VM-instruction error field of the current VMCS), the launch state of the current VMCS is not set to "launched," and the VM-exit information fields of the current VMCS (including IDT-vectoring information and exit reason) are stale. On a late VMLAUNCH/VMRESUME failure (i.e. one which sets the high bit of the exit reason field), the launch state of the current VMCS is not set to "launched," and only two of the VM-exit information fields of the current VMCS are modified (exit reason and exit qualification). The remaining VM-exit information fields of the current VMCS (including IDT-vectoring information, in particular) are stale. Signed-off-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jim Mattson 提交于
After a successful VM-entry, RFLAGS is cleared, with the exception of bit 1, which is always set. This is handled by load_vmcs12_host_state. Signed-off-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Davidlohr Bueso 提交于
During code inspection, the following potential race was seen: CPU0 CPU1 kvm_async_pf_task_wait apf_task_wake_one [L] swait_active(&n->wq) [S] prepare_to_swait(&n.wq) [L] if (!hlist_unhahed(&n.link)) schedule() [S] hlist_del_init(&n->link); Properly serialize swait_active() checks such that a wakeup is not missed. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Davidlohr Bueso 提交于
A comment might serve future readers. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan H. Schönherr 提交于
The value of the guest_irq argument to vmx_update_pi_irte() is ultimately coming from a KVM_IRQFD API call. Do not BUG() in vmx_update_pi_irte() if the value is out-of bounds. (Especially, since KVM as a whole seems to hang after that.) Instead, print a message only once if we find that we don't have a route for a certain IRQ (which can be out-of-bounds or within the array). This fixes CVE-2017-1000252. Fixes: efc64404 ("KVM: x86: Update IRTE for posted-interrupts") Signed-off-by: NJan H. Schönherr <jschoenh@amazon.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jim Mattson 提交于
If L1 does not specify the "use TPR shadow" VM-execution control in vmcs12, then L0 must specify the "CR8-load exiting" and "CR8-store exiting" VM-execution controls in vmcs02. Failure to do so will give the L2 VM unrestricted read/write access to the hardware CR8. This fixes CVE-2017-12154. Signed-off-by: NJim Mattson <jmattson@google.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Borislav Petkov 提交于
CPUID Fn8000_0007_EDX[CPB] is wrongly 0 on models up to B1. But they do support CPB (AMD's Core Performance Boosting cpufreq CPU feature), so fix that. Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sherry Hurwitz <sherry.hurwitz@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170907170821.16021-1-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Wanpeng Li 提交于
qemu-system-x86-8600 [004] d..1 7205.687530: kvm_entry: vcpu 2 qemu-system-x86-8600 [004] .... 7205.687532: kvm_exit: reason EXCEPTION_NMI rip 0xffffffffa921297d info ffffeb2c0e44e018 80000b0e qemu-system-x86-8600 [004] .... 7205.687532: kvm_page_fault: address ffffeb2c0e44e018 error_code 0 qemu-system-x86-8600 [004] .... 7205.687620: kvm_try_async_get_page: gva = 0xffffeb2c0e44e018, gfn = 0x427e4e qemu-system-x86-8600 [004] .N.. 7205.687628: kvm_async_pf_not_present: token 0x8b002 gva 0xffffeb2c0e44e018 kworker/4:2-7814 [004] .... 7205.687655: kvm_async_pf_completed: gva 0xffffeb2c0e44e018 address 0x7fcc30c4e000 qemu-system-x86-8600 [004] .... 7205.687703: kvm_async_pf_ready: token 0x8b002 gva 0xffffeb2c0e44e018 qemu-system-x86-8600 [004] d..1 7205.687711: kvm_entry: vcpu 2 After running some memory intensive workload in guest, I catch the kworker which completes the GUP too quickly, and queues an "Page Ready" #PF exception after the "Page not Present" exception before the next vmentry as the above trace which will result in #DF injected to guest. This patch fixes it by clearing the queue for "Page not Present" if "Page Ready" occurs before the next vmentry since the GUP has already got the required page and shadow page table has already been fixed by "Page Ready" handler. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Fixes: 7c90705b ("KVM: Inject asynchronous page fault into a PV guest if page is swapped out.") [Changed indentation and added clearing of injected. - Radim] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-