提交 · bc829ee36e0ec92383c9d9b88fe08f00d4d592f8 · openanolis / cloud-kernel

30 9月, 2017 1 次提交

x86/mm: Disable branch profiling in mem_encrypt.c · bc829ee3

由 Tom Lendacky 提交于 9月 29, 2017

Some routines in mem_encrypt.c are called very early in the boot process,
e.g. sme_encrypt_kernel(). When CONFIG_TRACE_BRANCH_PROFILING=y is defined
the resulting branch profiling associated with the check to see if SME is
active results in a kernel crash. Disable branch profiling for
mem_encrypt.c by defining DISABLE_BRANCH_PROFILING before including any
header files.
Reported-by: Nkernel test robot <lkp@01.org>
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Acked-by: NBorislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170929162419.6016.53390.stgit@tlendack-t1.amdoffice.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

bc829ee3

29 9月, 2017 1 次提交

x86/asm: Fix inline asm call constraints for GCC 4.4 · 520a13c5

由 Josh Poimboeuf 提交于 9月 28, 2017

The kernel test bot (run by Xiaolong Ye) reported that the following commit:

  f5caf621 ("x86/asm: Fix inline asm call constraints for Clang")

is causing double faults in a kernel compiled with GCC 4.4.

Linus subsequently diagnosed the crash pattern and the buggy commit and found that
the issue is with this code:

  register unsigned int __asm_call_sp asm("esp");
  #define ASM_CALL_CONSTRAINT "+r" (__asm_call_sp)

Even on a 64-bit kernel, it's using ESP instead of RSP.  That causes GCC
to produce the following bogus code:

  ffffffff8147461d:       89 e0                   mov    %esp,%eax
  ffffffff8147461f:       4c 89 f7                mov    %r14,%rdi
  ffffffff81474622:       4c 89 fe                mov    %r15,%rsi
  ffffffff81474625:       ba 20 00 00 00          mov    $0x20,%edx
  ffffffff8147462a:       89 c4                   mov    %eax,%esp
  ffffffff8147462c:       e8 bf 52 05 00          callq  ffffffff814c98f0 <copy_user_generic_unrolled>

Despite the absurdity of it backing up and restoring the stack pointer
for no reason, the bug is actually the fact that it's only backing up
and restoring the lower 32 bits of the stack pointer.  The upper 32 bits
are getting cleared out, corrupting the stack pointer.

So change the '__asm_call_sp' register variable to be associated with
the actual full-size stack pointer.

This also requires changing the __ASM_SEL() macro to be based on the
actual compiled arch size, rather than the CONFIG value, because
CONFIG_X86_64 compiles some files with '-m32' (e.g., realmode and vdso).
Otherwise Clang fails to build the kernel because it complains about the
use of a 64-bit register (RSP) in a 32-bit file.
Reported-and-Bisected-and-Tested-by: Nkernel test robot <xiaolong.ye@intel.com>
Diagnosed-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: LKP <lkp@01.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: f5caf621 ("x86/asm: Fix inline asm call constraints for Clang")
Link: http://lkml.kernel.org/r/20170928215826.6sdpmwtkiydiytim@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>

520a13c5

25 9月, 2017 7 次提交

perf/x86/intel/uncore: Correct num_boxes for IIO and IRP · 29b46dfb

由 Kan Liang 提交于 9月 11, 2017

There are 6 IIO/IRP boxes for CBDMA, PCIe0-2, MCP 0 and MCP 1
separately. Correct the num_boxes.
Signed-off-by: NKan Liang <Kan.liang@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: acme@kernel.org
Link: http://lkml.kernel.org/r/1505149816-12580-1-git-send-email-kan.liang@intel.com

29b46dfb

perf/x86/intel/rapl: Add missing CPU IDs · 450a9789

由 Kan Liang 提交于 9月 08, 2017

DENVERTON and GEMINI_LAKE support same RAPL counters as Apollo Lake.
Signed-off-by: NKan Liang <Kan.liang@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: piotr.luc@intel.com
Cc: harry.pan@intel.com
Cc: srinivas.pandruvada@linux.intel.com
Link: http://lkml.kernel.org/r/20170908213449.6224-3-kan.liang@intel.com

450a9789

perf/x86/msr: Add missing CPU IDs · 1aaccc40

由 Kan Liang 提交于 9月 08, 2017

Goldmont, Glodmont plus and Xeon Phi have MSR_SMI_COUNT as well.
Signed-off-by: NKan Liang <Kan.liang@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: piotr.luc@intel.com
Cc: harry.pan@intel.com
Cc: srinivas.pandruvada@linux.intel.com
Link: http://lkml.kernel.org/r/20170908213449.6224-2-kan.liang@intel.com

1aaccc40

perf/x86/intel/cstate: Add missing CPU IDs · b09c146f

由 Kan Liang 提交于 9月 08, 2017

Skylake server uses the same C-state residency events as Sandy Bridge.

Denverton and Gemini lake use the same C-state residency events as
Apollo Lake.
Signed-off-by: NKan Liang <Kan.liang@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: piotr.luc@intel.com
Cc: harry.pan@intel.com
Cc: srinivas.pandruvada@linux.intel.com
Link: http://lkml.kernel.org/r/20170908213449.6224-1-kan.liang@intel.com

b09c146f

x86: Don't cast away the __user in __get_user_asm_u64() · 5ac751d9

由 Ville Syrjälä 提交于 9月 12, 2017

Don't cast away the __user in __get_user_asm_u64() on x86-32.
Prevents sparse getting upset.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20170912164000.13745-1-ville.syrjala@linux.intel.com

5ac751d9

x86/sysfs: Fix off-by-one error in loop termination · 7d709943

由 Sean Fu 提交于 9月 11, 2017

An off-by-one error in loop terminantion conditions in
create_setup_data_nodes() will lead to memory leak when
create_setup_data_node() failed.
Signed-off-by: NSean Fu <fxinrong@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1505090001-1157-1-git-send-email-fxinrong@gmail.com

7d709943

x86/mm: Fix fault error path using unsafe vma pointer · a3c4fb7c

由 Laurent Dufour 提交于 9月 04, 2017

commit 7b2d0dba ("x86/mm/pkeys: Pass VMA down in to fault signal
generation code") passes down a vma pointer to the error path, but that is
done once the mmap_sem is released when calling mm_fault_error() from
__do_page_fault().

This is dangerous as the vma structure is no more safe to be used once the
mmap_sem has been released. As only the protection key value is required in
the error processing, we could just pass down this value.

Fix it by passing a pointer to a protection key value down to the fault
signal generation code. The use of a pointer allows to keep the check
generating a warning message in fill_sig_info_pkey() when the vma was not
known. If the pointer is valid, the protection value can be accessed by
deferencing the pointer.

[ tglx: Made *pkey u32 as that's the type which is passed in siginfo ]

Fixes: 7b2d0dba ("x86/mm/pkeys: Pass VMA down in to fault signal generation code")
Signed-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1504513935-12742-1-git-send-email-ldufour@linux.vnet.ibm.com

a3c4fb7c

23 9月, 2017 1 次提交

x86/asm: Fix inline asm call constraints for Clang · f5caf621

由 Josh Poimboeuf 提交于 9月 20, 2017

For inline asm statements which have a CALL instruction, we list the
stack pointer as a constraint to convince GCC to ensure the frame
pointer is set up first:

  static inline void foo()
  {
	register void *__sp asm(_ASM_SP);
	asm("call bar" : "+r" (__sp))
  }

Unfortunately, that pattern causes Clang to corrupt the stack pointer.

The fix is easy: convert the stack pointer register variable to a global
variable.

It should be noted that the end result is different based on the GCC
version.  With GCC 6.4, this patch has exactly the same result as
before:

	defconfig	defconfig-nofp	distro		distro-nofp
 before	9820389		9491555		8816046		8516940
 after	9820389		9491555		8816046		8516940

With GCC 7.2, however, GCC's behavior has changed.  It now changes its
behavior based on the conversion of the register variable to a global.
That somehow convinces it to *always* set up the frame pointer before
inserting *any* inline asm.  (Therefore, listing the variable as an
output constraint is a no-op and is no longer necessary.)  It's a bit
overkill, but the performance impact should be negligible.  And in fact,
there's a nice improvement with frame pointers disabled:

	defconfig	defconfig-nofp	distro		distro-nofp
 before	9796316		9468236		9076191		8790305
 after	9796957		9464267		9076381		8785949

So in summary, while listing the stack pointer as an output constraint
is no longer necessary for newer versions of GCC, it's still needed for
older versions.
Suggested-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: NMatthias Kaehlcke <mka@chromium.org>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

f5caf621

20 9月, 2017 12 次提交

crypto: x86/twofish - Fix RBP usage · 8f182f84

由 Josh Poimboeuf 提交于 9月 18, 2017

Using RBP as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.

Use R13 instead of RBP.  Both are callee-saved registers, so the
substitution is straightforward.
Reported-by: NEric Biggers <ebiggers@google.com>
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Tested-by: NEric Biggers <ebiggers@google.com>
Acked-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

8f182f84

crypto: sha512-avx2 - Fix RBP usage · ca04c823

由 Josh Poimboeuf 提交于 9月 18, 2017

Using RBP as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.

Mix things up a little bit to get rid of the RBP usage, without hurting
performance too much.  Use RDI instead of RBP for the TBL pointer.  That
will clobber CTX, so spill CTX onto the stack and use R12 to read it in
the outer loop.  R12 is used as a non-persistent temporary variable
elsewhere, so it's safe to use.

Also remove the unused y4 variable.
Reported-by: NEric Biggers <ebiggers3@gmail.com>
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Tested-by: NEric Biggers <ebiggers@google.com>
Acked-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

ca04c823

crypto: x86/sha256-ssse3 - Fix RBP usage · 539012dc

由 Josh Poimboeuf 提交于 9月 18, 2017

Using RBP as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.

Swap the usages of R12 and RBP.  Use R12 for the TBL register, and use
RBP to store the pre-aligned stack pointer.
Reported-by: NEric Biggers <ebiggers@google.com>
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Tested-by: NEric Biggers <ebiggers@google.com>
Acked-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

539012dc

crypto: x86/sha256-avx2 - Fix RBP usage · d3dfbfe2

由 Josh Poimboeuf 提交于 9月 18, 2017

Using RBP as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.

There's no need to use RBP as a temporary register for the TBL value,
because it always stores the same value: the address of the K256 table.
Instead just reference the address of K256 directly.
Reported-by: NEric Biggers <ebiggers@google.com>
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Tested-by: NEric Biggers <ebiggers@google.com>
Acked-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

d3dfbfe2

crypto: x86/sha256-avx - Fix RBP usage · 673ac6fb

由 Josh Poimboeuf 提交于 9月 18, 2017

Using RBP as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.

Swap the usages of R12 and RBP.  Use R12 for the TBL register, and use
RBP to store the pre-aligned stack pointer.
Reported-by: NEric Biggers <ebiggers@google.com>
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Tested-by: NEric Biggers <ebiggers@google.com>
Acked-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

673ac6fb

crypto: x86/sha1-ssse3 - Fix RBP usage · 6488bce7

由 Josh Poimboeuf 提交于 9月 18, 2017

Using RBP as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.

Swap the usages of R12 and RBP.  Use R12 for the REG_D register, and use
RBP to store the pre-aligned stack pointer.
Reported-by: NEric Biggers <ebiggers@google.com>
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Tested-by: NEric Biggers <ebiggers@google.com>
Acked-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

6488bce7

crypto: x86/sha1-avx2 - Fix RBP usage · d7b1722c

由 Josh Poimboeuf 提交于 9月 18, 2017

Using RBP as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.

Use R11 instead of RBP.  Since R11 isn't a callee-saved register, it
doesn't need to be saved and restored on the stack.
Reported-by: NEric Biggers <ebiggers@google.com>
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Tested-by: NEric Biggers <ebiggers@google.com>
Acked-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

d7b1722c

crypto: x86/des3_ede - Fix RBP usage · 3ed7b4d6

由 Josh Poimboeuf 提交于 9月 18, 2017

Using RBP as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.

Use RSI instead of RBP for RT1.  Since RSI is also used as a the 'dst'
function argument, it needs to be saved on the stack until the argument
is needed.
Reported-by: NEric Biggers <ebiggers@google.com>
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Tested-by: NEric Biggers <ebiggers@google.com>
Acked-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

3ed7b4d6

crypto: x86/cast6 - Fix RBP usage · c66cc3be

由 Josh Poimboeuf 提交于 9月 18, 2017

Using RBP as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.

Use R15 instead of RBP.  R15 can't be used as the RID1 register because
of x86 instruction encoding limitations.  So use R15 for CTX and RDI for
CTX.  This means that CTX is no longer an implicit function argument.
Instead it needs to be explicitly copied from RDI.
Reported-by: NEric Biggers <ebiggers@google.com>
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Tested-by: NEric Biggers <ebiggers@google.com>
Acked-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

c66cc3be

crypto: x86/cast5 - Fix RBP usage · 4b156066

由 Josh Poimboeuf 提交于 9月 18, 2017

Using RBP as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.

Use R15 instead of RBP.  R15 can't be used as the RID1 register because
of x86 instruction encoding limitations.  So use R15 for CTX and RDI for
CTX.  This means that CTX is no longer an implicit function argument.
Instead it needs to be explicitly copied from RDI.
Reported-by: NEric Biggers <ebiggers@google.com>
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Tested-by: NEric Biggers <ebiggers@google.com>
Acked-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

4b156066

crypto: x86/camellia - Fix RBP usage · b46c9d71

由 Josh Poimboeuf 提交于 9月 18, 2017

Using RBP as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.

Use R12 instead of RBP.  Both are callee-saved registers, so the
substitution is straightforward.
Reported-by: NEric Biggers <ebiggers@google.com>
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Tested-by: NEric Biggers <ebiggers@google.com>
Acked-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

b46c9d71

crypto: x86/blowfish - Fix RBP usage · 569f11c9

由 Josh Poimboeuf 提交于 9月 18, 2017

Using RBP as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.

Use R12 instead of RBP.  R12 can't be used as the RT0 register because
of x86 instruction encoding limitations.  So use R12 for CTX and RDI for
CTX.  This means that CTX is no longer an implicit function argument.
Instead it needs to be explicitly copied from RDI.
Reported-by: NEric Biggers <ebiggers@google.com>
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Tested-by: NEric Biggers <ebiggers@google.com>
Acked-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

569f11c9

19 9月, 2017 3 次提交

KVM: VMX: remove WARN_ON_ONCE in kvm_vcpu_trigger_posted_interrupt · 5753743f

由 Haozhong Zhang 提交于 9月 18, 2017

WARN_ON_ONCE(pi_test_sn(&vmx->pi_desc)) in kvm_vcpu_trigger_posted_interrupt()
intends to detect the violation of invariant that VT-d PI notification
event is not suppressed when vcpu is in the guest mode. Because the
two checks for the target vcpu mode and the target suppress field
cannot be performed atomically, the target vcpu mode may change in
between. If that does happen, WARN_ON_ONCE() here may raise false
alarms.

As the previous patch fixed the real invariant breaker, remove this
WARN_ON_ONCE() to avoid false alarms, and document the allowed cases
instead.
Signed-off-by: NHaozhong Zhang <haozhong.zhang@intel.com>
Reported-by: N"Ramamurthy, Venkatesh" <venkatesh.ramamurthy@intel.com>
Reported-by: NDan Williams <dan.j.williams@intel.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Fixes: 28b835d6 ("KVM: Update Posted-Interrupts Descriptor when vCPU is preempted")
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

5753743f

KVM: VMX: do not change SN bit in vmx_update_pi_irte() · dc91f2eb

由 Haozhong Zhang 提交于 9月 18, 2017

In kvm_vcpu_trigger_posted_interrupt() and pi_pre_block(), KVM
assumes that PI notification events should not be suppressed when the
target vCPU is not blocked.

vmx_update_pi_irte() sets the SN field before changing an interrupt
from posting to remapping, but it does not check the vCPU mode.
Therefore, the change of SN field may break above the assumption.
Besides, I don't see reasons to suppress notification events here, so
remove the changes of SN field to avoid race condition.
Signed-off-by: NHaozhong Zhang <haozhong.zhang@intel.com>
Reported-by: N"Ramamurthy, Venkatesh" <venkatesh.ramamurthy@intel.com>
Reported-by: NDan Williams <dan.j.williams@intel.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Fixes: 28b835d6 ("KVM: Update Posted-Interrupts Descriptor when vCPU is preempted")
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

dc91f2eb

KVM: x86: Fix the NULL pointer parameter in check_cr_write() · d6500149

由 Yu Zhang 提交于 9月 18, 2017

Routine check_cr_write() will trigger emulator_get_cpuid()->
kvm_cpuid() to get maxphyaddr, and NULL is passed as values
for ebx/ecx/edx. This is problematic because kvm_cpuid() will
dereference these pointers.

Fixes: d1cd3ce9 ("KVM: MMU: check guest CR3 reserved bits based on its physical address width.")
Reported-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NYu Zhang <yu.c.zhang@linux.intel.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

d6500149

18 9月, 2017 4 次提交

x86/mm/32: Load a sane CR3 before cpu_init() on secondary CPUs · 4ba55e65

由 Andy Lutomirski 提交于 9月 17, 2017

For unknown historical reasons (i.e. Borislav doesn't recall),
32-bit kernels invoke cpu_init() on secondary CPUs with
initial_page_table loaded into CR3.  Then they set
current->active_mm to &init_mm and call enter_lazy_tlb() before
fixing CR3.  This means that the x86 TLB code gets invoked while CR3
is inconsistent, and, with the improved PCID sanity checks I added,
we warn.

Fix it by loading swapper_pg_dir (i.e. init_mm.pgd) earlier.
Reported-by: NPaul Menzel <pmenzel@molgen.mpg.de>
Reported-by: NPavel Machek <pavel@ucw.cz>
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 72c0098d ("x86/mm: Reinitialize TLB state on hotplug and resume")
Link: http://lkml.kernel.org/r/30cdfea504682ba3b9012e77717800a91c22097f.1505663533.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

4ba55e65

x86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlier · b8b7abae

由 Andy Lutomirski 提交于 9月 17, 2017

Otherwise we might have the PCID feature bit set during cpu_init().

This is just for robustness. I haven't seen any actual bugs here.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: cba4671a ("x86/mm: Disable PCID on 32-bit kernels")
Link: http://lkml.kernel.org/r/b16dae9d6b0db5d9801ddbebbfd83384097c61f3.1505663533.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

b8b7abae

x86/mm/64: Stop using CR3.PCID == 0 in ASID-aware code · 52a2af40

由 Andy Lutomirski 提交于 9月 17, 2017

Putting the logical ASID into CR3's PCID bits directly means that we
have two cases to consider separately: ASID == 0 and ASID != 0.
This means that bugs that only hit in one of these cases trigger
nondeterministically.

There were some bugs like this in the past, and I think there's
still one in current kernels.  In particular, we have a number of
ASID-unware code paths that save CR3, write some special value, and
then restore CR3.  This includes suspend/resume, hibernate, kexec,
EFI, and maybe other things I've missed.  This is currently
dangerous: if ASID != 0, then this code sequence will leave garbage
in the TLB tagged for ASID 0.  We could potentially see corruption
when switching back to ASID 0.  In principle, an
initialize_tlbstate_and_flush() call after these sequences would
solve the problem, but EFI, at least, does not call this.  (And it
probably shouldn't -- initialize_tlbstate_and_flush() is rather
expensive.)
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/cdc14bbe5d3c3ef2a562be09a6368ffe9bd947a6.1505663533.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

52a2af40

x86/mm: Factor out CR3-building code · 47061a24

由 Andy Lutomirski 提交于 9月 17, 2017

Current, the code that assembles a value to load into CR3 is
open-coded everywhere.  Factor it out into helpers build_cr3() and
build_cr3_noflush().

This makes one semantic change: __get_current_cr3_fast() was wrong
on SME systems.  No one noticed because the only caller is in the
VMX code, and there are no CPUs with both SME and VMX.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Link: http://lkml.kernel.org/r/ce350cf11e93e2842d14d0b95b0199c7d881f527.1505663533.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

47061a24

16 9月, 2017 1 次提交

xen: x86: mark xen_find_pt_base as __init · 51ae2538

由 Arnd Bergmann 提交于 9月 15, 2017

gcc-4.6 causes a harmless link-time warning:

WARNING: vmlinux.o(.text.unlikely+0x48e): Section mismatch in reference from the function xen_find_pt_base() to the function .init.text:m2p()
The function xen_find_pt_base() references
the function __init m2p().
This is often because xen_find_pt_base lacks a __init
annotation or the annotation of m2p is wrong.

Newer compilers inline this function, so it never shows up, but marking
it __init is the right way to avoid the warning.

Fixes: 70e61199 ("xen: move p2m list if conflicting with e820 map")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>

51ae2538

15 9月, 2017 9 次提交

kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly · 4f350c6d

由 Jim Mattson 提交于 9月 14, 2017

When emulating a nested VM-entry from L1 to L2, several control field
validation checks are deferred to the hardware. Should one of these
validation checks fail, vcpu_vmx_run will set the vmx->fail flag. When
this happens, the L2 guest state is not loaded (even in part), and
execution should continue in L1 with the next instruction after the
VMLAUNCH/VMRESUME.

The VMCS12 is not modified (except for the VM-instruction error
field), the VMCS12 MSR save/load lists are not processed, and the CPU
state is not loaded from the VMCS12 host area. Moreover, the vmcs02
exit reason is stale, so it should not be consulted for any reason.
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4f350c6d

kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly · b060ca3b

由 Jim Mattson 提交于 9月 14, 2017

On an early VMLAUNCH/VMRESUME failure (i.e. one which sets the
VM-instruction error field of the current VMCS), the launch state of
the current VMCS is not set to "launched," and the VM-exit information
fields of the current VMCS (including IDT-vectoring information and
exit reason) are stale.

On a late VMLAUNCH/VMRESUME failure (i.e. one which sets the high bit
of the exit reason field), the launch state of the current VMCS is not
set to "launched," and only two of the VM-exit information fields of
the current VMCS are modified (exit reason and exit
qualification). The remaining VM-exit information fields of the
current VMCS (including IDT-vectoring information, in particular) are
stale.
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b060ca3b

kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry · 7881f96c

由 Jim Mattson 提交于 9月 14, 2017

After a successful VM-entry, RFLAGS is cleared, with the exception of
bit 1, which is always set. This is handled by load_vmcs12_host_state.
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7881f96c

kvm,x86: Fix apf_task_wake_one() wq serialization · a0cff57b

由 Davidlohr Bueso 提交于 9月 13, 2017

During code inspection, the following potential race was seen:

CPU0   	    		    	     	CPU1
kvm_async_pf_task_wait			apf_task_wake_one
					  [L] swait_active(&n->wq)
  [S] prepare_to_swait(&n.wq)
  [L] if (!hlist_unhahed(&n.link))
	schedule()			  [S] hlist_del_init(&n->link);

Properly serialize swait_active() checks such that a wakeup is
not missed.
Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a0cff57b

kvm,lapic: Justify use of swait_active() · cc1b4680

由 Davidlohr Bueso 提交于 9月 13, 2017

A comment might serve future readers.
Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

cc1b4680

KVM: VMX: Do not BUG() on out-of-bounds guest IRQ · 3a8b0677

由 Jan H. Schönherr 提交于 9月 07, 2017

The value of the guest_irq argument to vmx_update_pi_irte() is
ultimately coming from a KVM_IRQFD API call. Do not BUG() in
vmx_update_pi_irte() if the value is out-of bounds. (Especially,
since KVM as a whole seems to hang after that.)

Instead, print a message only once if we find that we don't have a
route for a certain IRQ (which can be out-of-bounds or within the
array).

This fixes CVE-2017-1000252.

Fixes: efc64404 ("KVM: x86: Update IRTE for posted-interrupts")
Signed-off-by: NJan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3a8b0677

kvm: nVMX: Don't allow L2 to access the hardware CR8 · 51aa68e7

由 Jim Mattson 提交于 9月 12, 2017

If L1 does not specify the "use TPR shadow" VM-execution control in
vmcs12, then L0 must specify the "CR8-load exiting" and "CR8-store
exiting" VM-execution controls in vmcs02. Failure to do so will give
the L2 VM unrestricted read/write access to the hardware CR8.

This fixes CVE-2017-12154.
Signed-off-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

51aa68e7

x86/cpu/AMD: Fix erratum 1076 (CPB bit) · f7f3dc00

由 Borislav Petkov 提交于 9月 07, 2017

CPUID Fn8000_0007_EDX[CPB] is wrongly 0 on models up to B1. But they do
support CPB (AMD's Core Performance Boosting cpufreq CPU feature), so fix that.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170907170821.16021-1-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

f7f3dc00

KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously · 9a6e7c39

由 Wanpeng Li 提交于 9月 14, 2017

qemu-system-x86-8600 [004] d..1 7205.687530: kvm_entry: vcpu 2
qemu-system-x86-8600 [004] .... 7205.687532: kvm_exit: reason EXCEPTION_NMI rip 0xffffffffa921297d info ffffeb2c0e44e018 80000b0e
qemu-system-x86-8600 [004] .... 7205.687532: kvm_page_fault: address ffffeb2c0e44e018 error_code 0
qemu-system-x86-8600 [004] .... 7205.687620: kvm_try_async_get_page: gva = 0xffffeb2c0e44e018, gfn = 0x427e4e
qemu-system-x86-8600 [004] .N.. 7205.687628: kvm_async_pf_not_present: token 0x8b002 gva 0xffffeb2c0e44e018
kworker/4:2-7814 [004] .... 7205.687655: kvm_async_pf_completed: gva 0xffffeb2c0e44e018 address 0x7fcc30c4e000
qemu-system-x86-8600 [004] .... 7205.687703: kvm_async_pf_ready: token 0x8b002 gva 0xffffeb2c0e44e018
qemu-system-x86-8600 [004] d..1 7205.687711: kvm_entry: vcpu 2

After running some memory intensive workload in guest, I catch the kworker
which completes the GUP too quickly, and queues an "Page Ready" #PF exception
after the "Page not Present" exception before the next vmentry as the above
trace which will result in #DF injected to guest.

This patch fixes it by clearing the queue for "Page not Present" if "Page Ready"
occurs before the next vmentry since the GUP has already got the required page
and shadow page table has already been fixed by "Page Ready" handler.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Fixes: 7c90705b ("KVM: Inject asynchronous page fault into a PV guest if page is swapped out.")
[Changed indentation and added clearing of injected. - Radim]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

9a6e7c39

14 9月, 2017 1 次提交

KVM: X86: Don't block vCPU if there is pending exception · a5f01f8e

由 Wanpeng Li 提交于 9月 13, 2017

Don't block vCPU if there is pending exception.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

a5f01f8e

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功