1. 09 5月, 2018 9 次提交
  2. 08 5月, 2018 2 次提交
  3. 07 5月, 2018 1 次提交
    • C
      PCI: remove PCI_DMA_BUS_IS_PHYS · 325ef185
      Christoph Hellwig 提交于
      This was used by the ide, scsi and networking code in the past to
      determine if they should bounce payloads.  Now that the dma mapping
      always have to support dma to all physical memory (thanks to swiotlb
      for non-iommu systems) there is no need to this crude hack any more.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: Palmer Dabbelt <palmer@sifive.com> (for riscv)
      Reviewed-by: NJens Axboe <axboe@kernel.dk>
      325ef185
  4. 02 5月, 2018 2 次提交
  5. 01 5月, 2018 2 次提交
  6. 28 4月, 2018 1 次提交
  7. 27 4月, 2018 8 次提交
    • J
      kvm: apic: Flush TLB after APIC mode/address change if VPIDs are in use · a468f2db
      Junaid Shahid 提交于
      Currently, KVM flushes the TLB after a change to the APIC access page
      address or the APIC mode when EPT mode is enabled. However, even in
      shadow paging mode, a TLB flush is needed if VPIDs are being used, as
      specified in the Intel SDM Section 29.4.5.
      
      So replace vmx_flush_tlb_ept_only() with vmx_flush_tlb(), which will
      flush if either EPT or VPIDs are in use.
      Signed-off-by: NJunaid Shahid <junaids@google.com>
      Reviewed-by: NJim Mattson <jmattson@google.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      a468f2db
    • A
      x86/entry/64/compat: Preserve r8-r11 in int $0x80 · 8bb2610b
      Andy Lutomirski 提交于
      32-bit user code that uses int $80 doesn't care about r8-r11.  There is,
      however, some 64-bit user code that intentionally uses int $0x80 to invoke
      32-bit system calls.  From what I've seen, basically all such code assumes
      that r8-r15 are all preserved, but the kernel clobbers r8-r11.  Since I
      doubt that there's any code that depends on int $0x80 zeroing r8-r11,
      change the kernel to preserve them.
      
      I suspect that very little user code is broken by the old clobber, since
      r8-r11 are only rarely allocated by gcc, and they're clobbered by function
      calls, so they only way we'd see a problem is if the same function that
      invokes int $0x80 also spills something important to one of these
      registers.
      
      The current behavior seems to date back to the historical commit
      "[PATCH] x86-64 merge for 2.6.4".  Before that, all regs were
      preserved.  I can't find any explanation of why this change was made.
      
      Update the test_syscall_vdso_32 testcase as well to verify the new
      behavior, and it strengthens the test to make sure that the kernel doesn't
      accidentally permute r8..r15.
      Suggested-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Link: https://lkml.kernel.org/r/d4c4d9985fbe64f8c9e19291886453914b48caee.1523975710.git.luto@kernel.org
      8bb2610b
    • A
      x86/ipc: Fix x32 version of shmid64_ds and msqid64_ds · 1a512c08
      Arnd Bergmann 提交于
      A bugfix broke the x32 shmid64_ds and msqid64_ds data structure layout
      (as seen from user space)  a few years ago: Originally, __BITS_PER_LONG
      was defined as 64 on x32, so we did not have padding after the 64-bit
      __kernel_time_t fields, After __BITS_PER_LONG got changed to 32,
      applications would observe extra padding.
      
      In other parts of the uapi headers we seem to have a mix of those
      expecting either 32 or 64 on x32 applications, so we can't easily revert
      the path that broke these two structures.
      
      Instead, this patch decouples x32 from the other architectures and moves
      it back into arch specific headers, partially reverting the even older
      commit 73a2d096 ("x86: remove all now-duplicate header files").
      
      It's not clear whether this ever made any difference, since at least
      glibc carries its own (correct) copy of both of these header files,
      so possibly no application has ever observed the definitions here.
      
      Based on a suggestion from H.J. Lu, I tried out the tool from
      https://github.com/hjl-tools/linux-header to find other such
      bugs, which pointed out the same bug in statfs(), which also has
      a separate (correct) copy in glibc.
      
      Fixes: f4b4aae1 ("x86/headers/uapi: Fix __BITS_PER_LONG value for x32 builds")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: "H . J . Lu" <hjl.tools@gmail.com>
      Cc: Jeffrey Walton <noloader@gmail.com>
      Cc: stable@vger.kernel.org
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Link: https://lkml.kernel.org/r/20180424212013.3967461-1-arnd@arndb.de
      1a512c08
    • P
      x86/setup: Do not reserve a crash kernel region if booted on Xen PV · 3db3eb28
      Petr Tesarik 提交于
      Xen PV domains cannot shut down and start a crash kernel. Instead,
      the crashing kernel makes a SCHEDOP_shutdown hypercall with the
      reason code SHUTDOWN_crash, cf. xen_crash_shutdown() machine op in
      arch/x86/xen/enlighten_pv.c.
      
      A crash kernel reservation is merely a waste of RAM in this case. It
      may also confuse users of kexec_load(2) and/or kexec_file_load(2).
      When flags include KEXEC_ON_CRASH or KEXEC_FILE_ON_CRASH,
      respectively, these syscalls return success, which is technically
      correct, but the crash kexec image will never be actually used.
      Signed-off-by: NPetr Tesarik <ptesarik@suse.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: xen-devel@lists.xenproject.org
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Jean Delvare <jdelvare@suse.de>
      Link: https://lkml.kernel.org/r/20180425120835.23cef60c@ezekiel.suse.cz
      3db3eb28
    • M
      arm64: avoid instrumenting atomic_ll_sc.o · 3789c122
      Mark Rutland 提交于
      Our out-of-line atomics are built with a special calling convention,
      preventing pointless stack spilling, and allowing us to patch call sites
      with ARMv8.1 atomic instructions.
      
      Instrumentation inserted by the compiler may result in calls to
      functions not following this special calling convention, resulting in
      registers being unexpectedly clobbered, and various problems resulting
      from this.
      
      For example, if a kernel is built with KCOV and ARM64_LSE_ATOMICS, the
      compiler inserts calls to __sanitizer_cov_trace_pc in the prologues of
      the atomic functions. This has been observed to result in spurious
      cmpxchg failures, leading to a hang early on in the boot process.
      
      This patch avoids such issues by preventing instrumentation of our
      out-of-line atomics.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      3789c122
    • L
      powerpc/kvm/booke: Fix altivec related build break · b2d7ecbe
      Laurentiu Tudor 提交于
      Add missing "altivec unavailable" interrupt injection helper
      thus fixing the linker error below:
      
        arch/powerpc/kvm/emulate_loadstore.o: In function `kvmppc_check_altivec_disabled':
        arch/powerpc/kvm/emulate_loadstore.c: undefined reference to `.kvmppc_core_queue_vec_unavail'
      
      Fixes: 09f98496 ("KVM: PPC: Book3S: Add MMIO emulation for VMX instructions")
      Signed-off-by: NLaurentiu Tudor <laurentiu.tudor@nxp.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b2d7ecbe
    • N
      powerpc: Fix deadlock with multiple calls to smp_send_stop · 6029755e
      Nicholas Piggin 提交于
      smp_send_stop can lock up the IPI path for any subsequent calls,
      because the receiving CPUs spin in their handler function. This
      started becoming a problem with the addition of an smp_send_stop
      call in the reboot path, because panics can reboot after doing
      their own smp_send_stop.
      
      The NMI IPI variant was fixed with ac61c115 ("powerpc: Fix
      smp_send_stop NMI IPI handling"), which leaves the smp_call_function
      variant.
      
      This is fixed by having smp_send_stop only ever do the
      smp_call_function once. This is a bit less robust than the NMI IPI
      fix, because any other call to smp_call_function after smp_send_stop
      could deadlock, but that has always been the case, and it was not
      been a problem before.
      
      Fixes: f2748bdf ("powerpc/powernv: Always stop secondaries before reboot/shutdown")
      Reported-by: NAbdul Haleem <abdhalee@linux.vnet.ibm.com>
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6029755e
    • J
      x86/cpu/intel: Add missing TLB cpuid values · b837913f
      jacek.tomaka@poczta.fm 提交于
      Make kernel print the correct number of TLB entries on Intel Xeon Phi 7210
      (and others)
      
      Before:
      [ 0.320005] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
      After:
      [ 0.320005] Last level dTLB entries: 4KB 256, 2MB 128, 4MB 128, 1GB 16
      
      The entries do exist in the official Intel SMD but the type column there is
      incorrect (states "Cache" where it should read "TLB"), but the entries for
      the values 0x6B, 0x6C and 0x6D are correctly described as 'Data TLB'.
      Signed-off-by: NJacek Tomaka <jacek.tomaka@poczta.fm>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20180423161425.24366-1-jacekt@dugeo.com
      b837913f
  8. 26 4月, 2018 10 次提交
  9. 25 4月, 2018 5 次提交
    • S
      tracing/x86: Update syscall trace events to handle new prefixed syscall func names · 1c758a22
      Steven Rostedt (VMware) 提交于
      Arnaldo noticed that the latest kernel is missing the syscall event system
      directory in x86. I bisected it down to d5a00528 ("syscalls/core,
      syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()").
      
      The system call trace events are special, as there is only one trace event
      for all system calls (the raw_syscalls). But a macro that wraps the system
      calls creates meta data for them that copies the name to find the system
      call that maps to the system call table (the number). At boot up, it does a
      kallsyms lookup of the system call table to find the function that maps to
      the meta data of the system call. If it does not find a function, then that
      system call is ignored.
      
      Because the x86 system calls had "__x64_", or "__ia32_" prefixed to the
      "sys" for the names, they do not match the default compare algorithm. As
      this was a problem for power pc, the algorithm can be overwritten by the
      architecture. The solution is to have x86 have its own algorithm to do the
      compare and this brings back the system call trace events.
      
      Link: http://lkml.kernel.org/r/20180417174128.0f3457f0@gandalf.local.homeReported-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Fixes: d5a00528 ("syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()")
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      1c758a22
    • N
      powerpc: Fix smp_send_stop NMI IPI handling · ac61c115
      Nicholas Piggin 提交于
      The NMI IPI handler for a receiving CPU increments nmi_ipi_busy_count
      over the handler function call, which causes later smp_send_nmi_ipi()
      callers to spin until the call is finished.
      
      The stop_this_cpu() function never returns, so the busy count is never
      decremeted, which can cause the system to hang in some cases. For
      example panic() will call smp_send_stop() early on which calls
      stop_this_cpu() on other CPUs, then later in the reboot path,
      pnv_restart() will call smp_send_stop() again, which hangs.
      
      Fix this by adding a special case to the stop_this_cpu() handler to
      decrement the busy count, because it will never return.
      
      Now that the NMI/non-NMI versions of stop_this_cpu() are different,
      split them out into separate functions rather than doing #ifdef tricks
      to share the body between the two functions.
      
      Fixes: 6bed3237 ("powerpc: use NMI IPI for smp_send_stop")
      Reported-by: NAbdul Haleem <abdhalee@linux.vnet.ibm.com>
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      [mpe: Split out the functions, tweak change log a bit]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ac61c115
    • D
      x86/pti: Filter at vma->vm_page_prot population · 316d097c
      Dave Hansen 提交于
      commit ce9962bf7e22bb3891655c349faff618922d4a73
      
      0day reported warnings at boot on 32-bit systems without NX support:
      
      attempted to set unsupported pgprot: 8000000000000025 bits: 8000000000000000 supported: 7fffffffffffffff
      WARNING: CPU: 0 PID: 1 at
      arch/x86/include/asm/pgtable.h:540 handle_mm_fault+0xfc1/0xfe0:
       check_pgprot at arch/x86/include/asm/pgtable.h:535
       (inlined by) pfn_pte at arch/x86/include/asm/pgtable.h:549
       (inlined by) do_anonymous_page at mm/memory.c:3169
       (inlined by) handle_pte_fault at mm/memory.c:3961
       (inlined by) __handle_mm_fault at mm/memory.c:4087
       (inlined by) handle_mm_fault at mm/memory.c:4124
      
      The problem is that due to the recent commit which removed auto-massaging
      of page protections, filtering page permissions at PTE creation time is not
      longer done, so vma->vm_page_prot is passed unfiltered to PTE creation.
      
      Filter the page protections before they are installed in vma->vm_page_prot.
      
      Fixes: fb43d6cb ("x86/mm: Do not auto-massage page protections")
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: linux-mm@kvack.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Link: https://lkml.kernel.org/r/20180420222028.99D72858@viggo.jf.intel.com
      
      316d097c
    • D
      x86/pti: Disallow global kernel text with RANDSTRUCT · b7c21bc5
      Dave Hansen 提交于
      commit 26d35ca6c3776784f8156e1d6f80cc60d9a2a915
      
      RANDSTRUCT derives its hardening benefits from the attacker's lack of
      knowledge about the layout of kernel data structures.  Keep the kernel
      image non-global in cases where RANDSTRUCT is in use to help keep the
      layout a secret.
      
      Fixes: 8c06c774 (x86/pti: Leave kernel text global for !PCID)
      Reported-by: NKees Cook <keescook@google.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: linux-mm@kvack.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Link: https://lkml.kernel.org/r/20180420222026.D0B4AAC9@viggo.jf.intel.com
      
      b7c21bc5
    • D
      x86/pti: Reduce amount of kernel text allowed to be Global · a44ca8f5
      Dave Hansen 提交于
      commit abb67605203687c8b7943d760638d0301787f8d9
      
      Kees reported to me that I made too much of the kernel image global.
      It was far more than just text:
      
      	I think this is too much set global: _end is after data,
      	bss, and brk, and all kinds of other stuff that could
      	hold secrets. I think this should match what
      	mark_rodata_ro() is doing.
      
      This does exactly that.  We use __end_rodata_hpage_align as our
      marker both because it is huge-page-aligned and it does not contain
      any sections we expect to hold secrets.
      
      Kees's logic was that r/o data is in the kernel image anyway and,
      in the case of traditional distributions, can be freely downloaded
      from the web, so there's no reason to hide it.
      
      Fixes: 8c06c774 (x86/pti: Leave kernel text global for !PCID)
      Reported-by: NKees Cook <keescook@google.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: linux-mm@kvack.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Link: https://lkml.kernel.org/r/20180420222023.1C8B2B20@viggo.jf.intel.com
      
      a44ca8f5