1. 08 3月, 2016 3 次提交
  2. 06 3月, 2016 1 次提交
  3. 03 3月, 2016 1 次提交
  4. 02 3月, 2016 1 次提交
  5. 28 2月, 2016 1 次提交
    • D
      mm: ASLR: use get_random_long() · 5ef11c35
      Daniel Cashman 提交于
      Replace calls to get_random_int() followed by a cast to (unsigned long)
      with calls to get_random_long().  Also address shifting bug which, in
      case of x86 removed entropy mask for mmap_rnd_bits values > 31 bits.
      Signed-off-by: NDaniel Cashman <dcashman@android.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Nick Kralevich <nnk@google.com>
      Cc: Jeff Vander Stoep <jeffv@google.com>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5ef11c35
  6. 27 2月, 2016 2 次提交
  7. 26 2月, 2016 2 次提交
  8. 25 2月, 2016 2 次提交
    • M
      KVM: x86: MMU: fix ubsan index-out-of-range warning · 17e4bce0
      Mike Krinkin 提交于
      Ubsan reports the following warning due to a typo in
      update_accessed_dirty_bits template, the patch fixes
      the typo:
      
      [  168.791851] ================================================================================
      [  168.791862] UBSAN: Undefined behaviour in arch/x86/kvm/paging_tmpl.h:252:15
      [  168.791866] index 4 is out of range for type 'u64 [4]'
      [  168.791871] CPU: 0 PID: 2950 Comm: qemu-system-x86 Tainted: G           O L  4.5.0-rc5-next-20160222 #7
      [  168.791873] Hardware name: LENOVO 23205NG/23205NG, BIOS G2ET95WW (2.55 ) 07/09/2013
      [  168.791876]  0000000000000000 ffff8801cfcaf208 ffffffff81c9f780 0000000041b58ab3
      [  168.791882]  ffffffff82eb2cc1 ffffffff81c9f6b4 ffff8801cfcaf230 ffff8801cfcaf1e0
      [  168.791886]  0000000000000004 0000000000000001 0000000000000000 ffffffffa1981600
      [  168.791891] Call Trace:
      [  168.791899]  [<ffffffff81c9f780>] dump_stack+0xcc/0x12c
      [  168.791904]  [<ffffffff81c9f6b4>] ? _atomic_dec_and_lock+0xc4/0xc4
      [  168.791910]  [<ffffffff81da9e81>] ubsan_epilogue+0xd/0x8a
      [  168.791914]  [<ffffffff81daafa2>] __ubsan_handle_out_of_bounds+0x15c/0x1a3
      [  168.791918]  [<ffffffff81daae46>] ? __ubsan_handle_shift_out_of_bounds+0x2bd/0x2bd
      [  168.791922]  [<ffffffff811287ef>] ? get_user_pages_fast+0x2bf/0x360
      [  168.791954]  [<ffffffffa1794050>] ? kvm_largepages_enabled+0x30/0x30 [kvm]
      [  168.791958]  [<ffffffff81128530>] ? __get_user_pages_fast+0x360/0x360
      [  168.791987]  [<ffffffffa181b818>] paging64_walk_addr_generic+0x1b28/0x2600 [kvm]
      [  168.792014]  [<ffffffffa1819cf0>] ? init_kvm_mmu+0x1100/0x1100 [kvm]
      [  168.792019]  [<ffffffff8129e350>] ? debug_check_no_locks_freed+0x350/0x350
      [  168.792044]  [<ffffffffa1819cf0>] ? init_kvm_mmu+0x1100/0x1100 [kvm]
      [  168.792076]  [<ffffffffa181c36d>] paging64_gva_to_gpa+0x7d/0x110 [kvm]
      [  168.792121]  [<ffffffffa181c2f0>] ? paging64_walk_addr_generic+0x2600/0x2600 [kvm]
      [  168.792130]  [<ffffffff812e848b>] ? debug_lockdep_rcu_enabled+0x7b/0x90
      [  168.792178]  [<ffffffffa17d9a4a>] emulator_read_write_onepage+0x27a/0x1150 [kvm]
      [  168.792208]  [<ffffffffa1794d44>] ? __kvm_read_guest_page+0x54/0x70 [kvm]
      [  168.792234]  [<ffffffffa17d97d0>] ? kvm_task_switch+0x160/0x160 [kvm]
      [  168.792238]  [<ffffffff812e848b>] ? debug_lockdep_rcu_enabled+0x7b/0x90
      [  168.792263]  [<ffffffffa17daa07>] emulator_read_write+0xe7/0x6d0 [kvm]
      [  168.792290]  [<ffffffffa183b620>] ? em_cr_write+0x230/0x230 [kvm]
      [  168.792314]  [<ffffffffa17db005>] emulator_write_emulated+0x15/0x20 [kvm]
      [  168.792340]  [<ffffffffa18465f8>] segmented_write+0xf8/0x130 [kvm]
      [  168.792367]  [<ffffffffa1846500>] ? em_lgdt+0x20/0x20 [kvm]
      [  168.792374]  [<ffffffffa14db512>] ? vmx_read_guest_seg_ar+0x42/0x1e0 [kvm_intel]
      [  168.792400]  [<ffffffffa1846d82>] writeback+0x3f2/0x700 [kvm]
      [  168.792424]  [<ffffffffa1846990>] ? em_sidt+0xa0/0xa0 [kvm]
      [  168.792449]  [<ffffffffa185554d>] ? x86_decode_insn+0x1b3d/0x4f70 [kvm]
      [  168.792474]  [<ffffffffa1859032>] x86_emulate_insn+0x572/0x3010 [kvm]
      [  168.792499]  [<ffffffffa17e71dd>] x86_emulate_instruction+0x3bd/0x2110 [kvm]
      [  168.792524]  [<ffffffffa17e6e20>] ? reexecute_instruction.part.110+0x2e0/0x2e0 [kvm]
      [  168.792532]  [<ffffffffa14e9a81>] handle_ept_misconfig+0x61/0x460 [kvm_intel]
      [  168.792539]  [<ffffffffa14e9a20>] ? handle_pause+0x450/0x450 [kvm_intel]
      [  168.792546]  [<ffffffffa15130ea>] vmx_handle_exit+0xd6a/0x1ad0 [kvm_intel]
      [  168.792572]  [<ffffffffa17f6a6c>] ? kvm_arch_vcpu_ioctl_run+0xbdc/0x6090 [kvm]
      [  168.792597]  [<ffffffffa17f6bcd>] kvm_arch_vcpu_ioctl_run+0xd3d/0x6090 [kvm]
      [  168.792621]  [<ffffffffa17f6a6c>] ? kvm_arch_vcpu_ioctl_run+0xbdc/0x6090 [kvm]
      [  168.792627]  [<ffffffff8293b530>] ? __ww_mutex_lock_interruptible+0x1630/0x1630
      [  168.792651]  [<ffffffffa17f5e90>] ? kvm_arch_vcpu_runnable+0x4f0/0x4f0 [kvm]
      [  168.792656]  [<ffffffff811eeb30>] ? preempt_notifier_unregister+0x190/0x190
      [  168.792681]  [<ffffffffa17e0447>] ? kvm_arch_vcpu_load+0x127/0x650 [kvm]
      [  168.792704]  [<ffffffffa178e9a3>] kvm_vcpu_ioctl+0x553/0xda0 [kvm]
      [  168.792727]  [<ffffffffa178e450>] ? vcpu_put+0x40/0x40 [kvm]
      [  168.792732]  [<ffffffff8129e350>] ? debug_check_no_locks_freed+0x350/0x350
      [  168.792735]  [<ffffffff82946087>] ? _raw_spin_unlock+0x27/0x40
      [  168.792740]  [<ffffffff8163a943>] ? handle_mm_fault+0x1673/0x2e40
      [  168.792744]  [<ffffffff8129daa8>] ? trace_hardirqs_on_caller+0x478/0x6c0
      [  168.792747]  [<ffffffff8129dcfd>] ? trace_hardirqs_on+0xd/0x10
      [  168.792751]  [<ffffffff812e848b>] ? debug_lockdep_rcu_enabled+0x7b/0x90
      [  168.792756]  [<ffffffff81725a80>] do_vfs_ioctl+0x1b0/0x12b0
      [  168.792759]  [<ffffffff817258d0>] ? ioctl_preallocate+0x210/0x210
      [  168.792763]  [<ffffffff8174aef3>] ? __fget+0x273/0x4a0
      [  168.792766]  [<ffffffff8174acd0>] ? __fget+0x50/0x4a0
      [  168.792770]  [<ffffffff8174b1f6>] ? __fget_light+0x96/0x2b0
      [  168.792773]  [<ffffffff81726bf9>] SyS_ioctl+0x79/0x90
      [  168.792777]  [<ffffffff82946880>] entry_SYSCALL_64_fastpath+0x23/0xc1
      [  168.792780] ================================================================================
      Signed-off-by: NMike Krinkin <krinkin.m.u@gmail.com>
      Reviewed-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      17e4bce0
    • A
      x86/entry/compat: Add missing CLAC to entry_INT80_32 · 3d44d51b
      Andy Lutomirski 提交于
      This doesn't seem to fix a regression -- I don't think the CLAC was
      ever there.
      
      I double-checked in a debugger: entries through the int80 gate do
      not automatically clear AC.
      
      Stable maintainers: I can provide a backport to 4.3 and earlier if
      needed.  This needs to be backported all the way to 3.10.
      Reported-by: NBrian Gerst <brgerst@gmail.com>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org> # v3.10 and later
      Fixes: 63bcff2a ("x86, smap: Add STAC and CLAC instructions to control user space access")
      Link: http://lkml.kernel.org/r/b02b7e71ae54074be01fc171cbd4b72517055c0e.1456345086.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3d44d51b
  9. 24 2月, 2016 4 次提交
    • P
      KVM: x86: fix conversion of addresses to linear in 32-bit protected mode · 0c1d77f4
      Paolo Bonzini 提交于
      Commit e8dd2d2d ("Silence compiler warning in arch/x86/kvm/emulate.c",
      2015-09-06) broke boot of the Hurd.  The bug is that the "default:"
      case actually could modify "la", but after the patch this change is
      not reflected in *linear.
      
      The bug is visible whenever a non-zero segment base causes the linear
      address to wrap around the 4GB mark.
      
      Fixes: e8dd2d2d
      Cc: stable@vger.kernel.org
      Reported-by: NAurelien Jarno <aurelien@aurel32.net>
      Tested-by: NAurelien Jarno <aurelien@aurel32.net>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0c1d77f4
    • P
      KVM: x86: fix missed hardware breakpoints · 172b2386
      Paolo Bonzini 提交于
      Sometimes when setting a breakpoint a process doesn't stop on it.
      This is because the debug registers are not loaded correctly on
      VCPU load.
      
      The following simple reproducer from Oleg Nesterov tries using debug
      registers in two threads.  To see the bug, run a 2-VCPU guest with
      "taskset -c 0" and run "./bp 0 1" inside the guest.
      
          #include <unistd.h>
          #include <signal.h>
          #include <stdlib.h>
          #include <stdio.h>
          #include <sys/wait.h>
          #include <sys/ptrace.h>
          #include <sys/user.h>
          #include <asm/debugreg.h>
          #include <assert.h>
      
          #define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
      
          unsigned long encode_dr7(int drnum, int enable, unsigned int type, unsigned int len)
          {
              unsigned long dr7;
      
              dr7 = ((len | type) & 0xf)
                  << (DR_CONTROL_SHIFT + drnum * DR_CONTROL_SIZE);
              if (enable)
                  dr7 |= (DR_GLOBAL_ENABLE << (drnum * DR_ENABLE_SIZE));
      
              return dr7;
          }
      
          int write_dr(int pid, int dr, unsigned long val)
          {
              return ptrace(PTRACE_POKEUSER, pid,
                      offsetof (struct user, u_debugreg[dr]),
                      val);
          }
      
          void set_bp(pid_t pid, void *addr)
          {
              unsigned long dr7;
              assert(write_dr(pid, 0, (long)addr) == 0);
              dr7 = encode_dr7(0, 1, DR_RW_EXECUTE, DR_LEN_1);
              assert(write_dr(pid, 7, dr7) == 0);
          }
      
          void *get_rip(int pid)
          {
              return (void*)ptrace(PTRACE_PEEKUSER, pid,
                      offsetof(struct user, regs.rip), 0);
          }
      
          void test(int nr)
          {
              void *bp_addr = &&label + nr, *bp_hit;
              int pid;
      
              printf("test bp %d\n", nr);
              assert(nr < 16); // see 16 asm nops below
      
              pid = fork();
              if (!pid) {
                  assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
                  kill(getpid(), SIGSTOP);
                  for (;;) {
                      label: asm (
                          "nop; nop; nop; nop;"
                          "nop; nop; nop; nop;"
                          "nop; nop; nop; nop;"
                          "nop; nop; nop; nop;"
                      );
                  }
              }
      
              assert(pid == wait(NULL));
              set_bp(pid, bp_addr);
      
              for (;;) {
                  assert(ptrace(PTRACE_CONT, pid, 0, 0) == 0);
                  assert(pid == wait(NULL));
      
                  bp_hit = get_rip(pid);
                  if (bp_hit != bp_addr)
                      fprintf(stderr, "ERR!! hit wrong bp %ld != %d\n",
                          bp_hit - &&label, nr);
              }
          }
      
          int main(int argc, const char *argv[])
          {
              while (--argc) {
                  int nr = atoi(*++argv);
                  if (!fork())
                      test(nr);
              }
      
              while (wait(NULL) > 0)
                  ;
              return 0;
          }
      
      Cc: stable@vger.kernel.org
      Suggested-by: NNadav Amit <namit@cs.technion.ac.il>
      Reported-by: NAndrey Wagin <avagin@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      172b2386
    • A
      x86/entry/32: Add an ASM_CLAC to entry_SYSENTER_32 · 04d1d281
      Andy Lutomirski 提交于
      Both before and after 5f310f73 ("x86/entry/32: Re-implement
      SYSENTER using the new C path"), we relied on a uaccess very early
      in the SYSENTER path to clear AC.  After that change, though, we can
      potentially make it all the way into C code with AC set, which
      enlarges the attack surface for SMAP bypass by doing SYSENTER with
      AC set.
      
      Strengthen the SMAP protection by addding the missing ASM_CLAC right
      at the beginning.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/3e36be110724896e32a4a1fe73bacb349d3cba94.1456262295.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      04d1d281
    • L
      x86: fix SMAP in 32-bit environments · de9e478b
      Linus Torvalds 提交于
      In commit 11f1a4b9 ("x86: reorganize SMAP handling in user space
      accesses") I changed how the stac/clac instructions were generated
      around the user space accesses, which then made it possible to do
      batched accesses efficiently for user string copies etc.
      
      However, in doing so, I completely spaced out, and didn't even think
      about the 32-bit case.  And nobody really even seemed to notice, because
      SMAP doesn't even exist until modern Skylake processors, and you'd have
      to be crazy to run 32-bit kernels on a modern CPU.
      
      Which brings us to Andy Lutomirski.
      
      He actually tested the 32-bit kernel on new hardware, and noticed that
      it doesn't work.  My bad.  The trivial fix is to add the required
      uaccess begin/end markers around the raw accesses in <asm/uaccess_32.h>.
      
      I feel a bit bad about this patch, just because that header file really
      should be cleaned up to avoid all the duplicated code in it, and this
      commit just expands on the problem.  But this just fixes the bug without
      any bigger cleanup surgery.
      Reported-and-tested-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de9e478b
  10. 23 2月, 2016 1 次提交
  11. 19 2月, 2016 1 次提交
  12. 18 2月, 2016 5 次提交
    • T
      x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries · b2f9d678
      Tony Luck 提交于
      Extend the severity checking code to add a new context IN_KERN_RECOV
      which is used to indicate that the machine check was triggered by code
      in the kernel tagged with _ASM_EXTABLE_FAULT() so that the ex_handler_fault()
      handler will provide the fixup code with the trap number.
      
      Major re-work to the tail code in do_machine_check() to make all this
      readable/maintainable. One functional change is that tolerant=3 no longer
      stops recovery actions. Revert to only skipping sending SIGBUS to the
      current process.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/89d243d05a7943bb187d1074bb30d9c4f482d5f5.1455732970.git.tony.luck@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b2f9d678
    • T
      x86/mm: Expand the exception table logic to allow new handling options · 548acf19
      Tony Luck 提交于
      Huge amounts of help from  Andy Lutomirski and Borislav Petkov to
      produce this. Andy provided the inspiration to add classes to the
      exception table with a clever bit-squeezing trick, Boris pointed
      out how much cleaner it would all be if we just had a new field.
      
      Linus Torvalds blessed the expansion with:
      
        ' I'd rather not be clever in order to save just a tiny amount of space
          in the exception table, which isn't really criticial for anybody. '
      
      The third field is another relative function pointer, this one to a
      handler that executes the actions.
      
      We start out with three handlers:
      
       1: Legacy - just jumps the to fixup IP
       2: Fault - provide the trap number in %ax to the fixup code
       3: Cleaned up legacy for the uaccess error hack
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/f6af78fcbd348cf4939875cfda9c19689b5e50b8.1455732970.git.tony.luck@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      548acf19
    • T
      x86/mm: Fix vmalloc_fault() to handle large pages properly · f4eafd8b
      Toshi Kani 提交于
      A kernel page fault oops with the callstack below was observed
      when a read syscall was made to a pmem device after a huge amount
      (>512GB) of vmalloc ranges was allocated by ioremap() on a x86_64
      system:
      
           BUG: unable to handle kernel paging request at ffff880840000ff8
           IP: vmalloc_fault+0x1be/0x300
           PGD c7f03a067 PUD 0
           Oops: 0000 [#1] SM
           Call Trace:
              __do_page_fault+0x285/0x3e0
              do_page_fault+0x2f/0x80
              ? put_prev_entity+0x35/0x7a0
              page_fault+0x28/0x30
              ? memcpy_erms+0x6/0x10
              ? schedule+0x35/0x80
              ? pmem_rw_bytes+0x6a/0x190 [nd_pmem]
              ? schedule_timeout+0x183/0x240
              btt_log_read+0x63/0x140 [nd_btt]
               :
              ? __symbol_put+0x60/0x60
              ? kernel_read+0x50/0x80
              SyS_finit_module+0xb9/0xf0
              entry_SYSCALL_64_fastpath+0x1a/0xa4
      
      Since v4.1, ioremap() supports large page (pud/pmd) mappings in
      x86_64 and PAE.  vmalloc_fault() however assumes that the vmalloc
      range is limited to pte mappings.
      
      vmalloc faults do not normally happen in ioremap'd ranges since
      ioremap() sets up the kernel page tables, which are shared by
      user processes.  pgd_ctor() sets the kernel's PGD entries to
      user's during fork().  When allocation of the vmalloc ranges
      crosses a 512GB boundary, ioremap() allocates a new pud table
      and updates the kernel PGD entry to point it.  If user process's
      PGD entry does not have this update yet, a read/write syscall
      to the range will cause a vmalloc fault, which hits the Oops
      above as it does not handle a large page properly.
      
      Following changes are made to vmalloc_fault().
      
      64-bit:
      
       - No change for the PGD sync operation as it handles large
         pages already.
       - Add pud_huge() and pmd_huge() to the validation code to
         handle large pages.
       - Change pud_page_vaddr() to pud_pfn() since an ioremap range
         is not directly mapped (while the if-statement still works
         with a bogus addr).
       - Change pmd_page() to pmd_pfn() since an ioremap range is not
         backed by struct page (while the if-statement still works
         with a bogus addr).
      
      32-bit:
       - No change for the sync operation since the index3 PGD entry
         covers the entire vmalloc range, which is always valid.
         (A separate change to sync PGD entry is necessary if this
          memory layout is changed regardless of the page size.)
       - Add pmd_huge() to the validation code to handle large pages.
         This is for completeness since vmalloc_fault() won't happen
         in ioremap'd ranges as its PGD entry is always valid.
      Reported-by: NHenning Schild <henning.schild@siemens.com>
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Acked-by: NBorislav Petkov <bp@alien8.de>
      Cc: <stable@vger.kernel.org> # 4.1+
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: linux-mm@kvack.org
      Cc: linux-nvdimm@lists.01.org
      Link: http://lkml.kernel.org/r/1455758214-24623-1-git-send-email-toshi.kani@hpe.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f4eafd8b
    • B
      Revert "PCI: Add helpers to manage pci_dev->irq and pci_dev->irq_managed" · 67b4eab9
      Bjorn Helgaas 提交于
      Revert 811a4e6f ("PCI: Add helpers to manage pci_dev->irq and
      pci_dev->irq_managed").
      
      This is part of reverting 991de2e5 ("PCI, x86: Implement
      pcibios_alloc_irq() and pcibios_free_irq()") to fix regressions it
      introduced.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=111211
      Fixes: 991de2e5 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()")
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NRafael J. Wysocki <rafael@kernel.org>
      CC: Jiang Liu <jiang.liu@linux.intel.com>
      67b4eab9
    • B
      Revert "x86/PCI: Don't alloc pcibios-irq when MSI is enabled" · fe25d078
      Bjorn Helgaas 提交于
      Revert 8affb487 ("x86/PCI: Don't alloc pcibios-irq when MSI is
      enabled").
      
      This is part of reverting 991de2e5 ("PCI, x86: Implement
      pcibios_alloc_irq() and pcibios_free_irq()") to fix regressions it
      introduced.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=111211
      Fixes: 991de2e5 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()")
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NRafael J. Wysocki <rafael@kernel.org>
      CC: Jiang Liu <jiang.liu@linux.intel.com>
      CC: Joerg Roedel <jroedel@suse.de>
      fe25d078
  13. 17 2月, 2016 4 次提交
    • M
      hpet: Drop stale URLs · 4e7f9df2
      Michael S. Tsirkin 提交于
      Looks like the HPET spec at intel.com got moved.
      It isn't hard to find so drop the link, just mention
      the revision assumed.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Clemens Ladisch <clemens@ladisch.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-doc@vger.kernel.org
      Link: http://lkml.kernel.org/r/1455145462-3877-1-git-send-email-mst@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4e7f9df2
    • T
      x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache() · a82eee74
      Toshi Kani 提交于
      Data corruption issues were observed in tests which initiated
      a system crash/reset while accessing BTT devices.  This problem
      is reproducible.
      
      The BTT driver calls pmem_rw_bytes() to update data in pmem
      devices.  This interface calls __copy_user_nocache(), which
      uses non-temporal stores so that the stores to pmem are
      persistent.
      
      __copy_user_nocache() uses non-temporal stores when a request
      size is 8 bytes or larger (and is aligned by 8 bytes).  The
      BTT driver updates the BTT map table, which entry size is
      4 bytes.  Therefore, updates to the map table entries remain
      cached, and are not written to pmem after a crash.
      
      Change __copy_user_nocache() to use non-temporal store when
      a request size is 4 bytes.  The change extends the current
      byte-copy path for a less-than-8-bytes request, and does not
      add any overhead to the regular path.
      Reported-and-tested-by: NMicah Parrish <micah.parrish@hpe.com>
      Reported-and-tested-by: NBrian Boylston <brian.boylston@hpe.com>
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Cc: <stable@vger.kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Cc: linux-nvdimm@lists.01.org
      Link: http://lkml.kernel.org/r/1455225857-12039-3-git-send-email-toshi.kani@hpe.com
      [ Small readability edits. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a82eee74
    • T
      x86/uaccess/64: Make the __copy_user_nocache() assembly code more readable · ee9737c9
      Toshi Kani 提交于
      Add comments to __copy_user_nocache() to clarify its procedures
      and alignment requirements.
      
      Also change numeric branch target labels to named local labels.
      
      No code changed:
      
       arch/x86/lib/copy_user_64.o:
      
          text    data     bss     dec     hex filename
          1239       0       0    1239     4d7 copy_user_64.o.before
          1239       0       0    1239     4d7 copy_user_64.o.after
      
       md5:
          58bed94c2db98c1ca9a2d46d0680aaae  copy_user_64.o.before.asm
          58bed94c2db98c1ca9a2d46d0680aaae  copy_user_64.o.after.asm
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Cc: <stable@vger.kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: brian.boylston@hpe.com
      Cc: dan.j.williams@intel.com
      Cc: linux-nvdimm@lists.01.org
      Cc: micah.parrish@hpe.com
      Cc: ross.zwisler@linux.intel.com
      Cc: vishal.l.verma@intel.com
      Link: http://lkml.kernel.org/r/1455225857-12039-2-git-send-email-toshi.kani@hpe.com
      [ Small readability edits and added object file comparison. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ee9737c9
    • T
      perf/x86/amd/uncore: Plug reference leak · 8bc9162c
      Thomas Gleixner 提交于
      In the error path of amd_uncore_cpu_up_prepare() the newly allocated uncore
      struct is freed, but the percpu pointer still references it. Set it to NULL.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1602162302170.19512@nanosSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8bc9162c
  14. 15 2月, 2016 1 次提交
  15. 12 2月, 2016 1 次提交
  16. 08 2月, 2016 1 次提交
    • I
      x86/mm/numa: Fix 32-bit memblock range truncation bug on 32-bit NUMA kernels · 59fd1214
      Ingo Molnar 提交于
      The following commit:
      
        a0acda91 ("acpi, numa, mem_hotplug: mark all nodes the kernel resides un-hotpluggable")
      
      Introduced numa_clear_kernel_node_hotplug(), which function is executed
      during early bootup, and which marks all currently reserved memblock
      regions as hot-memory-unswappable as well.
      
      y14sg1 <y14sg1@comcast.net> reported that when running 32-bit NUMA kernels,
      the grsecurity/PAX kernel patch flagged a size overflow in this function:
      
        PAX: size overflow detected in function x86_numa_init arch/x86/mm/numa.c:691 [...]
      
      ... the reason for the overflow is that memblock_clear_hotplug() takes physical
      addresses as arguments, while the start/end variables used by
      numa_clear_kernel_node_hotplug() are 'unsigned long', which is 32-bit on PAE
      kernels, but which has 64-bit physical addresses.
      
      So on 32-bit PAE kernels that have physical memory above the 4GB boundary,
      we truncate a 64-bit physical address range to 32 bits and pass it to
      memblock_clear_hotplug(), which at minimum prevents the original memory-hotplug
      bugfix from working, but might have other side effects as well.
      
      The fix is to use the proper type to handle physical addresses, phys_addr_t.
      Reported-by: Ny14sg1 <y14sg1@comcast.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Brad Spengler <spender@grsecurity.net>
      Cc: Chen Tang <imtangchen@gmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: PaX Team <pageexec@freemail.hu>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      59fd1214
  17. 06 2月, 2016 1 次提交
    • V
      mm, hugetlb: don't require CMA for runtime gigantic pages · 080fe206
      Vlastimil Babka 提交于
      Commit 944d9fec ("hugetlb: add support for gigantic page allocation
      at runtime") has added the runtime gigantic page allocation via
      alloc_contig_range(), making this support available only when CONFIG_CMA
      is enabled.  Because it doesn't depend on MIGRATE_CMA pageblocks and the
      associated infrastructure, it is possible with few simple adjustments to
      require only CONFIG_MEMORY_ISOLATION instead of full CONFIG_CMA.
      
      After this patch, alloc_contig_range() and related functions are
      available and used for gigantic pages with just CONFIG_MEMORY_ISOLATION
      enabled.  Note CONFIG_CMA selects CONFIG_MEMORY_ISOLATION.  This allows
      supporting runtime gigantic pages without the CMA-specific checks in
      page allocator fastpaths.
      Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      080fe206
  18. 05 2月, 2016 1 次提交
    • D
      x86: Fix KASAN false positives in thread_saved_pc() · 75edb54a
      Dmitry Vyukov 提交于
      thread_saved_pc() reads stack of a potentially running task.
      This can cause false KASAN stack-out-of-bounds reports,
      because the running task concurrently poisons and unpoisons
      own stack.
      
      The same happens in get_wchan(), and get get_wchan() was fixed
      by using READ_ONCE_NOCHECK(). Do the same here.
      
      Example KASAN report triggered by sysrq-t:
      
        BUG: KASAN: out-of-bounds in sched_show_task+0x306/0x3b0 at addr ffff880043c97c18
        Read of size 8 by task syz-executor/23839
        [...]
        page dumped because: kasan: bad access detected
        [...]
        Call Trace:
         [<ffffffff8175ea0e>] __asan_report_load8_noabort+0x3e/0x40
         [<ffffffff813e7a26>] sched_show_task+0x306/0x3b0
         [<ffffffff813e7bf4>] show_state_filter+0x124/0x1a0
         [<ffffffff82d2ca00>] fn_show_state+0x10/0x20
         [<ffffffff82d2cf98>] k_spec+0xa8/0xe0
         [<ffffffff82d3354f>] kbd_event+0xb9f/0x4000
         [<ffffffff843ca8a7>] input_to_handler+0x3a7/0x4b0
         [<ffffffff843d1954>] input_pass_values.part.5+0x554/0x6b0
         [<ffffffff843d29bc>] input_handle_event+0x2ac/0x1070
         [<ffffffff843d3a47>] input_inject_event+0x237/0x280
         [<ffffffff843e8c28>] evdev_write+0x478/0x680
         [<ffffffff817ac653>] __vfs_write+0x113/0x480
         [<ffffffff817ae0e7>] vfs_write+0x167/0x4a0
         [<ffffffff817b13d1>] SyS_write+0x111/0x220
      Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
      Acked-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: glider@google.com
      Cc: kasan-dev@googlegroups.com
      Cc: kcc@google.com
      Cc: linux-kernel@vger.kernel.org
      Cc: ryabinin.a.a@gmail.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      75edb54a
  19. 01 2月, 2016 6 次提交
  20. 29 1月, 2016 1 次提交
    • M
      x86/mm/pat: Avoid truncation when converting cpa->numpages to address · 74256377
      Matt Fleming 提交于
      There are a couple of nasty truncation bugs lurking in the pageattr
      code that can be triggered when mapping EFI regions, e.g. when we pass
      a cpa->pgd pointer. Because cpa->numpages is a 32-bit value, shifting
      left by PAGE_SHIFT will truncate the resultant address to 32-bits.
      
      Viorel-Cătălin managed to trigger this bug on his Dell machine that
      provides a ~5GB EFI region which requires 1236992 pages to be mapped.
      When calling populate_pud() the end of the region gets calculated
      incorrectly in the following buggy expression,
      
        end = start + (cpa->numpages << PAGE_SHIFT);
      
      And only 188416 pages are mapped. Next, populate_pud() gets invoked
      for a second time because of the loop in __change_page_attr_set_clr(),
      only this time no pages get mapped because shifting the remaining
      number of pages (1048576) by PAGE_SHIFT is zero. At which point the
      loop in __change_page_attr_set_clr() spins forever because we fail to
      map progress.
      
      Hitting this bug depends very much on the virtual address we pick to
      map the large region at and how many pages we map on the initial run
      through the loop. This explains why this issue was only recently hit
      with the introduction of commit
      
        a5caa209 ("x86/efi: Fix boot crash by mapping EFI memmap
         entries bottom-up at runtime, instead of top-down")
      
      It's interesting to note that safe uses of cpa->numpages do exist in
      the pageattr code. If instead of shifting ->numpages we multiply by
      PAGE_SIZE, no truncation occurs because PAGE_SIZE is a UL value, and
      so the result is unsigned long.
      
      To avoid surprises when users try to convert very large cpa->numpages
      values to addresses, change the data type from 'int' to 'unsigned
      long', thereby making it suitable for shifting by PAGE_SHIFT without
      any type casting.
      
      The alternative would be to make liberal use of casting, but that is
      far more likely to cause problems in the future when someone adds more
      code and fails to cast properly; this bug was difficult enough to
      track down in the first place.
      Reported-and-tested-by: NViorel-Cătălin Răpițeanu <rapiteanu.catalin@gmail.com>
      Acked-by: NBorislav Petkov <bp@alien8.de>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=110131
      Link: http://lkml.kernel.org/r/1454067370-10374-1-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      74256377