1. 28 10月, 2016 3 次提交
    • I
      perf/x86/intel: Honour the CPUID for number of fixed counters in hypervisors · f92b7604
      Imre Palik 提交于
      perf doesn't seem to honour the number of fixed counters specified by CPUID
      leaf 0xa. It always assumes that Intel CPUs have at least 3 fixed counters.
      
      So if some of the fixed counters are masked out by the hypervisor, it still
      tries to check/set them.
      
      This patch makes perf behave nicer when the kernel is running under a
      hypervisor that doesn't expose all the counters.
      
      This patch contains some ideas from Matt Wilson.
      Signed-off-by: NImre Palik <imrep@amazon.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Kozyrev <alexander.kozyrev@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Artyom Kuanbekov <artyom.kuanbekov@intel.com>
      Cc: David Carrillo-Cisneros <davidcc@google.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Wilson <msw@amazon.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1477037939-15605-1-git-send-email-imrep.amz@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f92b7604
    • B
      x86/microcode/AMD: Fix more fallout from CONFIG_RANDOMIZE_MEMORY=y · 1c27f646
      Borislav Petkov 提交于
      We needed the physical address of the container in order to compute the
      offset within the relocated ramdisk. And we did this by doing __pa() on
      the virtual address.
      
      However, __pa() does checks whether the physical address is within
      PAGE_OFFSET and __START_KERNEL_map - see __phys_addr() - which fail
      if we have CONFIG_RANDOMIZE_MEMORY enabled: we feed a virtual address
      which *doesn't* have the randomization offset into a function which uses
      PAGE_OFFSET which *does* have that offset.
      
      This makes this check fire:
      
      	VIRTUAL_BUG_ON((x > y) || !phys_addr_valid(x));
      			^^^^^^
      
      due to the randomization offset.
      
      The fix is as simple as using __pa_nodebug() because we do that
      randomization offset accounting later in that function ourselves.
      Reported-by: NBob Peterson <rpeterso@redhat.com>
      Tested-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andreas Gruenbacher <agruenba@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm <linux-mm@kvack.org>
      Cc: stable@vger.kernel.org # 4.9
      Link: http://lkml.kernel.org/r/20161027123623.j2jri5bandimboff@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1c27f646
    • M
      kconfig.h: remove config_enabled() macro · c0a0aba8
      Masahiro Yamada 提交于
      The use of config_enabled() is ambiguous.  For config options,
      IS_ENABLED(), IS_REACHABLE(), etc.  will make intention clearer.
      Sometimes config_enabled() has been used for non-config options because
      it is useful to check whether the given symbol is defined or not.
      
      I have been tackling on deprecating config_enabled(), and now is the
      time to finish this work.
      
      Some new users have appeared for v4.9-rc1, but it is trivial to replace
      them:
      
       - arch/x86/mm/kaslr.c
        replace config_enabled() with IS_ENABLED() because
        CONFIG_X86_ESPFIX64 and CONFIG_EFI are boolean.
      
       - include/asm-generic/export.h
        replace config_enabled() with __is_defined().
      
      Then, config_enabled() can be removed now.
      
      Going forward, please use IS_ENABLED(), IS_REACHABLE(), etc. for config
      options, and __is_defined() for non-config symbols.
      
      Link: http://lkml.kernel.org/r/1476616078-32252-1-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
      Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Michal Marek <mmarek@suse.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Garnier <thgarnie@google.com>
      Cc: Paul Bolle <pebolle@tiscali.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c0a0aba8
  2. 26 10月, 2016 2 次提交
    • S
      x86: Fix export for mcount and __fentry__ · 5de0a8c0
      Steven Rostedt 提交于
      Commit 784d5699 ("x86: move exports to actual definitions") removed the
      EXPORT_SYMBOL(__fentry__) and EXPORT_SYMBOL(mcount) from x8664_ksyms_64.c,
      and added EXPORT_SYMBOL(function_hook) in mcount_64.S instead. The problem
      is that function_hook isn't a function at all, but a macro that is defined
      as either mcount or __fentry__ depending on the support from gcc.
      
      Originally, I thought this was a macro issue, like what __stringify()
      is used for. But the problem is a bit deeper. The Makefile.build has
      some magic that does post processing of files to create the CRC
      bindings. It does some searches for EXPORT_SYMBOL() and because it
      finds a macro name and not the actual functions, this causes
      function_hook not to be converted into mcount or __fentry__ and they
      are missed.
      
      Instead of adding more magic to Makefile.build, just add
      EXPORT_SYMBOL() for mcount and __fentry__ where the ifdef is used.
      Since this is assembly and not C, it doesn't require being set after
      the function is defined.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Tested-by: NBorislav Petkov <bp@alien8.de>
      Cc: Gabriel C <nix.or.die@gmail.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Link: http://lkml.kernel.org/r/20161024150148.4f9d90e4@gandalf.local.homeSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      5de0a8c0
    • D
      x86/io: add interface to reserve io memtype for a resource range. (v1.1) · 8ef42276
      Dave Airlie 提交于
      A recent change to the mm code in:
      87744ab3 mm: fix cache mode tracking in vm_insert_mixed()
      
      started enforcing checking the memory type against the registered list for
      amixed pfn insertion mappings. It happens that the drm drivers for a number
      of gpus relied on this being broken. Currently the driver only inserted
      VRAM mappings into the tracking table when they came from the kernel,
      and userspace mappings never landed in the table. This led to a regression
      where all the mapping end up as UC instead of WC now.
      
      I've considered a number of solutions but since this needs to be fixed
      in fixes and not next, and some of the solutions were going to introduce
      overhead that hadn't been there before I didn't consider them viable at
      this stage. These mainly concerned hooking into the TTM io reserve APIs,
      but these API have a bunch of fast paths I didn't want to unwind to add
      this to.
      
      The solution I've decided on is to add a new API like the arch_phys_wc
      APIs (these would have worked but wc_del didn't take a range), and
      use them from the drivers to add a WC compatible mapping to the table
      for all VRAM on those GPUs. This means we can then create userspace
      mapping that won't get degraded to UC.
      
      v1.1: use CONFIG_X86_PAT + add some comments in io.h
      
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: x86@kernel.org
      Cc: mcgrof@suse.com
      Cc: Dan Williams <dan.j.williams@intel.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      8ef42276
  3. 25 10月, 2016 3 次提交
    • A
      x86/quirks: Hide maybe-uninitialized warning · d320b9a5
      Arnd Bergmann 提交于
      gcc -Wmaybe-uninitialized detects that quirk_intel_brickland_xeon_ras_cap
      uses uninitialized data when CONFIG_PCI is not set:
      
        arch/x86/kernel/quirks.c: In function ‘quirk_intel_brickland_xeon_ras_cap’:
        arch/x86/kernel/quirks.c:641:13: error: ‘capid0’ is used uninitialized in this function [-Werror=uninitialized]
      
      However, the function is also not called in this configuration, so we
      can avoid the warning by moving the existing #ifdef to cover it as well.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-pci@vger.kernel.org
      Link: http://lkml.kernel.org/r/20161024153325.2752428-1-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d320b9a5
    • J
      x86/build: Fix build with older GCC versions · a2209b74
      Jan Beulich 提交于
      Older GCC (observed with 4.1.x) doesn't support -Wno-override-init and
      also doesn't ignore unknown -Wno-* options.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
      Cc: Valdis.Kletnieks@vt.edu
      Fixes: 5e44258d "x86/build: Reduce the W=1 warnings noise when compiling x86 syscall tables"
      Link: http://lkml.kernel.org/r/580E3E1C02000078001191C4@prv-mh.provo.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a2209b74
    • J
      x86/unwind: Fix empty stack dereference in guess unwinder · 7fbe6ac0
      Josh Poimboeuf 提交于
      Vince Waver reported the following bug:
      
        WARNING: CPU: 0 PID: 21338 at arch/x86/mm/fault.c:435 vmalloc_fault+0x58/0x1f0
        CPU: 0 PID: 21338 Comm: perf_fuzzer Not tainted 4.8.0+ #37
        Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS K06 v02.57 08/16/2013
        Call Trace:
         <NMI>  ? dump_stack+0x46/0x59
         ? __warn+0xd5/0xee
         ? vmalloc_fault+0x58/0x1f0
         ? __do_page_fault+0x6d/0x48e
         ? perf_log_throttle+0xa4/0xf4
         ? trace_page_fault+0x22/0x30
         ? __unwind_start+0x28/0x42
         ? perf_callchain_kernel+0x75/0xac
         ? get_perf_callchain+0x13a/0x1f0
         ? perf_callchain+0x6a/0x6c
         ? perf_prepare_sample+0x71/0x2eb
         ? perf_event_output_forward+0x1a/0x54
         ? __default_send_IPI_shortcut+0x10/0x2d
         ? __perf_event_overflow+0xfb/0x167
         ? x86_pmu_handle_irq+0x113/0x150
         ? native_read_msr+0x6/0x34
         ? perf_event_nmi_handler+0x22/0x39
         ? perf_ibs_nmi_handler+0x4a/0x51
         ? perf_event_nmi_handler+0x22/0x39
         ? nmi_handle+0x4d/0xf0
         ? perf_ibs_handle_irq+0x3d1/0x3d1
         ? default_do_nmi+0x3c/0xd5
         ? do_nmi+0x92/0x102
         ? end_repeat_nmi+0x1a/0x1e
         ? entry_SYSCALL_64_after_swapgs+0x12/0x4a
         ? entry_SYSCALL_64_after_swapgs+0x12/0x4a
         ? entry_SYSCALL_64_after_swapgs+0x12/0x4a
         <EOE> ^A4---[ end trace 632723104d47d31a ]---
        BUG: stack guard page was hit at ffffc90008500000 (stack is ffffc900084fc000..ffffc900084fffff)
        kernel stack overflow (page fault): 0000 [#1] SMP
        ...
      
      The NMI hit in the entry code right after setting up the stack pointer
      from 'cpu_current_top_of_stack', so the kernel stack was empty.  The
      'guess' version of __unwind_start() attempted to dereference the "top of
      stack" pointer, which is not actually *on* the stack.
      
      Add a check in the guess unwinder to deal with an empty stack.  (The
      frame pointer unwinder already has such a check.)
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 7c7900f8 ("x86/unwind: Add new unwind interface and implementations")
      Link: http://lkml.kernel.org/r/20161024133127.e5evgeebdbohnmpb@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7fbe6ac0
  4. 24 10月, 2016 2 次提交
  5. 22 10月, 2016 1 次提交
    • V
      x86/boot/smp: Don't try to poke disabled/non-existent APIC · ff856051
      Ville Syrjälä 提交于
      Apparently trying to poke a disabled or non-existent APIC
      leads to a box that doesn't even boot. Let's not do that.
      
      No real clue if this is the right fix, but at least my
      P3 machine boots again.
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: dyoung@redhat.com
      Cc: kexec@lists.infradead.org
      Cc: stable@vger.kernel.org
      Fixes: 2a51fe08 ("arch/x86: Handle non enumerated CPU after physical hotplug")
      Link: http://lkml.kernel.org/r/1477102684-5092-1-git-send-email-ville.syrjala@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ff856051
  6. 20 10月, 2016 5 次提交
    • J
      kvm: x86: memset whole irq_eoi · 8678654e
      Jiri Slaby 提交于
      gcc 7 warns:
      arch/x86/kvm/ioapic.c: In function 'kvm_ioapic_reset':
      arch/x86/kvm/ioapic.c:597:2: warning: 'memset' used with length equal to number of elements without multiplication by element size [-Wmemset-elt-size]
      
      And it is right. Memset whole array using sizeof operator.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: kvm@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: stable@vger.kernel.org
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      [Added x86 subject tag]
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      8678654e
    • B
      kvm/x86: Fix unused variable warning in kvm_timer_init() · 758f588d
      Borislav Petkov 提交于
      When CONFIG_CPU_FREQ is not set, int cpu is unused and gcc rightfully
      warns about it:
      
        arch/x86/kvm/x86.c: In function ‘kvm_timer_init’:
        arch/x86/kvm/x86.c:5697:6: warning: unused variable ‘cpu’ [-Wunused-variable]
          int cpu;
              ^~~
      
      But since it is used only in the CONFIG_CPU_FREQ block, simply move it
      there, thus squashing the warning too.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      758f588d
    • H
      sched/core, x86: Make struct thread_info arch specific again · c8061485
      Heiko Carstens 提交于
      The following commit:
      
        c65eacbe ("sched/core: Allow putting thread_info into task_struct")
      
      ... made 'struct thread_info' a generic struct with only a
      single ::flags member, if CONFIG_THREAD_INFO_IN_TASK_STRUCT=y is
      selected.
      
      This change however seems to be quite x86 centric, since at least the
      generic preemption code (asm-generic/preempt.h) assumes that struct
      thread_info also has a preempt_count member, which apparently was not
      true for x86.
      
      We could add a bit more #ifdefs to solve this problem too, but it seems
      to be much simpler to make struct thread_info arch specific
      again. This also makes the conversion to THREAD_INFO_IN_TASK_STRUCT a
      bit easier for architectures that have a couple of arch specific stuff
      in their thread_info definition.
      
      The arch specific stuff _could_ be moved to thread_struct. However
      keeping them in thread_info makes it easier: accessing thread_info
      members is simple, since it is at the beginning of the task_struct,
      while the thread_struct is at the end. At least on s390 the offsets
      needed to access members of the thread_struct (with task_struct as
      base) are too large for various asm instructions.  This is not a
      problem when keeping these members within thread_info.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: keescook@chromium.org
      Cc: linux-arch@vger.kernel.org
      Link: http://lkml.kernel.org/r/1476901693-8492-2-git-send-email-mark.rutland@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c8061485
    • D
      x86/signal: Remove bogus user_64bit_mode() check from sigaction_compat_abi() · ed1e7db3
      Dmitry Safonov 提交于
      The recent introduction of SA_X32/IA32 sa_flags added a check for
      user_64bit_mode() into sigaction_compat_abi(). user_64bit_mode() is true
      for native 64-bit processes and x32 processes.
      
      Due to that the function returns w/o setting the SA_X32_ABI flag for X32
      processes. In consequence the kernel attempts to deliver the signal to the
      X32 process in native 64-bit mode causing the process to segfault.
      
      Remove the check, so the actual check for X32 mode which sets the ABI flag
      can be reached. There is no side effect for native 64-bit mode.
      
      [ tglx: Rewrote changelog ]
      
      Fixes: 68463510 ("x86/signal: Add SA_{X32,IA32}_ABI sa_flags")
      Reported-by: NMikulas Patocka <mpatocka@redhat.com>
      Tested-by: NAdam Borowski <kilobyte@angband.pl>
      Signed-off-by: NDmitry Safonov <0x7f454c46@gmail.com>
      Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: linux-mm@kvack.org
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@virtuozzo.com>
      Link: http://lkml.kernel.org/r/CAJwJo6Z8ZWPqNfT6t-i8GW1MKxQrKDUagQqnZ%2B0%2B697%3DMyVeGg@mail.gmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ed1e7db3
    • A
      x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates · caef78b6
      Alex Thorlton 提交于
      Some time ago, we brought our UV BIOS callback code up to speed with the
      new EFI memory mapping scheme, in commit:
      
          d1be84a2 ("x86/uv: Update uv_bios_call() to use efi_call_virt_pointer()")
      
      By leveraging some changes that I made to a few of the EFI runtime
      callback mechanisms, in commit:
      
          80e75596 ("efi: Convert efi_call_virt() to efi_call_virt_pointer()")
      
      This got everything running smoothly on UV, with the new EFI mapping
      code.  However, this left one, small loose end, in that EFI_OLD_MEMMAP
      (a.k.a. efi=old_map) will no longer work on UV, on kernels that include
      the aforementioned changes.
      
      At the time this was not a major issue (in fact, it still really isn't),
      but there's no reason that EFI_OLD_MEMMAP *shouldn't* work on our
      systems.  This commit adds a check into uv_bios_call(), to see if we have
      the EFI_OLD_MEMMAP bit set in efi.flags.  If it is set, we fall back to
      using our old callback method, which uses efi_call() directly on the __va()
      of our function pointer.
      Signed-off-by: NAlex Thorlton <athorlton@sgi.com>
      Acked-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: <stable@vger.kernel.org> # v4.7 and later
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Mike Travis <travis@sgi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1476928131-170101-1-git-send-email-athorlton@sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      caef78b6
  7. 19 10月, 2016 6 次提交
  8. 18 10月, 2016 2 次提交
  9. 17 10月, 2016 4 次提交
  10. 16 10月, 2016 4 次提交
    • D
      perf/x86/intel: Remove an inconsistent NULL check · 5c38181c
      Dan Carpenter 提交于
      Smatch complains that we don't check "event->ctx" consistently.  It's
      never NULL so we can just remove the check.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Carrillo-Cisneros <davidcc@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: kernel-janitors@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5c38181c
    • D
      x86/e820: Don't merge consecutive E820_PRAM ranges · 23446cb6
      Dan Williams 提交于
      Commit:
      
        917db484 ("x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation")
      
      ... fixed up the broken manipulations of max_pfn in the presence of
      E820_PRAM ranges.
      
      However, it also broke the sanitize_e820_map() support for not merging
      E820_PRAM ranges.
      
      Re-introduce the enabling to keep resource boundaries between
      consecutive defined ranges. Otherwise, for example, an environment that
      boots with memmap=2G!8G,2G!10G will end up with a single 4G /dev/pmem0
      device instead of a /dev/pmem0 and /dev/pmem1 device 2G in size.
      Reported-by: NDave Chinner <david@fromorbit.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Zhang Yi <yizhan@redhat.com>
      Cc: linux-nvdimm@lists.01.org
      Fixes: 917db484 ("x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation")
      Link: http://lkml.kernel.org/r/147629530854.10618.10383744751594021268.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      23446cb6
    • D
      kprobes: Unpoison stack in jprobe_return() for KASAN · 9f7d416c
      Dmitry Vyukov 提交于
      I observed false KSAN positives in the sctp code, when
      sctp uses jprobe_return() in jsctp_sf_eat_sack().
      
      The stray 0xf4 in shadow memory are stack redzones:
      
      [     ] ==================================================================
      [     ] BUG: KASAN: stack-out-of-bounds in memcmp+0xe9/0x150 at addr ffff88005e48f480
      [     ] Read of size 1 by task syz-executor/18535
      [     ] page:ffffea00017923c0 count:0 mapcount:0 mapping:          (null) index:0x0
      [     ] flags: 0x1fffc0000000000()
      [     ] page dumped because: kasan: bad access detected
      [     ] CPU: 1 PID: 18535 Comm: syz-executor Not tainted 4.8.0+ #28
      [     ] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      [     ]  ffff88005e48f2d0 ffffffff82d2b849 ffffffff0bc91e90 fffffbfff10971e8
      [     ]  ffffed000bc91e90 ffffed000bc91e90 0000000000000001 0000000000000000
      [     ]  ffff88005e48f480 ffff88005e48f350 ffffffff817d3169 ffff88005e48f370
      [     ] Call Trace:
      [     ]  [<ffffffff82d2b849>] dump_stack+0x12e/0x185
      [     ]  [<ffffffff817d3169>] kasan_report+0x489/0x4b0
      [     ]  [<ffffffff817d31a9>] __asan_report_load1_noabort+0x19/0x20
      [     ]  [<ffffffff82d49529>] memcmp+0xe9/0x150
      [     ]  [<ffffffff82df7486>] depot_save_stack+0x176/0x5c0
      [     ]  [<ffffffff817d2031>] save_stack+0xb1/0xd0
      [     ]  [<ffffffff817d27f2>] kasan_slab_free+0x72/0xc0
      [     ]  [<ffffffff817d05b8>] kfree+0xc8/0x2a0
      [     ]  [<ffffffff85b03f19>] skb_free_head+0x79/0xb0
      [     ]  [<ffffffff85b0900a>] skb_release_data+0x37a/0x420
      [     ]  [<ffffffff85b090ff>] skb_release_all+0x4f/0x60
      [     ]  [<ffffffff85b11348>] consume_skb+0x138/0x370
      [     ]  [<ffffffff8676ad7b>] sctp_chunk_put+0xcb/0x180
      [     ]  [<ffffffff8676ae88>] sctp_chunk_free+0x58/0x70
      [     ]  [<ffffffff8677fa5f>] sctp_inq_pop+0x68f/0xef0
      [     ]  [<ffffffff8675ee36>] sctp_assoc_bh_rcv+0xd6/0x4b0
      [     ]  [<ffffffff8677f2c1>] sctp_inq_push+0x131/0x190
      [     ]  [<ffffffff867bad69>] sctp_backlog_rcv+0xe9/0xa20
      [ ... ]
      [     ] Memory state around the buggy address:
      [     ]  ffff88005e48f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [     ]  ffff88005e48f400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [     ] >ffff88005e48f480: f4 f4 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [     ]                    ^
      [     ]  ffff88005e48f500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [     ]  ffff88005e48f580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [     ] ==================================================================
      
      KASAN stack instrumentation poisons stack redzones on function entry
      and unpoisons them on function exit. If a function exits abnormally
      (e.g. with a longjmp like jprobe_return()), stack redzones are left
      poisoned. Later this leads to random KASAN false reports.
      
      Unpoison stack redzones in the frames we are going to jump over
      before doing actual longjmp in jprobe_return().
      Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: kasan-dev@googlegroups.com
      Cc: surovegin@google.com
      Cc: rostedt@goodmis.org
      Link: http://lkml.kernel.org/r/1476454043-101898-1-git-send-email-dvyukov@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9f7d416c
    • D
      kprobes: Avoid false KASAN reports during stack copy · 9254139a
      Dmitry Vyukov 提交于
      Kprobes save and restore raw stack chunks with memcpy().
      With KASAN these chunks can contain poisoned stack redzones,
      as the result memcpy() interceptor produces false
      stack out-of-bounds reports.
      
      Use __memcpy() instead of memcpy() for stack copying.
      __memcpy() is not instrumented by KASAN and does not lead
      to the false reports.
      
      Currently there is a spew of KASAN reports during boot
      if CONFIG_KPROBES_SANITY_TEST is enabled:
      
      [   ] Kprobe smoke test: started
      [   ] ==================================================================
      [   ] BUG: KASAN: stack-out-of-bounds in setjmp_pre_handler+0x17c/0x280 at addr ffff88085259fba8
      [   ] Read of size 64 by task swapper/0/1
      [   ] page:ffffea00214967c0 count:0 mapcount:0 mapping:          (null) index:0x0
      [   ] flags: 0x2fffff80000000()
      [   ] page dumped because: kasan: bad access detected
      [...]
      Reported-by: NCAI Qian <caiqian@redhat.com>
      Tested-by: NCAI Qian <caiqian@redhat.com>
      Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kasan-dev@googlegroups.com
      [ Improved various details. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      9254139a
  11. 14 10月, 2016 2 次提交
  12. 12 10月, 2016 4 次提交
    • P
      kthread: kthread worker API cleanup · 3989144f
      Petr Mladek 提交于
      A good practice is to prefix the names of functions by the name
      of the subsystem.
      
      The kthread worker API is a mix of classic kthreads and workqueues.  Each
      worker has a dedicated kthread.  It runs a generic function that process
      queued works.  It is implemented as part of the kthread subsystem.
      
      This patch renames the existing kthread worker API to use
      the corresponding name from the workqueues API prefixed by
      kthread_:
      
      __init_kthread_worker()		-> __kthread_init_worker()
      init_kthread_worker()		-> kthread_init_worker()
      init_kthread_work()		-> kthread_init_work()
      insert_kthread_work()		-> kthread_insert_work()
      queue_kthread_work()		-> kthread_queue_work()
      flush_kthread_work()		-> kthread_flush_work()
      flush_kthread_worker()		-> kthread_flush_worker()
      
      Note that the names of DEFINE_KTHREAD_WORK*() macros stay
      as they are. It is common that the "DEFINE_" prefix has
      precedence over the subsystem names.
      
      Note that INIT() macros and init() functions use different
      naming scheme. There is no good solution. There are several
      reasons for this solution:
      
        + "init" in the function names stands for the verb "initialize"
          aka "initialize worker". While "INIT" in the macro names
          stands for the noun "INITIALIZER" aka "worker initializer".
      
        + INIT() macros are used only in DEFINE() macros
      
        + init() functions are used close to the other kthread()
          functions. It looks much better if all the functions
          use the same scheme.
      
        + There will be also kthread_destroy_worker() that will
          be used close to kthread_cancel_work(). It is related
          to the init() function. Again it looks better if all
          functions use the same naming scheme.
      
        + there are several precedents for such init() function
          names, e.g. amd_iommu_init_device(), free_area_init_node(),
          jump_label_init_type(),  regmap_init_mmio_clk(),
      
        + It is not an argument but it was inconsistent even before.
      
      [arnd@arndb.de: fix linux-next merge conflict]
       Link: http://lkml.kernel.org/r/20160908135724.1311726-1-arnd@arndb.de
      Link: http://lkml.kernel.org/r/1470754545-17632-3-git-send-email-pmladek@suse.comSuggested-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3989144f
    • T
      kdump, vmcoreinfo: report memory sections virtual addresses · 0549a3c0
      Thomas Garnier 提交于
      KASLR memory randomization can randomize the base of the physical memory
      mapping (PAGE_OFFSET), vmalloc (VMALLOC_START) and vmemmap
      (VMEMMAP_START).  Adding these variables on VMCOREINFO so tools can easily
      identify the base of each memory section.
      
      Link: http://lkml.kernel.org/r/1471531632-23003-1-git-send-email-thgarnie@google.comSigned-off-by: NThomas Garnier <thgarnie@google.com>
      Acked-by: NBaoquan He <bhe@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Xunlei Pang <xlpang@redhat.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Eugene Surovegin <surovegin@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0549a3c0
    • H
      x86/panic: replace smp_send_stop() with kdump friendly version in panic path · 0ee59413
      Hidehiro Kawai 提交于
      Daniel Walker reported problems which happens when
      crash_kexec_post_notifiers kernel option is enabled
      (https://lkml.org/lkml/2015/6/24/44).
      
      In that case, smp_send_stop() is called before entering kdump routines
      which assume other CPUs are still online.  As the result, for x86, kdump
      routines fail to save other CPUs' registers and disable virtualization
      extensions.
      
      To fix this problem, call a new kdump friendly function,
      crash_smp_send_stop(), instead of the smp_send_stop() when
      crash_kexec_post_notifiers is enabled.  crash_smp_send_stop() is a weak
      function, and it just call smp_send_stop().  Architecture codes should
      override it so that kdump can work appropriately.  This patch only
      provides x86-specific version.
      
      For Xen's PV kernel, just keep the current behavior.
      
      NOTES:
      
      - Right solution would be to place crash_smp_send_stop() before
        __crash_kexec() invocation in all cases and remove smp_send_stop(), but
        we can't do that until all architectures implement own
        crash_smp_send_stop()
      
      - crash_smp_send_stop()-like work is still needed by
        machine_crash_shutdown() because crash_kexec() can be called without
        entering panic()
      
      Fixes: f06e5153 (kernel/panic.c: add "crash_kexec_post_notifiers" option)
      Link: http://lkml.kernel.org/r/20160810080948.11028.15344.stgit@sysi4-13.yrl.intra.hitachi.co.jpSigned-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Reported-by: NDaniel Walker <dwalker@fifo99.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Daniel Walker <dwalker@fifo99.com>
      Cc: Xunlei Pang <xpang@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Daney <david.daney@cavium.com>
      Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
      Cc: "Steven J. Hill" <steven.hill@cavium.com>
      Cc: Corey Minyard <cminyard@mvista.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ee59413
    • J
      x86: use simpler API for random address requests · 9c6f0902
      Jason Cooper 提交于
      Currently, all callers to randomize_range() set the length to 0 and
      calculate end by adding a constant to the start address.  We can simplify
      the API to remove a bunch of needless checks and variables.
      
      Use the new randomize_addr(start, range) call to set the requested
      address.
      
      Link: http://lkml.kernel.org/r/20160803233913.32511-3-jason@lakedaemon.netSigned-off-by: NJason Cooper <jason@lakedaemon.net>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9c6f0902
  13. 08 10月, 2016 2 次提交
    • D
      x86/pkeys: Make protection keys an "eager" feature · d4b05923
      Dave Hansen 提交于
      Our XSAVE features are divided into two categories: those that
      generate FPU exceptions, and those that do not.  MPX and pkeys do
      not generate FPU exceptions and thus can not be used lazily.  We
      disable them when lazy mode is forced on.
      
      We have a pair of masks to collect these two sets of features, but
      XFEATURE_MASK_PKRU was added to the wrong mask: XFEATURE_MASK_LAZY.
      Fix it by moving the feature to XFEATURE_MASK_EAGER.
      
      Note: this only causes problem if you boot with lazy FPU mode
      (eagerfpu=off) which is *not* the default.  It also only affects
      hardware which is not currently publicly available.  It looks like
      eager mode is going away, but we still need this patch applied
      to any kernel that has protection keys and lazy mode, which is 4.6
      through 4.8 at this point, and 4.9 if the lazy removal isn't sent
      to Linus for 4.9.
      
      Fixes: c8df4009 ("x86/fpu, x86/mm/pkeys: Add PKRU xsave fields and data structures")
      Signed-off-by: NDave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20161007162342.28A49813@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      d4b05923
    • T
      x86/apic: Prevent pointless warning messages · df610d67
      Thomas Gleixner 提交于
      Markus reported that he sees new warnings:
      
        APIC: NR_CPUS/possible_cpus limit of 4 reached.  Processor 4/0x84 ignored.
        APIC: NR_CPUS/possible_cpus limit of 4 reached.  Processor 5/0x85 ignored.
      
      This comes from the recent persistant cpuid - nodeid changes. The code
      which emits the warning has been called prior to these changes only for
      enabled processors. Now it's called for disabled processors as well to get
      the possible cpu accounting correct. So if the kernel is compiled for the
      number of actual available/enabled CPUs and the BIOS reports disabled CPUs
      as well then the above warnings are printed.
      
      That's a pointless exercise as it only makes sense if there are more CPUs
      enabled than the kernel supports.
      
      Nake the warning conditional on enabled processors so we are back to the
      state before these changes.
      
      Fixes: 8f54969d ("x86/acpi: Introduce persistent storage for cpuid <-> apicid mapping") 
      Reported-and-tested-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
      Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: linux-acpi@vger.kernel.org
      Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610071549330.19804@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      df610d67