1. 13 4月, 2016 1 次提交
  2. 29 3月, 2016 3 次提交
  3. 26 3月, 2016 2 次提交
    • A
      mm, kasan: stackdepot implementation. Enable stackdepot for SLAB · cd11016e
      Alexander Potapenko 提交于
      Implement the stack depot and provide CONFIG_STACKDEPOT.  Stack depot
      will allow KASAN store allocation/deallocation stack traces for memory
      chunks.  The stack traces are stored in a hash table and referenced by
      handles which reside in the kasan_alloc_meta and kasan_free_meta
      structures in the allocated memory chunks.
      
      IRQ stack traces are cut below the IRQ entry point to avoid unnecessary
      duplication.
      
      Right now stackdepot support is only enabled in SLAB allocator.  Once
      KASAN features in SLAB are on par with those in SLUB we can switch SLUB
      to stackdepot as well, thus removing the dependency on SLUB stack
      bookkeeping, which wastes a lot of memory.
      
      This patch is based on the "mm: kasan: stack depots" patch originally
      prepared by Dmitry Chernenkov.
      
      Joonsoo has said that he plans to reuse the stackdepot code for the
      mm/page_owner.c debugging facility.
      
      [akpm@linux-foundation.org: s/depot_stack_handle/depot_stack_handle_t]
      [aryabinin@virtuozzo.com: comment style fixes]
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Konovalov <adech.fo@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cd11016e
    • A
      arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections · be7635e7
      Alexander Potapenko 提交于
      KASAN needs to know whether the allocation happens in an IRQ handler.
      This lets us strip everything below the IRQ entry point to reduce the
      number of unique stack traces needed to be stored.
      
      Move the definition of __irq_entry to <linux/interrupt.h> so that the
      users don't need to pull in <linux/ftrace.h>.  Also introduce the
      __softirq_entry macro which is similar to __irq_entry, but puts the
      corresponding functions to the .softirqentry.text section.
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Konovalov <adech.fo@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      be7635e7
  4. 23 3月, 2016 2 次提交
    • D
      kernel: add kcov code coverage · 5c9a8750
      Dmitry Vyukov 提交于
      kcov provides code coverage collection for coverage-guided fuzzing
      (randomized testing).  Coverage-guided fuzzing is a testing technique
      that uses coverage feedback to determine new interesting inputs to a
      system.  A notable user-space example is AFL
      (http://lcamtuf.coredump.cx/afl/).  However, this technique is not
      widely used for kernel testing due to missing compiler and kernel
      support.
      
      kcov does not aim to collect as much coverage as possible.  It aims to
      collect more or less stable coverage that is function of syscall inputs.
      To achieve this goal it does not collect coverage in soft/hard
      interrupts and instrumentation of some inherently non-deterministic or
      non-interesting parts of kernel is disbled (e.g.  scheduler, locking).
      
      Currently there is a single coverage collection mode (tracing), but the
      API anticipates additional collection modes.  Initially I also
      implemented a second mode which exposes coverage in a fixed-size hash
      table of counters (what Quentin used in his original patch).  I've
      dropped the second mode for simplicity.
      
      This patch adds the necessary support on kernel side.  The complimentary
      compiler support was added in gcc revision 231296.
      
      We've used this support to build syzkaller system call fuzzer, which has
      found 90 kernel bugs in just 2 months:
      
        https://github.com/google/syzkaller/wiki/Found-Bugs
      
      We've also found 30+ bugs in our internal systems with syzkaller.
      Another (yet unexplored) direction where kcov coverage would greatly
      help is more traditional "blob mutation".  For example, mounting a
      random blob as a filesystem, or receiving a random blob over wire.
      
      Why not gcov.  Typical fuzzing loop looks as follows: (1) reset
      coverage, (2) execute a bit of code, (3) collect coverage, repeat.  A
      typical coverage can be just a dozen of basic blocks (e.g.  an invalid
      input).  In such context gcov becomes prohibitively expensive as
      reset/collect coverage steps depend on total number of basic
      blocks/edges in program (in case of kernel it is about 2M).  Cost of
      kcov depends only on number of executed basic blocks/edges.  On top of
      that, kernel requires per-thread coverage because there are always
      background threads and unrelated processes that also produce coverage.
      With inlined gcov instrumentation per-thread coverage is not possible.
      
      kcov exposes kernel PCs and control flow to user-space which is
      insecure.  But debugfs should not be mapped as user accessible.
      
      Based on a patch by Quentin Casasnovas.
      
      [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
      [akpm@linux-foundation.org: unbreak allmodconfig]
      [akpm@linux-foundation.org: follow x86 Makefile layout standards]
      Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: Vegard Nossum <vegard.nossum@oracle.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Tavis Ormandy <taviso@google.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: David Drysdale <drysdale@google.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c9a8750
    • A
      x86/compat: remove is_compat_task() · f970165b
      Andy Lutomirski 提交于
      x86's is_compat_task always checked the current syscall type, not the
      task type.  It has no non-arch users any more, so just remove it to
      avoid confusion.
      
      On x86, nothing should really be checking the task ABI.  There are
      legitimate users for the syscall ABI and for the mm ABI.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f970165b
  5. 22 3月, 2016 1 次提交
  6. 21 3月, 2016 4 次提交
    • H
      x86/cpufeature, perf/x86: Add AMD Accumulated Power Mechanism feature flag · 01fe03ff
      Huang Rui 提交于
      AMD CPU family 15h model 0x60 introduces a mechanism for measuring
      accumulated power. It is used to report the processor power consumption
      and support for it is indicated by CPUID Fn8000_0007_EDX[12].
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wan Zongshun <Vincent.Wan@amd.com>
      Cc: spg_linux_kernel@amd.com
      Link: http://lkml.kernel.org/r/1452739808-11871-4-git-send-email-ray.huang@amd.com
      [ Resolved conflict and moved the synthetic CPUID slot to 19. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      01fe03ff
    • H
      perf/x86/amd: Move nodes_per_socket into bsp_init_amd() · 8dfeae0d
      Huang Rui 提交于
      nodes_per_socket is static and it needn't be initialized many
      times during every CPU core init. So move its initialization into
      bsp_init_amd().
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: spg_linux_kernel@amd.com
      Link: http://lkml.kernel.org/r/1452739808-11871-2-git-send-email-ray.huang@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8dfeae0d
    • V
      perf/x86/mbm: Add Intel Memory B/W Monitoring enumeration and init · 33c3cc7a
      Vikas Shivappa 提交于
      The MBM init patch enumerates the Intel MBM (Memory b/w monitoring)
      and initializes the perf events and datastructures for monitoring the
      memory b/w.
      
      Its based on original patch series by Tony Luck and Kanaka Juvva.
      
      Memory bandwidth monitoring (MBM) provides OS/VMM a way to monitor
      bandwidth from one level of cache to another. The current patches
      support L3 external bandwidth monitoring. It supports both 'local
      bandwidth' and 'total bandwidth' monitoring for the socket. Local
      bandwidth measures the amount of data sent through the memory controller
      on the socket and total b/w measures the total system bandwidth.
      
      Extending the cache quality of service monitoring (CQM) we add two
      more events to the perf infrastructure:
      
        intel_cqm_llc/local_bytes - bytes sent through local socket memory controller
        intel_cqm_llc/total_bytes - total L3 external bytes sent
      
      The tasks are associated with a Resouce Monitoring ID (RMID) just like
      in CQM and OS uses a MSR write to indicate the RMID of the task during
      scheduling.
      Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NTony Luck <tony.luck@intel.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: fenghua.yu@intel.com
      Cc: h.peter.anvin@intel.com
      Cc: ravi.v.shankar@intel.com
      Cc: vikas.shivappa@intel.com
      Link: http://lkml.kernel.org/r/1457652732-4499-4-git-send-email-vikas.shivappa@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      33c3cc7a
    • A
      x86/kallsyms: fix GOLD link failure with new relative kallsyms table format · 142b9e6c
      Ard Biesheuvel 提交于
      Commit 2213e9a6 ("kallsyms: add support for relative offsets in
      kallsyms address table") changed the default kallsyms symbol table
      format to use relative references rather than absolute addresses.
      
      This reduces the size of the kallsyms symbol table by 50% on 64-bit
      architectures, and further reduces the size of the relocation tables
      used by relocatable kernels.  Since the memory footprint of the static
      kernel image is always much smaller than 4 GB, these relative references
      are assumed to be representable in 32 bits, even when the native word
      size is 64 bits.
      
      On 64-bit architectures, this obviously only works if the distance
      between each relative reference and the chosen anchor point is
      representable in 32 bits, and so the table generation code in
      scripts/kallsyms.c scans the table for the lowest value that is covered
      by the kernel text, and selects it as the anchor point.
      
      However, when using the GOLD linker rather than the default BFD linker
      to build the x86_64 kernel, the symbol phys_offset_64, which is the
      result of arithmetic defined in the linker script, is emitted as a 'T'
      rather than an 'A' type symbol, resulting in scripts/kallsyms.c to
      mistake it for a suitable anchor point, even though it is far away from
      the actual kernel image in the virtual address space.  This results in
      out-of-range warnings from scripts/kallsyms.c and a broken build.
      
      So let's align with the BFD linker, and emit the phys_offset_[32|64]
      symbols as absolute symbols explicitly.  Note that the out of range
      issue does not exist on 32-bit x86, but this patch changes both symbols
      for symmetry.
      Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      142b9e6c
  7. 19 3月, 2016 7 次提交
  8. 18 3月, 2016 4 次提交
    • T
      x86/irq: Cure live lock in fixup_irqs() · 551adc60
      Thomas Gleixner 提交于
      Harry reported, that he's able to trigger a system freeze with cpu hot
      unplug. The freeze turned out to be a live lock caused by recent changes in
      irq_force_complete_move().
      
      When fixup_irqs() and from there irq_force_complete_move() is called on the
      dying cpu, then all other cpus are in stop machine an wait for the dying cpu
      to complete the teardown. If there is a move of an interrupt pending then
      irq_force_complete_move() sends the cleanup IPI to the cpus in the old_domain
      mask and waits for them to clear the mask. That's obviously impossible as
      those cpus are firmly stuck in stop machine with interrupts disabled.
      
      I should have known that, but I completely overlooked it being concentrated on
      the locking issues around the vectors. And the existance of the call to
      __irq_complete_move() in the code, which actually sends the cleanup IPI made
      it reasonable to wait for that cleanup to complete. That call was bogus even
      before the recent changes as it was just a pointless distraction.
      
      We have to look at two cases:
      
      1) The move_in_progress flag of the interrupt is set
      
         This means the ioapic has been updated with the new vector, but it has not
         fired yet. In theory there is a race:
      
         set_ioapic(new_vector) <-- Interrupt is raised before update is effective,
         			      i.e. it's raised on the old vector. 
      
         So if the target cpu cannot handle that interrupt before the old vector is
         cleaned up, we get a spurious interrupt and in the worst case the ioapic
         irq line becomes stale, but my experiments so far have only resulted in
         spurious interrupts.
      
         But in case of cpu hotplug this should be a non issue because if the
         affinity update happens right before all cpus rendevouz in stop machine,
         there is no way that the interrupt can be blocked on the target cpu because
         all cpus loops first with interrupts enabled in stop machine, so the old
         vector is not yet cleaned up when the interrupt fires.
      
         So the only way to run into this issue is if the delivery of the interrupt
         on the apic/system bus would be delayed beyond the point where the target
         cpu disables interrupts in stop machine. I doubt that it can happen, but at
         least there is a theroretical chance. Virtualization might be able to
         expose this, but AFAICT the IOAPIC emulation is not as stupid as the real
         hardware.
      
         I've spent quite some time over the weekend to enforce that situation,
         though I was not able to trigger the delayed case.
      
      2) The move_in_progress flag is not set and the old_domain cpu mask is not
         empty.
      
         That means, that an interrupt was delivered after the change and the
         cleanup IPI has been sent to the cpus in old_domain, but not all CPUs have
         responded to it yet.
      
      In both cases we can assume that the next interrupt will arrive on the new
      vector, so we can cleanup the old vectors on the cpus in the old_domain cpu
      mask.
      
      Fixes: 98229aa3 "x86/irq: Plug vector cleanup race"
      Reported-by: NHarry Junior <harryjr@outlook.fr>
      Tested-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Joe Lawrence <joe.lawrence@stratus.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603140931430.3657@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      551adc60
    • T
      x86/tsc: Prevent NULL pointer deref in calibrate_delay_is_known() · f508a5ba
      Thomas Gleixner 提交于
      The topology_core_cpumask is used to find a neighbour cpu in
      calibrate_delay_is_known(). It might not be allocated at the first invocation
      of that function on the boot cpu, when CONFIG_CPUMASK_OFFSTACK is set.
      
      The mask is allocated later in native_smp_prepare_cpus. As a consequence the
      underlying find_next_bit() call dereferences a NULL pointer.
      
      Add a proper check to prevent this.
      
      Fixes: c25323c0 "x86/tsc: Use topology functions"
      Reported-and-tested-by: NRichard W.M. Jones <rjones@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603180843270.3978@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      f508a5ba
    • K
      param: convert some "on"/"off" users to strtobool · 4cc7ecb7
      Kees Cook 提交于
      This changes several users of manual "on"/"off" parsing to use
      strtobool.
      
      Some side-effects:
      - these uses will now parse y/n/1/0 meaningfully too
      - the early_param uses will now bubble up parse errors
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: Amitkumar Karwar <akarwar@marvell.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Joe Perches <joe@perches.com>
      Cc: Kalle Valo <kvalo@codeaurora.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Nishant Sarmukadam <nishants@marvell.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Steve French <sfrench@samba.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4cc7ecb7
    • K
      mm: cleanup *pte_alloc* interfaces · 3ed3a4f0
      Kirill A. Shutemov 提交于
      There are few things about *pte_alloc*() helpers worth cleaning up:
      
       - 'vma' argument is unused, let's drop it;
      
       - most __pte_alloc() callers do speculative check for pmd_none(),
         before taking ptl: let's introduce pte_alloc() macro which does
         the check.
      
         The only direct user of __pte_alloc left is userfaultfd, which has
         different expectation about atomicity wrt pmd.
      
       - pte_alloc_map() and pte_alloc_map_lock() are redefined using
         pte_alloc().
      
      [sudeep.holla@arm.com: fix build for arm64 hugetlbpage]
      [sfr@canb.auug.org.au: fix arch/arm/mm/mmu.c some more]
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3ed3a4f0
  9. 17 3月, 2016 2 次提交
  10. 16 3月, 2016 1 次提交
  11. 13 3月, 2016 1 次提交
    • F
      x86/cpufeature: Enable new AVX-512 features · d0500494
      Fenghua Yu 提交于
      A few new AVX-512 instruction groups/features are added in cpufeatures.h
      for enuermation: AVX512DQ, AVX512BW, and AVX512VL.
      
      Clear the flags in fpu__xstate_clear_all_cpu_caps().
      
      The specification for latest AVX-512 including the features can be found at:
      
        https://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf
      
      Note, I didn't enable the flags in KVM. Hopefully the KVM guys can pick up
      the flags and enable them in KVM.
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kvm@vger.kernel.org
      Link: http://lkml.kernel.org/r/1457667498-37357-1-git-send-email-fenghua.yu@intel.com
      [ Added more detailed feature descriptions. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d0500494
  12. 12 3月, 2016 1 次提交
    • B
      x86/fpu: Fix eager-FPU handling on legacy FPU machines · 6e686709
      Borislav Petkov 提交于
      i486 derived cores like Intel Quark support only the very old,
      legacy x87 FPU (FSAVE/FRSTOR, CPUID bit FXSR is not set), and
      our FPU code wasn't handling the saving and restoring there
      properly in the 'eagerfpu' case.
      
      So after we made eagerfpu the default for all CPU types:
      
        58122bf1 x86/fpu: Default eagerfpu=on on all CPUs
      
      these old FPU designs broke. First, Andy Shevchenko reported a splat:
      
        WARNING: CPU: 0 PID: 823 at arch/x86/include/asm/fpu/internal.h:163 fpu__clear+0x8c/0x160
      
      which was us trying to execute FXRSTOR on those machines even though
      they don't support it.
      
      After taking care of that, Bryan O'Donoghue reported that a simple FPU
      test still failed because we weren't initializing the FPU state properly
      on those machines.
      
      Take care of all that.
      Reported-and-tested-by: NBryan O'Donoghue <pure.logic@nexus-software.ie>
      Reported-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yu-cheng <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/20160311113206.GD4312@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6e686709
  13. 11 3月, 2016 1 次提交
    • J
      x86/entry/traps: Show unhandled signal for i386 in do_trap() · 10ee7386
      Jianyu Zhan 提交于
      Commit abd4f750 ("x86: i386-show-unhandled-signals-v3") did turn on
      the showing-unhandled-signal behaviour for i386 for some exception handlers,
      but for no reason do_trap() is left out (my naive guess is because turning it on
      for do_trap() would be too noisy since do_trap() is shared by several exceptions).
      
      And since the same commit make "show_unhandled_signals" a debug tunable(in
      /proc/sys/debug/exception-trace), and x86 by default turning it on.
      
      So it would be strange for i386 users who turing it on manually and expect
      seeing the unhandled signal output in log, but nothing.
      
      This patch turns it on for i386 in do_trap() as well.
      Signed-off-by: NJianyu Zhan <nasa4836@gmail.com>
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: bp@suse.de
      Cc: dave.hansen@linux.intel.com
      Cc: heukelum@fastmail.fm
      Cc: jbeulich@novell.com
      Cc: jdike@addtoit.com
      Cc: joe@perches.com
      Cc: luto@kernel.org
      Link: http://lkml.kernel.org/r/1457612398-4568-1-git-send-email-nasa4836@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      10ee7386
  14. 10 3月, 2016 7 次提交
    • A
      x86/entry/32: Change INT80 to be an interrupt gate · a798f091
      Andy Lutomirski 提交于
      We want all of the syscall entries to run with interrupts off so that
      we can efficiently run context tracking before enabling interrupts.
      
      This will regress int $0x80 performance on 32-bit kernels by a
      couple of cycles.  This shouldn't matter much -- int $0x80 is not a
      fast path.
      
      This effectively reverts:
      
        657c1eea ("x86/entry/32: Fix entry_INT80_32() to expect interrupts to be on")
      
      ... and fixes the same issue differently.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/59b4f90c9ebfccd8c937305dbbbca680bc74b905.1457558566.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a798f091
    • Y
      x86/fpu: Revert ("x86/fpu: Disable AVX when eagerfpu is off") · a65050c6
      Yu-cheng Yu 提交于
      Leonid Shatz noticed that the SDM interpretation of the following
      recent commit:
      
        394db20c ("x86/fpu: Disable AVX when eagerfpu is off")
      
      ... is incorrect and that the original behavior of the FPU code was correct.
      
      Because AVX is not stated in CR0 TS bit description, it was mistakenly
      believed to be not supported for lazy context switch. This turns out
      to be false:
      
        Intel Software Developer's Manual Vol. 3A, Sec. 2.5 Control Registers:
      
         'TS Task Switched bit (bit 3 of CR0) -- Allows the saving of the x87 FPU/
          MMX/SSE/SSE2/SSE3/SSSE3/SSE4 context on a task switch to be delayed until
          an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction is actually executed
          by the new task.'
      
        Intel Software Developer's Manual Vol. 2A, Sec. 2.4 Instruction Exception
        Specification:
      
         'AVX instructions refer to exceptions by classes that include #NM
          "Device Not Available" exception for lazy context switch.'
      
      So revert the commit.
      Reported-by: NLeonid Shatz <leonid.shatz@ravellosystems.com>
      Signed-off-by: NYu-cheng Yu <yu-cheng.yu@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1457569734-3785-1-git-send-email-yu-cheng.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a65050c6
    • A
      x86/entry/32: Add and check a stack canary for the SYSENTER stack · 2a41aa4f
      Andy Lutomirski 提交于
      The first instruction of the SYSENTER entry runs on its own tiny
      stack.  That stack can be used if a #DB or NMI is delivered before
      the SYSENTER prologue switches to a real stack.
      
      We have code in place to prevent us from overflowing the tiny stack.
      For added paranoia, add a canary to the stack and check it in
      do_debug() -- that way, if something goes wrong with the #DB logic,
      we'll eventually notice.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/6ff9a806f39098b166dc2c41c1db744df5272f29.1457578375.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2a41aa4f
    • A
      x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup · 7536656f
      Andy Lutomirski 提交于
      Right after SYSENTER, we can get a #DB or NMI.  On x86_32, there's no IST,
      so the exception handler is invoked on the temporary SYSENTER stack.
      
      Because the SYSENTER stack is very small, we have a fixup to switch
      off the stack quickly when this happens.  The old fixup had several issues:
      
       1. It checked the interrupt frame's CS and EIP.  This wasn't
          obviously correct on Xen or if vm86 mode was in use [1].
      
       2. In the NMI handler, it did some frightening digging into the
          stack frame.  I'm not convinced this digging was correct.
      
       3. The fixup didn't switch stacks and then switch back.  Instead, it
          synthesized a brand new stack frame that would redirect the IRET
          back to the SYSENTER code.  That frame was highly questionable.
          For one thing, if NMI nested inside #DB, we would effectively
          abort the #DB prologue, which was probably safe but was
          frightening.  For another, the code used PUSHFL to write the
          FLAGS portion of the frame, which was simply bogus -- by the time
          PUSHFL was called, at least TF, NT, VM, and all of the arithmetic
          flags were clobbered.
      
      Simplify this considerably.  Instead of looking at the saved frame
      to see where we came from, check the hardware ESP register against
      the SYSENTER stack directly.  Malicious user code cannot spoof the
      kernel ESP register, and by moving the check after SAVE_ALL, we can
      use normal PER_CPU accesses to find all the relevant addresses.
      
      With this patch applied, the improved syscall_nt_32 test finally
      passes on 32-bit kernels.
      
      [1] It isn't obviously correct, but it is nonetheless safe from vm86
          shenanigans as far as I can tell.  A user can't point EIP at
          entry_SYSENTER_32 while in vm86 mode because entry_SYSENTER_32,
          like all kernel addresses, is greater than 0xffff and would thus
          violate the CS segment limit.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/b2cdbc037031c07ecf2c40a96069318aec0e7971.1457578375.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7536656f
    • A
      x86/entry: Vastly simplify SYSENTER TF (single-step) handling · f2b37575
      Andy Lutomirski 提交于
      Due to a blatant design error, SYSENTER doesn't clear TF (single-step).
      
      As a result, if a user does SYSENTER with TF set, we will single-step
      through the kernel until something clears TF.  There is absolutely
      nothing we can do to prevent this short of turning off SYSENTER [1].
      
      Simplify the handling considerably with two changes:
      
        1. We already sanitize EFLAGS in SYSENTER to clear NT and AC.  We can
           add TF to that list of flags to sanitize with no overhead whatsoever.
      
        2. Teach do_debug() to ignore single-step traps in the SYSENTER prologue.
      
      That's all we need to do.
      
      Don't get too excited -- our handling is still buggy on 32-bit
      kernels.  There's nothing wrong with the SYSENTER code itself, but
      the #DB prologue has a clever fixup for traps on the very first
      instruction of entry_SYSENTER_32, and the fixup doesn't work quite
      correctly.  The next two patches will fix that.
      
      [1] We could probably prevent it by forcing BTF on at all times and
          making sure we clear TF before any branches in the SYSENTER
          code.  Needless to say, this is a bad idea.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/a30d2ea06fe4b621fe6a9ef911b02c0f38feb6f2.1457578375.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f2b37575
    • A
      x86/entry/traps: Clear DR6 early in do_debug() and improve the comment · 8bb56436
      Andy Lutomirski 提交于
      Leaving any bits set in DR6 on return from a debug exception is
      asking for trouble.  Prevent it by writing zero right away and
      clarify the comment.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/3857676e1be8fb27db4b89bbb1e2052b7f435ff4.1457578375.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8bb56436
    • A
      x86/entry/traps: Clear TIF_BLOCKSTEP on all debug exceptions · 81edd9f6
      Andy Lutomirski 提交于
      The SDM says that debug exceptions clear BTF, and we need to keep
      TIF_BLOCKSTEP in sync with BTF.  Clear it unconditionally and improve
      the comment.
      
      I suspect that the fact that kmemcheck could cause TIF_BLOCKSTEP not
      to be cleared was just an oversight.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/fa86e55d196e6dde5b38839595bde2a292c52fdc.1457578375.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      81edd9f6
  15. 09 3月, 2016 2 次提交
    • A
      x86/fpu: Fix 'no387' regression · f363938c
      Andy Lutomirski 提交于
      After fixing FPU option parsing, we now parse the 'no387' boot option
      too early: no387 clears X86_FEATURE_FPU before it's even probed, so
      the boot CPU promptly re-enables it.
      
      I suspect it gets even more confused on SMP.
      
      Fix the probing code to leave X86_FEATURE_FPU off if it's been
      disabled by setup_clear_cpu_cap().
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Fixes: 4f81cbaf ("x86/fpu: Fix early FPU command-line parsing")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f363938c
    • T
      x86/mm, x86/mce: Add memcpy_mcsafe() · 92b0729c
      Tony Luck 提交于
      Make use of the EXTABLE_FAULT exception table entries to write
      a kernel copy routine that doesn't crash the system if it
      encounters a machine check. Prime use case for this is to copy
      from large arrays of non-volatile memory used as storage.
      
      We have to use an unrolled copy loop for now because current
      hardware implementations treat a machine check in "rep mov"
      as fatal. When that is fixed we can simplify.
      
      Return type is a "bool". True means that we copied OK, false means
      that it didn't.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@gmail.com>
      Link: http://lkml.kernel.org/r/a44e1055efc2d2a9473307b22c91caa437aa3f8b.1456439214.git.tony.luck@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      92b0729c
  16. 08 3月, 2016 1 次提交
    • A
      x86/asm-offsets: Remove PARAVIRT_enabled · 0dd0036f
      Andy Lutomirski 提交于
      It no longer has any users.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: boris.ostrovsky@oracle.com
      Cc: david.vrabel@citrix.com
      Cc: konrad.wilk@oracle.com
      Cc: lguest@lists.ozlabs.org
      Cc: xen-devel@lists.xensource.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      0dd0036f