1. 14 1月, 2017 1 次提交
    • P
      efi/x86: Prune invalid memory map entries and fix boot regression · 0100a3e6
      Peter Jones 提交于
      Some machines, such as the Lenovo ThinkPad W541 with firmware GNET80WW
      (2.28), include memory map entries with phys_addr=0x0 and num_pages=0.
      
      These machines fail to boot after the following commit,
      
        commit 8e80632f ("efi/esrt: Use efi_mem_reserve() and avoid a kmalloc()")
      
      Fix this by removing such bogus entries from the memory map.
      
      Furthermore, currently the log output for this case (with efi=debug)
      looks like:
      
       [    0.000000] efi: mem45: [Reserved           |   |  |  |  |  |  |  |  |  |  |  |  ] range=[0x0000000000000000-0xffffffffffffffff] (0MB)
      
      This is clearly wrong, and also not as informative as it could be.  This
      patch changes it so that if we find obviously invalid memory map
      entries, we print an error and skip those entries.  It also detects the
      display of the address range calculation overflow, so the new output is:
      
       [    0.000000] efi: [Firmware Bug]: Invalid EFI memory map entries:
       [    0.000000] efi: mem45: [Reserved           |   |  |  |  |  |  |  |   |  |  |  |  ] range=[0x0000000000000000-0x0000000000000000] (invalid)
      
      It also detects memory map sizes that would overflow the physical
      address, for example phys_addr=0xfffffffffffff000 and
      num_pages=0x0200000000000001, and prints:
      
       [    0.000000] efi: [Firmware Bug]: Invalid EFI memory map entries:
       [    0.000000] efi: mem45: [Reserved           |   |  |  |  |  |  |  |   |  |  |  |  ] range=[phys_addr=0xfffffffffffff000-0x20ffffffffffffffff] (invalid)
      
      It then removes these entries from the memory map.
      Signed-off-by: NPeter Jones <pjones@redhat.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      [ardb: refactor for clarity with no functional changes, avoid PAGE_SHIFT]
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      [Matt: Include bugzilla info in commit log]
      Cc: <stable@vger.kernel.org> # v4.9+
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=191121Signed-off-by: NIngo Molnar <mingo@kernel.org>
      0100a3e6
  2. 12 1月, 2017 4 次提交
    • P
      KVM: x86: fix emulation of "MOV SS, null selector" · 33ab9110
      Paolo Bonzini 提交于
      This is CVE-2017-2583.  On Intel this causes a failed vmentry because
      SS's type is neither 3 nor 7 (even though the manual says this check is
      only done for usable SS, and the dmesg splat says that SS is unusable!).
      On AMD it's worse: svm.c is confused and sets CPL to 0 in the vmcb.
      
      The fix fabricates a data segment descriptor when SS is set to a null
      selector, so that CPL and SS.DPL are set correctly in the VMCS/vmcb.
      Furthermore, only allow setting SS to a NULL selector if SS.RPL < 3;
      this in turn ensures CPL < 3 because RPL must be equal to CPL.
      
      Thanks to Andy Lutomirski and Willy Tarreau for help in analyzing
      the bug and deciphering the manuals.
      Reported-by: NXiaohan Zhang <zhangxiaohan1@huawei.com>
      Fixes: 79d5b4c3
      Cc: stable@nongnu.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      33ab9110
    • W
      KVM: x86: fix NULL deref in vcpu_scan_ioapic · 546d87e5
      Wanpeng Li 提交于
      Reported by syzkaller:
      
          BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0
          IP: _raw_spin_lock+0xc/0x30
          PGD 3e28eb067
          PUD 3f0ac6067
          PMD 0
          Oops: 0002 [#1] SMP
          CPU: 0 PID: 2431 Comm: test Tainted: G           OE   4.10.0-rc1+ #3
          Call Trace:
           ? kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
           kvm_arch_vcpu_ioctl_run+0x10a8/0x15f0 [kvm]
           ? pick_next_task_fair+0xe1/0x4e0
           ? kvm_arch_vcpu_load+0xea/0x260 [kvm]
           kvm_vcpu_ioctl+0x33a/0x600 [kvm]
           ? hrtimer_try_to_cancel+0x29/0x130
           ? do_nanosleep+0x97/0xf0
           do_vfs_ioctl+0xa1/0x5d0
           ? __hrtimer_init+0x90/0x90
           ? do_nanosleep+0x5b/0xf0
           SyS_ioctl+0x79/0x90
           do_syscall_64+0x6e/0x180
           entry_SYSCALL64_slow_path+0x25/0x25
          RIP: _raw_spin_lock+0xc/0x30 RSP: ffffa43688973cc0
      
      The syzkaller folks reported a NULL pointer dereference due to
      ENABLE_CAP succeeding even without an irqchip.  The Hyper-V
      synthetic interrupt controller is activated, resulting in a
      wrong request to rescan the ioapic and a NULL pointer dereference.
      
          #include <sys/ioctl.h>
          #include <sys/mman.h>
          #include <sys/types.h>
          #include <linux/kvm.h>
          #include <pthread.h>
          #include <stddef.h>
          #include <stdint.h>
          #include <stdlib.h>
          #include <string.h>
          #include <unistd.h>
      
          #ifndef KVM_CAP_HYPERV_SYNIC
          #define KVM_CAP_HYPERV_SYNIC 123
          #endif
      
          void* thr(void* arg)
          {
      	struct kvm_enable_cap cap;
      	cap.flags = 0;
      	cap.cap = KVM_CAP_HYPERV_SYNIC;
      	ioctl((long)arg, KVM_ENABLE_CAP, &cap);
      	return 0;
          }
      
          int main()
          {
      	void *host_mem = mmap(0, 0x1000, PROT_READ|PROT_WRITE,
      			MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
      	int kvmfd = open("/dev/kvm", 0);
      	int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
      	struct kvm_userspace_memory_region memreg;
      	memreg.slot = 0;
      	memreg.flags = 0;
      	memreg.guest_phys_addr = 0;
      	memreg.memory_size = 0x1000;
      	memreg.userspace_addr = (unsigned long)host_mem;
      	host_mem[0] = 0xf4;
      	ioctl(vmfd, KVM_SET_USER_MEMORY_REGION, &memreg);
      	int cpufd = ioctl(vmfd, KVM_CREATE_VCPU, 0);
      	struct kvm_sregs sregs;
      	ioctl(cpufd, KVM_GET_SREGS, &sregs);
      	sregs.cr0 = 0;
      	sregs.cr4 = 0;
      	sregs.efer = 0;
      	sregs.cs.selector = 0;
      	sregs.cs.base = 0;
      	ioctl(cpufd, KVM_SET_SREGS, &sregs);
      	struct kvm_regs regs = { .rflags = 2 };
      	ioctl(cpufd, KVM_SET_REGS, &regs);
      	ioctl(vmfd, KVM_CREATE_IRQCHIP, 0);
      	pthread_t th;
      	pthread_create(&th, 0, thr, (void*)(long)cpufd);
      	usleep(rand() % 10000);
      	ioctl(cpufd, KVM_RUN, 0);
      	pthread_join(th, 0);
      	return 0;
          }
      
      This patch fixes it by failing ENABLE_CAP if without an irqchip.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Fixes: 5c919412 (kvm/x86: Hyper-V synthetic interrupt controller)
      Cc: stable@vger.kernel.org # 4.5+
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      546d87e5
    • S
      KVM: x86: Introduce segmented_write_std · 129a72a0
      Steve Rutherford 提交于
      Introduces segemented_write_std.
      
      Switches from emulated reads/writes to standard read/writes in fxsave,
      fxrstor, sgdt, and sidt.  This fixes CVE-2017-2584, a longstanding
      kernel memory leak.
      
      Since commit 283c95d0 ("KVM: x86: emulate FXSAVE and FXRSTOR",
      2016-11-09), which is luckily not yet in any final release, this would
      also be an exploitable kernel memory *write*!
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: stable@vger.kernel.org
      Fixes: 96051572
      Fixes: 283c95d0Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NSteve Rutherford <srutherford@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      129a72a0
    • D
      KVM: x86: flush pending lapic jump label updates on module unload · cef84c30
      David Matlack 提交于
      KVM's lapic emulation uses static_key_deferred (apic_{hw,sw}_disabled).
      These are implemented with delayed_work structs which can still be
      pending when the KVM module is unloaded. We've seen this cause kernel
      panics when the kvm_intel module is quickly reloaded.
      
      Use the new static_key_deferred_flush() API to flush pending updates on
      module unload.
      Signed-off-by: NDavid Matlack <dmatlack@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cef84c30
  3. 09 1月, 2017 1 次提交
  4. 07 1月, 2017 1 次提交
    • N
      x86/efi: Don't allocate memmap through memblock after mm_init() · 20b1e22d
      Nicolai Stange 提交于
      With the following commit:
      
        4bc9f92e ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data")
      
      ...  efi_bgrt_init() calls into the memblock allocator through
      efi_mem_reserve() => efi_arch_mem_reserve() *after* mm_init() has been called.
      
      Indeed, KASAN reports a bad read access later on in efi_free_boot_services():
      
        BUG: KASAN: use-after-free in efi_free_boot_services+0xae/0x24c
                  at addr ffff88022de12740
        Read of size 4 by task swapper/0/0
        page:ffffea0008b78480 count:0 mapcount:-127
        mapping:          (null) index:0x1 flags: 0x5fff8000000000()
        [...]
        Call Trace:
         dump_stack+0x68/0x9f
         kasan_report_error+0x4c8/0x500
         kasan_report+0x58/0x60
         __asan_load4+0x61/0x80
         efi_free_boot_services+0xae/0x24c
         start_kernel+0x527/0x562
         x86_64_start_reservations+0x24/0x26
         x86_64_start_kernel+0x157/0x17a
         start_cpu+0x5/0x14
      
      The instruction at the given address is the first read from the memmap's
      memory, i.e. the read of md->type in efi_free_boot_services().
      
      Note that the writes earlier in efi_arch_mem_reserve() don't splat because
      they're done through early_memremap()ed addresses.
      
      So, after memblock is gone, allocations should be done through the "normal"
      page allocator. Introduce a helper, efi_memmap_alloc() for this. Use
      it from efi_arch_mem_reserve(), efi_free_boot_services() and, for the sake
      of consistency, from efi_fake_memmap() as well.
      
      Note that for the latter, the memmap allocations cease to be page aligned.
      This isn't needed though.
      Tested-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NNicolai Stange <nicstange@gmail.com>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: <stable@vger.kernel.org> # v4.9
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Mika Penttilä <mika.penttila@nextfour.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Fixes: 4bc9f92e ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data")
      Link: http://lkml.kernel.org/r/20170105125130.2815-1-nicstange@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      20b1e22d
  5. 05 1月, 2017 1 次提交
  6. 30 12月, 2016 2 次提交
    • H
      crypto: aesni - Fix failure when built-in with modular pcbc · 07825f0a
      Herbert Xu 提交于
      If aesni is built-in but pcbc is built as a module, then aesni
      will fail completely because when it tries to register the pcbc
      variant of aes the pcbc template is not available.
      
      This patch fixes this by modifying the pcbc presence test so that
      if aesni is built-in then pcbc must also be built-in for it to be
      used by aesni.
      
      Fixes: 85671860 ("crypto: aesni - Convert to skcipher")
      Reported-by: NStephan Müller <smueller@chronox.de>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      07825f0a
    • L
      mm: optimize PageWaiters bit use for unlock_page() · b91e1302
      Linus Torvalds 提交于
      In commit 62906027 ("mm: add PageWaiters indicating tasks are
      waiting for a page bit") Nick Piggin made our page locking no longer
      unconditionally touch the hashed page waitqueue, which not only helps
      performance in general, but is particularly helpful on NUMA machines
      where the hashed wait queues can bounce around a lot.
      
      However, the "clear lock bit atomically and then test the waiters bit"
      sequence turns out to be much more expensive than it needs to be,
      because you get a nasty stall when trying to access the same word that
      just got updated atomically.
      
      On architectures where locking is done with LL/SC, this would be trivial
      to fix with a new primitive that clears one bit and tests another
      atomically, but that ends up not working on x86, where the only atomic
      operations that return the result end up being cmpxchg and xadd.  The
      atomic bit operations return the old value of the same bit we changed,
      not the value of an unrelated bit.
      
      On x86, we could put the lock bit in the high bit of the byte, and use
      "xadd" with that bit (where the overflow ends up not touching other
      bits), and look at the other bits of the result.  However, an even
      simpler model is to just use a regular atomic "and" to clear the lock
      bit, and then the sign bit in eflags will indicate the resulting state
      of the unrelated bit #7.
      
      So by moving the PageWaiters bit up to bit #7, we can atomically clear
      the lock bit and test the waiters bit on x86 too.  And architectures
      with LL/SC (which is all the usual RISC suspects), the particular bit
      doesn't matter, so they are fine with this approach too.
      
      This avoids the extra access to the same atomic word, and thus avoids
      the costly stall at page unlock time.
      
      The only downside is that the interface ends up being a bit odd and
      specialized: clear a bit in a byte, and test the sign bit.  Nick doesn't
      love the resulting name of the new primitive, but I'd rather make the
      name be descriptive and very clear about the limitation imposed by
      trying to work across all relevant architectures than make it be some
      generic thing that doesn't make the odd semantics explicit.
      
      So this introduces the new architecture primitive
      
          clear_bit_unlock_is_negative_byte();
      
      and adds the trivial implementation for x86.  We have a generic
      non-optimized fallback (that just does a "clear_bit()"+"test_bit(7)"
      combination) which can be overridden by any architecture that can do
      better.  According to Nick, Power has the same hickup x86 has, for
      example, but some other architectures may not even care.
      
      All these optimizations mean that my page locking stress-test (which is
      just executing a lot of small short-lived shell scripts: "make test" in
      the git source tree) no longer makes our page locking look horribly bad.
      Before all these optimizations, just the unlock_page() costs were just
      over 3% of all CPU overhead on "make test".  After this, it's down to
      0.66%, so just a quarter of the cost it used to be.
      
      (The difference on NUMA is bigger, but there this micro-optimization is
      likely less noticeable, since the big issue on NUMA was not the accesses
      to 'struct page', but the waitqueue accesses that were already removed
      by Nick's earlier commit).
      Acked-by: NNick Piggin <npiggin@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Bob Peterson <rpeterso@redhat.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Andrew Lutomirski <luto@kernel.org>
      Cc: Andreas Gruenbacher <agruenba@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b91e1302
  7. 27 12月, 2016 1 次提交
  8. 26 12月, 2016 1 次提交
    • T
      ktime: Cleanup ktime_set() usage · 8b0e1953
      Thomas Gleixner 提交于
      ktime_set(S,N) was required for the timespec storage type and is still
      useful for situations where a Seconds and Nanoseconds part of a time value
      needs to be converted. For anything where the Seconds argument is 0, this
      is pointless and can be replaced with a simple assignment.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      8b0e1953
  9. 25 12月, 2016 5 次提交
  10. 24 12月, 2016 2 次提交
  11. 23 12月, 2016 3 次提交
    • P
      perf/x86: Fix overlap counter scheduling bug · 1134c2b5
      Peter Zijlstra 提交于
      Jiri reported the overlap scheduling exceeding its max stack.
      
      Looking at the constraint that triggered this, it turns out the
      overlap marker isn't needed.
      
      The comment with EVENT_CONSTRAINT_OVERLAP states: "This is the case if
      the counter mask of such an event is not a subset of any other counter
      mask of a constraint with an equal or higher weight".
      
      Esp. that latter part is of interest here I think, our overlapping mask
      is 0x0e, that has 3 bits set and is the highest weight mask in on the
      PMU, therefore it will be placed last. Can we still create a scenario
      where we would need to rewind that?
      
      The scenario for AMD Fam15h is we're having masks like:
      
      	0x3F -- 111111
      	0x38 -- 111000
      	0x07 -- 000111
      
      	0x09 -- 001001
      
      And we mark 0x09 as overlapping, because it is not a direct subset of
      0x38 or 0x07 and has less weight than either of those. This means we'll
      first try and place the 0x09 event, then try and place 0x38/0x07 events.
      Now imagine we have:
      
      	3 * 0x07 + 0x09
      
      and the initial pick for the 0x09 event is counter 0, then we'll fail to
      place all 0x07 events. So we'll pop back, try counter 4 for the 0x09
      event, and then re-try all 0x07 events, which will now work.
      
      The masks on the PMU in question are:
      
        0x01 - 0001
        0x03 - 0011
        0x0e - 1110
        0x0c - 1100
      
      But since all the masks that have overlap (0xe -> {0xc,0x3}) and (0x3 ->
      0x1) are of heavier weight, it should all work out.
      Reported-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NJiri Olsa <jolsa@kernel.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Liang Kan <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vince@deater.net>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/20161109155153.GQ3142@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1134c2b5
    • S
      perf/x86/pebs: Fix handling of PEBS buffer overflows · daa864b8
      Stephane Eranian 提交于
      This patch solves a race condition between PEBS and the PMU handler.
      
      In case multiple PEBS events are sampled at the same time,
      it is possible to have GLOBAL_STATUS bit 62 set indicating
      PEBS buffer overflow and also seeing at most 3 PEBS counters
      having their bits set in the status register. This is a sign
      that there was at least one PEBS record pending at the time
      of the PMU interrupt. PEBS counters must only be processed
      via the drain_pebs() calls, and not via the regular sample
      processing loop coming after that the function, otherwise
      phony regular samples may be generated in the sampling buffer
      not marked with the EXACT tag.
      
      Another possibility is to have one PEBS event and at least
      one non-PEBS event whic hoverflows while PEBS has armed. In this
      case, bit 62 of GLOBAL_STATUS will not be set, yet the overflow
      status bit for the PEBS counter will be on Skylake.
      
      To avoid this problem, we systematically ignore the PEBS-enabled
      counters from the GLOBAL_STATUS mask and we always process PEBS
      events via drain_pebs().
      
      The problem manifested itself by having non-exact samples when
      sampling only PEBS events, i.e., the PERF_SAMPLE_RECORD would
      not have the EXACT flag set.
      
      Note that this problem is only present on Skylake processor.
      This fix is harmless on older processors.
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1482395366-8992-1-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      daa864b8
    • P
      x86/paravirt: Mark unused patch_default label · cef4402d
      Peter Zijlstra 提交于
      A bugfix commit:
      
        45dbea5f ("x86/paravirt: Fix native_patch()")
      
      ... introduced a harmless warning:
      
        arch/x86/kernel/paravirt_patch_32.c: In function 'native_patch':
        arch/x86/kernel/paravirt_patch_32.c:71:1: error: label 'patch_default' defined but not used [-Werror=unused-label]
      
      Fix it by annotating the label as __maybe_unused.
      Reported-by: NArnd Bergmann <arnd@arndb.de>
      Reported-by: NPiotr Gregor <piotrgregor@rsyncme.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 45dbea5f ("x86/paravirt: Fix native_patch()")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      cef4402d
  12. 22 12月, 2016 2 次提交
  13. 21 12月, 2016 1 次提交
  14. 20 12月, 2016 4 次提交
    • N
      x86/platform/intel/quark: Add printf attribute to imr_self_test_result() · 9120cf4f
      Nicolas Iooss 提交于
      __printf() attributes help detecting issues in printf() format strings at
      compile time.
      
      Even though imr_selftest.c is only compiled with
      CONFIG_DEBUG_IMR_SELFTEST=y, GCC complains about a missing format
      attribute when compiling allmodconfig with -Wmissing-format-attribute.
      
      Silence this warning by adding the attribute.
      Signed-off-by: NNicolas Iooss <nicolas.iooss_linux@m4x.org>
      Acked-by: NBryan O'Donoghue <pure.logic@nexus-software.ie>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20161219132144.4108-1-nicolas.iooss_linux@m4x.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9120cf4f
    • L
      x86/platform/intel-mid: Switch MPU3050 driver to IIO · 634b847b
      Linus Walleij 提交于
      The Intel Mid goes in and creates a I2C device for the
      MPU3050 if the input driver for MPU-3050 is activated.
      
      As of commit:
      
        3904b28e ("iio: gyro: Add driver for the MPU-3050 gyroscope")
      
      .. there is a proper and fully featured IIO driver for this
      device, so deprecate the use of the incomplete input driver
      by augmenting the device population code to react to the
      presence of the IIO driver's Kconfig symbol instead.
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Acked-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: Jonathan Cameron <jic23@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1481722794-4348-1-git-send-email-linus.walleij@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      634b847b
    • B
      x86/alternatives: Do not use sync_core() to serialize I$ · 34bfab0e
      Borislav Petkov 提交于
      We use sync_core() in the alternatives code to stop speculative
      execution of prefetched instructions because we are potentially changing
      them and don't want to execute stale bytes.
      
      What it does on most machines is call CPUID which is a serializing
      instruction. And that's expensive.
      
      However, the instruction cache is serialized when we're on the local CPU
      and are changing the data through the same virtual address. So then, we
      don't need the serializing CPUID but a simple control flow change. Last
      being accomplished with a CALL/RET which the noinline causes.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
      Cc: Matthew Whitehead <tedheadster@gmail.com>
      Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20161203150258.vwr5zzco7ctgc4pe@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      34bfab0e
    • V
      x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic · 59107e2f
      Vitaly Kuznetsov 提交于
      There is a feature in Hyper-V ('Debug-VM --InjectNonMaskableInterrupt')
      which injects NMI to the guest. We may want to crash the guest and do kdump
      on this NMI by enabling unknown_nmi_panic. To make kdump succeed we need to
      allow the kdump kernel to re-establish VMBus connection so it will see
      VMBus devices (storage, network,..).
      
      To properly unload VMBus making it possible to start over during kdump we
      need to do the following:
      
       - Send an 'unload' message to the hypervisor. This can be done on any CPU
         so we do this the crashing CPU.
      
       - Receive the 'unload finished' reply message. WS2012R2 delivers this
         message to the CPU which was used to establish VMBus connection during
         module load and this CPU may differ from the CPU sending 'unload'.
      
      Receiving a VMBus message means the following:
      
       - There is a per-CPU slot in memory for one message. This slot can in
         theory be accessed by any CPU.
      
       - We get an interrupt on the CPU when a message was placed into the slot.
      
       - When we read the message we need to clear the slot and signal the fact
         to the hypervisor. In case there are more messages to this CPU pending
         the hypervisor will deliver the next message. The signaling is done by
         writing to an MSR so this can only be done on the appropriate CPU.
      
      To avoid doing cross-CPU work on crash we have vmbus_wait_for_unload()
      function which checks message slots for all CPUs in a loop waiting for the
      'unload finished' messages. However, there is an issue which arises when
      these conditions are met:
      
       - We're crashing on a CPU which is different from the one which was used
         to initially contact the hypervisor.
      
       - The CPU which was used for the initial contact is blocked with interrupts
         disabled and there is a message pending in the message slot.
      
      In this case we won't be able to read the 'unload finished' message on the
      crashing CPU. This is reproducible when we receive unknown NMIs on all CPUs
      simultaneously: the first CPU entering panic() will proceed to crash and
      all other CPUs will stop themselves with interrupts disabled.
      
      The suggested solution is to handle unknown NMIs for Hyper-V guests on the
      first CPU which gets them only. This will allow us to rely on VMBus
      interrupt handler being able to receive the 'unload finish' message in
      case it is delivered to a different CPU.
      
      The issue is not reproducible on WS2016 as Debug-VM delivers NMI to the
      boot CPU only, WS2012R2 and earlier Hyper-V versions are affected.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Acked-by: NK. Y. Srinivasan <kys@microsoft.com>
      Cc: devel@linuxdriverproject.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Link: http://lkml.kernel.org/r/20161202100720.28121-1-vkuznets@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      59107e2f
  15. 19 12月, 2016 11 次提交