1. 22 2月, 2018 4 次提交
  2. 15 2月, 2018 3 次提交
  3. 13 2月, 2018 4 次提交
    • T
      x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages · fd0e786d
      Tony Luck 提交于
      In the following commit:
      
        ce0fa3e5 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
      
      ... we added code to memory_failure() to unmap the page from the
      kernel 1:1 virtual address space to avoid speculative access to the
      page logging additional errors.
      
      But memory_failure() may not always succeed in taking the page offline,
      especially if the page belongs to the kernel.  This can happen if
      there are too many corrected errors on a page and either mcelog(8)
      or drivers/ras/cec.c asks to take a page offline.
      
      Since we remove the 1:1 mapping early in memory_failure(), we can
      end up with the page unmapped, but still in use. On the next access
      the kernel crashes :-(
      
      There are also various debug paths that call memory_failure() to simulate
      occurrence of an error. Since there is no actual error in memory, we
      don't need to map out the page for those cases.
      
      Revert most of the previous attempt and keep the solution local to
      arch/x86/kernel/cpu/mcheck/mce.c. Unmap the page only when:
      
      	1) there is a real error
      	2) memory_failure() succeeds.
      
      All of this only applies to 64-bit systems. 32-bit kernel doesn't map
      all of memory into kernel space. It isn't worth adding the code to unmap
      the piece that is mapped because nobody would run a 32-bit kernel on a
      machine that has recoverable machine checks.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave <dave.hansen@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert (Persistent Memory) <elliott@hpe.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Cc: stable@vger.kernel.org #v4.14
      Fixes: ce0fa3e5 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      fd0e786d
    • I
      x86/speculation: Clean up various Spectre related details · 21e433bd
      Ingo Molnar 提交于
      Harmonize all the Spectre messages so that a:
      
          dmesg | grep -i spectre
      
      ... gives us most Spectre related kernel boot messages.
      
      Also fix a few other details:
      
       - clarify a comment about firmware speculation control
      
       - s/KPTI/PTI
      
       - remove various line-breaks that made the code uglier
      Acked-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      21e433bd
    • D
      Revert "x86/speculation: Simplify indirect_branch_prediction_barrier()" · f208820a
      David Woodhouse 提交于
      This reverts commit 64e16720.
      
      We cannot call C functions like that, without marking all the
      call-clobbered registers as, well, clobbered. We might have got away
      with it for now because the __ibp_barrier() function was *fairly*
      unlikely to actually use any other registers. But no. Just no.
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arjan.van.de.ven@intel.com
      Cc: dave.hansen@intel.com
      Cc: jmattson@google.com
      Cc: karahmed@amazon.de
      Cc: kvm@vger.kernel.org
      Cc: pbonzini@redhat.com
      Cc: rkrcmar@redhat.com
      Cc: sironi@amazon.de
      Link: http://lkml.kernel.org/r/1518305967-31356-3-git-send-email-dwmw@amazon.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f208820a
    • D
      x86/speculation: Correct Speculation Control microcode blacklist again · d37fc6d3
      David Woodhouse 提交于
      Arjan points out that the Intel document only clears the 0xc2 microcode
      on *some* parts with CPUID 506E3 (INTEL_FAM6_SKYLAKE_DESKTOP stepping 3).
      For the Skylake H/S platform it's OK but for Skylake E3 which has the
      same CPUID it isn't (yet) cleared.
      
      So removing it from the blacklist was premature. Put it back for now.
      
      Also, Arjan assures me that the 0x84 microcode for Kaby Lake which was
      featured in one of the early revisions of the Intel document was never
      released to the public, and won't be until/unless it is also validated
      as safe. So those can change to 0x80 which is what all *other* versions
      of the doc have identified.
      
      Once the retrospective testing of existing public microcodes is done, we
      should be back into a mode where new microcodes are only released in
      batches and we shouldn't even need to update the blacklist for those
      anyway, so this tweaking of the list isn't expected to be a thing which
      keeps happening.
      Requested-by: NArjan van de Ven <arjan.van.de.ven@intel.com>
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arjan.van.de.ven@intel.com
      Cc: dave.hansen@intel.com
      Cc: kvm@vger.kernel.org
      Cc: pbonzini@redhat.com
      Link: http://lkml.kernel.org/r/1518449255-2182-1-git-send-email-dwmw@amazon.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d37fc6d3
  4. 12 2月, 2018 1 次提交
    • L
      vfs: do bulk POLL* -> EPOLL* replacement · a9a08845
      Linus Torvalds 提交于
      This is the mindless scripted replacement of kernel use of POLL*
      variables as described by Al, done by this script:
      
          for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
              L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
              for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
          done
      
      with de-mangling cleanups yet to come.
      
      NOTE! On almost all architectures, the EPOLL* constants have the same
      values as the POLL* constants do.  But they keyword here is "almost".
      For various bad reasons they aren't the same, and epoll() doesn't
      actually work quite correctly in some cases due to this on Sparc et al.
      
      The next patch from Al will sort out the final differences, and we
      should be all done.
      Scripted-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9a08845
  5. 11 2月, 2018 2 次提交
    • B
      x86/MCE: Fix build warning introduced by "x86: do not use print_symbol()" · c80c5ec1
      Borislav Petkov 提交于
      The following commit:
      
        7b606162 ("x86: do not use print_symbol()")
      
      ... introduced a new build warning on 32-bit x86:
      
        arch/x86/kernel/cpu/mcheck/mce.c:237:21: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
            pr_cont("{%pS}", (void *)m->ip);
                             ^
      
      Fix the type mismatch between the 'void *' expected by %pS and the mce->ip
      field which is u64 by casting to long.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-kernel@vger.kernel.org
      Fixes: 7b606162 ("x86: do not use print_symbol()")
      Link: http://lkml.kernel.org/r/20180210145314.22174-1-bp@alien8.de
      [ Cleaned up the changelog. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c80c5ec1
    • D
      x86/speculation: Update Speculation Control microcode blacklist · 17513420
      David Woodhouse 提交于
      Intel have retroactively blessed the 0xc2 microcode on Skylake mobile
      and desktop parts, and the Gemini Lake 0x22 microcode is apparently fine
      too. We blacklisted the latter purely because it was present with all
      the other problematic ones in the 2018-01-08 release, but now it's
      explicitly listed as OK.
      
      We still list 0x84 for the various Kaby Lake / Coffee Lake parts, as
      that appeared in one version of the blacklist and then reverted to
      0x80 again. We can change it if 0x84 is actually announced to be safe.
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arjan.van.de.ven@intel.com
      Cc: jmattson@google.com
      Cc: karahmed@amazon.de
      Cc: kvm@vger.kernel.org
      Cc: pbonzini@redhat.com
      Cc: rkrcmar@redhat.com
      Cc: sironi@amazon.de
      Link: http://lkml.kernel.org/r/1518305967-31356-2-git-send-email-dwmw@amazon.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      17513420
  6. 03 2月, 2018 1 次提交
    • A
      x86/pti: Mark constant arrays as __initconst · 4bf5d56d
      Arnd Bergmann 提交于
      I'm seeing build failures from the two newly introduced arrays that
      are marked 'const' and '__initdata', which are mutually exclusive:
      
      arch/x86/kernel/cpu/common.c:882:43: error: 'cpu_no_speculation' causes a section type conflict with 'e820_table_firmware_init'
      arch/x86/kernel/cpu/common.c:895:43: error: 'cpu_no_meltdown' causes a section type conflict with 'e820_table_firmware_init'
      
      The correct annotation is __initconst.
      
      Fixes: fec9434a ("x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Thomas Garnier <thgarnie@google.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Link: https://lkml.kernel.org/r/20180202213959.611210-1-arnd@arndb.de
      4bf5d56d
  7. 02 2月, 2018 1 次提交
  8. 31 1月, 2018 4 次提交
    • V
      x86/hyperv: Reenlightenment notifications support · 93286261
      Vitaly Kuznetsov 提交于
      Hyper-V supports Live Migration notification. This is supposed to be used
      in conjunction with TSC emulation: when a VM is migrated to a host with
      different TSC frequency for some short period the host emulates the
      accesses to TSC and sends an interrupt to notify about the event. When the
      guest is done updating everything it can disable TSC emulation and
      everything will start working fast again.
      
      These notifications weren't required until now as Hyper-V guests are not
      supposed to use TSC as a clocksource: in Linux the TSC is even marked as
      unstable on boot. Guests normally use 'tsc page' clocksource and host
      updates its values on migrations automatically.
      
      Things change when with nested virtualization: even when the PV
      clocksources (kvm-clock or tsc page) are passed through to the nested
      guests the TSC frequency and frequency changes need to be know..
      
      Hyper-V Top Level Functional Specification (as of v5.0b) wrongly specifies
      EAX:BIT(12) of CPUID:0x40000009 as the feature identification bit. The
      right one to check is EAX:BIT(13) of CPUID:0x40000003. I was assured that
      the fix in on the way.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: kvm@vger.kernel.org
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com>
      Cc: Roman Kagan <rkagan@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: devel@linuxdriverproject.org
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Cathy Avery <cavery@redhat.com>
      Cc: Mohammed Gamal <mmorsy@redhat.com>
      Link: https://lkml.kernel.org/r/20180124132337.30138-4-vkuznets@redhat.com
      93286261
    • D
      x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel · 7fcae111
      David Woodhouse 提交于
      Despite the fact that all the other code there seems to be doing it, just
      using set_cpu_cap() in early_intel_init() doesn't actually work.
      
      For CPUs with PKU support, setup_pku() calls get_cpu_cap() after
      c->c_init() has set those feature bits. That resets those bits back to what
      was queried from the hardware.
      
      Turning the bits off for bad microcode is easy to fix. That can just use
      setup_clear_cpu_cap() to force them off for all CPUs.
      
      I was less keen on forcing the feature bits *on* that way, just in case
      of inconsistencies. I appreciate that the kernel is going to get this
      utterly wrong if CPU features are not consistent, because it has already
      applied alternatives by the time secondary CPUs are brought up.
      
      But at least if setup_force_cpu_cap() isn't being used, we might have a
      chance of *detecting* the lack of the corresponding bit and either
      panicking or refusing to bring the offending CPU online.
      
      So ensure that the appropriate feature bits are set within get_cpu_cap()
      regardless of how many extra times it's called.
      
      Fixes: 2961298e ("x86/cpufeatures: Clean up Spectre v2 related CPUID flags")
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: karahmed@amazon.de
      Cc: peterz@infradead.org
      Cc: bp@alien8.de
      Link: https://lkml.kernel.org/r/1517322623-15261-1-git-send-email-dwmw@amazon.co.uk
      7fcae111
    • C
      x86/spectre: Fix spelling mistake: "vunerable"-> "vulnerable" · e698dcdf
      Colin Ian King 提交于
      Trivial fix to spelling mistake in pr_err error message text.
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: kernel-janitors@vger.kernel.org
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Link: https://lkml.kernel.org/r/20180130193218.9271-1-colin.king@canonical.com
      e698dcdf
    • D
      x86/spectre: Report get_user mitigation for spectre_v1 · edfbae53
      Dan Williams 提交于
      Reflect the presence of get_user(), __get_user(), and 'syscall' protections
      in sysfs. The expectation is that new and better tooling will allow the
      kernel to grow more usages of array_index_nospec(), for now, only claim
      mitigation for __user pointer de-references.
      Reported-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: linux-arch@vger.kernel.org
      Cc: kernel-hardening@lists.openwall.com
      Cc: gregkh@linuxfoundation.org
      Cc: torvalds@linux-foundation.org
      Cc: alan@linux.intel.com
      Link: https://lkml.kernel.org/r/151727420158.33451.11658324346540434635.stgit@dwillia2-desk3.amr.corp.intel.com
      edfbae53
  9. 30 1月, 2018 1 次提交
  10. 28 1月, 2018 2 次提交
  11. 27 1月, 2018 1 次提交
  12. 26 1月, 2018 6 次提交
  13. 24 1月, 2018 4 次提交
  14. 19 1月, 2018 1 次提交
  15. 18 1月, 2018 4 次提交
  16. 17 1月, 2018 1 次提交
    • T
      x86/intel_rdt/cqm: Prevent use after free · d4792441
      Thomas Gleixner 提交于
      intel_rdt_iffline_cpu() -> domain_remove_cpu() frees memory first and then
      proceeds accessing it.
      
       BUG: KASAN: use-after-free in find_first_bit+0x1f/0x80
       Read of size 8 at addr ffff883ff7c1e780 by task cpuhp/31/195
       find_first_bit+0x1f/0x80
       has_busy_rmid+0x47/0x70
       intel_rdt_offline_cpu+0x4b4/0x510
      
       Freed by task 195:
       kfree+0x94/0x1a0
       intel_rdt_offline_cpu+0x17d/0x510
      
      Do the teardown first and then free memory.
      
      Fixes: 24247aee ("x86/intel_rdt/cqm: Improve limbo list processing")
      Reported-by: NJoseph Salisbury <joseph.salisbury@canonical.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Ravi Shankar <ravi.v.shankar@intel.com>
      Cc: Peter Zilstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: "Roderick W. Smith" <rod.smith@canonical.com>
      Cc: 1733662@bugs.launchpad.net
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801161957510.2366@nanos
      d4792441