1. 27 7月, 2016 1 次提交
  2. 24 7月, 2016 2 次提交
    • A
      x86/mm/cpa: Add missing comment in populate_pdg() · 55920d31
      Andy Lutomirski 提交于
      In commit:
      
        21cbc2822aa1 ("x86/mm/cpa: Unbreak populate_pgd(): stop trying to deallocate failed PUDs")
      
      I intended to add this comment, but I failed at using git.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/242baf8612394f4e31216f96d13c4d2e9b90d1b7.1469293159.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      55920d31
    • A
      x86/mm/cpa: Fix populate_pgd(): Stop trying to deallocate failed PUDs · 530dd8d4
      Andy Lutomirski 提交于
      Valdis Kletnieks bisected a boot failure back to this recent commit:
      
        360cb4d1 ("x86/mm/cpa: In populate_pgd(), don't set the PGD entry until it's populated")
      
      I broke the case where a PUD table got allocated -- populate_pud()
      would wander off a pgd_none entry and get lost.  I'm not sure how
      this survived my testing.
      
      Fix the original issue in a much simpler way.  The problem
      was that, if we allocated a PUD table, failed to populate it, and
      freed it, another CPU could potentially keep using the PGD entry we
      installed (either by copying it via vmalloc_fault or by speculatively
      caching it).  There's a straightforward fix: simply leave the
      top-level entry in place if this happens.  This can't waste any
      significant amount of memory -- there are at most 256 entries like
      this systemwide and, as a practical matter, if we hit this failure
      path repeatedly, we're likely to reuse the same page anyway.
      
      For context, this is a reversion with this hunk added in:
      
      	if (ret < 0) {
      +		/*
      +		 * Leave the PUD page in place in case some other CPU or thread
      +		 * already found it, but remove any useless entries we just
      +		 * added to it.
      +		 */
      -		unmap_pgd_range(cpa->pgd, addr,
      +		unmap_pud_range(pgd_entry, addr,
      			        addr + (cpa->numpages << PAGE_SHIFT));
      		return ret;
      	}
      
      This effectively open-codes what the now-deleted unmap_pgd_range()
      function used to do except that unmap_pgd_range() used to try to
      free the page as well.
      Reported-by: NValdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Mike Krinkin <krinkin.m.u@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Link: http://lkml.kernel.org/r/21cbc2822aa18aa812c0215f4231dbf5f65afa7f.1469249789.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      530dd8d4
  3. 22 7月, 2016 3 次提交
    • A
      x86/boot: Simplify EBDA-vs-BIOS reservation logic · 6a79296c
      Andy Lutomirski 提交于
      Both the intent and the effect of reserve_bios_regions() is simple:
      reserve the range from the apparent BIOS start (suitably filtered)
      through 1MB and, if the EBDA start address is sensible, extend that
      reservation downward to cover the EBDA as well.
      
      The code is overcomplicated, though, and contains head-scratchers
      like:
      
      	if (ebda_start < BIOS_START_MIN)
      		ebda_start = BIOS_START_MAX;
      
      That snipped is trying to say "if ebda_start < BIOS_START_MIN,
      ignore it".
      
      Simplify it: reorder the code so that it makes sense.  This should
      have no functional effect under any circumstances.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Mario Limonciello <mario_limonciello@dell.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Link: http://lkml.kernel.org/r/ef89c0c761be20ead8bd9a3275743e6259b6092a.1469135598.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6a79296c
    • A
      x86/boot: Clarify what x86_legacy_features.reserve_bios_regions does · 30f02739
      Andy Lutomirski 提交于
      It doesn't just control probing for the EBDA -- it controls whether we
      detect and reserve the <1MB BIOS regions in general.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Mario Limonciello <mario_limonciello@dell.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Link: http://lkml.kernel.org/r/55bd591115498440d461857a7b64f349a5d911f3.1469135598.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      30f02739
    • D
      x86/fpu: Do not BUG_ON() in early FPU code · ec3ed4a2
      Dave Hansen 提交于
      I don't think it is really possible to have a system where CPUID
      enumerates support for XSAVE but that it does not have FP/SSE
      (they are "legacy" features and always present).
      
      But, I did manage to hit this case in qemu when I enabled its
      somewhat shaky XSAVE support.  The bummer is that the FPU is set
      up before we parse the command-line or have *any* console support
      including earlyprintk.  That turned what should have been an easy
      thing to debug in to a bit more of an odyssey.
      
      So a BUG() here is worthless.  All it does it guarantee that
      if/when we hit this case we have an empty console.  So, remove
      the BUG() and try to limp along by disabling XSAVE and trying to
      continue.  Add a comment on why we are doing this, and also add
      a common "out_disable" path for leaving fpu__init_system_xstate().
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160720194551.63BB2B58@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ec3ed4a2
  4. 21 7月, 2016 1 次提交
    • I
      x86/boot: Reorganize and clean up the BIOS area reservation code · edce2121
      Ingo Molnar 提交于
      So the reserve_ebda_region() code has accumulated a number of
      problems over the years that make it really difficult to read
      and understand:
      
      - The calculation of 'lowmem' and 'ebda_addr' is an unnecessarily
        interleaved mess of first lowmem, then ebda_addr, then lowmem tweaks...
      
      - 'lowmem' here means 'super low mem' - i.e. 16-bit addressable memory. In other
        parts of the x86 code 'lowmem' means 32-bit addressable memory... This makes it
        super confusing to read.
      
      - It does not help at all that we have various memory range markers, half of which
        are 'start of range', half of which are 'end of range' - but this crucial
        property is not obvious in the naming at all ... gave me a headache trying to
        understand all this.
      
      - Also, the 'ebda_addr' name sucks: it highlights that it's an address (which is
        obvious, all values here are addresses!), while it does not highlight that it's
        the _start_ of the EBDA region ...
      
      - 'BIOS_LOWMEM_KILOBYTES' says a lot of things, except that this is the only value
        that is a pointer to a value, not a memory range address!
      
      - The function name itself is a misnomer: it says 'reserve_ebda_region()' while
        its main purpose is to reserve all the firmware ROM typically between 640K and
        1MB, while the 'EBDA' part is only a small part of that ...
      
      - Likewise, the paravirt quirk flag name 'ebda_search' is misleading as well: this
        too should be about whether to reserve firmware areas in the paravirt case.
      
      - In fact thinking about this as 'end of RAM' is confusing: what this function
        *really* wants to reserve is firmware data and code areas! Once the thinking is
        inverted from a mixed 'ram' and 'reserved firmware area' notion to a pure
        'reserved area' notion everything becomes a lot clearer.
      
      To improve all this rewrite the whole code (without changing the logic):
      
      - Firstly invert the naming from 'lowmem end' to 'BIOS reserved area start'
        and propagate this concept through all the variable names and constants.
      
      	BIOS_RAM_SIZE_KB_PTR		// was: BIOS_LOWMEM_KILOBYTES
      
      	BIOS_START_MIN			// was: INSANE_CUTOFF
      
      	ebda_start			// was: ebda_addr
      	bios_start			// was: lowmem
      
      	BIOS_START_MAX			// was: LOWMEM_CAP
      
      - Then clean up the name of the function itself by renaming it
        to reserve_bios_regions() and renaming the ::ebda_search paravirt
        flag to ::reserve_bios_regions.
      
      - Fix up all the comments (fix typos), harmonize and simplify their
        formulation and remove comments that become unnecessary due to
        the much better naming all around.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      edce2121
  5. 15 7月, 2016 17 次提交
  6. 13 7月, 2016 6 次提交
    • D
      x86/mm: Use pte_none() to test for empty PTE · dcb32d99
      Dave Hansen 提交于
      The page table manipulation code seems to have grown a couple of
      sites that are looking for empty PTEs.  Just in case one of these
      entries got a stray bit set, use pte_none() instead of checking
      for a zero pte_val().
      
      The use pte_same() makes me a bit nervous.  If we were doing a
      pte_same() check against two cleared entries and one of them had
      a stray bit set, it might fail the pte_same() check.  But, I
      don't think we ever _do_ pte_same() for cleared entries.  It is
      almost entirely used for checking for races in fault-in paths.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: dave.hansen@intel.com
      Cc: linux-mm@kvack.org
      Cc: mhocko@suse.com
      Link: http://lkml.kernel.org/r/20160708001915.813703D9@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      dcb32d99
    • D
      x86/mm: Disallow running with 32-bit PTEs to work around erratum · e4a84be6
      Dave Hansen 提交于
      The Intel(R) Xeon Phi(TM) Processor x200 Family (codename: Knights
      Landing) has an erratum where a processor thread setting the Accessed
      or Dirty bits may not do so atomically against its checks for the
      Present bit.  This may cause a thread (which is about to page fault)
      to set A and/or D, even though the Present bit had already been
      atomically cleared.
      
      These bits are truly "stray".  In the case of the Dirty bit, the
      thread associated with the stray set was *not* allowed to write to
      the page.  This means that we do not have to launder the bit(s); we
      can simply ignore them.
      
      If the PTE is used for storing a swap index or a NUMA migration index,
      the A bit could be misinterpreted as part of the swap type.  The stray
      bits being set cause a software-cleared PTE to be interpreted as a
      swap entry.  In some cases (like when the swap index ends up being
      for a non-existent swapfile), the kernel detects the stray value
      and WARN()s about it, but there is no guarantee that the kernel can
      always detect it.
      
      When we have 64-bit PTEs (64-bit mode or 32-bit PAE), we were able
      to move the swap PTE format around to avoid these troublesome bits.
      But, 32-bit non-PAE is tight on bits.  So, disallow it from running
      on this hardware.  I can't imagine anyone wanting to run 32-bit
      non-highmem kernels on this hardware, but disallowing them from
      running entirely is surely the safe thing to do.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: dave.hansen@intel.com
      Cc: linux-mm@kvack.org
      Cc: mhocko@suse.com
      Link: http://lkml.kernel.org/r/20160708001914.D0B50110@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e4a84be6
    • D
      x86/mm: Ignore A/D bits in pte/pmd/pud_none() · 97e3c602
      Dave Hansen 提交于
      The erratum we are fixing here can lead to stray setting of the
      A and D bits.  That means that a pte that we cleared might
      suddenly have A/D set.  So, stop considering those bits when
      determining if a pte is pte_none().  The same goes for the
      other pmd_none() and pud_none().  pgd_none() can be skipped
      because it is not affected; we do not use PGD entries for
      anything other than pagetables on affected configurations.
      
      This adds a tiny amount of overhead to all pte_none() checks.
      I doubt we'll be able to measure it anywhere.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: dave.hansen@intel.com
      Cc: linux-mm@kvack.org
      Cc: mhocko@suse.com
      Link: http://lkml.kernel.org/r/20160708001912.5216F89C@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      97e3c602
    • D
      x86/mm: Move swap offset/type up in PTE to work around erratum · 00839ee3
      Dave Hansen 提交于
      This erratum can result in Accessed/Dirty getting set by the hardware
      when we do not expect them to be (on !Present PTEs).
      
      Instead of trying to fix them up after this happens, we just
      allow the bits to get set and try to ignore them.  We do this by
      shifting the layout of the bits we use for swap offset/type in
      our 64-bit PTEs.
      
      It looks like this:
      
       bitnrs: |     ...            | 11| 10|  9|8|7|6|5| 4| 3|2|1|0|
       names:  |     ...            |SW3|SW2|SW1|G|L|D|A|CD|WT|U|W|P|
       before: |         OFFSET (9-63)          |0|X|X| TYPE(1-5) |0|
        after: | OFFSET (14-63)  |  TYPE (9-13) |0|X|X|X| X| X|X|X|0|
      
      Note that D was already a don't care (X) even before.  We just
      move TYPE up and turn its old spot (which could be hit by the
      A bit) into all don't cares.
      
      We take 5 bits away from the offset, but that still leaves us
      with 50 bits which lets us index into a 62-bit swapfile (4 EiB).
      I think that's probably fine for the moment.  We could
      theoretically reclaim 5 of the bits (1, 2, 3, 4, 7) but it
      doesn't gain us anything.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: dave.hansen@intel.com
      Cc: linux-mm@kvack.org
      Cc: mhocko@suse.com
      Link: http://lkml.kernel.org/r/20160708001911.9A3FD2B6@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      00839ee3
    • A
      x86/sfi: Enable enumeration of SD devices · 05f310e2
      Andy Shevchenko 提交于
      SFI specification v0.8.2 defines type of devices which are connected to
      SD bus. In particularly WiFi dongle is a such.
      
      Add a callback to enumerate the devices connected to SD bus.
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1468322192-62080-1-git-send-email-andriy.shevchenko@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      05f310e2
    • A
      x86/pci: Use MRFLD abbreviation for Merrifield · 707a605b
      Andy Shevchenko 提交于
      Everywhere in the kernel the MRFLD is used as abbreviation of Intel Merrifield.
      Do the same in intel_mid_pci.c module.
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1468321462-136016-1-git-send-email-andriy.shevchenko@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      707a605b
  7. 12 7月, 2016 3 次提交
  8. 11 7月, 2016 7 次提交
    • Y
      x86/fpu/xstate: Re-enable XSAVES · b8be15d5
      Yu-cheng Yu 提交于
      We did not handle XSAVES instructions correctly. There were issues in
      converting between standard and compacted format when interfacing with
      user-space. These issues have been corrected.
      
      Add a WARN_ONCE() to make it clear that XSAVES supervisor states are not
      yet implemented.
      Signed-off-by: NYu-cheng Yu <yu-cheng.yu@intel.com>
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Cc: H. Peter Anvin <h.peter.anvin@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1468253937-40008-5-git-send-email-fenghua.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b8be15d5
    • Y
      x86/fpu/xstate: Fix fpstate_init() for XRSTORS · 35ac2d7b
      Yu-cheng Yu 提交于
      In XSAVES mode if fpstate_init() is used to initialize a
      task's extended state area, xsave.header.xcomp_bv[63] must
      be set. Otherwise, when the task is scheduled, a warning is
      triggered from copy_kernel_to_xregs().
      
      One such test case is: setting an invalid extended state
      through PTRACE. When xstateregs_set() rejects the syscall
      and re-initializes the task's extended state area. This triggers
      the warning mentioned above.
      Signed-off-by: NYu-cheng Yu <yu-cheng.yu@intel.com>
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Cc: H. Peter Anvin <h.peter.anvin@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1468253937-40008-4-git-send-email-fenghua.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      35ac2d7b
    • Y
      x86/fpu/xstate: Return NULL for disabled xstate component address · 5060b915
      Yu-cheng Yu 提交于
      It is an error to request a disabled XSAVE/XSAVES component address.
      For that case, make __raw_xsave_addr() return a NULL and issue a
      warning.
      Signed-off-by: NYu-cheng Yu <yu-cheng.yu@intel.com>
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Cc: H. Peter Anvin <h.peter.anvin@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1468253937-40008-3-git-send-email-fenghua.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5060b915
    • Y
      x86/fpu/xstate: Fix __fpu_restore_sig() for XSAVES · 1fc2b67b
      Yu-cheng Yu 提交于
      When the kernel is using XSAVES compacted format, we cannot do
      __copy_from_user() from a signal frame, which has standard-format data.
      Fix it by using copyin_to_xsaves(), which converts between formats and
      filters out all supervisor states that we do not allow userspace to
      write.
      Signed-off-by: NYu-cheng Yu <yu-cheng.yu@intel.com>
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Cc: H. Peter Anvin <h.peter.anvin@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1468253937-40008-2-git-send-email-fenghua.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1fc2b67b
    • I
      Revert "perf/x86/intel, watchdog: Switch NMI watchdog to ref cycles on x86" · 44530d58
      Ingo Molnar 提交于
      This reverts commit 2c95afc1.
      
      Stephane reported the following regression:
      
       > Since Andi added:
       >
       > commit 2c95afc1
       > Author: Andi Kleen <ak@linux.intel.com>
       > Date:   Thu Jun 9 06:14:38 2016 -0700
       >
       >    perf/x86/intel, watchdog: Switch NMI watchdog to ref cycles on x86
       >
       > $ perf stat -e ref-cycles ls
       >   <not counted> ....
       >
       > fails systematically because the ref-cycles is now used by the
       > watchdog and given this is a system-wide pinned event, it monopolizes
       > the fixed counter 2 which is the only counter able to measure this event.
      
      Since the next merge window is near, fix the regression for now
      by reverting the commit.
      Reported-by: NStephane Eranian <eranian@google.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      44530d58
    • L
      x86/quirks: Add early quirk to reset Apple AirPort card · abb2bafd
      Lukas Wunner 提交于
      The EFI firmware on Macs contains a full-fledged network stack for
      downloading OS X images from osrecovery.apple.com. Unfortunately
      on Macs introduced 2011 and 2012, EFI brings up the Broadcom 4331
      wireless card on every boot and leaves it enabled even after
      ExitBootServices has been called. The card continues to assert its IRQ
      line, causing spurious interrupts if the IRQ is shared. It also corrupts
      memory by DMAing received packets, allowing for remote code execution
      over the air. This only stops when a driver is loaded for the wireless
      card, which may be never if the driver is not installed or blacklisted.
      
      The issue seems to be constrained to the Broadcom 4331. Chris Milsted
      has verified that the newer Broadcom 4360 built into the MacBookPro11,3
      (2013/2014) does not exhibit this behaviour. The chances that Apple will
      ever supply a firmware fix for the older machines appear to be zero.
      
      The solution is to reset the card on boot by writing to a reset bit in
      its mmio space. This must be done as an early quirk and not as a plain
      vanilla PCI quirk to successfully combat memory corruption by DMAed
      packets: Matthew Garrett found out in 2012 that the packets are written
      to EfiBootServicesData memory (http://mjg59.dreamwidth.org/11235.html).
      This type of memory is made available to the page allocator by
      efi_free_boot_services(). Plain vanilla PCI quirks run much later, in
      subsys initcall level. In-between a time window would be open for memory
      corruption. Random crashes occurring in this time window and attributed
      to DMAed packets have indeed been observed in the wild by Chris
      Bainbridge.
      
      When Matthew Garrett analyzed the memory corruption issue in 2012, he
      sought to fix it with a grub quirk which transitions the card to D3hot:
      http://git.savannah.gnu.org/cgit/grub.git/commit/?id=9d34bb85da56
      
      This approach does not help users with other bootloaders and while it
      may prevent DMAed packets, it does not cure the spurious interrupts
      emanating from the card. Unfortunately the card's mmio space is
      inaccessible in D3hot, so to reset it, we have to undo the effect of
      Matthew's grub patch and transition the card back to D0.
      
      Note that the quirk takes a few shortcuts to reduce the amount of code:
      The size of BAR 0 and the location of the PM capability is identical
      on all affected machines and therefore hardcoded. Only the address of
      BAR 0 differs between models. Also, it is assumed that the BCMA core
      currently mapped is the 802.11 core. The EFI driver seems to always take
      care of this.
      
      Michael Büsch, Bjorn Helgaas and Matt Fleming contributed feedback
      towards finding the best solution to this problem.
      
      The following should be a comprehensive list of affected models:
          iMac13,1        2012  21.5"       [Root Port 00:1c.3 = 8086:1e16]
          iMac13,2        2012  27"         [Root Port 00:1c.3 = 8086:1e16]
          Macmini5,1      2011  i5 2.3 GHz  [Root Port 00:1c.1 = 8086:1c12]
          Macmini5,2      2011  i5 2.5 GHz  [Root Port 00:1c.1 = 8086:1c12]
          Macmini5,3      2011  i7 2.0 GHz  [Root Port 00:1c.1 = 8086:1c12]
          Macmini6,1      2012  i5 2.5 GHz  [Root Port 00:1c.1 = 8086:1e12]
          Macmini6,2      2012  i7 2.3 GHz  [Root Port 00:1c.1 = 8086:1e12]
          MacBookPro8,1   2011  13"         [Root Port 00:1c.1 = 8086:1c12]
          MacBookPro8,2   2011  15"         [Root Port 00:1c.1 = 8086:1c12]
          MacBookPro8,3   2011  17"         [Root Port 00:1c.1 = 8086:1c12]
          MacBookPro9,1   2012  15"         [Root Port 00:1c.1 = 8086:1e12]
          MacBookPro9,2   2012  13"         [Root Port 00:1c.1 = 8086:1e12]
          MacBookPro10,1  2012  15"         [Root Port 00:1c.1 = 8086:1e12]
          MacBookPro10,2  2012  13"         [Root Port 00:1c.1 = 8086:1e12]
      
      For posterity, spurious interrupts caused by the Broadcom 4331 wireless
      card resulted in splats like this (stacktrace omitted):
      
          irq 17: nobody cared (try booting with the "irqpoll" option)
          handlers:
          [<ffffffff81374370>] pcie_isr
          [<ffffffffc0704550>] sdhci_irq [sdhci] threaded [<ffffffffc07013c0>] sdhci_thread_irq [sdhci]
          [<ffffffffc0a0b960>] azx_interrupt [snd_hda_codec]
          Disabling IRQ #17
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=79301
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111781
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=728916
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=895951#c16
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1009819
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1098621
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1149632#c5
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1279130
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1332732
      Tested-by: Konstantin Simanov <k.simanov@stlk.ru>        # [MacBookPro8,1]
      Tested-by: Lukas Wunner <lukas@wunner.de>                # [MacBookPro9,1]
      Tested-by: Bryan Paradis <bryan.paradis@gmail.com>       # [MacBookPro9,2]
      Tested-by: Andrew Worsley <amworsley@gmail.com>          # [MacBookPro10,1]
      Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> # [MacBookPro10,2]
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Acked-by: NRafał Miłecki <zajec5@gmail.com>
      Acked-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Chris Milsted <cmilsted@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Michael Buesch <m@bues.ch>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: b43-dev@lists.infradead.org
      Cc: linux-pci@vger.kernel.org
      Cc: linux-wireless@vger.kernel.org
      Cc: stable@vger.kernel.org
      Cc: stable@vger.kernel.org # 123456789abc: x86/quirks: Apply nvidia_bugs quirk only on root bus
      Cc: stable@vger.kernel.org # 123456789abc: x86/quirks: Reintroduce scanning of secondary buses
      Link: http://lkml.kernel.org/r/48d0972ac82a53d460e5fce77a07b2560db95203.1465690253.git.lukas@wunner.de
      [ Did minor readability edits. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      abb2bafd
    • L
      x86/quirks: Reintroduce scanning of secondary buses · 850c3210
      Lukas Wunner 提交于
      We used to scan secondary buses until the following commit that
      was applied in 2009:
      
        8659c406 ("x86: only scan the root bus in early PCI quirks")
      
      which commit constrained early quirks to the root bus only. Its
      motivation was to prevent application of the nvidia_bugs quirk
      on secondary buses.
      
      We're about to add a quirk to reset the Broadcom 4331 wireless card on
      2011/2012 Macs, which is located on a secondary bus behind a PCIe root
      port. To facilitate that, reintroduce scanning of secondary buses.
      
      The commit message of 8659c406 notes that scanning only the root bus
      "saves quite some unnecessary scanning work". The algorithm used prior
      to 8659c406 was particularly time consuming because it scanned
      buses 0 to 31 brute force. To avoid lengthening boot time, employ a
      recursive strategy which only scans buses that are actually reachable
      from the root bus.
      
      Yinghai Lu pointed out that the secondary bus number read from a
      bridge's config space may be invalid, in particular a value of 0 would
      cause an infinite loop. The PCI core goes beyond that and recurses to a
      child bus only if its bus number is greater than the parent bus number
      (see pci_scan_bridge()). Since the root bus is numbered 0, this implies
      that secondary buses may not be 0. Do the same on early scanning.
      
      If this algorithm is found to significantly impact boot time or cause
      infinite loops on broken hardware, it would be possible to limit its
      recursion depth: The Broadcom 4331 quirk applies at depth 1, all others
      at depth 0, so the bus need not be scanned deeper than that for now. An
      alternative approach would be to revert to scanning only the root bus,
      and apply the Broadcom 4331 quirk to the root ports 8086:1c12, 8086:1e12
      and 8086:1e16. Apple always positioned the card behind either of these
      three ports. The quirk would then check presence of the card in slot 0
      below the root port and do its deed.
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: linux-pci@vger.kernel.org
      Link: http://lkml.kernel.org/r/f0daa70dac1a9b2483abdb31887173eb6ab77bdf.1465690253.git.lukas@wunner.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      850c3210
新手
引导
客服 返回
顶部