1. 10 8月, 2016 2 次提交
    • N
      x86/timers/apic: Inform TSC deadline clockevent device about recalibration · 6731b0d6
      Nicolai Stange 提交于
      This patch eliminates a source of imprecise APIC timer interrupts,
      which imprecision may result in double interrupts or even late
      interrupts.
      
      The TSC deadline clockevent devices' configuration and registration
      happens before the TSC frequency calibration is refined in
      tsc_refine_calibration_work().
      
      This results in the TSC clocksource and the TSC deadline clockevent
      devices being configured with slightly different frequencies: the former
      gets the refined one and the latter are configured with the inaccurate
      frequency detected earlier by means of the "Fast TSC calibration using PIT".
      
      Within the APIC code, introduce the notifier function
      lapic_update_tsc_freq() which reconfigures all per-CPU TSC deadline
      clockevent devices with the current tsc_khz.
      
      Call it from the TSC code after TSC calibration refinement has happened.
      Signed-off-by: NNicolai Stange <nicstange@gmail.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Christopher S. Hall <christopher.s.hall@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Link: http://lkml.kernel.org/r/20160714152255.18295-3-nicstange@gmail.com
      [ Pushed #ifdef CONFIG_X86_LOCAL_APIC into header, improved changelog. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6731b0d6
    • N
      x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC clockevents... · 1a9e4c56
      Nicolai Stange 提交于
      x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC clockevents frequency roundoff error
      
      I noticed the following bug/misbehavior on certain Intel systems: with a
      single task running on a NOHZ CPU on an Intel Haswell, I recognized
      that I did not only get the one expected local_timer APIC interrupt, but
      two per second at minimum. (!)
      
      Further tracing showed that the first one precedes the programmed deadline
      by up to ~50us and hence, it did nothing except for reprogramming the TSC
      deadline clockevent device to trigger shortly thereafter again.
      
      The reason for this is imprecise calibration, the timeout we program into
      the APIC results in 'too short' timer interrupts. The core (hr)timer code
      notices this (because it has a precise ktime source and sees the short
      interrupt) and fixes it up by programming an additional very short
      interrupt period.
      
      This is obviously suboptimal.
      
      The reason for the imprecise calibration is twofold, and this patch
      fixes the first reason:
      
      In setup_APIC_timer(), the registered clockevent device's frequency
      is calculated by first dividing tsc_khz by TSC_DIVISOR and multiplying
      it with 1000 afterwards:
      
        (tsc_khz / TSC_DIVISOR) * 1000
      
      The multiplication with 1000 is done for converting from kHz to Hz and the
      division by TSC_DIVISOR is carried out in order to make sure that the final
      result fits into an u32.
      
      However, with the order given in this calculation, the roundoff error
      introduced by the division gets magnified by a factor of 1000 by the
      following multiplication.
      
      To fix it, reversing the order of the division and the multiplication a la:
      
        (tsc_khz * 1000) / TSC_DIVISOR
      
      ... reduces the roundoff error already.
      
      Furthermore, if TSC_DIVISOR divides 1000, associativity holds:
      
        (tsc_khz * 1000) / TSC_DIVISOR = tsc_khz * (1000 / TSC_DIVISOR)
      
      and thus, the roundoff error even vanishes and the whole operation can be
      carried out within 32 bits.
      
      The powers of two that divide 1000 are 2, 4 and 8. A value of 8 for
      TSC_DIVISOR still allows for TSC frequencies up to
      2^32 / 10^9ns * 8 = 34.4GHz which is way larger than anything to expect
      in the next years.
      
      Thus we also replace the current TSC_DIVISOR value of 32 by 8. Reverse
      the order of the divison and the multiplication in the calculation of
      the registered clockevent device's frequency.
      Signed-off-by: NNicolai Stange <nicstange@gmail.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Christopher S. Hall <christopher.s.hall@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Link: http://lkml.kernel.org/r/20160714152255.18295-2-nicstange@gmail.com
      [ Improved changelog. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      1a9e4c56
  2. 27 7月, 2016 2 次提交
  3. 25 7月, 2016 1 次提交
  4. 22 7月, 2016 2 次提交
    • A
      x86/boot: Simplify EBDA-vs-BIOS reservation logic · 6a79296c
      Andy Lutomirski 提交于
      Both the intent and the effect of reserve_bios_regions() is simple:
      reserve the range from the apparent BIOS start (suitably filtered)
      through 1MB and, if the EBDA start address is sensible, extend that
      reservation downward to cover the EBDA as well.
      
      The code is overcomplicated, though, and contains head-scratchers
      like:
      
      	if (ebda_start < BIOS_START_MIN)
      		ebda_start = BIOS_START_MAX;
      
      That snipped is trying to say "if ebda_start < BIOS_START_MIN,
      ignore it".
      
      Simplify it: reorder the code so that it makes sense.  This should
      have no functional effect under any circumstances.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Mario Limonciello <mario_limonciello@dell.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Link: http://lkml.kernel.org/r/ef89c0c761be20ead8bd9a3275743e6259b6092a.1469135598.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6a79296c
    • D
      x86/fpu: Do not BUG_ON() in early FPU code · ec3ed4a2
      Dave Hansen 提交于
      I don't think it is really possible to have a system where CPUID
      enumerates support for XSAVE but that it does not have FP/SSE
      (they are "legacy" features and always present).
      
      But, I did manage to hit this case in qemu when I enabled its
      somewhat shaky XSAVE support.  The bummer is that the FPU is set
      up before we parse the command-line or have *any* console support
      including earlyprintk.  That turned what should have been an easy
      thing to debug in to a bit more of an odyssey.
      
      So a BUG() here is worthless.  All it does it guarantee that
      if/when we hit this case we have an empty console.  So, remove
      the BUG() and try to limp along by disabling XSAVE and trying to
      continue.  Add a comment on why we are doing this, and also add
      a common "out_disable" path for leaving fpu__init_system_xstate().
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160720194551.63BB2B58@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ec3ed4a2
  5. 21 7月, 2016 1 次提交
    • I
      x86/boot: Reorganize and clean up the BIOS area reservation code · edce2121
      Ingo Molnar 提交于
      So the reserve_ebda_region() code has accumulated a number of
      problems over the years that make it really difficult to read
      and understand:
      
      - The calculation of 'lowmem' and 'ebda_addr' is an unnecessarily
        interleaved mess of first lowmem, then ebda_addr, then lowmem tweaks...
      
      - 'lowmem' here means 'super low mem' - i.e. 16-bit addressable memory. In other
        parts of the x86 code 'lowmem' means 32-bit addressable memory... This makes it
        super confusing to read.
      
      - It does not help at all that we have various memory range markers, half of which
        are 'start of range', half of which are 'end of range' - but this crucial
        property is not obvious in the naming at all ... gave me a headache trying to
        understand all this.
      
      - Also, the 'ebda_addr' name sucks: it highlights that it's an address (which is
        obvious, all values here are addresses!), while it does not highlight that it's
        the _start_ of the EBDA region ...
      
      - 'BIOS_LOWMEM_KILOBYTES' says a lot of things, except that this is the only value
        that is a pointer to a value, not a memory range address!
      
      - The function name itself is a misnomer: it says 'reserve_ebda_region()' while
        its main purpose is to reserve all the firmware ROM typically between 640K and
        1MB, while the 'EBDA' part is only a small part of that ...
      
      - Likewise, the paravirt quirk flag name 'ebda_search' is misleading as well: this
        too should be about whether to reserve firmware areas in the paravirt case.
      
      - In fact thinking about this as 'end of RAM' is confusing: what this function
        *really* wants to reserve is firmware data and code areas! Once the thinking is
        inverted from a mixed 'ram' and 'reserved firmware area' notion to a pure
        'reserved area' notion everything becomes a lot clearer.
      
      To improve all this rewrite the whole code (without changing the logic):
      
      - Firstly invert the naming from 'lowmem end' to 'BIOS reserved area start'
        and propagate this concept through all the variable names and constants.
      
      	BIOS_RAM_SIZE_KB_PTR		// was: BIOS_LOWMEM_KILOBYTES
      
      	BIOS_START_MIN			// was: INSANE_CUTOFF
      
      	ebda_start			// was: ebda_addr
      	bios_start			// was: lowmem
      
      	BIOS_START_MAX			// was: LOWMEM_CAP
      
      - Then clean up the name of the function itself by renaming it
        to reserve_bios_regions() and renaming the ::ebda_search paravirt
        flag to ::reserve_bios_regions.
      
      - Fix up all the comments (fix typos), harmonize and simplify their
        formulation and remove comments that become unnecessary due to
        the much better naming all around.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      edce2121
  6. 20 7月, 2016 1 次提交
  7. 16 7月, 2016 1 次提交
    • R
      x86 / hibernate: Use hlt_play_dead() when resuming from hibernation · 406f992e
      Rafael J. Wysocki 提交于
      On Intel hardware, native_play_dead() uses mwait_play_dead() by
      default and only falls back to the other methods if that fails.
      That also happens during resume from hibernation, when the restore
      (boot) kernel runs disable_nonboot_cpus() to take all of the CPUs
      except for the boot one offline.
      
      However, that is problematic, because the address passed to
      __monitor() in mwait_play_dead() is likely to be written to in the
      last phase of hibernate image restoration and that causes the "dead"
      CPU to start executing instructions again.  Unfortunately, the page
      containing the address in that CPU's instruction pointer may not be
      valid any more at that point.
      
      First, that page may have been overwritten with image kernel memory
      contents already, so the instructions the CPU attempts to execute may
      simply be invalid.  Second, the page tables previously used by that
      CPU may have been overwritten by image kernel memory contents, so the
      address in its instruction pointer is impossible to resolve then.
      
      A report from Varun Koyyalagunta and investigation carried out by
      Chen Yu show that the latter sometimes happens in practice.
      
      To prevent it from happening, temporarily change the smp_ops.play_dead
      pointer during resume from hibernation so that it points to a special
      "play dead" routine which uses hlt_play_dead() and avoids the
      inadvertent "revivals" of "dead" CPUs this way.
      
      A slightly unpleasant consequence of this change is that if the
      system is hibernated with one or more CPUs offline, it will generally
      draw more power after resume than it did before hibernation, because
      the physical state entered by CPUs via hlt_play_dead() is higher-power
      than the mwait_play_dead() one in the majority of cases.  It is
      possible to work around this, but it is unclear how much of a problem
      that's going to be in practice, so the workaround will be implemented
      later if it turns out to be necessary.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=106371Reported-by: NVarun Koyyalagunta <cpudebug@centtech.com>
      Original-by: NChen Yu <yu.c.chen@intel.com>
      Tested-by: NChen Yu <yu.c.chen@intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      406f992e
  8. 15 7月, 2016 12 次提交
  9. 14 7月, 2016 1 次提交
  10. 12 7月, 2016 3 次提交
  11. 11 7月, 2016 8 次提交
    • Y
      x86/fpu/xstate: Re-enable XSAVES · b8be15d5
      Yu-cheng Yu 提交于
      We did not handle XSAVES instructions correctly. There were issues in
      converting between standard and compacted format when interfacing with
      user-space. These issues have been corrected.
      
      Add a WARN_ONCE() to make it clear that XSAVES supervisor states are not
      yet implemented.
      Signed-off-by: NYu-cheng Yu <yu-cheng.yu@intel.com>
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Cc: H. Peter Anvin <h.peter.anvin@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1468253937-40008-5-git-send-email-fenghua.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b8be15d5
    • Y
      x86/fpu/xstate: Fix fpstate_init() for XRSTORS · 35ac2d7b
      Yu-cheng Yu 提交于
      In XSAVES mode if fpstate_init() is used to initialize a
      task's extended state area, xsave.header.xcomp_bv[63] must
      be set. Otherwise, when the task is scheduled, a warning is
      triggered from copy_kernel_to_xregs().
      
      One such test case is: setting an invalid extended state
      through PTRACE. When xstateregs_set() rejects the syscall
      and re-initializes the task's extended state area. This triggers
      the warning mentioned above.
      Signed-off-by: NYu-cheng Yu <yu-cheng.yu@intel.com>
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Cc: H. Peter Anvin <h.peter.anvin@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1468253937-40008-4-git-send-email-fenghua.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      35ac2d7b
    • Y
      x86/fpu/xstate: Return NULL for disabled xstate component address · 5060b915
      Yu-cheng Yu 提交于
      It is an error to request a disabled XSAVE/XSAVES component address.
      For that case, make __raw_xsave_addr() return a NULL and issue a
      warning.
      Signed-off-by: NYu-cheng Yu <yu-cheng.yu@intel.com>
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Cc: H. Peter Anvin <h.peter.anvin@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1468253937-40008-3-git-send-email-fenghua.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5060b915
    • Y
      x86/fpu/xstate: Fix __fpu_restore_sig() for XSAVES · 1fc2b67b
      Yu-cheng Yu 提交于
      When the kernel is using XSAVES compacted format, we cannot do
      __copy_from_user() from a signal frame, which has standard-format data.
      Fix it by using copyin_to_xsaves(), which converts between formats and
      filters out all supervisor states that we do not allow userspace to
      write.
      Signed-off-by: NYu-cheng Yu <yu-cheng.yu@intel.com>
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Cc: H. Peter Anvin <h.peter.anvin@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1468253937-40008-2-git-send-email-fenghua.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1fc2b67b
    • I
      Revert "perf/x86/intel, watchdog: Switch NMI watchdog to ref cycles on x86" · 44530d58
      Ingo Molnar 提交于
      This reverts commit 2c95afc1.
      
      Stephane reported the following regression:
      
       > Since Andi added:
       >
       > commit 2c95afc1
       > Author: Andi Kleen <ak@linux.intel.com>
       > Date:   Thu Jun 9 06:14:38 2016 -0700
       >
       >    perf/x86/intel, watchdog: Switch NMI watchdog to ref cycles on x86
       >
       > $ perf stat -e ref-cycles ls
       >   <not counted> ....
       >
       > fails systematically because the ref-cycles is now used by the
       > watchdog and given this is a system-wide pinned event, it monopolizes
       > the fixed counter 2 which is the only counter able to measure this event.
      
      Since the next merge window is near, fix the regression for now
      by reverting the commit.
      Reported-by: NStephane Eranian <eranian@google.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      44530d58
    • L
      x86/quirks: Add early quirk to reset Apple AirPort card · abb2bafd
      Lukas Wunner 提交于
      The EFI firmware on Macs contains a full-fledged network stack for
      downloading OS X images from osrecovery.apple.com. Unfortunately
      on Macs introduced 2011 and 2012, EFI brings up the Broadcom 4331
      wireless card on every boot and leaves it enabled even after
      ExitBootServices has been called. The card continues to assert its IRQ
      line, causing spurious interrupts if the IRQ is shared. It also corrupts
      memory by DMAing received packets, allowing for remote code execution
      over the air. This only stops when a driver is loaded for the wireless
      card, which may be never if the driver is not installed or blacklisted.
      
      The issue seems to be constrained to the Broadcom 4331. Chris Milsted
      has verified that the newer Broadcom 4360 built into the MacBookPro11,3
      (2013/2014) does not exhibit this behaviour. The chances that Apple will
      ever supply a firmware fix for the older machines appear to be zero.
      
      The solution is to reset the card on boot by writing to a reset bit in
      its mmio space. This must be done as an early quirk and not as a plain
      vanilla PCI quirk to successfully combat memory corruption by DMAed
      packets: Matthew Garrett found out in 2012 that the packets are written
      to EfiBootServicesData memory (http://mjg59.dreamwidth.org/11235.html).
      This type of memory is made available to the page allocator by
      efi_free_boot_services(). Plain vanilla PCI quirks run much later, in
      subsys initcall level. In-between a time window would be open for memory
      corruption. Random crashes occurring in this time window and attributed
      to DMAed packets have indeed been observed in the wild by Chris
      Bainbridge.
      
      When Matthew Garrett analyzed the memory corruption issue in 2012, he
      sought to fix it with a grub quirk which transitions the card to D3hot:
      http://git.savannah.gnu.org/cgit/grub.git/commit/?id=9d34bb85da56
      
      This approach does not help users with other bootloaders and while it
      may prevent DMAed packets, it does not cure the spurious interrupts
      emanating from the card. Unfortunately the card's mmio space is
      inaccessible in D3hot, so to reset it, we have to undo the effect of
      Matthew's grub patch and transition the card back to D0.
      
      Note that the quirk takes a few shortcuts to reduce the amount of code:
      The size of BAR 0 and the location of the PM capability is identical
      on all affected machines and therefore hardcoded. Only the address of
      BAR 0 differs between models. Also, it is assumed that the BCMA core
      currently mapped is the 802.11 core. The EFI driver seems to always take
      care of this.
      
      Michael Büsch, Bjorn Helgaas and Matt Fleming contributed feedback
      towards finding the best solution to this problem.
      
      The following should be a comprehensive list of affected models:
          iMac13,1        2012  21.5"       [Root Port 00:1c.3 = 8086:1e16]
          iMac13,2        2012  27"         [Root Port 00:1c.3 = 8086:1e16]
          Macmini5,1      2011  i5 2.3 GHz  [Root Port 00:1c.1 = 8086:1c12]
          Macmini5,2      2011  i5 2.5 GHz  [Root Port 00:1c.1 = 8086:1c12]
          Macmini5,3      2011  i7 2.0 GHz  [Root Port 00:1c.1 = 8086:1c12]
          Macmini6,1      2012  i5 2.5 GHz  [Root Port 00:1c.1 = 8086:1e12]
          Macmini6,2      2012  i7 2.3 GHz  [Root Port 00:1c.1 = 8086:1e12]
          MacBookPro8,1   2011  13"         [Root Port 00:1c.1 = 8086:1c12]
          MacBookPro8,2   2011  15"         [Root Port 00:1c.1 = 8086:1c12]
          MacBookPro8,3   2011  17"         [Root Port 00:1c.1 = 8086:1c12]
          MacBookPro9,1   2012  15"         [Root Port 00:1c.1 = 8086:1e12]
          MacBookPro9,2   2012  13"         [Root Port 00:1c.1 = 8086:1e12]
          MacBookPro10,1  2012  15"         [Root Port 00:1c.1 = 8086:1e12]
          MacBookPro10,2  2012  13"         [Root Port 00:1c.1 = 8086:1e12]
      
      For posterity, spurious interrupts caused by the Broadcom 4331 wireless
      card resulted in splats like this (stacktrace omitted):
      
          irq 17: nobody cared (try booting with the "irqpoll" option)
          handlers:
          [<ffffffff81374370>] pcie_isr
          [<ffffffffc0704550>] sdhci_irq [sdhci] threaded [<ffffffffc07013c0>] sdhci_thread_irq [sdhci]
          [<ffffffffc0a0b960>] azx_interrupt [snd_hda_codec]
          Disabling IRQ #17
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=79301
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111781
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=728916
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=895951#c16
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1009819
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1098621
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1149632#c5
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1279130
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1332732
      Tested-by: Konstantin Simanov <k.simanov@stlk.ru>        # [MacBookPro8,1]
      Tested-by: Lukas Wunner <lukas@wunner.de>                # [MacBookPro9,1]
      Tested-by: Bryan Paradis <bryan.paradis@gmail.com>       # [MacBookPro9,2]
      Tested-by: Andrew Worsley <amworsley@gmail.com>          # [MacBookPro10,1]
      Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> # [MacBookPro10,2]
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Acked-by: NRafał Miłecki <zajec5@gmail.com>
      Acked-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Chris Milsted <cmilsted@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Michael Buesch <m@bues.ch>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: b43-dev@lists.infradead.org
      Cc: linux-pci@vger.kernel.org
      Cc: linux-wireless@vger.kernel.org
      Cc: stable@vger.kernel.org
      Cc: stable@vger.kernel.org # 123456789abc: x86/quirks: Apply nvidia_bugs quirk only on root bus
      Cc: stable@vger.kernel.org # 123456789abc: x86/quirks: Reintroduce scanning of secondary buses
      Link: http://lkml.kernel.org/r/48d0972ac82a53d460e5fce77a07b2560db95203.1465690253.git.lukas@wunner.de
      [ Did minor readability edits. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      abb2bafd
    • L
      x86/quirks: Reintroduce scanning of secondary buses · 850c3210
      Lukas Wunner 提交于
      We used to scan secondary buses until the following commit that
      was applied in 2009:
      
        8659c406 ("x86: only scan the root bus in early PCI quirks")
      
      which commit constrained early quirks to the root bus only. Its
      motivation was to prevent application of the nvidia_bugs quirk
      on secondary buses.
      
      We're about to add a quirk to reset the Broadcom 4331 wireless card on
      2011/2012 Macs, which is located on a secondary bus behind a PCIe root
      port. To facilitate that, reintroduce scanning of secondary buses.
      
      The commit message of 8659c406 notes that scanning only the root bus
      "saves quite some unnecessary scanning work". The algorithm used prior
      to 8659c406 was particularly time consuming because it scanned
      buses 0 to 31 brute force. To avoid lengthening boot time, employ a
      recursive strategy which only scans buses that are actually reachable
      from the root bus.
      
      Yinghai Lu pointed out that the secondary bus number read from a
      bridge's config space may be invalid, in particular a value of 0 would
      cause an infinite loop. The PCI core goes beyond that and recurses to a
      child bus only if its bus number is greater than the parent bus number
      (see pci_scan_bridge()). Since the root bus is numbered 0, this implies
      that secondary buses may not be 0. Do the same on early scanning.
      
      If this algorithm is found to significantly impact boot time or cause
      infinite loops on broken hardware, it would be possible to limit its
      recursion depth: The Broadcom 4331 quirk applies at depth 1, all others
      at depth 0, so the bus need not be scanned deeper than that for now. An
      alternative approach would be to revert to scanning only the root bus,
      and apply the Broadcom 4331 quirk to the root ports 8086:1c12, 8086:1e12
      and 8086:1e16. Apple always positioned the card behind either of these
      three ports. The quirk would then check presence of the card in slot 0
      below the root port and do its deed.
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: linux-pci@vger.kernel.org
      Link: http://lkml.kernel.org/r/f0daa70dac1a9b2483abdb31887173eb6ab77bdf.1465690253.git.lukas@wunner.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      850c3210
    • L
      x86/quirks: Apply nvidia_bugs quirk only on root bus · 447d29d1
      Lukas Wunner 提交于
      Since the following commit:
      
        8659c406 ("x86: only scan the root bus in early PCI quirks")
      
      ... early quirks are only applied to devices on the root bus.
      
      The motivation was to prevent application of the nvidia_bugs quirk on
      secondary buses.
      
      We're about to reintroduce scanning of secondary buses for a quirk to
      reset the Broadcom 4331 wireless card on 2011/2012 Macs. To prevent
      regressions, open code the requirement to apply nvidia_bugs only on the
      root bus.
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/4d5477c1d76b2f0387a780f2142bbcdd9fee869b.1465690253.git.lukas@wunner.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      447d29d1
  12. 10 7月, 2016 6 次提交