1. 06 8月, 2014 1 次提交
    • T
      x86: MCE: Add raw_lock conversion again · ed5c41d3
      Thomas Gleixner 提交于
      Commit ea431643 ("x86/mce: Fix CMCI preemption bugs") breaks RT by
      the completely unrelated conversion of the cmci_discover_lock to a
      regular (non raw) spinlock.  This lock was annotated in commit
      59d958d2 ("locking, x86: mce: Annotate cmci_discover_lock as raw")
      with a proper explanation why.
      
      The argument for converting the lock back to a regular spinlock was:
      
       - it does percpu ops without disabling preemption. Preemption is not
         disabled due to the mistaken use of a raw spinlock.
      
      Which is complete nonsense.  The raw_spinlock is disabling preemption in
      the same way as a regular spinlock.  In mainline spinlock maps to
      raw_spinlock, in RT spinlock becomes a "sleeping" lock.
      
      raw_spinlock has on RT exactly the same semantics as in mainline.  And
      because this lock is taken in non preemptible context it must be raw on
      RT.
      
      Undo the locking brainfart.
      Reported-by: NClark Williams <williams@redhat.com>
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed5c41d3
  2. 31 7月, 2014 1 次提交
  3. 23 7月, 2014 1 次提交
    • P
      x86, cpu: Fix cache topology for early P4-SMT · 2a226155
      Peter Zijlstra 提交于
      P4 systems with cpuid level < 4 can have SMT, but the cache topology
      description available (cpuid2) does not include SMP information.
      
      Now we know that SMT shares all cache levels, and therefore we can
      mark all available cache levels as shared.
      
      We do this by setting cpu_llc_id to ->phys_proc_id, since that's
      the same for each SMT thread. We can do this unconditional since if
      there's no SMT its still true, the one CPU shares cache with only
      itself.
      
      This fixes a problem where such CPUs report an incorrect LLC CPU mask.
      
      This in turn fixes a crash in the scheduler where the topology was
      build wrong, it assumes the LLC mask to include at least the SMT CPUs.
      
      Cc: Josh Boyer <jwboyer@redhat.com>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Tested-by: NBruno Wolff III <bruno@wolff.to>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140722133514.GM12054@laptop.lanSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
      2a226155
  4. 22 7月, 2014 1 次提交
  5. 16 7月, 2014 5 次提交
  6. 15 7月, 2014 2 次提交
  7. 05 7月, 2014 1 次提交
  8. 02 7月, 2014 1 次提交
    • H
      perf/x86/intel: ignore CondChgd bit to avoid false NMI handling · b292d7a1
      HATAYAMA Daisuke 提交于
      Currently, any NMI is falsely handled by a NMI handler of NMI watchdog
      if CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR is set.
      
      For example, we use external NMI to make system panic to get crash
      dump, but in this case, the external NMI is falsely handled do to the
      issue.
      
      This commit deals with the issue simply by ignoring CondChgd bit.
      
      Here is explanation in detail.
      
      On x86 NMI watchdog uses performance monitoring feature to
      periodically signal NMI each time performance counter gets overflowed.
      
      intel_pmu_handle_irq() is called as a NMI_LOCAL handler from a NMI
      handler of NMI watchdog, perf_event_nmi_handler(). It identifies an
      owner of a given NMI by looking at overflow status bits in
      MSR_CORE_PERF_GLOBAL_STATUS MSR. If some of the bits are set, then it
      handles the given NMI as its own NMI.
      
      The problem is that the intel_pmu_handle_irq() doesn't distinguish
      CondChgd bit from other bits. Unlike the other status bits, CondChgd
      bit doesn't represent overflow status for performance counters. Thus,
      CondChgd bit cannot be thought of as a mark indicating a given NMI is
      NMI watchdog's.
      
      As a result, if CondChgd bit is set, any NMI is falsely handled by the
      NMI handler of NMI watchdog. Also, if type of the falsely handled NMI
      is either NMI_UNKNOWN, NMI_SERR or NMI_IO_CHECK, the corresponding
      action is never performed until CondChgd bit is cleared.
      
      I noticed this behavior on systems with Ivy Bridge processors: Intel
      Xeon CPU E5-2630 v2 and Intel Xeon CPU E7-8890 v2. On both systems,
      CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR has already been set
      in the beginning at boot. Then the CondChgd bit is immediately cleared
      by next wrmsr to MSR_CORE_PERF_GLOBAL_CTRL MSR and appears to remain
      0.
      
      On the other hand, on older processors such as Nehalem, Xeon E7540,
      CondChgd bit is not set in the beginning at boot.
      
      I'm not sure about exact behavior of CondChgd bit, in particular when
      this bit is set. Although I read Intel System Programmer's Manual to
      figure out that, the descriptions I found are:
      
        In 18.9.1:
      
        "The MSR_PERF_GLOBAL_STATUS MSR also provides a ¡sticky bit¢ to
         indicate changes to the state of performancmonitoring hardware"
      
        In Table 35-2 IA-32 Architectural MSRs
      
        63 CondChg: status bits of this register has changed.
      
      These are different from the bahviour I see on the actual system as I
      explained above.
      
      At least, I think ignoring CondChgd bit should be enough for NMI
      watchdog perspective.
      Signed-off-by: NHATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: <stable@vger.kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/r/20140625.103503.409316067.d.hatayama@jp.fujitsu.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b292d7a1
  9. 24 6月, 2014 1 次提交
  10. 23 6月, 2014 1 次提交
  11. 19 6月, 2014 1 次提交
  12. 09 6月, 2014 1 次提交
  13. 05 6月, 2014 4 次提交
  14. 31 5月, 2014 2 次提交
  15. 30 5月, 2014 1 次提交
    • F
      x86/xsaves: Detect xsaves/xrstors feature · 6229ad27
      Fenghua Yu 提交于
      Detect the xsaveopt, xsavec, xgetbv, and xsaves features in processor extended
      state enumberation sub-leaf (eax=0x0d, ecx=1):
      Bit 00: XSAVEOPT is available
      Bit 01: Supports XSAVEC and the compacted form of XRSTOR if set
      Bit 02: Supports XGETBV with ECX = 1 if set
      Bit 03: Supports XSAVES/XRSTORS and IA32_XSS if set
      
      The above features are defined in the new word 10 in cpu features.
      
      The IA32_XSS MSR (index DA0H) contains a state-component bitmap that specifies
      the state components that software has enabled xsaves and xrstors to manage.
      If the bit corresponding to a state component is clear in XCR0 | IA32_XSS,
      xsaves and xrstors will not operate on that state component, regardless of
      the value of the instruction mask.
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Link: http://lkml.kernel.org/r/1401387164-43416-3-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      6229ad27
  16. 21 5月, 2014 1 次提交
  17. 19 5月, 2014 1 次提交
  18. 12 5月, 2014 1 次提交
  19. 07 5月, 2014 1 次提交
  20. 06 5月, 2014 3 次提交
  21. 24 4月, 2014 3 次提交
  22. 18 4月, 2014 2 次提交
  23. 17 4月, 2014 1 次提交
  24. 15 4月, 2014 1 次提交
    • K
      x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately · e179f691
      K. Y. Srinivasan 提交于
      The legacy PIC may or may not be available and we need a mechanism to
      detect the existence of the legacy PIC that is applicable for all
      hardware (both physical as well as virtual) currently supported by
      Linux.
      
      On Hyper-V, when our legacy firmware presented to the guests, emulates
      the legacy PIC while when our EFI based firmware is presented we do
      not emulate the PIC. To support Hyper-V EFI firmware, we had to set
      the legacy_pic to the null_legacy_pic since we had to bypass PIC based
      calibration in the early boot code. While, on the EFI firmware, we
      know we don't emulate the legacy PIC, we need a generic mechanism to
      detect the presence of the legacy PIC that is not based on boot time
      state - this became apparent when we tried to get kexec to work on
      Hyper-V EFI firmware.
      
      This patch implements the proposal put forth by H. Peter Anvin
      <hpa@linux.intel.com>: Write a known value to the PIC data port and
      read it back. If the value read is the value written, we do have the
      PIC, if not there is no PIC and we can safely set the legacy_pic to
      null_legacy_pic. Since the read from an unconnected I/O port returns
      0xff, we will use ~(1 << PIC_CASCADE_IR) (0xfb: mask all lines except
      the cascade line) to probe for the existence of the PIC.
      
      In version V1 of the patch, I had cleaned up the code based on comments from Peter.
      In version V2 of the patch, I have addressed additional comments from Peter.
      In version V3 of the patch, I have addressed Jan's comments (JBeulich@suse.com).
      In version V4 of the patch, I have addressed additional comments from Peter.
      Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
      Link: http://lkml.kernel.org/r/1397501029-29286-1-git-send-email-kys@microsoft.com
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      e179f691
  25. 02 4月, 2014 1 次提交
  26. 29 3月, 2014 1 次提交
    • C
      x86, CMCI: Add proper detection of end of CMCI storms · 27f6c573
      Chen, Gong 提交于
      When CMCI storm persists for a long time(at least beyond predefined
      threshold. It's 30 seconds for now), we can watch CMCI storm is
      detected immediately after it subsides.
      
      ...
      Dec 10 22:04:29 kernel: CMCI storm detected: switching to poll mode
      Dec 10 22:04:59 kernel: CMCI storm subsided: switching to interrupt mode
      Dec 10 22:04:59 kernel: CMCI storm detected: switching to poll mode
      Dec 10 22:05:29 kernel: CMCI storm subsided: switching to interrupt mode
      ...
      
      The problem is that our logic that determines that the storm has
      ended is incorrect. We announce the end, re-enable interrupts and
      realize that the storm is still going on, so we switch back to
      polling mode. Rinse, repeat.
      
      When a storm happens we disable signaling of errors via CMCI and begin
      polling machine check banks instead. If we find any logged errors,
      then we need to set a per-cpu flag so that our per-cpu tests that
      check whether the storm is ongoing will see that errors are still
      being logged independently of whether mce_notify_irq() says that the
      error has been fully processed.
      
      cmci_clear() is not the right tool to disable a bank. It disables the
      interrupt for the bank as desired, but it also clears the bit for
      this bank in "mce_banks_owned" so we will skip the bank when polling
      (so we fail to see that the storm continues because we stop looking).
      New cmci_storm_disable_banks() just disables the interrupt while
      allowing polling to continue.
      Reported-by: NWilliam Dauchy <wdauchy@gmail.com>
      Signed-off-by: NChen, Gong <gong.chen@linux.intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      27f6c573