1. 14 1月, 2017 1 次提交
  2. 25 12月, 2016 1 次提交
  3. 15 12月, 2016 2 次提交
    • T
      x86/tsc: Force TSC_ADJUST register to value >= zero · 5bae1562
      Thomas Gleixner 提交于
      Roland reported that his DELL T5810 sports a value add BIOS which
      completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with
      random negative TSC_ADJUST values, different on all CPUs. That renders the
      TSC useless because the sycnchronization check fails.
      
      Roland tested the new TSC_ADJUST mechanism. While it manages to readjust
      the TSCs he needs to disable the TSC deadline timer, otherwise the machine
      just stops booting.
      
      Deeper investigation unearthed that the TSC deadline timer is sensitive to
      the TSC_ADJUST value. Writing TSC_ADJUST to a negative value results in an
      interrupt storm caused by the TSC deadline timer.
      
      This does not make any sense and it's hard to imagine what kind of hardware
      wreckage is behind that misfeature, but it's reliably reproducible on other
      systems which have TSC_ADJUST and TSC deadline timer.
      
      While it would be understandable that a big enough negative value which
      moves the resulting TSC readout into the negative space could have the
      described effect, this happens even with a adjust value of -1, which keeps
      the TSC readout definitely in the positive space. The compare register for
      the TSC deadline timer is set to a positive value larger than the TSC, but
      despite not having reached the deadline the interrupt is raised
      immediately. If this happens on the boot CPU, then the machine dies
      silently because this setup happens before the NMI watchdog is armed.
      
      Further experiments showed that any other adjustment of TSC_ADJUST works as
      expected as long as it stays in the positive range. The direction of the
      adjustment has no influence either. See the lkml link for further analysis.
      
      Yet another proof for the theory that timers are designed by janitors and
      the underlying (obviously undocumented) mechanisms which allow BIOSes to
      wreckage them are considered a feature. Well done Intel - NOT!
      
      To address this wreckage add the following sanity measures:
      
      - If the TSC_ADJUST value on the boot cpu is not 0, set it to 0
      
      - If the TSC_ADJUST value on any cpu is negative, set it to 0
      
      - Prevent the cross package synchronization mechanism from setting negative
        TSC_ADJUST values.
      Reported-and-tested-by: NRoland Scheidegger <rscheidegger_lists@hispeed.ch>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
      Cc: Kevin Stanton <kevin.b.stanton@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Allen Hung <allen_hung@dell.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/r/20161213131211.397588033@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      5bae1562
    • T
      x86/tsc: Validate TSC_ADJUST after resume · 6a369583
      Thomas Gleixner 提交于
      Some 'feature' BIOSes fiddle with the TSC_ADJUST register during
      suspend/resume which renders the TSC unusable.
      
      Add sanity checks into the resume path and restore the
      original value if it was adjusted.
      Reported-and-tested-by: NRoland Scheidegger <rscheidegger_lists@hispeed.ch>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
      Cc: Kevin Stanton <kevin.b.stanton@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Allen Hung <allen_hung@dell.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/r/20161213131211.317654500@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      6a369583
  4. 30 11月, 2016 2 次提交
    • T
      x86/tsc: Store and check TSC ADJUST MSR · 8b223bc7
      Thomas Gleixner 提交于
      The TSC_ADJUST MSR shows whether the TSC has been modified. This is helpful
      in a two aspects:
      
      1) It allows to detect BIOS wreckage, where SMM code tries to 'hide' the
         cycles spent by storing the TSC value at SMM entry and restoring it at
         SMM exit. On affected machines the TSCs run slowly out of sync up to the
         point where the clocksource watchdog (if available) detects it.
      
         The TSC_ADJUST MSR allows to detect the TSC modification before that and
         eventually restore it. This is also important for SoCs which have no
         watchdog clocksource and therefore TSC wreckage cannot be detected and
         acted upon.
      
      2) All threads in a package are required to have the same TSC_ADJUST
         value. Broken BIOSes break that and as a result the TSC synchronization
         check fails.
      
         The TSC_ADJUST MSR allows to detect the deviation when a CPU comes
         online. If detected set it to the value of an already online CPU in the
         same package. This also allows to reduce the number of sync tests
         because with that in place the test is only required for the first CPU
         in a package.
      
         In principle all CPUs in a system should have the same TSC_ADJUST value
         even across packages, but with physical CPU hotplug this assumption is
         not true because the TSC starts with power on, so physical hotplug has
         to do some trickery to bring the TSC into sync with already running
         packages, which requires to use an TSC_ADJUST value different from CPUs
         which got powered earlier.
      
         A final enhancement is the opportunity to compensate for unsynced TSCs
         accross nodes at boot time and make the TSC usable that way. It won't
         help for TSCs which run apart due to frequency skew between packages,
         but this gets detected by the clocksource watchdog later.
      
      The first step toward this is to store the TSC_ADJUST value of a starting
      CPU and compare it with the value of an already online CPU in the same
      package. If they differ, emit a warning and adjust it to the reference
      value. The !SMP version just stores the boot value for later verification.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/r/20161119134017.655323776@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      8b223bc7
    • T
      x86/tsc: Use X86_FEATURE_TSC_ADJUST in detect_art() · 7b3d2f6e
      Thomas Gleixner 提交于
      The art detection uses rdmsrl_safe() to detect the availablity of the
      TSC_ADJUST MSR.
      
      That's pointless because we have a feature bit for this. Use it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/r/20161119134017.483561692@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      7b3d2f6e
  5. 18 11月, 2016 4 次提交
  6. 20 9月, 2016 2 次提交
  7. 10 8月, 2016 1 次提交
    • N
      x86/timers/apic: Inform TSC deadline clockevent device about recalibration · 6731b0d6
      Nicolai Stange 提交于
      This patch eliminates a source of imprecise APIC timer interrupts,
      which imprecision may result in double interrupts or even late
      interrupts.
      
      The TSC deadline clockevent devices' configuration and registration
      happens before the TSC frequency calibration is refined in
      tsc_refine_calibration_work().
      
      This results in the TSC clocksource and the TSC deadline clockevent
      devices being configured with slightly different frequencies: the former
      gets the refined one and the latter are configured with the inaccurate
      frequency detected earlier by means of the "Fast TSC calibration using PIT".
      
      Within the APIC code, introduce the notifier function
      lapic_update_tsc_freq() which reconfigures all per-CPU TSC deadline
      clockevent devices with the current tsc_khz.
      
      Call it from the TSC code after TSC calibration refinement has happened.
      Signed-off-by: NNicolai Stange <nicstange@gmail.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Christopher S. Hall <christopher.s.hall@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Link: http://lkml.kernel.org/r/20160714152255.18295-3-nicstange@gmail.com
      [ Pushed #ifdef CONFIG_X86_LOCAL_APIC into header, improved changelog. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6731b0d6
  8. 15 7月, 2016 1 次提交
  9. 14 7月, 2016 1 次提交
    • P
      x86/kernel: Audit and remove any unnecessary uses of module.h · 186f4360
      Paul Gortmaker 提交于
      Historically a lot of these existed because we did not have
      a distinction between what was modular code and what was providing
      support to modules via EXPORT_SYMBOL and friends.  That changed
      when we forked out support for the latter into the export.h file.
      
      This means we should be able to reduce the usage of module.h
      in code that is obj-y Makefile or bool Kconfig.  The advantage
      in doing so is that module.h itself sources about 15 other headers;
      adding significantly to what we feed cpp, and it can obscure what
      headers we are effectively using.
      
      Since module.h was the source for init.h (for __init) and for
      export.h (for EXPORT_SYMBOL) we consider each obj-y/bool instance
      for the presence of either and replace as needed.  Build testing
      revealed some implicit header usage that was fixed up accordingly.
      
      Note that some bool/obj-y instances remain since module.h is
      the header for some exception table entry stuff, and for things
      like __init_or_module (code that is tossed when MODULES=n).
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160714001901.31603-4-paul.gortmaker@windriver.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      186f4360
  10. 12 7月, 2016 3 次提交
  11. 13 4月, 2016 3 次提交
  12. 18 3月, 2016 1 次提交
  13. 04 3月, 2016 1 次提交
    • C
      x86/tsc: Always Running Timer (ART) correlated clocksource · f9677e0f
      Christopher S. Hall 提交于
      On modern Intel systems TSC is derived from the new Always Running Timer
      (ART). ART can be captured simultaneous to the capture of
      audio and network device clocks, allowing a correlation between timebases
      to be constructed. Upon capture, the driver converts the captured ART
      value to the appropriate system clock using the correlated clocksource
      mechanism.
      
      On systems that support ART a new CPUID leaf (0x15) returns parameters
      “m” and “n” such that:
      
      TSC_value = (ART_value * m) / n + k [n >= 1]
      
      [k is an offset that can adjusted by a privileged agent. The
      IA32_TSC_ADJUST MSR is an example of an interface to adjust k.
      See 17.14.4 of the Intel SDM for more details]
      
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: kevin.b.stanton@intel.com
      Cc: kevin.j.clarke@intel.com
      Cc: hpa@zytor.com
      Cc: jeffrey.t.kirsher@intel.com
      Cc: netdev@vger.kernel.org
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NChristopher S. Hall <christopher.s.hall@intel.com>
      [jstultz: Tweaked to fix build issue, also reworked math for
      64bit division on 32bit systems, as well as !CONFIG_CPU_FREQ build
      fixes]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      f9677e0f
  14. 24 2月, 2016 1 次提交
  15. 22 2月, 2016 1 次提交
  16. 19 11月, 2015 1 次提交
  17. 20 10月, 2015 1 次提交
    • A
      perf/x86: Fix time_shift in perf_event_mmap_page · b9511cd7
      Adrian Hunter 提交于
      Commit:
      
        b20112ed ("perf/x86: Improve accuracy of perf/sched clock")
      
      allowed the time_shift value in perf_event_mmap_page to be as much
      as 32.  Unfortunately the documented algorithms for using time_shift
      have it shifting an integer, whereas to work correctly with the value
      32, the type must be u64.
      
      In the case of perf tools, Intel PT decodes correctly but the timestamps
      that are output (for example by perf script) have lost 32-bits of
      granularity so they look like they are not changing at all.
      
      Fix by limiting the shift to 31 and adjusting the multiplier accordingly.
      
      Also update the documentation of perf_event_mmap_page so that new code
      based on it will be more future-proof.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: b20112ed ("perf/x86: Improve accuracy of perf/sched clock")
      Link: http://lkml.kernel.org/r/1445001845-13688-2-git-send-email-adrian.hunter@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b9511cd7
  18. 16 9月, 2015 1 次提交
  19. 13 9月, 2015 1 次提交
  20. 04 8月, 2015 1 次提交
  21. 03 8月, 2015 1 次提交
  22. 06 7月, 2015 5 次提交
  23. 23 1月, 2015 1 次提交
  24. 23 10月, 2014 1 次提交
  25. 24 7月, 2014 1 次提交
    • T
      clocksource: Move cycle_last validation to core code · 09ec5442
      Thomas Gleixner 提交于
      The only user of the cycle_last validation is the x86 TSC. In order to
      provide NMI safe accessor functions for clock monotonic and
      monotonic_raw we need to do that in the core.
      
      We can't do the TSC specific
      
          if (now < cycle_last)
             	    now = cycle_last;
      
      for the other wrapping around clocksources, but TSC has
      CLOCKSOURCE_MASK(64) which actually does not mask out anything so if
      now is less than cycle_last the subtraction will give a negative
      result. So we can check for that in clocksource_delta() and return 0
      for that case.
      
      Implement and enable it for x86
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      09ec5442
  26. 02 7月, 2014 1 次提交
    • P
      x86, tsc: Fix cpufreq lockup · 3896c329
      Peter Zijlstra 提交于
      Mauro reported that his AMD X2 using the powernow-k8 cpufreq driver
      locked up when doing cpu hotplug.
      
      Because we called set_cyc2ns_scale() from the time_cpufreq_notifier()
      unconditionally, it gets called multiple times for each freq change,
      instead of only the once, when the tsc_khz value actually changes.
      
      Because it gets called more than once, we run out of cyc2ns data slots
      and stall, waiting for a free one, but because we're half way offline,
      there's no consumers to free slots.
      
      By placing the call inside the condition that actually changes tsc_khz
      we avoid superfluous calls and avoid the problem.
      Reported-by: NMauro <registosites@hotmail.com>
      Tested-by: NMauro <registosites@hotmail.com>
      Fixes: 20d1c86a ("sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs")
      Cc: <stable@vger.kernel.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Bin Gao <bin.gao@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Stefani Seibold <stefani@seibold.net>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      3896c329