1. 31 7月, 2018 1 次提交
  2. 20 7月, 2018 6 次提交
    • P
      x86/tsc: Make use of tsc_calibrate_cpu_early() · 8dbe4385
      Pavel Tatashin 提交于
      During early boot enable tsc_calibrate_cpu_early() and switch to
      tsc_calibrate_cpu() only later. Do this unconditionally, because it is
      unknown what methods other cpus will use to calibrate once they are
      onlined.
      
      If by the time tsc_init() is called tsc frequency is still unknown do only
      pit_hpet_ptimer_calibrate_cpu() to calibrate, as this function contains the
      only methods wich have not been called and tried earlier.
      Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: steven.sistare@oracle.com
      Cc: daniel.m.jordan@oracle.com
      Cc: linux@armlinux.org.uk
      Cc: schwidefsky@de.ibm.com
      Cc: heiko.carstens@de.ibm.com
      Cc: john.stultz@linaro.org
      Cc: sboyd@codeaurora.org
      Cc: hpa@zytor.com
      Cc: douly.fnst@cn.fujitsu.com
      Cc: peterz@infradead.org
      Cc: prarit@redhat.com
      Cc: feng.tang@intel.com
      Cc: pmladek@suse.com
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: linux-s390@vger.kernel.org
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: pbonzini@redhat.com
      Link: https://lkml.kernel.org/r/20180719205545.16512-27-pasha.tatashin@oracle.com
      8dbe4385
    • P
      x86/tsc: Split native_calibrate_cpu() into early and late parts · 03821f45
      Pavel Tatashin 提交于
      During early boot TSC and CPU frequency can be calibrated using MSR, CPUID,
      and quick PIT calibration methods. The other methods PIT/HPET/PMTIMER are
      available only after ACPI is initialized.
      
      Split native_calibrate_cpu() into early and late parts so they can be
      called separately during early and late tsc calibration.
      Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: steven.sistare@oracle.com
      Cc: daniel.m.jordan@oracle.com
      Cc: linux@armlinux.org.uk
      Cc: schwidefsky@de.ibm.com
      Cc: heiko.carstens@de.ibm.com
      Cc: john.stultz@linaro.org
      Cc: sboyd@codeaurora.org
      Cc: hpa@zytor.com
      Cc: douly.fnst@cn.fujitsu.com
      Cc: peterz@infradead.org
      Cc: prarit@redhat.com
      Cc: feng.tang@intel.com
      Cc: pmladek@suse.com
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: linux-s390@vger.kernel.org
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: pbonzini@redhat.com
      Link: https://lkml.kernel.org/r/20180719205545.16512-26-pasha.tatashin@oracle.com
      03821f45
    • P
      x86/tsc: Use TSC as sched clock early · 4763f03d
      Pavel Tatashin 提交于
      All prerequesites for enabling TSC as sched clock early in the boot
      process are available now:
      
       - Early attempt of TSC calibration
      
       - Early availablity of static branch patching
      
      If TSC frequency can be established in the early calibration, enable the
      static key which switches sched clock to use TSC.
      
      [ tglx: Massaged changelog ]
      Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: steven.sistare@oracle.com
      Cc: daniel.m.jordan@oracle.com
      Cc: linux@armlinux.org.uk
      Cc: schwidefsky@de.ibm.com
      Cc: heiko.carstens@de.ibm.com
      Cc: john.stultz@linaro.org
      Cc: sboyd@codeaurora.org
      Cc: hpa@zytor.com
      Cc: douly.fnst@cn.fujitsu.com
      Cc: peterz@infradead.org
      Cc: prarit@redhat.com
      Cc: feng.tang@intel.com
      Cc: pmladek@suse.com
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: linux-s390@vger.kernel.org
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: pbonzini@redhat.com
      Link: https://lkml.kernel.org/r/20180719205545.16512-22-pasha.tatashin@oracle.com
      4763f03d
    • P
      x86/tsc: Initialize cyc2ns when tsc frequency is determined · e2a9ca29
      Pavel Tatashin 提交于
      cyc2ns converts tsc to nanoseconds, and it is handled in a per-cpu data
      structure.
      
      Currently, the setup code for c2ns data for every possible CPU goes through
      the same sequence of calculations as for the boot CPU, but is based on the
      same tsc frequency as the boot CPU, and thus this is not necessary.
      
      Initialize the boot cpu when tsc frequency is determined. Copy the
      calculated data from the boot CPU to the other CPUs in tsc_init().
      
      In addition do the following:
      
       - Remove unnecessary zeroing of c2ns data by removing cyc2ns_data_init()
      
       - Split set_cyc2ns_scale() into two functions, so set_cyc2ns_scale() can be
         called when system is up, and wraps around __set_cyc2ns_scale() that can
         be called directly when system is booting but avoids saving restoring
         IRQs and going and waking up from idle.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: steven.sistare@oracle.com
      Cc: daniel.m.jordan@oracle.com
      Cc: linux@armlinux.org.uk
      Cc: schwidefsky@de.ibm.com
      Cc: heiko.carstens@de.ibm.com
      Cc: john.stultz@linaro.org
      Cc: sboyd@codeaurora.org
      Cc: hpa@zytor.com
      Cc: douly.fnst@cn.fujitsu.com
      Cc: peterz@infradead.org
      Cc: prarit@redhat.com
      Cc: feng.tang@intel.com
      Cc: pmladek@suse.com
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: linux-s390@vger.kernel.org
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: pbonzini@redhat.com
      Link: https://lkml.kernel.org/r/20180719205545.16512-21-pasha.tatashin@oracle.com
      e2a9ca29
    • P
      x86/tsc: Calibrate tsc only once · cf7a63ef
      Pavel Tatashin 提交于
      During boot tsc is calibrated twice: once in tsc_early_delay_calibrate(),
      and the second time in tsc_init().
      
      Rename tsc_early_delay_calibrate() to tsc_early_init(), and rework it so
      the calibration is done only early, and make tsc_init() to use the values
      already determined in tsc_early_init().
      
      Sometimes it is not possible to determine tsc early, as the subsystem that
      is required is not yet initialized, in such case try again later in
      tsc_init().
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: steven.sistare@oracle.com
      Cc: daniel.m.jordan@oracle.com
      Cc: linux@armlinux.org.uk
      Cc: schwidefsky@de.ibm.com
      Cc: heiko.carstens@de.ibm.com
      Cc: john.stultz@linaro.org
      Cc: sboyd@codeaurora.org
      Cc: hpa@zytor.com
      Cc: douly.fnst@cn.fujitsu.com
      Cc: peterz@infradead.org
      Cc: prarit@redhat.com
      Cc: feng.tang@intel.com
      Cc: pmladek@suse.com
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: linux-s390@vger.kernel.org
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: pbonzini@redhat.com
      Link: https://lkml.kernel.org/r/20180719205545.16512-20-pasha.tatashin@oracle.com
      cf7a63ef
    • P
      x86/tsc: Redefine notsc to behave as tsc=unstable · fe9af81e
      Pavel Tatashin 提交于
      Currently, the notsc kernel parameter disables the use of the TSC by
      sched_clock(). However, this parameter does not prevent the kernel from
      accessing tsc in other places.
      
      The only rationale to boot with notsc is to avoid timing discrepancies on
      multi-socket systems where TSC are not properly synchronized, and thus
      exclude TSC from being used for time keeping. But that prevents using TSC
      as sched_clock() as well, which is not necessary as the core sched_clock()
      implementation can handle non synchronized TSC based sched clocks just
      fine.
      
      However, there is another method to solve the above problem: booting with
      tsc=unstable parameter. This parameter allows sched_clock() to use TSC and
      just excludes it from timekeeping.
      
      So there is no real reason to keep notsc, but for compatibility reasons the
      parameter has to stay. Make it behave like 'tsc=unstable' instead.
      
      [ tglx: Massaged changelog ]
      Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NDou Liyang <douly.fnst@cn.fujitsu.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: steven.sistare@oracle.com
      Cc: daniel.m.jordan@oracle.com
      Cc: linux@armlinux.org.uk
      Cc: schwidefsky@de.ibm.com
      Cc: heiko.carstens@de.ibm.com
      Cc: john.stultz@linaro.org
      Cc: sboyd@codeaurora.org
      Cc: hpa@zytor.com
      Cc: peterz@infradead.org
      Cc: prarit@redhat.com
      Cc: feng.tang@intel.com
      Cc: pmladek@suse.com
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: linux-s390@vger.kernel.org
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: pbonzini@redhat.com
      Link: https://lkml.kernel.org/r/20180719205545.16512-12-pasha.tatashin@oracle.com
      fe9af81e
  3. 02 5月, 2018 2 次提交
  4. 17 4月, 2018 1 次提交
    • X
      x86/tsc: Prevent 32bit truncation in calc_hpet_ref() · d3878e16
      Xiaoming Gao 提交于
      The TSC calibration code uses HPET as reference. The conversion normalizes
      the delta of two HPET timestamps:
      
          hpetref = ((tshpet1 - tshpet2) * HPET_PERIOD) / 1e6
      
      and then divides the normalized delta of the corresponding TSC timestamps
      by the result to calulate the TSC frequency.
      
          tscfreq = ((tstsc1 - tstsc2 ) * 1e6) / hpetref
      
      This uses do_div() which takes an u32 as the divisor, which worked so far
      because the HPET frequency was low enough that 'hpetref' never exceeded
      32bit.
      
      On Skylake machines the HPET frequency increased so 'hpetref' can exceed
      32bit. do_div() truncates the divisor, which causes the calibration to
      fail.
      
      Use div64_u64() to avoid the problem.
      
      [ tglx: Fixes whitespace mangled patch and rewrote changelog ]
      Signed-off-by: NXiaoming Gao <newtongao@tencent.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Cc: peterz@infradead.org
      Cc: hpa@zytor.com
      Link: https://lkml.kernel.org/r/38894564-4fc9-b8ec-353f-de702839e44e@gmail.com
      d3878e16
  5. 16 3月, 2018 1 次提交
    • R
      x86/tsc: Convert ART in nanoseconds to TSC · fc804f65
      Rajvi Jingar 提交于
      Device drivers use get_device_system_crosststamp() to produce precise
      system/device cross-timestamps. The PHC clock and ALSA interfaces, for
      example, make the cross-timestamps available to user applications.  On
      Intel platforms, get_device_system_crosststamp() requires a TSC value
      derived from ART (Always Running Timer) to compute the monotonic raw and
      realtime system timestamps.
      
      Starting with Intel Goldmont platforms, the PCIe root complex supports the
      PTM time sync protocol. PTM requires all timestamps to be in units of
      nanoseconds. The Intel root complex hardware propagates system time derived
      from ART in units of nanoseconds performing the conversion as follows:
      
           ART_NS = ART * 1e9 / <crystal frequency>
      
      When user software requests a cross-timestamp, the system timestamps
      (generally read from device registers) must be converted to TSC by the
      driver software as follows:
      
          TSC = ART_NS * TSC_KHZ / 1e6
      
      This is valid when CPU feature flag X86_FEATURE_TSC_KNOWN_FREQ is set
      indicating that tsc_khz is derived from CPUID[15H]. Drivers should check
      whether this flag is set before conversion to TSC is attempted.
      Suggested-by: NChristopher S. Hall <christopher.s.hall@intel.com>
      Signed-off-by: NRajvi Jingar <rajvi.jingar@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: peterz@infradead.org
      Link: https://lkml.kernel.org/r/1520530116-4925-1-git-send-email-rajvi.jingar@intel.com
      fc804f65
  6. 15 1月, 2018 2 次提交
  7. 14 1月, 2018 3 次提交
  8. 10 11月, 2017 1 次提交
  9. 07 11月, 2017 1 次提交
  10. 17 10月, 2017 1 次提交
  11. 25 9月, 2017 2 次提交
  12. 22 6月, 2017 1 次提交
  13. 23 5月, 2017 1 次提交
  14. 15 5月, 2017 6 次提交
    • P
      sched/clock: Remove unused argument to sched_clock_idle_wakeup_event() · ac1e843f
      Peter Zijlstra 提交于
      The argument to sched_clock_idle_wakeup_event() has not been used in a
      long time. Remove it.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ac1e843f
    • P
      x86/tsc, sched/clock, clocksource: Use clocksource watchdog to provide stable sync points · b421b22b
      Peter Zijlstra 提交于
      Currently we keep sched_clock_tick() active for stable TSC in order to
      keep the per-CPU state semi up-to-date. The (obvious) problem is that
      by the time we detect TSC is borked, our per-CPU state is also borked.
      
      So hook into the clocksource watchdog and call a method after we've
      found it to still be stable.
      
      There's the obvious race where the TSC goes wonky between finding it
      stable and us running the callback, but closing that is too much work
      and not really worth it, since we're already detecting TSC wobbles
      after the fact, so we cannot, per definition, fully avoid funny clock
      values.
      
      And since the watchdog runs less often than the tick, this is also an
      optimization.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b421b22b
    • P
      x86/tsc: Feed refined TSC calibration into sched_clock() · aa7b630e
      Peter Zijlstra 提交于
      For the (older) CPUs that still need the refined TSC calibration, also
      update the sched_clock() rate.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      aa7b630e
    • P
      x86/tsc: Fix sched_clock() sync · 615cd033
      Peter Zijlstra 提交于
      While looking through the code I noticed that we initialize the cyc2ns
      fields with a different cycle value for each CPU, resulting in a
      slightly different 0 point for each CPU.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      615cd033
    • P
      x86/tsc: Remodel cyc2ns to use seqcount_latch() · 59eaef78
      Peter Zijlstra 提交于
      Replace the custom multi-value scheme with the more regular
      seqcount_latch() scheme. Along with scrapping a lot of lines, the latch
      scheme is better documented and used in more places.
      
      The immediate benefit however is not being limited on the update side.
      The current code has a limit where the writers block which is hit by
      future changes.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      59eaef78
    • P
      x86/tsc: Provide 'tsc=unstable' boot parameter · 8309f86c
      Peter Zijlstra 提交于
      Since the clocksource watchdog will only detect broken TSC after the
      fact, all TSC based clocks will likely have observed non-continuous
      values before/when switching away from TSC.
      
      Therefore only thing to fully avoid random clock movement when your
      BIOS randomly mucks with TSC values from SMI handlers is reporting the
      TSC as unstable at boot.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      8309f86c
  15. 23 3月, 2017 1 次提交
  16. 14 3月, 2017 1 次提交
    • P
      x86/tsc: Fix ART for TSC_KNOWN_FREQ · 44fee88c
      Peter Zijlstra 提交于
      Subhransu reported that convert_art_to_tsc() isn't working for him.
      
      The ART to TSC relation is only set up for systems which use the refined
      TSC calibration. Systems with known TSC frequency (available via CPUID 15)
      are not using the refined calibration and therefor the ART to TSC relation
      is never established.
      
      Add the setup to the known frequency init path which skips ART
      calibration. The init code needs to be duplicated as for systems which use
      refined calibration the ART setup must be delayed until calibration has
      been done.
      
      The problem has been there since the ART support was introdduced, but only
      detected now because Subhransu tested the first time on hardware which has
      TSC frequency enumerated via CPUID 15.
      
      Note for stable: The conditional has changed from TSC_RELIABLE to
           	 	 TSC_KNOWN_FREQUENCY.
      
      [ tglx: Rewrote changelog and identified the proper 'Fixes' commit ]
      
      Fixes: f9677e0f ("x86/tsc: Always Running Timer (ART) correlated clocksource")
      Reported-by: N"Prusty, Subhransu S" <subhransu.s.prusty@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Cc: christopher.s.hall@intel.com
      Cc: kevin.b.stanton@intel.com
      Cc: john.stultz@linaro.org
      Cc: akataria@vmware.com
      Link: http://lkml.kernel.org/r/20170313145712.GI3312@twins.programming.kicks-ass.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      44fee88c
  17. 02 3月, 2017 2 次提交
    • P
      sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface · f94c8d11
      Peter Zijlstra 提交于
      Wanpeng Li reported that since the following commit:
      
        acb04058 ("sched/clock: Fix hotplug crash")
      
      ... KVM always runs with unstable sched-clock even though KVM's
      kvm_clock _is_ stable.
      
      The problem is that we've tied clear_sched_clock_stable() to the TSC
      state, and overlooked that sched_clock() is a paravirt function.
      
      Solve this by doing two things:
      
       - tie the sched_clock() stable state more clearly to the TSC stable
         state for the normal (!paravirt) case.
      
       - only call clear_sched_clock_stable() when we mark TSC unstable
         when we use native_sched_clock().
      
      The first means we can actually run with stable sched_clock in more
      situations then before, which is good. And since commit:
      
        12907fbb ("sched/clock, clocksource: Add optional cs::mark_unstable() method")
      
      ... this should be reliable. Since any detection of TSC fail now results
      in marking the TSC unstable.
      Reported-by: NWanpeng Li <kernellwp@gmail.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Fixes: acb04058 ("sched/clock: Fix hotplug crash")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f94c8d11
    • I
      sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> · e6017571
      Ingo Molnar 提交于
      We are going to split <linux/sched/clock.h> out of <linux/sched.h>, which
      will have to be picked up from other headers and .c files.
      
      Create a trivial placeholder <linux/sched/clock.h> file that just
      maps to <linux/sched.h> to make this patch obviously correct and
      bisectable.
      
      Include the new header in the files that are going to need it.
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e6017571
  18. 10 2月, 2017 1 次提交
    • T
      x86/tsc: Avoid the large time jump when sanitizing TSC ADJUST · f2e04214
      Thomas Gleixner 提交于
      Olof reported that on a machine which has a BIOS wreckaged TSC the
      timestamps in dmesg are making a large jump because the TSC value is
      jumping forward after resetting the TSC ADJUST register to a sane value.
      
      This can be avoided by calling the TSC ADJUST saniziting function before
      initializing the per cpu sched clock machinery. That takes the offset into
      account and avoid the time jump.
      
      What cannot be avoided is that the 'Firmware Bug' warnings on the secondary
      CPUs are printed with the large time offsets because it would be too much
      effort and ugly hackery to print those warnings into a buffer and emit them
      after the adjustemt on the starting CPUs. It's a firmware bug and should be
      fixed in firmware. The weird timestamps are collateral damage and just
      illustrate the sillyness of the BIOS folks:
      
      [    0.397445] smp: Bringing up secondary CPUs ...
      [    0.402100] x86: Booting SMP configuration:
      [    0.406343] .... node  #0, CPUs:      #1
      [1265776479.930667] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU1: -2978888639183101
      [1265776479.944664] TSC ADJUST synchronize: Reference CPU0: 0 CPU1: -2978888639183101
      [    0.508119]  #2
      [1265776480.032346] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU2: -2978888639183677
      [1265776480.044192] TSC ADJUST synchronize: Reference CPU0: 0 CPU2: -2978888639183677
      [    0.607643]  #3
      [1265776480.131874] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU3: -2978888639184530
      [1265776480.143720] TSC ADJUST synchronize: Reference CPU0: 0 CPU3: -2978888639184530
      [    0.707108] smp: Brought up 1 node, 4 CPUs
      [    0.711271] smpboot: Total of 4 processors activated (21698.88 BogoMIPS)
      Reported-by: NOlof Johansson <olof@lixom.net>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20170209151231.411460506@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      f2e04214
  19. 14 1月, 2017 2 次提交
  20. 25 12月, 2016 1 次提交
  21. 15 12月, 2016 2 次提交
    • T
      x86/tsc: Force TSC_ADJUST register to value >= zero · 5bae1562
      Thomas Gleixner 提交于
      Roland reported that his DELL T5810 sports a value add BIOS which
      completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with
      random negative TSC_ADJUST values, different on all CPUs. That renders the
      TSC useless because the sycnchronization check fails.
      
      Roland tested the new TSC_ADJUST mechanism. While it manages to readjust
      the TSCs he needs to disable the TSC deadline timer, otherwise the machine
      just stops booting.
      
      Deeper investigation unearthed that the TSC deadline timer is sensitive to
      the TSC_ADJUST value. Writing TSC_ADJUST to a negative value results in an
      interrupt storm caused by the TSC deadline timer.
      
      This does not make any sense and it's hard to imagine what kind of hardware
      wreckage is behind that misfeature, but it's reliably reproducible on other
      systems which have TSC_ADJUST and TSC deadline timer.
      
      While it would be understandable that a big enough negative value which
      moves the resulting TSC readout into the negative space could have the
      described effect, this happens even with a adjust value of -1, which keeps
      the TSC readout definitely in the positive space. The compare register for
      the TSC deadline timer is set to a positive value larger than the TSC, but
      despite not having reached the deadline the interrupt is raised
      immediately. If this happens on the boot CPU, then the machine dies
      silently because this setup happens before the NMI watchdog is armed.
      
      Further experiments showed that any other adjustment of TSC_ADJUST works as
      expected as long as it stays in the positive range. The direction of the
      adjustment has no influence either. See the lkml link for further analysis.
      
      Yet another proof for the theory that timers are designed by janitors and
      the underlying (obviously undocumented) mechanisms which allow BIOSes to
      wreckage them are considered a feature. Well done Intel - NOT!
      
      To address this wreckage add the following sanity measures:
      
      - If the TSC_ADJUST value on the boot cpu is not 0, set it to 0
      
      - If the TSC_ADJUST value on any cpu is negative, set it to 0
      
      - Prevent the cross package synchronization mechanism from setting negative
        TSC_ADJUST values.
      Reported-and-tested-by: NRoland Scheidegger <rscheidegger_lists@hispeed.ch>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
      Cc: Kevin Stanton <kevin.b.stanton@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Allen Hung <allen_hung@dell.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/r/20161213131211.397588033@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      5bae1562
    • T
      x86/tsc: Validate TSC_ADJUST after resume · 6a369583
      Thomas Gleixner 提交于
      Some 'feature' BIOSes fiddle with the TSC_ADJUST register during
      suspend/resume which renders the TSC unusable.
      
      Add sanity checks into the resume path and restore the
      original value if it was adjusted.
      Reported-and-tested-by: NRoland Scheidegger <rscheidegger_lists@hispeed.ch>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
      Cc: Kevin Stanton <kevin.b.stanton@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Allen Hung <allen_hung@dell.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/r/20161213131211.317654500@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      6a369583
  22. 30 11月, 2016 1 次提交
    • T
      x86/tsc: Store and check TSC ADJUST MSR · 8b223bc7
      Thomas Gleixner 提交于
      The TSC_ADJUST MSR shows whether the TSC has been modified. This is helpful
      in a two aspects:
      
      1) It allows to detect BIOS wreckage, where SMM code tries to 'hide' the
         cycles spent by storing the TSC value at SMM entry and restoring it at
         SMM exit. On affected machines the TSCs run slowly out of sync up to the
         point where the clocksource watchdog (if available) detects it.
      
         The TSC_ADJUST MSR allows to detect the TSC modification before that and
         eventually restore it. This is also important for SoCs which have no
         watchdog clocksource and therefore TSC wreckage cannot be detected and
         acted upon.
      
      2) All threads in a package are required to have the same TSC_ADJUST
         value. Broken BIOSes break that and as a result the TSC synchronization
         check fails.
      
         The TSC_ADJUST MSR allows to detect the deviation when a CPU comes
         online. If detected set it to the value of an already online CPU in the
         same package. This also allows to reduce the number of sync tests
         because with that in place the test is only required for the first CPU
         in a package.
      
         In principle all CPUs in a system should have the same TSC_ADJUST value
         even across packages, but with physical CPU hotplug this assumption is
         not true because the TSC starts with power on, so physical hotplug has
         to do some trickery to bring the TSC into sync with already running
         packages, which requires to use an TSC_ADJUST value different from CPUs
         which got powered earlier.
      
         A final enhancement is the opportunity to compensate for unsynced TSCs
         accross nodes at boot time and make the TSC usable that way. It won't
         help for TSCs which run apart due to frequency skew between packages,
         but this gets detected by the clocksource watchdog later.
      
      The first step toward this is to store the TSC_ADJUST value of a starting
      CPU and compare it with the value of an already online CPU in the same
      package. If they differ, emit a warning and adjust it to the reference
      value. The !SMP version just stores the boot value for later verification.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/r/20161119134017.655323776@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      8b223bc7