1. 17 6月, 2009 2 次提交
    • P
      sched, x86: Fix cpufreq + sched_clock() TSC scaling · 84599f8a
      Peter Zijlstra 提交于
      For freqency dependent TSCs we only scale the cycles, we do not account
      for the discrepancy in absolute value.
      
      Our current formula is: time = cycles * mult
      
      (where mult is a function of the cpu-speed on variable tsc machines)
      
      Suppose our current cycle count is 10, and we have a multiplier of 5,
      then our time value would end up being 50.
      
      Now cpufreq comes along and changes the multiplier to say 3 or 7,
      which would result in our time being resp. 30 or 70.
      
      That means that we can observe random jumps in the time value due to
      frequency changes in both fwd and bwd direction.
      
      So what this patch does is change the formula to:
      
        time = cycles * frequency + offset
      
      And we calculate offset so that time_before == time_after, thereby
      ridding us of these jumps in time.
      
      [ Impact: fix/reduce sched_clock() jumps across frequency changing events ]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Chucked-on-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      84599f8a
    • A
      time: move PIT_TICK_RATE to linux/timex.h · 08604bd9
      Arnd Bergmann 提交于
      PIT_TICK_RATE is currently defined in four architectures, but in three
      different places.  While linux/timex.h is not the perfect place for it, it
      is still a reasonable replacement for those drivers that traditionally use
      asm/timex.h to get CLOCK_TICK_RATE and expect it to be the PIT frequency.
      
      Note that for Alpha, the actual value changed from 1193182UL to 1193180UL.
       This is unlikely to make a difference, and probably can only improve
      accuracy.  There was a discussion on the correct value of CLOCK_TICK_RATE
      a few years ago, after which every existing instance was getting changed
      to 1193182.  According to the specification, it should be
      1193181.818181...
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Dmitry Torokhov <dtor@mail.ru>
      Cc: Takashi Iwai <tiwai@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      08604bd9
  2. 15 6月, 2009 1 次提交
    • D
      [CPUFREQ] Clean up convoluted code in arch/x86/kernel/tsc.c:time_cpufreq_notifier() · 931db6a3
      Dave Jones 提交于
      Christoph Hellwig noticed the following potential uninitialised use:
      
       > arch/x86/kernel/tsc.c: In function 'time_cpufreq_notifier':
       > arch/x86/kernel/tsc.c:634: warning: 'dummy' may be used uninitialized in this function
       >
       > where we do have CONFIG_SMP set, freq->flags & CPUFREQ_CONST_LOOPS is
       > true and ref_freq is false.
      
      It seems plausable, though the circumstances for hitting it are really low.
      Nearly all SMP capable cpufreq drivers set CPUFREQ_CONST_LOOPS.
      powernow-k8 is really the only exception. The older CPUs were typically
      only ever UP. (powernow-k7 never supported SMP for eg)
      
      It's worth fixing regardless, as it cleans up the code.
      
      Fix possible uninitialized use of dummy, by just removing it,
      and making the setting of lpj more obvious.
      Signed-off-by: NDave Jones <davej@redhat.com>
      931db6a3
  3. 28 5月, 2009 1 次提交
  4. 22 4月, 2009 1 次提交
  5. 12 4月, 2009 1 次提交
    • J
      x86: clean up declarations and variables · 2c1b284e
      Jaswinder Singh Rajput 提交于
      Impact: cleanup, no code changed
      
       - syscalls.h       update declarations due to unifications
       - irq.c            declare smp_generic_interrupt() before it gets used
       - process.c        declare sys_fork() and sys_vfork() before they get used
       - tsc.c            rename tsc_khz shadowed variable
       - apic/probe_32.c  declare apic_default before it gets used
       - apic/nmi.c       prev_nmi_count should be unsigned
       - apic/io_apic.c   declare smp_irq_move_cleanup_interrupt() before it gets used
       - mm/init.c        declare direct_gbpages and free_initrd_mem before they get used
      Signed-off-by: NJaswinder Singh Rajput <jaswinder@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2c1b284e
  6. 17 3月, 2009 2 次提交
    • L
      Fast TSC calibration: calculate proper frequency error bounds · 9e8912e0
      Linus Torvalds 提交于
      In order for ntpd to correctly synchronize the clocks, the frequency of
      the system clock must not be off by more than 500 ppm (or, put another
      way, 1:2000), or ntpd will end up giving up on trying to synchronize
      properly, and ends up reseting the clock in jumps instead.
      
      The fast TSC PIT calibration sometimes failed this test - it was
      assuming that the PIT reads always took about one microsecond each (2us
      for the two reads to get a 16-bit timer), and that calibrating TSC to
      the PIT over 15ms should thus be sufficient to get much closer than
      500ppm (max 2us error on both sides giving 4us over 15ms: a 270 ppm
      error value).
      
      However, that assumption does not always hold: apparently some hardware
      is either very much slower at reading the PIT registers, or there was
      other noise causing at least one machine to get 700+ ppm errors.
      
      So instead of using a fixed 15ms timing loop, this changes the fast PIT
      calibration to read the TSC delta over the individual PIT timer reads,
      and use the result to calculate the error bars on the PIT read timing
      properly.  We then successfully calibrate the TSC only if the maximum
      error bars fall below 500ppm.
      
      In the process, we also relax the timing to allow up to 25ms for the
      calibration, although it can happen much faster depending on hardware.
      Reported-and-tested-by: NJesper Krogh <jesper@krogh.cc>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9e8912e0
    • L
      Fix potential fast PIT TSC calibration startup glitch · a6a80e1d
      Linus Torvalds 提交于
      During bootup, when we reprogram the PIT (programmable interval timer)
      to start counting down from 0xffff in order to use it for the fast TSC
      calibration, we should also make sure to delay a bit afterwards to allow
      the PIT hardware to actually start counting with the new value.
      
      That will happens at the next CLK pulse (1.193182 MHz), so the easiest
      way to do that is to just wait at least one microsecond after
      programming the new PIT counter value.  We do that by just reading the
      counter value back once - which will take about 2us on PC hardware.
      Reported-and-tested-by: Njohn stultz <johnstul@us.ibm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a6a80e1d
  7. 11 3月, 2009 1 次提交
    • I
      x86, sched_clock(): mark variables read-mostly · f24ade3a
      Ingo Molnar 提交于
      Impact: micro-optimization
      
      There's a number of variables in the sched_clock() path that are
      in .data/.bss - but not marked __read_mostly. This creates the
      danger of accidental false cacheline sharing with some other,
      write-often variable.
      
      So mark them __read_mostly.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f24ade3a
  8. 25 2月, 2009 1 次提交
  9. 29 1月, 2009 1 次提交
    • I
      x86: replace CONFIG_X86_SMP with CONFIG_SMP · 3e5095d1
      Ingo Molnar 提交于
      The x86/Voyager subarch used to have this distinction between
       'x86 SMP support' and 'Voyager SMP support':
      
       config X86_SMP
      	bool
      	depends on SMP && ((X86_32 && !X86_VOYAGER) || X86_64)
      
      This is a pointless distinction - Voyager can (and already does) use
      smp_ops to implement various SMP quirks it has - and it can be extended
      more to cover all the specialities of Voyager.
      
      So remove this complication in the Kconfig space.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3e5095d1
  10. 09 11月, 2008 1 次提交
    • I
      sched: optimize sched_clock() a bit · 7cbaef9c
      Ingo Molnar 提交于
      sched_clock() uses cycles_2_ns() needlessly - which is an irq-disabling
      variant of __cycles_2_ns().
      
      Most of the time sched_clock() is called with irqs disabled already.
      The few places that call it with irqs enabled need to be updated.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7cbaef9c
  11. 04 11月, 2008 1 次提交
  12. 02 11月, 2008 2 次提交
    • A
      x86: Skip verification by the watchdog for TSC clocksource. · 395628ef
      Alok Kataria 提交于
      Impact: Changes timekeeping on Vmware (or with tsc=reliable).
      
      This is achieved by resetting the CLOCKSOURCE_MUST_VERIFY flag.
      
      We add a tsc=reliable commandline option to enable this.
      This enables legacy hardware without HPET, LAPIC, or ACPI timers
      to enter high-resolution timer mode.
      
      Along with that have extended this to be used in virtualization environement
      too. Now we also set this flag if the X86_FEATURE_TSC_RELIABLE bit is set.
      
      This is important since there is a wrap-around problem with the acpi_pm timer.
      The acpi_pm counter is just 24bits and this can overflow in ~4 seconds. With
      the NO_HZ kernels in virtualized environment, there can be situations when
      the guest is descheduled for longer duration, as a result we may miss the wrap
      of the acpi counter. When TSC is used as a clocksource and acpi_pm timer is
      being used as the watchdog clocksource this error in acpi_pm results in TSC
      being marked as unstable, and essentially results in time dropping in chunks
      of 4 seconds whenever this wrap is missed. Since the virtualized TSC is
      reliable on VMware, we should always use the TSCs clocksource on VMware, so
      we skip the verfication at runtime, by checking for the feature bit.
      
      Since we reset the flag for mgeode systems too, i have combined
      the mgeode case with the feature bit check.
      Signed-off-by: NJeff Hansen <jhansen@cardaccess-inc.com>
      Signed-off-by: NAlok N Kataria <akataria@vmware.com>
      Signed-off-by: NDan Hecht <dhecht@vmware.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      395628ef
    • A
      x86: Hypervisor detection and get tsc_freq from hypervisor · 88b094fb
      Alok Kataria 提交于
      Impact: Changes timebase calibration on Vmware.
      
      v3->v2 : Abstract the hypervisor detection and feature (tsc_freq) request
      	 behind a hypervisor.c file
      v2->v1 : Add a x86_hyper_vendor field to the cpuinfo_x86 structure.
      	 This avoids multiple calls to the hypervisor detection function.
      
      This patch adds function to detect if we are running under VMware.
      The current way to check if we are on VMware is following,
      #  check if "hypervisor present bit" is set, if so read the 0x40000000
         cpuid leaf and check for "VMwareVMware" signature.
      #  if the above fails, check the DMI vendors name for "VMware" string
         if we find one we query the VMware hypervisor port to check if we are
         under VMware.
      
      The DMI + "VMware hypervisor port check" is needed for older VMware products,
      which don't implement the hypervisor signature cpuid leaf.
      Also note that since we are checking for the DMI signature the hypervisor
      port should never be accessed on native hardware.
      
      This patch also adds a hypervisor_get_tsc_freq function, instead of
      calibrating the frequency which can be error prone in virtualized
      environment, we ask the hypervisor for it. We get the frequency from
      the hypervisor by accessing the hypervisor port if we are running on VMware.
      Other hypervisors too can add code to the generic routine to get frequency on
      their platform.
      Signed-off-by: NAlok N Kataria <akataria@vmware.com>
      Signed-off-by: NDan Hecht <dhecht@vmware.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      88b094fb
  13. 31 10月, 2008 1 次提交
  14. 07 9月, 2008 1 次提交
  15. 05 9月, 2008 2 次提交
    • I
      x86: quick TSC calibration, improve · 4156e9a8
      Ingo Molnar 提交于
      - make sure the final TSC timestamp is reliable too
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4156e9a8
    • L
      x86: quick TSC calibration · 6ac40ed0
      Linus Torvalds 提交于
      Introduce a fast TSC-calibration method on sane hardware.
      
      It only uses 17920 PIT timer ticks to calibrate the TSC, plus 256 ticks on
      each side to make sure the TSC values were very close to the tick, so the
      whole calibration takes 15ms. Yet, despite only takign 15ms,
      we can actually give pretty stringent guarantees of accuracy:
      
       - the code requires that we hit each 256-counter block at least 50 times,
         so the TSC error is basically at *MOST* just a few PIT cycles off in
         any direction. In practice, it's going to be about one microseconds
         off (which is how long it takes to read the counter)
      
       - so over 17920 PIT cycles, we can pretty much guarantee that the
         calibration error is less than one half of a percent.
      
      My testing bears this out: on my machine, the quick-calibration reports
      2934.085kHz, while the slow one reports 2933.415.
      
      Yes, the slower calibration is still more precise. For me, the slow
      calibration is stable to within about one hundreth of a percent, so it's
      (at a guess) roughly an order-and-a-half of magnitude more precise. The
      longer you wait, the more precise you can be.
      
      However, the nice thing about the fast TSC PIT synchronization is that
      it's pretty much _guaranteed_ to give that 0.5% precision, and fail
      gracefully (and very quickly) if it doesn't get it. And it really is
      fairly simple (even if there's a lot of _details_ there, and I didn't get
      all of those right ont he first try or even the second ;)
      
      The patch says "110 insertions", but 63 of those new lines are actually
      comments.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ---
       arch/x86/kernel/tsc.c |  111 ++++++++++++++++++++++++++++++++++++++++++++++++-
       1 files changed, 110 insertions(+), 1 deletions(-)
      6ac40ed0
  16. 04 9月, 2008 5 次提交
  17. 03 9月, 2008 2 次提交
    • L
      Split up PIT part of TSC calibration from native_calibrate_tsc · ec0c15af
      Linus Torvalds 提交于
      The TSC calibration function is still very complicated, but this makes
      it at least a little bit less so by moving the PIT part out into a
      helper function of its own.
      Tested-by: NLarry Finger <Larry.Finger@lwfinger.net>
      Signed-of-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ec0c15af
    • T
      [x86] Fix TSC calibration issues · fbb16e24
      Thomas Gleixner 提交于
      Larry Finger reported at http://lkml.org/lkml/2008/9/1/90:
      An ancient laptop of mine started throwing errors from b43legacy when
      I started using 2.6.27 on it. This has been bisected to commit bfc0f594
      "x86: merge tsc calibration".
      
      The unification of the TSC code adopted mostly the 64bit code, which
      prefers PMTIMER/HPET over the PIT calibration.
      
      Larrys system has an AMD K6 CPU. Such systems are known to have
      PMTIMER incarnations which run at double speed. This results in a
      miscalibration of the TSC by factor 0.5. So the resulting calibrated
      CPU/TSC speed is half of the real CPU speed, which means that the TSC
      based delay loop will run half the time it should run. That might
      explain why the b43legacy driver went berserk.
      
      On the other hand we know about systems, where the PIT based
      calibration results in random crap due to heavy SMI/SMM
      disturbance. On those systems the PMTIMER/HPET based calibration logic
      with SMI detection shows better results.
      
      According to Alok also virtualized systems suffer from the PIT
      calibration method.
      
      The solution is to use a more wreckage aware aproach than the current
      either/or decision.
      
      1) reimplement the retry loop which was dropped from the 32bit code
      during the merge. It repeats the calibration and selects the lowest
      frequency value as this is probably the closest estimate to the real
      frequency
      
      2) Monitor the delta of the TSC values in the delay loop which waits
      for the PIT counter to reach zero. If the maximum value is
      significantly different from the minimum, then we have a pretty safe
      indicator that the loop was disturbed by an SMI.
      
      3) keep the pmtimer/hpet reference as a backup solution for systems
      where the SMI disturbance is a permanent point of failure for PIT
      based calibration
      
      4) do the loop iteration for both methods, record the lowest value and
      decide after all iterations finished.
      
      5) Set a clear preference to PIT based calibration when the result
      makes sense.
      
      The implementation does the reference calibration based on
      HPET/PMTIMER around the delay, which is necessary for the PIT anyway,
      but keeps separate TSC values to ensure the "independency" of the
      resulting calibration values.
      
      Tested on various 32bit/64bit machines including Geode 266Mhz, AMD K6
      (affected machine with a double speed pmtimer which I grabbed out of
      the dump), Pentium class machines and AMD/Intel 64 bit boxen.
      Bisected-by: NLarry Finger <Larry.Finger@lwfinger.net>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NLarry Finger <Larry.Finger@lwfinger.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fbb16e24
  18. 25 8月, 2008 2 次提交
  19. 18 8月, 2008 1 次提交
  20. 11 8月, 2008 1 次提交
    • M
      x86, tsc: fix section mismatch warning · 90936cfe
      Marcin Slusarz 提交于
      WARNING: vmlinux.o(.text+0x7950): Section mismatch in reference from the function native_calibrate_tsc() to the function .init.text:tsc_read_refs()
      The function native_calibrate_tsc() references
      the function __init tsc_read_refs().
      This is often because native_calibrate_tsc lacks a __init
      annotation or the annotation of tsc_read_refs is wrong.
      
      tsc_read_refs is called from native_calibrate_tsc which is not __init
      and native_calibrate_tsc cannot be marked __init
      Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      90936cfe
  21. 16 7月, 2008 1 次提交
  22. 11 7月, 2008 1 次提交
  23. 09 7月, 2008 6 次提交