1. 24 10月, 2007 2 次提交
    • D
      x86: fix more TSC clock source calibration errors · 8c660065
      Dave Johnson 提交于
      The previous patch wasn't correctly handling the 'count' variable.  If
      a CPU gave bad results on the 1st or 2nd run but good results on the
      3rd, it wouldn't do the correct thing.  No idea if any such CPU
      exists, but the patch below handles that case by discarding the bad
      runs.
      
      If a bad result (too quick, or too slow) occurs on any of the 3 runs
      it will be discarded.
      
      Also updated some comments to explain what's going on.
      Signed-off-by: NDave Johnson <djohnson@sw.starentnetworks.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      8c660065
    • D
      x86: fix TSC clock source calibration error · edaf420f
      Dave Johnson 提交于
      I ran into this problem on a system that was unable to obtain NTP sync
      because the clock was running very slow (over 10000ppm slow). ntpd had
      declared all of its peers 'reject' with 'peer_dist' reason.
      
      On investigation, the tsc_khz variable was significantly incorrect
      causing xtime to run slow.  After a reboot tsc_khz was correct so I
      did a reboot test to see how often the problem occurred:
      
      Test was done on a 2000 Mhz Xeon system.  Of 689 reboots, 8 of them
      had unacceptable tsc_khz values (>500ppm):
      
       range of tsc_khz  # of boots  % of boots
       ----------------  ----------  ----------
              < 1999750           0      0.000%
      1999750 - 1999800          21      3.048%
      1999800 - 1999850         166     24.128%
      1999850 - 1999900         241     35.029%
      1999900 - 1999950         211     30.669%
      1999950 - 2000000          42      6.105%
      2000000 - 2000000           0      0.000%
      2000050 - 2000100           0      0.000%
                         [...]
      2000100 - 2015000           1      0.145%  << BAD
      2015000 - 2030000           6      0.872%  << BAD
      2030000 - 2045000           1      0.145%  << BAD
      2045000 <                   0      0.000%
      
      The worst boot was 2032.577 Mhz, over 1.5% off!
      
      It appears that on rare occasions, mach_countup() is taking longer to
      complete than necessary.
      
      I suspect that this is caused by the CPU taking a periodic SMI
      interrupt right at the end of the 30ms calibration loop.  This would
      cause the loop to delay while the SMI BIOS hander runs. The resulting
      TSC value is beyond what it actually should be resulting in a higher
      tsc_khz.
      
      The below patch makes native_calculate_cpu_khz() take the best
      (shortest duration, lowest khz) run of it's 3 calibration loops.  If a
      SMI goes off causing a bad result (long duration, higher khz) it will
      be discarded.
      
      With the patch applied, 300 boots of the same system produce good
      results:
      
       range of tsc_khz  # of boots  % of boots
       ----------------  ----------  ----------
              < 1999750           0      0.000%
      1999750 - 1999800          30     10.000%
      1999800 - 1999850         166     55.333%
      1999850 - 1999900          89     29.667%
      1999900 - 1999950          15      5.000%
      1999950 <                   0      0.000%
      
      Problem was found and tested against 2.6.18.  Patch is against 2.6.22.
      Signed-off-by: NDave Johnson <djohnson@sw.starentnetworks.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      edaf420f
  2. 23 10月, 2007 2 次提交
  3. 22 10月, 2007 6 次提交
  4. 20 10月, 2007 30 次提交