1. 15 7月, 2011 1 次提交
  2. 14 7月, 2011 1 次提交
  3. 01 6月, 2011 1 次提交
  4. 24 5月, 2011 4 次提交
  5. 18 3月, 2011 1 次提交
  6. 15 1月, 2011 1 次提交
    • J
      x86: tsc: Fix calibration refinement conditionals to avoid divide by zero · 62627bec
      John Stultz 提交于
      Konrad Wilk reported that the new delayed calibration crashes with a
      divide by zero on Xen. The reason is that Xen sets the pmtimer
      address, but reading from it returns 0xffffff. That results in the
      ref_start and ref_stop value being the same, so the delta is zero
      which causes the divide by zero later in the calculation.
      
      The conditional (!hpet && !ref_start && !ref_stop) which sanity checks
      the calibration reference values doesn't really make sense. If the
      refs are null, but hpet is on, we still want to break out.
      
      The div by zero would be possible to trigger by chance if both reads
      from the hardware provided the exact same value (due to hardware
      wrapping).
      
      So checking if both the ref values are the same should handle if we
      don't have hardware (both null) or if they are the same value (either by
      invalid hardware, or by chance), avoiding the div by zero issue.
      
      [ tglx: Applied the same fix to native_calibrate_tsc() where this
        	check was copied from ]
      Reported-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      LKML-Reference: <1295024788-15619-1-git-send-email-johnstul@us.ibm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      62627bec
  7. 11 1月, 2011 1 次提交
  8. 30 12月, 2010 1 次提交
  9. 13 12月, 2010 1 次提交
    • T
      x86: Check tsc available/disabled in the delayed init function · a8760eca
      Thomas Gleixner 提交于
      The delayed TSC init function does not check whether the system has no
      TSC or TSC is disabled at the kernel command line, which results in a
      crash in the work queue based extended calibration due to division by
      zero because the basic calibration never happened.
      
      Add the missing checks and do not touch TSC when not available or
      disabled.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <johnstul@us.ibm.com>
      a8760eca
  10. 03 12月, 2010 1 次提交
    • J
      x86: Improve TSC calibration using a delayed workqueue · 08ec0c58
      John Stultz 提交于
      Boot to boot the TSC calibration may vary by quite a large amount.
      
      While normal variance of 50-100ppm can easily be seen, the quick
      calibration code only requires 500ppm accuracy, which is the limit
      of what NTP can correct for.
      
      This can cause problems for systems being used as NTP servers, as
      every time they reboot it can take hours for them to calculate the
      new drift error caused by the calibration.
      
      The classic trade-off here is calibration accuracy vs slow boot times,
      as during the calibration nothing else can run.
      
      This patch uses a delayed workqueue  to calibrate the TSC over the
      period of a second. This allows very accurate calibration (in my
      tests only varying by 1khz or 0.4ppm boot to boot). Additionally this
      refined calibration step does not block the boot process, and only
      delays the TSC clocksoure registration by a few seconds in early boot.
      If the refined calibration strays 1% from the early boot calibration
      value, the system will fall back to already calculated early boot
      calibration.
      
      Credit to Andi Kleen who suggested using a timer quite awhile back,
      but I dismissed it thinking the timer calibration would be done after
      the clocksource was registered (which would break things). Forgive
      me for my short-sightedness.
      
      This patch has worked very well in my testing, but TSC hardware is
      quite varied so it would probably be good to get some extended
      testing, possibly pushing inclusion out to 2.6.39.
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      LKML-Reference: <1289003985-29060-1-git-send-email-johnstul@us.ibm.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Ingo Molnar <mingo@elte.hu>
      CC: Martin Schwidefsky <schwidefsky@de.ibm.com>
      CC: Clark Williams <williams@redhat.com>
      CC: Andi Kleen <andi@firstfloor.org>
      08ec0c58
  11. 19 10月, 2010 1 次提交
  12. 11 9月, 2010 2 次提交
  13. 26 8月, 2010 1 次提交
    • B
      x86, tsc: Remove CPU frequency calibration on AMD · acf01734
      Borislav Petkov 提交于
      6b37f5a2 introduced the CPU frequency
      calibration code for AMD CPUs whose TSCs didn't increment with the
      core's P0 frequency. From F10h, revB onward, however, the TSC increment
      rate is denoted by MSRC001_0015[24] and when this bit is set (which
      should be done by the BIOS) the TSC increments with the P0 frequency
      so the calibration is not needed and booting can be a couple of mcecs
      faster on those machines.
      
      Besides, there should be virtually no machines out there which don't
      have this bit set, therefore this calibration can be safely removed. It
      is a shaky hack anyway since it assumes implicitly that the core is in
      P0 when BIOS hands off to the OS, which might not always be the case.
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      LKML-Reference: <20100825162823.GE26438@aftab>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      acf01734
  14. 20 8月, 2010 1 次提交
    • S
      x86, tsc, sched: Recompute cyc2ns_offset's during resume from sleep states · cd7240c0
      Suresh Siddha 提交于
      TSC's get reset after suspend/resume (even on cpu's with invariant TSC
      which runs at a constant rate across ACPI P-, C- and T-states). And in
      some systems BIOS seem to reinit TSC to arbitrary large value (still
      sync'd across cpu's) during resume.
      
      This leads to a scenario of scheduler rq->clock (sched_clock_cpu()) less
      than rq->age_stamp (introduced in 2.6.32). This leads to a big value
      returned by scale_rt_power() and the resulting big group power set by the
      update_group_power() is causing improper load balancing between busy and
      idle cpu's after suspend/resume.
      
      This resulted in multi-threaded workloads (like kernel-compilation) go
      slower after suspend/resume cycle on core i5 laptops.
      
      Fix this by recomputing cyc2ns_offset's during resume, so that
      sched_clock() continues from the point where it was left off during
      suspend.
      Reported-by: NFlorian Pritz <flo@xssn.at>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: <stable@kernel.org> # [v2.6.32+]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1282262618.2675.24.camel@sbsiddha-MOBL3.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cd7240c0
  15. 27 7月, 2010 1 次提交
  16. 09 2月, 2010 1 次提交
  17. 05 2月, 2010 1 次提交
  18. 18 1月, 2010 1 次提交
  19. 18 12月, 2009 1 次提交
  20. 21 9月, 2009 1 次提交
    • F
      x86: Trivial whitespace cleanups · 878f4f53
      Felipe Contreras 提交于
      Signed-off-by: NFelipe Contreras <felipe.contreras@gmail.com>
      Cc: Vegard Nossum <vegardno@ifi.uio.no>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Alok N Kataria <akataria@vmware.com>
      Cc: "Tan Wei Chong" <wei.chong.tan@intel.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Bob Moore <robert.moore@intel.com>
      LKML-Reference: <1253137123-18047-2-git-send-email-felipe.contreras@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      878f4f53
  21. 31 8月, 2009 3 次提交
  22. 29 8月, 2009 2 次提交
    • J
      x86: Make tsc=reliable override boot time stability checks · d3b8f889
      john stultz 提交于
      This patch makes the tsc=reliable option disable the boot time
      stability checks. Currently the option only disables the runtime
      watchdog checks. This change allows folks who want to override the
      boot time TSC stability checks and use the TSC when the system would
      otherwise disqualify it.
      
      There still are some situations that the TSC will be disqualified,
      such as cpufreq scaling. But these are situations where the box will
      hang if allowed.
      
      Patch also includes a fix for an issue found by Thomas Gleixner, where
      the TSC disqualification message wouldn't be printed after a call to
      unsynchronized_tsc().
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: akataria@vmware.com
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      LKML-Reference: <1250552447.7212.92.camel@localhost.localdomain>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      d3b8f889
    • T
      clocksource: Resolve cpu hotplug dead lock with TSC unstable · 7285dd7f
      Thomas Gleixner 提交于
      Martin Schwidefsky analyzed it:
      To register a clocksource the clocksource_mutex is acquired and if
      necessary timekeeping_notify is called to install the clocksource as
      the timekeeper clock. timekeeping_notify uses stop_machine which needs
      to take cpu_add_remove_lock mutex.
      Starting a new cpu is done with the cpu_add_remove_lock mutex held.
      native_cpu_up checks the tsc of the new cpu and if the tsc is no good
      clocksource_change_rating is called. Which needs the clocksource_mutex
      and the deadlock is complete.
      
      The solution is to replace the TSC via the clocksource watchdog
      mechanism. Mark the TSC as unstable and schedule the watchdog work so
      it gets removed in the watchdog thread context.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <new-submission>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: John Stultz <johnstul@us.ibm.com>
      7285dd7f
  23. 15 8月, 2009 1 次提交
  24. 11 8月, 2009 1 次提交
    • L
      x86: Fix serialization in pit_expect_msb() · b6e61eef
      Linus Torvalds 提交于
      Wei Chong Tan reported a fast-PIT-calibration corner-case:
      
      | pit_expect_msb() is vulnerable to SMI disturbance corner case
      | in some platforms which causes /proc/cpuinfo to show wrong
      | CPU MHz value when quick_pit_calibrate() jumps to success
      | section.
      
      I think that the real issue isn't even an SMI - but the fact
      that in the very last iteration of the loop, there's no
      serializing instruction _after_ the last 'rdtsc'. So even in
      the absense of SMI's, we do have a situation where the cycle
      counter was read without proper serialization.
      
      The last check should be done outside the outer loop, since
      _inside_ the outer loop, we'll be testing that the PIT has
      the right MSB value has the right value in the next iteration.
      
      So only the _last_ iteration is special, because that's the one
      that will not check the PIT MSB value any more, and because the
      final 'get_cycles()' isn't serialized.
      
      In other words:
      
       - I'd like to move the PIT MSB check to after the last
         iteration, rather than in every iteration
      
       - I think we should comment on the fact that it's also a
         serializing instruction and so 'fences in' the TSC read.
      
      Here's a suggested replacement.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Reported-by: N"Tan, Wei Chong" <wei.chong.tan@intel.com>
      Tested-by: N"Tan, Wei Chong" <wei.chong.tan@intel.com>
      LKML-Reference: <B28277FD4E0F9247A3D55704C440A140D5D683F3@pgsmsx504.gar.corp.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b6e61eef
  25. 17 6月, 2009 2 次提交
    • P
      sched, x86: Fix cpufreq + sched_clock() TSC scaling · 84599f8a
      Peter Zijlstra 提交于
      For freqency dependent TSCs we only scale the cycles, we do not account
      for the discrepancy in absolute value.
      
      Our current formula is: time = cycles * mult
      
      (where mult is a function of the cpu-speed on variable tsc machines)
      
      Suppose our current cycle count is 10, and we have a multiplier of 5,
      then our time value would end up being 50.
      
      Now cpufreq comes along and changes the multiplier to say 3 or 7,
      which would result in our time being resp. 30 or 70.
      
      That means that we can observe random jumps in the time value due to
      frequency changes in both fwd and bwd direction.
      
      So what this patch does is change the formula to:
      
        time = cycles * frequency + offset
      
      And we calculate offset so that time_before == time_after, thereby
      ridding us of these jumps in time.
      
      [ Impact: fix/reduce sched_clock() jumps across frequency changing events ]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Chucked-on-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      84599f8a
    • A
      time: move PIT_TICK_RATE to linux/timex.h · 08604bd9
      Arnd Bergmann 提交于
      PIT_TICK_RATE is currently defined in four architectures, but in three
      different places.  While linux/timex.h is not the perfect place for it, it
      is still a reasonable replacement for those drivers that traditionally use
      asm/timex.h to get CLOCK_TICK_RATE and expect it to be the PIT frequency.
      
      Note that for Alpha, the actual value changed from 1193182UL to 1193180UL.
       This is unlikely to make a difference, and probably can only improve
      accuracy.  There was a discussion on the correct value of CLOCK_TICK_RATE
      a few years ago, after which every existing instance was getting changed
      to 1193182.  According to the specification, it should be
      1193181.818181...
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Dmitry Torokhov <dtor@mail.ru>
      Cc: Takashi Iwai <tiwai@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      08604bd9
  26. 15 6月, 2009 1 次提交
    • D
      [CPUFREQ] Clean up convoluted code in arch/x86/kernel/tsc.c:time_cpufreq_notifier() · 931db6a3
      Dave Jones 提交于
      Christoph Hellwig noticed the following potential uninitialised use:
      
       > arch/x86/kernel/tsc.c: In function 'time_cpufreq_notifier':
       > arch/x86/kernel/tsc.c:634: warning: 'dummy' may be used uninitialized in this function
       >
       > where we do have CONFIG_SMP set, freq->flags & CPUFREQ_CONST_LOOPS is
       > true and ref_freq is false.
      
      It seems plausable, though the circumstances for hitting it are really low.
      Nearly all SMP capable cpufreq drivers set CPUFREQ_CONST_LOOPS.
      powernow-k8 is really the only exception. The older CPUs were typically
      only ever UP. (powernow-k7 never supported SMP for eg)
      
      It's worth fixing regardless, as it cleans up the code.
      
      Fix possible uninitialized use of dummy, by just removing it,
      and making the setting of lpj more obvious.
      Signed-off-by: NDave Jones <davej@redhat.com>
      931db6a3
  27. 28 5月, 2009 1 次提交
  28. 22 4月, 2009 1 次提交
  29. 12 4月, 2009 1 次提交
    • J
      x86: clean up declarations and variables · 2c1b284e
      Jaswinder Singh Rajput 提交于
      Impact: cleanup, no code changed
      
       - syscalls.h       update declarations due to unifications
       - irq.c            declare smp_generic_interrupt() before it gets used
       - process.c        declare sys_fork() and sys_vfork() before they get used
       - tsc.c            rename tsc_khz shadowed variable
       - apic/probe_32.c  declare apic_default before it gets used
       - apic/nmi.c       prev_nmi_count should be unsigned
       - apic/io_apic.c   declare smp_irq_move_cleanup_interrupt() before it gets used
       - mm/init.c        declare direct_gbpages and free_initrd_mem before they get used
      Signed-off-by: NJaswinder Singh Rajput <jaswinder@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2c1b284e
  30. 17 3月, 2009 2 次提交
    • L
      Fast TSC calibration: calculate proper frequency error bounds · 9e8912e0
      Linus Torvalds 提交于
      In order for ntpd to correctly synchronize the clocks, the frequency of
      the system clock must not be off by more than 500 ppm (or, put another
      way, 1:2000), or ntpd will end up giving up on trying to synchronize
      properly, and ends up reseting the clock in jumps instead.
      
      The fast TSC PIT calibration sometimes failed this test - it was
      assuming that the PIT reads always took about one microsecond each (2us
      for the two reads to get a 16-bit timer), and that calibrating TSC to
      the PIT over 15ms should thus be sufficient to get much closer than
      500ppm (max 2us error on both sides giving 4us over 15ms: a 270 ppm
      error value).
      
      However, that assumption does not always hold: apparently some hardware
      is either very much slower at reading the PIT registers, or there was
      other noise causing at least one machine to get 700+ ppm errors.
      
      So instead of using a fixed 15ms timing loop, this changes the fast PIT
      calibration to read the TSC delta over the individual PIT timer reads,
      and use the result to calculate the error bars on the PIT read timing
      properly.  We then successfully calibrate the TSC only if the maximum
      error bars fall below 500ppm.
      
      In the process, we also relax the timing to allow up to 25ms for the
      calibration, although it can happen much faster depending on hardware.
      Reported-and-tested-by: NJesper Krogh <jesper@krogh.cc>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9e8912e0
    • L
      Fix potential fast PIT TSC calibration startup glitch · a6a80e1d
      Linus Torvalds 提交于
      During bootup, when we reprogram the PIT (programmable interval timer)
      to start counting down from 0xffff in order to use it for the fast TSC
      calibration, we should also make sure to delay a bit afterwards to allow
      the PIT hardware to actually start counting with the new value.
      
      That will happens at the next CLK pulse (1.193182 MHz), so the easiest
      way to do that is to just wait at least one microsecond after
      programming the new PIT counter value.  We do that by just reading the
      counter value back once - which will take about 2us on PC hardware.
      Reported-and-tested-by: Njohn stultz <johnstul@us.ibm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a6a80e1d
  31. 11 3月, 2009 1 次提交
    • I
      x86, sched_clock(): mark variables read-mostly · f24ade3a
      Ingo Molnar 提交于
      Impact: micro-optimization
      
      There's a number of variables in the sched_clock() path that are
      in .data/.bss - but not marked __read_mostly. This creates the
      danger of accidental false cacheline sharing with some other,
      write-often variable.
      
      So mark them __read_mostly.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f24ade3a