提交 · b0f969009f647cd473c5e559aeec9c4229d12f87 · openanolis / cloud-kernel

19 10月, 2010 1 次提交

由 Venkatesh Pallipadi 提交于 10月 04, 2010

This patch adds IRQ_TIME_ACCOUNTING option on x86 and runtime enables it
when TSC is enabled.

This change just enables fine grained irq time accounting, isn't used yet.
Following patches use it for different purposes.
Signed-off-by: NVenkatesh Pallipadi <venki@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286237003-12406-6-git-send-email-venki@google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e82b8e4e

11 9月, 2010 2 次提交

x86, tsc: Fix a preemption leak in restore_sched_clock_state() · 55496c89

由 Peter Zijlstra 提交于 9月 10, 2010

Doh, a real life genuine preemption leak..

This caused a suspend failure.

Reported-bisected-and-tested-by-the-invaluable: Jeff Chua <jeff.chua.linux@gmail.com>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Nico Schottelius <nico-linux-20100709@schottelius.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Florian Pritz <flo@xssn.at>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: <stable@kernel.org> # Greg, please apply after: cd7240c0 ("x86, tsc, sched: Recompute cyc2ns_offset's during resume from")
sleep states
LKML-Reference: <1284150773.402.122.camel@laptop>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

55496c89

x86, tsc: Fix a preemption leak in restore_sched_clock_state() · 5ee5e97e

由 Peter Zijlstra 提交于 9月 10, 2010

A real life genuine preemption leak..
Reported-and-tested-by: NJeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5ee5e97e

26 8月, 2010 1 次提交

x86, tsc: Remove CPU frequency calibration on AMD · acf01734

由 Borislav Petkov 提交于 8月 25, 2010

6b37f5a2 introduced the CPU frequency
calibration code for AMD CPUs whose TSCs didn't increment with the
core's P0 frequency. From F10h, revB onward, however, the TSC increment
rate is denoted by MSRC001_0015[24] and when this bit is set (which
should be done by the BIOS) the TSC increments with the P0 frequency
so the calibration is not needed and booting can be a couple of mcecs
faster on those machines.

Besides, there should be virtually no machines out there which don't
have this bit set, therefore this calibration can be safely removed. It
is a shaky hack anyway since it assumes implicitly that the core is in
P0 when BIOS hands off to the OS, which might not always be the case.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20100825162823.GE26438@aftab>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

acf01734

20 8月, 2010 1 次提交

x86, tsc, sched: Recompute cyc2ns_offset's during resume from sleep states · cd7240c0

由 Suresh Siddha 提交于 8月 19, 2010

TSC's get reset after suspend/resume (even on cpu's with invariant TSC
which runs at a constant rate across ACPI P-, C- and T-states). And in
some systems BIOS seem to reinit TSC to arbitrary large value (still
sync'd across cpu's) during resume.

This leads to a scenario of scheduler rq->clock (sched_clock_cpu()) less
than rq->age_stamp (introduced in 2.6.32). This leads to a big value
returned by scale_rt_power() and the resulting big group power set by the
update_group_power() is causing improper load balancing between busy and
idle cpu's after suspend/resume.

This resulted in multi-threaded workloads (like kernel-compilation) go
slower after suspend/resume cycle on core i5 laptops.

Fix this by recomputing cyc2ns_offset's during resume, so that
sched_clock() continues from the point where it was left off during
suspend.
Reported-by: NFlorian Pritz <flo@xssn.at>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: <stable@kernel.org> # [v2.6.32+]
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1282262618.2675.24.camel@sbsiddha-MOBL3.sc.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cd7240c0

27 7月, 2010 1 次提交

x86: Convert common clocksources to use clocksource_register_hz/khz · f12a15be

由 John Stultz 提交于 7月 13, 2010

This converts the most common of the x86 clocksources over to use
clocksource_register_hz/khz.
Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
LKML-Reference: <1279068988-21864-11-git-send-email-johnstul@us.ibm.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

f12a15be

09 2月, 2010 1 次提交

tree-wide: Assorted spelling fixes · 3ad2f3fb

由 Daniel Mack 提交于 2月 03, 2010

In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.
Signed-off-by: NDaniel Mack <daniel@caiaq.de>
Cc: Joe Perches <joe@perches.com>
Cc: Junio C Hamano <gitster@pobox.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

3ad2f3fb

05 2月, 2010 1 次提交

clocksource: add argument to resume callback · 17622339

由 Magnus Damm 提交于 2月 02, 2010

Pass the clocksource as an argument to the clocksource resume callback. 
Needed so we can point out which CMT channel the sh_cmt.c driver shall
resume.
Signed-off-by: NMagnus Damm <damm@opensource.se>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

17622339

18 1月, 2010 1 次提交

x86, trivial: Fix grammo in tsc comment about Geode TSC reliability · 00097c4f

由 Thadeu Lima de Souza Cascardo 提交于 1月 17, 2010

Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Cc: marcelo@kvack.org
Cc: dilinger@collabora.co.uk
Cc: trivial@kernel.org
LKML-Reference: <1263764685-9871-1-git-send-email-cascardo@holoscopio.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

00097c4f

18 12月, 2009 1 次提交

x86: Reenable TSC sync check at boot, even with NONSTOP_TSC · 6c56ccec

由 Pallipadi, Venkatesh 提交于 12月 17, 2009

Commit 83ce4009 did the following change
If the TSC is constant and non-stop, also set it reliable.

But, there seems to be few systems that will end up with TSC warp across
sockets, depending on how the cpus come out of reset. Skipping TSC sync
test on such systems may result in time inconsistency later.

So, reenable TSC sync test even on constant and non-stop TSC systems.
Set, sched_clock_stable to 1 by default and reset it in
mark_tsc_unstable, if TSC sync fails.

This change still gives perf benefit mentioned in 83ce4009 for systems
where TSC is reliable.
Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20091217202702.GA18015@linux-os.sc.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

6c56ccec

21 9月, 2009 1 次提交

x86: Trivial whitespace cleanups · 878f4f53

由 Felipe Contreras 提交于 9月 17, 2009

Signed-off-by: NFelipe Contreras <felipe.contreras@gmail.com>
Cc: Vegard Nossum <vegardno@ifi.uio.no>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alok N Kataria <akataria@vmware.com>
Cc: "Tan Wei Chong" <wei.chong.tan@intel.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Bob Moore <robert.moore@intel.com>
LKML-Reference: <1253137123-18047-2-git-send-email-felipe.contreras@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

878f4f53

31 8月, 2009 3 次提交

x86: Move tsc_calibration to x86_init_ops · 2d826404

由 Thomas Gleixner 提交于 8月 20, 2009

TSC calibration is modified by the vmware hypervisor and paravirt by
separate means. Moorestown wants to add its own calibration routine as
well. So make calibrate_tsc a proper x86_init_ops function and
override it by paravirt or by the early setup of the vmware
hypervisor.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

2d826404

x86: Move calibrate_cpu to tsc.c · 08047c4f

由 Thomas Gleixner 提交于 8月 20, 2009

Move the code where it's only user is. Also we need to look whether
this hardwired hackery might interfere with perfcounters.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

08047c4f

x86: Add timer_init to x86_init_ops · 845b3944

由 Thomas Gleixner 提交于 8月 19, 2009

The timer init code is convoluted with several quirks and the paravirt
timer chooser. Figuring out which code path is actually taken is not
for the faint hearted.

Move the numaq TSC quirk to tsc_pre_init x86_init_ops function and
replace the paravirt time chooser and the remaining x86 quirk with a
simple x86_init_ops function.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

845b3944

29 8月, 2009 2 次提交

x86: Make tsc=reliable override boot time stability checks · d3b8f889

由 john stultz 提交于 8月 17, 2009

This patch makes the tsc=reliable option disable the boot time
stability checks. Currently the option only disables the runtime
watchdog checks. This change allows folks who want to override the
boot time TSC stability checks and use the TSC when the system would
otherwise disqualify it.

There still are some situations that the TSC will be disqualified,
such as cpufreq scaling. But these are situations where the box will
hang if allowed.

Patch also includes a fix for an issue found by Thomas Gleixner, where
the TSC disqualification message wouldn't be printed after a call to
unsynchronized_tsc().
Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: akataria@vmware.com
Cc: Stephen Hemminger <shemminger@vyatta.com>
LKML-Reference: <1250552447.7212.92.camel@localhost.localdomain>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

d3b8f889

clocksource: Resolve cpu hotplug dead lock with TSC unstable · 7285dd7f

由 Thomas Gleixner 提交于 8月 28, 2009

Martin Schwidefsky analyzed it:
To register a clocksource the clocksource_mutex is acquired and if
necessary timekeeping_notify is called to install the clocksource as
the timekeeper clock. timekeeping_notify uses stop_machine which needs
to take cpu_add_remove_lock mutex.
Starting a new cpu is done with the cpu_add_remove_lock mutex held.
native_cpu_up checks the tsc of the new cpu and if the tsc is no good
clocksource_change_rating is called. Which needs the clocksource_mutex
and the deadlock is complete.

The solution is to replace the TSC via the clocksource watchdog
mechanism. Mark the TSC as unstable and schedule the watchdog work so
it gets removed in the watchdog thread context.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
LKML-Reference: <new-submission>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: John Stultz <johnstul@us.ibm.com>

7285dd7f

15 8月, 2009 1 次提交

timekeeping: Move reset of cycle_last for tsc clocksource to tsc · 1be39679

由 Martin Schwidefsky 提交于 8月 14, 2009

change_clocksource resets the cycle_last value to zero then sets it to
a value read from the clocksource. The reset to zero is required only
for the TSC clocksource to make the read_tsc function work after a
resume. The reason is that the TSC read function uses cycle_last to
detect backwards going TSCs. In the resume case cycle_last contains
the TSC value from the last update before the suspend. On resume the
TSC starts counting from 0 again and would trip over the cycle_last
comparison.

This is subtle and surprising. Move the reset to a resume function in
the tsc code.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NJohn Stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
LKML-Reference: <20090814134808.142191175@de.ibm.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

1be39679

11 8月, 2009 1 次提交

x86: Fix serialization in pit_expect_msb() · b6e61eef

由 Linus Torvalds 提交于 7月 31, 2009

Wei Chong Tan reported a fast-PIT-calibration corner-case:

| pit_expect_msb() is vulnerable to SMI disturbance corner case
| in some platforms which causes /proc/cpuinfo to show wrong
| CPU MHz value when quick_pit_calibrate() jumps to success
| section.

I think that the real issue isn't even an SMI - but the fact
that in the very last iteration of the loop, there's no
serializing instruction _after_ the last 'rdtsc'. So even in
the absense of SMI's, we do have a situation where the cycle
counter was read without proper serialization.

The last check should be done outside the outer loop, since
_inside_ the outer loop, we'll be testing that the PIT has
the right MSB value has the right value in the next iteration.

So only the _last_ iteration is special, because that's the one
that will not check the PIT MSB value any more, and because the
final 'get_cycles()' isn't serialized.

In other words:

 - I'd like to move the PIT MSB check to after the last
   iteration, rather than in every iteration

 - I think we should comment on the fact that it's also a
   serializing instruction and so 'fences in' the TSC read.

Here's a suggested replacement.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Reported-by: N"Tan, Wei Chong" <wei.chong.tan@intel.com>
Tested-by: N"Tan, Wei Chong" <wei.chong.tan@intel.com>
LKML-Reference: <B28277FD4E0F9247A3D55704C440A140D5D683F3@pgsmsx504.gar.corp.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b6e61eef

17 6月, 2009 2 次提交

sched, x86: Fix cpufreq + sched_clock() TSC scaling · 84599f8a

由 Peter Zijlstra 提交于 6月 16, 2009

For freqency dependent TSCs we only scale the cycles, we do not account
for the discrepancy in absolute value.

Our current formula is: time = cycles * mult

(where mult is a function of the cpu-speed on variable tsc machines)

Suppose our current cycle count is 10, and we have a multiplier of 5,
then our time value would end up being 50.

Now cpufreq comes along and changes the multiplier to say 3 or 7,
which would result in our time being resp. 30 or 70.

That means that we can observe random jumps in the time value due to
frequency changes in both fwd and bwd direction.

So what this patch does is change the formula to:

  time = cycles * frequency + offset

And we calculate offset so that time_before == time_after, thereby
ridding us of these jumps in time.

[ Impact: fix/reduce sched_clock() jumps across frequency changing events ]
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Chucked-on-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

84599f8a

time: move PIT_TICK_RATE to linux/timex.h · 08604bd9

由 Arnd Bergmann 提交于 6月 16, 2009

PIT_TICK_RATE is currently defined in four architectures, but in three
different places.  While linux/timex.h is not the perfect place for it, it
is still a reasonable replacement for those drivers that traditionally use
asm/timex.h to get CLOCK_TICK_RATE and expect it to be the PIT frequency.

Note that for Alpha, the actual value changed from 1193182UL to 1193180UL.
 This is unlikely to make a difference, and probably can only improve
accuracy.  There was a discussion on the correct value of CLOCK_TICK_RATE
a few years ago, after which every existing instance was getting changed
to 1193182.  According to the specification, it should be
1193181.818181...
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Len Brown <lenb@kernel.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

08604bd9

15 6月, 2009 1 次提交

[CPUFREQ] Clean up convoluted code in arch/x86/kernel/tsc.c:time_cpufreq_notifier() · 931db6a3

由 Dave Jones 提交于 6月 01, 2009

Christoph Hellwig noticed the following potential uninitialised use:

 > arch/x86/kernel/tsc.c: In function 'time_cpufreq_notifier':
 > arch/x86/kernel/tsc.c:634: warning: 'dummy' may be used uninitialized in this function
 >
 > where we do have CONFIG_SMP set, freq->flags & CPUFREQ_CONST_LOOPS is
 > true and ref_freq is false.

It seems plausable, though the circumstances for hitting it are really low.
Nearly all SMP capable cpufreq drivers set CPUFREQ_CONST_LOOPS.
powernow-k8 is really the only exception. The older CPUs were typically
only ever UP. (powernow-k7 never supported SMP for eg)

It's worth fixing regardless, as it cleans up the code.

Fix possible uninitialized use of dummy, by just removing it,
and making the setting of lpj more obvious.
Signed-off-by: NDave Jones <davej@redhat.com>

931db6a3

28 5月, 2009 1 次提交

x86: move rdtsc_barrier() into the TSC vread method · 7d96fd41

由 Petr Tesarik 提交于 5月 25, 2009

The *fence instructions were moved to vsyscall_64.c by commit
cb9e35dc.  But this breaks the
vDSO, because vread methods are also called from there.

Besides, the synchronization might be unnecessary for other
time sources than TSC.

[ Impact: fix potential time warp in VDSO ]
Signed-off-by: NPetr Tesarik <ptesarik@suse.cz>
LKML-Reference: <9d0ea9ea0f866bdc1f4d76831221ae117f11ea67.1243241859.git.ptesarik@suse.cz>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: <stable@kernel.org>

7d96fd41

22 4月, 2009 1 次提交

clocksource: pass clocksource to read() callback · 8e19608e

由 Magnus Damm 提交于 4月 21, 2009

Pass clocksource pointer to the read() callback for clocksources.  This
allows us to share the callback between multiple instances.

[hugh@veritas.com: fix powerpc build of clocksource pass clocksource mods]
[akpm@linux-foundation.org: cleanup]
Signed-off-by: NMagnus Damm <damm@igel.co.jp>
Acked-by: NJohn Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8e19608e

12 4月, 2009 1 次提交

x86: clean up declarations and variables · 2c1b284e

由 Jaswinder Singh Rajput 提交于 4月 11, 2009

Impact: cleanup, no code changed

 - syscalls.h       update declarations due to unifications
 - irq.c            declare smp_generic_interrupt() before it gets used
 - process.c        declare sys_fork() and sys_vfork() before they get used
 - tsc.c            rename tsc_khz shadowed variable
 - apic/probe_32.c  declare apic_default before it gets used
 - apic/nmi.c       prev_nmi_count should be unsigned
 - apic/io_apic.c   declare smp_irq_move_cleanup_interrupt() before it gets used
 - mm/init.c        declare direct_gbpages and free_initrd_mem before they get used
Signed-off-by: NJaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2c1b284e

17 3月, 2009 2 次提交

Fast TSC calibration: calculate proper frequency error bounds · 9e8912e0

由 Linus Torvalds 提交于 3月 17, 2009

In order for ntpd to correctly synchronize the clocks, the frequency of
the system clock must not be off by more than 500 ppm (or, put another
way, 1:2000), or ntpd will end up giving up on trying to synchronize
properly, and ends up reseting the clock in jumps instead.

The fast TSC PIT calibration sometimes failed this test - it was
assuming that the PIT reads always took about one microsecond each (2us
for the two reads to get a 16-bit timer), and that calibrating TSC to
the PIT over 15ms should thus be sufficient to get much closer than
500ppm (max 2us error on both sides giving 4us over 15ms: a 270 ppm
error value).

However, that assumption does not always hold: apparently some hardware
is either very much slower at reading the PIT registers, or there was
other noise causing at least one machine to get 700+ ppm errors.

So instead of using a fixed 15ms timing loop, this changes the fast PIT
calibration to read the TSC delta over the individual PIT timer reads,
and use the result to calculate the error bars on the PIT read timing
properly.  We then successfully calibrate the TSC only if the maximum
error bars fall below 500ppm.

In the process, we also relax the timing to allow up to 25ms for the
calibration, although it can happen much faster depending on hardware.
Reported-and-tested-by: NJesper Krogh <jesper@krogh.cc>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9e8912e0

Fix potential fast PIT TSC calibration startup glitch · a6a80e1d

由 Linus Torvalds 提交于 3月 17, 2009

During bootup, when we reprogram the PIT (programmable interval timer)
to start counting down from 0xffff in order to use it for the fast TSC
calibration, we should also make sure to delay a bit afterwards to allow
the PIT hardware to actually start counting with the new value.

That will happens at the next CLK pulse (1.193182 MHz), so the easiest
way to do that is to just wait at least one microsecond after
programming the new PIT counter value. We do that by just reading the
counter value back once - which will take about 2us on PC hardware.
Reported-and-tested-by: Njohn stultz <johnstul@us.ibm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a6a80e1d

11 3月, 2009 1 次提交

x86, sched_clock(): mark variables read-mostly · f24ade3a

由 Ingo Molnar 提交于 3月 10, 2009

Impact: micro-optimization

There's a number of variables in the sched_clock() path that are
in .data/.bss - but not marked __read_mostly. This creates the
danger of accidental false cacheline sharing with some other,
write-often variable.

So mark them __read_mostly.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f24ade3a

25 2月, 2009 1 次提交

[CPUFREQ] p4-clockmod reports wrong frequency. · 199785ea

由 Matthias-Christian Ott 提交于 2月 20, 2009

http://bugzilla.kernel.org/show_bug.cgi?id=10968

[ Updated for current tree, and fixed compile failure
  when p4-clockmod was built modular -- davej]

From: Matthias-Christian Ott <ott@mirix.org>
Signed-off-by: NDominik Brodowski <linux@brodo.de>
Signed-off-by: NDave Jones <davej@redhat.com>

199785ea

29 1月, 2009 1 次提交

x86: replace CONFIG_X86_SMP with CONFIG_SMP · 3e5095d1

由 Ingo Molnar 提交于 1月 27, 2009

The x86/Voyager subarch used to have this distinction between
 'x86 SMP support' and 'Voyager SMP support':

 config X86_SMP
	bool
	depends on SMP && ((X86_32 && !X86_VOYAGER) || X86_64)

This is a pointless distinction - Voyager can (and already does) use
smp_ops to implement various SMP quirks it has - and it can be extended
more to cover all the specialities of Voyager.

So remove this complication in the Kconfig space.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3e5095d1

09 11月, 2008 1 次提交

sched: optimize sched_clock() a bit · 7cbaef9c

由 Ingo Molnar 提交于 11月 08, 2008

sched_clock() uses cycles_2_ns() needlessly - which is an irq-disabling
variant of __cycles_2_ns().

Most of the time sched_clock() is called with irqs disabled already.
The few places that call it with irqs enabled need to be updated.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7cbaef9c

04 11月, 2008 1 次提交

x86: don't use tsc_khz to calculate lpj if notsc is passed · 70de9a97

由 Alok Kataria 提交于 11月 03, 2008

Impact: fix udelay when "notsc" boot parameter is passed

With notsc passed on commandline, tsc may not be used for
udelays, make sure that we do not use tsc_khz to calculate
the lpj value in such cases.
Reported-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: NAlok N Kataria <akataria@vmware.com>
Cc: <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

70de9a97

02 11月, 2008 2 次提交

x86: Skip verification by the watchdog for TSC clocksource. · 395628ef

由 Alok Kataria 提交于 10月 24, 2008

Impact: Changes timekeeping on Vmware (or with tsc=reliable).

This is achieved by resetting the CLOCKSOURCE_MUST_VERIFY flag.

We add a tsc=reliable commandline option to enable this.
This enables legacy hardware without HPET, LAPIC, or ACPI timers
to enter high-resolution timer mode.

Along with that have extended this to be used in virtualization environement
too. Now we also set this flag if the X86_FEATURE_TSC_RELIABLE bit is set.

This is important since there is a wrap-around problem with the acpi_pm timer.
The acpi_pm counter is just 24bits and this can overflow in ~4 seconds. With
the NO_HZ kernels in virtualized environment, there can be situations when
the guest is descheduled for longer duration, as a result we may miss the wrap
of the acpi counter. When TSC is used as a clocksource and acpi_pm timer is
being used as the watchdog clocksource this error in acpi_pm results in TSC
being marked as unstable, and essentially results in time dropping in chunks
of 4 seconds whenever this wrap is missed. Since the virtualized TSC is
reliable on VMware, we should always use the TSCs clocksource on VMware, so
we skip the verfication at runtime, by checking for the feature bit.

Since we reset the flag for mgeode systems too, i have combined
the mgeode case with the feature bit check.
Signed-off-by: NJeff Hansen <jhansen@cardaccess-inc.com>
Signed-off-by: NAlok N Kataria <akataria@vmware.com>
Signed-off-by: NDan Hecht <dhecht@vmware.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

395628ef

x86: Hypervisor detection and get tsc_freq from hypervisor · 88b094fb

由 Alok Kataria 提交于 10月 27, 2008

Impact: Changes timebase calibration on Vmware.

v3->v2 : Abstract the hypervisor detection and feature (tsc_freq) request
	 behind a hypervisor.c file
v2->v1 : Add a x86_hyper_vendor field to the cpuinfo_x86 structure.
	 This avoids multiple calls to the hypervisor detection function.

This patch adds function to detect if we are running under VMware.
The current way to check if we are on VMware is following,
#  check if "hypervisor present bit" is set, if so read the 0x40000000
   cpuid leaf and check for "VMwareVMware" signature.
#  if the above fails, check the DMI vendors name for "VMware" string
   if we find one we query the VMware hypervisor port to check if we are
   under VMware.

The DMI + "VMware hypervisor port check" is needed for older VMware products,
which don't implement the hypervisor signature cpuid leaf.
Also note that since we are checking for the DMI signature the hypervisor
port should never be accessed on native hardware.

This patch also adds a hypervisor_get_tsc_freq function, instead of
calibrating the frequency which can be error prone in virtualized
environment, we ask the hypervisor for it. We get the frequency from
the hypervisor by accessing the hypervisor port if we are running on VMware.
Other hypervisors too can add code to the generic routine to get frequency on
their platform.
Signed-off-by: NAlok N Kataria <akataria@vmware.com>
Signed-off-by: NDan Hecht <dhecht@vmware.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

88b094fb

31 10月, 2008 1 次提交

x86: use CONFIG_X86_SMP instead of CONFIG_SMP · 017d9d20

由 James Bottomley 提交于 10月 30, 2008

Impact: fix x86/Voyager boot

CONFIG_SMP is used for features which work on *all* x86 boxes.
CONFIG_X86_SMP is used for standard PC like x86 boxes (for things like
multi core and apics)
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

017d9d20

07 9月, 2008 1 次提交
- I
  x86, tsc calibration: fix · 5df45515
  由 Ingo Molnar 提交于 9月 06, 2008
```
my brown paperbag day ...
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
  5df45515
05 9月, 2008 2 次提交

x86: quick TSC calibration, improve · 4156e9a8

由 Ingo Molnar 提交于 9月 04, 2008

- make sure the final TSC timestamp is reliable too
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4156e9a8

x86: quick TSC calibration · 6ac40ed0

由 Linus Torvalds 提交于 9月 04, 2008

Introduce a fast TSC-calibration method on sane hardware.

It only uses 17920 PIT timer ticks to calibrate the TSC, plus 256 ticks on
each side to make sure the TSC values were very close to the tick, so the
whole calibration takes 15ms. Yet, despite only takign 15ms,
we can actually give pretty stringent guarantees of accuracy:

 - the code requires that we hit each 256-counter block at least 50 times,
   so the TSC error is basically at *MOST* just a few PIT cycles off in
   any direction. In practice, it's going to be about one microseconds
   off (which is how long it takes to read the counter)

 - so over 17920 PIT cycles, we can pretty much guarantee that the
   calibration error is less than one half of a percent.

My testing bears this out: on my machine, the quick-calibration reports
2934.085kHz, while the slow one reports 2933.415.

Yes, the slower calibration is still more precise. For me, the slow
calibration is stable to within about one hundreth of a percent, so it's
(at a guess) roughly an order-and-a-half of magnitude more precise. The
longer you wait, the more precise you can be.

However, the nice thing about the fast TSC PIT synchronization is that
it's pretty much _guaranteed_ to give that 0.5% precision, and fail
gracefully (and very quickly) if it doesn't get it. And it really is
fairly simple (even if there's a lot of _details_ there, and I didn't get
all of those right ont he first try or even the second ;)

The patch says "110 insertions", but 63 of those new lines are actually
comments.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
---
 arch/x86/kernel/tsc.c |  111 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 110 insertions(+), 1 deletions(-)

6ac40ed0

04 9月, 2008 3 次提交

x86: TSC make the calibration loop smarter · a977c400

由 Thomas Gleixner 提交于 9月 04, 2008

The last changes made the calibration loop 250ms long which is far
too much. Try to do that more clever.

Experiments have shown that using a 10ms delay for the PIT based calibration
gives us a good enough value. If we have a reference (HPET/PMTIMER) and the
result of the PIT and the reference is close enough, then we can break out of
the calibration loop on a match right away and use the reference value.

Otherwise we just loop 3 times and decide then, which value to take.

One caveat is that for virtualized environments the PIT calibration often does
not work at all and I found out that 10us is a bit too short as well for the
reference to give a sane result. The solution here is to make the last loop
longer when the first two PIT calibrations failed.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a977c400

T
x86: TSC: use one set of reference variables · 827014be
由 Thomas Gleixner 提交于 9月 04, 2008
```
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
827014be
T
x86: TSC: separate hpet/pmtimer calculation out · d683ef7a
由 Thomas Gleixner 提交于 9月 04, 2008
```
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
d683ef7a

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功