1. 03 5月, 2007 2 次提交
  2. 26 4月, 2007 3 次提交
  3. 09 4月, 2007 1 次提交
  4. 02 4月, 2007 1 次提交
    • A
      [PATCH] x86-64: Disable local APIC timer use on AMD systems with C1E · 3556ddfa
      Andi Kleen 提交于
      AMD dual core laptops with C1E do not run the APIC timer correctly
      when they go idle. Previously the code assumed this only happened
      on C2 or deeper.  But not all of these systems report support C2.
      
      Use a AMD supplied snippet to detect C1E being enabled and then disable
      local apic timer use.
      
      This supercedes an earlier workaround using DMI detection of specific systems.
      
      Thanks to Mark Langsdorf for the detection snippet.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      3556ddfa
  5. 28 3月, 2007 1 次提交
  6. 24 3月, 2007 2 次提交
    • R
      [PATCH] i386: clear segment register padding in core dumps · 6ea65ff7
      Roland McGrath 提交于
      The segment register slots in struct pt_regs are padded to 32 bits.
      Some of these are stored with instructions like "pushl %es", which
      leaves the high 16 bits as they were.  So the high bits of these
      fields in struct pt_regs contain kernel stack garbage.  These bits are
      ignored by everything and never leak to user space, except in core
      dumps.  The user struct pt_regs is always at the base of the thread's
      kernel stack and so it seems unlikely the information that leaks from
      here is ever worthwhile so as to be a security concern, but I'm not
      sure about that.  It has been this way for ages; userland consumers of
      core dumps all mask off these high bits themselves.  So it is not urgent.
      
      This change masks off the padding bits of the segment register slots
      in core dumps.  ptrace already masks off these high bits, so this
      makes the values in core dumps consistent with what ptrace would
      report just before the process died.
      
      As I read the processor manuals, the cs and ss values will always be
      padded with zero bits rather than stack garbage.  But unlike "pushl %es",
      this is not simple to test with a userland program.  So I added the two
      instructions rather than wonder if they are really never necessary.
      
      I think that x86_64 does not have this problem (for either 32-bit or
      64-bit processes).  It only uses "mov" instructions from segment
      registers, which zero-extend.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6ea65ff7
    • T
      [PATCH] i386: add command line option "local_apic_timer_c2_ok" · e585bef8
      Thomas Gleixner 提交于
      It turned out that it is almost impossible to trust ACPI, BIOS & Co.
      regarding the C states. This was the reason to switch the local apic
      timer off in C2 state already. OTOH there are sane and well behaving
      systems, which get punished by that decision.
      
      Allow the user to confirm that the local apic timer is trustworthy in C2
      state. This keeps the default behaviour on the safe side.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e585bef8
  7. 17 3月, 2007 1 次提交
  8. 15 3月, 2007 2 次提交
  9. 13 3月, 2007 1 次提交
  10. 07 3月, 2007 2 次提交
  11. 06 3月, 2007 1 次提交
    • I
      [PATCH] disable NMI watchdog by default · 6ebf622b
      Ingo Molnar 提交于
      there's a new NMI watchdog related problem: KVM crashes on certain
      bzImages because ... we enable the NMI watchdog by default (even if the
      user does not ask for it) , and no other OS on this planet does that so
      KVM doesnt have emulation for that yet. So KVM injects a #GP, which
      crashes the Linux guest:
      
       general protection fault: 0000 [#1]
       PREEMPT SMP
       Modules linked in:
       CPU:    0
       EIP:    0060:[<c011a8ae>]    Not tainted VLI
       EFLAGS: 00000246   (2.6.20-rc5-rt0 #3)
       EIP is at setup_apic_nmi_watchdog+0x26d/0x3d3
      
      and no, i did /not/ request an nmi_watchdog on the boot command line!
      
      Solution: turn off that darn thing! It's a debug tool, not a 'make life
      harder' tool!!
      
      with this patch the KVM guest boots up just fine.
      
      And with this my laptop (Lenovo T60) also stopped its sporadic hard
      hanging (sometimes in acpi_init(), sometimes later during bootup,
      sometimes much later during actual use) as well. It hung with both
      nmi_watchdog=1 and nmi_watchdog=2, so it's generally the fact of NMI
      injection that is causing problems, not the NMI watchdog variant, nor
      any particular bootup code.
      
      [ NMI breaks on some systems, esp in combination with SMM -Arjan ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6ebf622b
  12. 05 3月, 2007 8 次提交
    • Z
      [PATCH] vmi: apic ops · 772205f6
      Zachary Amsden 提交于
      Use para_fill instead of directly setting the APIC ops to the result of the
      vmi_get_function call - this allows one to implement a VMI ROM without
      implementing APIC functions, just using the native APIC functions.
      
      While doing this, I realized that there is a lot more cleanup that should have
      been done.  Basically, we should never assume that the ROM implements a
      specific set of functions, and always allow fallback to the native
      implementation.
      
      This is critical for future compatibility.
      Signed-off-by: NAnthony Liguori <anthony@codemonkey.ws>
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      772205f6
    • Z
      [PATCH] vmi: pit override · e30fab3a
      Zachary Amsden 提交于
      The time_init_hook in paravirt-ops no longer functions in the correct manner
      after the integration of the hrtimers code.  The problem is that now the call
      path for time initialization is:
      
        time_init :
             late_time_init = hpet_time_init;
      
        late_time_init -> hpet_time_init:
             setup_pit_timer (BAD)
             do_time_init --> (via paravirt.h)
                time_init_hook --> (via arch_hooks.h)
                    time_init_hook (in SUBARCH/setup.c)
      
      If this isn't confusing enough, the paravirt case goes through an indirect
      function pointer in the paravirt-ops table.  The problem is, by the time the
      paravirt hook is called, the pit timer is already enabled.
      
      But paravirt guests have their own timer, and don't want to use the PIT.
      Rather than intensify the struggle for power going on here, just make it all
      nice and simple and just unconditionally do all timer setup in the
      late_time_init hook.  This also has the advantage of enabling timers in the
      same place in all code paths, so everyone has the same bugs and we don't have
      outliers who break other code because they turn on timer too early or too
      late.
      
      So the paravirt-ops time init function is now by default hpet_time_init, which
      is the time init function used for native hardware.  Paravirt guests have the
      chance to override this when they setup the paravirt-ops table, and should
      need no change.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e30fab3a
    • Z
      [PATCH] vmi: paravirt drop udelay op · eda08b1b
      Zachary Amsden 提交于
      Not respecting udelay causes problems with any virtual hardware that is passed
      through to real hardware.  This can be noticed by any device that interacts
      with the real world in real time - like AP startup, which takes real time.  Or
      keyboard LEDs, which should blink in real-time.  Or floppy drives, but only
      when passed through to a real floppy controller on OSes which can't
      sufficiently buffer the floppy commands to emulate a zero latency floppy.  Or
      IDE drives, when connecting to a physical CDROM.
      
      This was mostly a hack to get the kernel to boot faster, but it introduced a
      number of misvirtualization bugs, and Alan and Pavel argued pretty strongly
      against it.  We were the only client, and now want to clean up this cruft.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eda08b1b
    • Z
      [PATCH] vmi: fix highpte · 9a1c13e9
      Zachary Amsden 提交于
      Provide a PT map hook for HIGHPTE kernels to designate where they are mapping
      page tables.  This information is required so the physical address of PTE
      updates can be determined; otherwise, the mm layer would have to carry the
      physical address all the way to each PTE modification callsite, which is even
      more hideous that the macros required to provide the proper hooks.
      
      So lets not mess up arch neutral code to achieve this, but keep the horror in
      an #ifdef HIGHPTE in include/asm-i386/pgtable.h.  I had to use macros here
      because some types are not yet defined in all the include paths for this
      header.
      
      This patch is absolutely required for HIGHPTE kernels to operate properly with
      VMI.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a1c13e9
    • Z
      [PATCH] vmi: cpu cycles fix · 1182d852
      Zachary Amsden 提交于
      In order to share the common code in tsc.c which does CPU Khz calibration, we
      need to make an accurate value of CPU speed available to the tsc.c code.  This
      value loses a lot of precision in a VM because of the timing differences with
      real hardware, but we need it to be as precise as possible so the guest can
      make accurate time calculations with the cycle counters.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1182d852
    • Z
      [PATCH] vmi: sched clock paravirt op fix · 6cb9a835
      Zachary Amsden 提交于
      The custom_sched_clock hook is broken.  The result from sched_clock needs to
      be in nanoseconds, not in CPU cycles.  The TSC is insufficient for this
      purpose, because TSC is poorly defined in a virtual environment, and mostly
      represents real world time instead of scheduled process time (which can be
      interrupted without notice when a virtual machine is descheduled).
      
      To make the scheduler consistent, we must expose a different nature of time,
      that is scheduled time.  So deprecate this custom_sched_clock hack and turn it
      into a paravirt-op, as it should have been all along.  This allows the tsc.c
      code which converts cycles to nanoseconds to be shared by all paravirt-ops
      backends.
      
      It is unfortunate to add a new paravirt-op, but this is a very distinct
      abstraction which is clearly different for all virtual machine
      implementations, and it gets rid of an ugly indirect function which I
      ashamedly admit I hacked in to try to get this to work earlier, and then even
      got in the wrong units.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6cb9a835
    • C
      [PATCH] sched: remove SMT nice · 69f7c0a1
      Con Kolivas 提交于
      Remove the SMT-nice feature which idles sibling cpus on SMT cpus to
      facilitiate nice working properly where cpu power is shared.  The idling of
      cpus in the presence of runnable tasks is considered too fragile, easy to
      break with outside code, and the complexity of managing this system if an
      architecture comes along with many logical cores sharing cpu power will be
      unworkable.
      
      Remove the associated per_cpu_gain variable in sched_domains used only by
      this code.
      
      Also:
      
        The reason is that with dynticks enabled, this code breaks without yet
        further tweaks so dynticks brought on the rapid demise of this code.  So
        either we tweak this code or kill it off entirely.  It was Ingo's preference
        to kill it off.  Either way this needs to happen for 2.6.21 since dynticks
        has gone in.
      Signed-off-by: NCon Kolivas <kernel@kolivas.org>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      69f7c0a1
    • J
      [PATCH] io_apic.h needs apicdef.h · 58a53b24
      Jean Delvare 提交于
      A -mm patch caused:
      
      In file included from drivers/pci/quirks.c:532:
      include/asm/io_apic.h:61: error: "MAX_IO_APICS" undeclared here (not in a function)
      
      So let's include the needed header.
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58a53b24
  13. 27 2月, 2007 1 次提交
    • L
      Revert "[PATCH] i386: add idle notifier" · ea3d5226
      Linus Torvalds 提交于
      This reverts commit 2ff2d3d7.
      
      Uwe Bugla reports that he cannot mount a floppy drive any more, and Jiri
      Slaby bisected it down to this commit.
      
      Benjamin LaHaise also points out that this is a big hot-path, and that
      interrupt delivery while idle is very common and should not go through
      all these expensive gyrations.
      
      Fix up conflicts in arch/i386/kernel/apic.c and arch/i386/kernel/irq.c
      due to other unrelated irq changes.
      
      Cc: Stephane Eranian <eranian@hpl.hp.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Andrew Morton <akpm@osdl.org>
      Cc: Uwe Bugla <uwe.bugla@gmx.de>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ea3d5226
  14. 21 2月, 2007 2 次提交
  15. 17 2月, 2007 5 次提交
    • T
      [PATCH] i386 rework local apic timer calibration · d36b49b9
      Thomas Gleixner 提交于
      The local apic timer calibration has two problem cases:
      
      1.  The calibration is based on readout of the PIT/HPET timer to detect the
         wrap of the periodic tick.  It happens that a box gets stuck in the
         calibration loop due to a PIT with a broken readout function.
      
      2.  CoreDuo boxen show a sporadic PIT runs too slow defect, which results
         in a wrong lapic calibration.  The PIT goes back to normal operation once
         the lapic timer is switched to periodic mode.
      
      Both are existing and unfixed problems in the current upstream kernel and
      prevent certain laptops and other systems from booting Linux.
      
      Rework the code to address both problems:
      
      - Make the calibration interrupt driven.  This removes the wait_timer_tick
        magic hackery from lapic.c and time_hpet.c.  The clockevents framework
        allows easy substitution of the global tick event handler for the
        calibration.  This is more accurate than monitoring jiffies.  At this point
        of the boot process, nothing disturbes the interrupt delivery, so the
        results are very accurate.
      
      - Verify the calibration against the PM timer, when available by using the
        early access function.  When the measured calibration period is outside of
        an one percent window, then the lapic timer calibration is adjusted to the
        pm timer result.
      
      - Verify the calibration by running the lapic timer with the calibration
        handler.  Disable lapic timer in case of deviation.
      
      This also removes the "synchronization" of the local apic timer to the global
      tick.  This synchronization never worked, as there is no way to synchronize
      PIT(HPET) and local APIC timer.  The synchronization by waiting for the tick
      just alignes the local APIC timer for the first events, but later the events
      drift away due to the different clocks.  Removing the "sync" is just
      randomizing the asynchronous behaviour at setup time.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Rohit Seth <rohitseth@google.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d36b49b9
    • T
      [PATCH] clockevents: i386 drivers · e9e2cdb4
      Thomas Gleixner 提交于
      Add clockevent drivers for i386: lapic (local) and PIT/HPET (global).  Update
      the timer IRQ to call into the PIT/HPET driver's event handler and the
      lapic-timer IRQ to call into the lapic clockevent driver.  The assignement of
      timer functionality is delegated to the core framework code and replaces the
      compile and runtime evalution in do_timer_interrupt_hook()
      
      Use the clockevents broadcast support and implement the lapic_broadcast
      function for ACPI.
      
      No changes to existing functionality.
      
      [ kdump fix from Vivek Goyal <vgoyal@in.ibm.com> ]
      [ fixes based on review feedback from Arjan van de Ven <arjan@infradead.org> ]
      Cleanups-from: Adrian Bunk <bunk@stusta.de>
      Build-fixes-from: Andrew Morton <akpm@osdl.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e9e2cdb4
    • T
      [PATCH] i386, apic: clean up the APIC code · e05d723f
      Thomas Gleixner 提交于
      The apic code is quite unstructured and missing a lot of comments.
      
      - Restructure the code into helper functions, timer, setup/shutdown,
        interrupt and power management blocks.
      - Fixup comments.
      - Namespace fixups
      - Inline helpers for version and is_integrated
      - Combine the ack_bad_irq functions
      
      No functional changes.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Rohit Seth <rohitseth@google.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e05d723f
    • M
      [PATCH] Mark TSC on GeodeLX reliable · 07190a08
      Marcelo Tosatti 提交于
      The Geode can safely use the TSC for highres, since:
      
      1) Does not support frequency scaling,
      
      2) The TSC _does_ count when the CPU is halted.  Furthermore, the Geode
         supports a mode called "suspension on halt", where Suspend mode (which
         interacts with the power management states) is entered.  TSC counting
         during suspend mode is controlled by bit 8 of the Bus Controller
         Configuration Register #0 (thanks Tom!).
      
      3) no SMP :)
      
      Check if "RTSC counts during suspension" and remove the requirement for
      verification, so the clocksource code can safely select it as an timesource
      for the highres timers subsystem.
      Signed-off-by: NMarcelo Tosatti <marcelo@kvack.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      07190a08
    • I
      [PATCH] x86: rewrite SMP TSC sync code · 95492e46
      Ingo Molnar 提交于
      make the TSC synchronization code more robust, and unify it between x86_64 and
      i386.
      
      The biggest change is the removal of the 'fix up TSCs' code on x86_64 and
      i386, in some rare cases it was /causing/ time-warps on SMP systems.
      
      The new code only checks for TSC asynchronity - and if it can prove a
      time-warp (if it can observe the TSC going backwards when going from one CPU
      to another within a critical section), then the TSC clock-source is turned
      off.
      
      The TSC synchronization-checking code also got moved into a separate file.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      95492e46
  16. 13 2月, 2007 7 次提交