1. 24 6月, 2008 1 次提交
  2. 19 6月, 2008 4 次提交
    • J
      x86, geode: add a VSA2 ID for General Software · ffe6e1da
      Jordan Crouse 提交于
      General Software writes their own VSA2 module for their version
      of the Geode BIOS, which returns a different ID then the standard
      VSA2.  This was causing the framebuffer driver to break for most
      GSW boards.
      Signed-off-by: NJordan Crouse <jordan.crouse@amd.com>
      Cc: tglx@linutronix.de
      Cc: linux-geode@lists.infradead.org
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ffe6e1da
    • B
      x86: use BOOTMEM_EXCLUSIVE on 32-bit · d3942cff
      Bernhard Walle 提交于
      This patch uses the BOOTMEM_EXCLUSIVE for crashkernel reservation also for
      i386 and prints a error message on failure.
      
      The patch is still for 2.6.26 since it is only bug fixing. The unification
      of reserve_crashkernel() between i386 and x86_64 should be done for 2.6.27.
      Signed-off-by: NBernhard Walle <bwalle@suse.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: <stable@kernel.org>
      d3942cff
    • M
      x86, 32-bit: fix boot failure on TSC-less processors · df17b1d9
      Mikael Pettersson 提交于
      Booting 2.6.26-rc6 on my 486 DX/4 fails with a "BUG: Int 6"
      (invalid opcode) and a kernel halt immediately after the
      kernel has been uncompressed. The BUG shows EIP pointing
      to an rdtsc instruction in native_read_tsc(), invoked from
      native_sched_clock().
      
      (This error occurs so early that not even the serial console
      can capture it.)
      
      A bisection showed that this bug first occurs in 2.6.26-rc3-git7,
      via commit 9ccc906c:
      
      >x86: distangle user disabled TSC from unstable
      >
      >tsc_enabled is set to 0 from the command line switch "notsc" and from
      >the mark_tsc_unstable code. Seperate those functionalities and replace
      >tsc_enable with tsc_disable. This makes also the native_sched_clock()
      >decision when to use TSC understandable.
      >
      >Preparatory patch to solve the sched_clock() issue on 32 bit.
      >
      >Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
      
      The core reason for this bug is that native_sched_clock() gets
      called before tsc_init().
      
      Before the commit above, tsc_32.c used a "tsc_enabled" variable
      which defaulted to 0 == disabled, and which only got enabled late
      in tsc_init(). Thus early calls to native_sched_clock() would skip
      the TSC and use jiffies instead.
      
      After the commit above, tsc_32.c uses a "tsc_disabled" variable
      which defaults to 0, meaning that the TSC is Ok to use. Early calls
      to native_sched_clock() now erroneously try to use the TSC on
      !cpu_has_tsc processors, leading to invalid opcode exceptions.
      
      My proposed fix is to initialise tsc_disabled to a "soft disabled"
      state distinct from the hard disabled state set up by the "notsc"
      kernel option. This fixes the native_sched_clock() problem. It also
      allows tsc_init() to be simplified: instead of setting tsc_disabled = 1
      on every error return, we just set tsc_disabled = 0 once when all
      checks have succeeded.
      
      I've verified that this lets my 486 boot again. I've also verified
      that a Core2 machine still uses the TSC as clocksource after the patch.
      Signed-off-by: NMikael Pettersson <mikpe@it.uu.se>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      df17b1d9
    • S
      x86: fix NULL pointer deref in __switch_to · 75118a82
      Suresh Siddha 提交于
      Patrick McHardy reported a crash:
      
      > > I get this oops once a day, its apparently triggered by something
      > > run by cron, but the process is a different one each time.
      > >
      > > Kernel is -git from yesterday shortly before the -rc6 release
      > > (last commit is the usb-2.6 merge, the x86 patches are missing),
      > > .config is attached.
      > >
      > > I'll retry with current -git, but the patches that have gone in
      > > since I last updated don't look related.
      > >
      > > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at
      > > 000001ff
      > > [62060.043009] IP: [<c0102a9b>] __switch_to+0x2f/0x118
      > > [62060.043009] *pde = 00000000
      > > [62060.043009] Oops: 0002 [#1] PREEMPT
      
      Vegard Nossum analyzed it:
      
      > This decodes to
      >
      >    0:   0f ae 00                fxsave (%eax)
      >
      > so it's related to the floating-point context. This is the exact
      > location of the crash:
      >
      > $ addr2line -e arch/x86/kernel/process_32.o -i ab0
      > include/asm/i387.h:232
      > include/asm/i387.h:262
      > arch/x86/kernel/process_32.c:595
      >
      > ...so it looks like prev_task->thread.xstate->fxsave has become NULL.
      > Or maybe it never had any other value.
      
      Somehow (as described below) TS_USEDFPU is set but the fpu is not
      allocated or freed.
      
      Another possible FPU pre-emption issue with the sleazy FPU optimization
      which was benign before but not so anymore, with the dynamic FPU allocation
      patch.
      
      New task is getting exec'd and it is prempted at the below point.
      
      flush_thread() {
      	...
      	/*
      	* Forget coprocessor state..
      	*/
      	clear_fpu(tsk);
      		<----- Preemption point
      	clear_used_math();
      	...
      }
      
      Now when it context switches in again, as the used_math() is still set
      and fpu_counter can be > 5, we will do a math_state_restore() which sets
      the task's TS_USEDFPU. After it continues from the above preemption point
      it does clear_used_math() and much later free_thread_xstate().
      
      Now, at the next context switch, it is quite possible that xstate is
      null, used_math() is not set and TS_USEDFPU is still set. This will
      trigger unlazy_fpu() causing kernel oops.
      
      Fix this  by clearing tsk's fpu_counter before clearing task's fpu.
      Reported-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      75118a82
  3. 18 6月, 2008 1 次提交
    • L
      x86-64: Fix "bytes left to copy" return value for copy_from_user() · 42a886af
      Linus Torvalds 提交于
      Most users by far do not care about the exact return value (they only
      really care about whether the copy succeeded in its entirety or not),
      but a few special core routines actually care deeply about exactly how
      many bytes were copied from user space.
      
      And the unrolled versions of the x86-64 user copy routines would
      sometimes report that it had copied more bytes than it actually had.
      
      Very few uses actually have partial copies to begin with, but to make
      this bug even harder to trigger, most x86 CPU's use the "rep string"
      instructions for normal user copies, and that version didn't have this
      issue.
      
      To make it even harder to hit, the one user of this that really cared
      about the return value (and used the uncached version of the copy that
      doesn't use the "rep string" instructions) was the generic write
      routine, which pre-populated its source, once more hiding the problem by
      avoiding the exception case that triggers the bug.
      
      In other words, very special thanks to Bron Gondwana who not only
      triggered this, but created a test-program to show it, and bisected the
      behavior down to commit 08291429 ("mm:
      fix pagecache write deadlocks") which changed the access pattern just
      enough that you can now trigger it with 'writev()' with multiple
      iovec's.
      
      That commit itself was not the cause of the bug, it just allowed all the
      stars to align just right that you could trigger the problem.
      
      [ Side note: this is just the minimal fix to make the copy routines
        (with __copy_from_user_inatomic_nocache as the particular version that
        was involved in showing this) have the right return values.
      
        We really should improve on the exceptional case further - to make the
        copy do a byte-accurate copy up to the exact page limit that causes it
        to fail.  As it is, the callers have to do extra work to handle the
        limit case gracefully. ]
      Reported-by: NBron Gondwana <brong@fastmail.fm>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      
       (which didn't have this problem), and since
      most users that do the carethis was very hard to trigger, but
      42a886af
  4. 17 6月, 2008 1 次提交
  5. 13 6月, 2008 10 次提交
    • S
      provide rtc_cmos platform device · 1da2e3d6
      Stas Sergeev 提交于
      Recently (around 2.6.25) I've noticed that RTC no longer works for me.  It
      turned out this is because I use pnpacpi=off kernel option to work around
      the parport_pc bugs.  I always did so, but RTC used to work fine in the
      past, and now it have regressed.
      
      The patch fixes the problem by creating the platform device for the RTC
      when PNP is disabled.  This may also help running the PNP-enabled kernel
      on an older PCs.
      Signed-off-by: NStas Sergeev <stsp@aknet.ru>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
      Cc: Adam Belay <ambx1@neo.rr.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1da2e3d6
    • K
      x86: fix pointer type warning in arch/x86/mm/init_64.c:early_memtest · f8a45704
      Kevin Winchester 提交于
      Changed the call to find_e820_area_size to pass u64 instead of unsigned long.
      Signed-off-by: NKevin Winchester <kjwinchester@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f8a45704
    • V
      x86, lockdep: fix "WARNING: at kernel/lockdep.c:2658 check_flags+0x4c/0x128()" · 4461145e
      Vegard Nossum 提交于
      Alessandro Suardi reported:
      > Recently upgraded my FC6 desktop to Fedora 9; with the
      >  latest nautilus RPM updates my VNC session went nuts
      >  with nautilus pegging the CPU for everything that breathed.
      >
      > I now reverted to an earlier nautilus package, but during
      >  the peak CPU period my kernel spat this:
      >
      > [314185.623294] ------------[ cut here ]------------
      > [314185.623414] WARNING: at kernel/lockdep.c:2658 check_flags+0x4c/0x128()
      > [314185.623514] Modules linked in: iptable_filter ip_tables x_tables
      > sunrpc ipv6 fuse snd_via82xx snd_ac97_codec ac97_bus snd_mpu401_uart
      > snd_rawmidi via686a hwmon parport_pc sg parport uhci_hcd ehci_hcd
      > [314185.623924] Pid: 12314, comm: nautilus Not tainted 2.6.26-rc5-git2 #4
      > [314185.624021]  [<c0115b95>] warn_on_slowpath+0x41/0x7b
      > [314185.624021]  [<c010de70>] ? do_page_fault+0x2c1/0x5fd
      > [314185.624021]  [<c0128396>] ? up_read+0x16/0x28
      > [314185.624021]  [<c010de70>] ? do_page_fault+0x2c1/0x5fd
      > [314185.624021]  [<c012fa33>] ? __lock_acquire+0xbb4/0xbc3
      > [314185.624021]  [<c012d0a0>] check_flags+0x4c/0x128
      > [314185.624021]  [<c012fa73>] lock_acquire+0x31/0x7d
      > [314185.624021]  [<c0128cf6>] __atomic_notifier_call_chain+0x30/0x80
      > [314185.624021]  [<c0128cc6>] ? __atomic_notifier_call_chain+0x0/0x80
      > [314185.624021]  [<c0128d52>] atomic_notifier_call_chain+0xc/0xe
      > [314185.624021]  [<c0128d81>] notify_die+0x2d/0x2f
      > [314185.624021]  [<c01043b0>] do_int3+0x1f/0x4d
      > [314185.624021]  [<c02f2d3b>] int3+0x27/0x2c
      > [314185.624021]  =======================
      > [314185.624021] ---[ end trace 1923f65a2d7bb246 ]---
      > [314185.624021] possible reason: unannotated irqs-off.
      > [314185.624021] irq event stamp: 488879
      > [314185.624021] hardirqs last  enabled at (488879): [<c0102d67>]
      > restore_nocheck+0x12/0x15
      > [314185.624021] hardirqs last disabled at (488878): [<c0102dca>]
      > work_resched+0x19/0x30
      > [314185.624021] softirqs last  enabled at (488876): [<c011a1ba>]
      > __do_softirq+0xa6/0xac
      > [314185.624021] softirqs last disabled at (488865): [<c010476e>]
      > do_softirq+0x57/0xa6
      >
      > I didn't seem to find it with some googling, so here it is.
      >
      > I was incidentally ltracing that process to try and find out
      >  what was gulping down that much CPU (sorry, no idea
      >  whether ltrace and the WARNING happened at the same
      >  time or which came first) and:
      
      Yeah, this is extremely likely to be the source of the warning.
      
      The warning should be harmless, however.
      
      > Box is my trusty noname K7-800, 512MB RAM; if there's
      >  anything else useful I might be able to provide, just ask.
      
      It would be interesting to see where the int3 comes from.  Too bad,
      lockdep doesn't provide the register dump. The stacktrace also doesn't
      go further than the int3(), I wonder if this int3 came from userspace?
      The ltrace readme says "software breakpoints, like gdb", so I guess
      this is the case. Yep, seems like it.
      
      This looks relevant:
      
      | commit fb1dac90
      | Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
      | Date:   Wed Jan 16 09:51:59 2008 +0100
      |
      |     lockdep: more hardirq annotations for notify_die()
      
      I'm attaching a similarly-looking patch for this case (DO_VM86_ERROR),
      though I suspect it might be missing for the other cases
      (DO_ERROR/DO_ERROR_INFO) as well.
      Reported-by: NAlessandro Suardi <alessandro.suardi@gmail.com>
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4461145e
    • D
      x86: fix an incompatible pointer type warning on 64-bit compilations · eb53e9f3
      David Howells 提交于
      Fix an incompatible pointer type warning on x86_64 compilations.
      early_memtest() is passing a u64* to find_e820_area_size() which is expecting
      an unsigned long.  Change t_start and t_size to unsigned long as those are
      also 64-bit types on x88_64.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      eb53e9f3
    • P
      x86: fix lockdep warning during suspend-to-ram · e32e58a9
      Peter Zijlstra 提交于
      Andrew Morton wrote:
      
      > I've been seeing the below for a long time during suspend-to-ram on the Vaio.
      >
      >
      > PM: Syncing filesystems ... done.
      > PM: Preparing system for mem sleep
      > Freezing user space processes ... <4>------------[ cut here ]------------
      > WARNING: at kernel/lockdep.c:2658 check_flags+0x4c/0x127()
      > Modules linked in: i915 drm ipw2200 sonypi ipv6 autofs4 hidp l2cap bluetooth sunrpc nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 xt_state nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables acpi_cpufreq nvram ohci1394 ieee1394 ehci_hcd uhci_hcd sg joydev snd_hda_intel snd_seq_dummy sr_mod snd_seq_oss cdrom snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss ieee80211 pcspkr ieee80211_crypt snd_pcm i2c_i801 snd_timer i2c_core ide_pci_generic piix snd soundcore snd_page_alloc button ext3 jbd ide_disk ide_core [last unloaded: ipw2200]
      > Pid: 3250, comm: zsh Not tainted 2.6.26-rc5 #1
      >  [<c011c5f5>] warn_on_slowpath+0x41/0x6d
      >  [<c01080e6>] ? native_sched_clock+0x82/0x96
      >  [<c013789c>] ? mark_held_locks+0x41/0x5c
      >  [<c0315688>] ? _spin_unlock_irqrestore+0x36/0x58
      >  [<c0137a29>] ? trace_hardirqs_on+0xe6/0x10d
      >  [<c0138637>] ? __lock_acquire+0xae3/0xb2b
      >  [<c0313413>] ? schedule+0x39b/0x3b4
      >  [<c0135596>] check_flags+0x4c/0x127
      >  [<c01386b9>] lock_acquire+0x3a/0x86
      >  [<c0315075>] _spin_lock+0x26/0x53
      >  [<c0140660>] ? refrigerator+0x13/0xc3
      >  [<c0140660>] refrigerator+0x13/0xc3
      >  [<c012684a>] get_signal_to_deliver+0x3c/0x31e
      >  [<c0102fe7>] do_notify_resume+0x91/0x6ee
      >  [<c01359fd>] ? lock_release_holdtime+0x50/0x56
      >  [<c0315688>] ? _spin_unlock_irqrestore+0x36/0x58
      >  [<c0235d24>] ? read_chan+0x0/0x58c
      >  [<c0137a29>] ? trace_hardirqs_on+0xe6/0x10d
      >  [<c0315694>] ? _spin_unlock_irqrestore+0x42/0x58
      >  [<c0230afa>] ? tty_ldisc_deref+0x5c/0x63
      >  [<c0233104>] ? tty_read+0x66/0x98
      >  [<c014b3f0>] ? audit_syscall_exit+0x2aa/0x2c5
      >  [<c0109430>] ? do_syscall_trace+0x6b/0x16f
      >  [<c0103a9c>] work_notifysig+0x13/0x1b
      >  =======================
      > ---[ end trace 25b49fe59a25afa5 ]---
      > possible reason: unannotated irqs-off.
      > irq event stamp: 58919
      > hardirqs last  enabled at (58919): [<c0103afd>] syscall_exit_work+0x11/0x26
      
      Joy - I so love entry.S
      
      Best I can make of it:
      
      syscall_exit_work
        resume_userspace
          DISABLE_INTERRUPTS
          (no TRACE_IRQS_OFF)
            work_pending
              work_notifysig
                do_notify_resume()
                  do_signal()
                    get_signal_to_deliver()
                      try_to_freeze()
                        refrigerator()
                          task_lock() -> check_flags() -> BANG
      
      The normal path is:
      
      syscall_exit_work
        resume_userspace
          DISABLE_INTERRUPTS
          restore_all
            TRACE_IRQS_IRET
            iret
      
      No idea why that would not warn..
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e32e58a9
    • M
      x86: fix unused variable 'loops' warning in arch/x86/boot/a20.c · 52aaa12f
      Manish Katiyar 提交于
      Following patch fixes the below warning message :
      arch/x86/boot/a20.c:118: warning: unused variable 'loops'
      
      Signed-off-by : Manish Katiyar <mkatiyar@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      52aaa12f
    • I
      Revert "x86: fix ioapic bug again" · 0b6a39f7
      Ingo Molnar 提交于
      This reverts commit 6e908947.
      
      Németh Márton reported:
      
      | there is a problem in 2.6.26-rc3 which was not there in case of
      | 2.6.25: the CPU wakes up ~90,000 times per sec instead of ~60 per sec.
      |
      | I also "git bisected" the problem, the result is:
      |
      | 6e908947 is first bad commit
      | commit 6e908947
      | Author: Ingo Molnar <mingo@elte.hu>
      | Date:   Fri Mar 21 14:32:36 2008 +0100
      |
      |     x86: fix ioapic bug again
      
      the original problem is fixed by Maciej W. Rozycki in the tip/x86/apic
      branch (confirmed by Márton), but those changes are too intrusive for
      v2.6.26 so we'll go for the less intrusive (repeated) revert now.
      Reported-and-bisected-by: NNémeth Márton <nm127@freemail.hu>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0b6a39f7
    • J
      x86: fix asm warning in head_32.S · 86b2b70e
      Joe Korty 提交于
      On Mon, May 19, 2008 at 04:10:02PM -0700, Linus Torvalds wrote:
      > It also causes these warnings on 32-bit PAE:
      >
      > 	  AS      arch/x86/kernel/head_32.o
      > 	arch/x86/kernel/head_32.S: Assembler messages:
      > 	arch/x86/kernel/head_32.S:225: Warning: left operand is a bignum; integer 0 assumed
      > 	arch/x86/kernel/head_32.S:609: Warning: left operand is a bignum; integer 0 assumed
      >
      > and I do not see why (the end result seems to be identical).
      
      Fix head_32.S gcc bignum warnings when CONFIG_PAE=y.
      
          arch/x86/kernel/head_32.S: Assembler messages:
          arch/x86/kernel/head_32.S:225: Warning: left operand is a bignum; integer 0 assumed
          arch/x86/kernel/head_32.S:609: Warning: left operand is a bignum; integer 0 assumed
      
      The assembler was stumbling over the 64-bit constant 0x100000000 in the
      KPMDS #define.
      
      Testing: a cmp(1) on head_32.o before and after shows the binary is unchanged.
      
      Signed-off-by: Joe Korty <joe.korty@ccur.com
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Theodore Tso <tytso@mit.edu>
      Cc: Gabriel C <nix.or.die@googlemail.com>
      Cc: Keith Packard <keithp@keithp.com>
      Cc: "Pallipadi Venkatesh" <venkatesh.pallipadi@intel.com>
      Cc: Eric Anholt <eric@anholt.net>
      Cc: "Siddha Suresh B" <suresh.b.siddha@intel.com>
      Cc: bugme-daemon@bugzilla.kernel.org
      Cc: airlied@linux.ie
      Cc: "Barnes Jesse" <jesse.barnes@intel.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      86b2b70e
    • H
      x86: fix endless page faults in mount_block_root for Linux 2.6 · b29c701d
      Henry Nestler 提交于
      Page faults in kernel address space between PAGE_OFFSET up to
      VMALLOC_START should not try to map as vmalloc.
      
      Fix rarely endless page faults inside mount_block_root for root
      filesystem at boot time.
      
      All 32bit kernels up to 2.6.25 can fail into this hole.
      I can not present this under native linux kernel. I see, that the 64bit
      has fixed the problem. I copied the same lines into 32bit part.
      
      Recorded debugs are from coLinux kernel 2.6.22.18 (virtualisation):
      http://www.henrynestler.com/colinux/testing/pfn-check-0.7.3/20080410-antinx/bug16-recursive-page-fault-endless.txt
      The physicaly memory was trimmed down to 192MB to better catch the bug.
      More memory gets the bug more rarely.
      
      Details, how every x86 32bit system can fail:
      
      Start from "mount_block_root",
      http://lxr.linux.no/linux/init/do_mounts.c#L297
      There the variable "fs_names" got one memory page with 4096 bytes.
      Variable "p" walks through the existing file system types. The first
      string is no problem.
      But, with the second loop in mount_block_root the offset of "p" is not
      at beginning of page, the offset is for example +9, if "reiserfs" is the
      first in list.
      Than calls do_mount_root, and lands in sys_mount.
      Remember: Variable "type_page" contains now "fs_type+9" and not contains
      a full page.
      The sys_mount copies 4096 bytes with function "exact_copy_from_user()":
      http://lxr.linux.no/linux/fs/namespace.c#L1540
      
      Mostly exist pages after the buffer "fs_names+4096+9" and the page fault
      handler was not called. No problem.
      
      In the case, if the page after "fs_names+4096" is not mapped, the page
      fault handler was called from http://lxr.linux.no/linux/fs/namespace.c#L1320
      
      The do_page_fault gots an address 0xc03b4000.
      It's kernel address, address >= TASK_SIZE, but not from vmalloc! It's
      from "__getname()" alias "kmem_cache_alloc".
      The "error_code" is 0. "vmalloc_fault" will be call:
      http://lxr.linux.no/linux/arch/i386/mm/fault.c#L332
      
      "vmalloc_fault" tryed to find the physical page for a non existing
      virtual memory area. The macro "pte_present" in vmalloc_fault()
      got a next page fault for 0xc0000ed0 at:
      http://lxr.linux.no/linux/arch/i386/mm/fault.c#L282
      
      No PTE exist for such virtual address. The page fault handler was trying
      to sync the physical page for the PTE lockup.
      
      This called vmalloc_fault() again for address 0xc000000, and that also
      was not existing. The endless began...
      
      In normal case the cpu would still loop with disabled interrrupts. Under
      coLinux this was catched by a stack overflow inside printk debugs.
      Signed-off-by: NHenry Nestler <henry.nestler@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      b29c701d
    • I
      geode: fix modular build · 3703f399
      Ingo Molnar 提交于
      -tip testing found this build bug:
      
       MODPOST 331 modules
       ERROR: "geode_mfgpt_toggle_event" [drivers/watchdog/geodewdt.ko] undefined!
       ERROR: "geode_mfgpt_alloc_timer" [drivers/watchdog/geodewdt.ko] undefined!
       make[1]: *** [__modpost] Error 1
       make: *** [modules] Error 2
      
      with this config:
      
        http://redhat.com/~mingo/misc/config-Wed_Jun__4_18_01_59_CEST_2008.bad
      
      export those symbols.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3703f399
  6. 12 6月, 2008 2 次提交
    • J
      common implementation of iterative div/mod · f595ec96
      Jeremy Fitzhardinge 提交于
      We have a few instances of the open-coded iterative div/mod loop, used
      when we don't expcet the dividend to be much bigger than the divisor.
      Unfortunately modern gcc's have the tendency to strength "reduce" this
      into a full mod operation, which isn't necessarily any faster, and
      even if it were, doesn't exist if gcc implements it in libgcc.
      
      The workaround is to put a dummy asm statement in the loop to prevent
      gcc from performing the transformation.
      
      This patch creates a single implementation of this loop, and uses it
      to replace the open-coded versions I know about.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Cc: Christian Kujau <lists@nerdbynature.de>
      Cc: Robert Hancock <hancockr@shaw.ca>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f595ec96
    • F
      ACPI: handle invalid ACPI SLIT table · 39b8931b
      Fenghua Yu 提交于
      This is a SLIT sanity checking patch.  It moves slit_valid() function to
      generic ACPI code and does sanity checking for both x86 and ia64.  It sets up
      node_distance with LOCAL_DISTANCE and REMOTE_DISTANCE when hitting invalid
      SLIT table on ia64.  It also cleans up unused variable localities in
      acpi_parse_slit() on x86.
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      39b8931b
  7. 10 6月, 2008 2 次提交
    • M
      x86, pci-dma.c: don't always add __GFP_NORETRY to gfp · b7f09ae5
      Miquel van Smoorenburg 提交于
      Currently arch/x86/kernel/pci-dma.c always adds __GFP_NORETRY
      to the allocation flags, because it wants to be reasonably
      sure not to deadlock when calling alloc_pages().
      
      But really that should only be done in two cases:
      - when allocating memory in the lower 16 MB DMA zone.
        If there's no free memory there, waiting or OOM killing is of no use
      - when optimistically trying an allocation in the DMA32 zone
        when dma_mask < DMA_32BIT_MASK hoping that the allocation
        happens to fall within the limits of the dma_mask
      
      Also blindly adding __GFP_NORETRY to the the gfp variable might
      not be a good idea since we then also use it when calling
      dma_ops->alloc_coherent(). Clearing it might also not be a
      good idea, dma_alloc_coherent()'s caller might have set it
      on purpose. The gfp variable should not be clobbered.
      
      [ mingo@elte.hu: converted to delta patch ontop of previous version. ]
      Signed-off-by: NMiquel van Smoorenburg <miquels@cistron.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b7f09ae5
    • A
      ftrace: remove ftrace_ip_converted() · 1d74f2a0
      Abhishek Sagar 提交于
      Remove the unneeded function ftrace_ip_converted().
      Signed-off-by: NAbhishek Sagar <sagar.abhishek@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1d74f2a0
  8. 07 6月, 2008 6 次提交
  9. 06 6月, 2008 3 次提交
    • B
      x86/PCI: add workaround for bug in ASUS A7V600 BIOS (rev 1005) · 9f67fd5d
      Bertram Felgenhauer 提交于
      This BIOS claims the VIA 8237 south bridge to be compatible with VIA 586,
      which it is not.
      
      Without this patch, I get the following warning while booting,
      among others,
      
      | PCI: Using IRQ router VIA [1106/3227] at 0000:00:11.0
      | ------------[ cut here ]------------
      | WARNING: at arch/x86/pci/irq.c:265 pirq_via586_get+0x4a/0x60()
      | Modules linked in:
      | Pid: 1, comm: swapper Not tainted 2.6.26-rc4-00015-g1ec7d99c #1
      |  [<c0119fd4>] warn_on_slowpath+0x54/0x70
      |  [<c02246e0>] ? vt_console_print+0x210/0x2b0
      |  [<c02244d0>] ? vt_console_print+0x0/0x2b0
      |  [<c011a413>] ? __call_console_drivers+0x43/0x60
      |  [<c011a482>] ? _call_console_drivers+0x52/0x80
      |  [<c011aa89>] ? release_console_sem+0x1c9/0x200
      |  [<c0291d21>] ? raw_pci_read+0x41/0x70
      |  [<c0291e8f>] ? pci_read+0x2f/0x40
      |  [<c029151a>] pirq_via586_get+0x4a/0x60
      |  [<c02914d0>] ? pirq_via586_get+0x0/0x60
      |  [<c029178d>] pcibios_lookup_irq+0x15d/0x430
      |  [<c03b895a>] pcibios_irq_init+0x17a/0x3e0
      |  [<c03a66f0>] ? kernel_init+0x0/0x250
      |  [<c03a6763>] kernel_init+0x73/0x250
      |  [<c03b87e0>] ? pcibios_irq_init+0x0/0x3e0
      |  [<c0114d00>] ? schedule_tail+0x10/0x40
      |  [<c0102dee>] ? ret_from_fork+0x6/0x1c
      |  [<c03a66f0>] ? kernel_init+0x0/0x250
      |  [<c03a66f0>] ? kernel_init+0x0/0x250
      |  [<c010324b>] kernel_thread_helper+0x7/0x1c
      |  =======================
      | ---[ end trace 4eaa2a86a8e2da22 ]---
      
      and IRQ trouble later,
      
      | irq 10: nobody cared (try booting with the "irqpoll" option)
      
      Now that's an VIA 8237 chip, so pirq_via586_get shouldn't be called
      at all; adding this workaround to via_router_probe() fixes the
      problem for me.
      
      Amazingly I have a 2.6.23.8 kernel that somehow works fine ... I'll
      never understand why.
      Signed-off-by: NBertram Felgenhauer <int-e@gmx.de>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Acked-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      9f67fd5d
    • A
      PCI/x86: fix up PCI stuff so that PCI_GOANY supports OLPC · 2bdd1b03
      Andres Salomon 提交于
      Previously, one would have to specifically choose CONFIG_OLPC and
      CONFIG_PCI_GOOLPC in order to enable PCI_OLPC.  That doesn't really work
      for distro kernels, so this patch allows one to choose CONFIG_OLPC and
      CONFIG_PCI_GOANY in order to build in OLPC support in a generic kernel (as
      requested by Robert Millan).
      
      This also moves GOOLPC before GOANY in the menuconfig list.
      
      Finally, make pci_access_init return early if we detect OLPC hardware.
      There's no need to continue probing stuff, and pci_pcbios_init
      specifically trashes our settings (we didn't run into that before because
      PCI_GOANY wasn't supported).
      Signed-off-by: NAndres Salomon <dilinger@debian.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      2bdd1b03
    • S
      x86: fix CONFIG_NONPROMISC_DEVMEM prompt and help text · 16104b55
      Stefan Richter 提交于
      Here is an attempt to translate the prompt and help text into something
      which is legible and, as a bonus, correct.
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      16104b55
  10. 04 6月, 2008 10 次提交
    • S
      x86, fpu: fix CONFIG_PREEMPT=y corruption of application's FPU stack · 870568b3
      Suresh Siddha 提交于
      Jürgen Mell reported an FPU state corruption bug under CONFIG_PREEMPT,
      and bisected it to commit v2.6.19-1363-gacc20761, "i386: add sleazy FPU
      optimization".
      
      Add tsk_used_math() checks to prevent calling math_state_restore()
      which can sleep in the case of !tsk_used_math(). This prevents
      making a blocking call in __switch_to().
      
      Apparently "fpu_counter > 5" check is not enough, as in some signal handling
      and fork/exec scenarios, fpu_counter > 5 and !tsk_used_math() is possible.
      
      It's a side effect though. This is the failing scenario:
      
      process 'A' in save_i387_ia32() just after clear_used_math()
      
      Got an interrupt and pre-empted out.
      
      At the next context switch to process 'A' again, kernel tries to restore
      the math state proactively and sees a fpu_counter > 0 and !tsk_used_math()
      
      This results in init_fpu() during the __switch_to()'s math_state_restore()
      
      And resulting in fpu corruption which will be saved/restored
      (save_i387_fxsave and restore_i387_fxsave) during the remaining
      part of the signal handling after the context switch.
      Bisected-by: NJürgen Mell <j.mell@t-online.de>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Tested-by: NJürgen Mell <j.mell@t-online.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@kernel.org
      870568b3
    • P
      suspend-vs-iommu: prevent suspend if we could not resume · cd76374e
      Pavel Machek 提交于
      iommu/gart support misses suspend/resume code, which can do bad stuff,
      including memory corruption on resume.  Prevent system suspend in case we
      would be unable to resume.
      Signed-off-by: NPavel Machek <pavel@suse.cz>
      Tested-by: NPatrick <ragamuffin@datacomm.ch>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cd76374e
    • A
      x86: section mismatch fix · be524fb9
      Andrew Morton 提交于
      Fix this:
      
       WARNING: vmlinux.o(.text+0x114bb): Section mismatch in reference from
       the function nopat() to the function .cpuinit.text:pat_disable()
       The function nopat() references
       the function __cpuinit pat_disable().
       This is often because nopat lacks a __cpuinit
       annotation or the annotation of pat_disable is wrong.
      Reported-by: N"Fabio Comolli" <fabio.comolli@gmail.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      be524fb9
    • V
      x86: fix Xorg crash with xf86MapVidMem error · 282c454c
      Venki Pallipadi 提交于
      Clarify the usage of mtrr_lookup() in PAT code, and to make PAT code
      resilient to mtrr lookup problems.
      
      Specifically, pat_x_mtrr_type() is restructured to highlight, under what
      conditions we look for mtrr hint. pat_x_mtrr_type() uses a default type
      when there are any errors in mtrr lookup (still maintaining the pat
      consistency). And, reserve_memtype() highlights its usage ot mtrr_lookup
      for request type of '-1' and also defaults in a sane way on any mtrr
      lookup failure.
      
      pat.c looks at mtrr type of a range to get a hint on what mapping type
      to request when user/API: (1) hasn't specified any type (/dev/mem
      mapping) and we do not want to take performance hit by always mapping
      UC_MINUS. This will be the case for /dev/mem mappings used to map BIOS
      area or ACPI region which are WB'able. In this case, as long as MTRR is
      not WB, PAT will request UC_MINUS for such mappings.
      
      (2) user/API requests WB mapping while in reality MTRR may have UC or
      WC. In this case, PAT can map as WB (without checking MTRR) and still
      effective type will be UC or WC. But, a subsequent request to map same
      region as UC or WC may fail, as the region will get trackked as WB in
      PAT list. Looking at MTRR hint helps us to track based on effective type
      rather than what user requested. Again, here mtrr_lookup is only used as
      hint and we fallback to WB mapping (as requested by user) as default.
      
      In both cases, after using the mtrr hint, we still go through the
      memtype list to make sure there are no inconsistencies among multiple
      users.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Tested-by: NRufus &amp; Azrael <rufus-azrael@numericable.fr>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      282c454c
    • K
      x86: fix pointer type warning in arch/x86/mm/init_64.c:early_memtest · 51163101
      Kevin Winchester 提交于
      Changed the call to find_e820_area_size to pass u64 instead of unsigned long.
      Signed-off-by: NKevin Winchester <kjwinchester@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      51163101
    • H
      x86: fix bad pmd ffff810000207xxx(9090909090909090) · 2884f110
      Hugh Dickins 提交于
      OGAWA Hirofumi and Fede have reported rare pmd_ERROR messages:
      mm/memory.c:127: bad pmd ffff810000207xxx(9090909090909090).
      
      Initialization's cleanup_highmap was leaving alignment filler
      behind in the pmd for MODULES_VADDR: when vmalloc's guard page
      would occupy a new page table, it's not allocated, and then
      module unload's vfree hits the bad 9090 pmd entry left over.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2884f110
    • I
      x86: ioremap fix failing nesting check · 226e9a93
      Ingo Molnar 提交于
      Mika Kukkonen noticed that the nesting check in early_iounmap() is not
      actually done.
      Reported-by: NMika Kukkonen <mikukkon@srv1-m700-lanp.koti>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: torvalds@linux-foundation.org
      Cc: arjan@linux.intel.com
      Cc: mikukkon@iki.fi
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      226e9a93
    • S
      x86: fix broken math-emu with lazy allocation of fpu area · e8a496ac
      Suresh Siddha 提交于
      Fix the math emulation that got broken with the recent lazy allocation of FPU
      area. init_fpu() need to be added for the math-emulation path aswell
      for the FPU area allocation.
      
      math emulation enabled kernel booted fine with this, in the presence
      of "no387 nofxsr" boot param.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: hpa@zytor.com
      Cc: mingo@elte.hu
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      e8a496ac
    • S
      x86: enable preemption in delay · 5c1ea082
      Steven Rostedt 提交于
      The RT team has been searching for a nasty latency. This latency shows
      up out of the blue and has been seen to be as big as 5ms!
      
      Using ftrace I found the cause of the latency.
      
         pcscd-2995  3dNh1 52360300us : irq_exit (smp_apic_timer_interrupt)
         pcscd-2995  3dN.2 52360301us : idle_cpu (irq_exit)
         pcscd-2995  3dN.2 52360301us : rcu_irq_exit (irq_exit)
         pcscd-2995  3dN.1 52360771us : smp_apic_timer_interrupt (apic_timer_interrupt
      )
         pcscd-2995  3dN.1 52360771us : exit_idle (smp_apic_timer_interrupt)
      
      Here's an example of a 400 us latency. pcscd took a timer interrupt and
      returned with "need resched" enabled, but did not reschedule until after
      the next interrupt came in at 52360771us 400us later!
      
      At first I thought we somehow missed a preemption check in entry.S. But
      I also noticed that this always seemed to happen during a __delay call.
      
         pcscd-2995  3dN.2 52360836us : rcu_irq_exit (irq_exit)
         pcscd-2995  3.N.. 52361265us : preempt_schedule (__delay)
      
      Looking at the x86 delay, I found my problem.
      
      In git commit 35d5d08a, Andrew Morton
      placed preempt_disable around the entire delay due to TSC's not working
      nicely on SMP.  Unfortunately for those that care about latencies this
      is devastating! Especially when we have callers to mdelay(8).
      
      Here I enable preemption during the loop and account for anytime the task
      migrates to a new CPU. The delay asked for may be extended a bit by
      the migration, but delay only guarantees that it will delay for that minimum
      time. Delaying longer should not be an issue.
      
      [
        Thanks to Thomas Gleixner for spotting that cpu wasn't updated,
          and to place the rep_nop between preempt_enabled/disable.
      ]
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Cc: akpm@osdl.org
      Cc: Clark Williams <clark.williams@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
      Cc: Gregory Haskins <ghaskins@novell.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andi Kleen <andi-suse@firstfloor.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      5c1ea082
    • I
      x86: disable preemption in native_smp_prepare_cpus · deef3250
      Ingo Molnar 提交于
      Priit Laes reported the following warning:
      
      Call Trace:
       [<ffffffff8022f1e1>] warn_on_slowpath+0x51/0x63
       [<ffffffff80282e48>] sys_ioctl+0x2d/0x5d
       [<ffffffff805185ff>] _spin_lock+0xe/0x24
       [<ffffffff80227459>] task_rq_lock+0x3d/0x73
       [<ffffffff805133c3>] set_cpu_sibling_map+0x336/0x350
       [<ffffffff8021c1b8>] read_apic_id+0x30/0x62
       [<ffffffff806d921d>] verify_local_APIC+0x90/0x138
       [<ffffffff806d84b5>] native_smp_prepare_cpus+0x1f9/0x305
       [<ffffffff806ce7b1>] kernel_init+0x59/0x2d9
       [<ffffffff80518a26>] _spin_unlock_irq+0x11/0x2b
       [<ffffffff8020bf48>] child_rip+0xa/0x12
       [<ffffffff806ce758>] kernel_init+0x0/0x2d9
       [<ffffffff8020bf3e>] child_rip+0x0/0x12
      
      fix this by generally disabling preemption in native_smp_prepare_cpus().
      Reported-and-bisected-by: NPriit Laes <plaes@plaes.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      deef3250