提交 · 707a18a51d83d9180a63b3cbaad8eda7764a8689 · openeuler / raspberrypi-kernel

25 3月, 2008 5 次提交

KVM: VMX: convert init_rmode_tss() to slots_lock · 707a18a5

由 Marcelo Tosatti 提交于 3月 18, 2008

init_rmode_tss was forgotten during the conversion from mmap_sem to
slots_lock.

INFO: task qemu-system-x86:3748 blocked for more than 120 seconds.
Call Trace:
 [<ffffffff8053d100>] __down_read+0x86/0x9e
 [<ffffffff8053fb43>] do_page_fault+0x346/0x78e
 [<ffffffff8053d235>] trace_hardirqs_on_thunk+0x35/0x3a
 [<ffffffff8053dcad>] error_exit+0x0/0xa9
 [<ffffffff8035a7a7>] copy_user_generic_string+0x17/0x40
 [<ffffffff88099a8a>] :kvm:kvm_write_guest_page+0x3e/0x5f
 [<ffffffff880b661a>] :kvm_intel:init_rmode_tss+0xa7/0xf9
 [<ffffffff880b7d7e>] :kvm_intel:vmx_vcpu_reset+0x10/0x38a
 [<ffffffff8809b9a5>] :kvm:kvm_arch_vcpu_setup+0x20/0x53
 [<ffffffff8809a1e4>] :kvm:kvm_vm_ioctl+0xad/0x1cf
 [<ffffffff80249dea>] __lock_acquire+0x4f7/0xc28
 [<ffffffff8028fad9>] vfs_ioctl+0x21/0x6b
 [<ffffffff8028fd75>] do_vfs_ioctl+0x252/0x26b
 [<ffffffff8028fdca>] sys_ioctl+0x3c/0x5e
 [<ffffffff8020b01b>] system_call_after_swapgs+0x7b/0x80
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

707a18a5

KVM: MMU: handle page removal with shadow mapping · 15aaa819

由 Marcelo Tosatti 提交于 3月 17, 2008

Do not assume that a shadow mapping will always point to the same host
frame number.  Fixes crash with madvise(MADV_DONTNEED).

[avi: move after first printk(), add another printk()]
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

15aaa819

KVM: MMU: Fix is_rmap_pte() with io ptes · 4b1a80fa

由 Avi Kivity 提交于 3月 23, 2008

is_rmap_pte() doesn't take into account io ptes, which have the avail bit set.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

4b1a80fa

KVM: VMX: Restore tss even on x86_64 · 5dc83262

由 Avi Kivity 提交于 3月 16, 2008

The vmx hardware state restore restores the tss selector and base address, but
not its length. Usually, this does not matter since most of the tss contents
is within the default length of 0x67. However, if a process is using ioperm()
to grant itself I/O port permissions, an additional bitmap within the tss,
but outside the default length is consulted. The effect is that the process
will receive a SIGSEGV instead of transparently accessing the port.

Fix by restoring the tss length. Note that i386 had this working already.

Closes bugzilla 10246.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

5dc83262

x86-32: Pass the full resource data to ioremap() · b9e76a00

由 Linus Torvalds 提交于 3月 24, 2008

It appears that 64-bit PCI resources cannot possibly ever have worked on
x86-32 even when the RESOURCES_64BIT config option was set, because any
driver that tried to [pci_]ioremap() the resource would have been unable
to do so because the high 32 bits would have been silently dropped on
the floor by the ioremap() routines that only used "unsigned long".

Change them to use "resource_size_t" instead, which properly encodes the
whole 64-bit resource data if RESOURCES_64BIT is enabled.
Acked-by: NH. Peter Anvin <hpa@kernel.org>
Acked-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b9e76a00

23 3月, 2008 1 次提交

x86: revert: reserve dma32 early for gart · 9e963048

由 Thomas Gleixner 提交于 3月 22, 2008

Revert

commit f62f1fc9
Author: Yinghai Lu <yhlu.kernel@gmail.com>
Date:   Fri Mar 7 15:02:50 2008 -0800

    x86: reserve dma32 early for gart

The patch has a dependency on bootmem modifications which are not .25
material that late in the -rc cycle. The problem which is addressed by
the patch is limited to machines with 256G and more memory booted with
NUMA disabled. This is not a .25 regression and the audience which is
affected by this problem is very limited, so it's safer to do the
revert than pulling in intrusive bootmem changes right now.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

9e963048

22 3月, 2008 11 次提交

x86_64: free_bootmem should take phys · 37bff62e

由 Yinghai Lu 提交于 3月 18, 2008

so use nodedata_phys directly.
Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

37bff62e

x86: trim mtrr don't close gap for resource allocation. · 5dca6a1b

由 Yinghai Lu 提交于 3月 18, 2008

fix the bug reported here:

	http://bugzilla.kernel.org/show_bug.cgi?id=10232

use update_memory_range() instead of add_memory_range() directly
to avoid closing the gap.

( the new code only affects and runs on systems where the MTRR
  workaround triggers. )
Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

5dca6a1b

x86: fix reboot problem with Dell Optiplex 745, 0KW626 board · fc1c8925

由 Heinz-Ado Arnolds 提交于 3月 12, 2008

we have seen a little problem in rebooting Dell Optiplex 745 with the
0KW626 board. Here is a small patch enabling reboot with this board,
which forces the default reboot path it into the BIOS reboot mode.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

fc1c8925

x86: fix fault_msg nul termination · e215f3c2

由 Jiri Slaby 提交于 3月 12, 2008

The fault_msg text is not explictly nul terminated now in startup
assembly. Do so by converting .ascii to .asciz.
Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

e215f3c2

x86: fix long standing bug with usb after hibernation with 4GB ram · 2050d45d

由 Pavel Machek 提交于 3月 13, 2008

aperture_64.c takes a piece of memory and makes it into iommu
window... but such window may not be saved by swsusp -- that leads to
oops during hibernation.
Signed-off-by: NPavel Machek <pavel@suse.cz>
Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

2050d45d

x86: hpet clock enable quirk on nVidia nForce 430 · 96bcf458

由 Zbigniew Luszpinski 提交于 3月 19, 2008

this patch allows hpet=force on nVidia nForce 430 southbridge.
This patch was tested by me on my old Asus A8N-VM CSM (where bios does not
support hpet and does not advertise it via acpi entry). My nForce430 version:
lspci -nn | grep LPC
00:0a.0 ISA bridge [0601]: nVidia Corporation MCP51 LPC Bridge [10de:0260]
(rev a2)

Kernel 2.6.24.3 after patching and using hpet=force reports this:
dmesg | grep -i hpet
Kernel command line: root=/dev/sda8 ro vga=773 video=vesafb:mtrr:4,ywrap
vt.default_utf8=0 hpet=force
Force enabled HPET at base address 0xfed00000
hpet clockevent registered
Time: hpet clocksource has been installed.

grep -i hpet /proc/timer_list
Clock Event Device: hpet
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode

grep Clock /proc/timer_list (before patching)
Clock Event Device: pit
Clock Event Device: lapic

grep Clock /proc/timer_list (after patching)
Clock Event Device: hpet
Clock Event Device: lapic
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

96bcf458

x86: reserve dma32 early for gart · f62f1fc9

由 Yinghai Lu 提交于 3月 07, 2008

a system with 256 GB of RAM, when NUMA is disabled crashes the
following way:

Your BIOS doesn't leave a aperture memory hole
Please enable the IOMMU option in the BIOS setup
This costs you 64 MB of RAM
Cannot allocate aperture memory hole (ffff8101c0000000,65536K)
Kernel panic - not syncing: Not enough memory for aperture
Pid: 0, comm: swapper Not tainted 2.6.25-rc4-x86-latest.git #33

Call Trace:
 [<ffffffff84037c62>] panic+0xb2/0x190
 [<ffffffff840381fc>] ? release_console_sem+0x7c/0x250
 [<ffffffff847b1628>] ? __alloc_bootmem_nopanic+0x48/0x90
 [<ffffffff847b0ac9>] ? free_bootmem+0x29/0x50
 [<ffffffff847ac1f7>] gart_iommu_hole_init+0x5e7/0x680
 [<ffffffff847b255b>] ? alloc_large_system_hash+0x16b/0x310
 [<ffffffff84506a2f>] ? _etext+0x0/0x1
 [<ffffffff847a2e8c>] pci_iommu_alloc+0x1c/0x40
 [<ffffffff847ac795>] mem_init+0x45/0x1a0
 [<ffffffff8479ff35>] start_kernel+0x295/0x380
 [<ffffffff8479f1c2>] _sinittext+0x1c2/0x230

the root cause is : memmap PMD is too big,
[ffffe200e0600000-ffffe200e07fffff] PMD ->ffff81383c000000 on node 0
almost near 4G..., and vmemmap_alloc_block will use up the ram under 4G.

solution will be:
1. make memmap allocation get memory above 4G...
2. reserve some dma32 range early before we try to set up memmap for all.
and release that before pci_iommu_alloc, so gart or swiotlb could get some
range under 4g limit for sure.

the patch is using method 2.
because method1 may need more code to handle SPARSEMEM and SPASEMEM_VMEMMAP

will get
Your BIOS doesn't leave a aperture memory hole
Please enable the IOMMU option in the BIOS setup
This costs you 64 MB of RAM
Mapping aperture over 65536 KB of RAM @ 4000000
Memory: 264245736k/268959744k available (8484k kernel code, 4187464k reserved, 4004k data, 724k init)
Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

f62f1fc9

x86: add the DFF (Desktop Form Factor) Dell Optiplex 745 to the reboot errata list · fc115bf1

由 Coleman Kane 提交于 3月 04, 2008

We recently got some of the "Desktop Form Factor" Optiplex 745's in.  I
noticed that there's an entry for the SFF one's, but the BIOS model number
of the DFF differs from that of the SFF.  We have been reliably
experiencing the same (as far as I can tell) reboot bug as the SFF boxes.

Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

fc115bf1

x86/visws: fix printk format warnings · 31780715

由 Randy Dunlap 提交于 3月 04, 2008

Fix visws printk format warnings:

/local/linsrc/linux-2.6.24-git15/arch/x86/mach-visws/traps.c:50: warning: format '%#lx' expects type 'long unsigned int', but argument 2 has type 'u32'
/local/linsrc/linux-2.6.24-git15/arch/x86/mach-visws/traps.c:50: warning: format '%#lx' expects type 'long unsigned int', but argument 3 has type 'u32'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

31780715

x86: tight online check in setup_per_cpu_areas · 7d2de137

由 Yinghai Lu 提交于 3月 06, 2008

when numa disabled I got this compile warning:

arch/x86/kernel/setup64.c: In function setup_per_cpu_areas:
arch/x86/kernel/setup64.c:147: warning: the address of
                      contig_page_data will always evaluate as true

it seems we missed checking if the node is online before we try to refer
NODE_DATA. Fix it.
Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

7d2de137

x86: fix dma_alloc_pages · 6721fc0a

由 Yinghai Lu 提交于 2月 19, 2008

memory-less node support:

this patch uses updated dev_to_node, because dev_to_node already makes sure
it returns an online node.
Signed-off-by: NYinghai Lu <yinghai.lu@sun.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

6721fc0a

13 3月, 2008 1 次提交

documentation: Move power-related files to Documentation/power/ · 53471121

由 Randy Dunlap 提交于 3月 12, 2008

Move 00-INDEX entries to power/00-INDEX (and add entry for
pm_qos_interface.txt).

Update references to moved filenames.

Fix some trailing whitespace.
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NLen Brown <len.brown@intel.com>

53471121

12 3月, 2008 3 次提交

x86: remove quicklists · 985a34bd

由 Thomas Gleixner 提交于 3月 09, 2008

quicklists cause a serious memory leak on 32-bit x86,
as documented at:

  http://bugzilla.kernel.org/show_bug.cgi?id=9991

the reason is that the quicklist pool is a special-purpose
cache that grows out of proportion. It is not accounted for
anywhere and users have no way to even realize that it's
the quicklists that are causing RAM usage spikes. It was
supposed to be a relatively small pool, but as demonstrated
by KOSAKI Motohiro, they can grow as large as:

  Quicklists:    1194304 kB

given how much trouble this code has caused historically,
and given that Andrew objected to its introduction on x86
(years ago), the best option at this point is to remove them.

[ any performance benefits of caching constructed pgds should
  be implemented in a more generic way (possibly within the page
  allocator), while still allowing constructed pages to be
  allocated by other workloads. ]
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

985a34bd

x86: ia32 syscall restart fix · 40f0933d

由 Roland McGrath 提交于 2月 28, 2008

The code to restart syscalls after signals depends on checking for a
negative orig_ax, and for particular negative -ERESTART* values in ax.
These fields are 64 bits and for a 32-bit task they get zero-extended.
The syscall restart behavior is lost, a regression from a native 32-bit
kernel and from 64-bit tasks' behavior.

This patch fixes the problem by doing sign-extension where it matters.

For orig_ax, the only time the value should be -1 but winds up as
0x0ffffffff is via a 32-bit ptrace call. So the patch changes ptrace to
sign-extend the 32-bit orig_eax value when it's stored; it doesn't
change the checks on orig_ax, though it uses the new current_syscall()
inline to better document the subtle importance of the used of
signedness there.

The ax value is stored a lot of ways and it seems hard to get them all
sign-extended at their origins. So for that, we use the
current_syscall_ret() to sign-extend it only for 32-bit tasks at the
time of the -ERESTART* comparisons.
Signed-off-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

40f0933d

I
x86: ioremap, remove WARN_ON() · 9a46d7e5
由 Ingo Molnar 提交于 2月 26, 2008
```
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
9a46d7e5

11 3月, 2008 3 次提交

fix BIOS PCI config cycle buglet causing ACPI boot regression · f5dbb55b

由 Ingo Molnar 提交于 3月 10, 2008

I figured out another ACPI related regression today.

randconfig testing triggered an early boot-time hang on a laptop of mine
(32-bit x86, config attached) - the screen was scrolling ACPI AML
exceptions [with no serial port and no early debugging available].

v2.6.24 works fine on that laptop with the same .config, so after a few
hours of bisection (had to restart it 3 times - other regressions
interacted), it honed in on this commit:

| 10270d48 is first bad commit
|
| Author: Linus Torvalds <torvalds@woody.linux-foundation.org>
| Date:   Wed Feb 13 09:56:14 2008 -0800
|
|     acpi: fix acpi_os_read_pci_configuration() misuse of raw_pci_read()

reverting this commit ontop of -rc5 gave a correctly booting kernel.

But this commit fixes a real bug so the real question is, why did it
break the bootup?

After quite some head-scratching, the following change stood out:

-                               pci_id->bus = tu8;
+                               pci_id->bus = val;

pci_id->bus is defined as u16:

   struct acpi_pci_id {
           u16 segment;
           u16 bus;
   ...

and 'tu8' changed from u8 to u32. So previously we'd unconditionally
mask the return value of acpi_os_read_pci_configuration()
(raw_pci_read()) to 8 bits, but now we just trust whatever comes back
from the PCI access routines and only crop it to 16 bits.

But if the high 8 bits of that result contains any noise then we'll
write that into ACPI's PCI ID descriptor and confuse the heck out of the
rest of ACPI.

So lets check the PCI-BIOS code on that theory. We have this codepath
for 8-bit accesses (arch/x86/pci/pcbios.c:pci_bios_read()):

        switch (len) {
        case 1:
                __asm__("lcall *(%%esi); cld\n\t"
                        "jc 1f\n\t"
                        "xor %%ah, %%ah\n"
                        "1:"
                        : "=c" (*value),
                          "=a" (result)
                        : "1" (PCIBIOS_READ_CONFIG_BYTE),
                          "b" (bx),
                          "D" ((long)reg),
                          "S" (&pci_indirect));

Aha! The "=a" output constraint puts the full 32 bits of EAX into
*value. But if the BIOS's routines set any of the high bits to nonzero,
we'll return a value with more set in it than intended.

The other, more common PCI access methods (v1 and v2 PCI reads) clear
out the high bits already, for example pci_conf1_read() does:

        switch (len) {
        case 1:
                *value = inb(0xCFC + (reg & 3));

which explicitly converts the return byte up to 32 bits and zero-extends
it.

So zero-extending the result in the PCI-BIOS read routine fixes the
regression on my laptop. ( It might fix some other long-standing issues
we had with PCI-BIOS during the past decade ... ) Both 8-bit and 16-bit
accesses were buggy.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f5dbb55b

lguest: Revert , fix real problem. · 4357bd94

由 Rusty Russell 提交于 3月 11, 2008

Ahmed managed to crash the Host in release_pgd(), which cannot be a Guest
bug, and indeed it wasn't.

The bug was that handing a 0 as the address of the toplevel page table
being manipulated can cause the lookup code in find_pgdir() to return
an uninitialized cache entry (we shadow up to 4 top level page tables
for each Guest).

Commit 37cc8d7f introduced this
behaviour in the Guest, uncovering the bug.

The patch which he submitted (which removed the /4 from the index
calculation) simply ensured that these high-indexed entries hit the
early exit path of guest_set_pmd().  But you get lots of segfaults in
guest userspace as the PMDs aren't being updated.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

4357bd94

lguest: Sanitize the lguest clock. · 3fabc55f

由 Rusty Russell 提交于 3月 11, 2008

Now the TSC code handles a zero return from calculate_cpu_khz(),
lguest can simply pass through the value it gets from the Host: if
non-zero, all the normal TSC code applies.

Otherwise (or if the Host really doesn't support TSC), the clocksource
code will fall back to the slower but reasonable lguest clock.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

3fabc55f

08 3月, 2008 1 次提交

x86_64: make ptrace always sign-extend orig_ax to 64 bits · 84c6f604

由 Roland McGrath 提交于 3月 07, 2008

This makes 64-bit ptrace calls setting the 64-bit orig_ax field for a
32-bit task sign-extend the low 32 bits up to 64.  This matches what a
64-bit debugger expects when tracing a 32-bit task.

This follows on my "x86_64 ia32 syscall restart fix".  This didn't
matter until that was fixed.

The debugger ignores or zeros the high half of every register slot it
sets (including the orig_rax pseudo-register) uniformly.  It expects
that the setting of the low 32 bits always has the same meaning as a
32-bit debugger setting those same 32 bits with native 32-bit
facilities.

This never arose before because the syscall restart check never
matched any -ERESTART* values due to lack of sign extension.  Before
that fix, even 32-bit ptrace setting orig_eax to -1 failed to trigger
the restart check anyway.  So this was never noticed as a regression
of 64-bit debuggers vs 32-bit debuggers on the same 64-bit kernel.
Signed-off-by: NRoland McGrath <roland@redhat.com>
[ Changed to just do the sign-extension unconditionally on x86-64,
  since orig_ax is always just a small integer and doesn't need
  the full 64-bit range ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

84c6f604

07 3月, 2008 5 次提交

x86-boot: don't request VBE2 information · 1722770f

由 Peter Korsgaard 提交于 3月 06, 2008

The new x86 setup code (4fd06960) broke booting on an old P3/500MHz
with an onboard Voodoo3 of mine. After debugging it, it turned out
to be caused by the fact that the vesa probing now asks for VBE2 data.

Disassembing the video BIOS shows that it overflows the vesa_general_info
structure when VBE2 data is requested because the source addresses for the
information strings which get strcpy'ed to the buffer lie outside the 32K
BIOS code (and hence contain long sequences of 0xff's).

E.G.:

get_vbe_controller_info:
00002A9C  60                pushaw
00002A9D  1E                push ds
00002A9E  0E                push cs
00002A9F  1F                pop ds
00002AA0  2BC9              sub cx,cx
00002AA2  6626813D56424532  cmp dword [es:di],0x32454256 ; "VBE2"
00002AAA  7501              jnz .1
00002AAC  41                inc cx
.1:
00002AAD  51                push cx
00002AAE  B91400            mov cx,0x14
00002AB1  BED47F            mov si, controller_header
00002AB4  57                push di
00002AB5  F3A4              rep movsb ; copy vbe1.2 header

00002AB7  B9EC00            mov cx,0xec
00002ABA  2AC0              sub al,al
00002ABC  F3AA              rep stosb ; zero pad remainder

00002ABE  5F                pop di
00002ABF  E8EB0D            call word get_memory
00002AC2  C1E002            shl ax,0x2
00002AC5  26894512          mov [es:di+0x12],ax ; total memory
00002AC9  26C745040003      mov word [es:di+0x4],0x300 ; VBE version
00002ACF  268C4D08          mov [es:di+0x8],cs
00002AD3  268C4D10          mov [es:di+0x10],cs
00002AD7  59                pop cx
00002AD8  E361              jcxz .done ; VBE2 requested?
00002ADA  8D9D0001          lea bx,[di+0x100]
00002ADE  53                push bx
00002ADF  87DF              xchg bx,di ; di now points to 2nd half
00002AE1  26C747140001      mov word [es:bx+0x14],0x100 ; sw rev

00002AE7  26897F06          mov [es:bx+0x6],di		; oem string
00002AEB  268C4708          mov [es:bx+0x8],es
00002AEF  BE5280            mov si,0x8052 ; oem string
00002AF2  E87A1B            call word strcpy

00002AF5  26897F0E          mov [es:bx+0xe],di ; video mode list
00002AF9  268C4710          mov [es:bx+0x10],es
00002AFD  B91E00            mov cx,0x1e
00002B00  BEE87F            mov si,vidmodes
00002B03  F3A5              rep movsw

00002B05  26897F16          mov [es:bx+0x16],di ; oem vendor
00002B09  268C4718          mov [es:bx+0x18],es
00002B0D  BE2480            mov si,0x8024 ; oem vendor
00002B10  E85C1B            call word strcpy

00002B13  26897F1A          mov [es:bx+0x1a],di ; oem product
00002B17  268C471C          mov [es:bx+0x1c],es
00002B1B  BE3880            mov si,0x8038 ; oem product
00002B1E  E84E1B            call word strcpy

00002B21  26897F1E          mov [es:bx+0x1e],di ; oem product rev
00002B25  268C4720          mov [es:bx+0x20],es
00002B29  BE4580            mov si,0x8045 ; oem product rev
00002B2C  E8401B            call word strcpy

00002B2F  58                pop ax
00002B30  B90001            mov cx,0x100
00002B33  2BCF              sub cx,di
00002B35  03C8              add cx,ax
00002B37  2AC0              sub al,al
00002B39  F3AA              rep stosb ; zero pad
.done:
00002B3B  1F                pop ds
00002B3C  61                popaw
00002B3D  B84F00            mov ax,0x4f
00002B40  C3                ret

(The full BIOS can be found at http://peter.korsgaard.com/vgabios.bin
if interested).

The old setup code didn't ask for VBE2 info, and the new code doesn't
actually do anything with the extra information, so the fix is to simply
not request it. Other BIOS'es might have the same problem.
Signed-off-by: NPeter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1722770f

x86: re-add reboot fixups · 7432d149

由 Ingo Molnar 提交于 3月 06, 2008

Jan Beulich noticed that the reboot fixups went missing during
reboot.c unification.

(commit 4d022e35)

Geode and a few other rare boards with special reboot quirks are
affected.
Reported-by: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7432d149

x86: fix typo in step.c · d032b31a

由 Jan Beulich 提交于 3月 05, 2008

TIF_DEBUGCTLMSR has no meaning in the actual MSR...
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d032b31a

x86: fix merge mistake in i387.c · 609b5297

由 Jan Beulich 提交于 3月 05, 2008

convert_fxsr_to_user() in 2.6.24's i387_32.c did this, and
convert_to_fxsr() also does the inverse, so I assume it's an oversight
that it is no longer being done.

[ mingo@elte.hu:

  we encode it this way because there's no space for the 'FPU Last
  Instruction Opcode' (->fop) field in the legacy user_i387_ia32_struct
  that PTRACE_GETFPREGS/PTRACE_SETFPREGS uses.

  it's probably pure legacy - i'd be surprised if any user-space relied on
  the FPU Last Opcode in any way. But indeed we used to do it previously
  so the most conservative thing is to preserve that piece of information.
]
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

609b5297

x86: clear DF before calling signal handler · e40cd10c

由 Aurelien Jarno 提交于 3月 05, 2008

The Linux kernel currently does not clear the direction flag before
calling a signal handler, whereas the x86/x86-64 ABI requires that.

Linux had this behavior/bug forever, but this becomes a real problem
with gcc version 4.3, which assumes that the direction flag is
correctly cleared at the entry of a function.

This patches changes the setup_frame() functions to clear the
direction before entering the signal handler.
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NH. Peter Anvin <hpa@zytor.com>

e40cd10c

06 3月, 2008 1 次提交

[CPUFREQ] Remove debugging message from e_powersaver · 0e5aa8d6

由 Dave Jones 提交于 2月 15, 2008

We don't need to printk a message every time we transition.
Leave the code there, but ifdef'd out, as it's useful when
adding support for new processors.
Reported-by: NPetr Titěra <P.Titera@century.cz>
Signed-off-by: NDave Jones <davej@redhat.com>

0e5aa8d6

05 3月, 2008 5 次提交

Kprobes: indicate kretprobe support in Kconfig · 9edddaa2

由 Ananth N Mavinakayanahalli 提交于 3月 04, 2008

Add CONFIG_HAVE_KRETPROBES to the arch/<arch>/Kconfig file for relevant
architectures with kprobes support.  This facilitates easy handling of
in-kernel modules (like samples/kprobes/kretprobe_example.c) that depend on
kretprobes being present in the kernel.

Thanks to Sam Ravnborg for helping make the patch more lean.

Per Mathieu's suggestion, added CONFIG_KRETPROBES and fixed up dependencies.
Signed-off-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9edddaa2

x86: a P4 is a P6 not an i486 · fcab59a3

由 Hugh Dickins 提交于 3月 04, 2008

P4 has been coming out as CPU_FAMILY=4 instead of 6: fix MPENTIUM4 typo.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fcab59a3

x86/xen: fix DomU boot problem · 87d034f3

由 Ian Campbell 提交于 2月 28, 2008

Construct Xen guest e820 map with a hole between 640K-1M.

It's pure luck that Xen kernels have gotten away with it in the past.

The patch below seems like the right thing to do. It certainly boots in
a domU without the DMI problem (without any of the other related patches
such as Alexander's).
Signed-off-by: NIan Campbell <ijc@hellion.org.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Tested-by: NMark McLoughlin <markmc@redhat.com>
Acked-by: NMark McLoughlin <markmc@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NThomas Gleixner <tglx@linutronix.de>

87d034f3

x86: not set node to cpu_to_node if the node is not online · 7c9e92b6

由 Yinghai Lu 提交于 2月 19, 2008

resolve boot problem reported by Mel Gorman:

   http://lkml.org/lkml/2008/2/13/404

init_cpu_to_node will use cpu->apic (from MADT or mptable) and
apic->node(from SRAT or AMD config space with k8_bus_64.c) to have
cpu->node mapping, and later identify_cpu will overwrite them
again...(with nearby_node...)

this patch checks if the node is online, otherwise it will not
update cpu_node map. so keep cpu_node map to online node before
identify_cpu..., to prevent possible error.
Signed-off-by: NYinghai Lu <yinghai.lu@sun.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NThomas Gleixner <tglx@linutronix.de>

7c9e92b6

x86, i387: fix ptrace leakage using init_fpu() · 18a86221

由 Suresh Siddha 提交于 3月 03, 2008

This bug got introduced by the recent i387 merge:

  commit 44210111
  Author: Roland McGrath <roland@redhat.com>
  Date:   Wed Jan 30 13:31:50 2008 +0100

      x86: x86 i387 user_regset

Current usage of unlazy_fpu() in ptrace specific routines is wrong.
unlazy_fpu() will not init fpu if the task never used math. So the
ptrace calls can expose the parent tasks FPU data in some cases.

Replace it with the init_fpu() which will init the math state, if the
task never used math before.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NThomas Gleixner <tglx@linutronix.de>

18a86221

04 3月, 2008 4 次提交

x86: disable KVM for Voyager and friends · 1a4e3f89

由 Randy Dunlap 提交于 2月 20, 2008

Most classic Pentiums don't have hardware virtualization extension,
and building kvm with Voyager, Visual Workstation, or NUMAQ
generates spurious failures.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>

1a4e3f89

KVM: VMX: Avoid rearranging switched guest msrs while they are loaded · 33f9c505

由 Avi Kivity 提交于 2月 27, 2008

KVM tries to run as much as possible with the guest msrs loaded instead of
host msrs, since switching msrs is very expensive.  It also tries to minimize
the number of msrs switched according to the guest mode; for example,
MSR_LSTAR is needed only by long mode guests.  This optimization is done by
setup_msrs().

However, we must not change which msrs are switched while we are running with
guest msr state:

 - switch to guest msr state
 - call setup_msrs(), removing some msrs from the list
 - switch to host msr state, leaving a few guest msrs loaded

An easy way to trigger this is to kexec an x86_64 linux guest.  Early during
setup, the guest will switch EFER to not include SCE.  KVM will stop saving
MSR_LSTAR, and on the next msr switch it will leave the guest LSTAR loaded.
The next host syscall will end up in a random location in the kernel.

Fix by reloading the host msrs before changing the msr list.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

33f9c505

KVM: MMU: Fix race when instantiating a shadow pte · f7d9c7b7

由 Avi Kivity 提交于 2月 26, 2008

For improved concurrency, the guest walk is performed concurrently with other
vcpus.  This means that we need to revalidate the guest ptes once we have
write-protected the guest page tables, at which point they can no longer be
modified.

The current code attempts to avoid this check if the shadow page table is not
new, on the assumption that if it has existed before, the guest could not have
modified the pte without the shadow lock.  However the assumption is incorrect,
as the racing vcpu could have modified the pte, then instantiated the shadow
page, before our vcpu regains control:

  vcpu0        vcpu1

  fault
  walk pte

               modify pte
               fault in same pagetable
               instantiate shadow page

  lookup shadow page
  conclude it is old
  instantiate spte based on stale guest pte

We could do something clever with generation counters, but a test run by
Marcelo suggests this is unnecessary and we can just do the revalidation
unconditionally.  The pte will be in the processor cache and the check can
be quite fast.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

f7d9c7b7

KVM: Avoid infinite-frequency local apic timer · 0b975a3c

由 Avi Kivity 提交于 2月 24, 2008

If the local apic initial count is zero, don't start a an hrtimer with infinite
frequency, locking up the host.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

0b975a3c