提交 · c10abbb26513f4ccff89c4d80912cb4d36fcd3e8 · openeuler / raspberrypi-kernel

18 12月, 2010 5 次提交

x86: avoid high BIOS area when allocating address space · a2c606d5

由 Bjorn Helgaas 提交于 12月 16, 2010

This prevents allocation of the last 2MB before 4GB.

The experiment described here shows Windows 7 ignoring the last 1MB:
https://bugzilla.kernel.org/show_bug.cgi?id=23542#c27

This patch ignores the top 2MB instead of just 1MB because H. Peter Anvin
says "There will be ROM at the top of the 32-bit address space; it's a fact
of the architecture, and on at least older systems it was common to have a
shadow 1 MiB below."
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

a2c606d5

x86: avoid E820 regions when allocating address space · 4dc2287c

由 Bjorn Helgaas 提交于 12月 16, 2010

When we allocate address space, e.g., to assign it to a PCI device, don't
allocate anything mentioned in the BIOS E820 memory map.

On recent machines (2008 and newer), we assign PCI resources from the
windows described by the ACPI PCI host bridge _CRS. On many Dell
machines, these windows overlap some E820 reserved areas, e.g.,

BIOS-e820: 00000000bfe4dc00 - 00000000c0000000 (reserved)
pci_root PNP0A03:00: host bridge window [mem 0xbff00000-0xdfffffff]

If we put devices at 0xbff00000, they don't work, probably because
that's really RAM, not I/O memory. This patch prevents that by removing
the 0xbfe4dc00-0xbfffffff area from the "available" resource.

I'm not very happy with this solution because Windows solves the problem
differently (it seems to ignore E820 reserved areas and it allocates
top-down instead of bottom-up; details at comment 45 of the bugzilla
below). That means we're vulnerable to BIOS defects that Windows would not
trip over. For example, if BIOS described a device in ACPI but didn't
mention it in E820, Windows would work fine but Linux would fail.

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228Acked-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

4dc2287c

x86: avoid low BIOS area when allocating address space · 30919b0b

由 Bjorn Helgaas 提交于 12月 16, 2010

This implements arch_remove_reservations() so allocate_resource() can
avoid any arch-specific reserved areas.  This currently just avoids the
BIOS area (the first 1MB), but could be used for E820 reserved areas if
that turns out to be necessary.

We previously avoided this area in pcibios_align_resource().  This patch
moves the test from that PCI-specific path to a generic path, so *all*
resource allocations will avoid this area.
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

30919b0b

Revert "x86/PCI: allocate space from the end of a region, not the beginning" · d14125ec

由 Bjorn Helgaas 提交于 12月 16, 2010

This reverts commit dc9887dc.
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

d14125ec

Revert "x86: allocate space within a region top-down" · 5e52f1c5

由 Bjorn Helgaas 提交于 12月 16, 2010

This reverts commit 1af3c2e4.
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

5e52f1c5

17 12月, 2010 1 次提交

x86-32: Make sure we can map all of lowmem if we need to · 147dd561

由 H. Peter Anvin 提交于 12月 16, 2010

A relocatable kernel can be anywhere in lowmem -- and in the case of a
kdump kernel, is likely to be fairly high.  Since the early page
tables map everything from address zero up we need to make sure we
allocate enough brk that we can map all of lowmem if we need to.
Reported-by: NStanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Tested-by: NYinghai Lu <yinghai@kernel.org>
LKML-Reference: <4D0AD3ED.8070607@kernel.org>

147dd561

16 12月, 2010 4 次提交

A
KVM: Fix preemption counter leak in kvm_timer_init() · 3e26f230
由 Avi Kivity 提交于 12月 16, 2010
```
Based on a patch from Thomas Meyer.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
3e26f230

lguest: populate initial_page_table · da32dac1

由 Rusty Russell 提交于 12月 16, 2010

Two x86 patches broke lguest:
1) v2.6.35-492-g72d7c3b3, which changed x86 to use the memblock allocator.

In lguest, the host places linear page tables at the top of mem, which
used to be enough to get us up to the swapper_pg_dir page tables.  With
the first patch, the direct mapping tables used that memory:

Before: kernel direct mapping tables up to 4000000 @ 7000-1a000
After: kernel direct mapping tables up to 4000000 @ 3fed000-4000000

I initially fixed this by lying about the amount of memory we had, so
the kernel wouldn't blatt the lguest boot pagetables (yuk!), but then...

2) v2.6.36-rc8-54-gb40827fa, which made x86 boot use initial_page_table.

This was initialized in a part of head_32.S which isn't executed by
lguest; it is then copied into swapper_pg_dir.  So we have to initialize
it; and anyway we switch to it before we blatt the old tables, so that
fixes the previous damage as well.

For the moment, I cut & pasted the code into lguest's boot code, but
next merge window I will merge them.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: x86@kernel.org

da32dac1

lguest: restore boot speed · bb4093de

由 Rusty Russell 提交于 12月 16, 2010

lguest is dumb and drops *all* the pagetables for set_pte (which is
only used for kernel mapping manipulation, so it's OK without highmem).

But it's used a lot in boot, too.  As a guest optimization, we
suppressed this flushing until the first page switch.  Now we have
initial_page_table, that happens much earlier, so extend the heuristic
to wait until we switch to something other than the swapper_pg_dir or
initial_page_table.

As measured on my laptop under kvm, this dropped the time-to-mount-root
from 48 seconds to 4.3 seconds.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

bb4093de

lguest: fix crash lguest_time_init · bb6f1d9a

由 Rusty Russell 提交于 12月 16, 2010

fe25c7fc "x86: lguest: Convert to new irq chip functions" converted
enable_lguest_irq() to take a struct irq_data *, but didn't fix the one
internal caller.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
To: x86@kernel.org

bb6f1d9a

15 12月, 2010 1 次提交

crypto: ghash-intel - ghash-clmulni-intel_glue needs err.h · 52f6c5ad

由 Randy Dunlap 提交于 12月 15, 2010

Add missing header file:

arch/x86/crypto/ghash-clmulni-intel_glue.c:256: error: implicit declaration of function 'IS_ERR'
arch/x86/crypto/ghash-clmulni-intel_glue.c:257: error: implicit declaration of function 'PTR_ERR'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

52f6c5ad

14 12月, 2010 4 次提交

x86: Enable the intr-remap fault handling after local APIC setup · 7f7fbf45

由 Kenji Kaneshige 提交于 11月 30, 2010

Interrupt-remapping gets enabled very early in the boot, as it determines the
apic mode that the processor can use. And the current code enables the vt-d
fault handling before the setup_local_APIC(). And hence the APIC LDR registers
and data structure in the memory may not be initialized. So the vt-d fault
handling in logical xapic/x2apic modes were broken.

Fix this by enabling the vt-d fault handling in the end_local_APIC_setup()

A cleaner fix of enabling fault handling while enabling intr-remapping
will be addressed for v2.6.38. [ Enabling intr-remapping determines the
usage of x2apic mode and the apic mode determines the fault-handling
configuration. ]
Signed-off-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
LKML-Reference: <20101201062244.541996375@intel.com>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org [v2.6.32+]
Acked-by: NChris Wright <chrisw@sous-sol.org>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

7f7fbf45

x86, vt-d: Fix the vt-d fault handling irq migration in the x2apic mode · 086e8ced

由 Kenji Kaneshige 提交于 12月 01, 2010

In x2apic mode, we need to set the upper address register of the fault
handling interrupt register of the vt-d hardware. Without this
irq migration of the vt-d fault handling interrupt is broken.
Signed-off-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
LKML-Reference: <1291225233.2648.39.camel@sbsiddha-MOBL3>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org [v2.6.32+]
Acked-by: NChris Wright <chrisw@sous-sol.org>
Tested-by: NTakao Indoh <indou.takao@jp.fujitsu.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

086e8ced

x86, xsave: Use alloc_bootmem_align() instead of alloc_bootmem() · 10340ae1

由 Suresh Siddha 提交于 11月 16, 2010

Alignment of alloc_bootmem() depends on the value of
L1_CACHE_SHIFT. What we need here, however, is 64 byte alignment.  Use
alloc_bootmem_align() and explicitly specify the alignment instead.

This fixes a kernel boot crash reported by Jody when the cpu in .config
is set to MPENTIUMII but the kernel is booted on a xsave-capable CPU.
Reported-by: NJody Bruchon <jody@nctritech.com>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20101116212442.059967454@sbsiddha-MOBL3.sc.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@kernel.org>

10340ae1

x86, gcc-4.6: Use gcc -m options when building vdso · de2a8cf9

由 H. Peter Anvin 提交于 12月 13, 2010

The vdso Makefile passes linker-style -m options not to the linker but
to gcc.  This happens to work with earlier gcc, but fails with gcc
4.6.  Pass gcc-style -m options, instead.

Note: all currently supported versions of gcc supports -m32, so there
is no reason to conditionalize it any more.
Reported-by: NH. J. Lu <hjl.tools@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
LKML-Reference: <tip-*@git.kernel.org>
Cc: <stable@kernel.org>

de2a8cf9

13 12月, 2010 1 次提交

x86: HPET: Chose a paranoid safe value for the ETIME check · f1c18071

由 Thomas Gleixner 提交于 12月 13, 2010

commit 995bd3bb (x86: Hpet: Avoid the comparator readback penalty)
chose 8 HPET cycles as a safe value for the ETIME check, as we had the
confirmation that the posted write to the comparator register is
delayed by two HPET clock cycles on Intel chipsets which showed
readback problems.

After that patch hit mainline we got reports from machines with newer
AMD chipsets which seem to have an even longer delay. See
http://thread.gmane.org/gmane.linux.kernel/1054283 and
http://thread.gmane.org/gmane.linux.kernel/1069458 for further
information.

Boris tried to come up with an ACPI based selection of the minimum
HPET cycles, but this failed on a couple of test machines. And of
course we did not get any useful information from the hardware folks.

For now our only option is to chose a paranoid high and safe value for
the minimum HPET cycles used by the ETIME check. Adjust the minimum ns
value for the HPET clockevent accordingly.
Reported-Bistected-and-Tested-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
LKML-Reference: <alpine.LFD.2.00.1012131222420.2653@localhost6.localdomain6>
Cc: Simon Kirby <sim@hostway.ca>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andreas Herrmann <Andreas.Herrmann3@amd.com>
Cc: John Stultz <johnstul@us.ibm.com>

f1c18071

10 12月, 2010 1 次提交

x86: io_apic: Avoid unused variable warning when CONFIG_GENERIC_PENDING_IRQ=n · 4720dd1b

由 Thomas Gleixner 提交于 12月 09, 2010

arch/x86/kernel/apic/io_apic.c: In function 'ack_apic_level':
arch/x86/kernel/apic/io_apic.c:2433: warning: unused variable 'desc'
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
LKML-Reference: <201010272107.o9RL7rse018212@imap1.linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

4720dd1b

08 12月, 2010 3 次提交

KVM: enlarge number of possible CPUID leaves · 73c1160c

由 Andre Przywara 提交于 12月 01, 2010

Currently the number of CPUID leaves KVM handles is limited to 40.
My desktop machine (AthlonII) already has 35 and future CPUs will
expand this well beyond the limit. Extend the limit to 80 to make
room for future processors.

KVM-Stable-Tag.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

73c1160c

KVM: SVM: Do not report xsave in supported cpuid · 24d1b15f

由 Joerg Roedel 提交于 12月 07, 2010

To support xsave properly for the guest the SVM module need
software support for it. As long as this is not present do
not report the xsave as supported feature in cpuid.
As a side-effect this patch moves the bit() helper function
into the x86.h file so that it can be used in svm.c too.

KVM-Stable-Tag.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

24d1b15f

KVM: Fix OSXSAVE after migration · 3ea3aa8c

由 Sheng Yang 提交于 12月 08, 2010

CPUID's OSXSAVE is a mirror of CR4.OSXSAVE bit. We need to update the CPUID
after migration.

KVM-Stable-Tag.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3ea3aa8c

03 12月, 2010 1 次提交

vmalloc: eagerly clear ptes on vunmap · 64141da5

由 Jeremy Fitzhardinge 提交于 12月 02, 2010

On stock 2.6.37-rc4, running:

  # mount lilith:/export /mnt/lilith
  # find  /mnt/lilith/ -type f -print0 | xargs -0 file

crashes the machine fairly quickly under Xen.  Often it results in oops
messages, but the couple of times I tried just now, it just hung quietly
and made Xen print some rude messages:

    (XEN) mm.c:2389:d80 Bad type (saw 7400000000000001 != exp
    3000000000000000) for mfn 1d7058 (pfn 18fa7)
    (XEN) mm.c:964:d80 Attempt to create linear p.t. with write perms
    (XEN) mm.c:2389:d80 Bad type (saw 7400000000000010 != exp
    1000000000000000) for mfn 1d2e04 (pfn 1d1fb)
    (XEN) mm.c:2965:d80 Error while pinning mfn 1d2e04

Which means the domain tried to map a pagetable page RW, which would
allow it to map arbitrary memory, so Xen stopped it.  This is because
vm_unmap_ram() left some pages mapped in the vmalloc area after NFS had
finished with them, and those pages got recycled as pagetable pages
while still having these RW aliases.

Removing those mappings immediately removes the Xen-visible aliases, and
so it has no problem with those pages being reused as pagetable pages.
Deferring the TLB flush doesn't upset Xen because it can flush the TLB
itself as needed to maintain its invariants.

When unmapping a region in the vmalloc space, clear the ptes
immediately.  There's no point in deferring this because there's no
amortization benefit.

The TLBs are left dirty, and they are flushed lazily to amortize the
cost of the IPIs.

This specific motivation for this patch is an oops-causing regression
since 2.6.36 when using NFS under Xen, triggered by the NFS client's use
of vm_map_ram() introduced in 56e4ebf8 ("NFS: readdir with vmapped
pages") .  XFS also uses vm_map_ram() and could cause similar problems.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Alex Elder <aelder@sgi.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

64141da5

02 12月, 2010 2 次提交

xen: unplug the emulated devices at resume time · 512b109e

由 Stefano Stabellini 提交于 12月 01, 2010

Early after being resumed we need to unplug again the emulated devices.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>

512b109e

xen: fix MSI setup and teardown for PV on HVM guests · af42b8d1

由 Stefano Stabellini 提交于 12月 01, 2010

When remapping MSIs into pirqs for PV on HVM guests, qemu is responsible
for doing the actual mapping and unmapping.
We only give qemu the desired pirq number when we ask to do the mapping
the first time, after that we should be reading back the pirq number
from qemu every time we want to re-enable the MSI.

This fixes a bug in xen_hvm_setup_msi_irqs that manifests itself when
trying to enable the same MSI for the second time: the old MSI to pirq
mapping is still valid at this point but xen_hvm_setup_msi_irqs would
try to assign a new pirq anyway.
A simple way to reproduce this bug is to assign an MSI capable network
card to a PV on HVM guest, if the user brings down the corresponding
ethernet interface and up again, Linux would fail to enable MSIs on the
device.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>

af42b8d1

30 11月, 2010 2 次提交

xen: x86/32: perform initial startup on initial_page_table · 805e3f49

由 Ian Campbell 提交于 11月 03, 2010

Only make swapper_pg_dir readonly and pinned when generic x86 architecture code
(which also starts on initial_page_table) switches to it.  This helps ensure
that the generic setup paths work on Xen unmodified. In particular
clone_pgd_range writes directly to the destination pgd entries and is used to
initialise swapper_pg_dir so we need to ensure that it remains writeable until
the last possible moment during bring up.

This is complicated slightly by the need to avoid sharing kernel PMD entries
when running under Xen, therefore the Xen implementation must make a copy of
the kernel PMD (which is otherwise referred to by both intial_page_table and
swapper_pg_dir) before switching to swapper_pg_dir.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

805e3f49

xen: don't bother to stop other cpus on shutdown/reboot · 31e323cc

由 Jeremy Fitzhardinge 提交于 11月 29, 2010

Xen will shoot all the VCPUs when we do a shutdown hypercall, so there's
no need to do it manually.

In any case it will fail because all the IPI irqs have been pulled
down by this point, so the cross-CPU calls will simply hang forever.

Until change 76fac077 the function calls
were not synchronously waited for, so this wasn't apparent.  However after
that change the calls became synchronous leading to a hang on shutdown
on multi-VCPU guests.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: Alok Kataria <akataria@vmware.com>

31e323cc

28 11月, 2010 1 次提交

x86/pvclock: Zero last_value on resume · e7a3481c

由 Jeremy Fitzhardinge 提交于 10月 25, 2010

If the guest domain has been suspend/resumed or migrated, then the
system clock backing the pvclock clocksource may revert to a smaller
value (ie, can be non-monotonic across the migration/save-restore).

Make sure we zero last_value in that case so that the domain
continues to see clock updates.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e7a3481c

26 11月, 2010 2 次提交

perf, x86: Fixup Kconfig deps · cc2067a5

由 Peter Zijlstra 提交于 11月 16, 2010

This leads to a Kconfig dep inversion, x86 selects PERF_EVENT (due to
a hw_breakpoint dep) but doesn't unconditionally provide
HAVE_PERF_EVENT.

(This can cause build failures on M386/M486 kernel .config's.)
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101117222055.982965150@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cc2067a5

x86, perf, nmi: Disable perf if counters are not accessible · 33c6d6a7

由 Don Zickus 提交于 11月 22, 2010

In a kvm virt guests, the perf counters are not emulated.  Instead they
return zero on a rdmsrl. The perf nmi handler uses the fact that crossing
a zero means the counter overflowed (for those counters that do not have
specific interrupt bits). Therefore on kvm guests, perf will swallow all
NMIs thinking the counters overflowed.

This causes problems for subsystems like kgdb which needs NMIs to do its
magic. This problem was discovered by running kgdb tests.

The solution is to write garbage into a perf counter during the
initialization and hopefully reading back the same number.  On kvm
guests, the value will be read back as zero and we disable perf as
a result.
Reported-by: NJason Wessel <jason.wessel@windriver.com>
Patch-inspired-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NDon Zickus <dzickus@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1290462923-30734-1-git-send-email-dzickus@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

33c6d6a7

25 11月, 2010 3 次提交

arch/x86/include/asm/fixmap.h: mark __set_fixmap_offset as __always_inline · 91d95fda

由 Andrew Morton 提交于 11月 24, 2010

When compiling arch/x86/kernel/early_printk_mrst.c with i386
allmodconfig, gcc-4.1.0 generates an out-of-line copy of
__set_fixmap_offset() which contains a reference to
__this_fixmap_does_not_exist which the compiler cannot elide.

Marking __set_fixmap_offset() as __always_inline prevents this.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Feng Tang <feng.tang@intel.com>
Acked-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

91d95fda

xen: remove duplicated #include · e6d4a76d

由 Huang Weiyi 提交于 11月 20, 2010

Remove duplicated #include('s) in
  arch/x86/xen/setup.c
Signed-off-by: NHuang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

e6d4a76d

xen: x86/32: perform initial startup on initial_page_table · 5b5c1af1

由 Ian Campbell 提交于 11月 24, 2010

Only make swapper_pg_dir readonly and pinned when generic x86 architecture code
(which also starts on initial_page_table) switches to it.  This helps ensure
that the generic setup paths work on Xen unmodified. In particular
clone_pgd_range writes directly to the destination pgd entries and is used to
initialise swapper_pg_dir so we need to ensure that it remains writeable until
the last possible moment during bring up.

This is complicated slightly by the need to avoid sharing kernel PMD entries
when running under Xen, therefore the Xen implementation must make a copy of
the kernel PMD (which is otherwise referred to by both intial_page_table and
swapper_pg_dir) before switching to swapper_pg_dir.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

5b5c1af1

23 11月, 2010 3 次提交

xen: use default_idle · bc15fde7

由 Jeremy Fitzhardinge 提交于 11月 22, 2010

We just need the idle loop to drop into safe_halt, which default_idle()
is perfectly capable of doing. There's no need to duplicate it.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

bc15fde7

xen: clean up "extra" memory handling some more · c2d08791

由 Jeremy Fitzhardinge 提交于 11月 22, 2010

Make sure that extra_pages is added for all E820_RAM regions beyond
mem_end - completely excluded regions as well as the remains of partially
included regions.

Also makes sure the extra region is not unnecessarily high, and simplifies
the logic to decide which regions should be added.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

c2d08791

xen: set IO permission early (before early_cpu_init()) · ec35a69c

由 Konrad Rzeszutek Wilk 提交于 11月 16, 2010

This patch is based off "xen dom0: Set up basic IO permissions for dom0."
by Juan Quintela <quintela@redhat.com>.

On AMD machines when we boot the kernel as Domain 0 we get this nasty:

mapping kernel into physical memory
Xen: setup ISA identity maps
about to get started...
(XEN) traps.c:475:d0 Unhandled general protection fault fault/trap [#13] on VCPU 0 [ec=0000]
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.1-101116  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e033:[<ffffffff8130271b>]
(XEN) RFLAGS: 0000000000000282   EM: 1   CONTEXT: pv guest
(XEN) rax: 000000008000c068   rbx: ffffffff8186c680   rcx: 0000000000000068
(XEN) rdx: 0000000000000cf8   rsi: 000000000000c000   rdi: 0000000000000000
(XEN) rbp: ffffffff81801e98   rsp: ffffffff81801e50   r8:  ffffffff81801eac
(XEN) r9:  ffffffff81801ea8   r10: ffffffff81801eb4   r11: 00000000ffffffff
(XEN) r12: ffffffff8186c694   r13: ffffffff81801f90   r14: ffffffffffffffff
(XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000006f0
(XEN) cr3: 0000000221803000   cr2: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=ffffffff81801e50:

RIP points to read_pci_config() function.

The issue is that we don't set IO permissions for the Linux kernel early enough.

The call sequence used to be:

    xen_start_kernel()
	x86_init.oem.arch_setup = xen_setup_arch;
        setup_arch:
           - early_cpu_init
               - early_init_amd
                  - read_pci_config
           - x86_init.oem.arch_setup [ xen_arch_setup ]
               - set IO permissions.

We need to set the IO permissions earlier on, which this patch does.
Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ec35a69c

20 11月, 2010 1 次提交

xen: re-enable boot-time ballooning · d2a81713

由 Jeremy Fitzhardinge 提交于 11月 19, 2010

Now that the balloon driver doesn't stumble over non-RAM pages, we
can enable the extra space for ballooning.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

d2a81713

18 11月, 2010 5 次提交

x86-64: Fix and clean up AMD Fam10 MMCONF enabling · 37db6c8f

由 Jan Beulich 提交于 11月 16, 2010

Candidate memory ranges were not calculated properly (start
addresses got needlessly rounded down, and end addresses didn't
get rounded up at all), address comparison for secondary CPUs
was done on only part of the address, and disabled status wasn't
tracked properly.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <4CE24DF40200007800022737@vpn.id2.novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

37db6c8f

x86/kprobes: Prevent kprobes to probe on save_args() · de31ec8a

由 Masami Hiramatsu 提交于 11月 18, 2010

Prevent kprobes to probe on save_args() since this function
will be called from breakpoint exception handler. That will
cause infinit loop on breakpoint handling.
Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
LKML-Reference: <20101118101655.2779.2816.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

de31ec8a

x86: UV: Address interrupt/IO port operation conflict · 8191c9f6

由 Dimitri Sivanich 提交于 11月 16, 2010

This patch for SGI UV systems addresses a problem whereby
interrupt transactions being looped back from a local IOH,
through the hub to a local CPU can (erroneously) conflict with
IO port operations and other transactions.

To workaound this we set a high bit in the APIC IDs used for
interrupts. This bit appears to be ignored by the sockets, but
it avoids the conflict in the hub.
Signed-off-by: NDimitri Sivanich <sivanich@sgi.com>
LKML-Reference: <20101116222352.GA8155@sgi.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
___

 arch/x86/include/asm/uv/uv_hub.h   |    4 ++++
 arch/x86/include/asm/uv/uv_mmrs.h  |   19 ++++++++++++++++++-
 arch/x86/kernel/apic/x2apic_uv_x.c |   25 +++++++++++++++++++++++--
 arch/x86/platform/uv/tlb_uv.c      |    2 +-
 arch/x86/platform/uv/uv_time.c     |    4 +++-
 5 files changed, 49 insertions(+), 5 deletions(-)

8191c9f6

x86: Use online node real index in calulate_tbl_offset() · 9223081f

由 Yinghai Lu 提交于 11月 13, 2010

Found a NUMA system that doesn't have RAM installed at the first
socket which hangs while executing init scripts.

bisected it to:

 | commit 93296720
 | Author: Shaohua Li <shaohua.li@intel.com>
 | Date:   Wed Oct 20 11:07:03 2010 +0800
 |
 |     x86: Spread tlb flush vector between nodes

It turns out when first socket is not online it could have cpus on
node1 tlb_offset set to bigger than NUM_INVALIDATE_TLB_VECTORS.

That could affect systems like 4 sockets, but socket 2 doesn't
have installed, sockets 3 will get too big tlb_offset.

Need to use real online node idx.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NShaohua Li <shaohua.li@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4CDEDE59.40603@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9223081f

x86, asm: Fix binutils 2.15 build failure · 96e612ff

由 Tetsuo Handa 提交于 11月 16, 2010

Add parentheses around one pushl_cfi argument.

Commit df5d1874 "x86: Use {push,pop}{l,q}_cfi in more places"
caused GNU assembler 2.15 (Debian Sarge) to fail. It is still
failing as of commit 07bd8516 "x86, asm: Restore parentheses
around one pushl_cfi argument". This patch solves build failure
with GNU assembler 2.15.
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: NJan Beulich <jbeulich@novell.com>
Cc: heukelum@fastmail.fm
Cc: hpa@linux.intel.com
LKML-Reference: <201011160445.oAG4jGif079860@www262.sakura.ne.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

96e612ff