提交 · d7526f271f2111684211fc7d27814e86a36336c9 · openeuler / Kernel

05 5月, 2010 1 次提交

Fix the x86_64 implementation of call_rwsem_wait() · a66f6375

由 David Howells 提交于 5月 04, 2010

The x86_64 call_rwsem_wait() treats the active state counter part of the
R/W semaphore state as being 16-bit when it's actually 32-bit (it's half
of the 64-bit state). It should do "decl %edx" not "decw %dx".
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a66f6375

03 5月, 2010 2 次提交

powernow-k8: Fix frequency reporting · b810e94c

由 Mark Langsdorf 提交于 3月 31, 2010

With F10, model 10, all valid frequencies are in the ACPI _PST table.

Cc: <stable@kernel.org> # 33.x 32.x
Signed-off-by: NMark Langsdorf <mark.langsdorf@amd.com>
LKML-Reference: <1270065406-1814-6-git-send-email-bp@amd64.org>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Reviewed-by: NThomas Renninger <trenn@suse.de>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b810e94c

x86: Fix parse_reservetop() build failure on certain configs · 56f0e74c

由 Ingo Molnar 提交于 5月 03, 2010

Commit e67a807f ("x86: Fix 'reservetop=' functionality") added a
fixup_early_ioremap() call to parse_reservetop() and declared it
in io.h.

But asm/io.h was only included indirectly - and on some configs
not at all, causing a build failure on those configs.

Cc: Liang Li <liang.li@windriver.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Wang Chen <wangchen@cn.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <1272621711-8683-1-git-send-email-liang.li@windriver.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

56f0e74c

01 5月, 2010 1 次提交

x86: Fix NULL pointer access in irq_force_complete_move() for Xen guests · bbd391a1

由 Prarit Bhargava 提交于 4月 27, 2010

Upstream PV guests fail to boot because of a NULL pointer in
irq_force_complete_move().  It is possible that xen guests have
irq_desc->chip_data = NULL.

Test for NULL chip_data pointer before attempting to complete an irq move.
Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
LKML-Reference: <20100427152434.16193.49104.sendpatchset@prarit.bos.redhat.com>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org> [2.6.33]

bbd391a1

30 4月, 2010 1 次提交

x86: Fix 'reservetop=' functionality · e67a807f

由 Liang Li 提交于 4月 30, 2010

When specifying the 'reservetop=0xbadc0de' kernel parameter,
the kernel will stop booting due to a early_ioremap bug that
relates to commit 8827247f.

The root cause of boot failure problem is the value of
'slot_virt[i]' was initialized in setup_arch->early_ioremap_init().
But later in setup_arch, the function 'parse_early_param' will
modify 'FIXADDR_TOP' when 'reservetop=0xbadc0de' being specified.

The simplest fix might be use __fix_to_virt(idx0) to get updated
value of 'FIXADDR_TOP' in '__early_ioremap' instead of reference
old value from slot_virt[slot] directly.

Changelog since v0:

-v1: When reservetop being handled then FIXADDR_TOP get
     adjusted, Hence check prev_map then re-initialize slot_virt and
     PMD based on new FIXADDR_TOP.

-v2: place fixup_early_ioremap hence call early_ioremap_init in
     reserve_top_address  to re-initialize slot_virt and
     corresponding PMD when parse_reservertop

-v3: move fixup_early_ioremap out of reserve_top_address to make
     sure other clients of reserve_top_address like xen/lguest won't
     broken
Signed-off-by: NLiang Li <liang.li@windriver.com>
Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Wang Chen <wangchen@cn.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <1272621711-8683-1-git-send-email-liang.li@windriver.com>
[ fixed three small cleanliness details in fixup_early_ioremap() ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e67a807f

29 4月, 2010 1 次提交

x86/PCI: compute Address Space length rather than using _LEN · 48728e07

由 Bjorn Helgaas 提交于 4月 27, 2010

ACPI _CRS Address Space Descriptors have _MIN, _MAX, and _LEN. Linux has
been computing Address Spaces as [_MIN to _MIN + _LEN - 1]. Based on the
tests in the bug reports below, Windows apparently uses [_MIN to _MAX].

Per spec (ACPI 4.0, Table 6-40), for _CRS fixed-size, fixed location
descriptors, "_LEN must be (_MAX - _MIN + 1)", and when that's true, it
doesn't matter which way we compute the end. But of course, there are
BIOSes that don't follow this rule, and we're better off if Linux handles
those exceptions the same way as Windows.

This patch makes Linux use [_MIN to _MAX], as Windows seems to do. This
effectively reverts d558b483 and 03db42ad and replaces them with
simpler code.

https://bugzilla.kernel.org/show_bug.cgi?id=14337 (round)
https://bugzilla.kernel.org/show_bug.cgi?id=15480 (truncate)
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

48728e07

27 4月, 2010 1 次提交

x86/PCI: never allocate PCI MMIO resources below BIOS_END · 55051feb

由 Bjorn Helgaas 提交于 4月 23, 2010

When we move a PCI device or assign resources to a device not configured
by the BIOS, we want to avoid the BIOS region below 1MB. Note that if the
BIOS places devices below 1MB, we leave them there.

See https://bugzilla.kernel.org/show_bug.cgi?id=15744
and https://bugzilla.kernel.org/show_bug.cgi?id=15841Tested-by: NAndy Isaacson <adi@hexapodia.org>
Tested-by: NAndy Bailey <bailey@akamai.com>
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

55051feb

25 4月, 2010 1 次提交

VMware Balloon driver · 453dc659

由 Dmitry Torokhov 提交于 4月 23, 2010

This is a standalone version of VMware Balloon driver.  Ballooning is a
technique that allows hypervisor dynamically limit the amount of memory
available to the guest (with guest cooperation).  In the overcommit
scenario, when hypervisor set detects that it needs to shuffle some
memory, it instructs the driver to allocate certain number of pages, and
the underlying memory gets returned to the hypervisor.  Later hypervisor
may return memory to the guest by reattaching memory to the pageframes and
instructing the driver to "deflate" balloon.

We are submitting a standalone driver because KVM maintainer (Avi Kivity)
expressed opinion (rightly) that our transport does not fit well into
virtqueue paradigm and thus it does not make much sense to integrate with
virtio.

There were also some concerns whether current ballooning technique is the
right thing.  If there appears a better framework to achieve this we are
prepared to evaluate and switch to using it, but in the meantime we'd like
to get this driver upstream.

We want to get the driver accepted in distributions so that users do not
have to deal with an out-of-tree module and many distributions have
"upstream first" requirement.

The driver has been shipping for a number of years and users running on
VMware platform will have it installed as part of VMware Tools even if it
will not come from a distribution, thus there should not be additional
risk in pulling the driver into mainline.  The driver will only activate
if host is VMware so everyone else should not be affected at all.
Signed-off-by: NDmitry Torokhov <dtor@vmware.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

453dc659

24 4月, 2010 2 次提交

x86: Disable large pages on CPUs with Atom erratum AAE44 · 7a0fc404

由 H. Peter Anvin 提交于 4月 13, 2010

Atom erratum AAE44/AAF40/AAG38/AAH41:

"If software clears the PS (page size) bit in a present PDE (page
directory entry), that will cause linear addresses mapped through this
PDE to use 4-KByte pages instead of using a large page after old TLB
entries are invalidated. Due to this erratum, if a code fetch uses
this PDE before the TLB entry for the large page is invalidated then
it may fetch from a different physical address than specified by
either the old large page translation or the new 4-KByte page
translation. This erratum may also cause speculative code fetches from
incorrect addresses."

[http://download.intel.com/design/processor/specupdt/319536.pdf]

Where as commit 211b3d03 seems to
workaround errata AAH41 (mixed 4K TLBs) it reduces the window of
opportunity for the bug to occur and does not totally remove it.  This
patch disables mixed 4K/4MB page tables totally avoiding the page
splitting and not tripping this processor issue.

This is based on an original patch by Colin King.
Originally-by: NColin Ian King <colin.king@canonical.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
LKML-Reference: <1269271251-19775-1-git-send-email-colin.king@canonical.com>
Cc: <stable@kernel.org>

7a0fc404

x86-64: Clear a 64-bit FS/GS base on fork if selector is nonzero · 7ce5a2b9

由 H. Peter Anvin 提交于 4月 23, 2010

When we do a thread switch, we clear the outgoing FS/GS base if the
corresponding selector is nonzero.  This is taken by __switch_to() as
an entry invariant; it does not verify that it is true on entry.
However, copy_thread() doesn't enforce this constraint, which can
result in inconsistent results after fork().

Make copy_thread() match the behavior of __switch_to().
Reported-and-tested-by: NSamuel Thibault <samuel.thibault@inria.fr>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
LKML-Reference: <4BD1E061.8030605@zytor.com>
Cc: <stable@kernel.org>

7ce5a2b9

23 4月, 2010 1 次提交

x86/PCI: parse additional host bridge window resource types · 66528fdd

由 Bjorn Helgaas 提交于 4月 20, 2010

This adds support for Memory24, Memory32, and Memory32Fixed descriptors in
PCI host bridge _CRS.

I experimentally determined that Windows (2008 R2) accepts these descriptors
and treats them as windows that are forwarded to the PCI bus, e.g., if
it finds any PCI devices with BARs outside the windows, it moves them into
the windows.

I don't know whether any machines actually use these descriptors in PCI
host bridge _CRS methods, but if any exist and they're new enough that we
automatically turn on "pci=use_crs", they will work with Windows but not
with Linux.

Here are the details: https://bugzilla.kernel.org/show_bug.cgi?id=15817Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

66528fdd

21 4月, 2010 3 次提交

KVM: x86: Fix TSS size check for 16-bit tasks · e8861cfe

由 Jan Kiszka 提交于 4月 14, 2010

A 16-bit TSS is only 44 bytes long. So make sure to test for the correct
size on task switch.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e8861cfe

x86, mrst: Conditionally register cpu hotplug notifier for apbt · ae7c9b70

由 Jacob Pan 提交于 4月 19, 2010

APB timer is used on Moorestown platforms but not on a standard PC.
If APB timer code is compiled in but not initialized at run-time due
to lack of FW reported SFI table, kernel would panic when the non-boot
CPUs are offlined and notifier is called.

https://bugzilla.kernel.org/show_bug.cgi?id=15786

This patch ensures CPU hotplug notifier for APB timer is only registered
when the APBT timer block is initialized.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
LKML-Reference: <1271701423-1162-1-git-send-email-jacob.jun.pan@linux.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

ae7c9b70

x86: correctly wire up the newuname system call · 4cecd935

由 Christoph Hellwig 提交于 4月 20, 2010

Before commit e28cbf22 ("improve
sys_newuname() for compat architectures") 64-bit x86 had a private
implementation of sys_uname which was just called sys_uname, which other
architectures used for the old uname.

Due to some merge issues with the uname refactoring patches we ended up
calling the old uname version for both the old and new system call
slots, which lead to the domainname filed never be set which caused
failures with libnss_nis.
Reported-and-tested-by: NAndy Isaacson <adi@hexapodia.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4cecd935

20 4月, 2010 7 次提交

KVM: fix the handling of dirty bitmaps to avoid overflows · 87bf6e7d

由 Takuya Yoshikawa 提交于 4月 12, 2010

Int is not long enough to store the size of a dirty bitmap.

This patch fixes this problem with the introduction of a wrapper
function to calculate the sizes of dirty bitmaps.

Note: in mark_page_dirty(), we have to consider the fact that
  __set_bit() takes the offset as int, not long.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

87bf6e7d

KVM: MMU: fix kvm_mmu_zap_page() and its calling path · 77662e00

由 Xiao Guangrong 提交于 4月 16, 2010

This patch fix:

- calculate zapped page number properly in mmu_zap_unsync_children()
- calculate freeed page number properly kvm_mmu_change_mmu_pages()
- if zapped children page it shoud restart hlist walking

KVM-Stable-Tag.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

77662e00

KVM: VMX: Save/restore rflags.vm correctly in real mode · 78ac8b47

由 Avi Kivity 提交于 4月 08, 2010

Currently we set eflags.vm unconditionally when entering real mode emulation
through virtual-8086 mode, and clear it unconditionally when we enter protected
mode.  The means that the following sequence

  KVM_SET_REGS  (rflags.vm=1)
  KVM_SET_SREGS (cr0.pe=1)

Ends up with rflags.vm clear due to KVM_SET_SREGS triggering enter_pmode().

Fix by shadowing rflags.vm (and rflags.iopl) correctly while in real mode:
reads and writes to those bits access a shadow register instead of the actual
register.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

78ac8b47

KVM: allow bit 10 to be cleared in MSR_IA32_MC4_CTL · 114be429

由 Andre Przywara 提交于 3月 24, 2010

There is a quirk for AMD K8 CPUs in many Linux kernels (see
arch/x86/kernel/cpu/mcheck/mce.c:__mcheck_cpu_apply_quirks()) that
clears bit 10 in that MCE related MSR. KVM can only cope with all
zeros or all ones, so it will inject a #GP into the guest, which
will let it panic.
So lets add a quirk to the quirk and ignore this single cleared bit.
This fixes -cpu kvm64 on all machines and -cpu host on K8 machines
with some guest Linux kernels.
Signed-off-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

114be429

A
KVM: Don't spam kernel log when injecting exceptions due to bad cr writes · d6a23895
由 Avi Kivity 提交于 3月 11, 2010
```
These are guest-triggerable.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
d6a23895

KVM: SVM: Fix memory leaks that happen when svm_create_vcpu() fails · b7af4043

由 Takuya Yoshikawa 提交于 3月 09, 2010

svm_create_vcpu() does not free the pages allocated during the creation
when it fails to complete the allocations. This patch fixes it.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b7af4043

KVM: take srcu lock before call to complete_pio() · 7567cae1

由 Gleb Natapov 提交于 3月 09, 2010

complete_pio() may use slot table which is protected by srcu.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAvi Kivity <avi@redhat.com>

7567cae1

14 4月, 2010 1 次提交

lguest: stop using KVM hypercall mechanism · 091ebf07

由 Rusty Russell 提交于 4月 14, 2010

This is a partial revert of 4cd8b5e2 "lguest: use KVM hypercalls";
we revert to using (just as questionable but more reliable) int $15 for
hypercalls.  I didn't revert the register mapping, so we still use the
same calling convention as kvm.

KVM in more recent incarnations stopped injecting a fault when a guest
tried to use the VMCALL instruction from ring 1, so lguest under kvm
fails to make hypercalls.  It was nice to share code with our KVM
cousins, but this was overreach.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Cc: Matias Zabaljauregui <zabaljauregui@gmail.com>
Cc: Avi Kivity <avi@redhat.com>

091ebf07

09 4月, 2010 2 次提交

perf: Fix unsafe frame rewinding with hot regs fetching · ab285f2b

由 Frederic Weisbecker 提交于 4月 08, 2010

When we fetch the hot regs and rewind to the nth caller, it
might happen that we dereference a frame pointer outside the
kernel stack boundaries, like in this example:

	perf_trace_sched_switch+0xd5/0x120
        schedule+0x6b5/0x860
        retint_careful+0xd/0x21

Since we directly dereference a userspace frame pointer here while
rewinding behind retint_careful, this may end up in a crash.

Fix this by simply using probe_kernel_address() when we rewind the
frame pointer.

This issue will have a much more proper fix in the next version of the
perf_arch_fetch_caller_regs() API that will only need to rewind to the
first caller.
Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Tested-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>

ab285f2b

x86/PCI: ignore Consumer/Producer bit in ACPI window descriptions · 73a0e614

由 Bjorn Helgaas 提交于 4月 06, 2010

ACPI Address Space Descriptors (used in _CRS) have a Consumer/Producer
bit that is supposed to distinguish regions that are consumed directly
by a device from those that are forwarded ("produced") by a bridge.
But BIOSes have apparently not used this consistently, and Windows
seems to ignore it, so I think Linux should ignore it as well.

I can't point to any of these supposed broken BIOSes, but since we
now rely on _CRS by default, I think it's safer to ignore this bit
from the start.

Here are details of my experiments with how Windows handles it:
https://bugzilla.kernel.org/show_bug.cgi?id=15701Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

73a0e614

07 4月, 2010 5 次提交

x86/gart: Disable GART explicitly before initialization · 4b83873d

由 Joerg Roedel 提交于 4月 07, 2010

If we boot into a crash-kernel the gart might still be
enabled and its caches might be dirty. This can result in
undefined behavior later. Fix it by explicitly disabling the
gart hardware before initialization and flushing the caches
after enablement.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

4b83873d

x86/amd-iommu: use for_each_pci_dev · d18c69d3

由 Chris Wright 提交于 4月 02, 2010

Replace open coded version with for_each_pci_dev
Signed-off-by: NChris Wright <chrisw@sous-sol.org>
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

d18c69d3

Revert "x86: disable IOMMUs on kernel crash" · 8f9f55e8

由 Chris Wright 提交于 4月 02, 2010

This effectively reverts commit 61d047be.

Disabling the IOMMU can potetially allow DMA transactions to
complete without being translated.  Leave it enabled, and allow
crash kernel to do the IOMMU reinitialization properly.

Cc: stable@kernel.org
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: NChris Wright <chrisw@sous-sol.org>
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

8f9f55e8

x86/amd-iommu: warn when issuing command to uninitialized cmd buffer · 549c90dc

由 Chris Wright 提交于 4月 02, 2010

To catch future potential issues we can add a warning whenever we issue
a command before the command buffer is fully initialized.
Signed-off-by: NChris Wright <chrisw@sous-sol.org>
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

549c90dc

x86/amd-iommu: enable iommu before attaching devices · 75f66533

由 Chris Wright 提交于 4月 02, 2010

Hit another kdump problem as reported by Neil Horman.  When initializaing
the IOMMU, we attach devices to their domains before the IOMMU is
fully (re)initialized.  Attaching a device will issue some important
invalidations.  In the context of the newly kexec'd kdump kernel, the
IOMMU may have stale cached data from the original kernel.  Because we
do the attach too early, the invalidation commands are placed in the new
command buffer before the IOMMU is updated w/ that buffer.  This leaves
the stale entries in the kdump context and can renders device unusable.
Simply enable the IOMMU before we do the attach.

Cc: stable@kernel.org
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: NChris Wright <chrisw@sous-sol.org>
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

75f66533

06 4月, 2010 1 次提交

perf, x86: Enable Nehalem-EX support · 134fbadf

由 Vince Weaver 提交于 4月 06, 2010

According to Intel Software Devel Manual Volume 3B, the
Nehalem-EX PMU is just like regular Nehalem (except for the
uncore support, which is completely different).
Signed-off-by: NVince Weaver <vweaver1@eecs.utk.edu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Lin Ming <ming.m.lin@intel.com>
LKML-Reference: <alpine.DEB.2.00.1004060956580.1417@cl320.eecs.utk.edu>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

134fbadf

03 4月, 2010 5 次提交

x86: Fix double enable_IR_x2apic() call on SMP kernel on !SMP boards · 472a474c

由 Suresh Siddha 提交于 3月 31, 2010

Jan Grossmann reported kernel boot panic while booting SMP
kernel on his system with a single core cpu. SMP kernels call
enable_IR_x2apic() from native_smp_prepare_cpus() and on
platforms where the kernel doesn't find SMP configuration we
ended up again calling enable_IR_x2apic() from the
APIC_init_uniprocessor() call in the smp_sanity_check(). Thus
leading to kernel panic.

Don't call enable_IR_x2apic() and default_setup_apic_routing()
from APIC_init_uniprocessor() in CONFIG_SMP case.

NOTE: this kind of non-idempotent and assymetric initialization
sequence is rather fragile and unclean, we'll clean that up
in v2.6.35. This is the minimal fix for v2.6.34.

Reported-by: Jan.Grossmann@kielnet.net
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: <jbarnes@virtuousgeek.org>
Cc: <david.woodhouse@intel.com>
Cc: <weidong.han@intel.com>
Cc: <youquan.song@intel.com>
Cc: <Jan.Grossmann@kielnet.net>
Cc: <stable@kernel.org> # [v2.6.32.x, v2.6.33.x]
LKML-Reference: <1270083887.7835.78.camel@sbs-t61.sc.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

472a474c

perf, x86: Fix callgraphs of 32-bit processes on 64-bit kernels · 257ef9d2

由 Torok Edwin 提交于 3月 17, 2010

When profiling a 32-bit process on a 64-bit kernel, callgraph tracing
stopped after the first function, because it has seen a garbage memory
address (tried to interpret the frame pointer, and return address as a
64-bit pointer).

Fix this by using a struct stack_frame with 32-bit pointers when the
TIF_IA32 flag is set.

Note that TIF_IA32 flag must be used, and not is_compat_task(), because
the latter is only set when the 32-bit process is executing a syscall,
which may not always be the case (when tracing page fault events for
example).
Signed-off-by: NTörök Edwin <edwintorok@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
LKML-Reference: <1268820436-13145-1-git-send-email-edwintorok@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

257ef9d2

perf, x86: Fix AMD hotplug & constraint initialization · b38b24ea

由 Peter Zijlstra 提交于 3月 23, 2010

Commit 3f6da390 ("perf: Rework and fix the arch CPU-hotplug hooks") moved
the amd northbridge allocation from CPUS_ONLINE to CPUS_PREPARE_UP
however amd_nb_id() doesn't work yet on prepare so it would simply bail
basically reverting to a state where we do not properly track node wide
constraints - causing weird perf results.

Fix up the AMD NorthBridge initialization code by allocating from
CPU_UP_PREPARE and installing it from CPU_STARTING once we have the
proper nb_id. It also properly deals with the allocation failing.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
[ robustify using amd_has_nb() ]
Signed-off-by: NStephane Eranian <eranian@google.com>
LKML-Reference: <1269353485.5109.48.camel@twins>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b38b24ea

x86: Move notify_cpu_starting() callback to a later stage · 85257024

由 Peter Zijlstra 提交于 3月 23, 2010

Because we need to have cpu identification things done by the time we run
CPU_STARTING notifiers.

( This init ordering will be relied on by the next fix. )
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1269353485.5109.48.camel@twins>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

85257024

x86: Increase CONFIG_NODES_SHIFT max to 10 · 51591e31

由 David Rientjes 提交于 3月 25, 2010

Some larger systems require more than 512 nodes, so increase the
maximum CONFIG_NODES_SHIFT to 10 for a new max of 1024 nodes.

This was tested with numa=fake=64M on systems with more than
64GB of RAM. A total of 1022 nodes were initialized.

Successfully builds with no additional warnings on x86_64
allyesconfig.

( No effect on any existing config. Newly enabled CONFIG_MAXSMP=y
  will see the new default. )
Signed-off-by: NDavid Rientjes <rientjes@google.com>
LKML-Reference: <alpine.DEB.2.00.1003251538060.8589@chino.kir.corp.google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

51591e31

02 4月, 2010 4 次提交

ibft, x86: Change reserve_ibft_region() to find_ibft_region() · 042be38e

由 Yinghai Lu 提交于 4月 01, 2010

This allows arch code could decide the way to reserve the ibft.

And we should reserve ibft as early as possible, instead of BOOTMEM
stage, in case the table is in RAM range and is not reserved by BIOS
(this will often be the case.)

Move to just after find_smp_config().

Also when CONFIG_NO_BOOTMEM=y, We will not have reserve_bootmem() anymore.

-v2: fix typo about ibft pointed by Konrad Rzeszutek Wilk <konrad@darnok.org>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
LKML-Reference: <4BB510FB.80601@kernel.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Jones <pjones@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
CC: Jan Beulich <jbeulich@novell.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

042be38e

x86, hpet: Fix bug in RTC emulation · b4a5e8a1

由 Alok Kataria 提交于 3月 11, 2010

We think there exists a bug in the HPET code that emulates the RTC.

In the normal case, when the RTC frequency is set, the rtc driver tells
the hpet code about it here:

int hpet_set_periodic_freq(unsigned long freq)
{
        uint64_t clc;

        if (!is_hpet_enabled())
                return 0;

        if (freq <= DEFAULT_RTC_INT_FREQ)
                hpet_pie_limit = DEFAULT_RTC_INT_FREQ / freq;
        else {
                clc = (uint64_t) hpet_clockevent.mult * NSEC_PER_SEC;
                do_div(clc, freq);
                clc >>= hpet_clockevent.shift;
                hpet_pie_delta = (unsigned long) clc;
        }
        return 1;
}

If freq is set to 64Hz (DEFAULT_RTC_INT_FREQ) or lower, then
hpet_pie_limit (a static) is set to non-zero.  Then, on every one-shot
HPET interrupt, hpet_rtc_timer_reinit is called to compute the next
timeout.  Well, that function has this logic:

        if (!(hpet_rtc_flags & RTC_PIE) || hpet_pie_limit)
                delta = hpet_default_delta;
        else
                delta = hpet_pie_delta;

Since hpet_pie_limit is not 0, hpet_default_delta is used.  That
corresponds to 64Hz.

Now, if you set a different rtc frequency, you'll take the else path
through hpet_set_periodic_freq, but unfortunately no one resets
hpet_pie_limit back to 0.

Boom....now you are stuck with 64Hz RTC interrupts forever.

The patch below just resets the hpet_pie_limit value when requested freq
is greater than DEFAULT_RTC_INT_FREQ, which we think fixes this problem.
Signed-off-by: NAlok N Kataria <akataria@vmware.com>
LKML-Reference: <201003112200.o2BM0Hre012875@imap1.linux-foundation.org>
Signed-off-by: NDaniel Hecht <dhecht@vmware.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

b4a5e8a1

x86, hpet: Erratum workaround for read after write of HPET comparator · 8da854cb

由 Pallipadi, Venkatesh 提交于 2月 25, 2010

On Wed, Feb 24, 2010 at 03:37:04PM -0800, Justin Piszcz wrote:
> Hello,
>
> Again, on the Intel DP55KG board:
>
> # uname -a
> Linux host 2.6.33 #1 SMP Wed Feb 24 18:31:00 EST 2010 x86_64 GNU/Linux
>
> [    1.237600] ------------[ cut here ]------------
> [    1.237890] WARNING: at arch/x86/kernel/hpet.c:404 hpet_next_event+0x70/0x80()
> [    1.238221] Hardware name:
> [    1.238504] hpet: compare register read back failed.
> [    1.238793] Modules linked in:
> [    1.239315] Pid: 0, comm: swapper Not tainted 2.6.33 #1
> [    1.239605] Call Trace:
> [    1.239886]  <IRQ>  [<ffffffff81056c13>] ? warn_slowpath_common+0x73/0xb0
> [    1.240409]  [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0
> [    1.240699]  [<ffffffff81056cb0>] ? warn_slowpath_fmt+0x40/0x50
> [    1.240992]  [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0
> [    1.241281]  [<ffffffff81041ad0>] ? hpet_next_event+0x70/0x80
> [    1.241573]  [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0
> [    1.241859]  [<ffffffff81078e32>] ? tick_handle_oneshot_broadcast+0xe2/0x100
> [    1.246533]  [<ffffffff8102a67a>] ? timer_interrupt+0x1a/0x30
> [    1.246826]  [<ffffffff81085499>] ? handle_IRQ_event+0x39/0xd0
> [    1.247118]  [<ffffffff81087368>] ? handle_edge_irq+0xb8/0x160
> [    1.247407]  [<ffffffff81029f55>] ? handle_irq+0x15/0x20
> [    1.247689]  [<ffffffff810294a2>] ? do_IRQ+0x62/0xe0
> [    1.247976]  [<ffffffff8146be53>] ? ret_from_intr+0x0/0xa
> [    1.248262]  <EOI>  [<ffffffff8102f277>] ? mwait_idle+0x57/0x80
> [    1.248796]  [<ffffffff8102645c>] ? cpu_idle+0x5c/0xb0
> [    1.249080] ---[ end trace db7f668fb6fef4e1 ]---
>
> Is this something Intel has to fix or is it a bug in the kernel?

This is a chipset erratum.

Thomas: You mentioned we can retain this check only for known-buggy and
hpet debug kind of options. But here is the simple workaround patch for
this particular erratum.

Some chipsets have a erratum due to which read immediately following a
write of HPET comparator returns old comparator value instead of most
recently written value.

Erratum 15 in
"Intel I/O Controller Hub 9 (ICH9) Family Specification Update"
(http://www.intel.com/assets/pdf/specupdate/316973.pdf)

Workaround for the errata is to read the comparator twice if the first
one fails.
Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
LKML-Reference: <20100225185348.GA9674@linux-os.sc.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com>
Cc: <stable@kernel.org>

8da854cb

x86: Handle overlapping mptables · 909fc87b

由 Andi Kleen 提交于 3月 29, 2010

We found a system where the MP table MPC and MPF structures overlap.

That doesn't really matter because the mptable is not used anyways with ACPI,
but it leads to a panic in the early allocator due to the overlapping
reservations in 2.6.33.

Earlier kernels handled this without problems.

Simply change these reservations to reserve_early_overlap_ok to avoid
the panic.
Reported-by: NThomas Renninger <trenn@suse.de>
Tested-by: NThomas Renninger <trenn@suse.de>
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
LKML-Reference: <20100329074111.GA22821@basil.fritz.box>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org>

909fc87b

01 4月, 2010 1 次提交

x86,kgdb: Always initialize the hw breakpoint attribute · ab310b5e

由 Jason Wessel 提交于 3月 30, 2010

It is required to call hw_breakpoint_init() on an attr before using it
in any other calls.  This fixes the problem where kgdb will sometimes
fail to initialize on x86_64.
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: 2.6.33 <stable@kernel.org>
LKML-Reference: <1269975907-27602-1-git-send-email-jason.wessel@windriver.com>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>

ab310b5e

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功