提交 · 3fb82d56ad003e804923185316236f26b30dfdd5 · openanolis / cloud-kernel

14 12月, 2010 1 次提交

x86, suspend: Avoid unnecessary smp alternatives switch during suspend/resume · 3fb82d56

由 Suresh Siddha 提交于 11月 23, 2010

During suspend, we disable all the non boot cpus. And during resume we bring
them all back again. So no need to do alternatives_smp_switch() in between.

On my core 2 based laptop, this speeds up the suspend path by 15msec and the
resume path by 5 msec (suspend/resume speed up differences can be attributed
to the different P-states that the cpu is in during suspend/resume).
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1290557500.4946.8.camel@sbsiddha-MOBL3.sc.intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

3fb82d56

03 12月, 2010 1 次提交

vmalloc: eagerly clear ptes on vunmap · 64141da5

由 Jeremy Fitzhardinge 提交于 12月 02, 2010

On stock 2.6.37-rc4, running:

  # mount lilith:/export /mnt/lilith
  # find  /mnt/lilith/ -type f -print0 | xargs -0 file

crashes the machine fairly quickly under Xen.  Often it results in oops
messages, but the couple of times I tried just now, it just hung quietly
and made Xen print some rude messages:

    (XEN) mm.c:2389:d80 Bad type (saw 7400000000000001 != exp
    3000000000000000) for mfn 1d7058 (pfn 18fa7)
    (XEN) mm.c:964:d80 Attempt to create linear p.t. with write perms
    (XEN) mm.c:2389:d80 Bad type (saw 7400000000000010 != exp
    1000000000000000) for mfn 1d2e04 (pfn 1d1fb)
    (XEN) mm.c:2965:d80 Error while pinning mfn 1d2e04

Which means the domain tried to map a pagetable page RW, which would
allow it to map arbitrary memory, so Xen stopped it.  This is because
vm_unmap_ram() left some pages mapped in the vmalloc area after NFS had
finished with them, and those pages got recycled as pagetable pages
while still having these RW aliases.

Removing those mappings immediately removes the Xen-visible aliases, and
so it has no problem with those pages being reused as pagetable pages.
Deferring the TLB flush doesn't upset Xen because it can flush the TLB
itself as needed to maintain its invariants.

When unmapping a region in the vmalloc space, clear the ptes
immediately.  There's no point in deferring this because there's no
amortization benefit.

The TLBs are left dirty, and they are flushed lazily to amortize the
cost of the IPIs.

This specific motivation for this patch is an oops-causing regression
since 2.6.36 when using NFS under Xen, triggered by the NFS client's use
of vm_map_ram() introduced in 56e4ebf8 ("NFS: readdir with vmapped
pages") .  XFS also uses vm_map_ram() and could cause similar problems.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Alex Elder <aelder@sgi.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

64141da5

02 12月, 2010 2 次提交

xen: unplug the emulated devices at resume time · 512b109e

由 Stefano Stabellini 提交于 12月 01, 2010

Early after being resumed we need to unplug again the emulated devices.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>

512b109e

xen: fix MSI setup and teardown for PV on HVM guests · af42b8d1

由 Stefano Stabellini 提交于 12月 01, 2010

When remapping MSIs into pirqs for PV on HVM guests, qemu is responsible
for doing the actual mapping and unmapping.
We only give qemu the desired pirq number when we ask to do the mapping
the first time, after that we should be reading back the pirq number
from qemu every time we want to re-enable the MSI.

This fixes a bug in xen_hvm_setup_msi_irqs that manifests itself when
trying to enable the same MSI for the second time: the old MSI to pirq
mapping is still valid at this point but xen_hvm_setup_msi_irqs would
try to assign a new pirq anyway.
A simple way to reproduce this bug is to assign an MSI capable network
card to a PV on HVM guest, if the user brings down the corresponding
ethernet interface and up again, Linux would fail to enable MSIs on the
device.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>

af42b8d1

30 11月, 2010 2 次提交

xen: x86/32: perform initial startup on initial_page_table · 805e3f49

由 Ian Campbell 提交于 11月 03, 2010

Only make swapper_pg_dir readonly and pinned when generic x86 architecture code
(which also starts on initial_page_table) switches to it.  This helps ensure
that the generic setup paths work on Xen unmodified. In particular
clone_pgd_range writes directly to the destination pgd entries and is used to
initialise swapper_pg_dir so we need to ensure that it remains writeable until
the last possible moment during bring up.

This is complicated slightly by the need to avoid sharing kernel PMD entries
when running under Xen, therefore the Xen implementation must make a copy of
the kernel PMD (which is otherwise referred to by both intial_page_table and
swapper_pg_dir) before switching to swapper_pg_dir.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

805e3f49

xen: don't bother to stop other cpus on shutdown/reboot · 31e323cc

由 Jeremy Fitzhardinge 提交于 11月 29, 2010

Xen will shoot all the VCPUs when we do a shutdown hypercall, so there's
no need to do it manually.

In any case it will fail because all the IPI irqs have been pulled
down by this point, so the cross-CPU calls will simply hang forever.

Until change 76fac077 the function calls
were not synchronously waited for, so this wasn't apparent.  However after
that change the calls became synchronous leading to a hang on shutdown
on multi-VCPU guests.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: Alok Kataria <akataria@vmware.com>

31e323cc

26 11月, 2010 2 次提交

perf, x86: Fixup Kconfig deps · cc2067a5

由 Peter Zijlstra 提交于 11月 16, 2010

This leads to a Kconfig dep inversion, x86 selects PERF_EVENT (due to
a hw_breakpoint dep) but doesn't unconditionally provide
HAVE_PERF_EVENT.

(This can cause build failures on M386/M486 kernel .config's.)
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101117222055.982965150@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cc2067a5

x86, perf, nmi: Disable perf if counters are not accessible · 33c6d6a7

由 Don Zickus 提交于 11月 22, 2010

In a kvm virt guests, the perf counters are not emulated.  Instead they
return zero on a rdmsrl. The perf nmi handler uses the fact that crossing
a zero means the counter overflowed (for those counters that do not have
specific interrupt bits). Therefore on kvm guests, perf will swallow all
NMIs thinking the counters overflowed.

This causes problems for subsystems like kgdb which needs NMIs to do its
magic. This problem was discovered by running kgdb tests.

The solution is to write garbage into a perf counter during the
initialization and hopefully reading back the same number.  On kvm
guests, the value will be read back as zero and we disable perf as
a result.
Reported-by: NJason Wessel <jason.wessel@windriver.com>
Patch-inspired-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NDon Zickus <dzickus@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1290462923-30734-1-git-send-email-dzickus@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

33c6d6a7

25 11月, 2010 3 次提交

arch/x86/include/asm/fixmap.h: mark __set_fixmap_offset as __always_inline · 91d95fda

由 Andrew Morton 提交于 11月 24, 2010

When compiling arch/x86/kernel/early_printk_mrst.c with i386
allmodconfig, gcc-4.1.0 generates an out-of-line copy of
__set_fixmap_offset() which contains a reference to
__this_fixmap_does_not_exist which the compiler cannot elide.

Marking __set_fixmap_offset() as __always_inline prevents this.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Feng Tang <feng.tang@intel.com>
Acked-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

91d95fda

xen: remove duplicated #include · e6d4a76d

由 Huang Weiyi 提交于 11月 20, 2010

Remove duplicated #include('s) in
  arch/x86/xen/setup.c
Signed-off-by: NHuang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

e6d4a76d

xen: x86/32: perform initial startup on initial_page_table · 5b5c1af1

由 Ian Campbell 提交于 11月 24, 2010

Only make swapper_pg_dir readonly and pinned when generic x86 architecture code
(which also starts on initial_page_table) switches to it.  This helps ensure
that the generic setup paths work on Xen unmodified. In particular
clone_pgd_range writes directly to the destination pgd entries and is used to
initialise swapper_pg_dir so we need to ensure that it remains writeable until
the last possible moment during bring up.

This is complicated slightly by the need to avoid sharing kernel PMD entries
when running under Xen, therefore the Xen implementation must make a copy of
the kernel PMD (which is otherwise referred to by both intial_page_table and
swapper_pg_dir) before switching to swapper_pg_dir.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

5b5c1af1

23 11月, 2010 3 次提交

xen: use default_idle · bc15fde7

由 Jeremy Fitzhardinge 提交于 11月 22, 2010

We just need the idle loop to drop into safe_halt, which default_idle()
is perfectly capable of doing. There's no need to duplicate it.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

bc15fde7

xen: clean up "extra" memory handling some more · c2d08791

由 Jeremy Fitzhardinge 提交于 11月 22, 2010

Make sure that extra_pages is added for all E820_RAM regions beyond
mem_end - completely excluded regions as well as the remains of partially
included regions.

Also makes sure the extra region is not unnecessarily high, and simplifies
the logic to decide which regions should be added.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

c2d08791

xen: set IO permission early (before early_cpu_init()) · ec35a69c

由 Konrad Rzeszutek Wilk 提交于 11月 16, 2010

This patch is based off "xen dom0: Set up basic IO permissions for dom0."
by Juan Quintela <quintela@redhat.com>.

On AMD machines when we boot the kernel as Domain 0 we get this nasty:

mapping kernel into physical memory
Xen: setup ISA identity maps
about to get started...
(XEN) traps.c:475:d0 Unhandled general protection fault fault/trap [#13] on VCPU 0 [ec=0000]
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.1-101116  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e033:[<ffffffff8130271b>]
(XEN) RFLAGS: 0000000000000282   EM: 1   CONTEXT: pv guest
(XEN) rax: 000000008000c068   rbx: ffffffff8186c680   rcx: 0000000000000068
(XEN) rdx: 0000000000000cf8   rsi: 000000000000c000   rdi: 0000000000000000
(XEN) rbp: ffffffff81801e98   rsp: ffffffff81801e50   r8:  ffffffff81801eac
(XEN) r9:  ffffffff81801ea8   r10: ffffffff81801eb4   r11: 00000000ffffffff
(XEN) r12: ffffffff8186c694   r13: ffffffff81801f90   r14: ffffffffffffffff
(XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000006f0
(XEN) cr3: 0000000221803000   cr2: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=ffffffff81801e50:

RIP points to read_pci_config() function.

The issue is that we don't set IO permissions for the Linux kernel early enough.

The call sequence used to be:

    xen_start_kernel()
	x86_init.oem.arch_setup = xen_setup_arch;
        setup_arch:
           - early_cpu_init
               - early_init_amd
                  - read_pci_config
           - x86_init.oem.arch_setup [ xen_arch_setup ]
               - set IO permissions.

We need to set the IO permissions earlier on, which this patch does.
Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ec35a69c

20 11月, 2010 1 次提交

xen: re-enable boot-time ballooning · d2a81713

由 Jeremy Fitzhardinge 提交于 11月 19, 2010

Now that the balloon driver doesn't stumble over non-RAM pages, we
can enable the extra space for ballooning.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

d2a81713

18 11月, 2010 10 次提交

x86-64: Fix and clean up AMD Fam10 MMCONF enabling · 37db6c8f

由 Jan Beulich 提交于 11月 16, 2010

Candidate memory ranges were not calculated properly (start
addresses got needlessly rounded down, and end addresses didn't
get rounded up at all), address comparison for secondary CPUs
was done on only part of the address, and disabled status wasn't
tracked properly.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <4CE24DF40200007800022737@vpn.id2.novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

37db6c8f

x86/kprobes: Prevent kprobes to probe on save_args() · de31ec8a

由 Masami Hiramatsu 提交于 11月 18, 2010

Prevent kprobes to probe on save_args() since this function
will be called from breakpoint exception handler. That will
cause infinit loop on breakpoint handling.
Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
LKML-Reference: <20101118101655.2779.2816.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

de31ec8a

x86: UV: Address interrupt/IO port operation conflict · 8191c9f6

由 Dimitri Sivanich 提交于 11月 16, 2010

This patch for SGI UV systems addresses a problem whereby
interrupt transactions being looped back from a local IOH,
through the hub to a local CPU can (erroneously) conflict with
IO port operations and other transactions.

To workaound this we set a high bit in the APIC IDs used for
interrupts. This bit appears to be ignored by the sockets, but
it avoids the conflict in the hub.
Signed-off-by: NDimitri Sivanich <sivanich@sgi.com>
LKML-Reference: <20101116222352.GA8155@sgi.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
___

 arch/x86/include/asm/uv/uv_hub.h   |    4 ++++
 arch/x86/include/asm/uv/uv_mmrs.h  |   19 ++++++++++++++++++-
 arch/x86/kernel/apic/x2apic_uv_x.c |   25 +++++++++++++++++++++++--
 arch/x86/platform/uv/tlb_uv.c      |    2 +-
 arch/x86/platform/uv/uv_time.c     |    4 +++-
 5 files changed, 49 insertions(+), 5 deletions(-)

8191c9f6

x86: Use online node real index in calulate_tbl_offset() · 9223081f

由 Yinghai Lu 提交于 11月 13, 2010

Found a NUMA system that doesn't have RAM installed at the first
socket which hangs while executing init scripts.

bisected it to:

 | commit 93296720
 | Author: Shaohua Li <shaohua.li@intel.com>
 | Date:   Wed Oct 20 11:07:03 2010 +0800
 |
 |     x86: Spread tlb flush vector between nodes

It turns out when first socket is not online it could have cpus on
node1 tlb_offset set to bigger than NUM_INVALIDATE_TLB_VECTORS.

That could affect systems like 4 sockets, but socket 2 doesn't
have installed, sockets 3 will get too big tlb_offset.

Need to use real online node idx.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NShaohua Li <shaohua.li@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4CDEDE59.40603@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9223081f

x86, asm: Fix binutils 2.15 build failure · 96e612ff

由 Tetsuo Handa 提交于 11月 16, 2010

Add parentheses around one pushl_cfi argument.

Commit df5d1874 "x86: Use {push,pop}{l,q}_cfi in more places"
caused GNU assembler 2.15 (Debian Sarge) to fail. It is still
failing as of commit 07bd8516 "x86, asm: Restore parentheses
around one pushl_cfi argument". This patch solves build failure
with GNU assembler 2.15.
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: NJan Beulich <jbeulich@novell.com>
Cc: heukelum@fastmail.fm
Cc: hpa@linux.intel.com
LKML-Reference: <201011160445.oAG4jGif079860@www262.sakura.ne.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

96e612ff

x86, hw_nmi: Move backtrace_mask declaration under ARCH_HAS_NMI_WATCHDOG · 0e2af2a9

由 Rakib Mullick 提交于 11月 12, 2010

backtrace_mask has been used under the code context of
ARCH_HAS_NMI_WATCHDOG. So put it into that context.
We were warned by the following warning:

  arch/x86/kernel/apic/hw_nmi.c:21: warning: ‘backtrace_mask’ defined but not used
Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: NDon Zickus <dzickus@redhat.com>
LKML-Reference: <1289573455-3410-2-git-send-email-dzickus@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0e2af2a9

KVM: VMX: Fix host userspace gsbase corruption · c8770e7b

由 Avi Kivity 提交于 11月 11, 2010

We now use load_gs_index() to load gs safely; unfortunately this also
changes MSR_KERNEL_GS_BASE, which we managed separately.  This resulted
in confusion and breakage running 32-bit host userspace on a 64-bit kernel.

Fix by
- saving guest MSR_KERNEL_GS_BASE before we we reload the host's gs
- doing the host save/load unconditionally, instead of only when in guest
  long mode

Things can be cleaned up further, but this is the minmal fix for now.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c8770e7b

KVM: Correct ordering of ldt reload wrt fs/gs reload · 0a77fe4c

由 Avi Kivity 提交于 10月 19, 2010

If fs or gs refer to the ldt, they must be reloaded after the ldt.  Reorder
the code to that effect.

Userspace code that uses the ldt with kvm is nonexistent, so this doesn't fix
a user-visible bug.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0a77fe4c

kgdb,x86: fix regression in detach handling · 10a6e676

由 Jason Wessel 提交于 11月 15, 2010

The fix from ba773f7c
(x86,kgdb: Fix hw breakpoint regression) was not entirely complete.

The kgdb_remove_all_hw_break() function also needs to call the
hw_break_release_slot() or else a breakpoint can get activated again
after the debugger has detached.

The kgdb test suite exposes the behavior in the form of either a hang
or repetitive failure.  The kernel config that exposes the problem
contains all of the following:

CONFIG_DEBUG_RODATA=y
CONFIG_KGDB_TESTS=y
CONFIG_KGDB_TESTS_ON_BOOT=y
CONFIG_KGDB_TESTS_BOOT_STRING="V1F100"
Reported-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
Tested-by: NFrederic Weisbecker <fweisbec@gmail.com>

10a6e676

BKL: remove extraneous #include <smp_lock.h> · 451a3c24

由 Arnd Bergmann 提交于 11月 17, 2010

The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

451a3c24

13 11月, 2010 1 次提交

xen: implement XENMEM_machphys_mapping · 7e77506a

由 Ian Campbell 提交于 9月 30, 2010

This hypercall allows Xen to specify a non-default location for the
machine to physical mapping. This capability is used when running a 32
bit domain 0 on a 64 bit hypervisor to shrink the hypervisor hole to
exactly the size required.

[ Impact: add Xen hypercall definitions ]
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>

7e77506a

12 11月, 2010 3 次提交

x86: Ignore trap bits on single step exceptions · 6c0aca28

由 Frederic Weisbecker 提交于 11月 11, 2010

When a single step exception fires, the trap bits, used to
signal hardware breakpoints, are in a random state.

These trap bits might be set if another exception will follow,
like a breakpoint in the next instruction, or a watchpoint in the
previous one. Or there can be any junk there.

So if we handle these trap bits during the single step exception,
we are going to handle an exception twice, or we are going to
handle junk.

Just ignore them in this case.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=21332Reported-by: NMichael Stefaniuc <mstefani@redhat.com>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Alexandre Julliard <julliard@winehq.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: All since 2.6.33.x <stable@kernel.org>

6c0aca28

xen: set vma flag VM_PFNMAP in the privcmd mmap file_op · e060e7af

由 Stefano Stabellini 提交于 11月 11, 2010

Set VM_PFNMAP in the privcmd mmap file_op, rather than later in
xen_remap_domain_mfn_range when it is too late because
vma_wants_writenotify has already been called and vm_page_prot has
already been modified.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

e060e7af

x86/PCI: coalesce overlapping host bridge windows · 4723d0f2

由 Bjorn Helgaas 提交于 9月 22, 2010

Some BIOSes provide PCI host bridge windows that overlap, e.g.,

pci_root PNP0A03:00: host bridge window [mem 0xb0000000-0xffffffff]
pci_root PNP0A03:00: host bridge window [mem 0xafffffff-0xdfffffff]
pci_root PNP0A03:00: host bridge window [mem 0xf0000000-0xffffffff]

If we simply insert these as children of iomem_resource, the second window
fails because it conflicts with the first, and the third is inserted as a
child of the first, i.e.,

b0000000-ffffffff PCI Bus 0000:00
f0000000-ffffffff PCI Bus 0000:00

When we claim PCI device resources, this can cause collisions like this
if we put them in the first window:

pci 0000:00:01.0: address space collision: [mem 0xff300000-0xff4fffff] conflicts with PCI Bus 0000:00 [mem 0xf0000000-0xffffffff]

Host bridge windows are top-level resources by definition, so it doesn't
make sense to make the third window a child of the first. This patch
coalesces any host bridge windows that overlap. For the example above,
the result is this single window:

pci_root PNP0A03:00: host bridge window [mem 0xafffffff-0xffffffff]

This fixes a 2.6.34 regression.

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=17011Reported-and-tested-by: NAnisse Astier <anisse@astier.eu>
Reported-and-tested-by: NPramod Dematagoda <pmd.lotr.gandalf@gmail.com>
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

4723d0f2

11 11月, 2010 3 次提交

tracing: Force arch_local_irq_* notrace for paravirt · b5908548

由 Steven Rostedt 提交于 11月 10, 2010

When running ktest.pl randconfig tests, I would sometimes trigger
a lockdep annotation bug (possible reason: unannotated irqs-on).

This triggering happened right after function tracer self test was
executed. After doing a config bisect I found that this was caused with
having function tracer, paravirt guest, prove locking, and rcu torture
all enabled.

The rcu torture just enhanced the likelyhood of triggering the bug.
Prove locking was needed, since it was the thing that was bugging.
Function tracer would trace and disable interrupts in all sorts
of funny places.
paravirt guest would turn arch_local_irq_* into functions that would
be traced.

Besides the fact that tracing arch_local_irq_* is just a bad idea,
this is what is happening.

The bug happened simply in the local_irq_restore() code:

		if (raw_irqs_disabled_flags(flags)) {	\
			raw_local_irq_restore(flags);	\
			trace_hardirqs_off();		\
		} else {				\
			trace_hardirqs_on();		\
			raw_local_irq_restore(flags);	\
		}					\

The raw_local_irq_restore() was defined as arch_local_irq_restore().

Now imagine, we are about to enable interrupts. We go into the else
case and call trace_hardirqs_on() which tells lockdep that we are enabling
interrupts, so it sets the current->hardirqs_enabled = 1.

Then we call raw_local_irq_restore() which calls arch_local_irq_restore()
which gets traced!

Now in the function tracer we disable interrupts with local_irq_save().
This is fine, but flags is stored that we have interrupts disabled.

When the function tracer calls local_irq_restore() it does it, but this
time with flags set as disabled, so we go into the if () path.
This keeps interrupts disabled and calls trace_hardirqs_off() which
sets current->hardirqs_enabled = 0.

When the tracer is finished and proceeds with the original code,
we enable interrupts but leave current->hardirqs_enabled as 0. Which
now breaks lockdeps internal processing.

Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

b5908548

xen: do not release any memory under 1M in domain 0 · 9ec23a7f

由 Ian Campbell 提交于 10月 28, 2010

We already deliberately setup a 1-1 P2M for the region up to 1M in
order to allow code which assumes this region is already mapped to
work without having to convert everything to ioremap.

Domain 0 should not return any apparently unused memory regions
(reserved or otherwise) in this region to Xen since the e820 may not
accurately reflect what the BIOS has stashed in this region.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

9ec23a7f

perf, amd: Use kmalloc_node(,__GFP_ZERO) for northbridge structure allocation · 034c6efa

由 Peter Zijlstra 提交于 11月 01, 2010

Jasper suggested we use the zeroing capability of the allocators
instead of calling memset ourselves. Add node affinity while we're at
it.
Reported-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

034c6efa

10 11月, 2010 6 次提交

x86, pvclock: Remove leftover scale_delta() function · 1f523bf3

由 Kusanagi Kouichi 提交于 11月 05, 2010

Commit 92580d64e16402762e2acc3022f065397c780425
("x86: pvclock: Move scale_delta into common header")
forgot to remove scale_delta.
Signed-off-by: NKusanagi Kouichi <slash@ac.auone-net.jp>
Cc: Zachary Amsden <zamsden@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Glauber Costa <glommer@redhat.com>
LKML-Reference: <20101105110444.BAF6D6FC03B@msa105.auone-net.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1f523bf3

x86, apic: Remove double #include · 2a8dcbd6

由 Jesper Juhl 提交于 11月 07, 2010

Remove the second <asm/atomic.h> inclusion.
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
LKML-Reference: <alpine.LNX.2.00.1011072253360.26247@swampdragon.chaosbits.net>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2a8dcbd6

x86: Adjust section annotations in AMD Fam10 MMCONF enabling code · 2f62bf7d

由 Jan Beulich 提交于 11月 04, 2010

check_enable_amd_mmconf_dmi() gets called only for the BSP,
hence everything hanging off of it can be __init*.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>
LKML-Reference: <4CD2DE1E0200007800020990@vpn.id2.novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2f62bf7d

x86, UV: Update node controller MMRs · 62b0cfc2

由 Jack Steiner 提交于 11月 06, 2010

A new version of the SGI UV hub node controller is being
developed. A few of the MMRs (control registers) that exist on
the current hub no longer exist on the new hub. Fortunately,
there are alternate MMRs that are are functionally equivalent
and that exist on both hubs.

This patch changes the UV code to use MMRs that exist in BOTH
versions of the hub node controller.
Signed-off-by: NJack Steiner <steiner@sgi.com>
LKML-Reference: <20101106204056.GA27584@sgi.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

62b0cfc2

x86: Remove unnecessary casts of void ptr returning alloc function return values · 8e5e9521

由 Jesper Juhl 提交于 11月 09, 2010

The [vk][cmz]alloc(_node) family of functions return void
pointers which it's completely unnecessary/pointless to cast to
other pointer types since that happens implicitly.

This patch removes such casts from arch/x86.
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Cc: trivial@kernel.org
Cc: amd64-microcode@amd64.org
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <alpine.LNX.2.00.1011082310220.23697@swampdragon.chaosbits.net>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8e5e9521

x86: Address gcc4.6 "set but not used" warnings in apic.h · 0059b243

由 Andi Kleen 提交于 11月 08, 2010

native_apic_msr_read() and x2apic_enabled() use rdmsr(msr, low, high),
but only use the low part.

gcc4.6 complains about this:
.../apic.h:144:11: warning: variable 'high' set but not used [-Wunused-but-set-variable]

rdmsr() is just a wrapper around rdmsrl() which splits the 64bit value
into low and high, so using rdmsrl() directly solves this.

[tglx: Changed the variables to u64 as suggested by Cyrill. It's less
       confusing and has no code impact as this is 64bit only anyway.
       Massaged changelog as well. ]
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Cc: x86@kernel.org
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
LKML-Reference: <1289251229-19589-1-git-send-email-andi@firstfloor.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

0059b243

09 11月, 2010 1 次提交

xen: fix memory leak in Xen PCI MSI/MSI-X allocator. · 07cf2a64

由 Jiri Slaby 提交于 11月 06, 2010

Stanse found that xen_setup_msi_irqs leaks memory when
xen_allocate_pirq fails. Free the memory in that fail path.
Signed-off-by: NJiri Slaby <jslaby@suse.cz>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: xen-devel@lists.xensource.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org

07cf2a64

06 11月, 2010 1 次提交

KVM: x86: Issue smp_call_function_many with preemption disabled · 453d9c57

由 Jan Kiszka 提交于 11月 01, 2010

smp_call_function_many is specified to be called only with preemption
disabled. Fulfill this requirement.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

453d9c57

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功