提交 · b92f885f31db782608c8b73942111238020b1f4a · openeuler / raspberrypi-kernel

05 9月, 2012 3 次提交

x86/Kconfig: Disable CONFIG_CRC_T10DIF in defconfigs · b92f885f

由 Josh Triplett 提交于 9月 02, 2012

CONFIG_CRC_T10DIF explicitly states that it exists only for use
by out-of-tree modules; anything in-kernel that needs it selects
it.  Thus, compile it out by default.
Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3aaff7a0af1320427952d411a21b8ded29747a1f.1346649518.git.josh@joshtriplett.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

b92f885f

x86/Kconfig: Switch to ext4 in defconfigs · 3fe2cb8f

由 Josh Triplett 提交于 9月 02, 2012

The current x86 and x86-64 defconfigs do not enable ext4, which
most current distributions default to.  Switch the defconfigs to
ext4, so they will boot on current systems without additional
configuration.
Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/bd8a359506b7e1287c680823de16d67608ec52fe.1346649518.git.josh@joshtriplett.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

3fe2cb8f

x86/Kconfig: Update defconfigs to current results of "make savedefconfig" · 406eae5d

由 Josh Triplett 提交于 9月 02, 2012

The x86 defconfigs have become somewhat out of date compared to
the current result of "make savedefconfig".  Update them to the
current output, as a prelude to further defconfig changes, to
avoid unrelated noise in those further changes.
Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/80c8a5fbeaf6cdb72fb78a016013427efee52668.1346649518.git.josh@joshtriplett.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

406eae5d

28 8月, 2012 1 次提交

KVM: x86: fix KVM_GET_MSR for PV EOI · 1d92128f

由 Michael S. Tsirkin 提交于 8月 26, 2012

KVM_GET_MSR was missing support for PV EOI,
which is needed for migration.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

1d92128f

23 8月, 2012 3 次提交

xen/setup: Fix one-off error when adding for-balloon PFNs to the P2M. · c96aae1f

由 Konrad Rzeszutek Wilk 提交于 8月 17, 2012

When we are finished with return PFNs to the hypervisor, then
populate it back, and also mark the E820 MMIO and E820 gaps
as IDENTITY_FRAMEs, we then call P2M to set areas that can
be used for ballooning. We were off by one, and ended up
over-writting a P2M entry that most likely was an IDENTITY_FRAME.
For example:

1-1 mapping on 40000->40200
1-1 mapping on bc558->bc5ac
1-1 mapping on bc5b4->bc8c5
1-1 mapping on bc8c6->bcb7c
1-1 mapping on bcd00->100000
Released 614 pages of unused memory
Set 277889 page(s) to 1-1 mapping
Populating 40200-40466 pfn range: 614 pages added

=> here we set from 40466 up to bc559 P2M tree to be
INVALID_P2M_ENTRY. We should have done it up to bc558.

The end result is that if anybody is trying to construct
a PTE for PFN bc558 they end up with ~PAGE_PRESENT.

CC: stable@vger.kernel.org
Reported-by-and-Tested-by: NAndre Przywara <andre.przywara@amd.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

c96aae1f

x86, microcode, AMD: Fix broken ucode patch size check · 36bf50d7

由 Andreas Herrmann 提交于 7月 31, 2012

This issue was recently observed on an AMD C-50 CPU where a patch of
maximum size was applied.

Commit be62adb4 ("x86, microcode, AMD: Simplify ucode verification")
added current_size in get_matching_microcode(). This is calculated as
size of the ucode patch + 8 (ie. size of the header). Later this is
compared against the maximum possible ucode patch size for a CPU family.
And of course this fails if the patch has already maximum size.

Cc: <stable@vger.kernel.org> [3.3+]
Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Link: http://lkml.kernel.org/r/1344361461-10076-1-git-send-email-bp@amd64.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

36bf50d7

KVM: x86 emulator: use stack size attribute to mask rsp in stack ops · 5ad105e5

由 Avi Kivity 提交于 8月 19, 2012

The sub-register used to access the stack (sp, esp, or rsp) is not
determined by the address size attribute like other memory references,
but by the stack segment's B bit (if not in x86_64 mode).

Fix by using the existing stack_mask() to figure out the correct mask.

This long-existing bug was exposed by a combination of a27685c3
(emulate invalid guest state by default), which causes many more
instructions to be emulated, and a seabios change (possibly a bug) which
causes the high 16 bits of esp to become polluted across calls to real
mode software interrupts.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5ad105e5

22 8月, 2012 5 次提交

KVM: MMU: Fix mmu_shrink() so that it can free mmu pages as intended · 35f2d16b

由 Takuya Yoshikawa 提交于 8月 20, 2012

Although the possible race described in

  commit 85b70591
  KVM: MMU: fix shrinking page from the empty mmu

was correct, the real cause of that issue was a more trivial bug of
mmu_shrink() introduced by

  commit 19526396
  KVM: MMU: do not iterate over all VMs in mmu_shrink()

Here is the bug:

	if (kvm->arch.n_used_mmu_pages > 0) {
		if (!nr_to_scan--)
			break;
		continue;
	}

We skip VMs whose n_used_mmu_pages is not zero and try to shrink others:
in other words we try to shrink empty ones by mistake.

This patch reverses the logic so that mmu_shrink() can free pages from
the first VM whose n_used_mmu_pages is not zero.  Note that we also add
comments explaining the role of nr_to_scan which is not practically
important now, hoping this will be improved in the future.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

35f2d16b

x86/alternatives: Fix p6 nops on non-modular kernels · cb09cad4

由 Avi Kivity 提交于 8月 22, 2012

Probably a leftover from the early days of self-patching, p6nops
are marked __initconst_or_module, which causes them to be
discarded in a non-modular kernel.  If something later triggers
patching, it will overwrite kernel code with garbage.
Reported-by: NTomas Racek <tracek@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: Anthony Liguori <anthony@codemonkey.ws>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Alan Cox <alan@linux.intel.com>
Link: http://lkml.kernel.org/r/5034AE84.90708@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

cb09cad4

x86/fixup_irq: Use cpu_online_mask instead of cpu_all_mask · 2530cd4f

由 Liu, Chuansheng 提交于 8月 14, 2012

When one CPU is going down and this CPU is the last one in irq
affinity, current code is setting cpu_all_mask as the new
affinity for that irq.

But for some systems (such as in Medfield Android mobile) the
firmware sends the interrupt to each CPU in the irq affinity
mask, averaged, and cpu_all_mask includes all potential CPUs,
i.e. offline ones as well.

So replace cpu_all_mask with cpu_online_mask.
Signed-off-by: Nliu chuansheng <chuansheng.liu@intel.com>
Acked-by: NYanmin Zhang <yanmin_zhang@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/27240C0AC20F114CBF8149A2696CBE4A137286@SHSMSX101.ccr.corp.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

2530cd4f

x86/spinlocks: Fix comment in spinlock.h · 83be4ffa

由 Richard Weinberger 提交于 8月 14, 2012

This comment is no longer true.  We support up to 2^16 CPUs
because __ticket_t is an u16 if NR_CPUS is larger than 256.
Signed-off-by: NRichard Weinberger <richard@nod.at>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

83be4ffa

mm: hugetlbfs: correctly populate shared pmd · eb48c071

由 Michal Hocko 提交于 8月 21, 2012

Each page mapped in a process's address space must be correctly
accounted for in _mapcount.  Normally the rules for this are
straightforward but hugetlbfs page table sharing is different.  The page
table pages at the PMD level are reference counted while the mapcount
remains the same.

If this accounting is wrong, it causes bugs like this one reported by
Larry Woodman:

  kernel BUG at mm/filemap.c:135!
  invalid opcode: 0000 [#1] SMP
  CPU 22
  Modules linked in: bridge stp llc sunrpc binfmt_misc dcdbas microcode pcspkr acpi_pad acpi]
  Pid: 18001, comm: mpitest Tainted: G        W    3.3.0+ #4 Dell Inc. PowerEdge R620/07NDJ2
  RIP: 0010:[<ffffffff8112cfed>]  [<ffffffff8112cfed>] __delete_from_page_cache+0x15d/0x170
  Process mpitest (pid: 18001, threadinfo ffff880428972000, task ffff880428b5cc20)
  Call Trace:
    delete_from_page_cache+0x40/0x80
    truncate_hugepages+0x115/0x1f0
    hugetlbfs_evict_inode+0x18/0x30
    evict+0x9f/0x1b0
    iput_final+0xe3/0x1e0
    iput+0x3e/0x50
    d_kill+0xf8/0x110
    dput+0xe2/0x1b0
    __fput+0x162/0x240

During fork(), copy_hugetlb_page_range() detects if huge_pte_alloc()
shared page tables with the check dst_pte == src_pte.  The logic is if
the PMD page is the same, they must be shared.  This assumes that the
sharing is between the parent and child.  However, if the sharing is
with a different process entirely then this check fails as in this
diagram:

  parent
    |
    ------------>pmd
                 src_pte----------> data page
                                        ^
  other--------->pmd--------------------|
                  ^
  child-----------|
                 dst_pte

For this situation to occur, it must be possible for Parent and Other to
have faulted and failed to share page tables with each other.  This is
possible due to the following style of race.

  PROC A                                          PROC B
  copy_hugetlb_page_range                         copy_hugetlb_page_range
    src_pte == huge_pte_offset                      src_pte == huge_pte_offset
    !src_pte so no sharing                          !src_pte so no sharing

  (time passes)

  hugetlb_fault                                   hugetlb_fault
    huge_pte_alloc                                  huge_pte_alloc
      huge_pmd_share                                 huge_pmd_share
        LOCK(i_mmap_mutex)
        find nothing, no sharing
        UNLOCK(i_mmap_mutex)
                                                      LOCK(i_mmap_mutex)
                                                      find nothing, no sharing
                                                      UNLOCK(i_mmap_mutex)
      pmd_alloc                                       pmd_alloc
      LOCK(instantiation_mutex)
      fault
      UNLOCK(instantiation_mutex)
                                                  LOCK(instantiation_mutex)
                                                  fault
                                                  UNLOCK(instantiation_mutex)

These two processes are not poing to the same data page but are not
sharing page tables because the opportunity was missed.  When either
process later forks, the src_pte == dst pte is potentially insufficient.
As the check falls through, the wrong PTE information is copied in
(harmless but wrong) and the mapcount is bumped for a page mapped by a
shared page table leading to the BUG_ON.

This patch addresses the issue by moving pmd_alloc into huge_pmd_share
which guarantees that the shared pud is populated in the same critical
section as pmd.  This also means that huge_pte_offset test in
huge_pmd_share is serialized correctly now which in turn means that the
success of the sharing will be higher as the racing tasks see the pud
and pmd populated together.

Race identified and changelog written mostly by Mel Gorman.

{akpm@linux-foundation.org: attempt to make the huge_pmd_share() comment comprehensible, clean up coding style]
Reported-by: NLarry Woodman <lwoodman@redhat.com>
Tested-by: NLarry Woodman <lwoodman@redhat.com>
Reviewed-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NMichal Hocko <mhocko@suse.cz>
Reviewed-by: NRik van Riel <riel@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Ken Chen <kenchen@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

eb48c071

19 8月, 2012 1 次提交

x32: Use compat shims for {g,s}etsockopt · 515c7af8

由 Mike Frysinger 提交于 8月 18, 2012

Some of the arguments to {g,s}etsockopt are passed in userland pointers.
If we try to use the 64bit entry point, we end up sometimes failing.

For example, dhcpcd doesn't run in x32:
	# dhcpcd eth0
	dhcpcd[1979]: version 5.5.6 starting
	dhcpcd[1979]: eth0: broadcasting for a lease
	dhcpcd[1979]: eth0: open_socket: Invalid argument
	dhcpcd[1979]: eth0: send_raw_packet: Bad file descriptor

The code in particular is getting back EINVAL when doing:
	struct sock_fprog pf;
	setsockopt(s, SOL_SOCKET, SO_ATTACH_FILTER, &pf, sizeof(pf));

Diving into the kernel code, we can see:
include/linux/filter.h:
	struct sock_fprog {
		unsigned short len;
		struct sock_filter __user *filter;
	};

net/core/sock.c:
	case SO_ATTACH_FILTER:
		ret = -EINVAL;
		if (optlen == sizeof(struct sock_fprog)) {
			struct sock_fprog fprog;

			ret = -EFAULT;
			if (copy_from_user(&fprog, optval, sizeof(fprog)))
				break;

			ret = sk_attach_filter(&fprog, sk);
		}
		break;

arch/x86/syscalls/syscall_64.tbl:
	54 common setsockopt sys_setsockopt
	55 common getsockopt sys_getsockopt

So for x64, sizeof(sock_fprog) is 16 bytes.  For x86/x32, it's 8 bytes.
This comes down to the pointer being 32bit for x32, which means we need
to do structure size translation.  But since x32 comes in directly to
sys_setsockopt, it doesn't get translated like x86.

After changing the syscall table and rebuilding glibc with the new kernel
headers, dhcp runs fine in an x32 userland.

Oddly, it seems like Linus noted the same thing during the initial port,
but I guess that was missed/lost along the way:
	https://lkml.org/lkml/2011/8/26/452

[ hpa: tagging for -stable since this is an ABI fix. ]

Bugzilla: https://bugs.gentoo.org/423649Reported-by: NMads <mads@ab3.no>
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Link: http://lkml.kernel.org/r/1345320697-15713-1-git-send-email-vapier@gentoo.org
Cc: H. J. Lu <hjl.tools@gmail.com>
Cc: <stable@vger.kernel.org> v3.4..v3.5
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

515c7af8

17 8月, 2012 2 次提交

xen/p2m: Reuse existing P2M leafs if they are filled with 1:1 PFNs or INVALID. · 250a41e0

由 Konrad Rzeszutek Wilk 提交于 8月 17, 2012

If P2M leaf is completly packed with INVALID_P2M_ENTRY or with
1:1 PFNs (so IDENTITY_FRAME type PFNs), we can swap the P2M leaf
with either a p2m_missing or p2m_identity respectively. The old
page (which was created via extend_brk or was grafted on from the
mfn_list) can be re-used for setting new PFNs.

This also means we can remove git commit:
5bc6f988
xen/p2m: Reserve 8MB of _brk space for P2M leafs when populating back
which tried to fix this.

and make the amount that is required to be reserved much smaller.

CC: stable@vger.kernel.org # for 3.5 only.
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

250a41e0

Revert "xen PVonHVM: move shared_info to MMIO before kexec" · ca08649e

由 Konrad Rzeszutek Wilk 提交于 8月 16, 2012

This reverts commit 00e37bdb.

During shutdown of PVHVM guests with more than 2VCPUs on certain
machines we can hit the race where the replaced shared_info is not
replaced fast enough and the PV time clock retries reading the same
area over and over without any any success and is stuck in an
infinite loop.
Acked-by: NOlaf Hering <olaf@aepfle.de>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ca08649e

15 8月, 2012 2 次提交

Revert "x86-64/efi: Use EFI to deal with platform wall clock" · f026cfa8

由 H. Peter Anvin 提交于 8月 14, 2012

This reverts commit bacef661.

This commit has been found to cause serious regressions on a number of
ASUS machines at the least.  We probably need to provide a 1:1 map in
addition to the EFI virtual memory map in order for this to work.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Reported-and-bisected-by: NJérôme Carretero <cJ-ko@zougloub.eu>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120805172903.5f8bb24c@zougloub.eu

f026cfa8

x86, apic: fix broken legacy interrupts in the logical apic mode · f1c63001

由 Suresh Siddha 提交于 8月 08, 2012

Recent commit 332afa65 cleaned up
a workaround that updates irq_cfg domain for legacy irq's that
are handled by the IO-APIC. This was assuming that the recent
changes in assign_irq_vector() were sufficient to remove the workaround.

But this broke couple of AMD platforms. One of them seems to be
sending interrupts to the offline cpu's, resulting in spurious
"No irq handler for vector xx (irq -1)" messages when those cpu's come online.
And the other platform seems to always send the interrupt to the last logical
CPU (cpu-7). Recent changes had an unintended side effect of using only logical
cpu-0 in the IO-APIC RTE (during boot for the legacy interrupts) and this
broke the legacy interrupts not getting routed to the cpu-7 on the AMD
platform, resulting in a boot hang.

For now, reintroduce the removed workaround, (essentially not allowing the
vector to change for legacy irq's when io-apic starts to handle the irq. Which
also addressed the uninteded sife effect of just specifying cpu-0 in the
IO-APIC RTE for those irq's during boot).
Reported-and-tested-by: NRobert Richter <robert.richter@amd.com>
Reported-and-tested-by: NBorislav Petkov <bp@amd64.org>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1344453412.29170.5.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

f1c63001

14 8月, 2012 4 次提交

perf/x86: disable PEBS on a guest entry. · 26a4f3c0

由 Gleb Natapov 提交于 8月 09, 2012

If PMU counter has PEBS enabled it is not enough to disable counter
on a guest entry since PEBS memory write can overshoot guest entry
and corrupt guest memory. Disabling PEBS during guest entry solves
the problem.
Tested-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120809085234.GI3341@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

26a4f3c0

perf/x86: Add Intel Westmere-EX uncore support · cb37af77

由 Yan, Zheng 提交于 8月 06, 2012

The Westmere-EX uncore is similar to the Nehalem-EX uncore. The
differences are:
 - Westmere-EX uncore has 10 instances of Cbox. The MSRs for Cbox8
   and Cbox9 in the Westmere-EX aren't contiguous with Cbox 0~7.
 - The fvid field in the ZDP_CTL_FVC register in the Mbox is
   different. It's 5 bits in the Nehalem-EX, 6 bits in the
   Westmere-EX.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1344229882-3907-3-git-send-email-zheng.z.yan@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

cb37af77

perf/x86: Fixes for Nehalem-EX uncore driver · ebb6cc03

由 Yan, Zheng 提交于 8月 06, 2012

This patch includes following fixes and update:
 - Only some events in the Sbox and Mbox can use the match/mask
   registers, add code to check this.
 - The format definitions for xbr_mm_cfg and xbr_match registers
   in the Rbox are wrong, xbr_mm_cfg should use 32 bits, xbr_match
   should use 64 bits.
 - Cleanup the Rbox code. Compute the addresses extra registers in
   the enable_event function instead of the hw_config function.
   This simplifies the code in nhmex_rbox_alter_er().
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1344229882-3907-2-git-send-email-zheng.z.yan@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

ebb6cc03

perf, x86: Fix uncore_types_exit section mismatch · cffa59ba

由 Borislav Petkov 提交于 8月 02, 2012

Fix the following section mismatch:

WARNING: arch/x86/kernel/cpu/built-in.o(.text+0x7ad9): Section mismatch in reference from the function uncore_types_exit() to the function .init.text:uncore_type_exit()

The function uncore_types_exit() references the function __init
uncore_type_exit().  This is often because uncore_types_exit lacks a
__init annotation or the annotation of uncore_type_exit is wrong.

caused by 14371cce ("perf: Add generic PCI uncore PMU device
support").

Cc: Zheng Yan <zheng.z.yan@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1339741902-8449-8-git-send-email-zheng.z.yan@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

cffa59ba

11 8月, 2012 1 次提交

x86, build: Globally set -fno-pic · 484d90ee

由 Andrew Boie 提交于 8月 10, 2012

GCC built with nonstandard options can enable -fpic by default.
We never want this for 32-bit kernels and it will break the build.

[ hpa: Notably the Android toolchain apparently does this. ]

Change-Id: Iaab7d66e598b1c65ac4a4f0229eca2cd3d0d2898
Signed-off-by: NAndrew Boie <andrew.p.boie@intel.com>
Link: http://lkml.kernel.org/r/1344624546-29691-1-git-send-email-andrew.p.boie@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

484d90ee

09 8月, 2012 1 次提交

x86, avx: don't use avx instructions with "noxsave" boot param · c6fd893d

由 Suresh Siddha 提交于 7月 31, 2012

Clear AVX, AVX2 features along with clearing XSAVE feature bits,
as part of the parsing "noxsave" parameter.

Fixes the kernel boot panic with "noxsave" boot parameter.

We could have checked cpu_has_osxsave along with cpu_has_avx etc, but Peter
mentioned clearing the feature bits will be better for uses like
static_cpu_has() etc.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1343755754.2041.2.camel@sbsiddha-desk.sc.intel.com
Cc: <stable@vger.kernel.org>	# v3.5
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

c6fd893d

05 8月, 2012 1 次提交

KVM: x86: update KVM_SAVE_MSRS_BEGIN to correct value · 439793d4

由 Gleb Natapov 提交于 8月 01, 2012

When MSR_KVM_PV_EOI_EN was added to msrs_to_save array
KVM_SAVE_MSRS_BEGIN was not updated accordingly.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

439793d4

03 8月, 2012 1 次提交

ACPI: Only count valid srat memory structures · 095adbb6

由 Thomas Renninger 提交于 7月 31, 2012

Otherwise you could run into:
WARN_ON in numa_register_memblks(), because node_possible_map is zero

References: https://bugzilla.novell.com/show_bug.cgi?id=757888

On this machine (ProLiant ML570 G3) the SRAT table contains:
  - No processor affinities
  - One memory affinity structure (which is set disabled)

CC: Per Jessen <per@opensuse.org>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: NThomas Renninger <trenn@suse.de>
Signed-off-by: NLen Brown <len.brown@intel.com>

095adbb6

02 8月, 2012 5 次提交

xen/p2m: Reserve 8MB of _brk space for P2M leafs when populating back. · 5bc6f988

由 Konrad Rzeszutek Wilk 提交于 7月 30, 2012

When we release pages back during bootup:

Freeing  9d-100 pfn range: 99 pages freed
Freeing  9cf36-9d0d2 pfn range: 412 pages freed
Freeing  9f6bd-9f6bf pfn range: 2 pages freed
Freeing  9f714-9f7bf pfn range: 171 pages freed
Freeing  9f7e0-9f7ff pfn range: 31 pages freed
Freeing  9f800-100000 pfn range: 395264 pages freed
Released 395979 pages of unused memory

We then try to populate those pages back. In the P2M tree however
the space for those leafs must be reserved - as such we use extend_brk.
We reserve 8MB of _brk space, which means we can fit over
1048576 PFNs - which is more than we should ever need.

Without this, on certain compilation of the kernel we would hit:

(XEN) domain_crash_sync called from entry.S
(XEN) CPU:    0
(XEN) RIP:    e033:[<ffffffff818aad3b>]
(XEN) RFLAGS: 0000000000000206   EM: 1   CONTEXT: pv guest
(XEN) rax: ffffffff81a7c000   rbx: 000000000000003d   rcx: 0000000000001000
(XEN) rdx: ffffffff81a7b000   rsi: 0000000000001000   rdi: 0000000000001000
(XEN) rbp: ffffffff81801cd8   rsp: ffffffff81801c98   r8:  0000000000100000
(XEN) r9:  ffffffff81a7a000   r10: 0000000000000001   r11: 0000000000000003
(XEN) r12: 0000000000000004   r13: 0000000000000004   r14: 000000000000003d
(XEN) r15: 00000000000001e8   cr0: 000000008005003b   cr4: 00000000000006f0
(XEN) cr3: 0000000125803000   cr2: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=ffffffff81801c98:

.. which is extend_brk hitting a BUG_ON.

Interestingly enough, most of the time we are not going to hit this
b/c the _brk space is quite large (v3.5):
 ffffffff81a25000 B __brk_base
 ffffffff81e43000 B __brk_limit
= ~4MB.

vs earlier kernels (with this back-ported), the space is smaller:
 ffffffff81a25000 B __brk_base
 ffffffff81a7b000 B __brk_limit
= 344 kBytes.

where we would certainly hit this and hit extend_brk.

Note that git commit c3d93f88
(xen: populate correct number of pages when across mem boundary (v2))
exposed this bug).

[v1: Made it 8MB of _brk space instead of 4MB per Jan's suggestion]

CC: stable@vger.kernel.org #only for 3.5
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

5bc6f988

KVM: VMX: Fix ds/es corruption on i386 with preemption · aa67f609

由 Avi Kivity 提交于 8月 01, 2012

Commit b2da15ac ("KVM: VMX: Optimize %ds, %es reload") broke i386
in the following scenario:

  vcpu_load
  ...
  vmx_save_host_state
  vmx_vcpu_run
  (ds.rpl, es.rpl cleared by hardware)

  interrupt
    push ds, es  # pushes bad ds, es
    schedule
      vmx_vcpu_put
        vmx_load_host_state
          reload ds, es (with __USER_DS)
    pop ds, es  # of other thread's stack
    iret
  # other thread runs
  interrupt
    push ds, es
    schedule  # back in vcpu thread
    pop ds, es  # now with rpl=0
    iret
  ...
  vcpu_put
  resume_userspace
  iret  # clears ds, es due to mismatched rpl

(instead of resume_userspace, we might return with SYSEXIT and then
take an exception; when the exception IRETs we end up with cleared
ds, es)

Fix by avoiding the optimization on i386 and reloading ds, es on the
lightweight exit path.
Reported-by: NChris Clayron <chris2553@googlemail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

aa67f609

x86-64, kcmp: The kcmp system call can be common · eaf4ce6c

由 H. Peter Anvin 提交于 8月 01, 2012

We already use the same system call handler for i386 and x86-64, there
is absolutely no reason x32 can't use the same system call, too.
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: <stable@vger.kernel.org> v3.5
Link: http://lkml.kernel.org/n/tip-vwzk3qbcr3yjyxjg2j38vgy9@git.kernel.org

eaf4ce6c

A
um: switch UPT_SET_RETURN_VALUE and regs_return_value to pt_regs · a3170d2e
由 Al Viro 提交于 5月 22, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>
```
a3170d2e

KVM: x86: apply kvmclock offset to guest wall clock time · 4b648665

由 Bruce Rogers 提交于 7月 20, 2012

When a guest migrates to a new host, the system time difference from the
previous host is used in the updates to the kvmclock system time visible
to the guest, resulting in a continuation of correct kvmclock based guest
timekeeping.

The wall clock component of the kvmclock provided time is currently not
updated with this same time offset. Since the Linux guest caches the
wall clock based time, this discrepency is not noticed until the guest is
rebooted. After reboot the guest's time calculations are off.

This patch adjusts the wall clock by the kvmclock_offset, resulting in
correct guest time after a reboot.

Cc: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: NBruce Rogers <brogers@suse.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

4b648665

01 8月, 2012 5 次提交

x86: OLPC: move s/r-related EC cmds to EC driver · 1fcfd08b

由 Andres Salomon 提交于 7月 17, 2012

The new EC driver calls platform-specific suspend and resume hooks; run
XO-1-specific EC commands from there, rather than deep in s/r code.  If we
attempt to run EC commands after the new EC driver has suspended, it is
refused by the ec->suspended checks.
Signed-off-by: NAndres Salomon <dilinger@queued.net>
Acked-by: NPaul Fox <pgf@laptop.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>

1fcfd08b

Platform: OLPC: move debugfs support from x86 EC driver · 6cca83d4

由 Andres Salomon 提交于 7月 12, 2012

There's nothing about the debugfs interface for the EC driver that is
architecture-specific, so move it into the arch-independent driver.

The code is mostly unchanged with the exception of renamed variables, coding
style changes, and API updates.
Signed-off-by: NAndres Salomon <dilinger@queued.net>
Acked-by: NPaul Fox <pgf@laptop.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>

6cca83d4

x86: OLPC: switch over to using new EC driver on x86 · 85f90cf6

由 Andres Salomon 提交于 7月 12, 2012

This uses the new EC driver framework in drivers/platform/olpc.  The
XO-1 and XO-1.5-specific code is still in arch/x86, but the generic stuff
(including a new workqueue; no more running EC commands with IRQs disabled!)
can be shared with other architectures.
Signed-off-by: NAndres Salomon <dilinger@queued.net>
Acked-by: NPaul Fox <pgf@laptop.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>

85f90cf6

drivers: OLPC: update various drivers to include olpc-ec.h · 3bf9428f

由 Andres Salomon 提交于 7月 11, 2012

Switch over to using olpc-ec.h in multiple steps, so as not to break builds.
This covers every driver that calls olpc_ec_cmd().
Signed-off-by: NAndres Salomon <dilinger@queued.net>
Acked-by: NPaul Fox <pgf@laptop.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>

3bf9428f

Platform: OLPC: add a stub to drivers/platform/ for the OLPC EC driver · 392a325c

由 Andres Salomon 提交于 7月 10, 2012

The OLPC EC driver has outgrown arch/x86/platform/.  It's time to both
share common code amongst different architectures, as well as move it out
of arch/x86/.  The XO-1.75 is ARM-based, and the EC driver shares a lot of
code with the x86 code.
Signed-off-by: NAndres Salomon <dilinger@queued.net>
Acked-by: NPaul Fox <pgf@laptop.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>

392a325c

31 7月, 2012 5 次提交

perf/x86: Fix USER/KERNEL tagging of samples properly · d07bdfd3

由 Peter Zijlstra 提交于 7月 10, 2012

Some PMUs don't provide a full register set for their sample,
specifically 'advanced' PMUs like AMD IBS and Intel PEBS which provide
'better' than regular interrupt accuracy.

In this case we use the interrupt regs as basis and over-write some
fields (typically IP) with different information.

The perf core however uses user_mode() to distinguish user/kernel
samples, user_mode() relies on regs->cs. If the interrupt skid pushed
us over a boundary the new IP might not be in the same domain as the
interrupt.

Commit ce5c1fe9 ("perf/x86: Fix USER/KERNEL tagging of samples")
tried to fix this by making the perf core use kernel_ip(). This
however is wrong (TM), as pointed out by Linus, since it doesn't allow
for VM86 and non-zero based segments in IA32 mode.

Therefore, provide a new helper to set the regs->ip field,
set_linear_ip(), which massages the regs into a suitable state
assuming the provided IP is in fact a linear address.

Also modify perf_instruction_pointer() and perf_callchain_user() to
deal with segments base offsets.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1341910954.3462.102.camel@twinsSigned-off-by: NIngo Molnar <mingo@kernel.org>

d07bdfd3

perf/x86/intel/uncore: Make UNCORE_PMU_HRTIMER_INTERVAL 64-bit · 7740dfc0

由 Andrew Morton 提交于 7月 21, 2012

i386 allmodconfig:

arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function 'uncore_pmu_hrtimer':
arch/x86/kernel/cpu/perf_event_intel_uncore.c:728: warning: integer overflow in expression
arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function 'uncore_pmu_start_hrtimer':
arch/x86/kernel/cpu/perf_event_intel_uncore.c:735: warning: integer overflow in expression
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-h84qlqj02zrojmxxybzmy9hi@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

7740dfc0

ACPI/x86: revert 'x86, acpi: Call acpi_enter_sleep_state via an asmlinkage C... · 3b6961ba

由 Len Brown 提交于 7月 26, 2012

ACPI/x86: revert 'x86, acpi: Call acpi_enter_sleep_state via an asmlinkage C function from assembler'

cd74257b
patched up GTS/BFS -- a feature we want to remove.
So revert it (by hand, due to conflict in sleep.h)
to prepare for GTS/BFS removal.
Signed-off-by: NLen Brown <len.brown@intel.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

3b6961ba

ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSION · c1d7e01d

由 Will Deacon 提交于 7月 30, 2012

Rather than #define the options manually in the architecture code, add
Kconfig options for them and select them there instead.  This also allows
us to select the compat IPC version parsing automatically for platforms
using the old compat IPC interface.
Reported-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c1d7e01d

firmware_map: make firmware_map_add_early() argument consistent with firmware_map_add_hotplug() · 4ed940d4

由 Yasuaki Ishimatsu 提交于 7月 30, 2012

There are two ways to create /sys/firmware/memmap/X sysfs:

  - firmware_map_add_early
    When the system starts, it is calledd from e820_reserve_resources()
  - firmware_map_add_hotplug
    When the memory is hot plugged, it is called from add_memory()

But these functions are called without unifying value of end argument as
below:

  - end argument of firmware_map_add_early()   : start + size - 1
  - end argument of firmware_map_add_hogplug() : start + size

The patch unifies them to "start + size".  Even if applying the patch,
/sys/firmware/memmap/X/end file content does not change.

[akpm@linux-foundation.org: clarify comments]
Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: NDave Hansen <dave@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4ed940d4