提交 · 7a7546b377bdaa25ac77f33d9433c59f259b9688 · openanolis / cloud-kernel

25 1月, 2012 1 次提交

x86: xen: size struct xen_spinlock to always fit in arch_spinlock_t · 7a7546b3

由 David Vrabel 提交于 1月 23, 2012

If NR_CPUS < 256 then arch_spinlock_t is only 16 bits wide but struct
xen_spinlock is 32 bits.  When a spin lock is contended and
xl->spinners is modified the two bytes immediately after the spin lock
would be corrupted.

This is a regression caused by 84eb950d
(x86, ticketlock: Clean up types and accessors) which reduced the size
of arch_spinlock_t.

Fix this by making xl->spinners a u8 if NR_CPUS < 256.  A
BUILD_BUG_ON() is also added to check the sizes of the two structures
are compatible.

In many cases this was not noticable as there would often be padding
bytes after the lock (e.g., if any of CONFIG_GENERIC_LOCKBREAK,
CONFIG_DEBUG_SPINLOCK, or CONFIG_DEBUG_LOCK_ALLOC were enabled).

The bnx2 driver is affected. In struct bnx2, phy_lock and
indirect_lock may have no padding after them.  Contention on phy_lock
would corrupt indirect_lock making it appear locked and the driver
would deadlock.
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy@goop.org>
Acked-by: NIan Campbell <ian.campbell@citrix.com>
CC: stable@kernel.org #only 3.2
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

7a7546b3

10 1月, 2012 1 次提交

xen/mmu: Fix compile errors introduced by x86/memblock mismerge. · dc6821e0

由 Konrad Rzeszutek Wilk 提交于 1月 07, 2012

The git commit d4bbf7e7
"Merge branch 'master' into x86/memblock" mismerged the 32-bit
section causing:

arch/x86/xen/mmu.c: In function ‘xen_setup_kernel_pagetable’:
arch/x86/xen/mmu.c:1855: error: expected ‘;’ before ‘)’ token
arch/x86/xen/mmu.c:1855: error: expected statement before ‘)’ token
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

dc6821e0

22 11月, 2011 1 次提交

fix braino in um patchset (mea culpa) · cc11f9ed

由 Al Viro 提交于 11月 21, 2011

wrong register returned...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cc11f9ed

20 11月, 2011 1 次提交

KVM guest: prevent tracing recursion with kvmclock · 95ef1e52

由 Avi Kivity 提交于 11月 15, 2011

Prevent tracing of preempt_disable() in get_cpu_var() in
kvm_clock_read(). When CONFIG_DEBUG_PREEMPT is enabled,
preempt_disable/enable() are traced and this causes the function_graph
tracer to go into an infinite recursion. By open coding the
preempt_disable() around the get_cpu_var(), we can use the notrace
version which prevents preempt_disable/enable() from being traced and
prevents the recursion.

Based on a similar patch for Xen from Jeremy Fitzhardinge.
Tested-by: NGleb Natapov <gleb@redhat.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAvi Kivity <avi@redhat.com>

95ef1e52

17 11月, 2011 5 次提交

G
KVM: VMX: Check for automatic switch msr table overflow · e7fc6f93
由 Gleb Natapov 提交于 10月 05, 2011
```
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
e7fc6f93

KVM: VMX: Add support for guest/host-only profiling · d7cd9796

由 Gleb Natapov 提交于 10月 05, 2011

Support guest/host-only profiling by switch perf msrs on
a guest entry if needed.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d7cd9796

KVM: VMX: add support for switching of PERF_GLOBAL_CTRL · 8bf00a52

由 Gleb Natapov 提交于 10月 05, 2011

Some cpus have special support for switching PERF_GLOBAL_CTRL msr.
Add logic to detect if such support exists and works properly and extend
msr switching code to use it if available. Also extend number of generic
msr switching entries to 8.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8bf00a52

xen:pvhvm: enable PVHVM VCPU placement when using more than 32 CPUs. · 90d4f553

由 Zhenzhong Duan 提交于 10月 27, 2011

PVHVM running with more than 32 vcpus and pv_irq/pv_time enabled
need VCPU placement to work, or else it will softlockup.

CC: stable@kernel.org
Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NZhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

90d4f553

xen: map foreign pages for shared rings by updating the PTEs directly · cd12909c

由 David Vrabel 提交于 9月 29, 2011

When mapping a foreign page with xenbus_map_ring_valloc() with the
GNTTABOP_map_grant_ref hypercall, set the GNTMAP_contains_pte flag and
pass a pointer to the PTE (in init_mm).

After the page is mapped, the usual fault mechanism can be used to
update additional MMs.  This allows the vmalloc_sync_all() to be
removed from alloc_vm_area().
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Acked-by: NAndrew Morton <akpm@linux-foundation.org>
[v1: Squashed fix by Michal for no-mmu case]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NMichal Simek <monstr@monstr.eu>

cd12909c

14 11月, 2011 1 次提交

x86: Call stop_machine_text_poke() on all CPUs · 78345d2e

由 Rabin Vincent 提交于 10月 27, 2011

It appears that stop_machine_text_poke() wants to be called on all CPUs,
like it's done from text_poke_smp().  Fix text_poke_smp_batch() to do
this.
Signed-off-by: NRabin Vincent <rabin@rab.in>
Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Jason Baron <jbaron@redhat.com>
Link: http://lkml.kernel.org/r/1319702072-32676-1-git-send-email-rabin@rab.inSigned-off-by: NIngo Molnar <mingo@elte.hu>

78345d2e

12 11月, 2011 3 次提交

bma023: Add SFI translation for this device · 9f80d8b6

由 William Douglas 提交于 11月 10, 2011

This needed the sfi IRQ 0xFF fix to go in first. It simply plumbs in the
bma023 driver with the firmware naming of it.
Signed-off-by: NWilliam Douglas <william.douglas@intel.com>
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9f80d8b6

vrtc: change its year offset from 1960 to 1972 · 57e6319d

由 Feng Tang 提交于 11月 10, 2011

Real world year equals the value in vrtc YEAR register plus an offset.
We used 1960 as the offset to make leap year consistent, but for a
device's first use, its YEAR register is 0 and the system year will
be parsed as 1960 which is not a valid UNIX time and will cause many
applications to fail mysteriously. So we use 1972 instead to fix this
issue.

Updated patch which adds a sanity check suggested by Mathias

This isn't a change in behaviour for systems, because 1972 is the one we
actually use. It's the old version in upstream which is out of sync with
all devices.
Signed-off-by: NFeng Tang <feng.tang@intel.com>
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

57e6319d

ce4100: fix a build error · f2ee4421

由 Zhang Rui 提交于 11月 10, 2011

Fix a build error. CE4100 with no serial errors because the alternate
function is only a prototype not a null function as intended.
Signed-off-by: NZhang Rui <rui.zhang@intel.com>
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f2ee4421

11 11月, 2011 1 次提交

x86, ioapic: Only print ioapic debug information for IRQs belonging to an ioapic chip · 6fd36ba0

由 Mathias Nyman 提交于 11月 10, 2011

with "apic=verbose" the print_IO_APIC() function tries to print
IRQ to pin mappings for every active irq. It assumes chip_data
is of type irq_cfg and may cause an oops if not.

As the print_IO_APIC() is called from a late_initcall other
chained irq chips may already be registered with custom
chip_data information, causing an oops. This is the case with
intel MID SoC devices with gpio demuxers registered as irq_chips.
Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: NAlan Cox <alan@linux.intel.com>
[ -v2: fixed build failure ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6fd36ba0

10 11月, 2011 5 次提交

x86/mrst: Avoid reporting wrong nmi status · 064a59b6

由 Jacob Pan 提交于 11月 10, 2011

Moorestown/Medfield platform does not have port 0x61 to report
NMI status, nor does it have external NMI sources. The only NMI
sources are from lapic, as results of perf counter overflow or
IPI, e.g. NMI watchdog or spin lock debug.

Reading port 0x61 on Moorestown will return 0xff which misled
NMI handlers to false critical errors such memory parity error.
The subsequent ioport access for NMI handling can also cause
undefined behavior on Moorestown.

This patch allows kernel process NMI due to watchdog or backrace
dump without unnecessary hangs.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
[hand applied]
Signed-off-by: NAlan Cox <alan@linux.intel.com>

064a59b6

x86/mrst: Add support for Penwell clock calibration · 0a915326

由 Dirk Brandewie 提交于 11月 10, 2011

Signed-off-by: NDirk Brandewie <dirk.brandewie@gmail.com>
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0a915326

x86/apic: Allow use of lapic timer early calibration result · 1ade93ef

由 Jacob Pan 提交于 11月 10, 2011

lapic timer calibration can be combined with tsc in platform
specific calibration functions. if such calibration result is
obtained early, we can skip the redundant calibration loops.
Signed-off-by: NJacob Pan <jacob.jun.pan@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NDirk Brandewie <dirk.brandewie@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1ade93ef

x86/apic: Do not clear nr_irqs_gsi if no legacy irqs · bb84ac2d

由 Jacob Pan 提交于 11月 10, 2011

nr_legacy_irqs is set in probe_nr_irqs_gsi, we should not clear
it after that. Otherwise, the result is that MSI irqs will be
allocated from the wrong range for the systems without legacy
PIC.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NDirk Brandewie <dirk.brandewie@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bb84ac2d

x86/platform: Add a wallclock_init func to x86_platforms ops · cf8ff6b6

由 Feng Tang 提交于 11月 10, 2011

Some wall clock devices use MMIO based HW register, this new
function will give them a chance to do some initialization work
before their get/set_time service get called.
Signed-off-by: NFeng Tang <feng.tang@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NDirk Brandewie <dirk.brandewie@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cf8ff6b6

08 11月, 2011 1 次提交

x86/mce: Make mce_chrdev_ops 'static const' · 66f5ddf3

由 Luck, Tony 提交于 11月 03, 2011

Arjan would like to make struct file_operations const, but
mce-inject directly writes to the mce_chrdev_ops to install its
write handler. In an ideal world mce-inject would have its own
character device, but we have a sizable legacy of test scripts
that hardwire "/dev/mcelog", so it would be painful to switch to
a separate device now. Instead, this patch switches to a stub
function in the mce code, with a registration helper that
mce-inject can call when it is loaded.

Note that this would also allow for a sane process to allow
mce-inject to be unloaded again (with an unregister function,
and appropriate module_{get,put}() calls), but that is left for
potential future patches.
Reported-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4eb2e1971326651a3b@agluck-desktop.sc.intel.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

66f5ddf3

07 11月, 2011 1 次提交

mrst pmu: update comment · 22f4521d

由 Len Brown 提交于 8月 12, 2011

referenced MeeGo, in particular, but really means Linux, in general.
Signed-off-by: NLen Brown <len.brown@intel.com>

22f4521d

03 11月, 2011 2 次提交

thp: share get_huge_page_tail() · b35a35b5

由 Andrea Arcangeli 提交于 11月 02, 2011

This avoids duplicating the function in every arch gup_fast.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b35a35b5

mm: thp: tail page refcounting fix · 70b50f94

由 Andrea Arcangeli 提交于 11月 02, 2011

Michel while working on the working set estimation code, noticed that
calling get_page_unless_zero() on a random pfn_to_page(random_pfn)
wasn't safe, if the pfn ended up being a tail page of a transparent
hugepage under splitting by __split_huge_page_refcount().

He then found the problem could also theoretically materialize with
page_cache_get_speculative() during the speculative radix tree lookups
that uses get_page_unless_zero() in SMP if the radix tree page is freed
and reallocated and get_user_pages is called on it before
page_cache_get_speculative has a chance to call get_page_unless_zero().

So the best way to fix the problem is to keep page_tail->_count zero at
all times.  This will guarantee that get_page_unless_zero() can never
succeed on any tail page.  page_tail->_mapcount is guaranteed zero and
is unused for all tail pages of a compound page, so we can simply
account the tail page references there and transfer them to
tail_page->_count in __split_huge_page_refcount() (in addition to the
head_page->_mapcount).

While debugging this s/_count/_mapcount/ change I also noticed get_page is
called by direct-io.c on pages returned by get_user_pages.  That wasn't
entirely safe because the two atomic_inc in get_page weren't atomic.  As
opposed to other get_user_page users like secondary-MMU page fault to
establish the shadow pagetables would never call any superflous get_page
after get_user_page returns.  It's safer to make get_page universally safe
for tail pages and to use get_page_foll() within follow_page (inside
get_user_pages()).  get_page_foll() is safe to do the refcounting for tail
pages without taking any locks because it is run within PT lock protected
critical sections (PT lock for pte and page_table_lock for
pmd_trans_huge).

The standard get_page() as invoked by direct-io instead will now take
the compound_lock but still only for tail pages.  The direct-io paths
are usually I/O bound and the compound_lock is per THP so very
finegrined, so there's no risk of scalability issues with it.  A simple
direct-io benchmarks with all lockdep prove locking and spinlock
debugging infrastructure enabled shows identical performance and no
overhead.  So it's worth it.  Ideally direct-io should stop calling
get_page() on pages returned by get_user_pages().  The spinlock in
get_page() is already optimized away for no-THP builds but doing
get_page() on tail pages returned by GUP is generally a rare operation
and usually only run in I/O paths.

This new refcounting on page_tail->_mapcount in addition to avoiding new
RCU critical sections will also allow the working set estimation code to
work without any further complexity associated to the tail page
refcounting with THP.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Reported-by: NMichel Lespinasse <walken@google.com>
Reviewed-by: NMichel Lespinasse <walken@google.com>
Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: <stable@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

70b50f94

02 11月, 2011 17 次提交

um: Fix kmalloc argument order in um/vdso/vma.c · 0d65ede0

由 Dave Jones 提交于 10月 24, 2011

kmalloc size is 1st arg, not second.
Signed-off-by: NDave Jones <davej@redhat.com>
Signed-off-by: NRichard Weinberger <richard@nod.at>

Cc: <stable@kernel.org> # 3.0.x
[richard@nod.at: on 3.0 the to be patched file is
arch/um/sys-x86_64/vdso/vma.c]

0d65ede0

R
um: we need sys/user.h only on i386 · 38b64aed
由 Richard Weinberger 提交于 8月 18, 2011
```
Signed-off-by: NRichard Weinberger <richard@nod.at>
```
38b64aed
R
um: merge delay_{32,64}.c · d0af6cbf
由 Richard Weinberger 提交于 8月 18, 2011
```
Signed-off-by: NRichard Weinberger <richard@nod.at>
```
d0af6cbf

um: kill system-um.h · a34978cb

由 Al Viro 提交于 8月 18, 2011

most of it belonged in irqflags.h, actually
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

a34978cb

um: segment.h is x86-only and needed only there · 46ecca8a

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

46ecca8a

um: unify ptrace_user.h · 966e803a

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

966e803a

um: unify KSTK_... · a10c95d8

由 Al Viro 提交于 8月 18, 2011

... and switch get_thread_register() to HOST_... for register numbers
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

a10c95d8

um: fix gcov build breakage · 4d211093

由 Al Viro 提交于 8月 18, 2011

a) exports in gmon_syms.c duplicate kernel/gcov/* ones
b) excluding -pg in vdso compile is not enough - -fprofile-arcs
and -ftest-coverage also needs to be excluded
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

4d211093

um: irq_vectors.h just shadows x86 one · 3fb77d72

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

3fb77d72

A
um: required-features.h is there only to shadow x86 one... · ff9586e9
由 Al Viro 提交于 8月 18, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>
```
ff9586e9

um: asm/apic.h is there only to shadow the x86 one... · 8807c1d5

由 Al Viro 提交于 8月 18, 2011

... so take it to arch/um/x86/asm.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

8807c1d5

um: take ldt.h to arch/x86/um/asm/mm_context.h · b3ee571e

由 Al Viro 提交于 8月 18, 2011

it's x86-only and we have no business playing with it in asm/mmu.h; make
the latter have
	struct uml_arch_mm_context arch;
instead of
	struct uml_ldt ldt;
and let arch/<subarch>/um/asm/mm_context.h decide what'll be in there.
While we are at it, kill host_ldt.h - it's not needed in part of places
that include it (we want asm/ldt.h in those) and it can be trivially
expanded into the single remaining one.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

b3ee571e

um: merge signal_{32,64}.c · f67aa2ff

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

f67aa2ff

A
um: no need to play with save_sp in signal frame setup anymore · fbe98686
由 Al Viro 提交于 8月 18, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>
```
fbe98686

um: increase stack growth cushion in pagefault · c7ea591c

由 Al Viro 提交于 8月 18, 2011

analog of [PATCH] i386: let usermode execute the "enter" instruction from
circa 2006.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

c7ea591c

A
um: merge HOST_... of registers common on i386 and amd64 · 3579a389
由 Al Viro 提交于 8月 18, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>
```
3579a389

um: sanitize paths in sys_call_table* includes · 8edc4147

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

8edc4147

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功