提交 · b0d310801a4c1f95b44357e4ebc22a9903e3bf3d · openanolis / cloud-kernel

02 5月, 2011 15 次提交

x86-32, NUMA: implement temporary NUMA init shims · b0d31080

由 Tejun Heo 提交于 5月 02, 2011

To help transition to common NUMA init, implement temporary 32bit
shims for numa_add_memblk() and numa_set_distance().
numa_add_memblk() registers the memblk and adjusts
node_start/end_pfn[].  numa_set_distance() is noop.

These shims will allow using 64bit NUMA init functions on 32bit and
gradual transition to common NUMA init path.

For detailed description, please read description of commits which
make use of the shim functions.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

b0d31080

x86, NUMA: Move numa_nodes_parsed to numa.[hc] · e6df595b

由 Tejun Heo 提交于 5月 02, 2011

Move numa_nodes_parsed from numa_64.[hc] to numa.[hc] to prepare for
NUMA init path unification.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

e6df595b

x86-32, NUMA: Move get_memcfg_numa() into numa_32.c · daf4f480

由 Tejun Heo 提交于 5月 02, 2011

There's no reason get_memcfg_numa() to be implemented inline in
mmzone_32.h.  Move it to numa_32.c and also make
get_memcfg_numa_flag() static.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

daf4f480

x86, NUMA: make srat.c 32bit safe · eca9ad31

由 Tejun Heo 提交于 5月 02, 2011

Make srat.c 32bit safe by removing the assumption that unsigned long
is 64bit.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

eca9ad31

x86, NUMA: rename srat_64.c to srat.c · 7b2600f8

由 Tejun Heo 提交于 5月 02, 2011

Rename srat_64.c to srat.c.  This is to prepare for unification of
NUMA init paths between 32 and 64bit.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

7b2600f8

x86, NUMA: trivial cleanups · 1201e10a

由 Tejun Heo 提交于 5月 02, 2011

* Kill no longer used struct bootnode.

* Kill dangling declaration of pxm_to_nid() in numa_32.h.

* Make setup_node_bootmem() static.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

1201e10a

x86-32, NUMA: use sparse_memory_present_with_active_regions() · 797390d8

由 Tejun Heo 提交于 5月 02, 2011

Instead of calling memory_present() for each region from NUMA init,
call sparse_memory_present_with_active_regions() from paging_init()
similarly to x86-64.

For flat and numaq, this results in exactly the same memory_present()
calls.  For srat, if there are multiple memory chunks for a node,
after this change, memory_present() will be called separately for each
chunk instead of being called once to encompass the whole range, which
doesn't cause any harm and actually is the better behavior.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

797390d8

x86-32, NUMA: Make apic->x86_32_numa_cpu_node() optional · 84914ed0

由 Tejun Heo 提交于 5月 02, 2011

NUMAQ is the only meaningful user of this callback and
setup_local_APIC() the only callsite.  Stop torturing everyone else by
making the callback optional and removing all the boilerplate
implementations and assignments.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

84914ed0

x86, NUMA: Unify 32/64bit numa_cpu_node() implementation · 6bd26273

由 Tejun Heo 提交于 5月 02, 2011

Currently, the only meaningful user of apic->x86_32_numa_cpu_node() is
NUMAQ which returns valid mapping only after CPU is initialized during
SMP bringup; thus, the previous patch to set apicid -> node in
setup_local_APIC() makes __apicid_to_node[] always contain the correct
mapping whether custom apic->x86_32_numa_cpu_node() is used or not.

So, there is no reason to keep separate 32bit implementation.  We can
always consult __apicid_to_node[].  Move 64bit implementation from
numa_64.c to numa.c and remove 32bit implementation from numa_32.c.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

6bd26273

x86-32, NUMA: Automatically set apicid -> node in setup_local_APIC() · c4b90c11

由 Tejun Heo 提交于 5月 02, 2011

Some x86-32 NUMA implementations (NUMAQ) don't initialize apicid ->
node mapping using set_apicid_to_node() during NUMA init but implement
custom apic->x86_32_numa_cpu_node() instead.

This patch automatically initializes the default apic -> node mapping
table from apic->x86_32_numa_cpu_node() from setup_local_APIC() such
that the mapping table is in sync with the actual mapping.

As the table isn't used by custom implementations, this doesn't make
any difference at this point.  This is in preparation of unifying
numa_cpu_node() between x86-32 and 64.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

c4b90c11

x86-64, NUMA: simplify nodedata allocation · acd26d61

由 Tejun Heo 提交于 5月 02, 2011

With top-down memblock allocation, the allocation range limits in
ealry_node_mem() can be simplified - try node-local first, then any
node but in any case don't allocate below DMA limit.

Remove early_node_mem() and implement simplified allocation directly
in setup_node_bootmem().
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

acd26d61

x86-64, NUMA: trivial cleanups for setup_node_bootmem() · ebe685f2

由 Tejun Heo 提交于 5月 02, 2011

Make the following trivial changes in preparation for further updates.

* nodeid -> nid, nid -> tnid
* use nd_ prefix for nodedata related variables
* remove start/end_pfn and use start/end directly
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

ebe685f2

x86-64, NUMA: Simplify hotadd memory handling · 9688678a

由 Tejun Heo 提交于 5月 02, 2011

The only special handling NUMA needs to do for hotadd memory is
determining the node for the hotadd memory given the address of it and
there's nothing specific to specific config method used.

srat_64.c does somewhat elaborate error checking on
ACPI_SRAT_MEM_HOT_PLUGGABLE regions, remembers them and implements
memory_add_physaddr_to_nid() which determines the node for given
hotadd address.

This is almost completely redundant.  All the information is already
available to the generic NUMA code which already performs all the
sanity checking and merging.  All that's necessary is not using
__initdata from numa_meminfo and providing a function which uses it to
map address to node.

Drop the specific implementation from srat_64.c and add generic
memory_add_physaddr_to_nid() in numa_64.c, which is enabled if
CONFIG_MEMORY_HOTPLUG is set.  Other than dropping the code, srat_64.c
doesn't need any change as it already calls numa_add_memblk() for hot
pluggable regions which is enough.

While at it, change CONFIG_MEMORY_HOTPLUG_SPARSE in srat_64.c to
CONFIG_MEMORY_HOTPLUG, for NUMA on x86-64, the two are always the
same.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

9688678a

x86, NUMA: Fix empty memblk detection in numa_cleanup_meminfo() · 2be19102

由 Yinghai Lu 提交于 5月 01, 2011

numa_cleanup_meminfo() trims each memblk between low (0) and
high (max_pfn) limits and discards empty ones.  However, the
emptiness detection incorrectly used equality test.  If the
start of a memblk is higher than max_pfn, it is empty but fails
the equality test and doesn't get discarded.

The condition triggers when max_pfn is lower than start of a
NUMA node and results in memory misconfiguration - leading to
WARN_ON()s and other funnies.  The bug was discovered in devel
branch where 32bit too uses this code path for NUMA init.  If a
node is above the addressing limit, max_pfn ends up lower than
the node triggering this problem.

The failure hasn't been observed on x86-64 but is still possible
with broken hardware e820/NUMA info.  As the fix is very low
risk, it would be better to apply it even for 64bit.

Fix it by using >= instead of ==.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
[ Extracted the actual fix from the original patch and rewrote patch description. ]
Signed-off-by: NTejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/20110501171204.GO29280@htj.dyndns.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

2be19102

x86, AMD: Fix APIC timer erratum 400 affecting K8 Rev.A-E processors · e20a2d20

由 Boris Ostrovsky 提交于 4月 29, 2011

Older AMD K8 processors (Revisions A-E) are affected by erratum
400 (APIC timer interrupts don't occur in C states greater than
C1). This, for example, means that X86_FEATURE_ARAT flag should
not be set for these parts.

This addresses regression introduced by commit
b87cf80a ("x86, AMD: Set ARAT
feature on AMD processors") where the system may become
unresponsive until external interrupt (such as keyboard input)
occurs. This results, for example, in time not being reported
correctly, lack of progress on the system and other lockups.
Reported-by: NJoerg-Volker Peetz <jvpeetz@web.de>
Tested-by: NJoerg-Volker Peetz <jvpeetz@web.de>
Acked-by: NBorislav Petkov <borislav.petkov@amd.com>
Signed-off-by: NBoris Ostrovsky <Boris.Ostrovsky@amd.com>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1304113663-6586-1-git-send-email-ostr@amd64.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

e20a2d20

29 4月, 2011 4 次提交

ioremap: Delay sanity check until after a successful mapping · c7a7b814

由 Tim Gardner 提交于 4月 28, 2011

While tracking down the reason for an ioremap() failure I was
distracted by the WARN_ONCE() in __ioremap_caller().

Performing a WARN_ONCE() sanity check before the mapping
is successful seems pointless if the caller sends bad values.

A case in point is when the BIOS provides erroneous screen_info
values causing vesafb_probe() to request an outrageuous size.
The WARN_ONCE is then wasted on bogosity. Move the warning to a
point where the mapping has been successfully allocated.

Addresses:

http://bugs.launchpad.net/bugs/772042Reviewed-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NTim Gardner <tim.gardner@canonical.com>
Link: http://lkml.kernel.org/r/4DB99D2E.9080106@canonical.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

c7a7b814

um: adjust current_thread_info() for newer gcc versions · 534e3adb

由 Richard Weinberger 提交于 4月 27, 2011

In some cases gcc >= 4.5.2 will optimize away current_thread_info().  To
prevent gcc from doing so the stack address has to be obtained via inline
asm.
Signed-off-by: NRichard Weinberger <richard@nod.at>
Acked-by: NKirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

534e3adb

uml: fix hppfs build · 365a0dea

由 Randy Dunlap 提交于 4月 27, 2011

Make HoneyPot ProcFS depend on CONFIG_PROC_FS so that it will build.
Recommended by Christoph Hellwig.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=33692Reported-by: NSimon Danner <danner.simon@gmail.com>
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NRichard Weinberger <richard@nod.at>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

365a0dea

um: mdd support for 64 bit atomic operations · 57d8e02e

由 Richard Weinberger 提交于 4月 27, 2011

This adds support for 64 bit atomic operations on 32 bit UML systems.  XFS
needs them since 2.6.38.

  $ make ARCH=um SUBARCH=i386
  ...
    LD      .tmp_vmlinux1
  fs/built-in.o: In function `xlog_regrant_reserve_log_space':
  xfs_log.c:(.text+0xd8584): undefined reference to `atomic64_read_386'
  xfs_log.c:(.text+0xd85ac): undefined reference to `cmpxchg8b_emu'
  ...

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=32812Reported-by: NMartin Walch <walch.martin@web.de>
Tested-by: NMartin Walch <walch.martin@web.de>
Cc: Martin Walch <walch.martin@web.de>
Cc: <stable@kernel.org> [2.6.38.x 084189a8: um: disable CONFIG_CMPXCHG_LOCAL]
Signed-off-by: NRichard Weinberger <richard@nod.at>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

57d8e02e

28 4月, 2011 2 次提交

x86: ce4100: Configure IOAPIC pins for USB and SATA to level type · 1ff42c32

由 Sebastian Andrzej Siewior 提交于 4月 27, 2011

The USB and SATA ioapic interrrupt pins are configured as edge type,
but need to be level type interrupts to work correctly.

[ tglx: Split out from the combo patch ]

Cc: Torben Hohn <torbenh@linutronix.de>
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/%3C20110427143052.GA15211%40linutronix.de%3ESigned-off-by: NThomas Gleixner <tglx@linutronix.de>

1ff42c32

x86: devicetree: Configure IOAPIC pin only once · 20443598

由 Sebastian Andrzej Siewior 提交于 4月 27, 2011

We use io_apic_setup_irq_pin() in order to configure pin's interrupt
number polarity and type. This is done on every irq_create_of_mapping()
which happens for instance during pci enable calls. Level typed
interrupts are masked by default, edge are unmasked.

On the first ->xlate() call the level interrupt is configured and
masked. The driver calls request_irq() and the line is unmasked. Lets
assume the interrupt line is shared with another device and we call
pci_enable_device() for this device. The ->xlate() configures the pin
again and it is masked. request_irq() does not unmask the line because
it _is_ already unmasked according to its internal state. So the
interrupt will never be unmasked again.

This patch is based on an earlier work by Torben Hohn and solves the
problem by configuring the pin only once. Since all devices must agree
on the same type and polarity there is no point in configuring the pin
more than once.

[ tglx: Split out the ce4100 part into a separate patch ]

Cc: Torben Hohn <torbenh@linutronix.de>
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/%3C20110427143052.GA15211%40linutronix.de%3ESigned-off-by: NThomas Gleixner <tglx@linutronix.de>

20443598

27 4月, 2011 4 次提交

perf, x86, nmi: Move LVT un-masking into irq handlers · 2bce5dac

由 Don Zickus 提交于 4月 27, 2011

It was noticed that P4 machines were generating double NMIs for
each perf event.  These extra NMIs lead to 'Dazed and confused'
messages on the screen.

I tracked this down to a P4 quirk that said the overflow bit had
to be cleared before re-enabling the apic LVT mask.  My first
attempt was to move the un-masking inside the perf nmi handler
from before the chipset NMI handler to after.

This broke Nehalem boxes that seem to like the unmasking before
the counters themselves are re-enabled.

In order to keep this change simple for 2.6.39, I decided to
just simply move the apic LVT un-masking to the beginning of all
the chipset NMI handlers, with the exception of Pentium4's to
fix the double NMI issue.

Later on we can move the un-masking to later in the handlers to
save a number of 'extra' NMIs on those particular chipsets.

I tested this change on a P4 machine, an AMD machine, a Nehalem
box, and a core2quad box.  'perf top' worked correctly along
with various other small 'perf record' runs.  Anything high
stress breaks all the machines but that is a different problem.

Thanks to various people for testing different versions of this
patch.
Reported-and-tested-by: NShaun Ruffell <sruffell@digium.com>
Signed-off-by: NDon Zickus <dzickus@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Link: http://lkml.kernel.org/r/1303900353-10242-1-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
CC: Cyrill Gorcunov <gorcunov@gmail.com>

2bce5dac

m68k/mm: Set all online nodes in N_NORMAL_MEMORY · 4aac0b48

由 Michael Schmitz 提交于 4月 26, 2011

For m68k, N_NORMAL_MEMORY represents all nodes that have present memory
since it does not support HIGHMEM.  This patch sets the bit at the time
node_present_pages has been set by free_area_init_node.
At the time the node is brought online, the node state would have to be
done unconditionally since information about present memory has not yet
been recorded.

If N_NORMAL_MEMORY is not accurate, slub may encounter errors since it
uses this nodemask to setup per-cache kmem_cache_node data structures.

This pach is an alternative to the one proposed by David Rientjes
<rientjes@google.com> attempting to set node state immediately when
bringing the node online.
Signed-off-by: NMichael Schmitz <schmitz@debian.org>
Tested-by: NThorsten Glaser <tg@debian.org>
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
CC: stable@kernel.org

4aac0b48

Revert wrong fixes for common misspellings · e9c54999

由 Lucas De Marchi 提交于 4月 26, 2011

These changes were incorrectly fixed by codespell. They were now
manually corrected.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>

e9c54999

perf events, x86: Work around the Nehalem AAJ80 erratum · ec75a716

由 Ingo Molnar 提交于 4月 27, 2011

On Nehalem CPUs the retired branch-misses event can be completely bogus,
when there are no branch-misses occuring. When there are a lot of branch
misses then the count is pretty accurate. Still, this leaves us with an
event that over-counts a lot.

Detect this erratum and work it around by using BR_MISP_EXEC.ANY events.
These will also count speculated branches but still it's a lot more
precise in practice than the architectural event.
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-yyfg0bxo9jsqxd6a0ovfny27@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

ec75a716

26 4月, 2011 7 次提交

perf, x86: Fix BTS condition · 18a073a3

由 Peter Zijlstra 提交于 4月 26, 2011

Currently the x86 backend incorrectly assumes that any BRANCH_INSN
with sample_period==1 is a BTS request. This is not true when we do
frequency driven profiling such as 'perf record -e branches'.

Solves this error:

  $ perf record -e branches ./array
  Error: sys_perf_event_open() syscall returned with 95 (Operation not supported).
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Reported-by: NIngo Molnar <mingo@elte.hu>
Cc: "Metzger, Markus T" <markus.t.metzger@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-rd2y4ct71hjawzz6fpvsy9hg@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

18a073a3

OMAP3+: voltage: remove initial voltage · 3f126087

由 Nishanth Menon 提交于 4月 26, 2011

Blindly setting 1.2V in the initial structure may not even match the
default voltages stored in the voltage table which are supported for
the domain. For example, OMAP3430 core domain does not use 1.2V and
ends up generating a warning on the first transition.

Further, since omap2_set_init_voltage is called as part of the pm
framework's initialization sequence to configure the voltage required
for the current OPP, the call does(and has to) setup the system
voltage(curr_volt as a result) using the right mechanisms appropriate
for the system at that point of time. This also overrides
initialization we are currently doing in voltage.c making it
redundant. So, remove the wrong and redundant initialization.
Signed-off-by: NNishanth Menon <nm@ti.com>
Signed-off-by: NKevin Hilman <khilman@ti.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>

3f126087

OMAP4: Intialize IVA Device in addition to DSP device. · 91968645

由 Shweta Gulati 提交于 4月 26, 2011

OMAP4 has two different Devices IVA and DSP. DSP is bound
with IVA for DVFS. The registration of IVA dev in API
'omap2_init_processor_devices' was missing. Init dev for
'iva_dev' is added.

This also fixes the following error seen during boot as
omap2_set_init_voltage can now find the iva device

	omap2_set_init_voltage: Invalid parameters!
	omap2_set_init_voltage: Unable to put vdd_iva to its init voltage
Signed-off-by: NShweta Gulati <shweta.gulati@ti.com>
Acked-by: NNishanth Menon <nm@ti.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>

91968645

omap: rx51: mark reserved memory earlier · 26a064d5

由 Felipe Contreras 提交于 4月 26, 2011

So that omap_vram_set_sdram_vram() is called before
omap_vram_reserve_sdram_memblock().
Signed-off-by: NFelipe Contreras <felipe.contreras@gmail.com>
Acked-by: NTomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>

26a064d5

OMAP3: l3: fix for "irq 10: nobody cared" message · bc16b377

由 omar ramirez 提交于 4月 26, 2011

If an error occurs in the L3 on any other initiator than MPU,
the interrupt goes unhandled given that the 'base' register
was calculated with the initialized err_source value (which
coincidentally points to MPU) and not with the actual source
of the error.

Removed parenthesis that are not needed for the touched lines.
Signed-off-by: NOmar Ramirez Luna <omar.ramirez@ti.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>

bc16b377

arm: omap2: enable smc instruction for sleep34xx · 6ba5932c

由 Oskar Andero 提交于 4月 26, 2011

This fixes broken build when using binutils 2.21.
Signed-off-by: NOskar Andero <oskar.andero@sonyericsson.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>

6ba5932c

x86, setup: When probing memory with e801, use ax/bx as a pair · 39b68976

由 H. Peter Anvin 提交于 4月 25, 2011

When we use BIOS function e801 to probe memory, we should use ax/bx
(or cx/dx) as a pair, not mix and match.  This was a typo during the
translation from assembly code, and breaks at least one set of
machines in the field (which return cx = dx = 0).
Reported-and-tested-by: NChris Samuel <chris@csamuel.org>
Fix-proposed-by: NThomas Meyer <thomas@m3y3r.de>
Link: http://lkml.kernel.org/r/1303566747.12067.10.camel@localhost.localdomain

39b68976

22 4月, 2011 4 次提交

perf, x86: Update/fix Intel Nehalem cache events · f4929bd3

由 Peter Zijlstra 提交于 4月 22, 2011

Change the Nehalem cache events to use retired memory instruction counters
(similar to Westmere), this greatly improves the provided stats.

Using:

main ()
{
        int i;

        for (i = 0; i < 1000000000; i++) {
                asm("mov (%%rsp), %%rbx;"
                    "mov %%rbx, (%%rsp);" : : : "rbx");
        }
}

We find:

 $ perf stat --repeat 10 -e instructions:u -e l1-dcache-loads:u -e l1-dcache-stores:u ./loop_1b_loads+stores
  Performance counter stats for './loop_1b_loads+stores' (10 runs):
      4,000,081,056 instructions:u           #      0.000 IPC ( +-   0.000% )
      4,999,502,846 l1-dcache-loads:u          ( +-   0.008% )
      1,000,034,832 l1-dcache-stores:u         ( +-   0.000% )
         1.565184942  seconds time elapsed   ( +-   0.005% )

The 5b is surprising - we'd expect 1b:

 $ perf stat --repeat 10 -e instructions:u -e r10b:u -e l1-dcache-stores:u ./loop_1b_loads+stores
  Performance counter stats for './loop_1b_loads+stores' (10 runs):
      4,000,081,054 instructions:u           #      0.000 IPC ( +-   0.000% )
      1,000,021,961 r10b:u                     ( +-   0.000% )
      1,000,030,951 l1-dcache-stores:u         ( +-   0.000% )
         1.565055422  seconds time elapsed   ( +-   0.003% )

Which this patch thus fixes.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Link: http://lkml.kernel.org/n/tip-q9rtru7b7840tws75xzboapv@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

f4929bd3

perf, x86: P4 PMU - Don't forget to clear cpuc->active_mask on overflow · 1ea5a6af

由 Cyrill Gorcunov 提交于 4月 21, 2011

It's not enough to simply disable event on overflow the
cpuc->active_mask should be cleared as well otherwise counter
may stall in "active" even in real being already disabled (which
potentially may lead to the situation that user may not use this
counter further).

Don pointed out that:

 " I also noticed this patch fixed some unknown NMIs
   on a P4 when I stressed the box".
Tested-by: NLin Ming <ming.m.lin@intel.com>
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Acked-by: NDon Zickus <dzickus@redhat.com>
Signed-off-by: NDon Zickus <dzickus@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Link: http://lkml.kernel.org/r/1303398203-2918-3-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

1ea5a6af

x86, perf event: Turn off unstructured raw event access to offcore registers · b52c55c6

由 Ingo Molnar 提交于 4月 22, 2011

Andi Kleen pointed out that the Intel offcore support patches were merged
without user-space tool support to the functionality:

 |
 | The offcore_msr perf kernel code was merged into 2.6.39-rc*, but the
 | user space bits were not. This made it impossible to set the extra mask
 | and actually do the OFFCORE profiling
 |

Andi submitted a preliminary patch for user-space support, as an
extension to perf's raw event syntax:

 |
 | Some raw events -- like the Intel OFFCORE events -- support additional
 | parameters. These can be appended after a ':'.
 |
 | For example on a multi socket Intel Nehalem:
 |
 |    perf stat -e r1b7:20ff -a sleep 1
 |
 | Profile the OFFCORE_RESPONSE.ANY_REQUEST with event mask REMOTE_DRAM_0
 | that measures any access to DRAM on another socket.
 |

But this kind of usability is absolutely unacceptable - users should not
be expected to type in magic, CPU and model specific incantations to get
access to useful hardware functionality.

The proper solution is to expose useful offcore functionality via
generalized events - that way users do not have to care which specific
CPU model they are using, they can use the conceptual event and not some
model specific quirky hexa number.

We already have such generalization in place for CPU cache events,
and it's all very extensible.

"Offcore" events measure general DRAM access patters along various
parameters. They are particularly useful in NUMA systems.

We want to support them via generalized DRAM events: either as the
fourth level of cache (after the last-level cache), or as a separate
generalization category.

That way user-space support would be very obvious, memory access
profiling could be done via self-explanatory commands like:

  perf record -e dram ./myapp
  perf record -e dram-remote ./myapp

... to measure DRAM accesses or more expensive cross-node NUMA DRAM
accesses.

These generalized events would work on all CPUs and architectures that
have comparable PMU features.

( Note, these are just examples: actual implementation could have more
  sophistication and more parameter - as long as they center around
  similarly simple usecases. )

Now we do not want to revert *all* of the current offcore bits, as they
are still somewhat useful for generic last-level-cache events, implemented
in this commit:

  e994d7d2: perf: Fix LLC-* events on Intel Nehalem/Westmere

But we definitely do not yet want to expose the unstructured raw events
to user-space, until better generalization and usability is implemented
for these hardware event features.

( Note: after generalization has been implemented raw offcore events can be
  supported as well: there can always be an odd event that is marginally
  useful but not useful enough to generalize. DRAM profiling is definitely
  *not* such a category so generalization must be done first. )

Furthermore, PERF_TYPE_RAW access to these registers was not intended
to go upstream without proper support - it was a side-effect of the above
e994d7d2 commit, not mentioned in the changelog.

As v2.6.39 is nearing release we go for the simplest approach: disable
the PERF_TYPE_RAW offcore hack for now, before it escapes into a released
kernel and becomes an ABI.

Once proper structure is implemented for these hardware events and users
are offered usable solutions we can revisit this issue.
Reported-by: NAndi Kleen <ak@linux.intel.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1302658203-4239-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

b52c55c6

perf: Support Xeon E7's via the Westmere PMU driver · b2508e82

由 Andi Kleen 提交于 4月 21, 2011

There's a new model number public, 47, for Xeon E7 (aka Westmere EX).
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Cc: a.p.zijlstra@chello.nl
Link: http://lkml.kernel.org/r/1303429715-10202-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

b2508e82

21 4月, 2011 4 次提交

[PARISC] set memory ranges in N_NORMAL_MEMORY when onlined · d9b41e0b

由 David Rientjes 提交于 4月 20, 2011

When a DISCONTIGMEM memory range is brought online as a NUMA node, it
also needs to have its bet set in N_NORMAL_MEMORY.  This is necessary for
generic kernel code that utilizes N_NORMAL_MEMORY as a subset of N_ONLINE
for memory savings.

These types of hacks can hopefully be removed once DISCONTIGMEM is either
removed or abstracted away from CONFIG_NUMA.

Fixes a panic in the slub code which only initializes structures for
N_NORMAL_MEMORY to save memory:

	Backtrace:
	 [<000000004021c938>] add_partial+0x28/0x98
	 [<000000004021faa0>] __slab_free+0x1d0/0x1d8
	 [<000000004021fd04>] kmem_cache_free+0xc4/0x128
	 [<000000004033bf9c>] ida_get_new_above+0x21c/0x2c0
	 [<00000000402a8980>] sysfs_new_dirent+0xd0/0x238
	 [<00000000402a974c>] create_dir+0x5c/0x168
	 [<00000000402a9ab0>] sysfs_create_dir+0x98/0x128
	 [<000000004033d6c4>] kobject_add_internal+0x114/0x258
	 [<000000004033d9ac>] kobject_add_varg+0x7c/0xa0
	 [<000000004033df20>] kobject_add+0x50/0x90
	 [<000000004033dfb4>] kobject_create_and_add+0x54/0xc8
	 [<00000000407862a0>] cgroup_init+0x138/0x1f0
	 [<000000004077ce50>] start_kernel+0x5a0/0x840
	 [<000000004011fa3c>] start_parisc+0xa4/0xb8
	 [<00000000404bb034>] packet_ioctl+0x16c/0x208
	 [<000000004049ac30>] ip_mroute_setsockopt+0x260/0xf20
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Cc: stable@kernel.org
Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>

d9b41e0b

x86, numa: Fix cpu nodemasks for NUMA emulation and CONFIG_DEBUG_PER_CPU_MAPS · 7a6c6547

由 David Rientjes 提交于 4月 20, 2011

The cpu<->node mappings under CONFIG_DEBUG_PER_CPU_MAPS=y
when NUMA emulation is enabled is currently broken because it does
not iterate through every emulated node and bind cpus that have
affinity to it.

NUMA emulation should bind each cpu to every local node to
accurately represent the true NUMA topology of the underlying
machine.

debug_cpumask_set_cpu() needs to be fixed at the same time so
that the debugging information that it emits shows the new
cpumask of the node being assigned when the cpu is being added
or removed.

It can now take responsibility of setting or clearing the cpu
itself to remove the need for duplicate code.

Also change its last parameter, "enable", to have the correct bool
type since it can only be true or false.

 -v2: Fix the return statements, by Kosaki Motohiro
Acked-and-Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1104201918470.12634@chino.kir.corp.google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

7a6c6547

Revert "x86, NUMA: Fix fakenuma boot failure" · 37f8527d

由 David Rientjes 提交于 4月 20, 2011

Andreas Herrmann reported that 7d6b4670 ("x86, NUMA: Fix fakenuma
boot failure") causes certain physical NUMA topologies (for example
AMD Magny-Cours) to move sibling cpus to a single node when in reality
they are in separate domains.

This may result in some nodes being completely void of cpus, which
doesn't accurately represent the correct topology. The system will
boot, but will have suboptimal NUMA performance.

This commit was intended as a fix for NUMA emulation, but should
not cause a regression for real NUMA machines as a side effect.

( There will be a separate fix for the numa-debug code, which
  will not affect physical topologies. )
Reported-by: NAndreas Herrmann <herrmann.der.user@googlemail.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1104201918110.12634@chino.kir.corp.google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

37f8527d

OMAP2/3: hwmod: fix gpio-reset timeouts seen during bootup. · f95440ca

由 Avinash.H.M 提交于 4月 05, 2011

GPIO module expects the debounce clocks to be enabled during reset. It doesn't
reset properly and timeouts are seen, if this clock isn't enabled during
reset. Add the HWMOD_CONTROL_OPT_CLKS_IN_RESET flags to the GPIO HWMODs, with
which the debounce clocks are enabled during reset.

Cc: Rajendra Nayak <rnayak@ti.com>
Cc: Paul Walmsley <paul@pwsan.com>
Cc: Benoit Cousson <b-cousson@ti.com>
Cc: Kevin Hilman <khilman@ti.com>
Signed-off-by: NAvinash.H.M <avinashhm@ti.com>
Signed-off-by: NPaul Walmsley <paul@pwsan.com>

f95440ca

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功