提交 · fb9a7d76da108d120efb2258ea83b18dbbb2ecdd · openanolis / cloud-kernel

01 4月, 2011 1 次提交

rcu: create new rcu_access_index() and use in mce · a4dd9925

由 Paul E. McKenney 提交于 4月 01, 2011

The MCE subsystem needs to sample an RCU-protected index outside of
any protection for that index.  If this was a pointer, we would use
rcu_access_pointer(), but there is no corresponding rcu_access_index().
This commit therefore creates an rcu_access_index() and applies it
to MCE.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NZdenek Kabelac <zkabelac@redhat.com>

a4dd9925

25 3月, 2011 2 次提交

perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue · 45daae57

由 Ingo Molnar 提交于 3月 25, 2011

Eric Dumazet reported that hardware PMU events do not work on his
system, due to the BIOS corrupting PMU state:

    Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only.
    [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c)

Linus suggested that we continue in the face of such BIOS-induced CPU
state corruption:

   http://lkml.org/lkml/2011/3/24/608

Such BIOSes will have to be fixed - Linux developers rely on a working and
fully capable PMU and the BIOS interfering with the CPU's PMU state is simply
not acceptable.

So this patch changes perf to continue when it detects such BIOS
interaction, some hardware events may be unreliable due to the BIOS
writing and re-writing them - there's not much the kernel can do
about that but to detect the corruption and report it.
Reported-and-tested-by: NEric Dumazet <eric.dumazet@gmail.com>
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

45daae57

perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows · 242214f9

由 Don Zickus 提交于 3月 24, 2011

The read of a proper MSR register was missed and instead of
counter the configration register was tested (it has
ARCH_P4_UNFLAGGED_BIT always cleared) leading to unknown NMI
hitting the system. As result the user may obtain "Dazed and
confused, but trying to continue" message. Fix it by reading a
proper MSR register.

When an NMI happens on a P4, the perf nmi handler checks the
configuration register to see if the overflow bit is set or not
before taking appropriate action.  Unfortunately, various P4
machines had a broken overflow bit, so a backup mechanism was
implemented.  This mechanism checked to see if the counter
rolled over or not.

A previous commit that implemented this backup mechanism was
broken. Instead of reading the counter register, it used the
configuration register to determine if the counter rolled over
or not. Reading that bit would give incorrect results.

This would lead to 'Dazed and confused' messages for the end
user when using the perf tool (or if the nmi watchdog is
running).

The fix is to read the counter register before determining if
the counter rolled over or not.
Signed-off-by: NDon Zickus <dzickus@redhat.com>
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Lin Ming <ming.m.lin@intel.com>
LKML-Reference: <4D8BAB49.3080701@openvz.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

242214f9

24 3月, 2011 1 次提交

x86: Use syscore_ops instead of sysdev classes and sysdevs · f3c6ea1b

由 Rafael J. Wysocki 提交于 3月 23, 2011

Some subsystems in the x86 tree need to carry out suspend/resume and
shutdown operations with one CPU on-line and interrupts disabled and
they define sysdev classes and sysdevs or sysdev drivers for this
purpose.  This leads to unnecessarily complicated code and excessive
memory usage, so switch them to using struct syscore_ops objects for
this purpose instead.

Generally, there are three categories of subsystems that use
sysdevs for implementing PM operations: (1) subsystems whose
suspend/resume callbacks ignore their arguments entirely (the
majority), (2) subsystems whose suspend/resume callbacks use their
struct sys_device argument, but don't really need to do that,
because they can be implemented differently in an arguably simpler
way (io_apic.c), and (3) subsystems whose suspend/resume callbacks
use their struct sys_device argument, but the value of that argument
is always the same and could be ignored (microcode_core.c).  In all
of these cases the subsystems in question may be readily converted to
using struct syscore_ops objects for power management and shutdown.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NIngo Molnar <mingo@elte.hu>

f3c6ea1b

22 3月, 2011 1 次提交

ACPI, APEI, Add ERST record ID cache · 885b976f

由 Huang Ying 提交于 2月 21, 2011

APEI ERST firmware interface and implementation has no multiple users
in mind.  For example, if there is four records in storage with ID: 1,
2, 3 and 4, if two ERST readers enumerate the records via
GET_NEXT_RECORD_ID as follow,

reader 1		reader 2
1
			2
3
			4
-1
			-1

where -1 signals there is no more record ID.

Reader 1 has no chance to check record 2 and 4, while reader 2 has no
chance to check record 1 and 3.  And any other GET_NEXT_RECORD_ID will
return -1, that is, other readers will has no chance to check any
record even they are not cleared by anyone.

This makes raw GET_NEXT_RECORD_ID not suitable for used by multiple
users.

To solve the issue, an in-memory ERST record ID cache is designed and
implemented.  When enumerating record ID, the ID returned by
GET_NEXT_RECORD_ID is added into cache in addition to be returned to
caller.  So other readers can check the cache to get all record ID
available.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NLen Brown <len.brown@intel.com>

885b976f

20 3月, 2011 1 次提交

perf, x86: Fix Intel fixed counters base initialization · fc66c521

由 Stephane Eranian 提交于 3月 19, 2011

The following patch solves the problems introduced by Robert's
commit 41bf4989 and reported by Arun Sharma. This commit gets rid
of the base + index notation for reading and writing PMU msrs.

The problem is that for fixed counters, the new calculation for
the base did not take into account the fixed counter indexes,
thus all fixed counters were read/written from fixed counter 0.
Although all fixed counters share the same config MSR, they each
have their own counter register.

Without:

 $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds

  242202299 unhalted_core_cycles (0.00% scaling, ena=1000790892, run=1000790892)
 2389685946 instructions_retired (0.00% scaling, ena=1000790892, run=1000790892)
      49473 baclears             (0.00% scaling, ena=1000790892, run=1000790892)

With:

 $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds

 2392703238 unhalted_core_cycles (0.00% scaling, ena=1000840809, run=1000840809)
 2389793744 instructions_retired (0.00% scaling, ena=1000840809, run=1000840809)
      47863 baclears             (0.00% scaling, ena=1000840809, run=1000840809)
Signed-off-by: NStephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ming.m.lin@intel.com
Cc: robert.richter@amd.com
Cc: asharma@fb.com
Cc: perfmon2-devel@lists.sf.net
LKML-Reference: <20110319172005.GB4978@quad>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fc66c521

18 3月, 2011 2 次提交

x86, dumpstack: Correct stack dump info when frame pointer is available · e8e999cf

由 Namhyung Kim 提交于 3月 18, 2011

Current stack dump code scans entire stack and check each entry
contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it
could mark whether the pointer is valid or not based on value of
the frame pointer. Invalid entries could be preceded by '?' sign.

However this was not going to happen because scan start point
was always higher than the frame pointer so that they could not
meet.

Commit 9c0729dc ("x86: Eliminate bp argument from the stack
tracing routines") delayed bp acquisition point, so the bp was
read in lower frame, thus all of the entries were marked
invalid.

This patch fixes this by reverting above commit while retaining
stack_frame() helper as suggested by Frederic Weisbecker.

End result looks like below:

before:

 [    3.508329] Call Trace:
 [    3.508551]  [<ffffffff814f35c9>] ? panic+0x91/0x199
 [    3.508662]  [<ffffffff814f3739>] ? printk+0x68/0x6a
 [    3.508770]  [<ffffffff81a981b2>] ? mount_block_root+0x257/0x26e
 [    3.508876]  [<ffffffff81a9821f>] ? mount_root+0x56/0x5a
 [    3.508975]  [<ffffffff81a98393>] ? prepare_namespace+0x170/0x1a9
 [    3.509216]  [<ffffffff81a9772b>] ? kernel_init+0x1d2/0x1e2
 [    3.509335]  [<ffffffff81003894>] ? kernel_thread_helper+0x4/0x10
 [    3.509442]  [<ffffffff814f6880>] ? restore_args+0x0/0x30
 [    3.509542]  [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
 [    3.509641]  [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10

after:

 [    3.522991] Call Trace:
 [    3.523351]  [<ffffffff814f35b9>] panic+0x91/0x199
 [    3.523468]  [<ffffffff814f3729>] ? printk+0x68/0x6a
 [    3.523576]  [<ffffffff81a981b2>] mount_block_root+0x257/0x26e
 [    3.523681]  [<ffffffff81a9821f>] mount_root+0x56/0x5a
 [    3.523780]  [<ffffffff81a98393>] prepare_namespace+0x170/0x1a9
 [    3.523885]  [<ffffffff81a9772b>] kernel_init+0x1d2/0x1e2
 [    3.523987]  [<ffffffff81003894>] kernel_thread_helper+0x4/0x10
 [    3.524228]  [<ffffffff814f6880>] ? restore_args+0x0/0x30
 [    3.524345]  [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
 [    3.524445]  [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10

 -v5:
   * fix build breakage with oprofile

 -v4:
   * use 0 instead of regs->bp
   * separate out printk changes

 -v3:
   * apply comment from Frederic
   * add a couple of printk fixes
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Soren Sandmann <ssp@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Robert Richter <robert.richter@amd.com>
LKML-Reference: <1300416006-3163-1-git-send-email-namhyung@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e8e999cf

x86: Fix common misspellings · 0d2eb44f

由 Lucas De Marchi 提交于 3月 17, 2011

They were generated by 'codespell' and then manually reviewed.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: trivial@kernel.org
LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0d2eb44f

17 3月, 2011 2 次提交

[CPUFREQ] pcc-cpufreq: remove duplicate statements · bdce2595

由 Chumbalkar, Nagananda 提交于 3月 16, 2011

Remove a couple of assigment statements that appear twice.
Signed-off-by: NNaga Chumbalkar <nagananda.chumbalkar@hp.com>
Signed-off-by: NDave Jones <davej@redhat.com>

bdce2595

[CPUFREQ] powernow-k8: The table index is not worth displaying · 9e918695

由 Thomas Renninger 提交于 3月 03, 2011

and it also is misleading due to another message above
which makes the index look like it is the CPU.

https://bugzilla.kernel.org/show_bug.cgi?id=24562Signed-off-by: NThomas Renninger <trenn@suse.de>
Signed-off-by: NDave Jones <davej@redhat.com>
CC: cpufreq@vger.kernel.org

9e918695

16 3月, 2011 3 次提交

perf, x86: Use INTEL_*_CONSTRAINT() for all PEBS event constraints · 7d5d02da

由 Lin Ming 提交于 3月 09, 2011

PEBS_EVENT_CONSTRAINT() is just a duplicate of INTEL_UEVENT_CONSTRAINT().
Remove it and use INTEL_UEVENT_CONSTRAINT() instead.
Signed-off-by: NLin Ming <ming.m.lin@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299684089-22835-3-git-send-email-ming.m.lin@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7d5d02da

perf, x86: Clean up SandyBridge PEBS events · eefaaac4

由 Lin Ming 提交于 3月 09, 2011

Use INTEL_EVENT_CONSTRAINT() for the events where all umasks support PEBS.
Signed-off-by: NLin Ming <ming.m.lin@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299684089-22835-2-git-send-email-ming.m.lin@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

eefaaac4

x86, AMD: Set ARAT feature on AMD processors · b87cf80a

由 Boris Ostrovsky 提交于 3月 15, 2011

Support for Always Running APIC timer (ARAT) was introduced in
commit db954b58. This feature
allows us to avoid switching timers from LAPIC to something else
(e.g. HPET) and go into timer broadcasts when entering deep
C-states.

AMD processors don't provide a CPUID bit for that feature but
they also keep APIC timers running in deep C-states (except for
cases when the processor is affected by erratum 400). Therefore
we should set ARAT feature bit on AMD CPUs.
Tested-by: NBorislav Petkov <borislav.petkov@amd.com>
Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: NMark Langsdorf <mark.langsdorf@amd.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@amd.com>
LKML-Reference: <1300205624-4813-1-git-send-email-ostr@amd64.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b87cf80a

10 3月, 2011 1 次提交

[CPUFREQ] pcc-cpufreq: don't load driver if get_freq fails during init. · 1f858ef2

由 Naga Chumbalkar 提交于 3月 09, 2011

Return 0 on failure. This will cause the initialization of the driver
to fail and prevent the driver from loading if the BIOS cannot handle
the PCC interface command to "get frequency". Otherwise, the driver
will load and display a very high value like "4294967274" (which is
actually -EINVAL) for frequency:

# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
4294967274
Signed-off-by: NNaga Chumbalkar <nagananda.chumbalkar@hp.com>
CC: stable@kernel.org
Signed-off-by: NDave Jones <davej@redhat.com>

1f858ef2

05 3月, 2011 2 次提交

x86: Really print supported CPUs if PROCESSOR_SELECT=y · ac23f253

由 Jan Beulich 提交于 3月 04, 2011

I'm sure it was a mere oversight that the CONFIG_ prefixes are
missing.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Cc: Dave Jones <davej@redhat.com>
LKML-Reference: <4D7118D30200007800034F79@vpn.id2.novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ac23f253

perf: Avoid the percore allocations if the CPU is not HT capable · 69092624

由 Lin Ming 提交于 3月 03, 2011

Signed-off-by: NLin Ming <ming.m.lin@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299119690-13991-5-git-send-email-ming.m.lin@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

69092624

04 3月, 2011 3 次提交

perf: Fix LLC-* events on Intel Nehalem/Westmere · e994d7d2

由 Andi Kleen 提交于 3月 03, 2011

On Intel Nehalem and Westmere CPUs the generic perf LLC-* events count the
L2 caches, not the real L3 LLC - this was inconsistent with behavior on
other CPUs.

Fixing this requires the use of the special OFFCORE_RESPONSE
events which need a separate mask register.

This has been implemented by the previous patch, now use this infrastructure
to set correct events for the LLC-* on Nehalem and Westmere.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NLin Ming <ming.m.lin@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299119690-13991-3-git-send-email-ming.m.lin@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e994d7d2

perf: Add support for supplementary event registers · a7e3ed1e

由 Andi Kleen 提交于 3月 03, 2011

Change logs against Andi's original version:

- Extends perf_event_attr:config to config{,1,2} (Peter Zijlstra)
- Fixed a major event scheduling issue. There cannot be a ref++ on an
  event that has already done ref++ once and without calling
  put_constraint() in between. (Stephane Eranian)
- Use thread_cpumask for percore allocation. (Lin Ming)
- Use MSR names in the extra reg lists. (Lin Ming)
- Remove redundant "c = NULL" in intel_percore_constraints
- Fix comment of perf_event_attr::config1

Intel Nehalem/Westmere have a special OFFCORE_RESPONSE event
that can be used to monitor any offcore accesses from a core.
This is a very useful event for various tunings, and it's
also needed to implement the generic LLC-* events correctly.

Unfortunately this event requires programming a mask in a separate
register. And worse this separate register is per core, not per
CPU thread.

This patch:

- Teaches perf_events that OFFCORE_RESPONSE needs extra parameters.
  The extra parameters are passed by user space in the
  perf_event_attr::config1 field.

- Adds support to the Intel perf_event core to schedule per
  core resources. This adds fairly generic infrastructure that
  can be also used for other per core resources.
  The basic code has is patterned after the similar AMD northbridge
  constraints code.

Thanks to Stephane Eranian who pointed out some problems
in the original version and suggested improvements.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NLin Ming <ming.m.lin@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299119690-13991-2-git-send-email-ming.m.lin@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a7e3ed1e

perf_events: Update PEBS event constraints · 17e31629

由 Stephane Eranian 提交于 3月 02, 2011

This patch updates PEBS event constraints for Intel Atom, Nehalem, Westmere.

This patch also reorganizes the PEBS format/constraint detection code. It is
now based on processor model and not PEBS format. Two processors may use the
same PEBS format without have the same list of PEBS events.

In this second version, we simplified the initialization of the PEBS
constraints by leveraging the existing switch() statement in perf_event_intel.c.
We also renamed the constraint tables to be more consistent with regular
constraints.

In this 3rd version, we drop BR_INST_RETIRED.MISPRED from Intel Atom as it does
not seem to work. Use MISPREDICTED_BRANCH_RETIRED instead. Also add FP_ASSIST.*
o both Intel Nehalem and Westmere. I misssed those in the earlier patches.
Events were tested using libpfm4 perf_examples.
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4d6e6b02.815bdf0a.637b.07a7@mx.google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

17e31629

02 3月, 2011 4 次提交

perf, x86: Add Intel SandyBridge CPU support · b06b3d49

由 Lin Ming 提交于 3月 02, 2011

This patch adds basic SandyBridge support, including hardware
cache events and PEBS events support.

It has been tested on SandyBridge CPUs with perf stat and also
with PEBS based profiling - both work fine.

The patch does not affect other models.

v2 -> v3:
 - fix PEBS event 0xd0 with right umask combinations
 - move snb pebs constraint assignment to intel_pmu_init

v1 -> v2:
 - add more raw and PEBS events constraints
 - use offcore events for LLC-* cache events
 - remove the call to Nehalem workaround enable_all function
Signed-off-by: NLin Ming <ming.m.lin@intel.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <1299072424.2175.24.camel@localhost>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b06b3d49

[CPUFREQ] p4-clockmod: print EST-capable warning message only once · 853cee26

由 Naga Chumbalkar 提交于 2月 15, 2011

Print the message only once. I see it 16 times on a 2P box with 16 logical CPUs.
Signed-off-by: NNaga Chumbalkar <nagananda.chumbalkar@hp.com>

853cee26

[CPUFREQ] Fix another notifier leak in powernow-k8. · a536b126

由 Dave Jones 提交于 11月 23, 2010

Do the notifier registration later, so we don't have to worry
about freeing it if we fail the msr allocation.
Signed-off-by: NDave Jones <davej@redhat.com>

a536b126

[CPUFREQ] Missing "unregister_cpu_notifier" in powernow-k8.c · ac818314

由 Neil Brown 提交于 11月 24, 2010

It appears that when powernow-k8 finds that

    No compatible ACPI _PSS objects found.

 and suggests

    Try again with latest BIOS.

 it fails the module load, but does not unregister the cpu_notifier that was
 registered in powernowk8_init

 This ends up leaving freed memory on the cpu notifier list for some other
 poor module (e.g. md/raid5) to come along and trip over.

 The following might be a partial fix, but I suspect there is probably other
 clean-up that is needed.

 ( https://bugzilla.novell.com/show_bug.cgi?id=655215 has full dmesg traces).
Signed-off-by: NDave Jones <davej@redhat.com>
Signed-off-by: NNeil Brown <neilb@suse.de>

ac818314

16 2月, 2011 6 次提交

perf, x86: Add support for AMD family 15h core counters · 4979d272

由 Robert Richter 提交于 2月 02, 2011

This patch adds support for AMD family 15h core counters. There are
major changes compared to family 10h. First, there is a new perfctr
msr range for up to 6 counters. Northbridge counters are separate
now. This patch only adds support for core counters. Second, certain
events may only be scheduled on certain counters. For this we need to
extend the event scheduling and constraints.

We use cpu feature flags to calculate family 15h msr address offsets.
This way we later can implement a faster ALTERNATIVE() version for
this.
Signed-off-by: NRobert Richter <robert.richter@amd.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110215135210.GB5874@erda.amd.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4979d272

perf, x86: Store perfctr msr addresses in config_base/event_base · 73d6e522

由 Robert Richter 提交于 2月 02, 2011

Instead of storing the base addresses we can store the counter's msr
addresses directly in config_base/event_base of struct hw_perf_event.
This avoids recalculating the address with each msr access. The
addresses are configured one time. We also need this change to later
modify the address calculation.
Signed-off-by: NRobert Richter <robert.richter@amd.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1296664860-10886-5-git-send-email-robert.richter@amd.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

73d6e522

perf, x86: Add new AMD family 15h msrs to perfctr reservation code · 69d8e1e8

由 Robert Richter 提交于 2月 02, 2011

This patch allows the reservation of perfctrs with new msr addresses
introduced for AMD cpu family 15h (0xc0010200/0xc0010201, etc).
Signed-off-by: NRobert Richter <robert.richter@amd.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1296664860-10886-4-git-send-email-robert.richter@amd.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

69d8e1e8

perf, x86: Calculate perfctr msr addresses in helper functions · 41bf4989

由 Robert Richter 提交于 2月 02, 2011

This patch adds helper functions to calculate perfctr msr addresses.
We need this to later add support for AMD family 15h cpus. For this we
have to change the algorithms to generate the perfctr's msr addresses.
Signed-off-by: NRobert Richter <robert.richter@amd.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1296664860-10886-3-git-send-email-robert.richter@amd.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

41bf4989

perf, x86: Use helper function in x86_pmu_enable_all() · d45dd923

由 Robert Richter 提交于 2月 02, 2011

Use helper function in x86_pmu_enable_all() to minimize access to
x86_pmu.eventsel in the fast path. The counter's msr address is now
calculated using struct hw_perf_event. Later we add code that
calculates the msr addresses with a table lookup which shouldn't be
done in the fast path.
Signed-off-by: NRobert Richter <robert.richter@amd.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1296664860-10886-2-git-send-email-robert.richter@amd.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d45dd923

perf, x86: P4 PMU: Fix spurious NMI messages · 7d44ec19

由 Cyrill Gorcunov 提交于 2月 16, 2011

Several people have reported spurious unknown NMI
messages on some P4 CPUs.

This patch fixes it by checking for an overflow (negative
counter values) directly, instead of relying on the
P4_CCCR_OVF bit.
Reported-by: NGeorge Spelvin <linux@horizon.com>
Reported-by: NMeelis Roos <mroos@linux.ee>
Reported-by: NDon Zickus <dzickus@redhat.com>
Reported-by: NDave Airlie <airlied@gmail.com>
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <AANLkTinfuTfCck_FfaOHrDqQZZehtRzkBum4SpFoO=KJ@mail.gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7d44ec19

15 2月, 2011 1 次提交

x86, amd: Initialize variable properly · 9e81509e

由 Borislav Petkov 提交于 2月 14, 2011

Commit d518573d ("x86, amd: Normalize compute unit IDs on
multi-node processors") introduced compute unit normalization
but causes a compiler warning:

 arch/x86/kernel/cpu/amd.c: In function 'amd_detect_cmp':
 arch/x86/kernel/cpu/amd.c:268: warning: 'cores_per_cu' may be used uninitialized in this function
 arch/x86/kernel/cpu/amd.c:268: note: 'cores_per_cu' was declared here

The compiler is right - initialize it with a proper value.

Also, fix up a comment while at it.
Reported-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20110214171451.GB10076@kryptos.osrc.amd.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9e81509e

08 2月, 2011 1 次提交

x86, amd: Support L3 Cache Partitioning on AMD family 0x15 CPUs · cabb5bd7

由 Hans Rosenfeld 提交于 2月 07, 2011

L3 Cache Partitioning allows selecting which of the 4 L3 subcaches can be used
for evictions by the L2 cache of each compute unit. By writing a 4-bit
hexadecimal mask into the the sysfs file
/sys/devices/system/cpu/cpuX/cache/index3/subcaches, the user can set the
enabled subcaches for a CPU.

The settings are directly read from and written to the hardware, so there is no
way to have contradicting settings for two CPUs belonging to the same compute
unit. Writing will always overwrite any previous setting for a compute unit.
Signed-off-by: NHans Rosenfeld <hans.rosenfeld@amd.com>
Cc: <Andreas.Herrmann3@amd.com>
LKML-Reference: <1297098639-431383-1-git-send-email-hans.rosenfeld@amd.com>
[ -v3: minor style fixes ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cabb5bd7

03 2月, 2011 1 次提交

x86, mtrr: Avoid MTRR reprogramming on BP during boot on UP platforms · f7448548

由 Suresh Siddha 提交于 2月 02, 2011

Markus Kohn ran into a hard hang regression on an acer aspire
1310, when acpi is enabled. git bisect showed the following
commit as the bad one that introduced the boot regression.

	commit d0af9eed
	Author: Suresh Siddha <suresh.b.siddha@intel.com>
	Date:   Wed Aug 19 18:05:36 2009 -0700

	    x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init

Because of the UP configuration of that platform,
native_smp_prepare_cpus() bailed out (in smp_sanity_check())
before doing the set_mtrr_aps_delayed_init()

Further down the boot path, native_smp_cpus_done() will call the
delayed MTRR initialization for the AP's (mtrr_aps_init()) with
mtrr_aps_delayed_init not set. This resulted in the boot
processor reprogramming its MTRR's to the values seen during the
start of the OS boot. While this is not needed ideally, this
shouldn't have caused any side-effects. This is because the
reprogramming of MTRR's (set_mtrr_state() that gets called via
set_mtrr()) will check if the live register contents are
different from what is being asked to write and will do the actual
write only if they are different.

BP's mtrr state is read during the start of the OS boot and
typically nothing would have changed when we ask to reprogram it
on BP again because of the above scenario on an UP platform. So
on a normal UP platform no reprogramming of BP MTRR MSR's
happens and all is well.

However, on this platform, bios seems to be modifying the fixed
mtrr range registers between the start of OS boot and when we
double check the live registers for reprogramming BP MTRR
registers. And as the live registers are modified, we end up
reprogramming the MTRR's to the state seen during the start of
the OS boot.

During ACPI initialization, something in the bios (probably smi
handler?) don't like this fact and results in a hard lockup.

We didn't see this boot hang issue on this platform before the
commit d0af9eed, because only
the AP's (if any) will program its MTRR's to the value that BP
had at the start of the OS boot.

Fix this issue by checking mtrr_aps_delayed_init before
continuing further in the mtrr_aps_init(). Now, only AP's (if
any) will program its MTRR's to the BP values during boot.

Addresses https://bugzilla.novell.com/show_bug.cgi?id=623393

  [ By the way, this behavior of the bios modifying MTRR's after the start
    of the OS boot is not common and the kernel is not prepared to
    handle this situation well. Irrespective of this issue, during
    suspend/resume, linux kernel will try to reprogram the BP's MTRR values
    to the values seen during the start of the OS boot. So suspend/resume might
    be already broken on this platform for all linux kernel versions. ]
Reported-and-bisected-by: NMarkus Kohn <jabber@gmx.org>
Tested-by: NMarkus Kohn <jabber@gmx.org>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Thomas Renninger <trenn@novell.com>
Cc: Rafael Wysocki <rjw@novell.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: stable@kernel.org # [v2.6.32+]
LKML-Reference: <1296694975.4418.402.camel@sbsiddha-MOBL3.sc.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f7448548

28 1月, 2011 5 次提交

x86: Unify node_to_cpumask_map handling between 32 and 64bit · de2d9445

由 Tejun Heo 提交于 1月 23, 2011

x86_32 has been managing node_to_cpumask_map explicitly from
map_cpu_to_node() and friends in a rather ugly way.  With
previous changes, it's now possible to share the code with
64bit.

* When CONFIG_NUMA_EMU is disabled, numa_add/remove_cpu() are
  implemented in numa.c and shared by 32 and 64bit.  CONFIG_NUMA_EMU
  versions still live in numa_64.c.

  NUMA_EMU's dependency on 64bit is planned to be removed and the
  above should go away together.

* identify_cpu() now calls numa_add_cpu() for 32bit too.  This
  makes the explicit mask management from map_cpu_to_node() unnecessary.

* The whole x86_32 specific map_cpu_to_node() chunk is no longer
  necessary.  Dropped.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NPekka Enberg <penberg@kernel.org>
Cc: eric.dumazet@gmail.com
Cc: yinghai@kernel.org
Cc: brgerst@gmail.com
Cc: gorcunov@gmail.com
Cc: shaohui.zheng@intel.com
Cc: rientjes@google.com
LKML-Reference: <1295789862-25482-16-git-send-email-tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: David Rientjes <rientjes@google.com>
Cc: Shaohui Zheng <shaohui.zheng@intel.com>

de2d9445

x86: Unify CPU -> NUMA node mapping between 32 and 64bit · 645a7919

由 Tejun Heo 提交于 1月 23, 2011

Unlike 64bit, 32bit has been using its own cpu_to_node_map[] for
CPU -> NUMA node mapping.  Replace it with early_percpu variable
x86_cpu_to_node_map and share the mapping code with 64bit.

* USE_PERCPU_NUMA_NODE_ID is now enabled for 32bit too.

* x86_cpu_to_node_map and numa_set/clear_node() are moved from
  numa_64 to numa.  For now, on 32bit, x86_cpu_to_node_map is initialized
  with 0 instead of NUMA_NO_NODE.  This is to avoid introducing unexpected
  behavior change and will be updated once init path is unified.

* srat_detect_node() is now enabled for x86_32 too.  It calls
  numa_set_node() and initializes the mapping making explicit
  cpu_to_node_map[] updates from map/unmap_cpu_to_node() unnecessary.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: eric.dumazet@gmail.com
Cc: yinghai@kernel.org
Cc: brgerst@gmail.com
Cc: gorcunov@gmail.com
Cc: penberg@kernel.org
Cc: shaohui.zheng@intel.com
Cc: rientjes@google.com
LKML-Reference: <1295789862-25482-15-git-send-email-tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: David Rientjes <rientjes@google.com>

645a7919

x86: Unify cpu/apicid <-> NUMA node mapping between 32 and 64bit · bbc9e2f4

由 Tejun Heo 提交于 1月 23, 2011

The mapping between cpu/apicid and node is done via
apicid_to_node[] on 64bit and apicid_2_node[] +
apic->x86_32_numa_cpu_node() on 32bit. This difference makes it
difficult to further unify 32 and 64bit NUMA handling.

This patch unifies it by replacing both apicid_to_node[] and
apicid_2_node[] with __apicid_to_node[] array, which is accessed
by two accessors - set_apicid_to_node() and numa_cpu_node().  On
64bit, numa_cpu_node() always consults __apicid_to_node[]
directly while 32bit goes through apic->numa_cpu_node() method
to allow apic implementations to override it.

srat_detect_node() for amd cpus contains workaround for broken
NUMA configuration which assumes relationship between APIC ID,
HT node ID and NUMA topology.  Leave it to access
__apicid_to_node[] directly as mapping through CPU might result
in undesirable behavior change.  The comment is reformatted and
updated to note the ugliness.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NPekka Enberg <penberg@kernel.org>
Cc: eric.dumazet@gmail.com
Cc: yinghai@kernel.org
Cc: brgerst@gmail.com
Cc: gorcunov@gmail.com
Cc: shaohui.zheng@intel.com
Cc: rientjes@google.com
LKML-Reference: <1295789862-25482-14-git-send-email-tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: David Rientjes <rientjes@google.com>

bbc9e2f4

x86, perf: Change two init functions to static · dda99116

由 Yinghai Lu 提交于 1月 21, 2011

init_hw_perf_events() is called via early_initcall now.
x86_pmu_event_init is x86_pmu member function.

So we can change them to static.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
LKML-Reference: <4D3A16F9.109@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

dda99116

perf: Fix Pentium4 raw event validation · d038b12c

由 Stephane Eranian 提交于 1月 25, 2011

This patch fixes some issues with raw event validation on
Pentium 4 (Netburst) based processors.

As I was testing libpfm4 Netburst support, I ran into two
problems in the p4_validate_raw_event() function:

   - the shared field must be checked ONLY when HT is on
   - the binding to ESCR register was missing

The second item was causing raw events to not be encoded
correctly compared to generic PMU events.

With this patch, I can now pass Netburst events to libpfm4
examples and get meaningful results:

  $ task -e global_power_events:running:u  noploop 1
  noploop for 1 seconds
  3,206,304,898 global_power_events:running
Signed-off-by: NStephane Eranian <eranian@google.com>
Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: peterz@infradead.org
Cc: paulus@samba.org
Cc: davem@davemloft.net
Cc: fweisbec@gmail.com
Cc: perfmon2-devel@lists.sf.net
Cc: eranian@gmail.com
Cc: robert.richter@amd.com
Cc: acme@redhat.com
Cc: gorcunov@gmail.com
Cc: ming.m.lin@intel.com
LKML-Reference: <4d3efb2f.1252d80a.1a80.ffffc83f@mx.google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d038b12c

26 1月, 2011 2 次提交

x86: Move llc_shared_map out of cpu_info · b3d7336d

由 Yinghai Lu 提交于 1月 21, 2011

cpu_info is already with per_cpu, We can take llc_shared_map out
of cpu_info, and declare it as per_cpu variable directly.

So later referencing could be simple and directly instead of
diving to find cpu_info at first.

Also could make smp_store_cpu_info() much simple to avoid to do
save and restore trick.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
Cc: Alok N Kataria <akataria@vmware.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Hans J. Koch <hjk@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <4D3A16E8.5020608@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b3d7336d

x86, amd: Normalize compute unit IDs on multi-node processors · d518573d

由 Andreas Herrmann 提交于 1月 24, 2011

On multi-node CPUs we don't need the socket wide compute unit ID
but the node-wide compute unit ID. Thus we need to normalize the
value. This is similar to what we do with cpu_core_id.

A compute unit is then identified by physical_package_id,
node_id, and compute_unit_id.
Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <1295881543-572552-2-git-send-email-hans.rosenfeld@amd.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d518573d

21 1月, 2011 1 次提交

x86, mcheck, therm_throt.c: Export symbol platform_thermal_notify to allow coretemp to handler intr · f21bbec9

由 Fenghua Yu 提交于 1月 20, 2011

In therm_throt.c, commit
9e76a97e patch doesn't export
the symbol platform_thermal_notify.

Other drivers (e.g. drivers/hwmon/coretemp.c) can not find the
symbol platform_thermal_notify when defining threshould
interrupt handler.

Please apply this patch to allow threshold interrupt handler in
coretemp.
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Cc: R Durgadoss <durgadoss.r@intel.com>
Cc: khali@linux-fr.org <khali@linux-fr.org>
Cc: lm-sensors@lm-sensors.org <lm-sensors@lm-sensors.org>
Cc: Guenter Roeck <guenter.roeck@ericsson.com>
LKML-Reference: <20110121041239.GB26954@linux-os.sc.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f21bbec9

openanolis / cloud-kernel 12 个月 前同步成功

openanolis / cloud-kernel
12 个月前同步成功