1. 04 12月, 2009 1 次提交
  2. 03 12月, 2009 2 次提交
    • M
      x86, apic: Enable lapic nmi watchdog on AMD Family 11h · 7d1849af
      Mikael Pettersson 提交于
      The x86 lapic nmi watchdog does not recognize AMD Family 11h,
      resulting in:
      
        NMI watchdog: CPU not supported
      
      As far as I can see from available documentation (the BKDM),
      family 11h looks identical to family 10h as far as the PMU
      is concerned.
      
      Extending the check to accept family 11h results in:
      
        Testing NMI watchdog ... OK.
      
      I've been running with this change on a Turion X2 Ultra ZM-82
      laptop for a couple of weeks now without problems.
      Signed-off-by: NMikael Pettersson <mikpe@it.uu.se>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Joerg Roedel <joerg.roedel@amd.com>
      Cc: <stable@kernel.org>
      LKML-Reference: <19223.53436.931768.278021@pilspetsen.it.uu.se>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7d1849af
    • H
      x86, mce: don't restart timer if disabled · fe5ed91d
      Hidetoshi Seto 提交于
      Even it is in error path unlikely taken, add_timer_on() at
      CPU_DOWN_FAILED* needs to be skipped if mce_timer is disabled.
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Jan Beulich <jbeulich@novell.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      fe5ed91d
  3. 26 11月, 2009 1 次提交
  4. 25 11月, 2009 7 次提交
  5. 24 11月, 2009 1 次提交
  6. 23 11月, 2009 2 次提交
    • I
      perf events: Do not generate function trace entries in perf code · 6e3d8330
      Ingo Molnar 提交于
      Decreases perf overhead when function tracing is enabled,
      by about 50%.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6e3d8330
    • Y
      x86, numa: Use near(er) online node instead of roundrobin for NUMA · d9c2d5ac
      Yinghai Lu 提交于
      CPU to node mapping is set via the following sequence:
      
       1. numa_init_array(): Set up roundrobin from cpu to online node
      
       2. init_cpu_to_node(): Set that according to apicid_to_node[]
      			according to srat only handle the node that
      			is online, and leave other cpu on node
      			without ram (aka not online) to still
      			roundrobin.
      
      3. later call srat_detect_node for Intel/AMD, will use first_online
         node or nearby node.
      
      Problem is that setup_per_cpu_areas() is not called between 2 and 3,
      the per_cpu for cpu on node with ram is on different node, and could
      put that on node with two hops away.
      
      So try to optimize this and add find_near_online_node() and call
      init_cpu_to_node().
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4B07A739.3030104@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d9c2d5ac
  7. 18 11月, 2009 5 次提交
  8. 17 11月, 2009 1 次提交
    • H
      x86, mm: Clean up and simplify NX enablement · 4763ed4d
      H. Peter Anvin 提交于
      The 32- and 64-bit code used very different mechanisms for enabling
      NX, but even the 32-bit code was enabling NX in head_32.S if it is
      available.  Furthermore, we had a bewildering collection of tests for
      the available of NX.
      
      This patch:
      
      a) merges the 32-bit set_nx() and the 64-bit check_efer() function
         into a single x86_configure_nx() function.  EFER control is left
         to the head code.
      
      b) eliminates the nx_enabled variable entirely.  Things that need to
         test for NX enablement can verify __supported_pte_mask directly,
         and cpu_has_nx gives the supported status of NX.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Vegard Nossum <vegardno@ifi.uio.no>
      Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      LKML-Reference: <1258154897-6770-5-git-send-email-hpa@zytor.com>
      Acked-by: NKees Cook <kees.cook@canonical.com>
      4763ed4d
  9. 14 11月, 2009 3 次提交
    • I
      x86: Fix cpu_devs[] initialization in early_cpu_init() · 31c997ca
      Ingo Molnar 提交于
      Yinghai Lu noticed that this commit:
      
        0388423d: x86: Minimise printk spew from per-vendor init code
      
      mistakenly left out the initialization of cpu_devs[] in the
      !PROCESSOR_SELECT case. Fix it.
      Reported-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Dave Jones <davej@redhat.com>
      LKML-Reference: <20091113203000.GA19160@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      31c997ca
    • R
      x86: Remove CPU cache size output for non-Intel too · b01c845f
      Roland Dreier 提交于
      As Dave Jones said about the output in intel_cacheinfo.c: "They
      aren't useful, and pollute the dmesg output a lot (especially on
      machines with many cores).  Also the same information can be
      trivially found out from userspace."
      
      Give the generic display_cacheinfo() function the same treatment.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      Acked-by: NDave Jones <davej@redhat.com>
      Cc: Mike Travis <travis@sgi.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <adaocn6dp99.fsf_-_@roland-alpha.cisco.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b01c845f
    • D
      x86: Minimise printk spew from per-vendor init code · 0388423d
      Dave Jones 提交于
      In the default case where the kernel supports all CPU vendors,
      we currently print out a bunch of not useful messages on every
      system.
      
      32-bit:
      KERNEL supported cpus:
        Intel GenuineIntel
        AMD AuthenticAMD
        NSC Geode by NSC
        Cyrix CyrixInstead
        Centaur CentaurHauls
        Transmeta GenuineTMx86
        Transmeta TransmetaCPU
        UMC UMC UMC UMC
      
      64-bit:
      KERNEL supported cpus:
        Intel GenuineIntel
        AMD AuthenticAMD
        Centaur CentaurHauls
      
      Given that "what CPUs does the kernel support" isn't useful for
      the "support everything" case, we can suppress these printk's.
      Signed-off-by: NDave Jones <davej@redhat.com>
      LKML-Reference: <20091113203000.GA19160@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0388423d
  10. 13 11月, 2009 1 次提交
    • D
      x86: Remove the CPU cache size printk's · 15cd8812
      Dave Jones 提交于
      They aren't really useful, and they pollute the dmesg output a lot
      (especially on machines with many cores).
      
      Also the same information can be trivially found out from
      userspace.
      Reported-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NDave Jones <davej@redhat.com>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Roland Dreier <rdreier@cisco.com>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091112231542.GA7129@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      15cd8812
  11. 12 11月, 2009 2 次提交
  12. 11 11月, 2009 3 次提交
  13. 10 11月, 2009 1 次提交
    • Y
      x86: Under BIOS control, restore AP's APIC_LVTTHMR to the BSP value · a2202aa2
      Yong Wang 提交于
      On platforms where the BIOS handles the thermal monitor interrupt,
      APIC_LVTTHMR on each logical CPU is programmed to generate a SMI
      and OS must not touch it.
      
      Unfortunately AP bringup sequence using INIT-SIPI-SIPI clears all
      the LVT entries except the mask bit. Essentially this results in
      all LVT entries including the thermal monitoring interrupt set
      to masked (clearing the bios programmed value for APIC_LVTTHMR).
      
      And this leads to kernel take over the thermal monitoring
      interrupt on AP's but not on BSP (leaving the bios programmed
      value only on BSP).
      
      As a result of this, we have seen system hangs when the thermal
      monitoring interrupt is generated.
      
      Fix this by reading the initial value of thermal LVT entry on
      BSP and if bios has taken over the control, then program the
      same value on all AP's and leave the thermal monitoring
      interrupt control on all the logical cpu's to the bios.
      Signed-off-by: NYong Wang <yong.y.wang@intel.com>
      Reviewed-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Borislav Petkov <borislav.petkov@amd.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      LKML-Reference: <20091110013824.GA24940@ywang-moblin2.bj.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: stable@kernel.org
      a2202aa2
  14. 02 11月, 2009 1 次提交
  15. 16 10月, 2009 3 次提交
  16. 13 10月, 2009 2 次提交
    • H
      perf_event, x86, mce: Use TRACE_EVENT() for MCE logging · 8968f9d3
      Hidetoshi Seto 提交于
      This approach is the first baby step towards solving many of the
      structural problems the x86 MCE logging code is having today:
      
       - It has a private ring-buffer implementation that has a number
         of limitations and has been historically fragile and buggy.
      
       - It is using a quirky /dev/mcelog ioctl driven ABI that is MCE
         specific. /dev/mcelog is not part of any larger logging
         framework and hence has remained on the fringes for many years.
      
       - The MCE logging code is still very unclean partly due to its ABI
         limitations. Fields are being reused for multiple purposes, and
         the whole message structure is limited and x86 specific to begin
         with.
      
      All in one, the x86 tree would like to move away from this private
      implementation of an event logging facility to a broader framework.
      
      By using perf events we gain the following advantages:
      
       - Multiple user-space agents can access MCE events. We can have an
         mcelog daemon running but also a system-wide tracer capturing
         important events in flight-recorder mode.
      
       - Sampling support: the kernel and the user-space call-chain of MCE
         events can be stored and analyzed as well. This way actual patterns
         of bad behavior can be matched to precisely what kind of activity
         happened in the kernel (and/or in the app) around that moment in
         time.
      
       - Coupling with other hardware and software events: the PMU can track a
         number of other anomalies - monitoring software might chose to
         monitor those plus the MCE events as well - in one coherent stream of
         events.
      
       - Discovery of MCE sources - tracepoints are enumerated and tools can
         act upon the existence (or non-existence) of various channels of MCE
         information.
      
       - Filtering support: we just subscribe to and act upon the events we
         are interested in. Then even on a per event source basis there's
         in-kernel filter expressions available that can restrict the amount
         of data that hits the event channel.
      
       - Arbitrary deep per cpu buffering of events - we can buffer 32
         entries or we can buffer as much as we want, as long as we have
         the RAM.
      
       - An NMI-safe ring-buffer implementation - mappable to user-space.
      
       - Built-in support for timestamping of events, PID markers, CPU
         markers, etc.
      
       - A rich ABI accessible over system call interface. Per cpu, per task
         and per workload monitoring of MCE events can be done this way. The
         ABI itself has a nice, meaningful structure.
      
       - Extensible ABI: new fields can be added without breaking tooling.
         New tracepoints can be added as the hardware side evolves. There's
         various parsers that can be used.
      
       - Lots of scheduling/buffering/batching modes of operandi for MCE
         events. poll() support. mmap() support. read() support. You name it.
      
       - Rich tooling support: even without any MCE specific extensions added
         the 'perf' tool today offers various views of MCE data: perf report,
         perf stat, perf trace can all be used to view logged MCE events and
         perhaps correlate them to certain user-space usage patterns. But it
         can be used directly as well, for user-space agents and policy action
         in mcelog, etc.
      
      With this we hope to achieve significant code cleanup and feature
      improvements in the MCE code, and we hope to be able to drop the
      /dev/mcelog facility in the end.
      
      This patch is just a plain dumb dump of mce_log() records to
      the tracepoints / perf events framework - a first proof of
      concept step.
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      LKML-Reference: <4AD42A0D.7050104@jp.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8968f9d3
    • I
      perf_events, x86: Fix event constraints code · 7a693d3f
      Ingo Molnar 提交于
      There was namespace overlap due to a rename i did - this caused
      the following build warning, reported by Stephen Rothwell against
      linux-next x86_64 allmodconfig:
      
        arch/x86/kernel/cpu/perf_event.c: In function 'intel_get_event_idx':
        arch/x86/kernel/cpu/perf_event.c:1445: warning: 'event_constraint' is used uninitialized in this function
      
      This is a real bug not just a warning: fix it by renaming the
      global event-constraints table pointer to 'event_constraints'.
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Stephane Eranian <eranian@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091013144223.369d616d.sfr@canb.auug.org.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7a693d3f
  17. 12 10月, 2009 2 次提交
  18. 09 10月, 2009 2 次提交
    • P
      perf, x86: Add simple group validation · fe9081cc
      Peter Zijlstra 提交于
      Refuse to add events when the group wouldn't fit onto the PMU
      anymore.
      
      Naive implementation.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@gmail.com>
      LKML-Reference: <1254911461.26976.239.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fe9081cc
    • S
      perf_events: Add event constraints support for Intel processors · b690081d
      Stephane Eranian 提交于
      On some Intel processors, not all events can be measured in all
      counters. Some events can only be measured in one particular
      counter, for instance. Assigning an event to the wrong counter does
      not crash the machine but this yields bogus counts, i.e., silent
      error.
      
      This patch changes the event to counter assignment logic to take
      into account event constraints for Intel P6, Core and Nehalem
      processors. There is no contraints on Intel Atom. There are
      constraints on Intel Yonah (Core Duo) but they are not provided in
      this patch given that this processor is not yet supported by
      perf_events.
      
      As a result of the constraints, it is possible for some event
      groups to never actually be loaded onto the PMU if they contain two
      events which can only be measured on a single counter. That
      situation can be detected with the scaling information extracted
      with read().
      Signed-off-by: NStephane Eranian <eranian@gmail.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1254840129-6198-3-git-send-email-eranian@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b690081d