1. 01 4月, 2011 1 次提交
  2. 25 3月, 2011 2 次提交
    • I
      perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue · 45daae57
      Ingo Molnar 提交于
      Eric Dumazet reported that hardware PMU events do not work on his
      system, due to the BIOS corrupting PMU state:
      
          Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only.
          [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c)
      
      Linus suggested that we continue in the face of such BIOS-induced CPU
      state corruption:
      
         http://lkml.org/lkml/2011/3/24/608
      
      Such BIOSes will have to be fixed - Linux developers rely on a working and
      fully capable PMU and the BIOS interfering with the CPU's PMU state is simply
      not acceptable.
      
      So this patch changes perf to continue when it detects such BIOS
      interaction, some hardware events may be unreliable due to the BIOS
      writing and re-writing them - there's not much the kernel can do
      about that but to detect the corruption and report it.
      Reported-and-tested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      45daae57
    • D
      perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows · 242214f9
      Don Zickus 提交于
      The read of a proper MSR register was missed and instead of
      counter the configration register was tested (it has
      ARCH_P4_UNFLAGGED_BIT always cleared) leading to unknown NMI
      hitting the system. As result the user may obtain "Dazed and
      confused, but trying to continue" message. Fix it by reading a
      proper MSR register.
      
      When an NMI happens on a P4, the perf nmi handler checks the
      configuration register to see if the overflow bit is set or not
      before taking appropriate action.  Unfortunately, various P4
      machines had a broken overflow bit, so a backup mechanism was
      implemented.  This mechanism checked to see if the counter
      rolled over or not.
      
      A previous commit that implemented this backup mechanism was
      broken. Instead of reading the counter register, it used the
      configuration register to determine if the counter rolled over
      or not. Reading that bit would give incorrect results.
      
      This would lead to 'Dazed and confused' messages for the end
      user when using the perf tool (or if the nmi watchdog is
      running).
      
      The fix is to read the counter register before determining if
      the counter rolled over or not.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <4D8BAB49.3080701@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      242214f9
  3. 24 3月, 2011 1 次提交
    • R
      x86: Use syscore_ops instead of sysdev classes and sysdevs · f3c6ea1b
      Rafael J. Wysocki 提交于
      Some subsystems in the x86 tree need to carry out suspend/resume and
      shutdown operations with one CPU on-line and interrupts disabled and
      they define sysdev classes and sysdevs or sysdev drivers for this
      purpose.  This leads to unnecessarily complicated code and excessive
      memory usage, so switch them to using struct syscore_ops objects for
      this purpose instead.
      
      Generally, there are three categories of subsystems that use
      sysdevs for implementing PM operations: (1) subsystems whose
      suspend/resume callbacks ignore their arguments entirely (the
      majority), (2) subsystems whose suspend/resume callbacks use their
      struct sys_device argument, but don't really need to do that,
      because they can be implemented differently in an arguably simpler
      way (io_apic.c), and (3) subsystems whose suspend/resume callbacks
      use their struct sys_device argument, but the value of that argument
      is always the same and could be ignored (microcode_core.c).  In all
      of these cases the subsystems in question may be readily converted to
      using struct syscore_ops objects for power management and shutdown.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      f3c6ea1b
  4. 22 3月, 2011 1 次提交
    • H
      ACPI, APEI, Add ERST record ID cache · 885b976f
      Huang Ying 提交于
      APEI ERST firmware interface and implementation has no multiple users
      in mind.  For example, if there is four records in storage with ID: 1,
      2, 3 and 4, if two ERST readers enumerate the records via
      GET_NEXT_RECORD_ID as follow,
      
      reader 1		reader 2
      1
      			2
      3
      			4
      -1
      			-1
      
      where -1 signals there is no more record ID.
      
      Reader 1 has no chance to check record 2 and 4, while reader 2 has no
      chance to check record 1 and 3.  And any other GET_NEXT_RECORD_ID will
      return -1, that is, other readers will has no chance to check any
      record even they are not cleared by anyone.
      
      This makes raw GET_NEXT_RECORD_ID not suitable for used by multiple
      users.
      
      To solve the issue, an in-memory ERST record ID cache is designed and
      implemented.  When enumerating record ID, the ID returned by
      GET_NEXT_RECORD_ID is added into cache in addition to be returned to
      caller.  So other readers can check the cache to get all record ID
      available.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      885b976f
  5. 20 3月, 2011 1 次提交
    • S
      perf, x86: Fix Intel fixed counters base initialization · fc66c521
      Stephane Eranian 提交于
      The following patch solves the problems introduced by Robert's
      commit 41bf4989 and reported by Arun Sharma. This commit gets rid
      of the base + index notation for reading and writing PMU msrs.
      
      The problem is that for fixed counters, the new calculation for
      the base did not take into account the fixed counter indexes,
      thus all fixed counters were read/written from fixed counter 0.
      Although all fixed counters share the same config MSR, they each
      have their own counter register.
      
      Without:
      
       $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds
      
        242202299 unhalted_core_cycles (0.00% scaling, ena=1000790892, run=1000790892)
       2389685946 instructions_retired (0.00% scaling, ena=1000790892, run=1000790892)
            49473 baclears             (0.00% scaling, ena=1000790892, run=1000790892)
      
      With:
      
       $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds
      
       2392703238 unhalted_core_cycles (0.00% scaling, ena=1000840809, run=1000840809)
       2389793744 instructions_retired (0.00% scaling, ena=1000840809, run=1000840809)
            47863 baclears             (0.00% scaling, ena=1000840809, run=1000840809)
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: ming.m.lin@intel.com
      Cc: robert.richter@amd.com
      Cc: asharma@fb.com
      Cc: perfmon2-devel@lists.sf.net
      LKML-Reference: <20110319172005.GB4978@quad>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fc66c521
  6. 18 3月, 2011 2 次提交
    • N
      x86, dumpstack: Correct stack dump info when frame pointer is available · e8e999cf
      Namhyung Kim 提交于
      Current stack dump code scans entire stack and check each entry
      contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it
      could mark whether the pointer is valid or not based on value of
      the frame pointer. Invalid entries could be preceded by '?' sign.
      
      However this was not going to happen because scan start point
      was always higher than the frame pointer so that they could not
      meet.
      
      Commit 9c0729dc ("x86: Eliminate bp argument from the stack
      tracing routines") delayed bp acquisition point, so the bp was
      read in lower frame, thus all of the entries were marked
      invalid.
      
      This patch fixes this by reverting above commit while retaining
      stack_frame() helper as suggested by Frederic Weisbecker.
      
      End result looks like below:
      
      before:
      
       [    3.508329] Call Trace:
       [    3.508551]  [<ffffffff814f35c9>] ? panic+0x91/0x199
       [    3.508662]  [<ffffffff814f3739>] ? printk+0x68/0x6a
       [    3.508770]  [<ffffffff81a981b2>] ? mount_block_root+0x257/0x26e
       [    3.508876]  [<ffffffff81a9821f>] ? mount_root+0x56/0x5a
       [    3.508975]  [<ffffffff81a98393>] ? prepare_namespace+0x170/0x1a9
       [    3.509216]  [<ffffffff81a9772b>] ? kernel_init+0x1d2/0x1e2
       [    3.509335]  [<ffffffff81003894>] ? kernel_thread_helper+0x4/0x10
       [    3.509442]  [<ffffffff814f6880>] ? restore_args+0x0/0x30
       [    3.509542]  [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
       [    3.509641]  [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10
      
      after:
      
       [    3.522991] Call Trace:
       [    3.523351]  [<ffffffff814f35b9>] panic+0x91/0x199
       [    3.523468]  [<ffffffff814f3729>] ? printk+0x68/0x6a
       [    3.523576]  [<ffffffff81a981b2>] mount_block_root+0x257/0x26e
       [    3.523681]  [<ffffffff81a9821f>] mount_root+0x56/0x5a
       [    3.523780]  [<ffffffff81a98393>] prepare_namespace+0x170/0x1a9
       [    3.523885]  [<ffffffff81a9772b>] kernel_init+0x1d2/0x1e2
       [    3.523987]  [<ffffffff81003894>] kernel_thread_helper+0x4/0x10
       [    3.524228]  [<ffffffff814f6880>] ? restore_args+0x0/0x30
       [    3.524345]  [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
       [    3.524445]  [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10
      
       -v5:
         * fix build breakage with oprofile
      
       -v4:
         * use 0 instead of regs->bp
         * separate out printk changes
      
       -v3:
         * apply comment from Frederic
         * add a couple of printk fixes
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Soren Sandmann <ssp@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Robert Richter <robert.richter@amd.com>
      LKML-Reference: <1300416006-3163-1-git-send-email-namhyung@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e8e999cf
    • L
      x86: Fix common misspellings · 0d2eb44f
      Lucas De Marchi 提交于
      They were generated by 'codespell' and then manually reviewed.
      Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
      Cc: trivial@kernel.org
      LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0d2eb44f
  7. 17 3月, 2011 2 次提交
  8. 16 3月, 2011 3 次提交
  9. 10 3月, 2011 1 次提交
  10. 05 3月, 2011 2 次提交
  11. 04 3月, 2011 3 次提交
    • A
      perf: Fix LLC-* events on Intel Nehalem/Westmere · e994d7d2
      Andi Kleen 提交于
      On Intel Nehalem and Westmere CPUs the generic perf LLC-* events count the
      L2 caches, not the real L3 LLC - this was inconsistent with behavior on
      other CPUs.
      
      Fixing this requires the use of the special OFFCORE_RESPONSE
      events which need a separate mask register.
      
      This has been implemented by the previous patch, now use this infrastructure
      to set correct events for the LLC-* on Nehalem and Westmere.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1299119690-13991-3-git-send-email-ming.m.lin@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e994d7d2
    • A
      perf: Add support for supplementary event registers · a7e3ed1e
      Andi Kleen 提交于
      Change logs against Andi's original version:
      
      - Extends perf_event_attr:config to config{,1,2} (Peter Zijlstra)
      - Fixed a major event scheduling issue. There cannot be a ref++ on an
        event that has already done ref++ once and without calling
        put_constraint() in between. (Stephane Eranian)
      - Use thread_cpumask for percore allocation. (Lin Ming)
      - Use MSR names in the extra reg lists. (Lin Ming)
      - Remove redundant "c = NULL" in intel_percore_constraints
      - Fix comment of perf_event_attr::config1
      
      Intel Nehalem/Westmere have a special OFFCORE_RESPONSE event
      that can be used to monitor any offcore accesses from a core.
      This is a very useful event for various tunings, and it's
      also needed to implement the generic LLC-* events correctly.
      
      Unfortunately this event requires programming a mask in a separate
      register. And worse this separate register is per core, not per
      CPU thread.
      
      This patch:
      
      - Teaches perf_events that OFFCORE_RESPONSE needs extra parameters.
        The extra parameters are passed by user space in the
        perf_event_attr::config1 field.
      
      - Adds support to the Intel perf_event core to schedule per
        core resources. This adds fairly generic infrastructure that
        can be also used for other per core resources.
        The basic code has is patterned after the similar AMD northbridge
        constraints code.
      
      Thanks to Stephane Eranian who pointed out some problems
      in the original version and suggested improvements.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1299119690-13991-2-git-send-email-ming.m.lin@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a7e3ed1e
    • S
      perf_events: Update PEBS event constraints · 17e31629
      Stephane Eranian 提交于
      This patch updates PEBS event constraints for Intel Atom, Nehalem, Westmere.
      
      This patch also reorganizes the PEBS format/constraint detection code. It is
      now based on processor model and not PEBS format. Two processors may use the
      same PEBS format without have the same list of PEBS events.
      
      In this second version, we simplified the initialization of the PEBS
      constraints by leveraging the existing switch() statement in perf_event_intel.c.
      We also renamed the constraint tables to be more consistent with regular
      constraints.
      
      In this 3rd version, we drop BR_INST_RETIRED.MISPRED from Intel Atom as it does
      not seem to work. Use MISPREDICTED_BRANCH_RETIRED instead. Also add FP_ASSIST.*
      o both Intel Nehalem and Westmere. I misssed those in the earlier patches.
      Events were tested using libpfm4 perf_examples.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4d6e6b02.815bdf0a.637b.07a7@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      17e31629
  12. 02 3月, 2011 4 次提交
  13. 16 2月, 2011 6 次提交
  14. 15 2月, 2011 1 次提交
    • B
      x86, amd: Initialize variable properly · 9e81509e
      Borislav Petkov 提交于
      Commit d518573d ("x86, amd: Normalize compute unit IDs on
      multi-node processors") introduced compute unit normalization
      but causes a compiler warning:
      
       arch/x86/kernel/cpu/amd.c: In function 'amd_detect_cmp':
       arch/x86/kernel/cpu/amd.c:268: warning: 'cores_per_cu' may be used uninitialized in this function
       arch/x86/kernel/cpu/amd.c:268: note: 'cores_per_cu' was declared here
      
      The compiler is right - initialize it with a proper value.
      
      Also, fix up a comment while at it.
      Reported-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      LKML-Reference: <20110214171451.GB10076@kryptos.osrc.amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9e81509e
  15. 08 2月, 2011 1 次提交
    • H
      x86, amd: Support L3 Cache Partitioning on AMD family 0x15 CPUs · cabb5bd7
      Hans Rosenfeld 提交于
      L3 Cache Partitioning allows selecting which of the 4 L3 subcaches can be used
      for evictions by the L2 cache of each compute unit. By writing a 4-bit
      hexadecimal mask into the the sysfs file
      /sys/devices/system/cpu/cpuX/cache/index3/subcaches, the user can set the
      enabled subcaches for a CPU.
      
      The settings are directly read from and written to the hardware, so there is no
      way to have contradicting settings for two CPUs belonging to the same compute
      unit. Writing will always overwrite any previous setting for a compute unit.
      Signed-off-by: NHans Rosenfeld <hans.rosenfeld@amd.com>
      Cc: <Andreas.Herrmann3@amd.com>
      LKML-Reference: <1297098639-431383-1-git-send-email-hans.rosenfeld@amd.com>
      [ -v3: minor style fixes ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cabb5bd7
  16. 03 2月, 2011 1 次提交
    • S
      x86, mtrr: Avoid MTRR reprogramming on BP during boot on UP platforms · f7448548
      Suresh Siddha 提交于
      Markus Kohn ran into a hard hang regression on an acer aspire
      1310, when acpi is enabled. git bisect showed the following
      commit as the bad one that introduced the boot regression.
      
      	commit d0af9eed
      	Author: Suresh Siddha <suresh.b.siddha@intel.com>
      	Date:   Wed Aug 19 18:05:36 2009 -0700
      
      	    x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init
      
      Because of the UP configuration of that platform,
      native_smp_prepare_cpus() bailed out (in smp_sanity_check())
      before doing the set_mtrr_aps_delayed_init()
      
      Further down the boot path, native_smp_cpus_done() will call the
      delayed MTRR initialization for the AP's (mtrr_aps_init()) with
      mtrr_aps_delayed_init not set. This resulted in the boot
      processor reprogramming its MTRR's to the values seen during the
      start of the OS boot. While this is not needed ideally, this
      shouldn't have caused any side-effects. This is because the
      reprogramming of MTRR's (set_mtrr_state() that gets called via
      set_mtrr()) will check if the live register contents are
      different from what is being asked to write and will do the actual
      write only if they are different.
      
      BP's mtrr state is read during the start of the OS boot and
      typically nothing would have changed when we ask to reprogram it
      on BP again because of the above scenario on an UP platform. So
      on a normal UP platform no reprogramming of BP MTRR MSR's
      happens and all is well.
      
      However, on this platform, bios seems to be modifying the fixed
      mtrr range registers between the start of OS boot and when we
      double check the live registers for reprogramming BP MTRR
      registers. And as the live registers are modified, we end up
      reprogramming the MTRR's to the state seen during the start of
      the OS boot.
      
      During ACPI initialization, something in the bios (probably smi
      handler?) don't like this fact and results in a hard lockup.
      
      We didn't see this boot hang issue on this platform before the
      commit d0af9eed, because only
      the AP's (if any) will program its MTRR's to the value that BP
      had at the start of the OS boot.
      
      Fix this issue by checking mtrr_aps_delayed_init before
      continuing further in the mtrr_aps_init(). Now, only AP's (if
      any) will program its MTRR's to the BP values during boot.
      
      Addresses https://bugzilla.novell.com/show_bug.cgi?id=623393
      
        [ By the way, this behavior of the bios modifying MTRR's after the start
          of the OS boot is not common and the kernel is not prepared to
          handle this situation well. Irrespective of this issue, during
          suspend/resume, linux kernel will try to reprogram the BP's MTRR values
          to the values seen during the start of the OS boot. So suspend/resume might
          be already broken on this platform for all linux kernel versions. ]
      Reported-and-bisected-by: NMarkus Kohn <jabber@gmx.org>
      Tested-by: NMarkus Kohn <jabber@gmx.org>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Thomas Renninger <trenn@novell.com>
      Cc: Rafael Wysocki <rjw@novell.com>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Cc: stable@kernel.org # [v2.6.32+]
      LKML-Reference: <1296694975.4418.402.camel@sbsiddha-MOBL3.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f7448548
  17. 28 1月, 2011 5 次提交
    • T
      x86: Unify node_to_cpumask_map handling between 32 and 64bit · de2d9445
      Tejun Heo 提交于
      x86_32 has been managing node_to_cpumask_map explicitly from
      map_cpu_to_node() and friends in a rather ugly way.  With
      previous changes, it's now possible to share the code with
      64bit.
      
      * When CONFIG_NUMA_EMU is disabled, numa_add/remove_cpu() are
        implemented in numa.c and shared by 32 and 64bit.  CONFIG_NUMA_EMU
        versions still live in numa_64.c.
      
        NUMA_EMU's dependency on 64bit is planned to be removed and the
        above should go away together.
      
      * identify_cpu() now calls numa_add_cpu() for 32bit too.  This
        makes the explicit mask management from map_cpu_to_node() unnecessary.
      
      * The whole x86_32 specific map_cpu_to_node() chunk is no longer
        necessary.  Dropped.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NPekka Enberg <penberg@kernel.org>
      Cc: eric.dumazet@gmail.com
      Cc: yinghai@kernel.org
      Cc: brgerst@gmail.com
      Cc: gorcunov@gmail.com
      Cc: shaohui.zheng@intel.com
      Cc: rientjes@google.com
      LKML-Reference: <1295789862-25482-16-git-send-email-tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Shaohui Zheng <shaohui.zheng@intel.com>
      de2d9445
    • T
      x86: Unify CPU -> NUMA node mapping between 32 and 64bit · 645a7919
      Tejun Heo 提交于
      Unlike 64bit, 32bit has been using its own cpu_to_node_map[] for
      CPU -> NUMA node mapping.  Replace it with early_percpu variable
      x86_cpu_to_node_map and share the mapping code with 64bit.
      
      * USE_PERCPU_NUMA_NODE_ID is now enabled for 32bit too.
      
      * x86_cpu_to_node_map and numa_set/clear_node() are moved from
        numa_64 to numa.  For now, on 32bit, x86_cpu_to_node_map is initialized
        with 0 instead of NUMA_NO_NODE.  This is to avoid introducing unexpected
        behavior change and will be updated once init path is unified.
      
      * srat_detect_node() is now enabled for x86_32 too.  It calls
        numa_set_node() and initializes the mapping making explicit
        cpu_to_node_map[] updates from map/unmap_cpu_to_node() unnecessary.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: eric.dumazet@gmail.com
      Cc: yinghai@kernel.org
      Cc: brgerst@gmail.com
      Cc: gorcunov@gmail.com
      Cc: penberg@kernel.org
      Cc: shaohui.zheng@intel.com
      Cc: rientjes@google.com
      LKML-Reference: <1295789862-25482-15-git-send-email-tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: David Rientjes <rientjes@google.com>
      645a7919
    • T
      x86: Unify cpu/apicid <-> NUMA node mapping between 32 and 64bit · bbc9e2f4
      Tejun Heo 提交于
      The mapping between cpu/apicid and node is done via
      apicid_to_node[] on 64bit and apicid_2_node[] +
      apic->x86_32_numa_cpu_node() on 32bit. This difference makes it
      difficult to further unify 32 and 64bit NUMA handling.
      
      This patch unifies it by replacing both apicid_to_node[] and
      apicid_2_node[] with __apicid_to_node[] array, which is accessed
      by two accessors - set_apicid_to_node() and numa_cpu_node().  On
      64bit, numa_cpu_node() always consults __apicid_to_node[]
      directly while 32bit goes through apic->numa_cpu_node() method
      to allow apic implementations to override it.
      
      srat_detect_node() for amd cpus contains workaround for broken
      NUMA configuration which assumes relationship between APIC ID,
      HT node ID and NUMA topology.  Leave it to access
      __apicid_to_node[] directly as mapping through CPU might result
      in undesirable behavior change.  The comment is reformatted and
      updated to note the ugliness.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NPekka Enberg <penberg@kernel.org>
      Cc: eric.dumazet@gmail.com
      Cc: yinghai@kernel.org
      Cc: brgerst@gmail.com
      Cc: gorcunov@gmail.com
      Cc: shaohui.zheng@intel.com
      Cc: rientjes@google.com
      LKML-Reference: <1295789862-25482-14-git-send-email-tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: David Rientjes <rientjes@google.com>
      bbc9e2f4
    • Y
      x86, perf: Change two init functions to static · dda99116
      Yinghai Lu 提交于
      init_hw_perf_events() is called via early_initcall now.
      x86_pmu_event_init is x86_pmu member function.
      
      So we can change them to static.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      LKML-Reference: <4D3A16F9.109@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dda99116
    • S
      perf: Fix Pentium4 raw event validation · d038b12c
      Stephane Eranian 提交于
      This patch fixes some issues with raw event validation on
      Pentium 4 (Netburst) based processors.
      
      As I was testing libpfm4 Netburst support, I ran into two
      problems in the p4_validate_raw_event() function:
      
         - the shared field must be checked ONLY when HT is on
         - the binding to ESCR register was missing
      
      The second item was causing raw events to not be encoded
      correctly compared to generic PMU events.
      
      With this patch, I can now pass Netburst events to libpfm4
      examples and get meaningful results:
      
        $ task -e global_power_events:running:u  noploop 1
        noploop for 1 seconds
        3,206,304,898 global_power_events:running
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: peterz@infradead.org
      Cc: paulus@samba.org
      Cc: davem@davemloft.net
      Cc: fweisbec@gmail.com
      Cc: perfmon2-devel@lists.sf.net
      Cc: eranian@gmail.com
      Cc: robert.richter@amd.com
      Cc: acme@redhat.com
      Cc: gorcunov@gmail.com
      Cc: ming.m.lin@intel.com
      LKML-Reference: <4d3efb2f.1252d80a.1a80.ffffc83f@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d038b12c
  18. 26 1月, 2011 2 次提交
    • Y
      x86: Move llc_shared_map out of cpu_info · b3d7336d
      Yinghai Lu 提交于
      cpu_info is already with per_cpu, We can take llc_shared_map out
      of cpu_info, and declare it as per_cpu variable directly.
      
      So later referencing could be simple and directly instead of
      diving to find cpu_info at first.
      
      Also could make smp_store_cpu_info() much simple to avoid to do
      save and restore trick.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
      Cc: Alok N Kataria <akataria@vmware.com>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Hans J. Koch <hjk@linutronix.de>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Borislav Petkov <borislav.petkov@amd.com>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <4D3A16E8.5020608@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b3d7336d
    • A
      x86, amd: Normalize compute unit IDs on multi-node processors · d518573d
      Andreas Herrmann 提交于
      On multi-node CPUs we don't need the socket wide compute unit ID
      but the node-wide compute unit ID. Thus we need to normalize the
      value. This is similar to what we do with cpu_core_id.
      
      A compute unit is then identified by physical_package_id,
      node_id, and compute_unit_id.
      Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
      LKML-Reference: <1295881543-572552-2-git-send-email-hans.rosenfeld@amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d518573d
  19. 21 1月, 2011 1 次提交