1. 04 3月, 2011 1 次提交
    • A
      perf: Add support for supplementary event registers · a7e3ed1e
      Andi Kleen 提交于
      Change logs against Andi's original version:
      
      - Extends perf_event_attr:config to config{,1,2} (Peter Zijlstra)
      - Fixed a major event scheduling issue. There cannot be a ref++ on an
        event that has already done ref++ once and without calling
        put_constraint() in between. (Stephane Eranian)
      - Use thread_cpumask for percore allocation. (Lin Ming)
      - Use MSR names in the extra reg lists. (Lin Ming)
      - Remove redundant "c = NULL" in intel_percore_constraints
      - Fix comment of perf_event_attr::config1
      
      Intel Nehalem/Westmere have a special OFFCORE_RESPONSE event
      that can be used to monitor any offcore accesses from a core.
      This is a very useful event for various tunings, and it's
      also needed to implement the generic LLC-* events correctly.
      
      Unfortunately this event requires programming a mask in a separate
      register. And worse this separate register is per core, not per
      CPU thread.
      
      This patch:
      
      - Teaches perf_events that OFFCORE_RESPONSE needs extra parameters.
        The extra parameters are passed by user space in the
        perf_event_attr::config1 field.
      
      - Adds support to the Intel perf_event core to schedule per
        core resources. This adds fairly generic infrastructure that
        can be also used for other per core resources.
        The basic code has is patterned after the similar AMD northbridge
        constraints code.
      
      Thanks to Stephane Eranian who pointed out some problems
      in the original version and suggested improvements.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1299119690-13991-2-git-send-email-ming.m.lin@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a7e3ed1e
  2. 18 2月, 2011 2 次提交
  3. 04 1月, 2011 1 次提交
  4. 19 12月, 2010 1 次提交
  5. 18 11月, 2010 1 次提交
  6. 24 10月, 2010 1 次提交
  7. 15 10月, 2010 1 次提交
    • R
      oprofile, x86: Add support for IBS branch target address reporting · 25da6950
      Robert Richter 提交于
      This patch adds support for IBS branch target address reporting. A new
      MSR (MSRC001_103B IBS Branch Target Address) has been added that
      provides the logical address in canonical form for the branch
      target. The size of the IBS sample that is transferred to the userland
      has been increased.
      
      For backward compatibility, the userland daemon must explicit enable
      the feature by writing to the oprofilefs file
      
       ibs_op/branch_target
      
      After enabling branch target address reporting, the userland daemon
      must handle the extended size of the IBS sample.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      25da6950
  8. 01 8月, 2010 1 次提交
  9. 31 7月, 2010 1 次提交
  10. 22 7月, 2010 1 次提交
  11. 17 6月, 2010 1 次提交
    • V
      x86: Look for IA32_ENERGY_PERF_BIAS support · 23016bf0
      Venkatesh Pallipadi 提交于
      The new IA32_ENERGY_PERF_BIAS MSR allows system software to give
      hardware a hint whether OS policy favors more power saving,
      or more performance.  This allows the OS to have some influence
      on internal hardware power/performance tradeoffs where the OS
      has previously had no influence.
      
      The support for this feature is indicated by CPUID.06H.ECX.bit3,
      as documented in the Intel Architectures Software Developer's Manual.
      
      This patch discovers support of this feature and displays it
      as "epb" in /proc/cpuinfo.
      Signed-off-by: NVenkatesh Pallipadi <venki@google.com>
      LKML-Reference: <alpine.LFD.2.00.1006032310160.6669@localhost.localdomain>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      23016bf0
  12. 11 6月, 2010 1 次提交
  13. 09 6月, 2010 1 次提交
  14. 25 5月, 2010 1 次提交
  15. 19 5月, 2010 1 次提交
  16. 26 3月, 2010 1 次提交
  17. 20 3月, 2010 1 次提交
    • A
      x86, amd: Restrict usage of c1e_idle() · 035a02c1
      Andreas Herrmann 提交于
      Currently c1e_idle returns true for all CPUs greater than or equal to
      family 0xf model 0x40. This covers too many CPUs.
      
      Meanwhile a respective erratum for the underlying problem was filed
      (#400). This patch adds the logic to check whether erratum #400
      applies to a given CPU.
      Especially for CPUs where SMI/HW triggered C1e is not supported,
      c1e_idle() doesn't need to be used. We can check this by looking at
      the respective OSVW bit for erratum #400.
      
      Cc: <stable@kernel.org> # .32.x .33.x
      Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
      LKML-Reference: <20100319110922.GA19614@alberich.amd.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      035a02c1
  18. 19 3月, 2010 1 次提交
  19. 17 12月, 2009 1 次提交
  20. 16 12月, 2009 1 次提交
  21. 10 9月, 2009 1 次提交
  22. 30 7月, 2009 1 次提交
  23. 10 7月, 2009 1 次提交
  24. 01 7月, 2009 1 次提交
  25. 29 5月, 2009 1 次提交
  26. 09 5月, 2009 1 次提交
  27. 24 3月, 2009 2 次提交
  28. 25 2月, 2009 1 次提交
  29. 22 1月, 2009 1 次提交
  30. 17 12月, 2008 1 次提交
  31. 23 10月, 2008 2 次提交
  32. 15 10月, 2008 1 次提交
  33. 10 9月, 2008 1 次提交
  34. 23 7月, 2008 1 次提交
    • V
      x86: consolidate header guards · 77ef50a5
      Vegard Nossum 提交于
      This patch is the result of an automatic script that consolidates the
      format of all the headers in include/asm-x86/.
      
      The format:
      
      1. No leading underscore. Names with leading underscores are reserved.
      2. Pathname components are separated by two underscores. So we can
         distinguish between mm_types.h and mm/types.h.
      3. Everything except letters and numbers are turned into single
         underscores.
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      77ef50a5
  35. 10 6月, 2008 1 次提交
  36. 17 4月, 2008 2 次提交
    • A
      x86: split large page mapping for AMD TSEG · 8346ea17
      Andi Kleen 提交于
      On AMD SMM protected memory is part of the address map, but handled
      internally like an MTRR. That leads to large pages getting split
      internally which has some performance implications. Check for the
      AMD TSEG MSR and split the large page mapping on that area
      explicitely if it is part of the direct mapping.
      
      There is also SMM ASEG, but it is in the first 1MB and already covered by
      the earlier split first page patch.
      
      Idea for this came from an earlier patch by Andreas Herrmann
      
      On a RevF dual Socket Opteron system kernbench shows a clear
      improvement from this:
      (together with the earlier patches in this series, especially the
      split first 2MB patch)
      
      [lower is better]
                    no split stddev         split  stddev    delta
      Elapsed Time   87.146 (0.727516)     84.296 (1.09098)  -3.2%
      User Time     274.537 (4.05226)     273.692 (3.34344)  -0.3%
      System Time    34.907 (0.42492)      34.508 (0.26832)  -1.1%
      Percent CPU   322.5   (38.3007)     326.5   (44.5128)  +1.2%
      
      => About 3.2% improvement in elapsed time for kernbench.
      
      With GB pages on AMD Fam1h the impact of splitting is much higher of course,
      since it would split two full GB pages (together with the first
      1MB split patch) instead of two 2MB pages.  I could not benchmark
      a clear difference in kernbench on gbpages, so I kept it disabled
      for that case
      
      That was only limited benchmarking of course, so if someone
      was interested in running more tests for the gbpages case
      that could be revisited (contributions welcome)
      
      I didn't bother implementing this for 32bit because it is very
      unlikely the 32bit lowmem mapping overlaps into the TSEG near 4GB
      and the 2MB low split is already handled for both.
      
      [ mingo@elte.hu: do it on gbpages kernels too, there's no clear reason
                       why it shouldnt help there. ]
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: andreas.herrmann3@amd.com
      Cc: mingo@elte.hu
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8346ea17
    • V
      x86: PAT infrastructure patch · 2e5d9c85
      venkatesh.pallipadi@intel.com 提交于
      Sets up pat_init() infrastructure.
      
      PAT MSR has following setting.
      	PAT
      	|PCD
      	||PWT
      	|||
      	000 WB		_PAGE_CACHE_WB
      	001 WC		_PAGE_CACHE_WC
      	010 UC-		_PAGE_CACHE_UC_MINUS
      	011 UC		_PAGE_CACHE_UC
      
      We are effectively changing WT from boot time setting to WC.
      UC_MINUS is used to provide backward compatibility to existing /dev/mem
      users(X).
      
      reserve_memtype and free_memtype are new interfaces for maintaining alias-free
      mapping. It is currently implemented in a simple way with a linked list and
      not optimized. reserve and free tracks the effective memory type, as a result
      of PAT and MTRR setting rather than what is actually requested in PAT.
      
      pat_init piggy backs on mtrr_init as the rules for setting both pat and mtrr
      are same.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2e5d9c85