1. 27 4月, 2011 2 次提交
    • D
      perf, x86, nmi: Move LVT un-masking into irq handlers · 2bce5dac
      Don Zickus 提交于
      It was noticed that P4 machines were generating double NMIs for
      each perf event.  These extra NMIs lead to 'Dazed and confused'
      messages on the screen.
      
      I tracked this down to a P4 quirk that said the overflow bit had
      to be cleared before re-enabling the apic LVT mask.  My first
      attempt was to move the un-masking inside the perf nmi handler
      from before the chipset NMI handler to after.
      
      This broke Nehalem boxes that seem to like the unmasking before
      the counters themselves are re-enabled.
      
      In order to keep this change simple for 2.6.39, I decided to
      just simply move the apic LVT un-masking to the beginning of all
      the chipset NMI handlers, with the exception of Pentium4's to
      fix the double NMI issue.
      
      Later on we can move the un-masking to later in the handlers to
      save a number of 'extra' NMIs on those particular chipsets.
      
      I tested this change on a P4 machine, an AMD machine, a Nehalem
      box, and a core2quad box.  'perf top' worked correctly along
      with various other small 'perf record' runs.  Anything high
      stress breaks all the machines but that is a different problem.
      
      Thanks to various people for testing different versions of this
      patch.
      Reported-and-tested-by: NShaun Ruffell <sruffell@digium.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Link: http://lkml.kernel.org/r/1303900353-10242-1-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      CC: Cyrill Gorcunov <gorcunov@gmail.com>
      2bce5dac
    • I
      perf events, x86: Work around the Nehalem AAJ80 erratum · ec75a716
      Ingo Molnar 提交于
      On Nehalem CPUs the retired branch-misses event can be completely bogus,
      when there are no branch-misses occuring. When there are a lot of branch
      misses then the count is pretty accurate. Still, this leaves us with an
      event that over-counts a lot.
      
      Detect this erratum and work it around by using BR_MISP_EXEC.ANY events.
      These will also count speculated branches but still it's a lot more
      precise in practice than the architectural event.
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Link: http://lkml.kernel.org/n/tip-yyfg0bxo9jsqxd6a0ovfny27@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      ec75a716
  2. 26 4月, 2011 1 次提交
  3. 22 4月, 2011 4 次提交
    • P
      perf, x86: Update/fix Intel Nehalem cache events · f4929bd3
      Peter Zijlstra 提交于
      Change the Nehalem cache events to use retired memory instruction counters
      (similar to Westmere), this greatly improves the provided stats.
      
      Using:
      
      main ()
      {
              int i;
      
              for (i = 0; i < 1000000000; i++) {
                      asm("mov (%%rsp), %%rbx;"
                          "mov %%rbx, (%%rsp);" : : : "rbx");
              }
      }
      
      We find:
      
       $ perf stat --repeat 10 -e instructions:u -e l1-dcache-loads:u -e l1-dcache-stores:u ./loop_1b_loads+stores
        Performance counter stats for './loop_1b_loads+stores' (10 runs):
            4,000,081,056 instructions:u           #      0.000 IPC ( +-   0.000% )
            4,999,502,846 l1-dcache-loads:u          ( +-   0.008% )
            1,000,034,832 l1-dcache-stores:u         ( +-   0.000% )
               1.565184942  seconds time elapsed   ( +-   0.005% )
      
      The 5b is surprising - we'd expect 1b:
      
       $ perf stat --repeat 10 -e instructions:u -e r10b:u -e l1-dcache-stores:u ./loop_1b_loads+stores
        Performance counter stats for './loop_1b_loads+stores' (10 runs):
            4,000,081,054 instructions:u           #      0.000 IPC ( +-   0.000% )
            1,000,021,961 r10b:u                     ( +-   0.000% )
            1,000,030,951 l1-dcache-stores:u         ( +-   0.000% )
               1.565055422  seconds time elapsed   ( +-   0.003% )
      
      Which this patch thus fixes.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Link: http://lkml.kernel.org/n/tip-q9rtru7b7840tws75xzboapv@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      f4929bd3
    • C
      perf, x86: P4 PMU - Don't forget to clear cpuc->active_mask on overflow · 1ea5a6af
      Cyrill Gorcunov 提交于
      It's not enough to simply disable event on overflow the
      cpuc->active_mask should be cleared as well otherwise counter
      may stall in "active" even in real being already disabled (which
      potentially may lead to the situation that user may not use this
      counter further).
      
      Don pointed out that:
      
       " I also noticed this patch fixed some unknown NMIs
         on a P4 when I stressed the box".
      Tested-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Link: http://lkml.kernel.org/r/1303398203-2918-3-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      1ea5a6af
    • I
      x86, perf event: Turn off unstructured raw event access to offcore registers · b52c55c6
      Ingo Molnar 提交于
      Andi Kleen pointed out that the Intel offcore support patches were merged
      without user-space tool support to the functionality:
      
       |
       | The offcore_msr perf kernel code was merged into 2.6.39-rc*, but the
       | user space bits were not. This made it impossible to set the extra mask
       | and actually do the OFFCORE profiling
       |
      
      Andi submitted a preliminary patch for user-space support, as an
      extension to perf's raw event syntax:
      
       |
       | Some raw events -- like the Intel OFFCORE events -- support additional
       | parameters. These can be appended after a ':'.
       |
       | For example on a multi socket Intel Nehalem:
       |
       |    perf stat -e r1b7:20ff -a sleep 1
       |
       | Profile the OFFCORE_RESPONSE.ANY_REQUEST with event mask REMOTE_DRAM_0
       | that measures any access to DRAM on another socket.
       |
      
      But this kind of usability is absolutely unacceptable - users should not
      be expected to type in magic, CPU and model specific incantations to get
      access to useful hardware functionality.
      
      The proper solution is to expose useful offcore functionality via
      generalized events - that way users do not have to care which specific
      CPU model they are using, they can use the conceptual event and not some
      model specific quirky hexa number.
      
      We already have such generalization in place for CPU cache events,
      and it's all very extensible.
      
      "Offcore" events measure general DRAM access patters along various
      parameters. They are particularly useful in NUMA systems.
      
      We want to support them via generalized DRAM events: either as the
      fourth level of cache (after the last-level cache), or as a separate
      generalization category.
      
      That way user-space support would be very obvious, memory access
      profiling could be done via self-explanatory commands like:
      
        perf record -e dram ./myapp
        perf record -e dram-remote ./myapp
      
      ... to measure DRAM accesses or more expensive cross-node NUMA DRAM
      accesses.
      
      These generalized events would work on all CPUs and architectures that
      have comparable PMU features.
      
      ( Note, these are just examples: actual implementation could have more
        sophistication and more parameter - as long as they center around
        similarly simple usecases. )
      
      Now we do not want to revert *all* of the current offcore bits, as they
      are still somewhat useful for generic last-level-cache events, implemented
      in this commit:
      
        e994d7d2: perf: Fix LLC-* events on Intel Nehalem/Westmere
      
      But we definitely do not yet want to expose the unstructured raw events
      to user-space, until better generalization and usability is implemented
      for these hardware event features.
      
      ( Note: after generalization has been implemented raw offcore events can be
        supported as well: there can always be an odd event that is marginally
        useful but not useful enough to generalize. DRAM profiling is definitely
        *not* such a category so generalization must be done first. )
      
      Furthermore, PERF_TYPE_RAW access to these registers was not intended
      to go upstream without proper support - it was a side-effect of the above
      e994d7d2 commit, not mentioned in the changelog.
      
      As v2.6.39 is nearing release we go for the simplest approach: disable
      the PERF_TYPE_RAW offcore hack for now, before it escapes into a released
      kernel and becomes an ABI.
      
      Once proper structure is implemented for these hardware events and users
      are offered usable solutions we can revisit this issue.
      Reported-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1302658203-4239-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b52c55c6
    • A
      perf: Support Xeon E7's via the Westmere PMU driver · b2508e82
      Andi Kleen 提交于
      There's a new model number public, 47, for Xeon E7 (aka Westmere EX).
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: a.p.zijlstra@chello.nl
      Link: http://lkml.kernel.org/r/1303429715-10202-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b2508e82
  4. 19 4月, 2011 2 次提交
  5. 16 4月, 2011 1 次提交
  6. 01 4月, 2011 1 次提交
  7. 30 3月, 2011 1 次提交
    • S
      x86, mtrr, pat: Fix one cpu getting out of sync during resume · 84ac7cdb
      Suresh Siddha 提交于
      On laptops with core i5/i7, there were reports that after resume
      graphics workloads were performing poorly on a specific AP, while
      the other cpu's were ok. This was observed on a 32bit kernel
      specifically.
      
      Debug showed that the PAT init was not happening on that AP
      during resume and hence it contributing to the poor workload
      performance on that cpu.
      
      On this system, resume flow looked like this:
      
      1. BP starts the resume sequence and we reinit BP's MTRR's/PAT
         early on using mtrr_bp_restore()
      
      2. Resume sequence brings all AP's online
      
      3. Resume sequence now kicks off the MTRR reinit on all the AP's.
      
      4. For some reason, between point 2 and 3, we moved from BP
         to one of the AP's. My guess is that printk() during resume
         sequence is contributing to this. We don't see similar
         behavior with the 64bit kernel but there is no guarantee that
         at this point the remaining resume sequence (after AP's bringup)
         has to happen on BP.
      
      5. set_mtrr() was assuming that we are still on BP and skipped the
         MTRR/PAT init on that cpu (because of 1 above)
      
      6. But we were on an AP and this led to not reprogramming PAT
         on this cpu leading to bad performance.
      
      Fix this by doing unconditional mtrr_if->set_all() in set_mtrr()
      during MTRR/PAT init. This might be unnecessary if we are still
      running on BP. But it is of no harm and will guarantee that after
      resume, all the cpu's will be in sync with respect to the
      MTRR/PAT registers.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1301438292-28370-1-git-send-email-eric@anholt.net>
      Signed-off-by: NEric Anholt <eric@anholt.net>
      Tested-by: NKeith Packard <keithp@keithp.com>
      Cc: stable@kernel.org	[v2.6.32+]
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      84ac7cdb
  8. 25 3月, 2011 2 次提交
    • I
      perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue · 45daae57
      Ingo Molnar 提交于
      Eric Dumazet reported that hardware PMU events do not work on his
      system, due to the BIOS corrupting PMU state:
      
          Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only.
          [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c)
      
      Linus suggested that we continue in the face of such BIOS-induced CPU
      state corruption:
      
         http://lkml.org/lkml/2011/3/24/608
      
      Such BIOSes will have to be fixed - Linux developers rely on a working and
      fully capable PMU and the BIOS interfering with the CPU's PMU state is simply
      not acceptable.
      
      So this patch changes perf to continue when it detects such BIOS
      interaction, some hardware events may be unreliable due to the BIOS
      writing and re-writing them - there's not much the kernel can do
      about that but to detect the corruption and report it.
      Reported-and-tested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      45daae57
    • D
      perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows · 242214f9
      Don Zickus 提交于
      The read of a proper MSR register was missed and instead of
      counter the configration register was tested (it has
      ARCH_P4_UNFLAGGED_BIT always cleared) leading to unknown NMI
      hitting the system. As result the user may obtain "Dazed and
      confused, but trying to continue" message. Fix it by reading a
      proper MSR register.
      
      When an NMI happens on a P4, the perf nmi handler checks the
      configuration register to see if the overflow bit is set or not
      before taking appropriate action.  Unfortunately, various P4
      machines had a broken overflow bit, so a backup mechanism was
      implemented.  This mechanism checked to see if the counter
      rolled over or not.
      
      A previous commit that implemented this backup mechanism was
      broken. Instead of reading the counter register, it used the
      configuration register to determine if the counter rolled over
      or not. Reading that bit would give incorrect results.
      
      This would lead to 'Dazed and confused' messages for the end
      user when using the perf tool (or if the nmi watchdog is
      running).
      
      The fix is to read the counter register before determining if
      the counter rolled over or not.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <4D8BAB49.3080701@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      242214f9
  9. 24 3月, 2011 1 次提交
    • R
      x86: Use syscore_ops instead of sysdev classes and sysdevs · f3c6ea1b
      Rafael J. Wysocki 提交于
      Some subsystems in the x86 tree need to carry out suspend/resume and
      shutdown operations with one CPU on-line and interrupts disabled and
      they define sysdev classes and sysdevs or sysdev drivers for this
      purpose.  This leads to unnecessarily complicated code and excessive
      memory usage, so switch them to using struct syscore_ops objects for
      this purpose instead.
      
      Generally, there are three categories of subsystems that use
      sysdevs for implementing PM operations: (1) subsystems whose
      suspend/resume callbacks ignore their arguments entirely (the
      majority), (2) subsystems whose suspend/resume callbacks use their
      struct sys_device argument, but don't really need to do that,
      because they can be implemented differently in an arguably simpler
      way (io_apic.c), and (3) subsystems whose suspend/resume callbacks
      use their struct sys_device argument, but the value of that argument
      is always the same and could be ignored (microcode_core.c).  In all
      of these cases the subsystems in question may be readily converted to
      using struct syscore_ops objects for power management and shutdown.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      f3c6ea1b
  10. 22 3月, 2011 1 次提交
    • H
      ACPI, APEI, Add ERST record ID cache · 885b976f
      Huang Ying 提交于
      APEI ERST firmware interface and implementation has no multiple users
      in mind.  For example, if there is four records in storage with ID: 1,
      2, 3 and 4, if two ERST readers enumerate the records via
      GET_NEXT_RECORD_ID as follow,
      
      reader 1		reader 2
      1
      			2
      3
      			4
      -1
      			-1
      
      where -1 signals there is no more record ID.
      
      Reader 1 has no chance to check record 2 and 4, while reader 2 has no
      chance to check record 1 and 3.  And any other GET_NEXT_RECORD_ID will
      return -1, that is, other readers will has no chance to check any
      record even they are not cleared by anyone.
      
      This makes raw GET_NEXT_RECORD_ID not suitable for used by multiple
      users.
      
      To solve the issue, an in-memory ERST record ID cache is designed and
      implemented.  When enumerating record ID, the ID returned by
      GET_NEXT_RECORD_ID is added into cache in addition to be returned to
      caller.  So other readers can check the cache to get all record ID
      available.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      885b976f
  11. 20 3月, 2011 1 次提交
    • S
      perf, x86: Fix Intel fixed counters base initialization · fc66c521
      Stephane Eranian 提交于
      The following patch solves the problems introduced by Robert's
      commit 41bf4989 and reported by Arun Sharma. This commit gets rid
      of the base + index notation for reading and writing PMU msrs.
      
      The problem is that for fixed counters, the new calculation for
      the base did not take into account the fixed counter indexes,
      thus all fixed counters were read/written from fixed counter 0.
      Although all fixed counters share the same config MSR, they each
      have their own counter register.
      
      Without:
      
       $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds
      
        242202299 unhalted_core_cycles (0.00% scaling, ena=1000790892, run=1000790892)
       2389685946 instructions_retired (0.00% scaling, ena=1000790892, run=1000790892)
            49473 baclears             (0.00% scaling, ena=1000790892, run=1000790892)
      
      With:
      
       $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds
      
       2392703238 unhalted_core_cycles (0.00% scaling, ena=1000840809, run=1000840809)
       2389793744 instructions_retired (0.00% scaling, ena=1000840809, run=1000840809)
            47863 baclears             (0.00% scaling, ena=1000840809, run=1000840809)
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: ming.m.lin@intel.com
      Cc: robert.richter@amd.com
      Cc: asharma@fb.com
      Cc: perfmon2-devel@lists.sf.net
      LKML-Reference: <20110319172005.GB4978@quad>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fc66c521
  12. 18 3月, 2011 2 次提交
    • N
      x86, dumpstack: Correct stack dump info when frame pointer is available · e8e999cf
      Namhyung Kim 提交于
      Current stack dump code scans entire stack and check each entry
      contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it
      could mark whether the pointer is valid or not based on value of
      the frame pointer. Invalid entries could be preceded by '?' sign.
      
      However this was not going to happen because scan start point
      was always higher than the frame pointer so that they could not
      meet.
      
      Commit 9c0729dc ("x86: Eliminate bp argument from the stack
      tracing routines") delayed bp acquisition point, so the bp was
      read in lower frame, thus all of the entries were marked
      invalid.
      
      This patch fixes this by reverting above commit while retaining
      stack_frame() helper as suggested by Frederic Weisbecker.
      
      End result looks like below:
      
      before:
      
       [    3.508329] Call Trace:
       [    3.508551]  [<ffffffff814f35c9>] ? panic+0x91/0x199
       [    3.508662]  [<ffffffff814f3739>] ? printk+0x68/0x6a
       [    3.508770]  [<ffffffff81a981b2>] ? mount_block_root+0x257/0x26e
       [    3.508876]  [<ffffffff81a9821f>] ? mount_root+0x56/0x5a
       [    3.508975]  [<ffffffff81a98393>] ? prepare_namespace+0x170/0x1a9
       [    3.509216]  [<ffffffff81a9772b>] ? kernel_init+0x1d2/0x1e2
       [    3.509335]  [<ffffffff81003894>] ? kernel_thread_helper+0x4/0x10
       [    3.509442]  [<ffffffff814f6880>] ? restore_args+0x0/0x30
       [    3.509542]  [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
       [    3.509641]  [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10
      
      after:
      
       [    3.522991] Call Trace:
       [    3.523351]  [<ffffffff814f35b9>] panic+0x91/0x199
       [    3.523468]  [<ffffffff814f3729>] ? printk+0x68/0x6a
       [    3.523576]  [<ffffffff81a981b2>] mount_block_root+0x257/0x26e
       [    3.523681]  [<ffffffff81a9821f>] mount_root+0x56/0x5a
       [    3.523780]  [<ffffffff81a98393>] prepare_namespace+0x170/0x1a9
       [    3.523885]  [<ffffffff81a9772b>] kernel_init+0x1d2/0x1e2
       [    3.523987]  [<ffffffff81003894>] kernel_thread_helper+0x4/0x10
       [    3.524228]  [<ffffffff814f6880>] ? restore_args+0x0/0x30
       [    3.524345]  [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
       [    3.524445]  [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10
      
       -v5:
         * fix build breakage with oprofile
      
       -v4:
         * use 0 instead of regs->bp
         * separate out printk changes
      
       -v3:
         * apply comment from Frederic
         * add a couple of printk fixes
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Soren Sandmann <ssp@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Robert Richter <robert.richter@amd.com>
      LKML-Reference: <1300416006-3163-1-git-send-email-namhyung@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e8e999cf
    • L
      x86: Fix common misspellings · 0d2eb44f
      Lucas De Marchi 提交于
      They were generated by 'codespell' and then manually reviewed.
      Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
      Cc: trivial@kernel.org
      LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0d2eb44f
  13. 17 3月, 2011 2 次提交
  14. 16 3月, 2011 3 次提交
  15. 10 3月, 2011 1 次提交
  16. 05 3月, 2011 2 次提交
  17. 04 3月, 2011 3 次提交
    • A
      perf: Fix LLC-* events on Intel Nehalem/Westmere · e994d7d2
      Andi Kleen 提交于
      On Intel Nehalem and Westmere CPUs the generic perf LLC-* events count the
      L2 caches, not the real L3 LLC - this was inconsistent with behavior on
      other CPUs.
      
      Fixing this requires the use of the special OFFCORE_RESPONSE
      events which need a separate mask register.
      
      This has been implemented by the previous patch, now use this infrastructure
      to set correct events for the LLC-* on Nehalem and Westmere.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1299119690-13991-3-git-send-email-ming.m.lin@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e994d7d2
    • A
      perf: Add support for supplementary event registers · a7e3ed1e
      Andi Kleen 提交于
      Change logs against Andi's original version:
      
      - Extends perf_event_attr:config to config{,1,2} (Peter Zijlstra)
      - Fixed a major event scheduling issue. There cannot be a ref++ on an
        event that has already done ref++ once and without calling
        put_constraint() in between. (Stephane Eranian)
      - Use thread_cpumask for percore allocation. (Lin Ming)
      - Use MSR names in the extra reg lists. (Lin Ming)
      - Remove redundant "c = NULL" in intel_percore_constraints
      - Fix comment of perf_event_attr::config1
      
      Intel Nehalem/Westmere have a special OFFCORE_RESPONSE event
      that can be used to monitor any offcore accesses from a core.
      This is a very useful event for various tunings, and it's
      also needed to implement the generic LLC-* events correctly.
      
      Unfortunately this event requires programming a mask in a separate
      register. And worse this separate register is per core, not per
      CPU thread.
      
      This patch:
      
      - Teaches perf_events that OFFCORE_RESPONSE needs extra parameters.
        The extra parameters are passed by user space in the
        perf_event_attr::config1 field.
      
      - Adds support to the Intel perf_event core to schedule per
        core resources. This adds fairly generic infrastructure that
        can be also used for other per core resources.
        The basic code has is patterned after the similar AMD northbridge
        constraints code.
      
      Thanks to Stephane Eranian who pointed out some problems
      in the original version and suggested improvements.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1299119690-13991-2-git-send-email-ming.m.lin@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a7e3ed1e
    • S
      perf_events: Update PEBS event constraints · 17e31629
      Stephane Eranian 提交于
      This patch updates PEBS event constraints for Intel Atom, Nehalem, Westmere.
      
      This patch also reorganizes the PEBS format/constraint detection code. It is
      now based on processor model and not PEBS format. Two processors may use the
      same PEBS format without have the same list of PEBS events.
      
      In this second version, we simplified the initialization of the PEBS
      constraints by leveraging the existing switch() statement in perf_event_intel.c.
      We also renamed the constraint tables to be more consistent with regular
      constraints.
      
      In this 3rd version, we drop BR_INST_RETIRED.MISPRED from Intel Atom as it does
      not seem to work. Use MISPREDICTED_BRANCH_RETIRED instead. Also add FP_ASSIST.*
      o both Intel Nehalem and Westmere. I misssed those in the earlier patches.
      Events were tested using libpfm4 perf_examples.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4d6e6b02.815bdf0a.637b.07a7@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      17e31629
  18. 02 3月, 2011 4 次提交
  19. 16 2月, 2011 6 次提交