1. 21 12月, 2011 4 次提交
  2. 07 12月, 2011 2 次提交
  3. 06 12月, 2011 3 次提交
    • P
      perf, x86: Prefer fixed-purpose counters when scheduling · 4defea85
      Peter Zijlstra 提交于
      This avoids a scheduling failure for cases like:
      
        cycles, cycles, instructions, instructions (on Core2)
      
      Which would end up being programmed like:
      
        PMC0, PMC1, FP-instructions, fail
      
      Because all events will have the same weight.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/n/tip-8tnwb92asqj7xajqqoty4gel@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      4defea85
    • R
      perf, x86: Fix event scheduler for constraints with overlapping counters · bc1738f6
      Robert Richter 提交于
      The current x86 event scheduler fails to resolve scheduling problems
      of certain combinations of events and constraints. This happens if the
      counter mask of such an event is not a subset of any other counter
      mask of a constraint with an equal or higher weight, e.g. constraints
      of the AMD family 15h pmu:
      
                              counter mask    weight
      
       amd_f15_PMC30          0x09            2  <--- overlapping counters
       amd_f15_PMC20          0x07            3
       amd_f15_PMC53          0x38            3
      
      The scheduler does not find then an existing solution. Here is an
      example:
      
       event code     counter         failure         possible solution
      
       0x02E          PMC[3,0]        0               3
       0x043          PMC[2:0]        1               0
       0x045          PMC[2:0]        2               1
       0x046          PMC[2:0]        FAIL            2
      
      The event scheduler may not select the correct counter in the first
      cycle because it needs to know which subsequent events will be
      scheduled. It may fail to schedule the events then.
      
      To solve this, we now save the scheduler state of events with
      overlapping counter counstraints.  If we fail to schedule the events
      we rollback to those states and try to use another free counter.
      
      Constraints with overlapping counters are marked with a new introduced
      overlap flag. We set the overlap flag for such constraints to give the
      scheduler a hint which events to select for counter rescheduling. The
      EVENT_CONSTRAINT_OVERLAP() macro can be used for this.
      
      Care must be taken as the rescheduling algorithm is O(n!) which will
      increase scheduling cycles for an over-commited system dramatically.
      The number of such EVENT_CONSTRAINT_OVERLAP() macros and its counter
      masks must be kept at a minimum. Thus, the current stack is limited to
      2 states to limit the number of loops the algorithm takes in the worst
      case.
      
      On systems with no overlapping-counter constraints, this
      implementation does not increase the loop count compared to the
      previous algorithm.
      
      V2:
      * Renamed redo -> overlap.
      * Reimplementation using perf scheduling helper functions.
      
      V3:
      * Added WARN_ON_ONCE() if out of save states.
      * Changed function interface of perf_sched_restore_state() to use bool
        as return value.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1321616122-1533-3-git-send-email-robert.richter@amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      bc1738f6
    • R
      perf, x86: Implement event scheduler helper functions · 1e2ad28f
      Robert Richter 提交于
      This patch introduces x86 perf scheduler code helper functions. We
      need this to later add more complex functionality to support
      overlapping counter constraints (next patch).
      
      The algorithm is modified so that the range of weight values is now
      generated from the constraints. There shouldn't be other functional
      changes.
      
      With the helper functions the scheduler is controlled. There are
      functions to initialize, traverse the event list, find unused counters
      etc. The scheduler keeps its own state.
      
      V3:
      * Added macro for_each_set_bit_cont().
      * Changed functions interfaces of perf_sched_find_counter() and
        perf_sched_next_event() to use bool as return value.
      * Added some comments to make code better understandable.
      
      V4:
      * Fix broken event assignment if weight of the first event is not
        wmin (perf_sched_init()).
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1321616122-1533-2-git-send-email-robert.richter@amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      1e2ad28f
  4. 14 11月, 2011 2 次提交
  5. 10 10月, 2011 1 次提交
  6. 26 9月, 2011 1 次提交
  7. 31 8月, 2011 1 次提交
    • A
      x86, perf: Check that current->mm is alive before getting user callchain · 20afc60f
      Andrey Vagin 提交于
      An event may occur when an mm is already released.
      
      I added an event in dequeue_entity() and caught a panic with
      the following backtrace:
      
      [  434.421110] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
      [  434.421258] IP: [<ffffffff810464ac>] __get_user_pages_fast+0x9c/0x120
      ...
      [  434.421258] Call Trace:
      [  434.421258]  [<ffffffff8101ae81>] copy_from_user_nmi+0x51/0xf0
      [  434.421258]  [<ffffffff8109a0d5>] ? sched_clock_local+0x25/0x90
      [  434.421258]  [<ffffffff8101b048>] perf_callchain_user+0x128/0x170
      [  434.421258]  [<ffffffff811154cd>] ? __perf_event_header__init_id+0xed/0x100
      [  434.421258]  [<ffffffff81116690>] perf_prepare_sample+0x200/0x280
      [  434.421258]  [<ffffffff81118da8>] __perf_event_overflow+0x1b8/0x290
      [  434.421258]  [<ffffffff81065240>] ? tg_shares_up+0x0/0x670
      [  434.421258]  [<ffffffff8104fe1a>] ? walk_tg_tree+0x6a/0xb0
      [  434.421258]  [<ffffffff81118f44>] perf_swevent_overflow+0xc4/0xf0
      [  434.421258]  [<ffffffff81119150>] do_perf_sw_event+0x1e0/0x250
      [  434.421258]  [<ffffffff81119204>] perf_tp_event+0x44/0x70
      [  434.421258]  [<ffffffff8105701f>] ftrace_profile_sched_block+0xdf/0x110
      [  434.421258]  [<ffffffff8106121d>] dequeue_entity+0x2ad/0x2d0
      [  434.421258]  [<ffffffff810614ec>] dequeue_task_fair+0x1c/0x60
      [  434.421258]  [<ffffffff8105818a>] dequeue_task+0x9a/0xb0
      [  434.421258]  [<ffffffff810581e2>] deactivate_task+0x42/0xe0
      [  434.421258]  [<ffffffff814bc019>] thread_return+0x191/0x808
      [  434.421258]  [<ffffffff81098a44>] ? switch_task_namespaces+0x24/0x60
      [  434.421258]  [<ffffffff8106f4c4>] do_exit+0x464/0x910
      [  434.421258]  [<ffffffff8106f9c8>] do_group_exit+0x58/0xd0
      [  434.421258]  [<ffffffff8106fa57>] sys_exit_group+0x17/0x20
      [  434.421258]  [<ffffffff8100b202>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: stable@kernel.org
      Link: http://lkml.kernel.org/r/1314693156-24131-1-git-send-email-avagin@openvz.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      20afc60f
  8. 14 8月, 2011 1 次提交
  9. 22 7月, 2011 1 次提交
  10. 15 7月, 2011 1 次提交
    • C
      perf, x86: P4 PMU - Introduce event alias feature · f9129870
      Cyrill Gorcunov 提交于
      Instead of hw_nmi_watchdog_set_attr() weak function
      and appropriate x86_pmu::hw_watchdog_set_attr() call
      we introduce even alias mechanism which allow us
      to drop this routines completely and isolate quirks
      of Netburst architecture inside P4 PMU code only.
      
      The main idea remains the same though -- to allow
      nmi-watchdog and perf top run simultaneously.
      
      Note the aliasing mechanism applies to generic
      PERF_COUNT_HW_CPU_CYCLES event only because arbitrary
      event (say passed as RAW initially) might have some
      additional bits set inside ESCR register changing
      the behaviour of event and we can't guarantee anymore
      that alias event will give the same result.
      
      P.S. Thanks a huge to Don and Steven for for testing
           and early review.
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Tested-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      CC: Ingo Molnar <mingo@elte.hu>
      CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Stephane Eranian <eranian@google.com>
      CC: Lin Ming <ming.m.lin@intel.com>
      CC: Arnaldo Carvalho de Melo <acme@redhat.com>
      CC: Frederic Weisbecker <fweisbec@gmail.com>
      Link: http://lkml.kernel.org/r/20110708201712.GS23657@sunSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f9129870
  11. 01 7月, 2011 6 次提交
  12. 12 5月, 2011 1 次提交
  13. 27 4月, 2011 1 次提交
    • D
      perf, x86, nmi: Move LVT un-masking into irq handlers · 2bce5dac
      Don Zickus 提交于
      It was noticed that P4 machines were generating double NMIs for
      each perf event.  These extra NMIs lead to 'Dazed and confused'
      messages on the screen.
      
      I tracked this down to a P4 quirk that said the overflow bit had
      to be cleared before re-enabling the apic LVT mask.  My first
      attempt was to move the un-masking inside the perf nmi handler
      from before the chipset NMI handler to after.
      
      This broke Nehalem boxes that seem to like the unmasking before
      the counters themselves are re-enabled.
      
      In order to keep this change simple for 2.6.39, I decided to
      just simply move the apic LVT un-masking to the beginning of all
      the chipset NMI handlers, with the exception of Pentium4's to
      fix the double NMI issue.
      
      Later on we can move the un-masking to later in the handlers to
      save a number of 'extra' NMIs on those particular chipsets.
      
      I tested this change on a P4 machine, an AMD machine, a Nehalem
      box, and a core2quad box.  'perf top' worked correctly along
      with various other small 'perf record' runs.  Anything high
      stress breaks all the machines but that is a different problem.
      
      Thanks to various people for testing different versions of this
      patch.
      Reported-and-tested-by: NShaun Ruffell <sruffell@digium.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Link: http://lkml.kernel.org/r/1303900353-10242-1-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      CC: Cyrill Gorcunov <gorcunov@gmail.com>
      2bce5dac
  14. 26 4月, 2011 1 次提交
  15. 22 4月, 2011 1 次提交
    • I
      x86, perf event: Turn off unstructured raw event access to offcore registers · b52c55c6
      Ingo Molnar 提交于
      Andi Kleen pointed out that the Intel offcore support patches were merged
      without user-space tool support to the functionality:
      
       |
       | The offcore_msr perf kernel code was merged into 2.6.39-rc*, but the
       | user space bits were not. This made it impossible to set the extra mask
       | and actually do the OFFCORE profiling
       |
      
      Andi submitted a preliminary patch for user-space support, as an
      extension to perf's raw event syntax:
      
       |
       | Some raw events -- like the Intel OFFCORE events -- support additional
       | parameters. These can be appended after a ':'.
       |
       | For example on a multi socket Intel Nehalem:
       |
       |    perf stat -e r1b7:20ff -a sleep 1
       |
       | Profile the OFFCORE_RESPONSE.ANY_REQUEST with event mask REMOTE_DRAM_0
       | that measures any access to DRAM on another socket.
       |
      
      But this kind of usability is absolutely unacceptable - users should not
      be expected to type in magic, CPU and model specific incantations to get
      access to useful hardware functionality.
      
      The proper solution is to expose useful offcore functionality via
      generalized events - that way users do not have to care which specific
      CPU model they are using, they can use the conceptual event and not some
      model specific quirky hexa number.
      
      We already have such generalization in place for CPU cache events,
      and it's all very extensible.
      
      "Offcore" events measure general DRAM access patters along various
      parameters. They are particularly useful in NUMA systems.
      
      We want to support them via generalized DRAM events: either as the
      fourth level of cache (after the last-level cache), or as a separate
      generalization category.
      
      That way user-space support would be very obvious, memory access
      profiling could be done via self-explanatory commands like:
      
        perf record -e dram ./myapp
        perf record -e dram-remote ./myapp
      
      ... to measure DRAM accesses or more expensive cross-node NUMA DRAM
      accesses.
      
      These generalized events would work on all CPUs and architectures that
      have comparable PMU features.
      
      ( Note, these are just examples: actual implementation could have more
        sophistication and more parameter - as long as they center around
        similarly simple usecases. )
      
      Now we do not want to revert *all* of the current offcore bits, as they
      are still somewhat useful for generic last-level-cache events, implemented
      in this commit:
      
        e994d7d2: perf: Fix LLC-* events on Intel Nehalem/Westmere
      
      But we definitely do not yet want to expose the unstructured raw events
      to user-space, until better generalization and usability is implemented
      for these hardware event features.
      
      ( Note: after generalization has been implemented raw offcore events can be
        supported as well: there can always be an odd event that is marginally
        useful but not useful enough to generalize. DRAM profiling is definitely
        *not* such a category so generalization must be done first. )
      
      Furthermore, PERF_TYPE_RAW access to these registers was not intended
      to go upstream without proper support - it was a side-effect of the above
      e994d7d2 commit, not mentioned in the changelog.
      
      As v2.6.39 is nearing release we go for the simplest approach: disable
      the PERF_TYPE_RAW offcore hack for now, before it escapes into a released
      kernel and becomes an ABI.
      
      Once proper structure is implemented for these hardware events and users
      are offered usable solutions we can revisit this issue.
      Reported-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1302658203-4239-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b52c55c6
  16. 19 4月, 2011 1 次提交
  17. 25 3月, 2011 1 次提交
    • I
      perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue · 45daae57
      Ingo Molnar 提交于
      Eric Dumazet reported that hardware PMU events do not work on his
      system, due to the BIOS corrupting PMU state:
      
          Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only.
          [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c)
      
      Linus suggested that we continue in the face of such BIOS-induced CPU
      state corruption:
      
         http://lkml.org/lkml/2011/3/24/608
      
      Such BIOSes will have to be fixed - Linux developers rely on a working and
      fully capable PMU and the BIOS interfering with the CPU's PMU state is simply
      not acceptable.
      
      So this patch changes perf to continue when it detects such BIOS
      interaction, some hardware events may be unreliable due to the BIOS
      writing and re-writing them - there's not much the kernel can do
      about that but to detect the corruption and report it.
      Reported-and-tested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      45daae57
  18. 20 3月, 2011 1 次提交
    • S
      perf, x86: Fix Intel fixed counters base initialization · fc66c521
      Stephane Eranian 提交于
      The following patch solves the problems introduced by Robert's
      commit 41bf4989 and reported by Arun Sharma. This commit gets rid
      of the base + index notation for reading and writing PMU msrs.
      
      The problem is that for fixed counters, the new calculation for
      the base did not take into account the fixed counter indexes,
      thus all fixed counters were read/written from fixed counter 0.
      Although all fixed counters share the same config MSR, they each
      have their own counter register.
      
      Without:
      
       $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds
      
        242202299 unhalted_core_cycles (0.00% scaling, ena=1000790892, run=1000790892)
       2389685946 instructions_retired (0.00% scaling, ena=1000790892, run=1000790892)
            49473 baclears             (0.00% scaling, ena=1000790892, run=1000790892)
      
      With:
      
       $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds
      
       2392703238 unhalted_core_cycles (0.00% scaling, ena=1000840809, run=1000840809)
       2389793744 instructions_retired (0.00% scaling, ena=1000840809, run=1000840809)
            47863 baclears             (0.00% scaling, ena=1000840809, run=1000840809)
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: ming.m.lin@intel.com
      Cc: robert.richter@amd.com
      Cc: asharma@fb.com
      Cc: perfmon2-devel@lists.sf.net
      LKML-Reference: <20110319172005.GB4978@quad>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fc66c521
  19. 18 3月, 2011 2 次提交
    • N
      x86, dumpstack: Correct stack dump info when frame pointer is available · e8e999cf
      Namhyung Kim 提交于
      Current stack dump code scans entire stack and check each entry
      contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it
      could mark whether the pointer is valid or not based on value of
      the frame pointer. Invalid entries could be preceded by '?' sign.
      
      However this was not going to happen because scan start point
      was always higher than the frame pointer so that they could not
      meet.
      
      Commit 9c0729dc ("x86: Eliminate bp argument from the stack
      tracing routines") delayed bp acquisition point, so the bp was
      read in lower frame, thus all of the entries were marked
      invalid.
      
      This patch fixes this by reverting above commit while retaining
      stack_frame() helper as suggested by Frederic Weisbecker.
      
      End result looks like below:
      
      before:
      
       [    3.508329] Call Trace:
       [    3.508551]  [<ffffffff814f35c9>] ? panic+0x91/0x199
       [    3.508662]  [<ffffffff814f3739>] ? printk+0x68/0x6a
       [    3.508770]  [<ffffffff81a981b2>] ? mount_block_root+0x257/0x26e
       [    3.508876]  [<ffffffff81a9821f>] ? mount_root+0x56/0x5a
       [    3.508975]  [<ffffffff81a98393>] ? prepare_namespace+0x170/0x1a9
       [    3.509216]  [<ffffffff81a9772b>] ? kernel_init+0x1d2/0x1e2
       [    3.509335]  [<ffffffff81003894>] ? kernel_thread_helper+0x4/0x10
       [    3.509442]  [<ffffffff814f6880>] ? restore_args+0x0/0x30
       [    3.509542]  [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
       [    3.509641]  [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10
      
      after:
      
       [    3.522991] Call Trace:
       [    3.523351]  [<ffffffff814f35b9>] panic+0x91/0x199
       [    3.523468]  [<ffffffff814f3729>] ? printk+0x68/0x6a
       [    3.523576]  [<ffffffff81a981b2>] mount_block_root+0x257/0x26e
       [    3.523681]  [<ffffffff81a9821f>] mount_root+0x56/0x5a
       [    3.523780]  [<ffffffff81a98393>] prepare_namespace+0x170/0x1a9
       [    3.523885]  [<ffffffff81a9772b>] kernel_init+0x1d2/0x1e2
       [    3.523987]  [<ffffffff81003894>] kernel_thread_helper+0x4/0x10
       [    3.524228]  [<ffffffff814f6880>] ? restore_args+0x0/0x30
       [    3.524345]  [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
       [    3.524445]  [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10
      
       -v5:
         * fix build breakage with oprofile
      
       -v4:
         * use 0 instead of regs->bp
         * separate out printk changes
      
       -v3:
         * apply comment from Frederic
         * add a couple of printk fixes
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Soren Sandmann <ssp@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Robert Richter <robert.richter@amd.com>
      LKML-Reference: <1300416006-3163-1-git-send-email-namhyung@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e8e999cf
    • L
      x86: Fix common misspellings · 0d2eb44f
      Lucas De Marchi 提交于
      They were generated by 'codespell' and then manually reviewed.
      Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
      Cc: trivial@kernel.org
      LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0d2eb44f
  20. 16 3月, 2011 1 次提交
  21. 05 3月, 2011 1 次提交
  22. 04 3月, 2011 2 次提交
    • A
      perf: Fix LLC-* events on Intel Nehalem/Westmere · e994d7d2
      Andi Kleen 提交于
      On Intel Nehalem and Westmere CPUs the generic perf LLC-* events count the
      L2 caches, not the real L3 LLC - this was inconsistent with behavior on
      other CPUs.
      
      Fixing this requires the use of the special OFFCORE_RESPONSE
      events which need a separate mask register.
      
      This has been implemented by the previous patch, now use this infrastructure
      to set correct events for the LLC-* on Nehalem and Westmere.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1299119690-13991-3-git-send-email-ming.m.lin@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e994d7d2
    • A
      perf: Add support for supplementary event registers · a7e3ed1e
      Andi Kleen 提交于
      Change logs against Andi's original version:
      
      - Extends perf_event_attr:config to config{,1,2} (Peter Zijlstra)
      - Fixed a major event scheduling issue. There cannot be a ref++ on an
        event that has already done ref++ once and without calling
        put_constraint() in between. (Stephane Eranian)
      - Use thread_cpumask for percore allocation. (Lin Ming)
      - Use MSR names in the extra reg lists. (Lin Ming)
      - Remove redundant "c = NULL" in intel_percore_constraints
      - Fix comment of perf_event_attr::config1
      
      Intel Nehalem/Westmere have a special OFFCORE_RESPONSE event
      that can be used to monitor any offcore accesses from a core.
      This is a very useful event for various tunings, and it's
      also needed to implement the generic LLC-* events correctly.
      
      Unfortunately this event requires programming a mask in a separate
      register. And worse this separate register is per core, not per
      CPU thread.
      
      This patch:
      
      - Teaches perf_events that OFFCORE_RESPONSE needs extra parameters.
        The extra parameters are passed by user space in the
        perf_event_attr::config1 field.
      
      - Adds support to the Intel perf_event core to schedule per
        core resources. This adds fairly generic infrastructure that
        can be also used for other per core resources.
        The basic code has is patterned after the similar AMD northbridge
        constraints code.
      
      Thanks to Stephane Eranian who pointed out some problems
      in the original version and suggested improvements.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1299119690-13991-2-git-send-email-ming.m.lin@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a7e3ed1e
  23. 02 3月, 2011 1 次提交
    • L
      perf, x86: Add Intel SandyBridge CPU support · b06b3d49
      Lin Ming 提交于
      This patch adds basic SandyBridge support, including hardware
      cache events and PEBS events support.
      
      It has been tested on SandyBridge CPUs with perf stat and also
      with PEBS based profiling - both work fine.
      
      The patch does not affect other models.
      
      v2 -> v3:
       - fix PEBS event 0xd0 with right umask combinations
       - move snb pebs constraint assignment to intel_pmu_init
      
      v1 -> v2:
       - add more raw and PEBS events constraints
       - use offcore events for LLC-* cache events
       - remove the call to Nehalem workaround enable_all function
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      LKML-Reference: <1299072424.2175.24.camel@localhost>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b06b3d49
  24. 16 2月, 2011 3 次提交