1. 15 9月, 2009 1 次提交
  2. 04 9月, 2009 1 次提交
    • I
      perf_counter: Fix output-sharing error path · dc86cabe
      Ingo Molnar 提交于
      We forget to release the fd in the PERF_FLAG_FD_OUTPUT
      error path.
      
      Reorganize the error flow here to be a clean fall-through
      logic.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dc86cabe
  3. 03 9月, 2009 1 次提交
  4. 29 8月, 2009 1 次提交
    • P
      perf_counter: Fix /0 bug in swcounters · eced1dfc
      Peter Zijlstra 提交于
      We have a race in the swcounter stuff where we can start
      counting a counter that has never been enabled, this leads to a
      /0 situation.
      
      The below avoids the /0 but doesn't close the race, this would
      need a new counter state.
      
      The race is due to perf_swcounter_is_counting() which cannot
      discern between disabled due to scheduled out, and disabled for
      any other reason.
      
      Such a crash has been seen by Ingo:
      
      [  967.092372] divide error: 0000 [#1] SMP
      [  967.096499] last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
      [  967.104846] CPU 5
      [  967.106965] Modules linked in:
      [  967.110169] Pid: 3351, comm: hackbench Not tainted 2.6.31-rc8-tip-01158-gd940a54-dirty #1568 X8DTN
      [  967.119456] RIP: 0010:[<ffffffff810c0aba>]  [<ffffffff810c0aba>] perf_swcounter_ctx_event+0x127/0x1af
      [  967.129137] RSP: 0018:ffff8801a95abd70  EFLAGS: 00010046
      [  967.134699] RAX: 0000000000000002 RBX: ffff8801bd645c00 RCX: 0000000000000002
      [  967.142162] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801bd645d40
      [  967.149584] RBP: ffff8801a95abdb0 R08: 0000000000000001 R09: ffff8801a95abe00
      [  967.157042] R10: 0000000000000037 R11: ffff8801aa1245f8 R12: ffff8801a95abe00
      [  967.164481] R13: ffff8801a95abe00 R14: ffff8801aa1c0e78 R15: 0000000000000001
      [  967.171953] FS:  0000000000000000(0000) GS:ffffc90000a00000(0063) knlGS:00000000f7f486c0
      [  967.180406] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
      [  967.186374] CR2: 000000004822c0ac CR3: 00000001b19a2000 CR4: 00000000000006e0
      [  967.193770] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  967.201224] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  967.208692] Process hackbench (pid: 3351, threadinfo ffff8801a95aa000, task ffff8801a96b0000)
      [  967.217607] Stack:
      [  967.219711]  0000000000000000 0000000000000037 0000000200000001 ffffc90000a1107c
      [  967.227296] <0> ffff8801a95abe00 0000000000000001 0000000000000001 0000000000000037
      [  967.235333] <0> ffff8801a95abdf0 ffffffff810c0c20 0000000200a14f30 ffff8801a95abe40
      [  967.243532] Call Trace:
      [  967.246103]  [<ffffffff810c0c20>] do_perf_swcounter_event+0xde/0xec
      [  967.252635]  [<ffffffff810c0ca7>] perf_tpcounter_event+0x79/0x7b
      [  967.258957]  [<ffffffff81037f73>] ftrace_profile_sched_switch+0xc0/0xcb
      [  967.265791]  [<ffffffff8155f22d>] schedule+0x429/0x4c4
      [  967.271156]  [<ffffffff8100c01e>] int_careful+0xd/0x14
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1251472247.17617.74.camel@laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      eced1dfc
  5. 28 8月, 2009 1 次提交
    • I
      perf_counters: Increase paranoia level · 6bb56347
      Ingo Molnar 提交于
      Per-cpu counters are an ASLR information leak as they show
      the execution other tasks do. Increase the paranoia level
      to 1, which disallows per-cpu counters. (they still allow
      counting/profiling of own tasks - and admin can profile
      everything.)
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6bb56347
  6. 25 8月, 2009 2 次提交
    • P
      perf_counter: Allow sharing of output channels · a4be7c27
      Peter Zijlstra 提交于
      Provide the ability to configure a counter to send its output
      to another (already existing) counter's output stream.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20090819092023.980284148@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a4be7c27
    • P
      perf_counter: Start counting time enabled when group leader gets enabled · fa289bec
      Paul Mackerras 提交于
      Currently, if a group is created where the group leader is
      initially disabled but a non-leader member is initially
      enabled, and then the leader is subsequently enabled some time
      later, the time_enabled for the non-leader member will reflect
      the whole time since it was created, not just the time since
      the leader was enabled.
      
      This is incorrect, because all of the members are effectively
      disabled while the leader is disabled, since none of the
      members can go on the PMU if the leader can't.
      
      Thus we have to update the ->tstamp_enabled for all the enabled
      group members when a group leader is enabled, so that the
      time_enabled computation only counts the time since the leader
      was enabled.
      
      Similarly, when disabling a group leader we have to update the
      time_enabled and time_running for all of the group members.
      
      Also, in update_counter_times, we have to treat a counter whose
      group leader is disabled as being disabled.
      Reported-by: NStephane Eranian <eranian@googlemail.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@kernel.org>
      LKML-Reference: <19091.29664.342227.445006@drongo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fa289bec
  7. 22 8月, 2009 1 次提交
    • P
      perf_counter: Fix typo in read() output generation · 4464fcaa
      Peter Zijlstra 提交于
      When you iterate a list, using the iterator is useful.
      
      Before:
      
         ID: 5
         ID: 5
         ID: 5
         ID: 5
         EVNT: 0x40088b scale: nan ID: 5 CNT: 1006252 ID: 6 CNT: 1011090 ID: 7 CNT: 1011196 ID: 8 CNT: 1011095
         EVNT: 0x40088c scale: 1.000000 ID: 5 CNT: 2003065 ID: 6 CNT: 2011671 ID: 7 CNT: 2012620 ID: 8 CNT: 2013479
         EVNT: 0x40088c scale: 1.000000 ID: 5 CNT: 3002390 ID: 6 CNT: 3015996 ID: 7 CNT: 3018019 ID: 8 CNT: 3020006
         EVNT: 0x40088b scale: 1.000000 ID: 5 CNT: 4002406 ID: 6 CNT: 4021120 ID: 7 CNT: 4024241 ID: 8 CNT: 4027059
      
      After:
      
         ID: 1
         ID: 2
         ID: 3
         ID: 4
         EVNT: 0x400889 scale: nan ID: 1 CNT: 1005270 ID: 2 CNT: 1009833 ID: 3 CNT: 1010065 ID: 4 CNT: 1010088
         EVNT: 0x400898 scale: nan ID: 1 CNT: 2001531 ID: 2 CNT: 2022309 ID: 3 CNT: 2022470 ID: 4 CNT: 2022627
         EVNT: 0x400888 scale: 0.489467 ID: 1 CNT: 3001261 ID: 2 CNT: 3027088 ID: 3 CNT: 3027941 ID: 4 CNT: 3028762
      Reported-by: Nstephane eranian <eranian@googlemail.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey J Ashford <cjashfor@us.ibm.com>
      Cc: perfmon2-devel <perfmon2-devel@lists.sourceforge.net>
      LKML-Reference: <1250867976.7538.73.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4464fcaa
  8. 18 8月, 2009 1 次提交
    • I
      perf_counter: Fix the PARISC build · f738eb1b
      Ingo Molnar 提交于
      PARISC does not build:
      
      /home/mingo/tip/kernel/perf_counter.c: In function 'perf_counter_index':
      /home/mingo/tip/kernel/perf_counter.c:2016: error: 'PERF_COUNTER_INDEX_OFFSET' undeclared (first use in this function)
      /home/mingo/tip/kernel/perf_counter.c:2016: error: (Each undeclared identifier is reported only once
      /home/mingo/tip/kernel/perf_counter.c:2016: error: for each function it appears in.)
      
      As PERF_COUNTER_INDEX_OFFSET is not defined.
      
      Now, we could define it in the architecture - but lets also provide
      a core default of 0 (which happens to be what all but one
      architecture uses at the moment).
      
      Architectures that need a different index offset should set this
      value in their asm/perf_counter.h files.
      
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Helge Deller <deller@gmx.de>
      Cc: linux-parisc@vger.kernel.org
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f738eb1b
  9. 17 8月, 2009 1 次提交
    • P
      perf_counter: Check task on counter read IPI · e1ac3614
      Paul Mackerras 提交于
      In general, code in perf_counter.c that is called through an
      IPI checks, for per-task counters, that the counter's task is
      still the current task.  This is to handle the race condition
      where the cpu switches from the task we want to another task in
      the interval between sending the IPI and the IPI arriving and
      being handled on the target CPU.
      
      For some reason, __perf_counter_read is missing this check, yet
      there is no reason why the race condition can't occur.  This
      adds a check that the current task is the one we want.  If it
      isn't, we just return.  In that case the counter->count value
      should be up to date, since it will have been updated when the
      counter was scheduled out, which must have happened since the
      IPI was sent.
      
      I don't have an example of an actual failure due to this race,
      but it seems obvious that it could occur and we need to guard
      against it.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <19076.63614.277861.368125@drongo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e1ac3614
  10. 13 8月, 2009 5 次提交
    • P
      perf_counter: Report the cloning task as parent on perf_counter_fork() · 94d5d1b2
      Peter Zijlstra 提交于
      A bug in (9f498cc5: perf_counter: Full task tracing) makes
      profiling multi-threaded apps it go belly up.
      
      [ output as: (PID:TID):(PPID:PTID) ]
      
       # ./perf report -D | grep FORK
      0x4b0 [0x18]: PERF_EVENT_FORK: (3237:3237):(3236:3236)
      0xa10 [0x18]: PERF_EVENT_FORK: (3237:3238):(3236:3236)
      0xa70 [0x18]: PERF_EVENT_FORK: (3237:3239):(3236:3236)
      0xad0 [0x18]: PERF_EVENT_FORK: (3237:3240):(3236:3236)
      0xb18 [0x18]: PERF_EVENT_FORK: (3237:3241):(3236:3236)
      
      Shows us that the test (27d028de perf report: Update for the new
      FORK/EXIT events) in builtin-report.c:
      
              /*
               * A thread clone will have the same PID for both
               * parent and child.
               */
              if (thread == parent)
                      return 0;
      
      Will clearly fail.
      
      The problem is that perf_counter_fork() reports the actual
      parent, instead of the cloning thread.
      
      Fixing that (with the below patch), yields:
      
       # ./perf report -D | grep FORK
      0x4c8 [0x18]: PERF_EVENT_FORK: (1590:1590):(1589:1589)
      0xbd8 [0x18]: PERF_EVENT_FORK: (1590:1591):(1590:1590)
      0xc80 [0x18]: PERF_EVENT_FORK: (1590:1592):(1590:1590)
      0x3338 [0x18]: PERF_EVENT_FORK: (1590:1593):(1590:1590)
      0x66b0 [0x18]: PERF_EVENT_FORK: (1590:1594):(1590:1590)
      
      Which both makes more sense and doesn't confuse perf report
      anymore.
      Reported-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: paulus@samba.org
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arjan van de Ven <arjan@infradead.org>
      LKML-Reference: <1250172882.5241.62.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      94d5d1b2
    • P
      perf_counter: Fix an ipi-deadlock · 970892a9
      Peter Zijlstra 提交于
      perf_pending_counter() is called from IRQ context and will call
      perf_counter_disable(), however perf_counter_disable() uses
      smp_call_function_single() which doesn't fancy being used with
      IRQs disabled due to IPI deadlocks.
      
      Fix this by making it use the local __perf_counter_disable()
      call and teaching the counter_sched_out() code about pending
      disables as well.
      
      This should cover the case where a counter migrates before the
      pending queue gets processed.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey J Ashford <cjashfor@us.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: stephane eranian <eranian@googlemail.com>
      LKML-Reference: <20090813103655.244097721@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      970892a9
    • P
      perf: Rework/fix the whole read vs group stuff · 3dab77fb
      Peter Zijlstra 提交于
      Replace PERF_SAMPLE_GROUP with PERF_SAMPLE_READ and introduce
      PERF_FORMAT_GROUP to deal with group reads in a more generic
      way.
      
      This allows you to get group reads out of read() as well.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey J Ashford <cjashfor@us.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: stephane eranian <eranian@googlemail.com>
      LKML-Reference: <20090813103655.117411814@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3dab77fb
    • P
      perf_counter: Fix swcounter context invariance · bcfc2602
      Peter Zijlstra 提交于
      perf_swcounter_is_counting() uses a lock, which means we cannot
      use swcounters from NMI or when holding that particular lock,
      this is unintended.
      
      The below removes the lock, this opens up race window, but not
      worse than the swcounters already experience due to RCU
      traversal of the context in perf_swcounter_ctx_event().
      
      This also fixes the hard lockups while opening a lockdep
      tracepoint counter.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: Corey J Ashford <cjashfor@us.ibm.com>
      LKML-Reference: <1250149915.10001.66.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bcfc2602
    • I
      perf_counter: Provide hw_perf_counter_setup_online() APIs · 28402971
      Ingo Molnar 提交于
      Provide weak aliases for hw_perf_counter_setup_online(). This is
      used by the BTS patches (for v2.6.32), but it interacts with
      fixes so propagate this upstream. (it has no effect as of yet)
      
      Also export perf_counter_output() to architecture code.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      28402971
  11. 10 8月, 2009 2 次提交
  12. 09 8月, 2009 7 次提交
    • M
      x86, perf_counter, bts: Add BTS support to perfcounters · 30dd568c
      Markus Metzger 提交于
      Implement a performance counter with:
      
          attr.type           = PERF_TYPE_HARDWARE
          attr.config         = PERF_COUNT_HW_BRANCH_INSTRUCTIONS
          attr.sample_period  = 1
      
      Using branch trace store (BTS) on x86 hardware, if available.
      
      The from and to address for each branch can be sampled using:
      
          PERF_SAMPLE_IP      for the from address
          PERF_SAMPLE_ADDR    for the to address
      
      [ v2: address review feedback, fix bugs ]
      Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      30dd568c
    • P
      perf_counter: Fix a race on perf_counter_ctx · 3a80b4a3
      Peter Zijlstra 提交于
      While extending perfcounters with BTS hw-tracing, Markus
      Metzger managed to trigger this warning:
      
         [  995.557128] WARNING: at kernel/perf_counter.c:1191 __perf_counter_task_sched_out+0x48/0x6b()
      
      triggers because commit
      9f498cc5 (perf_counter: Full
      task tracing) removed clearing of tsk->perf_counter_ctxp out
      from under ctx->lock which introduced a race (against
      perf_lock_task_context).
      
      Move it back and deal with the exit notification by explicitly
      passing along the former task context.
      Reported-by: NMarkus T Metzger <markus.t.metzger@intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1249667341.17467.5.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a80b4a3
    • F
      perf_counter: Fix tracepoint sampling to be part of generic sampling · 3a43ce68
      Frederic Weisbecker 提交于
      Based on Peter's comments, make tracepoint sampling generic
      just like all the other sampling bits are. This is a rename
      with no code changes:
      
      - PERF_SAMPLE_TP_RECORD to PERF_SAMPLE_RAW
      - struct perf_tracepoint_record to perf_raw_record
      
      We want the system in place that transport tracepoints raw
      samples events into the perf ring buffer to be generalized and
      usable by any type of counter.
      
      Reported-by; Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1249698400-5441-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a43ce68
    • F
      perf_counter: Work around gcc warning by initializing tracepoint record unconditionally · 10b8e306
      Frederic Weisbecker 提交于
      Despite that the tracepoint record is always present when the
      PERF_SAMPLE_TP_RECORD flag is set, gcc raises a warning,
      thinking it might not be initialized:
      
        kernel/perf_counter.c: In function ‘perf_counter_output’:
        kernel/perf_counter.c:2650: warning: ‘tp’ may be used uninitialized in this function
      
      Then, initialize it to NULL and always check if it's not NULL
      before dereference it.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1249698400-5441-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      10b8e306
    • P
      perf_counter: Fix software counters for fast moving event sources · 7b4b6658
      Peter Zijlstra 提交于
      Reimplement the software counters to deal with fast moving
      event sources (such as tracepoints). This means being able
      to generate multiple overflows from a single 'event' as well
      as support throttling.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7b4b6658
    • F
      perf_counter: Fix/complete ftrace event records sampling · f413cdb8
      Frederic Weisbecker 提交于
      This patch implements the kernel side support for ftrace event
      record sampling.
      
      A new counter sampling attribute is added:
      
         PERF_SAMPLE_TP_RECORD
      
      which requests ftrace events record sampling. In this case
      if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
      fires, we emit the tracepoint binary record to the
      perfcounter event buffer, as a sample.
      
      Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
      record:
      
       perf record -f -F 1 -a -e workqueue:workqueue_execution
       perf report -D
      
       0x21e18 [0x48]: event: 9
       .
       . ... raw event: size 72 bytes
       .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
       .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
       .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
       .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
       .  0040:  e0 b1 31 81 ff ff ff ff                          .......
      .
      0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33
      
      The raw ftrace binary record starts at offset 0020.
      
      Translation:
      
       struct trace_entry {
      	type		= 0x2b = 43;
      	flags		= 1;
      	preempt_count	= 2;
      	pid		= 0xa = 10;
      	tgid		= 0xa = 10;
       }
      
       thread_comm = "events/1"
       thread_pid  = 0xa = 10;
       func	    = 0xffffffff8131b1e0 = flush_to_ldisc()
      
      What will come next?
      
       - Userspace support ('perf trace'), 'flight data recorder' mode
         for perf trace, etc.
      
       - The unconditional copy from the profiling callback brings
         some costs however if someone wants no such sampling to
         occur, and needs to be fixed in the future. For that we need
         to have an instant access to the perf counter attribute.
         This is a matter of a flag to add in the struct ftrace_event.
      
       - Take care of the events recursivity! Don't ever try to record
         a lock event for example, it seems some locking is used in
         the profiling fast path and lead to a tracing recursivity.
         That will be fixed using raw spinlock or recursivity
         protection.
      
       - [...]
      
       - Profit! :-)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f413cdb8
    • P
      perf_counter, ftrace: Fix perf_counter integration · 3a659305
      Peter Zijlstra 提交于
      Adds possible second part to the assign argument of TP_EVENT().
      
        TP_perf_assign(
      	__perf_count(foo);
      	__perf_addr(bar);
        )
      
      Which, when specified make the swcounter increment with @foo instead
      of the usual 1, and report @bar for PERF_SAMPLE_ADDR (data address
      associated with the event) when this triggers a counter overflow.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a659305
  13. 07 8月, 2009 1 次提交
    • P
      perf_counter: Fix double list iteration in per task precise stats · 1054598c
      Peter Zijlstra 提交于
      Brice Goglin reported this crash with per task precise stats:
      
      > I finally managed to test the threaded perfcounter statistics (thanks a
      > lot for implementing it). I am running 2.6.31-rc5 (with the AMD
      > magny-cours patches but I don't think they matter here). I am trying to
      > measure local/remote memory accesses per thread during the well-known
      > stream benchmark. It's compiled with OpenMP using 16 threads on a
      > quad-socket quad-core barcelona machine.
      >
      > Command line is:
      >  /mnt/scratch/bgoglin/cpunode/linux-2.6.31/tools/perf/perf record -f -s
      > -e r1000001e0 -e r1000002e0 -e r1000004e0 -e r1000008e0 ./stream
      >
      > It seems to work fine with a single -e <counter> on the command line
      > while it crashes when there are at least 2 of them.
      > It seems to work fine without -s as well.
      
      A silly copy-paste resulted in a messed up iteration which would
      cause the OOPS.
      Reported-by: NBrice Goglin <Brice.Goglin@inria.fr>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Tested-by: NBrice Goglin <Brice.Goglin@inria.fr>
      LKML-Reference: <1249574786.32113.550.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1054598c
  14. 02 8月, 2009 2 次提交
  15. 23 7月, 2009 5 次提交
  16. 18 7月, 2009 1 次提交
    • A
      perf_counter: Make sure we dont leak kernel memory to userspace · 413ee3b4
      Anton Blanchard 提交于
      There are a few places we are leaking tiny amounts of kernel
      memory to userspace. This happens when writing out strings
      because we always align the end to 64 bits.
      
      To avoid this we should always use an appropriately sized
      temporary buffer and ensure it is zeroed.
      
      Since d_path assembles the string from the end of the buffer
      backwards, we need to add 64 bits after the buffer to allow for
      alignment.
      
      We also need to copy arch_vma_name to the temporary buffer,
      because if we use it directly we may end up copying to
      userspace a number of bytes after the end of the string
      constant.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20090716104817.273972048@samba.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      413ee3b4
  17. 13 7月, 2009 1 次提交
    • C
      perf_counter: Fix the tracepoint channel to perfcounters · d4d7d0b9
      Chris Wilson 提交于
      Fix a missed rename in EVENT_PROFILE support so that it gets
      built and allows tracepoint tracing from the 'perf' tool.
      
      Fix a typo in the (never before built & enabled) portion in
      perf_counter.c as well, and update that code to the
      attr.config changes as well.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Gamari <bgamari.foss@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1246869094-21237-1-git-send-email-chris@chris-wilson.co.uk>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d4d7d0b9
  18. 10 7月, 2009 1 次提交
  19. 07 7月, 2009 1 次提交
    • K
      Fix virt_to_phys() warnings · 5bfd7560
      Kevin Cernekee 提交于
      These warnings were observed on MIPS32 using 2.6.31-rc1 and gcc-4.2.0:
      
      mm/page_alloc.c: In function 'alloc_pages_exact':
      mm/page_alloc.c:1986: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast
      
      drivers/usb/mon/mon_bin.c: In function 'mon_alloc_buff':
      drivers/usb/mon/mon_bin.c:1264: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast
      
      [akpm@linux-foundation.org: fix kernel/perf_counter.c too]
      Signed-off-by: NKevin Cernekee <cernekee@gmail.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5bfd7560
  20. 30 6月, 2009 1 次提交
  21. 26 6月, 2009 3 次提交
    • P
      perf_counter: Complete counter swap · 19d2e755
      Peter Zijlstra 提交于
      Complete the counter swap by indeed switching the times too and
      updating the userpage after modifying the counter values.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1246014623.31755.195.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      19d2e755
    • P
      perf_counter: Rework the sample ABI · e6e18ec7
      Peter Zijlstra 提交于
      The PERF_EVENT_READ implementation made me realize we don't
      actually need the sample_type int the output sample, since
      we already have that in the perf_counter_attr information.
      
      Therefore, remove the PERF_EVENT_MISC_OVERFLOW bit and the
      event->type overloading, and imply put counter overflow
      samples in a PERF_EVENT_SAMPLE type.
      
      This also fixes the issue that event->type was only 32-bit
      and sample_type had 64 usable bits.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e6e18ec7
    • P
      perf_counter: Implement more accurate per task statistics · bfbd3381
      Peter Zijlstra 提交于
      With the introduction of PERF_EVENT_READ we have the
      possibility to provide accurate counter values for
      individual tasks in a task hierarchy.
      
      However, due to the lazy context switching used for similar
      counter contexts our current per task counts are way off.
      
      In order to maintain some of the lazy switch benefits we
      don't disable it out-right, but simply iterate the active
      counters and flip the values between the contexts.
      
      This only reads the counters but does not need to reprogram
      the full PMU.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bfbd3381