1. 10 9月, 2010 8 次提交
    • P
      perf: Remove the sysfs bits · 15ac9a39
      Peter Zijlstra 提交于
      Neither the overcommit nor the reservation sysfs parameter were
      actually working, remove them as they'll only get in the way.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus <paulus@samba.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      15ac9a39
    • P
      perf: Rework the PMU methods · a4eaf7f1
      Peter Zijlstra 提交于
      Replace pmu::{enable,disable,start,stop,unthrottle} with
      pmu::{add,del,start,stop}, all of which take a flags argument.
      
      The new interface extends the capability to stop a counter while
      keeping it scheduled on the PMU. We replace the throttled state with
      the generic stopped state.
      
      This also allows us to efficiently stop/start counters over certain
      code paths (like IRQ handlers).
      
      It also allows scheduling a counter without it starting, allowing for
      a generic frozen state (useful for rotating stopped counters).
      
      The stopped state is implemented in two different ways, depending on
      how the architecture implemented the throttled state:
      
       1) We disable the counter:
          a) the pmu has per-counter enable bits, we flip that
          b) we program a NOP event, preserving the counter state
      
       2) We store the counter state and ignore all read/overflow events
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus <paulus@samba.org>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Michael Cree <mcree@orcon.net.nz>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a4eaf7f1
    • P
      perf: Shrink hw_perf_event · fa407f35
      Peter Zijlstra 提交于
      Use hw_perf_event::period_left instead of hw_perf_event::remaining
      and win back 8 bytes.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fa407f35
    • P
      perf: Default PMU ops · ad5133b7
      Peter Zijlstra 提交于
      Provide default implementations for the pmu txn methods, this
      allows us to remove some conditional code.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus <paulus@samba.org>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Michael Cree <mcree@orcon.net.nz>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ad5133b7
    • P
      perf: Per PMU disable · 33696fc0
      Peter Zijlstra 提交于
      Changes perf_disable() into perf_pmu_disable().
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus <paulus@samba.org>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Michael Cree <mcree@orcon.net.nz>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      33696fc0
    • P
      perf: Reduce perf_disable() usage · 24cd7f54
      Peter Zijlstra 提交于
      Since the current perf_disable() usage is only an optimization,
      remove it for now. This eases the removal of the __weak
      hw_perf_enable() interface.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus <paulus@samba.org>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Michael Cree <mcree@orcon.net.nz>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      24cd7f54
    • P
      perf: Register PMU implementations · b0a873eb
      Peter Zijlstra 提交于
      Simple registration interface for struct pmu, this provides the
      infrastructure for removing all the weak functions.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus <paulus@samba.org>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Michael Cree <mcree@orcon.net.nz>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b0a873eb
    • P
      perf: Deconstify struct pmu · 51b0fe39
      Peter Zijlstra 提交于
      sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"`
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus <paulus@samba.org>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Michael Cree <mcree@orcon.net.nz>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      51b0fe39
  2. 19 8月, 2010 4 次提交
    • F
      perf: Humanize the number of contexts · 7ae07ea3
      Frederic Weisbecker 提交于
      Instead of hardcoding the number of contexts for the recursions
      barriers, define a cpp constant to make the code more
      self-explanatory.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      7ae07ea3
    • F
      perf: Fix race in callchains · 927c7a9e
      Frederic Weisbecker 提交于
      Now that software events don't have interrupt disabled anymore in
      the event path, callchains can nest on any context. So seperating
      nmi and others contexts in two buffers has become racy.
      
      Fix this by providing one buffer per nesting level. Given the size
      of the callchain entries (2040 bytes * 4), we now need to allocate
      them dynamically.
      
      v2: Fixed put_callchain_entry call after recursion.
          Fix the type of the recursion, it must be an array.
      
      v3: Use a manual pr cpu allocation (temporary solution until NMIs
          can safely access vmalloc'ed memory).
          Do a better separation between callchain reference tracking and
          allocation. Make the "put" path lockless for non-release cases.
      
      v4: Protect the callchain buffers with rcu.
      
      v5: Do the cpu buffers allocations node affine.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: NWill Deacon <will.deacon@arm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Borislav Petkov <bp@amd64.org>
      927c7a9e
    • F
      perf: Generalize some arch callchain code · 56962b44
      Frederic Weisbecker 提交于
      - Most archs use one callchain buffer per cpu, except x86 that needs
        to deal with NMIs. Provide a default perf_callchain_buffer()
        implementation that x86 overrides.
      
      - Centralize all the kernel/user regs handling and invoke new arch
        handlers from there: perf_callchain_user() / perf_callchain_kernel()
        That avoid all the user_mode(), current->mm checks and so...
      
      - Invert some parameters in perf_callchain_*() helpers: entry to the
        left, regs to the right, following the traditional (dst, src).
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Tested-by: NWill Deacon <will.deacon@arm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Borislav Petkov <bp@amd64.org>
      56962b44
    • F
      perf: Generalize callchain_store() · 70791ce9
      Frederic Weisbecker 提交于
      callchain_store() is the same on every archs, inline it in
      perf_event.h and rename it to perf_callchain_store() to avoid
      any collision.
      
      This removes repetitive code.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Tested-by: NWill Deacon <will.deacon@arm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Borislav Petkov <bp@amd64.org>
      70791ce9
  3. 25 6月, 2010 2 次提交
    • N
      perf: Fix argument of perf_arch_fetch_caller_regs · 5cfaf214
      Nobuhiro Iwamatsu 提交于
      "struct regs" was set to argument of perf_arch_fetch_caller_regs
      off-case. It should be "struct pt_regs".
      
      This fixes various build errors in archs that have CONFIG_PERF_EVENTS=y
      but no overriden implementation of perf_arch_fetch_caller_regs.
      
      cc1: warnings being treated as errors
      In file included from include/linux/ftrace_event.h:8,
                       from include/trace/syscall.h:6,
                       from include/linux/syscalls.h:75,
                       from arch/sh/kernel/sys_sh32.c:9:
      include/linux/perf_event.h:937: error: 'struct regs' declared inside parameter list
      include/linux/perf_event.h:937: error: its scope is only this definition or declaration, which is probably not what you want
      include/linux/perf_event.h: In function 'perf_fetch_caller_regs':
      include/linux/perf_event.h:952: error: passing argument 1 of 'perf_arch_fetch_caller_regs' from incompatible pointer type
      Signed-off-by: NNobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <AANLkTinKKFKEBQrZ3Hkj-XCaMwaTqulb-XnFzqEYiFRr@mail.gmail.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      5cfaf214
    • F
      hw_breakpoints: Fix per task breakpoint tracking · 45a73372
      Frederic Weisbecker 提交于
      Freeing a perf event can happen in several ways. A task
      calls perf_event_exit_task() right before exiting. This helper
      will detach all the events from the task context and queue their
      removal through free_event() if they are child tasks. The task
      also loses its context reference there.
      
      Releasing the breakpoint slot from the constraint table is made
      from free_event() that calls release_bp_slot(). We count the number
      of breakpoints this task is running by looking at the task's
      perf_event_ctxp and iterating through its attached events.
      But at this time, the reference to this context has been cleaned up
      already.
      
      So looking at the event->ctx instead of task->perf_event_ctxp
      to count the remaining breakpoints should solve the problem.
      At least it would for child breakpoints, but not for parent ones.
      If the parent exits before the child, it will remove all its
      events from the context but free_event() will be called later,
      on fd release time. And checking the number of breakpoints the
      task has attached to its context at this time is unreliable as all
      events have been removed from the context.
      
      To solve this, we keep track of the list of per task breakpoints.
      On top of it, we maintain our array of numbers of breakpoints used
      by the tasks. We use the context address as a task id.
      
      So, instead of looking at the number of events attached to a context,
      we walk through our list of per task breakpoints and count the number
      of breakpoints that use the same ctx than the one to be reserved or
      released from the constraint table, and update the count on top of this
      result.
      
      In the meantime it solves a bad refcounting, it also solves a warning,
      reported by Paul.
      
      Badness at /home/paulus/kernel/perf/kernel/hw_breakpoint.c:114
      NIP: c0000000000cb470 LR: c0000000000cb46c CTR: c00000000032d9b8
      REGS: c000000118e7b570 TRAP: 0700   Not tainted  (2.6.35-rc3-perf-00008-g76b0f133
      )
      MSR: 9000000000029032 <EE,ME,CE,IR,DR>  CR: 44004424  XER: 000fffff
      TASK = c0000001187dcad0[3143] 'perf' THREAD: c000000118e78000 CPU: 1
      GPR00: c0000000000cb46c c000000118e7b7f0 c0000000009866a0 0000000000000020
      GPR04: 0000000000000000 000000000000001d 0000000000000000 0000000000000001
      GPR08: c0000000009bed68 c00000000086dff8 c000000000a5bf10 0000000000000001
      GPR12: 0000000024004422 c00000000ffff200 0000000000000000 0000000000000000
      GPR16: 0000000000000000 0000000000000000 0000000000000018 00000000101150f4
      GPR20: 0000000010206b40 0000000000000000 0000000000000000 00000000101150f4
      GPR24: c0000001199090c0 0000000000000001 0000000000000000 0000000000000001
      GPR28: 0000000000000000 0000000000000000 c0000000008ec290 0000000000000000
      NIP [c0000000000cb470] .task_bp_pinned+0x5c/0x12c
      LR [c0000000000cb46c] .task_bp_pinned+0x58/0x12c
      Call Trace:
      [c000000118e7b7f0] [c0000000000cb46c] .task_bp_pinned+0x58/0x12c (unreliable)
      [c000000118e7b8a0] [c0000000000cb584] .toggle_bp_task_slot+0x44/0xe4
      [c000000118e7b940] [c0000000000cb6c8] .toggle_bp_slot+0xa4/0x164
      [c000000118e7b9f0] [c0000000000cbafc] .release_bp_slot+0x44/0x6c
      [c000000118e7ba80] [c0000000000c4178] .bp_perf_event_destroy+0x10/0x24
      [c000000118e7bb00] [c0000000000c4aec] .free_event+0x180/0x1bc
      [c000000118e7bbc0] [c0000000000c54c4] .perf_event_release_kernel+0x14c/0x170
      Reported-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      45a73372
  4. 09 6月, 2010 10 次提交
  5. 31 5月, 2010 2 次提交
  6. 21 5月, 2010 5 次提交
  7. 19 5月, 2010 4 次提交
  8. 11 5月, 2010 1 次提交
  9. 07 5月, 2010 3 次提交
    • L
      perf: Add group scheduling transactional APIs · 6bde9b6c
      Lin Ming 提交于
      Add group scheduling transactional APIs to struct pmu.
      These APIs will be implemented in arch code, based on Peter's idea as
      below.
      
      > the idea behind hw_perf_group_sched_in() is to not perform
      > schedulability tests on each event in the group, but to add the group
      > as a whole and then perform one test.
      >
      > Of course, when that test fails, you'll have to roll-back the whole
      > group again.
      >
      > So start_txn (or a better name) would simply toggle a flag in the pmu
      > implementation that will make pmu::enable() not perform the
      > schedulablilty test.
      >
      > Then commit_txn() will perform the schedulability test (so note the
      > method has to have a !void return value.
      >
      > This will allow us to use the regular
      > kernel/perf_event.c::group_sched_in() and all the rollback code.
      > Currently each hw_perf_group_sched_in() implementation duplicates all
      > the rolllback code (with various bugs).
      
      ->start_txn:
      Start group events scheduling transaction, set a flag to make
      pmu::enable() not perform the schedulability test, it will be performed
      at commit time.
      
      ->commit_txn:
      Commit group events scheduling transaction, perform the group
      schedulability as a whole
      
      ->cancel_txn:
      Stop group events scheduling transaction, clear the flag so
      pmu::enable() will perform the schedulability test.
      Reviewed-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1272002160.5707.60.camel@minggr.sh.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6bde9b6c
    • P
      perf, x86: Improve the PEBS ABI · ab608344
      Peter Zijlstra 提交于
      Rename perf_event_attr::precise to perf_event_attr::precise_ip and
      widen it to 2 bits. This new field describes the required precision of
      the PERF_SAMPLE_IP field:
      
        0 - SAMPLE_IP can have arbitrary skid
        1 - SAMPLE_IP must have constant skid
        2 - SAMPLE_IP requested to have 0 skid
        3 - SAMPLE_IP must have 0 skid
      
      And modify the Intel PEBS code accordingly. The PEBS implementation
      now supports up to precise_ip == 2, where we perform the IP fixup.
      
      Also s/PERF_RECORD_MISC_EXACT/&_IP/ to clarify its meaning, this bit
      should be set for each PERF_SAMPLE_IP field known to match the actual
      instruction triggering the event.
      
      This new scheme allows for a PEBS mode that uses the buffer for more
      than a single event.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ab608344
    • P
      perf: Fix exit() vs PERF_FORMAT_GROUP · 4fd38e45
      Peter Zijlstra 提交于
      Both Stephane and Corey reported that PERF_FORMAT_GROUP didn't work
      as expected if the task the counters were attached to quit before
      the read() call.
      
      The cause is that we unconditionally destroy the grouping when we
      remove counters from their context. Fix this by only doing this when
      we free the counter itself.
      Reported-by: NCorey Ashford <cjashfor@linux.vnet.ibm.com>
      Reported-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1273160566.5605.404.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4fd38e45
  10. 20 4月, 2010 1 次提交
    • Z
      perf & kvm: Clean up some of the guest profiling callback API details · dcf46b94
      Zhang, Yanmin 提交于
      Fix some build bug and programming style issues:
      
       - use valid C
       - fix up various style details
      Signed-off-by: NZhang Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Sheng Yang <sheng@linux.intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: oerg Roedel <joro@8bytes.org>
      Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Zachary Amsden <zamsden@redhat.com>
      Cc: zhiteng.huang@intel.com
      Cc: tim.c.chen@intel.com
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      LKML-Reference: <1271729638.2078.624.camel@ymzhang.sh.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dcf46b94