1. 19 8月, 2010 5 次提交
    • F
      perf: Fix race in callchains · 927c7a9e
      Frederic Weisbecker 提交于
      Now that software events don't have interrupt disabled anymore in
      the event path, callchains can nest on any context. So seperating
      nmi and others contexts in two buffers has become racy.
      
      Fix this by providing one buffer per nesting level. Given the size
      of the callchain entries (2040 bytes * 4), we now need to allocate
      them dynamically.
      
      v2: Fixed put_callchain_entry call after recursion.
          Fix the type of the recursion, it must be an array.
      
      v3: Use a manual pr cpu allocation (temporary solution until NMIs
          can safely access vmalloc'ed memory).
          Do a better separation between callchain reference tracking and
          allocation. Make the "put" path lockless for non-release cases.
      
      v4: Protect the callchain buffers with rcu.
      
      v5: Do the cpu buffers allocations node affine.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: NWill Deacon <will.deacon@arm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Borislav Petkov <bp@amd64.org>
      927c7a9e
    • F
      perf: Factorize callchain context handling · f72c1a93
      Frederic Weisbecker 提交于
      Store the kernel and user contexts from the generic layer instead
      of archs, this gathers some repetitive code.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Tested-by: NWill Deacon <will.deacon@arm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Borislav Petkov <bp@amd64.org>
      f72c1a93
    • F
      perf: Generalize some arch callchain code · 56962b44
      Frederic Weisbecker 提交于
      - Most archs use one callchain buffer per cpu, except x86 that needs
        to deal with NMIs. Provide a default perf_callchain_buffer()
        implementation that x86 overrides.
      
      - Centralize all the kernel/user regs handling and invoke new arch
        handlers from there: perf_callchain_user() / perf_callchain_kernel()
        That avoid all the user_mode(), current->mm checks and so...
      
      - Invert some parameters in perf_callchain_*() helpers: entry to the
        left, regs to the right, following the traditional (dst, src).
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Tested-by: NWill Deacon <will.deacon@arm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Borislav Petkov <bp@amd64.org>
      56962b44
    • F
      perf: Generalize callchain_store() · 70791ce9
      Frederic Weisbecker 提交于
      callchain_store() is the same on every archs, inline it in
      perf_event.h and rename it to perf_callchain_store() to avoid
      any collision.
      
      This removes repetitive code.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Tested-by: NWill Deacon <will.deacon@arm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Borislav Petkov <bp@amd64.org>
      70791ce9
    • F
      perf: Drop unappropriate tests on arch callchains · c1a65932
      Frederic Weisbecker 提交于
      Drop the TASK_RUNNING test on user tasks for callchains as
      this check doesn't seem to make any sense.
      
      Also remove the tests for !current that is not supposed to
      happen and current->pid as this should be handled at the
      generic level, with exclude_idle attribute.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: NWill Deacon <will.deacon@arm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Borislav Petkov <bp@amd64.org>
      c1a65932
  2. 11 8月, 2010 13 次提交
  3. 07 8月, 2010 4 次提交
  4. 06 8月, 2010 11 次提交
  5. 05 8月, 2010 3 次提交
    • J
      oprofile: add support for Intel processor model 30 · a7c55cbe
      Josh Hunt 提交于
      Newer Intel processors identifying themselves as model 30 are not recognized by
      oprofile.
      
      <cpuinfo snippet>
      model           : 30
      model name      : Intel(R) Xeon(R) CPU           X3470  @ 2.93GHz
      </cpuinfo snippet>
      
      Running oprofile on these machines gives the following:
      + opcontrol --init
      + opcontrol --list-events
      oprofile: available events for CPU type "Intel Architectural Perfmon"
      
      See Intel 64 and IA-32 Architectures Software Developer's Manual
      Volume 3B (Document 253669) Chapter 18 for architectural perfmon events
      This is a limited set of fallback events because oprofile doesn't know your CPU
      CPU_CLK_UNHALTED: (counter: all)
              Clock cycles when not halted (min count: 6000)
      INST_RETIRED: (counter: all)
              number of instructions retired (min count: 6000)
      LLC_MISSES: (counter: all)
              Last level cache demand requests from this core that missed the LLC
      (min count: 6000)
              Unit masks (default 0x41)
              ----------
              0x41: No unit mask
      LLC_REFS: (counter: all)
              Last level cache demand requests from this core (min count: 6000)
              Unit masks (default 0x4f)
              ----------
              0x4f: No unit mask
      BR_MISS_PRED_RETIRED: (counter: all)
              number of mispredicted branches retired (precise) (min count: 500)
      + opcontrol --shutdown
      
      Tested using oprofile 0.9.6.
      Signed-off-by: NJosh Hunt <johunt@akamai.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      a7c55cbe
    • I
      Merge branch 'perf/core' of... · fc9ea5a1
      Ingo Molnar 提交于
      Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core
      fc9ea5a1
    • I
      Merge branch 'perf/nmi' into perf/core · 61be7fde
      Ingo Molnar 提交于
      Conflicts:
      	kernel/Makefile
      
      Merge reason: Add the now complete topic, fix the conflict.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      61be7fde
  6. 04 8月, 2010 4 次提交