1. 09 2月, 2013 7 次提交
    • O
      uprobes: Introduce uprobe->register_rwsem · e591c8d7
      Oleg Nesterov 提交于
      Introduce uprobe->register_rwsem. It is taken for writing around
      __uprobe_register/unregister.
      
      Change handler_chain() to use this sem rather than consumer_rwsem.
      
      The main reason for this change is that we have the nasty problem
      with mmap_sem/consumer_rwsem dependency. filter_chain() needs to
      protect uprobe->consumers like handler_chain(), but they can not
      use the same lock. filter_chain() can be called under ->mmap_sem
      (currently this is always true), but we want to allow ->handler()
      to play with the probed task's memory, and this needs ->mmap_sem.
      
      Alternatively we could use srcu, but synchronize_srcu() is very
      slow and ->register_rwsem allows us to do more. In particular, we
      can teach handler_chain() to do remove_breakpoint() if this bp is
      "nacked" by all consumers, we know that we can't race with the
      new consumer which does uprobe_register().
      
      See also the next patches. uprobes_mutex[] is almost ready to die.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      e591c8d7
    • O
      uprobes: _register() should always do register_for_each_vma(true) · 9a98e03c
      Oleg Nesterov 提交于
      To support the filtering uprobe_register() should do
      register_for_each_vma(true) every time the new consumer comes,
      we need to install the previously nacked breakpoints.
      
      Note:
      	- uprobes_mutex[] should die, what it actually protects is
      	  alloc_uprobe().
      
      	- UPROBE_RUN_HANDLER should die too, obviously it can't work
      	  unless uprobe has a single consumer. The consumer should
      	  serialize with _register/_unregister itself. Or this flag
      	  should live in uprobe_consumer->state.
      
      	- Perhaps we can do some optimizations later. For example, if
      	  filter_chain() never returns false uprobe can record this
      	  fact and avoid the unnecessary register_for_each_vma().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      9a98e03c
    • O
      uprobes: _unregister() should always do register_for_each_vma(false) · 04aab9b2
      Oleg Nesterov 提交于
      uprobe_unregister() removes the breakpoints only if the last consumer
      goes away. To support the filtering it should do this every time, we
      want to remove the breakpoints which nobody else want to keep.
      
      Note: given that filter_chain() is not actually implemented, this patch
      itself doesn't change the behaviour yet, register_for_each_vma(false)
      is a heavy "nop" unless there are no more consumers.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      04aab9b2
    • O
      uprobes: Introduce filter_chain() · 63633cbf
      Oleg Nesterov 提交于
      Add the new helper filter_chain(). Currently it is only placeholder,
      the comment explains what is should do. We will change it later to
      consult every consumer to decide whether we need to install the swbp.
      Until then it works as if any consumer returns true, this matches the
      current behavior.
      
      Change install_breakpoint() to call filter_chain() instead of checking
      uprobe->consumers != NULL. We obviously need this, and this equally
      closes the race with _unregister().
      
      Change remove_breakpoint() to call this helper too. Currently this is
      pointless because remove_breakpoint() is only called when the last
      consumer goes away, but we will change this.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      63633cbf
    • O
      uprobes: Kill uprobe_consumer->filter() · fe20d71f
      Oleg Nesterov 提交于
      uprobe_consumer->filter() is pointless in its current form, kill it.
      
      We will add it back, but with the different signature/semantics. Perhaps
      we will even re-introduce the callsite in handler_chain(), but not to
      just skip uc->handler().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      fe20d71f
    • O
      uprobes: Kill the pointless inode/uc checks in register/unregister · f0744af7
      Oleg Nesterov 提交于
      register/unregister verifies that inode/uc != NULL. For what?
      This really looks like "hide the potential problem", the caller
      should pass the valid data.
      
      register() also checks uc->next == NULL, probably to prevent the
      double-register but the caller can do other stupid/wrong things.
      If we do this check, then we should document that uc->next should
      be cleared before register() and add BUG_ON().
      
      Also add the small comment about the i_size_read() check.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      f0744af7
    • O
      uprobes: Move __set_bit(UPROBE_SKIP_SSTEP) into alloc_uprobe() · bbc33d05
      Oleg Nesterov 提交于
      Cosmetic. __set_bit(UPROBE_SKIP_SSTEP) is the part of initialization,
      it is not clear why it is set in insert_uprobe().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      bbc33d05
  2. 07 2月, 2013 23 次提交
  3. 06 2月, 2013 1 次提交
  4. 03 2月, 2013 1 次提交
  5. 02 2月, 2013 1 次提交
    • S
      tracing: Init current_trace to nop_trace and remove NULL checks · d840f718
      Steven Rostedt (Red Hat) 提交于
      On early boot up, when the ftrace ring buffer is initialized, the
      static variable current_trace is initialized to &nop_trace.
      Before this initialization, current_trace is NULL and will never
      become NULL again. It is always reassigned to a ftrace tracer.
      
      Several places check if current_trace is NULL before it uses
      it, and this check is frivolous, because at the point in time
      when the checks are made the only way current_trace could be
      NULL is if ftrace failed its allocations at boot up, and the
      paths to these locations would probably not be possible.
      
      By initializing current_trace to &nop_trace where it is declared,
      current_trace will never be NULL, and we can remove all these
      checks of current_trace being NULL which never needed to be
      checked in the first place.
      
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d840f718
  6. 01 2月, 2013 7 次提交
    • I
      Merge tag 'perf-core-for-mingo' of... · 9c4c5fd9
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      . Make some POWER7 events available in sysfs, equivalent to
        what was done on x86, from Sukadev Bhattiprolu.
      
      . Add event group view, from Namyung Kim:
      
        To use it, 'perf record' should group events when recording. And then perf
        report parses the saved group relation from file header and prints them
        together if --group option is provided.  You can use 'perf evlist' command to
        see event group information:
      
          $ perf record -e '{ref-cycles,cycles}' noploop 1
          [ perf record: Woken up 2 times to write data ]
          [ perf record: Captured and wrote 0.385 MB perf.data (~16807 samples) ]
      
          $ perf evlist --group
          {ref-cycles,cycles}
      
        With this example, default perf report will show you each event
        separately like this:
      
          $ perf report
          ...
          # group: {ref-cycles,cycles}
          # ========
          # Samples: 3K of event 'ref-cycles'
          # Event count (approx.): 3153797218
          #
          # Overhead  Command      Shared Object                      Symbol
          # ........  .......  .................  ..........................
              99.84%  noploop  noploop            [.] main
               0.07%  noploop  ld-2.15.so         [.] strcmp
               0.03%  noploop  [kernel.kallsyms]  [k] timerqueue_del
               0.03%  noploop  [kernel.kallsyms]  [k] sched_clock_cpu
               0.02%  noploop  [kernel.kallsyms]  [k] account_user_time
               0.01%  noploop  [kernel.kallsyms]  [k] __alloc_pages_nodemask
               0.00%  noploop  [kernel.kallsyms]  [k] native_write_msr_safe
      
          # Samples: 3K of event 'cycles'
          # Event count (approx.): 3722310525
          #
          # Overhead  Command      Shared Object                     Symbol
          # ........  .......  .................  .........................
              99.76%  noploop  noploop            [.] main
               0.11%  noploop  [kernel.kallsyms]  [k] _raw_spin_lock
               0.06%  noploop  [kernel.kallsyms]  [k] find_get_page
               0.03%  noploop  [kernel.kallsyms]  [k] sched_clock_cpu
               0.02%  noploop  [kernel.kallsyms]  [k] rcu_check_callbacks
               0.02%  noploop  [kernel.kallsyms]  [k] __current_kernel_time
               0.00%  noploop  [kernel.kallsyms]  [k] native_write_msr_safe
      
        In this case the event group information will be shown in the end of
        header area.  So you can use --group option to enable event group view.
      
          $ perf report --group
          ...
          # group: {ref-cycles,cycles}
          # ========
          # Samples: 7K of event 'anon group { ref-cycles, cycles }'
          # Event count (approx.): 6876107743
          #
          #         Overhead  Command      Shared Object                      Symbol
          # ................  .......  .................  ..........................
              99.84%  99.76%  noploop  noploop            [.] main
               0.07%   0.00%  noploop  ld-2.15.so         [.] strcmp
               0.03%   0.00%  noploop  [kernel.kallsyms]  [k] timerqueue_del
               0.03%   0.03%  noploop  [kernel.kallsyms]  [k] sched_clock_cpu
               0.02%   0.00%  noploop  [kernel.kallsyms]  [k] account_user_time
               0.01%   0.00%  noploop  [kernel.kallsyms]  [k] __alloc_pages_nodemask
               0.00%   0.00%  noploop  [kernel.kallsyms]  [k] native_write_msr_safe
               0.00%   0.11%  noploop  [kernel.kallsyms]  [k] _raw_spin_lock
               0.00%   0.06%  noploop  [kernel.kallsyms]  [k] find_get_page
               0.00%   0.02%  noploop  [kernel.kallsyms]  [k] rcu_check_callbacks
               0.00%   0.02%  noploop  [kernel.kallsyms]  [k] __current_kernel_time
      
        As you can see the Overhead column now contains both of ref-cycles and
        cycles and header line shows group information also - 'anon group {
        ref-cycles, cycles }'.  The output is sorted by period of group leader
        first.
      
        If perf.data file doesn't contain group information, this --group
        option does nothing.  So if you want enable event group view by
        default you can set it in ~/.perfconfig file:
      
          $ cat ~/.perfconfig
          [report]
          group = true
      
        It can be overridden with command line if you want:
      
          $ perf report --no-group
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      9c4c5fd9
    • S
      perf: Document the ABI of perf sysfs entries · 2ac3634a
      Sukadev Bhattiprolu 提交于
      This patchset addes two new sets of files to sysfs for POWER architecture.
      
      	- perf event config format in /sys/devices/cpu/format/event
      	- generic and POWER-specific perf events in /sys/devices/cpu/events/
      
      The format of the first file is already documented in:
      
      	sysfs-bus-event_source-devices-format
      
      Document the format of the second set of files '/sys/devices/cpu/events/*'
      which would also become part of the ABI.
      
      Changelog[v4]:
      	[Jiri Olsa]: Mention that multiple event= like terms can be specified
      	in the 'events' file.
      	[Jiri Olsa]: Remove the documentation for the 'config format' file
      	as it is already documented in 'Documentation/ABI/testing/'.
      	[Jiri Olsa]: Move ABI documentation from 'stable/' to 'testing/'
      
      Changelog[v3]:
      	[Greg KH] Include ABI documentation.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anton Blanchard <anton@au1.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: linuxppc-dev@ozlabs.org
      Link: http://lkml.kernel.org/r/20130123062645.GG13720@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2ac3634a
    • S
      perf/POWER7: Make some POWER7 events available in sysfs · 886c3b2d
      Sukadev Bhattiprolu 提交于
      Make some POWER7-specific perf events available in sysfs.
      
      	$ /bin/ls -1 /sys/bus/event_source/devices/cpu/events/
      	branch-instructions
      	branch-misses
      	cache-misses
      	cache-references
      	cpu-cycles
      	instructions
      	PM_BRU_FIN
      	PM_BRU_MPRED
      	PM_CMPLU_STALL
      	PM_CYC
      	PM_GCT_NOSLOT_CYC
      	PM_INST_CMPL
      	PM_LD_MISS_L1
      	PM_LD_REF_L1
      	stalled-cycles-backend
      	stalled-cycles-frontend
      
      where the 'PM_*' events are POWER specific and the others are the
      generic events.
      
      This will enable users to specify these events with their symbolic
      names rather than with their raw code.
      
      	perf stat -e 'cpu/PM_CYC' ...
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anton Blanchard <anton@au1.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: linuxppc-dev@ozlabs.org
      Link: http://lkml.kernel.org/r/20130123062528.GE13720@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      886c3b2d
    • S
      perf/POWER7: Make generic event translations available in sysfs · 1c53a270
      Sukadev Bhattiprolu 提交于
      Make the generic perf events in POWER7 available via sysfs.
      
      	$ ls /sys/bus/event_source/devices/cpu/events
      	branch-instructions
      	branch-misses
      	cache-misses
      	cache-references
      	cpu-cycles
      	instructions
      	stalled-cycles-backend
      	stalled-cycles-frontend
      
      	$ cat /sys/bus/event_source/devices/cpu/events/cache-misses
      	event=0x400f0
      
      This patch is based on commits that implement this functionality on x86.
      Eg:
      	commit a4747393
      	Author: Jiri Olsa <jolsa@redhat.com>
      	Date:   Wed Oct 10 14:53:11 2012 +0200
      
      	    perf/x86: Make hardware event translations available in sysfs
      
      Changelog:[v2]
      	[Jiri Osla] Drop EVENT_ID() macro since it is only used once.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anton Blanchard <anton@au1.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: linuxppc-dev@ozlabs.org
      Link: http://lkml.kernel.org/r/20130123062454.GD13720@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1c53a270
    • S
      perf: Make EVENT_ATTR global · 2663960c
      Sukadev Bhattiprolu 提交于
      Rename EVENT_ATTR() to PMU_EVENT_ATTR() and make it global so it is
      available to all architectures.
      
      Further to allow architectures flexibility, have PMU_EVENT_ATTR() pass
      in the variable name as a parameter.
      
      Changelog[v2]
      	- [Jiri Olsa] No need to define PMU_EVENT_PTR()
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anton Blanchard <anton@au1.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: linuxppc-dev@ozlabs.org
      Link: http://lkml.kernel.org/r/20130123062422.GC13720@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2663960c
    • S
      perf/Power7: Use macros to identify perf events · bbdc7aa4
      Sukadev Bhattiprolu 提交于
      Define and use macros to identify perf events codes This would make it
      easier and more readable when these event codes need to be used in more
      than one place.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anton Blanchard <anton@au1.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: linuxppc-dev@ozlabs.org
      Link: http://lkml.kernel.org/r/20130123062353.GB13720@us.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bbdc7aa4
    • N
      perf evlist: Add --group option · e6ab07d0
      Namhyung Kim 提交于
      Add '-g/--group' option for showing event groups.  For simplicity it is
      currently not compatible with other options.
      
        $ perf evlist --group
        {ref-cycles,cycles}
      
        $ perf evlist
        ref-cycles
        cycles
      Suggested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1358845787-1350-20-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e6ab07d0