1. 10 8月, 2012 2 次提交
    • J
      perf: Add perf_output_skip function to skip bytes in sample · 5685e0ff
      Jiri Olsa 提交于
      Introducing perf_output_skip function to be able to skip data within the
      perf ring buffer.
      
      When writing data into perf ring buffer we first reserve needed place in
      ring buffer and then copy the actual data.
      
      There's a possibility we won't be able to fill all the reserved size
      with data, so we need a way to skip the remaining bytes.
      
      This is going to be useful when storing the user stack dump, where we
      might end up with less data than we originally requested.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/1344345647-11536-5-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5685e0ff
    • F
      perf: Factor __output_copy to be usable with specific copy function · 91d7753a
      Frederic Weisbecker 提交于
      Adding a generic way to use __output_copy function with specific copy
      function via DEFINE_PERF_OUTPUT_COPY macro.
      
      Using this to add new __output_copy_user function, that provides output
      copy from user pointers. For x86 the copy_from_user_nmi function is used
      and __copy_from_user_inatomic for the rest of the architectures.
      
      This new function will be used in user stack dump on sample, coming in
      next patches.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/1344345647-11536-4-git-send-email-jolsa@redhat.comSigned-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      91d7753a
  2. 02 1月, 2012 1 次提交
  3. 05 12月, 2011 1 次提交
    • P
      perf: Fix loss of notification with multi-event · 10c6db11
      Peter Zijlstra 提交于
      When you do:
              $ perf record -e cycles,cycles,cycles noploop 10
      
      You expect about 10,000 samples for each event, i.e., 10s at
      1000samples/sec. However, this is not what's happening. You
      get much fewer samples, maybe 3700 samples/event:
      
      $ perf report -D | tail -15
      Aggregated stats:
                 TOTAL events:      10998
                  MMAP events:         66
                  COMM events:          2
                SAMPLE events:      10930
      cycles stats:
                 TOTAL events:       3644
                SAMPLE events:       3644
      cycles stats:
                 TOTAL events:       3642
                SAMPLE events:       3642
      cycles stats:
                 TOTAL events:       3644
                SAMPLE events:       3644
      
      On a Intel Nehalem or even AMD64, there are 4 counters capable
      of measuring cycles, so there is plenty of space to measure those
      events without multiplexing (even with the NMI watchdog active).
      And even with multiplexing, we'd expect roughly the same number
      of samples per event.
      
      The root of the problem was that when the event that caused the buffer
      to become full was not the first event passed on the cmdline, the user
      notification would get lost. The notification was sent to the file
      descriptor of the overflowed event but the perf tool was not polling
      on it.  The perf tool aggregates all samples into a single buffer,
      i.e., the buffer of the first event. Consequently, it assumes
      notifications for any event will come via that descriptor.
      
      The seemingly straight forward solution of moving the waitq into the
      ringbuffer object doesn't work because of life-time issues. One could
      perf_event_set_output() on a fd that you're also blocking on and cause
      the old rb object to be freed while its waitq would still be
      referenced by the blocked thread -> FAIL.
      
      Therefore link all events to the ringbuffer and broadcast the wakeup
      from the ringbuffer object to all possible events that could be waited
      upon. This is rather ugly, and we're open to better solutions but it
      works for now.
      Reported-by: NStephane Eranian <eranian@google.com>
      Finished-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20111126014731.GA7030@quadSigned-off-by: NIngo Molnar <mingo@elte.hu>
      10c6db11
  4. 01 7月, 2011 3 次提交
    • P
      perf: Remove the perf_output_begin(.sample) argument · a7ac67ea
      Peter Zijlstra 提交于
      Since only samples call perf_output_sample() its much saner (and more
      correct) to put the sample logic in there than in the
      perf_output_begin()/perf_output_end() pair.
      
      Saves a useless argument, reduces conditionals and shrinks
      struct perf_output_handle, win!
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/n/tip-2crpvsx3cqu67q3zqjbnlpsc@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      a7ac67ea
    • P
      perf: Remove the nmi parameter from the swevent and overflow interface · a8b0ca17
      Peter Zijlstra 提交于
      The nmi parameter indicated if we could do wakeups from the current
      context, if not, we would set some state and self-IPI and let the
      resulting interrupt do the wakeup.
      
      For the various event classes:
      
        - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
          the PMI-tail (ARM etc.)
        - tracepoint: nmi=0; since tracepoint could be from NMI context.
        - software: nmi=[0,1]; some, like the schedule thing cannot
          perform wakeups, and hence need 0.
      
      As one can see, there is very little nmi=1 usage, and the down-side of
      not using it is that on some platforms some software events can have a
      jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).
      
      The up-side however is that we can remove the nmi parameter and save a
      bunch of conditionals in fast paths.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Michael Cree <mcree@orcon.net.nz>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      a8b0ca17
    • V
      perf_events: Fix perf buffer watermark setting · 4ec8363d
      Vince Weaver 提交于
      Since 2.6.36 (specifically commit d57e34fd ("perf: Simplify the
      ring-buffer logic: make perf_buffer_alloc() do everything needed"),
      the perf_buffer_init_code() has been mis-setting the buffer watermark
      if perf_event_attr.wakeup_events has a non-zero value.
      
      This is because perf_event_attr.wakeup_events is a union with
      perf_event_attr.wakeup_watermark.
      
      This commit re-enables the check for perf_event_attr.watermark being
      set before continuing with setting a non-default watermark.
      
      This bug is most noticable when you are trying to use PERF_IOC_REFRESH
      with a value larger than one and perf_event_attr.wakeup_events is set to
      one.  In this case the buffer watermark will be set to 1 and you will
      get extraneous POLL_IN overflows rather than POLL_HUP as expected.
      
      [ avoid using attr.wakeup_events when attr.watermark is set ]
      Signed-off-by: NVince Weaver <vweaver1@eecs.utk.edu>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@kernel.org>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1106011506390.5384@cl320.eecs.utk.eduSigned-off-by: NIngo Molnar <mingo@elte.hu>
      4ec8363d
  5. 09 6月, 2011 1 次提交