1. 12 6月, 2014 7 次提交
    • J
      perf tools: Cache dso data file descriptor · c6580451
      Jiri Olsa 提交于
      Caching dso data file descriptors to avoid expensive re-opens
      especially during DWARF unwind.
      
      We keep dsos data file descriptors open until their count reaches
      the half of the current fd open limit (RLIMIT_NOFILE). In this case
      we close file descriptor of the first opened dso object.
      
      We've got overall speedup (~27% for my workload) of report:
       'perf report --stdio -i perf-test.data' (3 runs)
        (perf-test.data size was around 12GB)
      
        current code:
         545,640,944,228      cycles                     ( +-  0.53% )
         785,255,798,320      instructions               ( +-  0.03% )
      
           366.340910010 seconds time elapsed            ( +-  3.65% )
      
        after change:
         435,895,036,114      cycles                     ( +-  0.26% )
         636,790,271,176      instructions               ( +-  0.04% )
      
           266.481463387 seconds time elapsed            ( +-  0.13% )
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1401892622-30848-7-git-send-email-jolsa@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      c6580451
    • J
      perf tools: Add global count of opened dso objects · bda6ee4a
      Jiri Olsa 提交于
      Adding global count of opened dso objects so we could
      properly limit the number of opened dso data file
      descriptors.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1401892622-30848-6-git-send-email-jolsa@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      bda6ee4a
    • J
      perf tools: Add global list of opened dso objects · eba5102d
      Jiri Olsa 提交于
      Adding global list of opened dso objects, so we can
      track them and use the list for caching dso data file
      descriptors.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1401892622-30848-5-git-send-email-jolsa@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      eba5102d
    • J
      perf tools: Add data_fd into dso object · 53fa8eaa
      Jiri Olsa 提交于
      Adding data_fd into dso object so we could handle caching
      of opened dso file data descriptors coming int next patches.
      
      Adding dso__data_close interface to keep the data_fd updated
      when the descriptor is closed.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1401892622-30848-4-git-send-email-jolsa@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      53fa8eaa
    • J
      perf tools: Separate dso data related variables · ca40e2af
      Jiri Olsa 提交于
      Add separated structure/namespace for data related
      variables. We are going to add mode of them, so this
      way they will be clearly separated.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1401892622-30848-3-git-send-email-jolsa@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      ca40e2af
    • J
      perf tools: Cache register accesses for unwind processing · 0c4e774f
      Jiri Olsa 提交于
      Caching registers value into an array. Got about 4% speed up
      of perf_reg_value function for report command processing
      dwarf unwind stacks.
      
      Output from report over 1.5 GB data with DWARF unwind stacks:
      (TODO fix perf diff)
      
        current code:
         5.84%     perf  perf                       [.] perf_reg_value
        change:
         1.94%     perf  perf                       [.] perf_reg_value
      
      And little bit of overall speed up:
      (perf stat -r 5 -e '{cycles,instructions}:u' ...)
      
        current code:
         310,298,611,754      cycles                     ( +-  0.33% )
         439,669,689,341      instructions               ( +-  0.03% )
      
           188.656753166 seconds time elapsed            ( +-  0.82% )
      
        change:
         291,315,329,878      cycles                     ( +-  0.22% )
         391,763,485,304      instructions               ( +-  0.03%  )
      
           180.742249687 seconds time elapsed            ( +-  0.64% )
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1401892622-30848-2-git-send-email-jolsa@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      0c4e774f
    • N
      perf record: Fix to honor user freq/interval properly · 17314e23
      Namhyung Kim 提交于
      When configuring event perf checked a wrong condition that user
      specified both of freq (-F) and period (-c) or the event has no
      default value.  This worked because most of events don't have default
      value and only tracepoint events have default of 1 (and it's not
      desirable to change it for those events).
      
      However, Andi's downloadable event patch changes the situation so it
      cannot change the value for those events.  Fix it by allowing override
      the default value if user gives one of the options.
      
        $ perf record -a -e uops_retired.all -F 4000 sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.325 MB perf.data (~14185 samples) ]
      
        $ perf evlist -F
        cpu/uops_retired.all/: sample_freq=4000
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Link: http://lkml.kernel.org/r/1402292617-26278-1-git-send-email-namhyung@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      17314e23
  2. 10 6月, 2014 2 次提交
    • M
      perf probe: Improve error messages in --line option · 5ee05b88
      Masami Hiramatsu 提交于
      Improve error messages of 'perf probe --line' mode.
      
      Currently 'perf probe' shows the "Debuginfo analysis failed" message with
      an error code when the given symbol is not found:
      
        -----
        # perf probe -L page_cgroup_init_flatmem
        Debuginfo analysis failed. (-2)
          Error: Failed to show lines.
        -----
      
      But -2 (-ENOENT) means that the given source line or function was not
      found. With this patch, 'perf probe' shows the correct error message:
      
        -----
        # perf probe -L page_cgroup_init_flatmem
        Specified source line is not found.
          Error: Failed to show lines.
        -----
      
      There is also another debug error code is shown in the same function
      after get_real_path(). This removes that too.
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20140606071406.6788.47850.stgit@kbuild-fedora.novalocalSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5ee05b88
    • M
      perf probe: Improve an error message of perf probe --vars mode · 69e96eaa
      Masami Hiramatsu 提交于
      Fix an error message when failed to find given address in --vars
      mode.
      
      Without this fix, perf probe -V doesn't show the final "Error"
      message if it fails to find given source line. Moreover, it
      tells it fails to find "variables" instead of the source line.
        -----
        # perf probe -V foo@bar
        Failed to find variables at foo@bar (0)
        -----
      The result also shows mysterious error code. Actually the error
      returns 0 or -ENOENT means that it just fails to find the address
      of given source line. (0 means there is no matching address,
      and -ENOENT means there is an entry(DIE) but it has no instance,
      e.g. an empty inlined function)
      
      This fixes it to show what happened and the final error message
      as below.
        -----
        # perf probe -V foo@bar
        Failed to find the address of foo@bar
          Error: Failed to show vars.
        -----
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20140606071359.6788.84716.stgit@kbuild-fedora.novalocalSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      69e96eaa
  3. 09 6月, 2014 7 次提交
  4. 08 6月, 2014 1 次提交
  5. 04 6月, 2014 4 次提交
    • M
      perf probe: Fix perf probe to find correct variable DIE · 082f96a9
      Masami Hiramatsu 提交于
      Fix perf probe to find correct variable DIE which has location or
      external instance by tracking down the lexical blocks.
      
      Current die_find_variable() expects that the all variable DIEs
      which has DW_TAG_variable have a location. However, since recent
      dwarf information may have declaration variable DIEs at the
      entry of function (subprogram), die_find_variable() returns it.
      
      To solve this problem, it must track down the DIE tree to find
      a DIE which has an actual location or a reference for external
      instance.
      
      e.g. finding a DIE which origin is <0xdc73>;
      
       <1><11496>: Abbrev Number: 95 (DW_TAG_subprogram)
          <11497>   DW_AT_abstract_origin: <0xdc42>
          <1149b>   DW_AT_low_pc      : 0x1850
      [...]
       <2><114cc>: Abbrev Number: 119 (DW_TAG_variable) <- this is a declaration
          <114cd>   DW_AT_abstract_origin: <0xdc73>
       <2><114d1>: Abbrev Number: 119 (DW_TAG_variable)
      [...]
       <3><115a7>: Abbrev Number: 105 (DW_TAG_lexical_block)
          <115a8>   DW_AT_ranges      : 0xaa0
       <4><115ac>: Abbrev Number: 96 (DW_TAG_variable) <- this has a location
          <115ad>   DW_AT_abstract_origin: <0xdc73>
          <115b1>   DW_AT_location    : 0x486c        (location list)
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Acked-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/20140529121930.30879.87092.stgit@ltc230.yrl.intra.hitachi.co.jpSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      082f96a9
    • M
      perf probe: Fix a segfault if asked for variable it doesn't find · 0c188a07
      Masami Hiramatsu 提交于
      Fix a segfault bug by asking for variable it doesn't find.
      Since the convert_variable() didn't handle error code returned
      from convert_variable_location(), it just passed an incomplete
      variable field and then a segfault was occurred when formatting
      the field.
      
      This fixes that bug by handling success code correctly in
      convert_variable(). Other callers of convert_variable_location()
      are correctly checking the return code.
      
      This bug was introduced by following commit. But another hidden
      erroneous error handling has been there previously (-ENOMEM case).
      
       commit 3d918a12Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Reported-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/20140529105232.28251.30447.stgit@ltc230.yrl.intra.hitachi.co.jpSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      0c188a07
    • J
      perf tools: Move elide bool into perf_hpp_fmt struct · f2998422
      Jiri Olsa 提交于
      After output/sort fields refactoring, it's expensive
      to check the elide bool in its current location inside
      the 'struct sort_entry'.
      
      The perf_hpp__should_skip function gets highly noticable in
      workloads with high number of output/sort fields, like for:
      
        $ perf report -i perf-test.data -F overhead,sample,period,comm,pid,dso,symbol,cpu --stdio
      
      Performance report:
         9.70%  perf  [.] perf_hpp__should_skip
      
      Moving the elide bool into the 'struct perf_hpp_fmt', which
      makes the perf_hpp__should_skip just single struct read.
      
      Got speedup of around 22% for my test perf.data workload.
      The change should not harm any other workload types.
      
      Performance counter stats for (10 runs):
        before:
         358,319,732,626      cycles                    ( +-  0.55% )
         467,129,581,515      instructions              #    1.30  insns per cycle          ( +-  0.00% )
      
           150.943975206 seconds time elapsed           ( +-  0.62% )
      
        now:
         278,785,972,990      cycles                    ( +-  0.12% )
         370,146,797,640      instructions              #    1.33  insns per cycle          ( +-  0.00% )
      
           116.416670507 seconds time elapsed           ( +-  0.31% )
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20140601142622.GA9131@krava.brq.redhat.comSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      f2998422
    • J
      perf tools: Remove elide setup for SORT_MODE__MEMORY mode · 2ec85c62
      Jiri Olsa 提交于
      There's no need to setup elide of sort_dso sort entry again
      with symbol_conf.dso_list list.
      
      The only difference were list names of memory mode data,
      which does not make much sense to me.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1400858147-7155-2-git-send-email-jolsa@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      2ec85c62
  6. 01 6月, 2014 15 次提交
  7. 21 5月, 2014 4 次提交