1. 05 6月, 2018 1 次提交
  2. 04 6月, 2018 1 次提交
  3. 03 4月, 2018 1 次提交
  4. 08 3月, 2018 1 次提交
  5. 17 2月, 2018 1 次提交
    • J
      perf report: Fix memory corruption in --branch-history mode --branch-history · e3ebaa46
      Jiri Olsa 提交于
      Jin Yao reported memory corrupton in perf report with
      branch info used for stack trace:
      
        > Following command lines will cause perf crash.
      
        > perf record -j call -g -a <application>
        > perf report --branch-history
        >
        > *** Error in `perf': double free or corruption (!prev): 0x00000000104aa040 ***
        > ======= Backtrace: =========
        > /lib/x86_64-linux-gnu/libc.so.6(+0x77725)[0x7f6b37254725]
        > /lib/x86_64-linux-gnu/libc.so.6(+0x7ff4a)[0x7f6b3725cf4a]
        > /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f6b37260abc]
        > perf[0x51b914]
        > perf(hist_entry_iter__add+0x1e5)[0x51f305]
        > perf[0x43cf01]
        > perf[0x4fa3bf]
        > perf[0x4fa923]
        > perf[0x4fd396]
        > perf[0x4f9614]
        > perf(perf_session__process_events+0x89e)[0x4fc38e]
        > perf(cmd_report+0x15d2)[0x43f202]
        > perf[0x4a059f]
        > perf(main+0x631)[0x427b71]
        > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f6b371fd830]
        > perf(_start+0x29)[0x427d89]
      
      For the cumulative output, we allocate the he_cache array based on the
      --max-stack option value and populate it with data from 'callchain_cursor'.
      
      The --max-stack option value does not ensure now the limit for number of
      callchain_cursor nodes, so the cumulative iter code will allocate smaller array
      than it's actually needed and cause above corruption.
      
      I think the --max-stack limit does not apply here anyway, because we add
      callchain data as normal hist entries, while the --max-stack control the limit
      of single entry callchain depth.
      
      Using the callchain_cursor.nr as he_cache array count to fix this. Also
      removing struct hist_entry_iter::max_stack, because there's no longer any use
      for it.
      
      We need more fixes to ensure that the branch stack code follows properly the
      logic of --max-stack, which is not the case at the moment.
      Original-patch-by: NJin Yao <yao.jin@linux.intel.com>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Reported-by: NJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180216123619.GA9945@kravaSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e3ebaa46
  6. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  7. 25 10月, 2017 1 次提交
    • M
      perf report: Use srcline from callchain for hist entries · 1fb7d06a
      Milian Wolff 提交于
      This also removes the symbol name from the srcline column, more on this
      below.
      
      This ensures we use the correct srcline, which could originate from a
      potentially inlined function. The hist entries used to query for the
      srcline based purely on the IP, which leads to wrong results for inlined
      entries.
      
      Before:
      
      ~~~~~
        perf report --inline -s srcline -g none --stdio
        ...
        # Children      Self  Source:Line
        # ........  ........  ..................................................................................................................................
        #
            94.23%     0.00%  __libc_start_main+18446603487898210537
            94.23%     0.00%  _start+41
            44.58%     0.00%  main+100
            44.58%     0.00%  std::_Norm_helper<true>::_S_do_it<double>+100
            44.58%     0.00%  std::__complex_abs+100
            44.58%     0.00%  std::abs<double>+100
            44.58%     0.00%  std::norm<double>+100
            36.01%     0.00%  hypot+18446603487892193300
            25.81%     0.00%  main+41
            25.81%     0.00%  std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()+41
            25.81%     0.00%  std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >+41
            25.75%    25.75%  random.h:143
            18.39%     0.00%  main+57
            18.39%     0.00%  std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()+57
            18.39%     0.00%  std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >+57
            13.80%    13.80%  random.tcc:3330
             5.64%     0.00%  ??:0
             4.13%     4.13%  __hypot_finite+163
             4.13%     0.00%  __hypot_finite+18446603487892193443
      ...
      ~~~~~
      
      After:
      
      ~~~~~
        perf report --inline -s srcline -g none --stdio
        ...
        # Children      Self  Source:Line
        # ........  ........  ...........................................
        #
            94.30%     1.19%  main.cpp:39
            94.23%     0.00%  __libc_start_main+18446603487898210537
            94.23%     0.00%  _start+41
            48.44%     1.70%  random.h:1823
            48.44%     0.00%  random.h:1814
            46.74%     2.53%  random.h:185
            44.68%     0.10%  complex:589
            44.68%     0.00%  complex:597
            44.68%     0.00%  complex:654
            44.68%     0.00%  complex:664
            40.61%    13.80%  random.tcc:3330
            36.01%     0.00%  hypot+18446603487892193300
            26.81%     0.00%  random.h:151
            26.81%     0.00%  random.h:332
            25.75%    25.75%  random.h:143
             5.64%     0.00%  ??:0
             4.13%     4.13%  __hypot_finite+163
             4.13%     0.00%  __hypot_finite+18446603487892193443
      ...
      ~~~~~
      
      Note that this change removes the symbol from the source:line hist
      column. If this information is desired, users should explicitly query
      for it if needed. I.e. run this command instead:
      
      ~~~~~
        perf report --inline -s sym,srcline -g none --stdio
        ...
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 1K of event 'cycles:uppp'
        # Event count (approx.): 1381229476
        #
        # Children      Self  Symbol                                                                                                                               Source:Line
        # ........  ........  ...................................................................................................................................  ...........................................
        #
            94.30%     1.19%  [.] main                                                                                                                             main.cpp:39
            94.23%     0.00%  [.] __libc_start_main                                                                                                                __libc_start_main+18446603487898210537
            94.23%     0.00%  [.] _start                                                                                                                           _start+41
            48.44%     0.00%  [.] std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined)  random.h:1814
            48.44%     0.00%  [.] std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined)  random.h:1823
            46.74%     0.00%  [.] std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() (inlined)  random.h:185
            44.68%     0.00%  [.] std::_Norm_helper<true>::_S_do_it<double> (inlined)                                                                              complex:654
            44.68%     0.00%  [.] std::__complex_abs (inlined)                                                                                                     complex:589
            44.68%     0.00%  [.] std::abs<double> (inlined)                                                                                                       complex:597
            44.68%     0.00%  [.] std::norm<double> (inlined)                                                                                                      complex:664
            39.80%    13.59%  [.] std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >               random.tcc:3330
            36.01%     0.00%  [.] hypot                                                                                                                            hypot+18446603487892193300
            26.81%     0.00%  [.] std::__detail::__mod<unsigned long, 2147483647ul, 16807ul, 0ul> (inlined)                                                        random.h:151
            26.81%     0.00%  [.] std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>::operator() (inlined)                                 random.h:332
            25.75%     0.00%  [.] std::__detail::_Mod<unsigned long, 2147483647ul, 16807ul, 0ul, true, true>::__calc (inlined)                                     random.h:143
            25.19%    25.19%  [.] std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >               random.h:143
             4.13%     4.13%  [.] __hypot_finite                                                                                                                   __hypot_finite+163
             4.13%     0.00%  [.] __hypot_finite                                                                                                                   __hypot_finite+18446603487892193443
      ...
      ~~~~~
      
      Compared to the old behavior, this reduces duplication in the output.
      Before we used to print the symbol name in the srcline column even
      when the sym column was explicitly requested. I.e. the output was:
      
      ~~~~~
        perf report --inline -s sym,srcline -g none --stdio
        ...
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 1K of event 'cycles:uppp'
        # Event count (approx.): 1381229476
        #
        # Children      Self  Symbol                                                                                                                               Source:Line
        # ........  ........  ...................................................................................................................................  ..................................................................................................................................
        #
            94.23%     0.00%  [.] __libc_start_main                                                                                                                __libc_start_main+18446603487898210537
            94.23%     0.00%  [.] _start                                                                                                                           _start+41
            44.58%     0.00%  [.] main                                                                                                                             main+100
            44.58%     0.00%  [.] std::_Norm_helper<true>::_S_do_it<double> (inlined)                                                                              std::_Norm_helper<true>::_S_do_it<double>+100
            44.58%     0.00%  [.] std::__complex_abs (inlined)                                                                                                     std::__complex_abs+100
            44.58%     0.00%  [.] std::abs<double> (inlined)                                                                                                       std::abs<double>+100
            44.58%     0.00%  [.] std::norm<double> (inlined)                                                                                                      std::norm<double>+100
            36.01%     0.00%  [.] hypot                                                                                                                            hypot+18446603487892193300
            25.81%     0.00%  [.] main                                                                                                                             main+41
            25.81%     0.00%  [.] std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() (inlined)  std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()+41
            25.81%     0.00%  [.] std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined)  std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >+41
            25.69%    25.69%  [.] std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >               random.h:143
            18.39%     0.00%  [.] main                                                                                                                             main+57
            18.39%     0.00%  [.] std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() (inlined)  std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()+57
            18.39%     0.00%  [.] std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined)  std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >+57
            13.80%    13.80%  [.] std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >               random.tcc:3330
             4.13%     4.13%  [.] __hypot_finite                                                                                                                   __hypot_finite+163
             4.13%     0.00%  [.] __hypot_finite                                                                                                                   __hypot_finite+18446603487892193443
      ...
      ~~~~~
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20171019113836.5548-5-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1fb7d06a
  8. 24 10月, 2017 1 次提交
  9. 02 9月, 2017 1 次提交
  10. 26 7月, 2017 1 次提交
    • J
      perf report: Make --branch-history work without callgraphs(-g) option in perf record · b49a821e
      Jin Yao 提交于
        perf record -b -g <command>
        perf report --branch-history
      
      This merges the LBRs with the callgraphs.
      
      However it would be nice if it also works without callgraphs (-g) set in
      perf record, so that only the LBRs are displayed.  But currently perf
      report errors in this case. For example,
      
        perf record -b <command>
        perf report --branch-history
      
        Error:
        Selected -g or --branch-history but no callchain data. Did
        you call 'perf record' without -g?
      
      This patch displays the LBRs only even if callgraphs(-g) is not enabled
      in perf record.
      
      Change log:
      
      v2: According to Milian Wolff's comment, change the obsolete error
      message. Now the error message is:
      
                       ┌─Error:─────────────────────────────────────┐
                       │Selected -g or --branch-history.            │
                       │But no callchain or branch data.            │
                       │Did you call 'perf record' without -g or -b?│
                       │                                            │
                       │                                            │
                       │Press any key...                            │
                       └────────────────────────────────────────────┘
      
      When passing the last parameter to hists__fprintf,
      changes "|" to "||".
      
        hists__fprintf(hists, !quiet, 0, 0, rep->min_percent, stdout,
                       symbol_conf.use_callchain || symbol_conf.show_branchflag_count);
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1494240182-28899-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b49a821e
  11. 19 7月, 2017 1 次提交
    • J
      perf report: Show branch type statistics for stdio mode · 2d78b189
      Jin Yao 提交于
      Show the branch type statistics at the end of perf report --stdio.
      
      For example:
      
        perf report --stdio
      
        COND_FWD:  28.5%
        COND_BWD:   9.4%
        CROSS_4K:   0.7%
        CROSS_2M:  14.1%
            COND:  37.9%
          UNCOND:   0.2%
             IND:   6.7%
            CALL:  26.5%
             RET:  28.7%
          SYSRET:   0.0%
      
        The branch types are:
      
         COND_FWD: conditional forward
         COND_BWD: conditional backward
             COND: conditional branch
           UNCOND: unconditional branch
              IND: indirect
             CALL: function call
           IND_CALL: indirect function call
              RET: function return
          SYSCALL: syscall
           SYSRET: syscall return
        COND_CALL: conditional function call
         COND_RET: conditional function return
      
      CROSS_4K and CROSS_2M:
      
      They are the metrics checking for branches cross 4K or 2MB pages.
      It's an approximate computing. We don't know if the area is 4K or
      2MB, so always compute both.
      
      To make the output simple, if a branch crosses 2M area, CROSS_4K
      will not be incremented.
      
      Change log
      
      v7: Since the common branch type definitions are changed, some
          tags/strings are updated accordingly.
      
      v6: Remove branch_type_stat_display() since it's moved to branch.c.
      
      v5: Remove the unnecessary sort__mode checking in
          hist_iter__branch_callback().
      
      v4: Comparing to previous version, the major changes are:
      
      Add the computing of JCC forward/JCC backward and cross page checking
      by using the from and to addresses.
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1500379995-6449-7-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2d78b189
  12. 25 4月, 2017 2 次提交
  13. 20 4月, 2017 2 次提交
  14. 30 3月, 2017 1 次提交
  15. 27 3月, 2017 1 次提交
  16. 15 3月, 2017 1 次提交
    • H
      perf tools: Add 'cgroup_id' sort order keyword · d890a98c
      Hari Bathini 提交于
      This patch introduces a cgroup identifier entry field in perf report to
      identify or distinguish data of different cgroups. It uses the device
      number and inode number of cgroup namespace, included in perf data with
      the new PERF_RECORD_NAMESPACES event, as cgroup identifier.
      
      With the assumption that each container is created with it's own cgroup
      namespace,  this allows assessment/analysis of multiple containers at
      once.
      
      A simple test for this would be to clone a few processes passing
      SIGCHILD & CLONE_NEWCROUP flags to each of them, execute shell and run
      different workloads  on each of those contexts,  while running perf
      record command with --namespaces option.
      
      Shown below is the output of perf report, sorted with cgroup identifier,
      on perf.data generated with the above test scenario, clearly indicating
      one context's considerable use of kernel memory in comparison with
      others:
      
      	$ perf report -s cgroup_id,sample --stdio
      	#
      	# Total Lost Samples: 0
      	#
      	# Samples: 5K of event 'kmem:kmalloc'
      	# Event count (approx.): 5965
      	#
      	# Overhead  cgroup id (dev/inode)       Samples
      	# ........  .....................  ............
      	#
      	    81.27%  3/0xeffffffb                   4848
      	    16.24%  3/0xf00000d0                    969
      	     1.16%  3/0xf00000ce                     69
      	     0.82%  3/0xf00000cf                     49
      	     0.50%  0/0x0                            30
      
      While this is a start, there is further scope of improving this. For
      example, instead of cgroup namespace's device and inode numbers, dev
      and inode numbers of some or all namespaces may be used to distinguish
      which processes are running in a given container context.
      
      Also, scripts to map device and inode info to containers sounds
      plausible for better tracing of containers.
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/148891933338.25309.756882900782042645.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d890a98c
  17. 20 2月, 2017 1 次提交
  18. 02 2月, 2017 1 次提交
  19. 01 2月, 2017 1 次提交
  20. 27 1月, 2017 1 次提交
    • A
      perf tools: Propagate perf_config() errors · ecc4c561
      Arnaldo Carvalho de Melo 提交于
      Previously these were being ignored, sometimes silently.
      
      Stop doing that, emitting debug messages and handling the errors.
      
      Testing it:
      
        $ cat ~/.perfconfig
        cat: /home/acme/.perfconfig: No such file or directory
        $ perf stat -e cycles usleep 1
      
         Performance counter stats for 'usleep 1':
      
                 938,996      cycles:u
      
             0.003813731 seconds time elapsed
      
        $ perf top --stdio
        Error:
        You may not have permission to collect system-wide stats.
      
        Consider tweaking /proc/sys/kernel/perf_event_paranoid,
        <SNIP>
        [ perf record: Captured and wrote 0.019 MB perf.data (7 samples) ]
        [acme@jouet linux]$ perf report --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        # Overhead  Command  Shared Object      Symbol
        # ........  .......  .................  .........................
          71.77%  usleep   libc-2.24.so       [.] _dl_addr
          27.07%  usleep   ld-2.24.so         [.] _dl_next_ld_env_entry
           1.13%  usleep   [kernel.kallsyms]  [k] page_fault
        $
        $ touch ~/.perfconfig
        $ ls -la ~/.perfconfig
        -rw-rw-r--. 1 acme acme 0 Jan 27 12:14 /home/acme/.perfconfig
        $
        $ perf stat -e instructions usleep 1
      
         Performance counter stats for 'usleep 1':
      
                 244,610      instructions:u
      
             0.000805383 seconds time elapsed
      
        $
        [root@jouet ~]# chown acme.acme ~/.perfconfig
        [root@jouet ~]# perf stat -e cycles usleep 1
          Warning: File /root/.perfconfig not owned by current user or root, ignoring it.
      
         Performance counter stats for 'usleep 1':
      
                 937,615      cycles
      
             0.000836931 seconds time elapsed
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-j2rq96so6xdqlr8p8rd6a3jx@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ecc4c561
  21. 09 11月, 2016 1 次提交
  22. 21 10月, 2016 1 次提交
  23. 20 9月, 2016 1 次提交
  24. 14 9月, 2016 3 次提交
  25. 03 8月, 2016 1 次提交
  26. 12 7月, 2016 3 次提交
  27. 23 6月, 2016 1 次提交
  28. 22 6月, 2016 2 次提交
  29. 15 6月, 2016 1 次提交
  30. 23 5月, 2016 1 次提交
    • A
      perf report: Add srcline_from/to branch sort keys · 508be0df
      Andi Kleen 提交于
      Add "srcline_from" and "srcline_to" branch sort keys that allow to show
      the source lines of a branch.
      
      That makes it much easier to track down where particular branches happen
      in the program, for example to examine branch mispredictions, or to
      associate it with cycle counts:
      
        % perf record -b -e cycles:p ./tcall
        % perf report --sort srcline_from,srcline_to,mispredict
        ...
          15.10%  tcall.c:18       tcall.c:10       N
          14.83%  tcall.c:11       tcall.c:5        N
          14.12%  tcall.c:7        tcall.c:12       N
          14.04%  tcall.c:12       tcall.c:5        N
          12.42%  tcall.c:17       tcall.c:18       N
          12.39%  tcall.c:7        tcall.c:13       N
          12.27%  tcall.c:13       tcall.c:17       N
        ...
      
        % perf report --sort srcline_from,srcline_to,cycles
        ...
          17.12%  tcall.c:18       tcall.c:11       1
          17.01%  tcall.c:12       tcall.c:6        1
          16.98%  tcall.c:11       tcall.c:6        1
          15.91%  tcall.c:17       tcall.c:18       1
           6.38%  tcall.c:7        tcall.c:17       7
           4.80%  tcall.c:7        tcall.c:12       8
           4.21%  tcall.c:7        tcall.c:17       8
           2.67%  tcall.c:7        tcall.c:12       7
           2.62%  tcall.c:7        tcall.c:12       10
           2.10%  tcall.c:7        tcall.c:17       9
           1.58%  tcall.c:7        tcall.c:12       6
           1.44%  tcall.c:7        tcall.c:12       5
           1.38%  tcall.c:7        tcall.c:12       9
           1.06%  tcall.c:7        tcall.c:17       13
           1.05%  tcall.c:7        tcall.c:12       4
           1.01%  tcall.c:7        tcall.c:17       6
      
      Open issues:
      
      - Some kernel symbols get misresolved.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Link: http://lkml.kernel.org/r/1463775308-32748-1-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      508be0df
  31. 06 5月, 2016 1 次提交
  32. 26 4月, 2016 1 次提交
    • K
      perf hists: Clear dummy entry accumulated period · 09623d79
      Kan Liang 提交于
      The accumulated period for dummy entry should also be 0.  Otherwise, the
      total overhead could be overcounted.
      
        $ perf record -e '{LLC-load-misses,cpu/instructions/}' --call-graph=lbr ./tchain
        $ perf report --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        # Total Lost Samples: 0
        #
        # Samples: 21K of event 'anon group { LLC-load-misses, cpu/instructions/ }'
        # Event count (approx.): 16313667937
        #
        #         Children              Self  Command      Shared Object     Symbol
        # ................  ................  ...........  ................  ............................
        #
          4769.98%   0.01%     0.00%   0.01%  tchain_edit  [kernel.vmlinux]  [k] update_fast_timekeeper
          4356.18%   0.01%     0.00%   0.01%  tchain_edit  [kernel.vmlinux]  [k] trigger_load_balance
          3181.12%   0.01%     0.00%   0.01%  tchain_edit  [kernel.vmlinux]  [k] irq_work_tick
          1592.37%   0.00%     0.00%   0.00%  tchain_edit  [kernel.vmlinux]  [k] cpu_needs_another_gp
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1461565689-5862-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      09623d79
  33. 15 4月, 2016 1 次提交
    • A
      perf callchain: Start moving away from global per thread cursors · 91d7b2de
      Arnaldo Carvalho de Melo 提交于
      The recent perf_evsel__fprintf_callchain() move to evsel.c added several
      new symbol requirements to the python binding, for instance:
      
        # perf test -v python
        16: Try 'import perf' in python, checking link problems      :
        --- start ---
        test child forked, pid 18030
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        ImportError: /tmp/build/perf/python/perf.so: undefined symbol:
        callchain_cursor
        test child finished with -1
        ---- end ----
        Try 'import perf' in python, checking link problems: FAILED!
        #
      
      This would require linking against callchain.c to access to the global
      callchain_cursor variables.
      
      Since lots of functions already receive as a parameter a
      callchain_cursor struct pointer, make that be the case for some more
      function so that we can start phasing out usage of yet another global
      variable.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-djko3097eyg2rn66v2qcqfvn@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      91d7b2de