1. 18 9月, 2017 1 次提交
    • K
      perf machine: Use hashtable for machine threads · 91e467bc
      Kan Liang 提交于
      To process any events, it needs to find the thread in the machine first.
      The machine maintains a rb tree to store all threads. The rb tree is
      protected by a rw lock.
      
      It is not a problem for current perf which serially processing events.
      However, it will have scalability performance issue to process events in
      parallel, especially on a heavy load system which have many threads.
      
      Introduce a hashtable to divide the big rb tree into many samll rb tree
      for threads. The index is thread id % hashtable size. It can reduce the
      lock contention.
      
      Committer notes:
      
      Renamed some variables and function names to reduce semantic confusion:
      
        'struct threads' pointers: thread -> threads
        threads hastable index: tid -> hash_bucket
        struct threads *machine__thread() -> machine__threads()
        Cast tid to (unsigned int) to handle -1 in machine__threads() (Kan Liang)
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1505096603-215017-2-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      91e467bc
  2. 13 9月, 2017 33 次提交
  3. 12 9月, 2017 6 次提交
    • M
      perf stat: Wait for the correct child · dfc9eec7
      Milian Wolff 提交于
      When packaging the perf userland application into an AppImage, the
      wait() call in perf stat returned too early. It turned out that some
      other child process exited, but not the one perf stat launched:
      
        $ sudo strace -e fork,execve,clone,wait4 -f ./perf-x86_64.AppImage stat sleep 1
        execve("./perf-git.3a73b7f9-x86_64.AppImage", ["./perf-git.3a73b7f9-x86_64.AppIm"..., "stat", "sleep", "1"], 0x7ffec1bbf050 /* 18 vars */) = 0
        clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f6a6e7efe50) = 3912
        strace: Process 3912 attached
        [pid  3912] clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f6a6e7efe50) = 3914
        strace: Process 3914 attached
        [pid  3912] +++ exited with 0 +++
        [pid  3911] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3912, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
        [pid  3914] clone(strace: Process 3915 attached
        child_stack=0x7f6a6d9fefb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f6a6d9ff9d0, tls=0x7f6a6d9ff700, child_tidptr=0x7f6a6d9ff9d0) = 3915
        [pid  3911] execve("/tmp/.mount_perf-g6VYMpl/AppRun", ["./perf-git.3a73b7f9-x86_64.AppIm"..., "stat", "sleep", "1"], 0x14aab70 /* 21 vars */) = 0
        [pid  3911] clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f4ae113c4d0) = 3916
        strace: Process 3916 attached
        [pid  3911] wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 3912
        [pid  3916] execve("/usr/libexec/perf-core/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/tmp/./sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/home/milian/.bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/usr/lib/icecream/libexec/icecc/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/ssd2/milian/projects/compiled/other/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/home/milian/.bin/kf5/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/ssd2/milian/projects/compiled/kf5/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/home/milian/projects/compiled/other/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/home/milian/projects/compiled/kf5/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/usr/local/sbin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/usr/local/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory)
        [pid  3916] execve("/usr/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */
         Performance counter stats for 'sleep 1':
      
             <not counted>	task-clock
             <not counted>	context-switches
             <not counted>	cpu-migrations
             <not counted>	page-faults
             <not counted>	cycles
             <not counted>	instructions
             <not counted>      branches
             <not counted>      branch-misses
      
               0.000047194 seconds time elapsed
      
        [pid  3916] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=3911, si_uid=0} ---
        [pid  3916] +++ killed by SIGTERM +++
        [pid  3911] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=3916, si_uid=0, si_status=SIGTERM, si_utime=0, si_stime=0} ---
        [pid  3915] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=3914, si_uid=0} ---
        [pid  3911] +++ exited with 0 +++
        [pid  3915] --- SIGHUP {si_signo=SIGHUP, si_code=SI_USER, si_pid=3914, si_uid=0} ---
        [pid  3915] +++ exited with 0 +++
        +++ exited with 0 +++
      
      This patch uses waitpid instead to ensure the call waits for the
      debuggee application launched by 'perf stat'. This fixes 'perf stat'
      when launched from an AppImage:
      
        $ ./perf-x86_64.AppImage stat sleep 1
      
         Performance counter stats for 'sleep 1':
      
                0.357235      task-clock (msec)         #    0.000 CPUs utilized
                       1      context-switches          #    0.003 M/sec
                       0      cpu-migrations            #    0.000 K/sec
                      50      page-faults               #    0.140 M/sec
                 1269602      cycles                    #    3.554 GHz
                  654278      instructions              #    0.52  insn per cycle
                  129963      branches                  #  363.803 M/sec
                    7082      branch-misses             #    5.45% of all branches
      
             1.000633420 seconds time elapsed
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170912152523.4497-1-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dfc9eec7
    • M
      perf tools: Support running perf binaries with a dash in their name · 3192f1ed
      Milian Wolff 提交于
      Previously the part behind "perf-" was interpreted as an internal perf
      command. If the suffix could not be handled, the execution was stopped.
      This makes it impossible to launch perf binaries that got renamed to
      have the `perf-` prefix. This is e.g. the case for appimages (e.g.
      "perf-x86_64.AppImage"), but would also apply to all other scenarios
      where users symlink or rename perf themselves:
      
      Status quo with the broken behavior:
      
        $ ln -s ./perf ./perf-custom-suffix
        $ ./perf-custom-suffix list
        cannot handle custom-suffix internally$
      
      Also note the missing newline at the end of the error message.
      
      With this patch applied, the above works properly:
      
        $ ./perf-custom-suffix list
      
        List of pre-defined events (to be used in -e):
        ...
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170911111422.31903-1-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3192f1ed
    • T
      perf config: Check not only section->from_system_config but also item's · cba225d6
      Taeung Song 提交于
      Currently section->from_system_config is being checked multiple times.
      item->from_system_config should be checked instead, when iterating thru
      the items in a section. Fix it.
      Signed-off-by: NTaeung Song <treeze.taeung@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1504754325-9724-1-git-send-email-treeze.taeung@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cba225d6
    • J
      perf ui progress: Fix progress update · a82bfd04
      Jiri Olsa 提交于
      We currently update the 'next' variable only with a single step value.
      But it's possible the 'adv' update is bigger than single 'step' value.
      This would leave 'next' value under counted and force unnecessary
      ui_progress__ops->update calls.
      
      Calculate the amount of steps we need for 'adv' update and increase the
      'next' with that amounts of steps.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170908120510.22515-3-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a82bfd04
    • J
      perf ui progress: Make sure we always define step value · 4d286c89
      Jiri Olsa 提交于
      Unlikely, but we could have ui_progress__init being called with total <
      16, which would set the next and step variables to 0. That would force
      unnecessary ui_progress__ops->update calls because 'next' would never
      raise.
      
      Forcing the next and step values to be always > 0.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170908120510.22515-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4d286c89
    • J
      perf tools: Open perf.data with O_CLOEXEC flag · cd6379eb
      Jiri Olsa 提交于
      Do not carry the perf.data file descriptor into the workload process and
      close it when perf executes the workload.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170908084621.31595-2-jolsa@kernel.org
      [ Add definitions for O_CLOEXEC for older systems ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cd6379eb