1. 13 7月, 2016 1 次提交
    • A
      tools: Introduce str_error_r() · c8b5f2c9
      Arnaldo Carvalho de Melo 提交于
      The tools so far have been using the strerror_r() GNU variant, that
      returns a string, be it the buffer passed or something else.
      
      But that, besides being tricky in cases where we expect that the
      function using strerror_r() returns the error formatted in a provided
      buffer (we have to check if it returned something else and copy that
      instead), breaks the build on systems not using glibc, like Alpine
      Linux, where musl libc is used.
      
      So, introduce yet another wrapper, str_error_r(), that has the GNU
      interface, but uses the portable XSI variant of strerror_r(), so that
      users rest asured that the provided buffer is used and it is what is
      returned.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-d4t42fnf48ytlk8rjxs822tf@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c8b5f2c9
  2. 13 4月, 2016 5 次提交
  3. 18 12月, 2015 1 次提交
  4. 05 11月, 2015 1 次提交
  5. 27 10月, 2015 1 次提交
  6. 27 5月, 2015 1 次提交
    • J
      perf sched: Add option to merge like comms to lat output · 2f80dd44
      Josef Bacik 提交于
      Sometimes when debugging large multi-threaded applications it is helpful
      to collate all of the latency numbers into one bulk record to get an
      idea of what is going on.
      
      This patch does this by merging any entries that belong to the same comm
      into one entry and then spits out those totals.
      
      I've also slightly changed the output so you can see how many threads
      were merged in the processing.  Here is the new default output format
      
       -----------------------------------------------------------------------------------------------------------
        Task                 | Runtime ms  | Switches | Average delay ms | Maximum delay ms | Maximum delay at    |
       -----------------------------------------------------------------------------------------------------------
        chrome:(23)          |  740.878 ms |     2612 | avg:    0.022 ms | max:    0.845 ms | max at: 7935.254223 s
        pulseaudio:1523      |   94.440 ms |      597 | avg:    0.027 ms | max:    0.110 ms | max at: 7934.668372 s
        threaded-ml:6042     |   72.554 ms |      386 | avg:    0.035 ms | max:    1.186 ms | max at: 7935.330911 s
        Chrome_IOThread:3832 |   52.388 ms |      456 | avg:    0.021 ms | max:    1.365 ms | max at: 7935.330602 s
        Chrome_ChildIOT:(7)  |   50.694 ms |      743 | avg:    0.021 ms | max:    1.448 ms | max at: 7935.256659 s
        Compositor:5510      |   30.012 ms |      192 | avg:    0.019 ms | max:    0.131 ms | max at: 7936.636815 s
        plugin_audio_th:6043 |   24.828 ms |      314 | avg:    0.018 ms | max:    0.143 ms | max at: 7936.205994 s
        CompositorTileW:(2)  |   14.099 ms |       45 | avg:    0.022 ms | max:    0.153 ms | max at: 7937.521800 s
      
      the (#) after the task is the number of tasks merged, and then if there were
      no tasks merged it just shows the pid.  Here is the same trace file with the -p
      option to print the per-pid latency numbers
      
       -----------------------------------------------------------------------------------------------------------
        Task                 | Runtime ms  | Switches | Average delay ms | Maximum delay ms | Maximum delay at    |
       -----------------------------------------------------------------------------------------------------------
        chrome:5500          |  386.872 ms |      387 | avg:    0.023 ms | max:    0.241 ms | max at: 7936.001694 s
        pulseaudio:1523      |   94.440 ms |      597 | avg:    0.027 ms | max:    0.110 ms | max at: 7934.668372 s
        threaded-ml:6042     |   72.554 ms |      386 | avg:    0.035 ms | max:    1.186 ms | max at: 7935.330911 s
        chrome:10226         |   69.710 ms |      251 | avg:    0.023 ms | max:    0.764 ms | max at: 7935.992305 s
        chrome:4267          |   64.551 ms |      418 | avg:    0.021 ms | max:    0.294 ms | max at: 7937.862427 s
        chrome:4827          |   62.268 ms |       54 | avg:    0.029 ms | max:    0.666 ms | max at: 7935.992813 s
        Chrome_IOThread:3832 |   52.388 ms |      456 | avg:    0.021 ms | max:    1.365 ms | max at: 7935.330602 s
        chrome:3776          |   46.150 ms |      349 | avg:    0.023 ms | max:    0.845 ms | max at: 7935.254223 s
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: kernel-team@fb.com
      Link: http://lkml.kernel.org/r/1432300720-30478-1-git-send-email-jbacik@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2f80dd44
  7. 09 5月, 2015 1 次提交
    • A
      perf machine: Protect the machine->threads with a rwlock · b91fc39f
      Arnaldo Carvalho de Melo 提交于
      In addition to using refcounts for the struct thread lifetime
      management, we need to protect access to machine->threads from
      concurrent access.
      
      That happens in 'perf top', where a thread processes events, inserting
      and deleting entries from that rb_tree while another thread decays
      hist_entries, that end up dropping references and ultimately deleting
      threads from the rb_tree and releasing its resources when no further
      hist_entry (or other data structures, like in 'perf sched') references
      it.
      
      So the rule is the same for refcounts + protected trees in the kernel,
      get the tree lock, find object, bump the refcount, drop the tree lock,
      return, use object, drop the refcount if no more use of it is needed,
      keep it if storing it in some other data structure, drop when releasing
      that data structure.
      
      I.e. pair "t = machine__find(new)_thread()" with a "thread__put(t)", and
      "perf_event__preprocess_sample(&al)" with "addr_location__put(&al)".
      
      The addr_location__put() one is because as we return references to
      several data structures, we may end up adding more reference counting
      for the other data structures and then we'll drop it at
      addr_location__put() time.
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-bs9rt4n0jw3hi9f3zxyy3xln@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b91fc39f
  8. 08 4月, 2015 9 次提交
    • Y
      perf sched replay: Use replay_repeat to calculate the runavg of cpu usage... · ff5f3bbd
      Yunlong Song 提交于
      perf sched replay: Use replay_repeat to calculate the runavg of cpu usage instead of the default value 10
      
      Since sched->replay_repeat is set to 10 as default, the sched->run_avg,
      sched->runavg_cpu_usage, and sched->runavg_parent_cpu_usage all use
      10 to calculate their value.
      
      However, the replay_repeat can be changed to other value by using -r
      option, so the calculation above should use replay_repeat to achieve
      more accurate results instead of the default value 10.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427809596-29559-10-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ff5f3bbd
    • Y
      perf sched replay: Support using -f to override perf.data file ownership · f0dd330f
      Yunlong Song 提交于
      Enable to use perf.data when it is not owned by current user or root.
      
      Example:
      
       $ ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 5321918 Mar 25 15:14 perf.data
       $ sudo id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       $ sudo perf sched replay -f
       run measurement overhead: 98 nsecs
       sleep measurement overhead: 52909 nsecs
       the run test took 1000015 nsecs
       the sleep test took 1054253 nsecs
       File perf.data not owned by current user or root (use -f to override)
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       $ sudo perf sched replay -f
       run measurement overhead: 221 nsecs
       sleep measurement overhead: 40514 nsecs
       the run test took 1000003 nsecs
       the sleep test took 1056098 nsecs
       nr_run_events:        10
       nr_sleep_events:      1562
       nr_wakeup_events:     5
       task      0 (                  :1:         1), nr_events: 1
       task      1 (                  :2:         2), nr_events: 1
       task      2 (                  :3:         3), nr_events: 1
       ...
       ...
       task   1549 (             :163132:    163132), nr_events: 1
       task   1550 (             :163540:    163540), nr_events: 1
       task   1551 (           <unknown>:         0), nr_events: 10
       ------------------------------------------------------------
       #1  : 50.198, ravg: 50.20, cpu: 2335.18 / 2335.18
       #2  : 219.099, ravg: 67.09, cpu: 2835.11 / 2385.17
       #3  : 238.626, ravg: 84.24, cpu: 3278.26 / 2474.48
       #4  : 200.364, ravg: 95.85, cpu: 2977.41 / 2524.77
       #5  : 176.882, ravg: 103.96, cpu: 2801.35 / 2552.43
       #6  : 191.093, ravg: 112.67, cpu: 2813.70 / 2578.56
       #7  : 189.448, ravg: 120.35, cpu: 2809.21 / 2601.62
       #8  : 200.637, ravg: 128.38, cpu: 2849.91 / 2626.45
       #9  : 248.338, ravg: 140.37, cpu: 4380.61 / 2801.87
       #10 : 511.139, ravg: 177.45, cpu: 3077.73 / 2829.45
      
      As shown above, the -f option really works now.
      
      Besides for replay, -f option can also work for latency and map.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427809596-29559-9-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f0dd330f
    • Y
      perf sched replay: Fix the EMFILE error caused by the limitation of the maximum open files · 939cda52
      Yunlong Song 提交于
      The soft maximum number of open files for a calling process is 1024,
      which is defined as INR_OPEN_CUR in include/uapi/linux/fs.h, and the
      hard maximum number of open files for a calling process is 4096, which
      is defined as INR_OPEN_MAX in include/uapi/linux/fs.h.
      
      Both INR_OPEN_CUR and INR_OPEN_MAX are used to limit the value of
      RLIMIT_NOFILE in include/asm-generic/resource.h.
      
      And the soft maximum number finally decides the limitation of the
      maximum files which are allowed to be opened.
      
      That is to say a process can use at most 1024 file descriptors for its
      o pened files, or an EMFILE error will happen.
      
      This error can be fixed by increasing the soft maximum number, under the
      constraint that the soft maximum number can not exceed the hard maximum
      number, or both soft and hard maximum number should be increased
      simultaneously with privilege.
      
      For perf sched replay, it uses sys_perf_event_open to create the file
      descriptor for each of the tasks in order to handle information of perf
      events.
      
      That is to say each task needs a unique file descriptor. In x86_64,
      there may be over 1024 or 4096 tasks correspoinding to the record in
      perf.data, which causes that no enough file descriptors can be used.
      
      As a result, EMFILE error happens and stops the replay process. To solve
      this problem, we adaptively increase the soft and hard maximum number of
      open files with a '-f' option.
      
      Example:
      
      Test environment: x86_64 with 160 cores
      
       $ cat /proc/sys/kernel/pid_max
       163840
       $ cat /proc/sys/fs/file-max
       6815744
       $ ulimit -Sn
       1024
       $ ulimit -Hn
       4096
      
      Before this patch:
      
       $ perf sched replay
       ...
       task   1549 (             :163132:    163132), nr_events: 1
       task   1550 (             :163540:    163540), nr_events: 1
       task   1551 (           <unknown>:         0), nr_events: 10
       Error: sys_perf_event_open() syscall returned with -1 (Too many open
       files)
      
      After this patch:
      
       $ perf sched replay
       ...
       task   1549 (             :163132:    163132), nr_events: 1
       task   1550 (             :163540:    163540), nr_events: 1
       task   1551 (           <unknown>:         0), nr_events: 10
       Error: sys_perf_event_open() syscall returned with -1 (Too many open
       files)
       Have a try with -f option
      
       $ perf sched replay -f
       ...
       task   1549 (             :163132:    163132), nr_events: 1
       task   1550 (             :163540:    163540), nr_events: 1
       task   1551 (           <unknown>:         0), nr_events: 10
       ------------------------------------------------------------
       #1  : 54.401, ravg: 54.40, cpu: 3285.21 / 3285.21
       #2  : 199.548, ravg: 68.92, cpu: 4999.65 / 3456.66
       #3  : 170.483, ravg: 79.07, cpu: 1349.94 / 3245.99
       #4  : 192.034, ravg: 90.37, cpu: 1322.88 / 3053.67
       #5  : 182.929, ravg: 99.62, cpu: 1406.51 / 2888.96
       #6  : 152.974, ravg: 104.96, cpu: 1167.54 / 2716.82
       #7  : 155.579, ravg: 110.02, cpu: 2992.53 / 2744.39
       #8  : 130.557, ravg: 112.08, cpu: 1126.43 / 2582.59
       #9  : 138.520, ravg: 114.72, cpu: 1253.22 / 2449.65
       #10 : 134.328, ravg: 116.68, cpu: 1587.95 / 2363.48
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427809596-29559-8-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      939cda52
    • Y
      perf sched replay: Handle the dead halt of sem_wait when create_tasks() fails for any task · 1aff59be
      Yunlong Song 提交于
      Since there is sem_wait for each task in the wait_for_tasks(), e.g.
      sem_wait(&task->work_done_sem).
      
      The sem_wait can continue only when work_done_sem is greater than 0, or
      it will be blocked.
      
      For perf sched replay, one task may sem_post the work_done_sem of
      another task, which causes the work_done_sem of that task processed in a
      reasonable sequence, e.g. sem_post, sem_wait, sem_wait, sem_post...
      
      This sequence simulates the sched process of the running tasks at the
      time when perf sched record runs.
      
      As a result, all the tasks are required and their threads must be
      successfully created.
      
      If any one (task A) of the tasks fails to create its thread, then
      another task (task B), whose work_done_sem needs sem_post from that
      failed task A, may likely block itself due to seg_wait.
      
      And this is a dead halt, since task B's thread_func cannot continue at
      all.
      
      To solve this problem, perf sched replay should exit once any task fails
      to create its thread.
      
      Example:
      
      Test environment: x86_64 with 160 cores
      
      Before this patch:
      
       $ perf sched replay
       ...
       Error: sys_perf_event_open() syscall returned with -1 (Too many open
       files)
       ------------------------------------------------------------    <- dead halt
      
      After this patch:
      
       $ perf sched replay
       ...
       task   1551 (           <unknown>:         0), nr_events: 10
       Error: sys_perf_event_open() syscall returned with -1 (Too many open
       files)
       $
      
      As shown above, perf sched replay finishes the process after printing an
      error message and does not block itself.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427809596-29559-7-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1aff59be
    • Y
      perf sched replay: Fix the segmentation fault problem caused by pr_err in threads · 08097abc
      Yunlong Song 提交于
      The pr_err in self_open_counters() prints error message to stderr.
      Unlike stdout, stderr uses memory buffer on the stack of each calling
      process.
      
      The pr_err in self_open_counters() works in a thread called thread_func
      created in function create_tasks, which concurrently creates
      sched->nr_tasks threads.
      
      If the error happens and pr_err prints the error message in each of
      these threads, the stack size of the perf process (default is 8192
      kbytes) will quickly run out and the segmentation fault will happen
      then.
      
      To solve this problem, pr_err with self_open_counters() should be moved
      from newly created threads to the old main thread of the perf process.
      Then the pr_err can work in a stable situation without the strange
      segmentation fault problem.
      
      Example:
      
      Test environment: x86_64 with 160 cores
      
      Before this patch:
      
       $ perf sched replay
       ...
       task   1549 (             :163132:    163132), nr_events: 1
       task   1550 (             :163540:    163540), nr_events: 1
       task   1551 (           <unknown>:         0), nr_events: 10
       Segmentation fault
      
      After this patch:
      
       $ perf sched replay
       ...
       task   1549 (             :163132:    163132), nr_events: 1
       task   1550 (             :163540:    163540), nr_events: 1
       task   1551 (           <unknown>:         0), nr_events: 10
       ...
      
      As shown above, the result continues without any segmentation fault.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427809596-29559-6-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      08097abc
    • Y
      perf sched replay: Realloc the memory of pid_to_task stepwise to adapt to the... · 3a423a5c
      Yunlong Song 提交于
      perf sched replay: Realloc the memory of pid_to_task stepwise to adapt to the different pid_max configurations
      
      Although the memory of pid_to_task can be allocated via calloc according
      to the value of /proc/sys/kernel/pid_max, it cannot handle the case when
      pid_max is changed after 'perf sched record' has created its perf.data.
      
      If the new pid_max configured in 'perf sched replay' is smaller than the
      old pid_max configured in 'perf sched record', then it will cause the
      assertion failure problem.
      
      To solve this problem, we realloc the memory of pid_to_task stepwise
      once the passed-in pid parameter in register_pid is larger than the
      current pid_max.
      
      Example:
      
      Test environment: x86_64 with 160 cores
      
       $ cat /proc/sys/kernel/pid_max
       163840
       $ perf sched record ls
       $ echo 5000 > /proc/sys/kernel/pid_max
       $ cat /proc/sys/kernel/pid_max
       5000
      
      Before this patch:
      
       $ perf sched replay
       run measurement overhead: 221 nsecs
       sleep measurement overhead: 55356 nsecs
       the run test took 1000011 nsecs
       the sleep test took 1060940 nsecs
       perf: builtin-sched.c:337: register_pid: Assertion `!(pid >= (unsigned
       long)pid_max)' failed.
       Aborted
      
      After this patch:
      
       $ perf sched replay
       run measurement overhead: 221 nsecs
       sleep measurement overhead: 55611 nsecs
       the run test took 1000026 nsecs
       the sleep test took 1060486 nsecs
       nr_run_events:        10
       nr_sleep_events:      1562
       nr_wakeup_events:     5
       task      0 (                  :1:         1), nr_events: 1
       task      1 (                  :2:         2), nr_events: 1
       task      2 (                  :3:         3), nr_events: 1
       task      3 (                  :5:         5), nr_events: 1
       ...
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427809596-29559-5-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3a423a5c
    • Y
      perf sched replay: Alloc the memory of pid_to_task dynamically to adapt to the... · cb06ac25
      Yunlong Song 提交于
      perf sched replay: Alloc the memory of pid_to_task dynamically to adapt to the unexpected change of pid_max
      
      The current memory allocation of struct task_desc *pid_to_task[MAX_PID]
      is in a permanent and preset way, and it has two problems:
      
      Problem 1: If the pid_max, which is the max number of pids in the
      system, is much smaller than MAX_PID (1024*1000), then it causes a waste
      of stack memory. This may happen in the case where the number of cpu
      cores is much smaller than 1000.
      
      Problem 2: If the pid_max is changed from the default value to a value
      larger than MAX_PID, then it will cause assertion failure problem. The
      maximum value of pid_max can be set to pid_max_max (see pidmap_init
      defined in kernel/pid.c), which equals to PID_MAX_LIMIT. In x86_64,
      PID_MAX_LIMIT is 4*1024*1024 (defined in include/linux/threads.h). This
      value is much larger than MAX_PID, and will take up 32768 Kbytes
      (4*1024*1024*8/1024) for memory allocation of pid_to_task, which is much
      larger than the default 8192 Kbytes of the stack size of calling
      process.
      
      Due to these two problems, we use calloc to allocate the memory of
      pid_to_task dynamically.
      
      Example:
      
      Test environment: x86_64 with 160 cores
      
       $ cat /proc/sys/kernel/pid_max
       163840
       $ echo 1025000 > /proc/sys/kernel/pid_max
       $ cat /proc/sys/kernel/pid_max
       1025000
      
      Run some applications until the pid of some process is greater than
      the value of MAX_PID (1024*1000).
      
      Before this patch:
      
       $ perf sched replay
       run measurement overhead: 221 nsecs
       sleep measurement overhead: 55480 nsecs
       the run test took 1000008 nsecs
       the sleep test took 1063151 nsecs
       perf: builtin-sched.c:330: register_pid: Assertion `!(pid >= 1024000)'
       failed.
       Aborted
      
      After this patch:
      
       $ perf sched replay
       run measurement overhead: 221 nsecs
       sleep measurement overhead: 55435 nsecs
       the run test took 1000004 nsecs
       the sleep test took 1059312 nsecs
       nr_run_events:        10
       nr_sleep_events:      1562
       nr_wakeup_events:     5
       task      0 (                  :1:         1), nr_events: 1
       task      1 (                  :2:         2), nr_events: 1
       task      2 (                  :3:         3), nr_events: 1
       task      3 (                  :5:         5), nr_events: 1
       ...
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427809596-29559-4-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cb06ac25
    • Y
      perf sched replay: Increase the MAX_PID value to fix assertion failure problem · a35e27d0
      Yunlong Song 提交于
      Current MAX_PID is only 65536, which will cause assertion failure problem
      when CPU cores are more than 64 in x86_64.
      
      This is because the pid_max value in x86_64 is at least
      PIDS_PER_CPU_DEFAULT * num_possible_cpus() (see function pidmap_init
      defined in kernel/pid.c), where PIDS_PER_CPU_DEFAULT is 1024 (defined in
      include/linux/threads.h).
      
      Thus for MAX_PID = 65536, the correspoinding CPU cores are
      65536/1024=64.  This is obviously not enough at all for x86_64, and will
      cause an assertion failure problem due to BUG_ON(pid >= MAX_PID) in the
      codes.
      
      We increase MAX_PID value from 65536 to 1024*1000, which can be used in
      x86_64 with 1000 cores.
      
      This number is finally decided according to the limitation of stack size
      of calling process.
      
      Use 'ulimit -a', the result shows the stack size of any process is 8192
      Kbytes, which is defined in include/uapi/linux/resource.h (#define
      _STK_LIM (8*1024*1024)).
      
      Thus we choose a large enough value for MAX_PID, and make it satisfy to
      the limitation of the stack size, i.e., making the perf process take up
      a memory space just smaller than 8192 Kbytes.
      
      We have calculated and tested that 1024*1000 is OK for MAX_PID.
      
      This means perf sched replay can now be used with at most 1000 cores in
      x86_64 without any assertion failure problem.
      
      Example:
      
      Test environment: x86_64 with 160 cores
      
       $ cat /proc/sys/kernel/pid_max
       163840
      
      Before this patch:
      
       $ perf sched replay
       run measurement overhead: 240 nsecs
       sleep measurement overhead: 55379 nsecs
       the run test took 1000004 nsecs
       the sleep test took 1059424 nsecs
       perf: builtin-sched.c:330: register_pid: Assertion `!(pid >= 65536)'
       failed.
       Aborted
      
      After this patch:
      
       $ perf sched replay
       run measurement overhead: 221 nsecs
       sleep measurement overhead: 55397 nsecs
       the run test took 999920 nsecs
       the sleep test took 1053313 nsecs
       nr_run_events:        10
       nr_sleep_events:      1562
       nr_wakeup_events:     5
       task      0 (                  :1:         1), nr_events: 1
       task      1 (                  :2:         2), nr_events: 1
       task      2 (                  :3:         3), nr_events: 1
       task      3 (                  :5:         5), nr_events: 1
       ...
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427809596-29559-3-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a35e27d0
    • Y
      perf sched replay: Use struct task_desc instead of struct task_task for correct meaning · 0755bc4d
      Yunlong Song 提交于
      There is no struct task_task at all, thus it is a typo error in the old
      commits, now fix it to what it should be in order to avoid unnecessary
      misunderstanding.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427809596-29559-2-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0755bc4d
  9. 11 3月, 2015 1 次提交
  10. 03 3月, 2015 2 次提交
    • A
      perf sched: No need to keep the session around · ae536acf
      Arnaldo Carvalho de Melo 提交于
      We were keeping the session around just because we kept pointers to
      struct thread instances, but now we reference count them, so no need
      for deferring the perf_session__delete call to after we traverse the
      work_list entries.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-9agtck6jdr3rebdp39z1lo0e@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ae536acf
    • A
      perf tools: Reference count struct thread · f3b623b8
      Arnaldo Carvalho de Melo 提交于
      We need to do that to stop accumulating entries in the dead_threads
      linked list, i.e. we were keeping references to threads in struct hists
      that continue to exist even after a thread exited and was removed from
      the machine threads rbtree.
      
      We still keep the dead_threads list, but just for debugging, allowing us
      to iterate at any given point over the threads that still are referenced
      by things like struct hist_entry.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-3ejvfyed0r7ue61dkurzjux4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f3b623b8
  11. 23 2月, 2015 1 次提交
  12. 09 10月, 2014 1 次提交
  13. 16 8月, 2014 1 次提交
  14. 14 8月, 2014 2 次提交
  15. 12 8月, 2014 1 次提交
  16. 18 7月, 2014 1 次提交
  17. 17 7月, 2014 1 次提交
  18. 01 6月, 2014 1 次提交
  19. 16 5月, 2014 2 次提交
  20. 13 5月, 2014 1 次提交
  21. 12 5月, 2014 3 次提交
  22. 16 4月, 2014 1 次提交
  23. 19 3月, 2014 1 次提交
    • R
      perf sched: Fixup header alignment in 'latency' output · 80790e0b
      Ramkumar Ramachandra 提交于
      Before:
      
       ---------------------------------------------------------------------------------------------------------------
        Task                  |   Runtime ms  | Switches | Average delay ms | Maximum delay ms | Maximum delay at     |
       ---------------------------------------------------------------------------------------------------------------
        ...                   |               |          |                  |                  |
        git:24540             |    336.622 ms |       10 | avg:    0.032 ms | max:    0.062 ms | max at: 115610.111046 s
        git:24541             |      0.457 ms |        1 | avg:    0.000 ms | max:    0.000 ms | max at:  0.000000 s
       -----------------------------------------------------------------------------------------
        TOTAL:                |    396.542 ms |      353 |
       ---------------------------------------------------
      
      After:
      
       -----------------------------------------------------------------------------------------------------------------
        Task                  |   Runtime ms  | Switches | Average delay ms | Maximum delay ms | Maximum delay at       |
       -----------------------------------------------------------------------------------------------------------------
        ...                   |               |          |                  |                  |
        git:24540             |    336.622 ms |       10 | avg:    0.032 ms | max:    0.062 ms | max at: 115610.111046 s
        git:24541             |      0.457 ms |        1 | avg:    0.000 ms | max:    0.000 ms | max at:      0.000000 s
       -----------------------------------------------------------------------------------------------------------------
        TOTAL:                |    396.542 ms |      353 |
       ---------------------------------------------------
      Signed-off-by: NRamkumar Ramachandra <artagnon@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1395065901-25740-1-git-send-email-artagnon@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      80790e0b