1. 18 9月, 2015 3 次提交
    • A
      tools build: Add test for presence of __get_cpuid() gcc builtin · b0063dbf
      Arnaldo Carvalho de Melo 提交于
      The auxtrace code needed by Intel PT uses the __get_cpuid() gcc builtin,
      that is not present in old systems, breaking the build.
      
      Add a test to check for that builtin and disable AUXTRACE in those
      systems.
      
        [acme@rhel5 linux]$  make NO_LIBPERL=1 -C tools/perf O=/tmp/build/perf install-bin
        make: Entering directory `/home/acme/git/linux/tools/perf'
          BUILD:   Doing 'make -j2' parallel build
      
        Auto-detecting system features:
        <SNIP>
        ...                          lzma: [ on  ]
        ...                     get_cpuid: [ OFF ]
        <SNIP>
        config/Makefile:630: Your gcc lacks the __get_cpuid() builtin, disables support for auxtrace/Intel PT, please install a newer gcc
          MKDIR    /tmp/build/perf/util/
        <SNIP>
      
      This fixes the build on old systems such as RHEL/CentOS 5.11.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Victor Kamensky <victor.kamensky@linaro.org>
      Cc: Vinson Lee <vlee@twopensource.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-d4puslul0jltoodzpx9r4sje@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b0063dbf
    • A
      tools build: Add test for presence of numa_num_possible_cpus() in libnuma · f8ac8606
      Arnaldo Carvalho de Melo 提交于
      The existing numa test checks only if numa.h and numa_available() are
      present, but that can be satisfied with an old libnuma that is not
      enough for the 'perf bench numa' entry, so add a test to check for that:
      
        [acme@rhel5 linux]$  make NO_AUXTRACE=1 NO_LIBPERL=1 -C tools/perf O=/tmp/build/perf install-bin
        make: Entering directory `/home/acme/git/linux/tools/perf'
          BUILD:   Doing 'make -j2' parallel build
      
        Auto-detecting system features:
        ...                        libelf: [ on  ]
        ...                       libnuma: [ on  ]
        ...        numa_num_possible_cpus: [ OFF ]
        ...                       libperl: [ on  ]
      
        <SNIP>
        config/Makefile:577: Old numa library found, disables 'perf bench numa mem' benchmark, please install numactl-devel/libnuma-devel/libnuma-dev >= 2.0.8
          INSTALL  binaries
        <SNIP>
      
      This fixes the build on old systems such as RHEL/CentOS 5.11.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Victor Kamensky <victor.kamensky@linaro.org>
      Cc: Vinson Lee <vlee@twopensource.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-zqriqkezppi2de2iyjin1tnc@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f8ac8606
    • A
      Revert "perf symbols: Fix mismatched declarations for elf_getphdrnum" · 179f36dd
      Arnaldo Carvalho de Melo 提交于
      This reverts commit f785f235.
      
      We have a test to check if elf_getphdrnum() is present, so, if it fails,
      we'll get:
      
        [acme@rhel5 linux]$ cat /tmp/build/perf/feature/test-libelf-getphdrnum.make.output
        cc1: warnings being treated as errors
        test-libelf-getphdrnum.c: In function ‘main’:
        test-libelf-getphdrnum.c:7: warning: implicit declaration of function ‘elf_getphdrnum’
        [acme@rhel5 linux]$
      
      And this block will not be compiled:
      
        #ifndef HAVE_ELF_GETPHDRNUM_SUPPORT
        static int elf_getphdrnum(Elf *elf, size_t *dst)
        ...
        #endif
      
      So, if elf_getphdrnum() is being defined somewhere, there is a problem
      with the test that is not detecting that function, go fix it.
      Reported-by: NVinson Lee <vlee@twopensource.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Victor Kamensky <victor.kamensky@linaro.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-qn459fal6acvcvm50i8zxx9k@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      179f36dd
  2. 17 9月, 2015 1 次提交
    • S
      perf stat: Fix per-pkg event reporting bug · 02d8dabc
      Stephane Eranian 提交于
      Per-pkg events need to be captured once per processor socket. The code
      in check_per_pkg() ensures only one value per processor package is used.
      However there is a problem with this function in case the first CPU of
      the package does not measure anything for the per-pkg event, but other
      CPUs do.
      
      Consider the following:
      
        $ create cgroup FOO; echo $$ >FOO/tasks; taskset -c 1 noploop &
        $ perf stat -a -I 1000 -e intel_cqm/llc_occupancy/ -G FOO sleep 100
          1.00000 <not counted> Bytes intel_cqm/llc_occupancy/  FOO
      
      The reason for this is that CPU0 in the cgroup has nothing running on it.
      Yet check_per_plg() will mark socket0 as processed and no other event
      value will be considered for the socket.
      
      This patch fixes the problem by having check_per_pkg() only consider
      events which actually ran.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1441286620-10117-1-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      02d8dabc
  3. 16 9月, 2015 1 次提交
    • I
      Merge tag 'perf-urgent-for-mingo' of... · f6cf87f7
      Ingo Molnar 提交于
      Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
      - Fix segfault pressing -> in 'perf top' with no hist entries. (Wang Nan)
      
         E.g:
      	perf top -e page-faults --pid 11400 # 11400 generates no page-fault
      
      - Fix propagation of thread and cpu maps, that got broken when doing incomplete
        changes to better support events with a PMU cpu mask, leading to Intel PT to
        fail with an error like:
      
          $ perf record -e intel_pt//u uname
          Error: The sys_perf_event_open() syscall returned with
                    22 (Invalid argument) for event (sched:sched_switch).
      
        Because intel_pt adds that sched:sched_switch evsel to the evlist after the
        thread/cpu maps were propagated to the evsels, fix it. (Adrian Hunter)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f6cf87f7
  4. 15 9月, 2015 15 次提交
  5. 14 9月, 2015 1 次提交
  6. 13 9月, 2015 2 次提交
    • A
      perf header: Fixup reading of HEADER_NRCPUS feature · caa47047
      Arnaldo Carvalho de Melo 提交于
      The original patch introducing this header wrote the number of CPUs available
      and online in one order and then swapped those values when reading, fix it.
      
      Before:
      
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 4
        # nrcpus avail : 4
        # echo 0 > /sys/devices/system/cpu/cpu2/online
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 4
        # nrcpus avail : 3
        # echo 0 > /sys/devices/system/cpu/cpu1/online
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 4
        # nrcpus avail : 2
      
      After the fix, bringing back the CPUs online:
      
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 2
        # nrcpus avail : 4
        # echo 1 > /sys/devices/system/cpu/cpu2/online
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 3
        # nrcpus avail : 4
        # echo 1 > /sys/devices/system/cpu/cpu1/online
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 4
        # nrcpus avail : 4
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: fbe96f29 ("perf tools: Make perf.data more self-descriptive (v8)")
      Link: http://lkml.kernel.org/r/20150911153323.GP23511@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      caa47047
    • P
      perf/x86/intel: Fix constraint access · ebfb4988
      Peter Zijlstra 提交于
      Sasha reported that we can get here with .idx==-1, and
      cpuc->event_constraints unallocated.
      Suggested-by: NStephane Eranian <eranian@google.com>
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org>
      Fixes: b371b594 ("perf/x86: Fix event/group validation")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ebfb4988
  7. 11 9月, 2015 1 次提交
  8. 04 9月, 2015 1 次提交
    • I
      Merge tag 'perf-urgent-for-mingo' of... · 21adf76e
      Ingo Molnar 提交于
      Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
        - In some cases where perf_event.fork.{pid,tid} should be used we were instead
          using perf_event.comm.{pid,tid}, which is not a problem for for the 'pid'
          case, that sits in the same place in these union_perf_event members, but
          comm.tid sits where fork.ppid is, oops.
      
          These cases were considered as (potentially) problematic:
      
           - 'perf script' with !sample_id_all, i.e. only non old kernels without
              perf_event_attr.sample_id_all.
      
           - intel_pt could be affected when decoding without timestamps, as the exit
             event is only used to flush out data which anyway gets flushed at the
             end of the session.
      
           - intel_bts also uses the exit event to flush data which would probably not
             cause errors as it would get flushed at the end of the session instead.
      
          Fix it. (Adrian Hunter)
      
        - Due to relaxing the compiler checks for bison generated files, we missed
          updating one parse_events_add_pmu() caller when this function had its
          prototype changed, fix it. (Jiri Olsa)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      21adf76e
  9. 03 9月, 2015 1 次提交
    • A
      perf tools: Fix use of wrong event when processing exit events · 53ff6bc3
      Adrian Hunter 提交于
      In a couple of cases the 'comm' member of 'union event' has been used
      instead of the correct member ('fork') when processing exit events.
      
      In the cases where it has been used incorrectly, only the 'pid' and
      'tid' are affected.  The 'pid' value would be correct anyway because it
      is in the same position in 'comm' and 'fork' events, but the 'tid' would
      have been incorrectly assigned from 'ppid'.
      
      However, for exit events, the kernel puts the current task in the 'ppid'
      and 'ttid' which is the same as the exiting task.  That is 'ppid' ==
      'pid' and if the task is not multi-threaded, 'pid' == 'tid' i.e. the
      data goes wrong only when tracing multi-threaded programs.
      
      It is hard to find an example of how this would produce an error in
      practice.  There are 3 occurences of the fix:
      
      1. perf script is only affected if !sample_id_all which only happens on
        old kernels.
      
      2. intel_pt is only affected when decoding without timestamps
         and would probably still decode correctly - the exit event is
         only used to flush out data which anyway gets flushed at the
         end of the session
      
      3. intel_bts also uses the exit event to flush data which
         would probably not cause errors as it would get flushed at
         the end of the session instead
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1439888825-27708-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      53ff6bc3
  10. 02 9月, 2015 4 次提交
  11. 01 9月, 2015 10 次提交
    • W
      perf dwarf: Fix potential array out of bounds access · 3b27d139
      Wang Nan 提交于
      There is a problem in the dwarf-regs.c files for sh, sparc and x86 where
      it is possible to make an out-of-bounds array access when searching for
      register names.
      
      This patch fixes it by replacing '<=' to '<', so when register (number
      == XXX_MAX_REGS), get_arch_regstr() will return NULL.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Reviewed-by: NMatt Fleming <matt@console-pimps.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@huawei.com
      Link: http://lkml.kernel.org/r/1441078184-105038-1-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3b27d139
    • I
      Merge tag 'perf-core-for-mingo' of... · 53202661
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Add ability to specify to select which registers to record,
          to reduce the size of perf.data files, and also allow printing
          the registers in 'perf script': (Stephane Eranian)
      
            # perf record --intr-regs=AX,SP usleep 1
            [ perf record: Woken up 1 times to write data ]
            [ perf record: Captured and wrote 0.016 MB perf.data (8 samples) ]
            # perf script -F ip,sym,iregs | tail -5
             ffffffff8105f42a native_write_msr_safe   AX:0xf    SP:0xffff8802629c3c00
             ffffffff8105f42a native_write_msr_safe   AX:0xf    SP:0xffff8802629c3c00
             ffffffff81761ac0 _raw_spin_lock   AX:0xffff8801bfcf8020    SP:0xffff8802629c3ce8
             ffffffff81202bf8 __vma_adjust_trans_huge   AX:0x7ffc75200000    SP:0xffff8802629c3b30
             ffffffff8122b089 dput   AX:0x101    SP:0xffff8802629c3c78
            #
      
      Infrastructure changes:
      
        - Open event on evsel cpus and threads. (Kan Liang)
      
        - Add new bpf API to get name from a BPF object. (Wang Nan)
      
      Build fixes:
      
        - Fix build on powerpc broken by pt/bts. (Adrian Hunter)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      53202661
    • L
      Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 65a99597
      Linus Torvalds 提交于
      Pull NOHZ updates from Ingo Molnar:
       "The main changes, mostly written by Frederic Weisbecker, include:
      
         - Fix some jiffies based cputime assumptions.  (No real harm because
           the concerned code isn't used by full dynticks.)
      
         - Simplify jiffies <-> usecs conversions.  Remove dead code.
      
         - Remove early hacks on nohz full code that avoided messing up idle
           nohz internals.  Now nohz integrates well full and idle and such
           hack have become needless.
      
         - Restart nohz full tick from irq exit.  (A simplification and a
           preparation for future optimization on scheduler kick to nohz
           full)
      
         - Code cleanups.
      
         - Tile driver isolation enhancement on top of nohz.  (Chris Metcalf)"
      
      * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        nohz: Remove useless argument on tick_nohz_task_switch()
        nohz: Move tick_nohz_restart_sched_tick() above its users
        nohz: Restart nohz full tick from irq exit
        nohz: Remove idle task special case
        nohz: Prevent tilegx network driver interrupts
        alpha: Fix jiffies based cputime assumption
        apm32: Fix cputime == jiffies assumption
        jiffies: Remove HZ > USEC_PER_SEC special case
      65a99597
    • L
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 418c2e1f
      Linus Torvalds 提交于
      Pull scheduler fix from Ingo Molnar:
       "This is a leftover scheduler fix from the v4.2 cycle"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Fix cpu_active_mask/cpu_online_mask race
      418c2e1f
    • L
      Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a1d85611
      Linus Torvalds 提交于
      Pull scheduler updates from Ingo Molnar:
       "The biggest change in this cycle is the rewrite of the main SMP load
        balancing metric: the CPU load/utilization.  The main goal was to make
        the metric more precise and more representative - see the changelog of
        this commit for the gory details:
      
          9d89c257 ("sched/fair: Rewrite runnable load and utilization average tracking")
      
        It is done in a way that significantly reduces complexity of the code:
      
          5 files changed, 249 insertions(+), 494 deletions(-)
      
        and the performance testing results are encouraging.  Nevertheless we
        need to keep an eye on potential regressions, since this potentially
        affects every SMP workload in existence.
      
        This work comes from Yuyang Du.
      
        Other changes:
      
         - SCHED_DL updates.  (Andrea Parri)
      
         - Simplify architecture callbacks by removing finish_arch_switch().
           (Peter Zijlstra et al)
      
         - cputime accounting: guarantee stime + utime == rtime.  (Peter
           Zijlstra)
      
         - optimize idle CPU wakeups some more - inspired by Facebook server
           loads.  (Mike Galbraith)
      
         - stop_machine fixes and updates.  (Oleg Nesterov)
      
         - Introduce the 'trace_sched_waking' tracepoint.  (Peter Zijlstra)
      
         - sched/numa tweaks.  (Srikar Dronamraju)
      
         - misc fixes and small cleanups"
      
      * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
        sched/deadline: Fix comment in enqueue_task_dl()
        sched/deadline: Fix comment in push_dl_tasks()
        sched: Change the sched_class::set_cpus_allowed() calling context
        sched: Make sched_class::set_cpus_allowed() unconditional
        sched: Fix a race between __kthread_bind() and sched_setaffinity()
        sched: Ensure a task has a non-normalized vruntime when returning back to CFS
        sched/numa: Fix NUMA_DIRECT topology identification
        tile: Reorganize _switch_to()
        sched, sparc32: Update scheduler comments in copy_thread()
        sched: Remove finish_arch_switch()
        sched, tile: Remove finish_arch_switch
        sched, sh: Fold finish_arch_switch() into switch_to()
        sched, score: Remove finish_arch_switch()
        sched, avr32: Remove finish_arch_switch()
        sched, MIPS: Get rid of finish_arch_switch()
        sched, arm: Remove finish_arch_switch()
        sched/fair: Clean up load average references
        sched/fair: Provide runnable_load_avg back to cfs_rq
        sched/fair: Remove task and group entity load when they are dead
        sched/fair: Init cfs_rq's sched_entity load average
        ...
      a1d85611
    • L
      Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3959df1d
      Linus Torvalds 提交于
      Pull RAS updates from Ingo Molnar:
       "MCE handling updates, but also some generic drivers/edac/ changes to
        better organize the Kconfig space"
      
      * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/ras: Move AMD MCE injector to arch/x86/ras/
        x86/mce: Add a wrapper around mce_log() for injection
        x86/mce: Rename rcu_dereference_check_mce() to mce_log_get_idx_check()
        RAS: Add a menuconfig option with descriptive text
        x86/mce: Reenable CMCI banks when swiching back to interrupt mode
        x86/mce: Clear Local MCE opt-in before kexec
        x86/mce: Remove unused function declarations
        x86/mce: Kill drain_mcelog_buffer()
        x86/mce: Avoid potential deadlock due to printk() in MCE context
        x86/mce: Remove the MCE ring for Action Optional errors
        x86/mce: Don't use percpu workqueues
        x86/mce: Provide a lockless memory pool to save error records
        x86/mce: Reuse one of the u16 padding fields in 'struct mce'
      3959df1d
    • L
      Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 41d859a8
      Linus Torvalds 提交于
      Pull perf updates from Ingo Molnar:
       "Main perf kernel side changes:
      
         - uprobes updates/fixes.  (Oleg Nesterov)
      
         - Add PERF_RECORD_SWITCH to indicate context switches and use it in
           tooling.  (Adrian Hunter)
      
         - Support BPF programs attached to uprobes and first steps for BPF
           tooling support.  (Wang Nan)
      
         - x86 generic x86 MSR-to-perf PMU driver.  (Andy Lutomirski)
      
         - x86 Intel PT, LBR and BTS updates.  (Alexander Shishkin)
      
         - x86 Intel Skylake support.  (Andi Kleen)
      
         - x86 Intel Knights Landing (KNL) RAPL support.  (Dasaratharaman
           Chandramouli)
      
         - x86 Intel Broadwell-DE uncore support.  (Kan Liang)
      
         - x86 hw breakpoints robustization (Andy Lutomirski)
      
        Main perf tooling side changes:
      
         - Support Intel PT in several tools, enabling the use of the
           processor trace feature introduced in Intel Broadwell processors:
           (Adrian Hunter)
      
             # dmesg | grep Performance
             # [0.188477] Performance Events: PEBS fmt2+, 16-deep LBR, Broadwell events, full-width counters, Intel PMU driver.
             # perf record -e intel_pt//u -a sleep 1
             [ perf record: Woken up 1 times to write data ]
             [ perf record: Captured and wrote 0.216 MB perf.data ]
             # perf script # then navigate in the tool output to some area, like this one:
             184 1030 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba661440 dl_main (/usr/lib64/ld-2.17.so)
             185 1457 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba669f10 _dl_new_object (/usr/lib64/ld-2.17.so)
             186 9f37 _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba677b90 strlen (/usr/lib64/ld-2.17.so)
             187 7ba3 strlen (/usr/lib64/ld-2.17.so) => 7f21ba677c75 strlen (/usr/lib64/ld-2.17.so)
             188 7c78 strlen (/usr/lib64/ld-2.17.so) => 7f21ba669f3c _dl_new_object (/usr/lib64/ld-2.17.so)
             189 9f8a _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba65fab0 calloc@plt (/usr/lib64/ld-2.17.so)
             190 fab0 calloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e70 calloc (/usr/lib64/ld-2.17.so)
             191 5e87 calloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa90 malloc@plt (/usr/lib64/ld-2.17.so)
             192 fa90 malloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e60 malloc (/usr/lib64/ld-2.17.so)
             193 5e68 malloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so)
             194 fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so) => 7f21ba675d50 __libc_memalign (/usr/lib64/ld-2.17.so)
             195 5d63 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e20 __libc_memalign (/usr/lib64/ld-2.17.so)
             196 5e40 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675d73 __libc_memalign (/usr/lib64/ld-2.17.so)
             197 5d97 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e18 __libc_memalign (/usr/lib64/ld-2.17.so)
             198 5e1e __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675df9 __libc_memalign (/usr/lib64/ld-2.17.so)
             199 5e10 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba669f8f _dl_new_object (/usr/lib64/ld-2.17.so)
             200 9fc2 _dl_new_object (/usr/lib64/ld-2.17.so) =>  7f21ba678e70 memcpy (/usr/lib64/ld-2.17.so)
             201 8e8c memcpy (/usr/lib64/ld-2.17.so) => 7f21ba678ea0 memcpy (/usr/lib64/ld-2.17.so)
      
         - Add support for using several Intel PT features (CYC, MTC packets),
           the relevant documentation was updated in:
               tools/perf/Documentation/intel-pt.txt
           briefly describing those packets, its purposes, how to configure
           them in the event config terms and relevant external documentation
           for further reading.  (Adrian Hunter)
      
         - Introduce support for probing at an absolute address, for user and
           kernel 'perf probe's, useful when one have the symbol maps on a
           developer machine but not on an embedded system.  (Wang Nan)
      
         - Add Intel BTS support, with a call-graph script to show it and PT
           in use in a GUI using 'perf script' python scripting with
           postgresql and Qt.  (Adrian Hunter)
      
         - Allow selecting the type of callchains per event, including
           disabling callchains in all but one entry in an event list, to save
           space, and also to ask for the callchains collected in one event to
           be used in other events.  (Kan Liang)
      
         - Beautify more syscall arguments in 'perf trace': (Arnaldo Carvalho
           de Melo)
             * A bunch more translate file/pathnames from pointers to strings.
             * Convert numbers to strings for the 'keyctl' syscall 'option'
               arg.
             * Add missing 'clockid' entries.
      
         - Introduce 'srcfile' sort key: (Andi Kleen)
      
             # perf record -F 10000 usleep 1
             # perf report --stdio --dsos '[kernel.vmlinux]' -s srcfile
             <SNIP>
             # Overhead  Source File
                26.49%  copy_page_64.S
                 5.49%  signal.c
                 0.51%  msr.h
             #
      
           It can be combined with other fields, for instance, experiment with
           '-s srcfile,symbol'.
      
           There are some oddities in some distros and with some specific
           DSOs, being investigated, so your mileage may vary.
      
         - Support per-event 'freq' term: (Namhyung Kim)
      
             $ perf record -e 'cpu/instructions,freq=1234/',cycles -c 1000 sleep 1
             $ perf evlist -F
             cpu/instructions,freq=1234/: sample_freq=1234
             cycles: sample_period=1000
             $
      
         - Deref sys_enter pointer args with contents from probe:vfs_getname,
           showing pathnames instead of pointers in many syscalls in 'perf
           trace'.  (Arnaldo Carvalho de Melo)
      
         - Stop collecting /proc/kallsyms in perf.data files, saving about
           4.5MB on a typical x86-64 system, use the the symbol resolution
           routines used in all the other tools (report, top, etc) now that we
           can ask libtraceevent to use perf's symbol resolution code.
           (Arnaldo Carvalho de Melo)
      
         - Allow filtering out of perf's PID via 'perf record --exclude-perf'.
           (Wang Nan)
      
         - 'perf trace' now supports syscall groups, like strace, i.e:
      
             $ trace -e file touch file
      
           Will expand 'file' into multiple, file related, syscalls.  More
           work needed to add extra groups for other syscall groups, and also
           to complement what was added for the 'file' group, included as a
           proof of concept.  (Arnaldo Carvalho de Melo)
      
         - Add lock_pi stresser to 'perf bench futex', to test the kernel code
           related to FUTEX_(UN)LOCK_PI.  (Davidlohr Bueso)
      
         - Let user have timestamps with per-thread recording in 'perf record'
           (Adrian Hunter)
      
         - ... and tons of other changes, see the shortlog and the Git log for
           details"
      
      * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (240 commits)
        perf evlist: Add backpointer for perf_env to evlist
        perf tools: Rename perf_session_env to perf_env
        perf tools: Do not change lib/api/fs/debugfs directly
        perf tools: Add tracing_path and remove unneeded functions
        perf buildid: Introduce sysfs/filename__sprintf_build_id
        perf evsel: Add a backpointer to the evlist a evsel is in
        perf trace: Add header with copyright and background info
        perf scripts python: Add new compaction-times script
        perf stat: Get correct cpu id for print_aggr
        tools lib traceeveent: Allow for negative numbers in print format
        perf script: Add --[no-]-demangle/--[no-]-demangle-kernel
        tracing/uprobes: Do not print '0x (null)' when offset is 0
        perf probe: Support probing at absolute address
        perf probe: Fix error reported when offset without function
        perf probe: Fix list result when address is zero
        perf probe: Fix list result when symbol can't be found
        tools build: Allow duplicate objects in the object list
        perf tools: Remove export.h from MANIFEST
        perf probe: Prevent segfault when reading probe point with absolute address
        perf tools: Update Intel PT documentation
        ...
      41d859a8
    • L
      Merge branch 'mm-kasan-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 46580009
      Linus Torvalds 提交于
      Pull x86/kasan changes from Ingo Molnar:
       "These are two KASAN changes that factor out (and generalize) x86
        specific KASAN code from x86 to mm"
      
      * 'mm-kasan-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/kasan, mm: Introduce generic kasan_populate_zero_shadow()
        x86/kasan: Define KASAN_SHADOW_OFFSET per architecture
      46580009
    • L
      Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e10994ff
      Linus Torvalds 提交于
      Pull liblockdep fixes from Ingo Molnar:
       "Three liblockdep fixes left over from the v4.2 cycle"
      
      * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tools/liblockdep: Use the rbtree header provided by common tools headers
        tools/liblockdep: Correct macro for WARN
        tools: Restore export.h
      e10994ff
    • L
      Merge branch 'core-types-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5757bd61
      Linus Torvalds 提交于
      Pull inlining tuning from Ingo Molnar:
       "A handful of inlining optimizations inspired by x86 work but
        applicable in general"
      
      * 'core-types-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        jiffies: Force inlining of {m,u}msecs_to_jiffies()
        x86/hweight: Force inlining of __arch_hweight{32,64}()
        linux/bitmap: Force inlining of bitmap weight functions
      5757bd61