1. 12 3月, 2019 2 次提交
  2. 11 3月, 2019 13 次提交
  3. 10 3月, 2019 1 次提交
    • I
      Merge tag 'perf-core-for-mingo-5.1-20190307' of... · b339da48
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo-5.1-20190307' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/core changes from Arnaldo Carvalho de Melo:
      
      perf bpf:
      
        Arnaldo Carvalho de Melo:
      
        - Automatically add BTF ELF markers to 'perf trace' BPF programs, so that
          tools such as 'bpftool map dump' can pretty print map keys and values.
      
      perf c2c:
      
        Jiri Olsa:
      
        - Fix report for empty NUMA node.
      
      perf diff:
      
        Jin Yao:
      
        - Support --time, --cpu, --pid and --tid filter options.
      
      perf probe:
      
        Arnaldo Carvalho de Melo:
      
        - Clarify error message about not finding kernel modules debuginfo.
      
      perf record:
      
        Jiri Olsa:
      
        - Fixup probing for max attr.precise_ip.
      
      perf trace:
      
        Arnaldo Carvalho de Melo:
      
        - Add missing %s lost in the 'msg_flags' recvmmsg arg when adding prefix suppression logic.
      
      perf annotate:
      
        Arnaldo Carvalho de Melo:
      
        - Calculate the max instruction name, align column to that, removing the
          hardcoded max 6 chars and cope with instructions with names longer than that,
          such as vpmovmskb, vpcmpeqb, etc.
      
      kernel:
      
        Song Liu:
      
        - Consider events with attr.bpf_event set as side-band.
      
        Gustavo A. R. Silva:
      
        - Mark expected switch fall-through in perf_event_parse_addr_filter().
      
      Libraries:
      
        Jiri Olsa:
      
        - Fix leaks and double frees on error paths.
      
      libtraceevent:
      
        Tony Jones:
      
        - Fix buffer overflow in arg_eval().
      
      python scripting:
      
        Tony Jones:
      
        - More python3 fixes.
      
      Trivial:
      
        Yang Wei:
      
        - Remove needless extra semicolon in clang C++ glue code.
      
      Intel PT/BTS:
      
        Adrian Hunter:
      
        - Improve auxtrace address filter error message when there is no DSO.
      
        - Fix divide by zero when TSC is not available.
      
        - Further improvements to the export to sqlite/posgresql python scripts
          and to the GUI sqlviewer, exporting 'parent_id' so that we have enable
          the creation of call trees.
      
        Andi Kleen:
      
        - Generalize function to copy from thread addr space from intel-bts code.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b339da48
  4. 09 3月, 2019 3 次提交
    • G
      perf/core: Mark expected switch fall-through · 43aa378b
      Gustavo A. R. Silva 提交于
      In preparation to enabling -Wimplicit-fallthrough, mark switch cases
      where we are expecting to fall through.
      
      This patch fixes the following warning:
      
        kernel/events/core.c: In function ‘perf_event_parse_addr_filter’:
        kernel/events/core.c:9154:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
            kernel = 1;
            ~~~~~~~^~~
        kernel/events/core.c:9156:3: note: here
           case IF_SRC_FILEADDR:
           ^~~~
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough.
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/20190212205430.GA8446@embeddedorSigned-off-by: NIngo Molnar <mingo@kernel.org>
      43aa378b
    • K
      perf/x86/intel/uncore: Fix client IMC events return huge result · 8041ffd3
      Kan Liang 提交于
      The client IMC bandwidth events currently return very large values:
      
        $ perf stat -e uncore_imc/data_reads/ -e uncore_imc/data_writes/ -I 10000 -a
      
        10.000117222 34,788.76 MiB uncore_imc/data_reads/
        10.000117222 8.26 MiB uncore_imc/data_writes/
        20.000374584 34,842.89 MiB uncore_imc/data_reads/
        20.000374584 10.45 MiB uncore_imc/data_writes/
        30.000633299 37,965.29 MiB uncore_imc/data_reads/
        30.000633299 323.62 MiB uncore_imc/data_writes/
        40.000891548 41,012.88 MiB uncore_imc/data_reads/
        40.000891548 6.98 MiB uncore_imc/data_writes/
        50.001142480 1,125,899,906,621,494.75 MiB uncore_imc/data_reads/
        50.001142480 6.97 MiB uncore_imc/data_writes/
      
      The client IMC events are freerunning counters. They still use the
      old event encoding format (0x1 for data_read and 0x2 for data write).
      The counter bit width is calculated by common code, which assume that
      the standard encoding format is used for the freerunning counters.
      Error bit width information is calculated.
      
      The patch intends to convert the old client IMC event encoding to the
      standard encoding format.
      
      Current common code uses event->attr.config which directly copy from
      user space. We should not implicitly modify it for a converted event.
      The event->hw.config is used to replace the event->attr.config in
      common code.
      
      For client IMC events, the event->attr.config is used to calculate a
      converted event with standard encoding format in the custom
      event_init(). The converted event is stored in event->hw.config.
      For other events of freerunning counters, they already use the standard
      encoding format. The same value as event->attr.config is assigned to
      event->hw.config in common event_init().
      Reported-by: NJin Yao <yao.jin@linux.intel.com>
      Tested-by: NJin Yao <yao.jin@linux.intel.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: stable@kernel.org # v4.18+
      Fixes: 9aae1780 ("perf/x86/intel/uncore: Clean up client IMC uncore")
      Link: https://lkml.kernel.org/r/20190227165729.1861-1-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8041ffd3
    • A
      perf/ring_buffer: Use high order allocations for AUX buffers optimistically · 5768402f
      Alexander Shishkin 提交于
      Currently, the AUX buffer allocator will use high-order allocations
      for PMUs that don't support hardware scatter-gather chaining to ensure
      large contiguous blocks of pages, and always use an array of single
      pages otherwise.
      
      There is, however, a tangible performance benefit in using larger chunks
      of contiguous memory even in the latter case, that comes from not having
      to fetch the next page's address at every page boundary. In particular,
      a task running under Intel PT on an Atom CPU shows 1.5%-2% less runtime
      penalty with a single multi-page output region in snapshot mode (no PMI)
      than with multiple single-page output regions, from ~6% down to ~4%. For
      the snapshot mode it does make a difference as it is intended to run over
      long periods of time.
      
      For this reason, change the allocation policy to always optimistically
      start with the highest possible order when allocating pages for the AUX
      buffer, desceding until the allocation succeeds or order zero allocation
      fails.
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/20190215114727.62648-2-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5768402f
  5. 07 3月, 2019 19 次提交
  6. 06 3月, 2019 2 次提交
    • L
      Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 203b6609
      Linus Torvalds 提交于
      Pull perf updates from Ingo Molnar:
       "Lots of tooling updates - too many to list, here's a few highlights:
      
         - Various subcommand updates to 'perf trace', 'perf report', 'perf
           record', 'perf annotate', 'perf script', 'perf test', etc.
      
         - CPU and NUMA topology and affinity handling improvements,
      
         - HW tracing and HW support updates:
            - Intel PT updates
            - ARM CoreSight updates
            - vendor HW event updates
      
         - BPF updates
      
         - Tons of infrastructure updates, both on the build system and the
           library support side
      
         - Documentation updates.
      
         - ... and lots of other changes, see the changelog for details.
      
        Kernel side updates:
      
         - Tighten up kprobes blacklist handling, reduce the number of places
           where developers can install a kprobe and hang/crash the system.
      
         - Fix/enhance vma address filter handling.
      
         - Various PMU driver updates, small fixes and additions.
      
         - refcount_t conversions
      
         - BPF updates
      
         - error code propagation enhancements
      
         - misc other changes"
      
      * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
        perf script python: Add Python3 support to syscall-counts-by-pid.py
        perf script python: Add Python3 support to syscall-counts.py
        perf script python: Add Python3 support to stat-cpi.py
        perf script python: Add Python3 support to stackcollapse.py
        perf script python: Add Python3 support to sctop.py
        perf script python: Add Python3 support to powerpc-hcalls.py
        perf script python: Add Python3 support to net_dropmonitor.py
        perf script python: Add Python3 support to mem-phys-addr.py
        perf script python: Add Python3 support to failed-syscalls-by-pid.py
        perf script python: Add Python3 support to netdev-times.py
        perf tools: Add perf_exe() helper to find perf binary
        perf script: Handle missing fields with -F +..
        perf data: Add perf_data__open_dir_data function
        perf data: Add perf_data__(create_dir|close_dir) functions
        perf data: Fail check_backup in case of error
        perf data: Make check_backup work over directories
        perf tools: Add rm_rf_perf_data function
        perf tools: Add pattern name checking to rm_rf
        perf tools: Add depth checking to rm_rf
        perf data: Add global path holder
        ...
      203b6609
    • L
      Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3478588b
      Linus Torvalds 提交于
      Pull locking updates from Ingo Molnar:
       "The biggest part of this tree is the new auto-generated atomics API
        wrappers by Mark Rutland.
      
        The primary motivation was to allow instrumentation without uglifying
        the primary source code.
      
        The linecount increase comes from adding the auto-generated files to
        the Git space as well:
      
          include/asm-generic/atomic-instrumented.h     | 1689 ++++++++++++++++--
          include/asm-generic/atomic-long.h             | 1174 ++++++++++---
          include/linux/atomic-fallback.h               | 2295 +++++++++++++++++++++++++
          include/linux/atomic.h                        | 1241 +------------
      
        I preferred this approach, so that the full call stack of the (already
        complex) locking APIs is still fully visible in 'git grep'.
      
        But if this is excessive we could certainly hide them.
      
        There's a separate build-time mechanism to determine whether the
        headers are out of date (they should never be stale if we do our job
        right).
      
        Anyway, nothing from this should be visible to regular kernel
        developers.
      
        Other changes:
      
         - Add support for dynamic keys, which removes a source of false
           positives in the workqueue code, among other things (Bart Van
           Assche)
      
         - Updates to tools/memory-model (Andrea Parri, Paul E. McKenney)
      
         - qspinlock, wake_q and lockdep micro-optimizations (Waiman Long)
      
         - misc other updates and enhancements"
      
      * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
        locking/lockdep: Shrink struct lock_class_key
        locking/lockdep: Add module_param to enable consistency checks
        lockdep/lib/tests: Test dynamic key registration
        lockdep/lib/tests: Fix run_tests.sh
        kernel/workqueue: Use dynamic lockdep keys for workqueues
        locking/lockdep: Add support for dynamic keys
        locking/lockdep: Verify whether lock objects are small enough to be used as class keys
        locking/lockdep: Check data structure consistency
        locking/lockdep: Reuse lock chains that have been freed
        locking/lockdep: Fix a comment in add_chain_cache()
        locking/lockdep: Introduce lockdep_next_lockchain() and lock_chain_count()
        locking/lockdep: Reuse list entries that are no longer in use
        locking/lockdep: Free lock classes that are no longer in use
        locking/lockdep: Update two outdated comments
        locking/lockdep: Make it easy to detect whether or not inside a selftest
        locking/lockdep: Split lockdep_free_key_range() and lockdep_reset_lock()
        locking/lockdep: Initialize the locks_before and locks_after lists earlier
        locking/lockdep: Make zap_class() remove all matching lock order entries
        locking/lockdep: Reorder struct lock_class members
        locking/lockdep: Avoid that add_chain_cache() adds an invalid chain to the cache
        ...
      3478588b